Hi there,
as i described in my previous post here, i created a LogFileParser and did some work on it.
Download: https://github.com/ddneves/LogFileParser
First of all i want to show you some of my findings in this project, which i did not all foresee:
Findings:
- following first line is much faster than second one (even more for big files):
$t = (Get-Content -Path $Path -ReadCount 1000).Split([Environment]::NewLine) $t = Get-Content -Path $Path
- Performance for the parsing loop
StreamReader with While <<<< Foreach() << Foreach-Object < piped Functions (fastest by 20% vs. Foreach-Object) - Parallelizing with e.g. Invoke-Parallel did not work out till now
- it was not as fast as i expected (10-30%) and brought some memory problems with larger or multiple files
- Filtering Performance
- Where-Object {} << .Where{} (fastest) Take look here
- Classes in Powershell are fun!
- Overriding ToString() in some classes makes sense and creates better overviews
- ToString() is called, when you list the class up – for Example:
Listing up a list of ParsedLogFile would show { ParsedLogFile, ParsedLogFile}
To give the user an better overview you override ToString in ParsedLogFile:
- ToString() is called, when you list the class up – for Example:
#Overriding ToString to show the LogFilenames in the overview [string] ToString() { return ($this.LogFilePath).ToString() }
and now you see a list of the filepaths.
- Generic Lists with classes work!
$this.ParsedLogFiles = New-Object -TypeName System.Collections.Generic.List``1[ParsedLogFile]
- Export-CliXML for self made nested classes works.
- But be careful – the object you get after importing it again is a deserialized object, which can not be casted to its previous class. But you can work easily with it by not using any datatype.
- But be careful – the object you get after importing it again is a deserialized object, which can not be casted to its previous class. But you can work easily with it by not using any datatype.
New stuff: