Hi there,
as i described in my previous post here, i created a LogFileParser and did some work on it.
Download: https://github.com/ddneves/LogFileParser
First of all i want to show you some of my findings in this project, which i did not all foresee:
Findings:
- following first line is much faster than second one (even more for big files):
$t = (Get-Content -Path $Path -ReadCount 1000).Split([Environment]::NewLine) $t = Get-Content -Path $Path
- Performance for the parsing loop
StreamReader with While <<<< Foreach() << Foreach-Object < piped Functions (fastest by 20% vs. Foreach-Object) - Parallelizing with e.g. Invoke-Parallel did not work out till now
- it was not as fast as i expected (10-30%) and brought some memory problems with larger or multiple files
- Filtering Performance
- Where-Object {} << .Where{} (fastest) Take look here
- Classes in Powershell are fun!
- Overriding ToString() in some classes makes sense and creates better overviews
- ToString() is called, when you list the class up – for Example:
Listing up a list of ParsedLogFile would show { ParsedLogFile, ParsedLogFile}
To give the user an better overview you override ToString in ParsedLogFile:
- ToString() is called, when you list the class up – for Example:
#Overriding ToString to show the LogFilenames in the overview [string] ToString() { return ($this.LogFilePath).ToString() }
and now you see a list of the filepaths.
- Generic Lists with classes work!
$this.ParsedLogFiles = New-Object -TypeName System.Collections.Generic.List``1[ParsedLogFile]
- Export-CliXML for self made nested classes works.
- But be careful – the object you get after importing it again is a deserialized object, which can not be casted to its previous class. But you can work easily with it by not using any datatype.
- But be careful – the object you get after importing it again is a deserialized object, which can not be casted to its previous class. But you can work easily with it by not using any datatype.
New stuff:
I managed to increase the performance by about 20% by using piped functions (vs. foreach-object) and i also added LogFileTypeClasses to easier extend this LogFileParser with new LogFileTypes.
A LogFileTypeClass consists of the following:
#SCCM $newClass = [LogFileTypeClass]::new() $newClass.LogFileType = 'SCCM' $newClass.Description = 'All SCCM log-files.' $newClass.RegExString = '<!\[LOG\[(?<Entry>.*)]LOG]!><time="(?<Time>.*)\.\d{3}-\d{3}"\s+date="(?<Date>.*)"\s+component="(?<Component>.*)"\s+context="(?<Context>.*)"\s+type="(?<Type>.*)"\s+thread="(?<Thread>.*)"\s+file="(?<File>.*):(?<CodeLine>\d*)">' $newClass.LogFiles = 'default' $newClass.LocationsLogFiles = ('c:\windows\ccm\logs\*','c:\Program Files\System Center Configuration Manager*') ($this.LoadedClasses).Add($newClass)
This can be found in the constructor of LogFileTypeClasses.
LogfileType – the unique and short Name for a type of LogFiles.
Description – the description regarding the LogFileType
RegExString – the RegExString to parse this kind of LogFiles. Look here
LogFiles – Array of string to identify your type of LogFiles – is used in a -like statement
LocationsLogFiles – the place, where this kind of LogFiles can be found. This property is just informational and is not used.
You can also export and import the LogFileTypeClasses with the following code:
$newLogFileTypeClasses = [LogFileTypeClasses]::new() Export-Clixml -InputObject $newLogFileTypeClasses -Path '.\LogFileParser\Classes.xml'
and loading it is also pretty simple.
I created an overloaded constructor herefore, which can be used like this:
#Loading a specific file with own classes $newLogParser = [LogFileParser]::new(.\DemoLogs\DISM\dism.log','.\LogFileParser\Classes.xml')
And – last but not least – i added also some worker functions on top, which can be found in the Examples.ps1. As you can see, you can add some logic to the classes to retrieve the information you need from the logs :
# Gets lines with errors $newLogParser.ParsedLogFiles[0].GetLinesWithErrors() | Out-GridView # Gets lines with warnings $newLogParser.ParsedLogFiles[0].GetLinesWithWarnings() | Out-GridView # gather only rows, which contain errors and show also all 20 lines before and after the error-lines $newLogParser.ParsedLogFiles[0].GetLinesWithErrorsWithRange(20) | Out-GridView # gather only rows, which contain warnings and show also all 20 lines before and after the warning-lines $newLogParser.ParsedLogFiles[0].GetLinesWithWarningsWithRange(20) | Out-GridView
Summary? It is very easy to use and now also very easy to extend. I personally like it as an example for how you might work with classes and some of the new Powershell features in v5.
As for the parser itself – You can add now new LogFileTypes via LogFileTypeClasses and also integrate new worker functions in the class ParsedLogFile – if you like to. Its performance is quite decent and it can load complete folders with all containing logs in just one step.
I hope you like it and feel free to comment.
Best regards,
David