Parsing all the files in a directory using PowerShell

By:   |   Comments (3)   |   Related: > PowerShell


Problem

I want to parse all the files in a given directory using PowerShell. I'm looking for a particular bit of information and need to search for it in all the files. Is there an easy way to do this?  Check out this tip to learn more.

Solution

Yes, there is. PowerShell is very powerful because it:

  • Handles everything as objects.
  • Supports regular expressions.
  • Has proper looping structures.

As a result, looping through a list of files to find a particular text string (or several text strings) is easy to do. For instance, consider the output from this previous tip where we audited for the members of the sysadmin role across multiple SQL Servers. This PowerShell script produces a text file per SQL Server instance. This is the perfect scenario for parsing for a particular group, such as BUILTIN\Administrators.

First, let's start by defining a variable containing a path to our directory as well as another variable, an object that is an array list (this is what New-Object System.Collections.ArrayList does), to hold our findings.

$fileDirectory = "c:\scripts\reports";
$parse_results = New-Object System.Collections.ArrayList;

Now we'll need a foreach loop combined with a Get-ChildItem cmdlet call to get a list of all the files in the directory.

# Use a foreach to loop through all the files in a directory.
# This method allows us to easily track the file name so we can report
# our findings by file.
foreach($file in Get-ChildItem $fileDirectory)
{
# Processing code goes here
}

This part so far is pretty straight forward. Now, to be able to parse the files, we will want to use the switch command. The switch command in PowerShell is similar in function to the same command in other languages. You can think of it as being capable of stacking multiple IF statements together.  This is what goes where we have the comment "# Processing code goes here" in the previous block.

# We will need to tell the Switch command exactly where to parse, so we'll put together
# the full file path.
$filePath = $fileDirectory + "\" + $file;
# parse all files using a regular expression
Switch -regex (Get-Content -path $filePath)
{
# send the counter to $null so it doesn't display on screen
'BUILTIN\\Administrators' { $parse_results.add($file.name + " > " + $switch.current `
+ "`r`n") > $null }
}

Note that for each pattern of a regular expression in single quotes followed by the curly braces, we have an evaluation. In this case I'm only looking for one case, when BUILTIN\Administrators is present (the backslash is doubled since we're using a regular expression and the backslash is an escape character). If I was looking for Users, I could add another line to parse this as well.

As to exactly what is being done, when the Switch statement detects a line that matches the condition I've specified, the line contains BUILTIN\Administrators, it's adding another entry to my array list. The entry will be a concatenated string of the file name where the text was detected,  along with the entire line (that's what $switch.current refers to).  Once all this is done, all that's left to do is write out what was captured. That's actually the reason for the "`r`n" added to the end of each string. That puts a carriage return/new line at the end of the string so that it'll output properly.

Our finished script looks like this:

 

$fileDirectory = "c:\scripts\reports";
$parse_results = New-Object System.Collections.ArrayList;

# Use a foreach to loop through all the files in a directory.
# This method allows us to easily track the file name so we can report
# our findings by file.
foreach($file in Get-ChildItem $fileDirectory)
{
# We will need to tell the Switch command exactly where to parse, so we'll put together
# the full file path.
$filePath = $fileDirectory + "\" + $file;
# parse all files using a regular expression
Switch -regex (Get-Content -path $filePath)
{
# send the counter to $null so it doesn't display on screen
'BUILTIN\\Administrators' { $parse_results.add($file.name + " > " + $switch.current `
+ "`r`n") > $null }
}
}

write-host $parse_results;

And the output will look like this for a couple of files that I have in the directory:

localhost,5555_sysadmins.txt >  BUILTIN\Administrators
localhost_sysadmins.txt >  BUILTIN\Administrators

 

Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author K. Brian Kelley K. Brian Kelley is a SQL Server author and columnist focusing primarily on SQL Server security.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips



Comments For This Article




Thursday, January 23, 2014 - 11:23:51 AM - [email protected] Back To Top (28196)

How could we split a 30 gig Unix text file into N text files split on a line width of 1700 characters?

Thanks


Wednesday, February 6, 2013 - 10:24:18 AM - snorkel Back To Top (21950)

You should probably point out that this method is going to consume a large amount of system memory if ran on a large directory, which if you find yourself writing a script to parse files in a directory, chances are you doing it because the directory is huge.

 

foreach($file in Get-ChildItem $fileDirectory) is going to grab the file object for every file in that directory and store it in memory prior to entering the for loop.  And since it returns an entire object containing a large amount of data per file this can build up to several GB of data in a hurry.  
Unfortunately, to my knowledge, there is no native to Powershell way around this.  What I've done in the past is execute a dir of the directory that I am interested in and pipe the results to a txt file . I then loop through that text file in Powershell so that I can do a get-childitem on a single file at a time.

Wednesday, August 22, 2012 - 6:44:06 PM - Laerte Junior Back To Top (19180)

Too Much Code. Try :

Select-String -Path 'c:\scripts\reports\*.*' -Pattern 'BUILTIN\\Administrators' -AllMatches















get free sql tips
agree to terms