Learn more about SQL Server tools



solving sql server problems for millions of dbas and developers since 2006 join MSSQLTips for free SQL Server tips













































   Got a SQL tip?
            We want to know!

Parsing all the files in a directory using PowerShell

MSSQLTips author K. Brian Kelley By:   |   Read Comments (3)   |   Related Tips: More > PowerShell
Problem

I want to parse all the files in a given directory using PowerShell. I'm looking for a particular bit of information and need to search for it in all the files. Is there an easy way to do this?  Check out this tip to learn more.

Solution

Yes, there is. PowerShell is very powerful because it:

  • Handles everything as objects.
  • Supports regular expressions.
  • Has proper looping structures.

As a result, looping through a list of files to find a particular text string (or several text strings) is easy to do. For instance, consider the output from this previous tip where we audited for the members of the sysadmin role across multiple SQL Servers. This PowerShell script produces a text file per SQL Server instance. This is the perfect scenario for parsing for a particular group, such as BUILTIN\Administrators.

First, let's start by defining a variable containing a path to our directory as well as another variable, an object that is an array list (this is what New-Object System.Collections.ArrayList does), to hold our findings.

$fileDirectory = "c:\scripts\reports";
$parse_results = New-Object System.Collections.ArrayList;

Now we'll need a foreach loop combined with a Get-ChildItem cmdlet call to get a list of all the files in the directory.

# Use a foreach to loop through all the files in a directory.
# This method allows us to easily track the file name so we can report
# our findings by file.
foreach($file in Get-ChildItem $fileDirectory)
{
# Processing code goes here
}

This part so far is pretty straight forward. Now, to be able to parse the files, we will want to use the switch command. The switch command in PowerShell is similar in function to the same command in other languages. You can think of it as being capable of stacking multiple IF statements together.  This is what goes where we have the comment "# Processing code goes here" in the previous block.

# We will need to tell the Switch command exactly where to parse, so we'll put together
# the full file path.
$filePath = $fileDirectory + "\" + $file;
# parse all files using a regular expression
Switch -regex (Get-Content -path $filePath)
{
# send the counter to $null so it doesn't display on screen
'BUILTIN\\Administrators' { $parse_results.add($file.name + " > " + $switch.current `
+ "`r`n") > $null }
}

Note that for each pattern of a regular expression in single quotes followed by the curly braces, we have an evaluation. In this case I'm only looking for one case, when BUILTIN\Administrators is present (the backslash is doubled since we're using a regular expression and the backslash is an escape character). If I was looking for Users, I could add another line to parse this as well.

As to exactly what is being done, when the Switch statement detects a line that matches the condition I've specified, the line contains BUILTIN\Administrators, it's adding another entry to my array list. The entry will be a concatenated string of the file name where the text was detected,  along with the entire line (that's what $switch.current refers to).  Once all this is done, all that's left to do is write out what was captured. That's actually the reason for the "`r`n" added to the end of each string. That puts a carriage return/new line at the end of the string so that it'll output properly.

Our finished script looks like this:

 

$fileDirectory = "c:\scripts\reports";
$parse_results = New-Object System.Collections.ArrayList;

# Use a foreach to loop through all the files in a directory.
# This method allows us to easily track the file name so we can report
# our findings by file.
foreach($file in Get-ChildItem $fileDirectory)
{
# We will need to tell the Switch command exactly where to parse, so we'll put together
# the full file path.
$filePath = $fileDirectory + "\" + $file;
# parse all files using a regular expression
Switch -regex (Get-Content -path $filePath)
{
# send the counter to $null so it doesn't display on screen
'BUILTIN\\Administrators' { $parse_results.add($file.name + " > " + $switch.current `
+ "`r`n") > $null }
}
}

write-host $parse_results;

And the output will look like this for a couple of files that I have in the directory:

localhost,5555_sysadmins.txt >  BUILTIN\Administrators
localhost_sysadmins.txt >  BUILTIN\Administrators

 

Next Steps


Last Update: 8/22/2012


About the author
MSSQLTips author K. Brian Kelley
K. Brian Kelley is a SQL Server author and columnist focusing primarily on SQL Server security.

View all my tips
Related Resources


print tip Print  
Become a paid author





join MSSQLTips for free SQL Server tips     



Learn more about SQL Server tools
Comments and Feedback:
Wednesday, August 22, 2012 - 6:44:06 PM - Laerte Junior Read The Tip

Too Much Code. Try :

Select-String -Path 'c:\scripts\reports\*.*' -Pattern 'BUILTIN\\Administrators' -AllMatches


Wednesday, February 06, 2013 - 10:24:18 AM - snorkel Read The Tip

You should probably point out that this method is going to consume a large amount of system memory if ran on a large directory, which if you find yourself writing a script to parse files in a directory, chances are you doing it because the directory is huge.

 

foreach($file in Get-ChildItem $fileDirectory) is going to grab the file object for every file in that directory and store it in memory prior to entering the for loop.  And since it returns an entire object containing a large amount of data per file this can build up to several GB of data in a hurry.  
Unfortunately, to my knowledge, there is no native to Powershell way around this.  What I've done in the past is execute a dir of the directory that I am interested in and pipe the results to a txt file . I then loop through that text file in Powershell so that I can do a get-childitem on a single file at a time.

Thursday, January 23, 2014 - 11:23:51 AM - drew.georgopulos@yahoo.com Read The Tip

How could we split a 30 gig Unix text file into N text files split on a line width of 1700 characters?

Thanks



Post a Comment or Question

Keep it clean and stay on the subject or we may delete your comment.
Your email address is not published. Required fields are marked with an asterisk (*)

*Name   *Email Notify for updates



Comments
Get free SQL tips:

*Enter Code refresh code


 
Sponsor Information







Copyright (c) 2006-2014 Edgewood Solutions, LLC All rights reserved
privacy | disclaimer | copyright | advertise | about
authors | contribute | feedback | giveaways | free t-shirt | user groups | community | events | first timer?
Some names and products listed are the registered trademarks of their respective owners.