Free SQL Server Learning - Backup compression and storage deduplication: A perfect match?
solving sql server problems for millions of dbas and developers since 2006



SQL Server DBA Tips SQL Server Developer Tips SQL Server Business Intelligence Tips SQL Server Career Tips SQL Server Tip Categories SQL Server Tutorials SQL Server Webcasts SQL Server Whitepapers SQL Server Tools SQL Server Questions and Answers MSSQLTips Authors About MSSQLTips SQL Server User Groups SQL Server Events I am MSSQLTips MSSQLTips Advertising Options

MSSQLTips Facebook Page MSSQLTips LinkedIn Page MSSQLTips RSS Feed MSSQLTips Pinterest Page MSSQLTips Twitter Page MSSQLTips Google+ Page









SQL Product Highlight

Idera - SQL safe backup

SQL Server backup compression with network fault tolerance and zero impact encryption

  • Fast, compressed & encrypted SQL backup and restore
  • SQL Server backup compression of 95%
  • At least 50% faster than native SQL backups
  • Learn more!






































Parsing all the files in a directory using PowerShell

By:   |   Read Comments (2)   |   Related Tips: More > PowerShell

Problem

I want to parse all the files in a given directory using PowerShell. I'm looking for a particular bit of information and need to search for it in all the files. Is there an easy way to do this?  Check out this tip to learn more.

Solution

Yes, there is. PowerShell is very powerful because it:

  • Handles everything as objects.
  • Supports regular expressions.
  • Has proper looping structures.

As a result, looping through a list of files to find a particular text string (or several text strings) is easy to do. For instance, consider the output from this previous tip where we audited for the members of the sysadmin role across multiple SQL Servers. This PowerShell script produces a text file per SQL Server instance. This is the perfect scenario for parsing for a particular group, such as BUILTIN\Administrators.

First, let's start by defining a variable containing a path to our directory as well as another variable, an object that is an array list (this is what New-Object System.Collections.ArrayList does), to hold our findings.

$fileDirectory = "c:\scripts\reports";
$parse_results = New-Object System.Collections.ArrayList;

Now we'll need a foreach loop combined with a Get-ChildItem cmdlet call to get a list of all the files in the directory.

# Use a foreach to loop through all the files in a directory.
# This method allows us to easily track the file name so we can report
# our findings by file.
foreach($file in Get-ChildItem $fileDirectory)
{
# Processing code goes here
}

This part so far is pretty straight forward. Now, to be able to parse the files, we will want to use the switch command. The switch command in PowerShell is similar in function to the same command in other languages. You can think of it as being capable of stacking multiple IF statements together.  This is what goes where we have the comment "# Processing code goes here" in the previous block.

# We will need to tell the Switch command exactly where to parse, so we'll put together
# the full file path.
$filePath = $fileDirectory + "\" + $file;
# parse all files using a regular expression
Switch -regex (Get-Content -path $filePath)
{
# send the counter to $null so it doesn't display on screen
'BUILTIN\\Administrators' { $parse_results.add($file.name + " > " + $switch.current `
+ "`r`n") > $null }
}

Note that for each pattern of a regular expression in single quotes followed by the curly braces, we have an evaluation. In this case I'm only looking for one case, when BUILTIN\Administrators is present (the backslash is doubled since we're using a regular expression and the backslash is an escape character). If I was looking for Users, I could add another line to parse this as well.

As to exactly what is being done, when the Switch statement detects a line that matches the condition I've specified, the line contains BUILTIN\Administrators, it's adding another entry to my array list. The entry will be a concatenated string of the file name where the text was detected,  along with the entire line (that's what $switch.current refers to).  Once all this is done, all that's left to do is write out what was captured. That's actually the reason for the "`r`n" added to the end of each string. That puts a carriage return/new line at the end of the string so that it'll output properly.

Our finished script looks like this:

 

$fileDirectory = "c:\scripts\reports";
$parse_results = New-Object System.Collections.ArrayList;

# Use a foreach to loop through all the files in a directory.
# This method allows us to easily track the file name so we can report
# our findings by file.
foreach($file in Get-ChildItem $fileDirectory)
{
# We will need to tell the Switch command exactly where to parse, so we'll put together
# the full file path.
$filePath = $fileDirectory + "\" + $file;
# parse all files using a regular expression
Switch -regex (Get-Content -path $filePath)
{
# send the counter to $null so it doesn't display on screen
'BUILTIN\\Administrators' { $parse_results.add($file.name + " > " + $switch.current `
+ "`r`n") > $null }
}
}

write-host $parse_results;

And the output will look like this for a couple of files that I have in the directory:

localhost,5555_sysadmins.txt >  BUILTIN\Administrators
localhost_sysadmins.txt >  BUILTIN\Administrators

 

Next Steps



Last Update: 8/22/2012

About the author

Brian is a SQL Server author and columnist focusing primarily on SQL Server security.

View all my tips


Print  
Become a paid author


Comments and Feedback:

Wednesday, August 22, 2012 - 6:44:06 PM - Laerte Junior Read The Tip

Too Much Code. Try :

Select-String -Path 'c:\scripts\reports\*.*' -Pattern 'BUILTIN\\Administrators' -AllMatches


Wednesday, February 06, 2013 - 10:24:18 AM - snorkel Read The Tip

You should probably point out that this method is going to consume a large amount of system memory if ran on a large directory, which if you find yourself writing a script to parse files in a directory, chances are you doing it because the directory is huge.

 

foreach($file in Get-ChildItem $fileDirectory) is going to grab the file object for every file in that directory and store it in memory prior to entering the for loop.  And since it returns an entire object containing a large amount of data per file this can build up to several GB of data in a hurry.  
Unfortunately, to my knowledge, there is no native to Powershell way around this.  What I've done in the past is execute a dir of the directory that I am interested in and pipe the results to a txt file . I then loop through that text file in Powershell so that I can do a get-childitem on a single file at a time.


Post a Comment or Question

Keep it clean and stay on the subject or we may delete your comment.
Your email address is not published. Required fields are marked with an asterisk (*)

*Name   *Email Notify for updates

Signup for our newsletter


Comments
*Enter Code refresh code


 
Sponsor Information
"Amazing, Amazing, Amazing! SQL doctor is truly one of the most powerful tools I have seen."

NEW! Top 5 hard-earned lessons of a DBA from Grant Fritchey & the DBA Team. Read it now

Is “blocking” a bad word at your company? Contact the Edgewood SQL Server Consultants for the resolution.

Secure column & whole database on all versions and editions of SQL Server with NetLib’s TDE

Free Learning - Using SQL Server DMVs to Help Improve Performance


Copyright (c) 2006-2013 Edgewood Solutions, LLC All rights reserved
privacy | disclaimer | copyright | advertise | about
authors | contribute | feedback | giveaways | user groups
Some names and products listed are the registered trademarks of their respective owners.


Edgewood Solutions LLC | MSSharePointTips.com | MSSQLTips.com