Learn more about SQL Server tools

mssqltips logo
giveaway
 

Tutorials          DBA          Dev          BI          Career          Categories          Webcasts          Whitepapers          Today's Tip          Join

Tutorials      DBA      Dev      BI      Categories      Webcasts

DBA    Dev    BI    Categories

 

Get Total Line Counts for all SQL Server Source Objects Using PowerShell


By:   |   Read Comments   |   Related Tips: More > PowerShell

Problem

I want to count the number of lines of code for all of our non-CLR procedures, views and UDFs in a SQL Server database, how can I do this?  I also want to exclude comments and empty lines from the counts.

Solution

This line-counting task is really interesting. I remember years ago when my team wanted to present how much work we had done from product version N to version N+1, one KPI is the total number of code lines in all stored procedures (non-CLR SPs). At that time, we simply counted the lines of the code using the following T-SQL query (this will work for SQL Server 2005 and later versions).

use [DB_name] -- change to your own db name
declare @crlf nchar(2)
set @crlf = char(0x0d) + char(0x0a);

select [Schema]=schema_name(p.schema_id), [Proc_Name]=p.name
, Num_of_LineCode=(len(m.definition) -len(replace(m.definition, @crlf, ''))) /2
from sys.sql_modules m
inner join sys.procedures p
on m.object_id = p.object_id;
GO

However, this T-SQL solution has a few shortcomings that compromise the quality, such as:

  • Empty lines are counted
  • Comments are counted. Yes, code comments are important, but there are many cases where developers comment out big chunks of old code, which should be ignored when counting the effective lines of code
  • No way to know the average line length in words, which is an important indicator for code volume in addition to lines of code

Solution To Get Count of Code Lines Per Object

The solution is actually very natural if we know the following facts:

  • PowerShell has a cmdlet called Measure-Object that can count lines and words of a string.
  • In SQL SMO, there is a class corresponding to stored procedure, i.e. [Microsoft.SQLServer.Management.SMO.StoredProcedure], and it has two properties of interest to our topic, they are TextHeader and TextBody (for Views or UDFs, it is similar.)
  • .Net has a Regex class that implements Regular Expression functions, which make it the perfect tool when dealing with text/string manipulation, such as finding comments or empty lines, etc.
  • We can use PowerShell to manipulate a stored procedure's TextHeader and TextBody properties via Regex and finally use Measure-Object to get the result.

So here are the details.

Import the SQLPS module to load SQLServer PSProvider, and list all stored procedures (SPs) in a database via "DIR", loop through each stored procedure to retrieve its TextHeader and TextBody properties, then remove all comments and empty lines via a RegEx function, finally the cleansed stored procedure's text header and body will be processed by the Measure-object, and all results are put into a PSObject array (for further processing if needed).

Here is the PowerShell code running against AdventureWorks2012. Note we need to deal with two types of comments, one is block comment, i.e. /* ... */ and the other is a line comment, i.e. -- ..

#count the lines of code for views/stored proc/UDFs (non-CLR) in a database
import-module sqlps -DisableNameChecking; 

function MatchEval-Function # as a delegate function 
{ param ([string]$x)
    
  if ($x.StartsWith('/*') -or $x.StartsWith('--'))
  { return "";}
  else {return $x;}
  
}

[string]$database = 'AdventureWorks2012'; # change to your own db name
[string]$type = 'sp' # 'SP' | 'vw' | 'udf';

[string]$object='' 

switch ($type)
{
   'vw' {$object='Views'; break;} 
   'sp' {$object='StoredProcedures'; break;} 
   'udf' {$object='UserDefinedFunctions'; break;} 
}

$objs=@();

[string]$reg_exp_1 = @'
["'][^'"]*?['"]|(?m)--.*$|(?s)/\*.*?\*/[\r\n]?
'@; 

$RegEx= new-object "System.Text.RegularExpressions.RegEx" ($reg_exp_1);

[string]$reg_exp_2 = '(?m)^[;\s]*$[\r\n]'; #remove empty lines
$RegEx2= new-object "System.Text.RegularExpressions.RegEx" ($reg_exp_2);

#note TP_W520 is my local computer name, and you need to change it to your own,
#also if your sql instance is not the default one, you need to change "default" to your sql instance name
dir sqlserver:\sql\TP_W520\default\databases\$database\$object | 
% { $s = $_.textheader + $_.textbody;
    $ss = $RegEx.replace($s, {MatchEval-Function $args[0].value}); #remove all comments
    $ss = $RegEx2.replace($ss, ''); #remove all empty lines;
    $p= measure-object -inputObject $ss -Line -Word -Character -IgnoreWhiteSpace; 
    $objs +=new-object -type PSObject -Property @{Type = $type; Name=($_.schema +'.'+ $_.name); Line=$($p.Lines); Word=$($p.Words)} }
    
$objs | select type, name, line, word, @{label='Words_Per_Line'; e={"{0:N2}" -f ($_.word/$_.line)}} | Out-GridView

When run, we will get the following results:

Lines of Code

Regular Expression Explained

Here is a brief explanation of the Regular Expressions used in the solution:
  • ["'][^'"]*?['"] : Find strings starting with either single or double quotes["'], match any characters that are not single/double quotes as few times as possible, [^'"]*? until next single/double quote ['"]
  • (?m)--.*$ : In multi-line mode (?m), find strings starting with "--" until the end of the line ".*$"
  • (?s)/\*.*?\*/[\r\n]* : in single line mode (?s), find string blocks that start with /* and ends with */, i.e. /\*.*?\*/, followed by carriage line return zero or one time, i.e. [\r\n]?
  • (?m)^[;\s]*$[\r\n] : In multi-line mode (?m), find lines that start with "^" with ';' or blank space [\s] as many as possible, until the end of the line "$" which is then followed by carriage new line "[\r\n]"

The code is using .Net Regular Expression class, RegEx and its method Replace.  It first processes any comments by replacing them with an empty string and then reprocesses the result by removing all empty lines.

Note: This solution has a small issue, i.e. when there is a nested block comment, such as the following:

/* There will be issues where
   there is /* nested block comments */
   inside a comment
*/

Because of this issue, the counting may not be totally accurate. This is fixable, but needs a very complex regular expression and the performance is not ideal either when you have to scan big chunks of code.  In my environment, luckily the nested block comments did not exist, so I opted to ignore this scenario. But if you are really interested in a solution for nested block comments, you may take a look at this article.

Next Steps

The script is tested on SQL Server 2012 (where SQLPS module is installed) in a PowerShell V3 (or above) environment. You can run the above PowerShell script by opening a PowerShell ISE window and then copy and paste the above code into the PowerShell window. Modify the SQL Server instance / database name in the script and then just run.

Regular Expression provides a powerful mechanism in doing text analysis, and it can be very useful for DBAs when handling text-based tasks, such as SQL Server Error Log analysis or string searches in T-SQL code or tables. You can learn more about regular expressions through the following links.



Last Update:






About the author
MSSQLTips author Jeffrey Yao Jeffrey Yao is a senior SQL Server consultant, striving to automate DBA work as much as possible to have more time for family, life and more automation.

View all my tips
Related Resources





More SQL Server Solutions











Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name    *Email    Notify for updates 


Get free SQL tips:

*Enter Code refresh code     



Learn more about SQL Server tools