By: Aaron Bertrand | Updated: 2015-09-30 | Comments (2) | Related: 1 | 2 | 3 | 4 | 5 | 6 | 7 | More > Scripts
Problem
In a previous tip, Greg Robidoux showed us a method for searching and finding a string value in all columns in a table, and pointed out a script by Vyas Kondreddi that will search all columns in all tables of a single database. But what if I want to search across all string columns, across all tables, across all databases?
Solution
I've come across this problem multiple times in my career, so I thought I would put together my approach, which differs from others I have seen. In particular, I have a tendency to avoid explicit loops and cursors where possible, and I also like to avoid concatenating any user input (like the search string) into dynamic SQL - I would rather strongly parameterize those values. As I've explained before, any arbitrary string you accept from a user is a potential threat.
My solution is, admittedly, a bit complex; it involves three layers of nested dynamic SQL. You could probably argue that this offsets the benefits achieved by avoiding the scaffolding of loops or cursors, and that it may make the code harder to read for some people. I can't argue with those points, but still, this is the solution I prefer.
The approach is three-part (basically in reverse of how the code actually has to be structured):
- Build a dynamic SQL command to find every eligible column in every table.
- Build a dynamic SQL command for each column to be searched.
- Build a dynamic SQL command to run that in each database.
And before I get into any code, I'll iterate a few of the intentions and limitations of my solution:
- This is not meant to be fast. Searching for patterns within strings is not what relational databases are designed for, so you'll need some patience when running this against larger databases. (And as such, this is one case where I feel the problems with
NOLOCK
can be overlooked. Of course if you need absolute certainty and are willing to run this concurrently with other users, you may want to comment the isolation level out. Better yet, run this against restored copies of your databases - you're testing your restores, right?)
- I intend to support searches of text, ntext, char, nchar, varchar, and nvarchar. With some effort, it could be expanded to support other data types, like sql_variant and xml. Currently, I have constructed the code to avoid implicit conversions from Unicode to non-Unicode and vice-versa, by using two variables for the search term and constructing the dynamic SQL accordingly. However I do not go to the extent of covering different collations or dealing with case sensitive searches.
- Rather than return all matching rows from any given table, the point is to identify each table that has at least one row, and return the first 1,000 characters from an arbitrary, example value from that row. (The primary purpose of such a script is usually to identify which table(s) are victims of SQL injection, not to return every affected row in a grid.) You can return all rows by removing the
TOP 1
, and the full value of any tuple by changing the #temp table to useNVARCHAR(MAX)
and removing theLEFT(,1000)
bit.
The full stored procedure is listed below, but I should highlight the key portions and the purpose of each.
Stored procedure signature
The stored procedure takes a single @search
argument, using nvarchar
for greatest compatibility. I use SET NOCOUNT ON
to prevent DONE_IN_PROC
messages from being transmitted, and immediately return if the search argument is NULL
or only contains wildcard characters.
USE UtilityDatabase; GO CREATE PROCEDURE dbo.SearchAllDatabases @SearchTerm NVARCHAR(255) = NULL AS BEGIN SET NOCOUNT ON; SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; IF @SearchTerm IS NULL OR @SearchTerm NOT LIKE N'%[^%^_]%' BEGIN RAISERROR(N'Please enter a valid search term.', 11, 1); RETURN; END
Temporary table
I create a simple #temp table, with self-explanatory columns to hold the results of the search. This is created at the outermost scope, which will be visible to dynamic SQL statements, no matter how many levels deep they go.
CREATE TABLE #results ( [database] SYSNAME, [schema] SYSNAME, [table] SYSNAME, [column] SYSNAME, ExampleValue NVARCHAR(1000) );
@DatabaseCommands
First, we need to build a set of commands so that we can pass the search argument to each database, regardless of how many eligible tables are found there.
SELECT @DatabaseCommands = @DatabaseCommands + N' /* layer 1 */ EXEC ' + QUOTENAME(name) + '.sys.sp_executesql @ColumnCommands, N''@SearchTerm NVARCHAR(MAX)'', @SearchTerm;' FROM sys.databases WHERE database_id > 4 -- non-system databases AND [state] = 0 -- online AND user_access = 0; -- multi-user
You may want to add other checks here, like HAS_DBACCESS(name) = 1
. You may also want to include system databases (in case you were really exposed) or not exclude single-user databases (since that user could be you, on purpose, while you fix this).
This will lead to a variable that looks like this:
EXEC [AdventureWorks].sys.sp_executesql @ColumnCommands, N'@search NVARCHAR(MAX)', @search; EXEC [ContosoYogurts].sys.sp_executesql @ColumnCommands, N'@search NVARCHAR(MAX)', @search; EXEC [Microsoft Bobs].sys.sp_executesql @ColumnCommands, N'@search NVARCHAR(MAX)', @search; ...
@ColumnCommands & @SearchCommands
This is the most complex part of the code, where we generate a search command for each table + eligible column combination. This requires nesting of dynamic SQL, which also means ugly things with single quotes (but I do try to eliminate some of that with the variable @q
). And I try to use indenting to show the two layers being used here:
SET @ColumnCommands = N' /* layer 2 */ DECLARE @q NCHAR(1), -- eliminates nested single-quote nausea @SearchCommands NVARCHAR(MAX); SELECT @q = NCHAR(39), @SearchCommands = N''DECLARE @VSearchTerm VARCHAR(255) = @SearchTerm;''; SELECT @SearchCommands = @SearchCommands + CHAR(10) + N'' /* layer 3 */ SELECT TOP (1) [db] = DB_NAME(), [schema] = N'' + @q + s.name + @q + '', [table] = N'' + @q + t.name + @q + '', [column] = N'' + @q + c.name + @q + '', ExampleValue = LEFT('' + QUOTENAME(c.name) + '', 1000) FROM '' + QUOTENAME(s.name) + ''.'' + QUOTENAME(t.name) + '' WHERE '' + QUOTENAME(c.name) + N'' LIKE @'' + CASE WHEN c.system_type_id IN (35, 167, 175) THEN ''V'' ELSE '''' END + ''SearchTerm;'' /* end layer 3 */ FROM sys.schemas AS s INNER JOIN sys.tables AS t ON s.[schema_id] = t.[schema_id] INNER JOIN sys.columns AS c ON t.[object_id] = c.[object_id] WHERE c.system_type_id IN (35, 99, 167, 175, 231, 239) AND c.max_length >= LEN(@SearchTerm); /* end layer 2 */ PRINT @SearchCommands; EXEC sys.sp_executesql @SearchCommands, N''@SearchTerm NVARCHAR(255)'', @SearchTerm; /* end layer 1 */';
Inside layer 3 you can see the CASE
expression magic that changes @SearchTerm
to @VSearchTerm
in the event the column is non-Unicode. I also exclude any columns that can't possibly contain the search term because of their length; ColName CHAR(3)
can't contain '%some_search%'
.
There is also a PRINT
command in this portion so you can troubleshoot at least a portion of the code (note that PRINT
output is inherently limited by SSMS, as I described in a tip last year, Validate the contents of large dynamic SQL strings). A selective sample from the PRINT
output of @SearchCommands
against AdventureWorks:
DECLARE @VSearchTerm VARCHAR(255) = @SearchTerm; SELECT TOP (1) [db] = DB_NAME(), [schema] = N'Production', [table] = N'ScrapReason', [column] = N'Name', ExampleValue = LEFT([Name], 1000) FROM [Production].[ScrapReason] WHERE [Name] LIKE @SearchTerm; SELECT TOP (1) [db] = DB_NAME(), [schema] = N'Sales', [table] = N'Customer', [column] = N'AccountNumber', ExampleValue = LEFT([AccountNumber], 1000) FROM [Sales].[Customer] WHERE [AccountNumber] LIKE @VSearchTerm;
Note that the second query uses the VARCHAR
version of the variable, since AccountNumber
is not Unicode. Also note that the procedure has no way to know when you are using a string column to store things that are unlikely to contain non-numeric data; it blindly includes all columns that match the six data types identified above. If you have common naming conventions for columns that can't possibly match your searches, you could consider filtering them out in layer 2.
Final execution
With all the messy parts out of the way, now we can populate the #temp table, and return a single resultset (we use the #temp table to avoid a separate resultset for every table/column combination or for every database).
-- insert/exec to run all searches -- against al columns in all databases INSERT #Results ( [database], [schema], [table], [column], ExampleValue ) EXEC [master].sys.sp_executesql @DatabaseCommands, N'@ColumnCommands NVARCHAR(MAX), @SearchTerm NVARCHAR(255)', @ColumnCommands, @SearchTerm; -- in case you execute multiple searches, this will help -- identify each individual search in your results: SELECT [Searched for] = @SearchTerm; -- final select SELECT [database],[schema],[table],[column],ExampleValue FROM #Results ORDER BY [database],[schema],[table],[column];
Putting it all together
Here is the entire procedure body or open in a new tab.
CREATE PROCEDURE dbo.SearchAllDatabases @SearchTerm NVARCHAR(255) = NULL AS BEGIN SET NOCOUNT ON; SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; IF @SearchTerm IS NULL OR @SearchTerm NOT LIKE N'%[^%^_]%' BEGIN RAISERROR(N'Please enter a valid search term.', 11, 1); RETURN; END CREATE TABLE #results ( [database] SYSNAME, [schema] SYSNAME, [table] SYSNAME, [column] SYSNAME, ExampleValue NVARCHAR(1000) ); DECLARE @DatabaseCommands NVARCHAR(MAX) = N'', @ColumnCommands NVARCHAR(MAX) = N''; SELECT @DatabaseCommands = @DatabaseCommands + N' EXEC ' + QUOTENAME(name) + '.sys.sp_executesql @ColumnCommands, N''@SearchTerm NVARCHAR(MAX)'', @SearchTerm;' FROM sys.databases WHERE database_id > 4 -- non-system databases AND [state] = 0 -- online AND user_access = 0; -- multi-user SET @ColumnCommands = N'DECLARE @q NCHAR(1), @SearchCommands NVARCHAR(MAX); SELECT @q = NCHAR(39), @SearchCommands = N''DECLARE @VSearchTerm VARCHAR(255) = @SearchTerm;''; SELECT @SearchCommands = @SearchCommands + CHAR(10) + N'' SELECT TOP (1) [db] = DB_NAME(), [schema] = N'' + @q + s.name + @q + '', [table] = N'' + @q + t.name + @q + '', [column] = N'' + @q + c.name + @q + '', ExampleValue = LEFT('' + QUOTENAME(c.name) + '', 1000) FROM '' + QUOTENAME(s.name) + ''.'' + QUOTENAME(t.name) + '' WHERE '' + QUOTENAME(c.name) + N'' LIKE @'' + CASE WHEN c.system_type_id IN (35, 167, 175) THEN ''V'' ELSE '''' END + ''SearchTerm;'' FROM sys.schemas AS s INNER JOIN sys.tables AS t ON s.[schema_id] = t.[schema_id] INNER JOIN sys.columns AS c ON t.[object_id] = c.[object_id] WHERE c.system_type_id IN (35, 99, 167, 175, 231, 239) AND c.max_length >= LEN(@SearchTerm); PRINT @SearchCommands; EXEC sys.sp_executesql @SearchCommands, N''@SearchTerm NVARCHAR(255)'', @SearchTerm;'; INSERT #Results ( [database], [schema], [table], [column], ExampleValue ) EXEC [master].sys.sp_executesql @DatabaseCommands, N'@ColumnCommands NVARCHAR(MAX), @SearchTerm NVARCHAR(255)', @ColumnCommands, @SearchTerm; SELECT [Searched for] = @SearchTerm; SELECT [database],[schema],[table],[column],ExampleValue FROM #Results ORDER BY [database],[schema],[table],[column]; END GO
Sample Calls to Find Text
-- generate an error because this is not a sensible search: EXEC dbo.SearchAllDatabases @SearchTerm = N'%'; GO -- search for columns containing "http://" EXEC dbo.SearchAllDatabases @SearchTerm = N'%http://%'; -- search for columns containing "%" or "_": EXEC dbo.SearchAllDatabases @SearchTerm = N'%[%]%'; EXEC dbo.SearchAllDatabases @SearchTerm = N'%[_]%'; -- search for columns containing "update" followed by "dbo.": EXEC dbo.SearchAllDatabases @SearchTerm = N'%update%dbo.%';
A Caveat
Dynamic SQL is *always* a risky endeavor, and not one you should take lightly. While I tend to go to great lengths to make code safe, nothing is 100% foolproof, and I urge you to put this code through rigorous testing before deploying to any important system. A hint: anyone who can create tables or columns in any database could name those entities anything they like, and since there's no way to parameterize entity names, you may want additional protections in there that validate that all tables/columns actually exist.
Next Steps
- On a test or development system, create this stored procedure and experiment with it. Feel free to add additional clauses to
@DatabaseCommands
to limit the databases targeted. - See these tips and other resources:
- Searching and finding a string value in all columns in a SQL Server table
- Protecting Yourself from SQL Injection in SQL Server - Part 1
- Protecting Yourself from SQL Injection in SQL Server - Part 2
- Validate the contents of large dynamic SQL strings in SQL Server
- How to search all columns of all tables in a database for a keyword?
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2015-09-30