Search all string columns in all SQL Server databases

By: Aaron Bertrand | Comments (2) | Related: 1 | 2 | 3 | 4 | 5 | 6 | 7 | More > Scripts

Problem

In a previous tip, Greg Robidoux showed us a method for searching and finding a string value in all columns in a table, and pointed out a script by Vyas Kondreddi that will search all columns in all tables of a single database. But what if I want to search across all string columns, across all tables, across all databases?

Solution

I've come across this problem multiple times in my career, so I thought I would put together my approach, which differs from others I have seen. In particular, I have a tendency to avoid explicit loops and cursors where possible, and I also like to avoid concatenating any user input (like the search string) into dynamic SQL - I would rather strongly parameterize those values. As I've explained before, any arbitrary string you accept from a user is a potential threat.

My solution is, admittedly, a bit complex; it involves three layers of nested dynamic SQL. You could probably argue that this offsets the benefits achieved by avoiding the scaffolding of loops or cursors, and that it may make the code harder to read for some people. I can't argue with those points, but still, this is the solution I prefer.

The approach is three-part (basically in reverse of how the code actually has to be structured):

Build a dynamic SQL command to find every eligible column in every table.
Build a dynamic SQL command for each column to be searched.
Build a dynamic SQL command to run that in each database.

And before I get into any code, I'll iterate a few of the intentions and limitations of my solution:

This is not meant to be fast. Searching for patterns within strings is not what relational databases are designed for, so you'll need some patience when running this against larger databases. (And as such, this is one case where I feel the problems with NOLOCK can be overlooked. Of course if you need absolute certainty and are willing to run this concurrently with other users, you may want to comment the isolation level out. Better yet, run this against restored copies of your databases - you're testing your restores, right?)
I intend to support searches of text, ntext, char, nchar, varchar, and nvarchar. With some effort, it could be expanded to support other data types, like sql_variant and xml. Currently, I have constructed the code to avoid implicit conversions from Unicode to non-Unicode and vice-versa, by using two variables for the search term and constructing the dynamic SQL accordingly. However I do not go to the extent of covering different collations or dealing with case sensitive searches.
Rather than return all matching rows from any given table, the point is to identify each table that has at least one row, and return the first 1,000 characters from an arbitrary, example value from that row. (The primary purpose of such a script is usually to identify which table(s) are victims of SQL injection, not to return every affected row in a grid.) You can return all rows by removing the TOP 1, and the full value of any tuple by changing the #temp table to use NVARCHAR(MAX) and removing the LEFT(,1000) bit.

The full stored procedure is listed below, but I should highlight the key portions and the purpose of each.

Stored procedure signature

The stored procedure takes a single @search argument, using nvarchar for greatest compatibility. I use SET NOCOUNT ON to prevent DONE_IN_PROC messages from being transmitted, and immediately return if the search argument is NULL or only contains wildcard characters.

USE UtilityDatabase;
GO

CREATE PROCEDURE dbo.SearchAllDatabases
  @SearchTerm NVARCHAR(255) = NULL
AS
BEGIN
  SET NOCOUNT ON;
  SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
  
  IF @SearchTerm IS NULL OR @SearchTerm NOT LIKE N'%[^%^_]%'
  BEGIN
    RAISERROR(N'Please enter a valid search term.', 11, 1);
    RETURN;
  END

Temporary table

I create a simple #temp table, with self-explanatory columns to hold the results of the search. This is created at the outermost scope, which will be visible to dynamic SQL statements, no matter how many levels deep they go.

CREATE TABLE #results
  (
    [database]   SYSNAME,
    [schema]     SYSNAME,
    [table]      SYSNAME, 
    [column]     SYSNAME,
    ExampleValue NVARCHAR(1000)
  );

@DatabaseCommands

First, we need to build a set of commands so that we can pass the search argument to each database, regardless of how many eligible tables are found there.

  SELECT @DatabaseCommands = @DatabaseCommands + N'

  /* layer 1 */

  EXEC ' + QUOTENAME(name) + '.sys.sp_executesql 
        @ColumnCommands, N''@SearchTerm NVARCHAR(MAX)'', @SearchTerm;'
    FROM sys.databases 
    WHERE database_id  > 4  -- non-system databases  
      AND [state]      = 0  -- online 
      AND user_access  = 0; -- multi-user

You may want to add other checks here, like HAS_DBACCESS(name) = 1. You may also want to include system databases (in case you were really exposed) or not exclude single-user databases (since that user could be you, on purpose, while you fix this).

This will lead to a variable that looks like this:

  EXEC [AdventureWorks].sys.sp_executesql @ColumnCommands, N'@search NVARCHAR(MAX)', @search;
  EXEC [ContosoYogurts].sys.sp_executesql @ColumnCommands, N'@search NVARCHAR(MAX)', @search;
  EXEC [Microsoft Bobs].sys.sp_executesql @ColumnCommands, N'@search NVARCHAR(MAX)', @search;
  ...

@ColumnCommands & @SearchCommands

This is the most complex part of the code, where we generate a search command for each table + eligible column combination. This requires nesting of dynamic SQL, which also means ugly things with single quotes (but I do try to eliminate some of that with the variable @q). And I try to use indenting to show the two layers being used here:

  SET @ColumnCommands = N'
    /* layer 2 */
    DECLARE @q NCHAR(1),  -- eliminates nested single-quote nausea
            @SearchCommands NVARCHAR(MAX);
    
    SELECT @q = NCHAR(39), 
      @SearchCommands = N''DECLARE @VSearchTerm VARCHAR(255) = @SearchTerm;'';
  
    SELECT @SearchCommands = @SearchCommands + CHAR(10) + N''

      /* layer 3 */

      SELECT TOP (1)
        [db]     = DB_NAME(),
        [schema] = N'' + @q + s.name + @q + '', 
        [table]  = N'' + @q + t.name + @q + '',
        [column] = N'' + @q + c.name + @q + '',
        ExampleValue = LEFT('' + QUOTENAME(c.name) + '', 1000) 
      FROM '' + QUOTENAME(s.name) + ''.'' + QUOTENAME(t.name) + ''
      WHERE '' + QUOTENAME(c.name) + N'' LIKE @'' + CASE 
        WHEN c.system_type_id IN (35, 167, 175) THEN ''V'' 
        ELSE '''' END + ''SearchTerm;'' 

      /* end layer 3 */

    FROM sys.schemas AS s
    INNER JOIN sys.tables AS t
    ON s.[schema_id] = t.[schema_id]
    INNER JOIN sys.columns AS c
    ON t.[object_id] = c.[object_id]
    WHERE c.system_type_id IN (35, 99, 167, 175, 231, 239)
      AND c.max_length >= LEN(@SearchTerm);

    /* end layer 2 */
    
    PRINT @SearchCommands;
    EXEC sys.sp_executesql @SearchCommands, N''@SearchTerm NVARCHAR(255)'', @SearchTerm;

  /* end layer 1 */';

Inside layer 3 you can see the CASE expression magic that changes @SearchTerm to @VSearchTerm in the event the column is non-Unicode. I also exclude any columns that can't possibly contain the search term because of their length; ColName CHAR(3) can't contain '%some_search%'.

There is also a PRINT command in this portion so you can troubleshoot at least a portion of the code (note that PRINT output is inherently limited by SSMS, as I described in a tip last year, Validate the contents of large dynamic SQL strings). A selective sample from the PRINT output of @SearchCommands against AdventureWorks:

DECLARE @VSearchTerm VARCHAR(255) = @SearchTerm;

      SELECT TOP (1)
        [db]     = DB_NAME(),
        [schema] = N'Production', 
        [table]  = N'ScrapReason',
        [column] = N'Name',
        ExampleValue = LEFT([Name], 1000) 
      FROM [Production].[ScrapReason]
      WHERE [Name] LIKE @SearchTerm;

      SELECT TOP (1)
        [db]     = DB_NAME(),
        [schema] = N'Sales', 
        [table]  = N'Customer',
        [column] = N'AccountNumber',
        ExampleValue = LEFT([AccountNumber], 1000) 
      FROM [Sales].[Customer]
      WHERE [AccountNumber] LIKE @VSearchTerm;

Note that the second query uses the VARCHAR version of the variable, since AccountNumber is not Unicode. Also note that the procedure has no way to know when you are using a string column to store things that are unlikely to contain non-numeric data; it blindly includes all columns that match the six data types identified above. If you have common naming conventions for columns that can't possibly match your searches, you could consider filtering them out in layer 2.

Final execution

With all the messy parts out of the way, now we can populate the #temp table, and return a single resultset (we use the #temp table to avoid a separate resultset for every table/column combination or for every database).

  -- insert/exec to run all searches 
  -- against al columns in all databases
  
  INSERT #Results
  (
    [database],
    [schema],
    [table],
    [column],
    ExampleValue
  )
  EXEC [master].sys.sp_executesql @DatabaseCommands, 
    N'@ColumnCommands NVARCHAR(MAX), @SearchTerm NVARCHAR(255)', 
    @ColumnCommands, @SearchTerm;

  -- in case you execute multiple searches, this will help 
  -- identify each individual search in your results:
  SELECT [Searched for] = @SearchTerm;
  
  -- final select
  SELECT [database],[schema],[table],[column],ExampleValue 
    FROM #Results 
    ORDER BY [database],[schema],[table],[column];

Putting it all together

Here is the entire procedure body or open in a new tab.

CREATE PROCEDURE dbo.SearchAllDatabases
  @SearchTerm NVARCHAR(255) = NULL
AS
BEGIN
  SET NOCOUNT ON;
  SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
  
  IF @SearchTerm IS NULL OR @SearchTerm NOT LIKE N'%[^%^_]%'
  BEGIN
    RAISERROR(N'Please enter a valid search term.', 11, 1);
    RETURN;
  END
 
  CREATE TABLE #results
  (
    [database]   SYSNAME,
    [schema]     SYSNAME,
    [table]      SYSNAME, 
    [column]     SYSNAME,
    ExampleValue NVARCHAR(1000)
  );

  DECLARE
    @DatabaseCommands  NVARCHAR(MAX) = N'', 
    @ColumnCommands    NVARCHAR(MAX) = N'';

  SELECT @DatabaseCommands = @DatabaseCommands + N'
    EXEC ' + QUOTENAME(name) + '.sys.sp_executesql 
        @ColumnCommands, N''@SearchTerm NVARCHAR(MAX)'', @SearchTerm;'
    FROM sys.databases 
    WHERE database_id  > 4  -- non-system databases  
      AND [state]      = 0  -- online 
      AND user_access  = 0; -- multi-user

    SET @ColumnCommands = N'DECLARE @q NCHAR(1),
          @SearchCommands NVARCHAR(MAX);
    
    SELECT @q = NCHAR(39), 
      @SearchCommands = N''DECLARE @VSearchTerm VARCHAR(255) = @SearchTerm;'';
  
    SELECT @SearchCommands = @SearchCommands + CHAR(10) + N''

      SELECT TOP (1)
        [db]     = DB_NAME(),
        [schema] = N'' + @q + s.name + @q + '', 
        [table]  = N'' + @q + t.name + @q + '',
        [column] = N'' + @q + c.name + @q + '',
        ExampleValue = LEFT('' + QUOTENAME(c.name) + '', 1000) 
      FROM '' + QUOTENAME(s.name) + ''.'' + QUOTENAME(t.name) + ''
      WHERE '' + QUOTENAME(c.name) + N'' LIKE @'' + CASE 
        WHEN c.system_type_id IN (35, 167, 175) THEN ''V'' 
        ELSE '''' END + ''SearchTerm;'' 
 
    FROM sys.schemas AS s
    INNER JOIN sys.tables AS t
    ON s.[schema_id] = t.[schema_id]
    INNER JOIN sys.columns AS c
    ON t.[object_id] = c.[object_id]
    WHERE c.system_type_id IN (35, 99, 167, 175, 231, 239)
      AND c.max_length >= LEN(@SearchTerm);

    PRINT @SearchCommands;
    EXEC sys.sp_executesql @SearchCommands, 
      N''@SearchTerm NVARCHAR(255)'', @SearchTerm;';
  
  INSERT #Results
  (
    [database],
    [schema],
    [table],
    [column],
    ExampleValue
  )
  EXEC [master].sys.sp_executesql @DatabaseCommands, 
    N'@ColumnCommands NVARCHAR(MAX), @SearchTerm NVARCHAR(255)', 
    @ColumnCommands, @SearchTerm;

  SELECT [Searched for] = @SearchTerm;
  
  SELECT [database],[schema],[table],[column],ExampleValue 
    FROM #Results 
    ORDER BY [database],[schema],[table],[column];
END
GO

Sample Calls to Find Text

-- generate an error because this is not a sensible search:
EXEC dbo.SearchAllDatabases @SearchTerm = N'%';
GO

-- search for columns containing "http://"
EXEC dbo.SearchAllDatabases @SearchTerm = N'%http://%';

-- search for columns containing "%" or "_":
EXEC dbo.SearchAllDatabases @SearchTerm = N'%[%]%';
EXEC dbo.SearchAllDatabases @SearchTerm = N'%[_]%';

-- search for columns containing "update" followed by "dbo.":
EXEC dbo.SearchAllDatabases @SearchTerm = N'%update%dbo.%';

A Caveat

Dynamic SQL is *always* a risky endeavor, and not one you should take lightly. While I tend to go to great lengths to make code safe, nothing is 100% foolproof, and I urge you to put this code through rigorous testing before deploying to any important system. A hint: anyone who can create tables or columns in any database could name those entities anything they like, and since there's no way to parameterize entity names, you may want additional protections in there that validate that all tables/columns actually exist.

Next Steps

On a test or development system, create this stored procedure and experiment with it. Feel free to add additional clauses to @DatabaseCommands to limit the databases targeted.
See these tips and other resources:

About the author

Aaron Bertrand (@AaronBertrand) is a passionate technologist with industry experience dating back to Classic ASP and SQL Server 6.5. He is editor-in-chief of the performance-related blog, SQLPerformance.com, and also blogs at sqlblog.org.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips