Validate Dynamic SQL to Prevent SQL Injection in SQL Server

By: Tim Smith | Comments | Related: > SQL Injection

Problem

We use dynamic SQL in a customer facing application and throughout our data's life cycle, as it allows us and customers to use the data in a range of table objects that we and our customers may need to use for reports or data. We think this is convenient since it saves us time in development and also allows customers to have access to new tables with data that we may add later. Are there alternatives to using dynamic SQL that will add security?

Solution

Injection of Code is a Risk

First, we should understand that injection is a risk in systems that allow input, even if we tend to only hear about "SQL injection" as if it's the only risk of injection that we face. I call this the injection risk myth and it's stated as "Only SQL databases are susceptible to injection risks." I've heard derivatives of this stated, especially when people sometimes bring up NoSQL as an alternative and while there are use cases for NoSQL engines, they are not immune to injection risks (since most don't allow SQL, they are immune to SQL injection because the interaction isn't SQL-based). Any system that allows input is at risk of injection, where we define injection as malicious influence or infiltration on the system. If an application server allows an attacker to edit or view a configuration file, the attacker may exploit this vulnerability by injection a malicious configuration or have access to parts of the system the attacker should not have access. This true outside computing system as well: since our brain receives input, we can be influenced by that input, and some influence may be malicious. Any system with input has injection risks.

Validate Input

Since we know that input of any type adds injection risks, we can move to our second point, which is to validate our input. We want to define strict rules about what is allowed in our system and what is not allowed. For an example, if a field allows TINYINTs, then we want the input to be strict in only allowing TINYINTs - this means that anything about 255 is invalid and anything over a byte is invalid. The data that the client should be entering becomes the rule about what we allow during the input. This matters and let's look at an example of this - using DATALENGTH, we'll get the bytes of 1 without casting 1 as a TINYINT and with casting 1 as a TINYINT:

SELECT DATALENGTH(1) AS NoCast
SELECT DATALENGTH(CAST(1 AS TINYINT)) AS StrictCast

My immediate thought when I see the NoCast result is concern because 1 doesn't start off on the lowest possible byte value - but when we cast it, we see it's strict in result. Notice what happens if we try to cast a value that's outside the definition of TINYINT:

SELECT DATALENGTH(CAST(324 AS TINYINT)) AS BreakOutputCast

We can also consider strings while considering what the effect may be if we allow someone to enter any amount of characters:

SELECT DATALENGTH('valueOne') LimitedCharacterAmount
SELECT DATALENGTH('Multi line textbox with any allowable text amount that users can enter along with any characters they want entered allowed') AnyCharacterAmount
SELECT DATALENGTH('A1') StrictCharacterAmount

The purpose of these three exercises is to demonstrate how we can think a value will come across as a default in one way, but it may be different and this nuance can matter for an attacker. We're only talking one rule here - the size in bytes. This holds true for using software combinations where tiny differences in application libraries allow for a range of attacks.

Adding Extra Validation

Since the question involves uses dynamic SQL for looping over tables, we'll look at an example of adding extra validation - even with extra work and extra performance use - of validating input. This serves as an idea of adding extra validation, but should not be taken as "do this." For an example, giving a customer facing application the ability to query any table is a bad idea; I would use a preset combination box and add to it and rebuild the application as new tables are created. Still, I know some developers will go the dynamic route because it's easier. In the below code, we start with a DECLARE @table for testing the input that will come externally through either a code or procedure (from developers or customers).

DECLARE @table VARCHAR(25) = 'etlRunners'

SELECT 
    TABLE_NAME
  , TABLE_SCHEMA
  , QUOTENAME(TABLE_SCHEMA) + '.' + QUOTENAME(TABLE_NAME) WrappedFormatMix
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = @table

What we see in the above is an example of further validation: we're not going to execute a query against the table parameter declared here, but use the table parameter to validate that a table with that name exists. From here, we'll use some of the output from this query to dynamically query the table - but we're never using what's passed into the code directly - as a contrast to this approach see the section Accept anything:

DECLARE @table VARCHAR(25) = 'etlRunners'

--- Validate example:
SELECT 
    TABLE_NAME
  , TABLE_SCHEMA
  , QUOTENAME(TABLE_SCHEMA) + '.' + QUOTENAME(TABLE_NAME) WrappedFormatMix
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = @table

--- Accept anything
EXEC ('SELECT * FROM ' + @table)

Rather than the above where we're directly using what's entered, we're using what's entered to be used for validation, then using what the INFORMATION_SCHEMA.TABLES returns as the dynamic table. If a table returns, we query it. But if no table returns, we query nothing. Only in the context of validation could this be useful. Querying a non-existent table may result in an error that becomes useful to an attacker (assuming we don't have a catch in place on another layer).

DECLARE @table VARCHAR(25) = 'etlRunners', @tableValidated VARCHAR(35)

--- Use a validation step:
SELECT @tableValidated = QUOTENAME(TABLE_SCHEMA) + '.' + QUOTENAME(TABLE_NAME)
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = @table

EXEC ('SELECT * FROM ' + @tableValidated)

Next, we'll wrap this with an encrypted stored procedure:

CREATE PROCEDURE stpReturnQuery
@table VARCHAR(25)
WITH ENCRYPTION
AS
BEGIN 

	DECLARE @tableValidated VARCHAR(35), @sql NVARCHAR(50)

	SELECT @tableValidated = QUOTENAME(TABLE_SCHEMA) + '.' + QUOTENAME(TABLE_NAME)
	FROM INFORMATION_SCHEMA.TABLES
	WHERE TABLE_NAME = @table
	
	SET @sql = (N'SELECT * FROM ' + @tableValidated)

	EXEC sp_executesql @sql

END

Summary

As a final thought, this exercise does not mean we should only do one step for validating input, or that we should only validate input on the back-end (it may be too late relative to the attacker). It also doesn't mean we should add other security measures, such as returning a generic result regardless of whether an object is found.

In addition, we want to make sure that only the fewest possible users can execute this stored procedure - permissions is another side to this security design. We also have to consider that an attacker will see the above procedure differently than developers will - he'll already be looking at the procedure for weaknesses and any revelation gives him time to develop a counter strategy.

Validation may be something we want to do with several steps and in several layers. Our rules for security will dictate how thorough we want this to be. We have seen how adding extra validation can be a useful technique in some situations, but by no means is this the only validation or technique we should use.

Finally, does this add extra costs? Yes, but delay when accountable offers robust security.

Next Steps

Any system that allows input is at risk of injection from the database layer to the application layer to the server layer. For an example, if a hacker is able to influence our configuration files, as an example, they can inject malicious text there that possibly results in a compromise or breach. While we tend to hear "injection" in the context of databases, injections risks exist wherever input is allowed.
Giving a client the ability to "enter anything" adds increased risk to an application. In the same manner, allowing customer facing applications to dynamically query anything is a highly questionable practice - though there may be a few contexts where it's needed. When possible, limit the range of input that a client can enter, even if it may cost more to create this structure. As for other parts of the data life cycle, such as early in the ETL flow, be careful about who may have permissions to execute dynamic SQL. Are there use cases in development? Yes, but we should still follow incredibly strict permissions.
When you have to allow input, validate the input with strict rules across multiple layers along with rules for output regardless of input. Be careful about sharing detailed rules, as it gives attackers an insight into what you're doing and how to get around your validation. For every action there is an opposite and equal reaction, when the action is known.

About the author

Tim Smith works as a DBA and developer and also teaches Automating ETL on Udemy.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips