Why You Should Avoid SELECT * in SQL Server T-SQL Code


By:   |   Updated: 2015-12-10   |   Comments (2)   |   Related: More > Views

Problem

I have read and been told that using SELECT * is a no-no in my SQL Server T-SQL code. From my perspective, it's easy and it means I don't have to type all the column names for a query. If it is a problem, I can always rewrite the query in question. Is there something I'm missing?

Solution

Given the scenario you've described, it does seem like folks are making SELECT * to be an issue when it isn't. However, there are some things to consider:

  • If you have to return to the code after some length of time, say a year, will you remember that you used SELECT *?
  • Will you remember everywhere you used SELECT *?
  • If someone else has to edit your code, will they catch all of your uses of SELECT *?
  • What if SELECT * was used for other objects, such as a view definition? How will an underlying table change affect things?

The last example, when there is dependent code using SELECT *, which other systems could be using as an interface, is easy to see. Let's take a simple example, say a table which stores info on domain logins:

CREATE TABLE dbo.DomainUser_Live (
	UserID NVARCHAR(50) NOT NULL,
	FirstName NVARCHAR(30) NOT NULL,
	Surname NVARCHAR(30) NOT NULL,
	CONSTRAINT PK_DomainUser PRIMARY KEY (UserID)
);
GO 

We'll put in some dummy data:

INSERT INTO dbo.DomainUser_Live ( UserID, FirstName, Surname )
VALUES ( N'John.Doe', N'John', N'Doe');

INSERT INTO dbo.DomainUser_Live( UserID, FirstName, Surname )
VALUES ( N'Jane.Doe', N'Jane', N'Doe');
GO

Then we'll create a view using SELECT * against the table.

CREATE VIEW dbo.DomainUser
AS 
  SELECT * FROM dbo.DomainUser_Live;
GO 

If you're wondering about the name with the suffix _Live, think about a set of tables for Domain Users. One is live and active for applications and people to use. The other can be loaded without interruption. A couple of renames and the data is "refreshed." Meanwhile, the view allows an unchanging interface for anyone or any system that needs the data, without having to understand what's going on behind the scenes. This is just one example of where you'd have a dependent object like a view referring to a table. So let's query against the view dbo.DomainUser:

Query Against the Original View/Table

Scenario: A New Requirement

Now let's say there's a new requirement. An application needs the Distinguished Name of the user, which is the path in Active Directory to find where the user account is located. Therefore, the following change is made:

ALTER TABLE dbo.DomainUser_Live
ADD DN NVARCHAR(255) NULL;
GO 

And after data is loaded for every user object, then we would flip the column to NOT NULL like so. First, the data:

UPDATE dbo.DomainUser_Live
SET DN = 'CN=' + UserID + ', OU=Users, DC=MyDomain, DC=local';
GO 

Then the ALTER COLUMN to set the column to NOT NULL:

ALTER TABLE dbo.DomainUser_Live
ALTER COLUMN DN NVARCHAR(255) NOT NULL;
GO 

Now let's query the view and see what we get:

Query the View and the DN column is missing

The view, though it's defined as SELECT *, doesn't return the column DN! Let's check the table to make sure we've made the changes:

Table has the DN column

SO the SELECT * isn't actually returning all columns! Imagine trying to troubleshoot this issue at 3 AM in the morning! Let's issue a query against sys.columns to figure out what's going on:

SELECT OBJECT_NAME(c.object_id) AS 'View', c.name AS 'Column'
FROM sys.columns AS c
  JOIN sys.views AS v
    ON c.object_id = v.object_id
WHERE v.name = 'DomainUser';
GO

The results of the query reveals the problem:

View only has three columns according to sys.columns

The table's original columns are specified for the view, but not the new one, DN. With this example, it's not hard to do the rewrite. However, consider the case where you have a lot of tables and a lot of views. If you have code relying upon those views, rewriting from SELECT * to add the columns needed can become a significant amount of work.

How did this happen? The problem is that the SELECT * for the view definition was automatically expanded by SQL Server at the time the view was created. So even though specific column names weren't specified, SQL Server figured out what they were and assigned them to the view. Therefore, the view only returns the original columns. Now take the scenario a step further and imaging that someone is only looking at the view definition from what was checked into source control. The view definition looks right, only we know from this exercise that it's not. Therefore, a person reviewing code in source control will miss the problem.

Next Steps


Last Updated: 2015-12-10


get scripts

next tip button



About the author
MSSQLTips author K. Brian Kelley K. Brian Kelley is a SQL Server author and columnist focusing primarily on SQL Server security.

View all my tips




Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name
*Email
Email me updates

Signup for our newsletter

I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.





Monday, December 14, 2015 - 5:19:46 PM - Gene Wirchenko Back To Top

1) That is not even the important point. Suppose you do refigure the views. That new column might have sensitive data that not everyone using the system should be able to see. Anyone accessing the data through a view defined with select * can now potentially see the data. Big error! 2) Another smaller point is how much data is sent. Does it all need to be sent? With appropriate columns specified, less data gets sent. 3) Do you want to correct the uses of select *? Since any of the data might be used, you have to check through programs to see what is used. (The same applies to too many specified columns, but you do have their names which will make it easier to check.)


Thursday, December 10, 2015 - 6:09:42 AM - raju Back To Top

1. DECLARE @CAN VARCHAR(MAX)

   SELECT @CAN='CodeSetAcuityID'+' '+CAST(CodeSetAcuityID AS VARCHAR(10))  from CodeSetAcuity  

    SELECT @CAN

 

 

2. SELECT 'CodeSetAcuityID'+' '+CAST(CodeSetAcuityID AS VARCHAR(10))  from CodeSetAcuity  

 

 

 

what is the difference of this two.



download

























get free sql tips

I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.



Learn more about SQL Server tools