SQL NOT IN Operator
I am new to T-SQL and need to know about the NOT IN operator in the Microsoft SQL Server DBMS. What does this logical operator do? When and how do I use it? Can the SQL NOT IN operator be run in a SELECT statement or a stored procedure? If so, what is the syntax?
This tip will cover the NOT IN operator in detail. It will explain the functionality and explain the use cases. It will show several NOT IN examples. For any reader that is not familiar with the IN operator, review this tip to learn about SQL IN.
All of the demos in this SQL tutorial will use the WideWorldImporters sample database which can be downloaded for free from Github.
The SQL Server NOT IN operator is used to replace a group of arguments using the <> (or !=) operator that are combined with an AND. It can make code easier to read and understand for SELECT, UPDATE or DELETE SQL commands. Generally, it will not change performance characteristics. Consider this SQL query:
SELECT * FROM Sales.Invoices WHERE LastEditedBy <> 11 AND LastEditedBy <> 17 AND LastEditedBy <> 13;
This isn’t hard to comprehend, but makes for a long query.
A better approach would be the SQL NOT IN operator. The NOT IN operator works by comparing one value, usually a column, to a comma-separated list of potential match values held in a set of parentheses. The value to the left of the NOT IN operator is then compared, one at a time, to entire list and an exact match to any one member of the list will cause the argument to evaluate as false. This can be hard to understand, but should make more sense after seeing several examples.
The following SQL statement converts the 3 "LastEditedBy <>" arguments using a single NOT IN operator instead.
SELECT * FROM Sales.Invoices WHERE LastEditedBy NOT IN (11,17,13);
This query is functionally equivalent to the one above, but is shorter and easier to read.
Rules and Best Practices for SQL NOT IN
The NOT IN operator can only replace the <> or != operators. It cannot replace =, <, >, <=, >=, BETWEEN, or LIKE. It will only find and exclude exact matches. Duplicate values in the list are ignored.
This means that
WHERE LastEditedBy NOT IN (11,17,13)
is the same as
WHERE LastEditedBy NOT IN (11,17,13,11,17)
The NOT keyword can be placed either at the start of the argument or in the operator. The 2 options are functionally equivalent and choosing which way to code is purely a style choice. This example shows both methods.
WHERE LastEditedBy NOT IN (11,17,13)
is the same as
WHERE NOT LastEditedBy IN (11,17,13)
The NOT IN operator can be used anywhere any other operator is used including WHERE clauses, HAVING clauses, IF statements, or join predicates – although they should be extremely rare in join predicates (SQL JOINS - SQL INNER JOIN, SQL LEFT JOIN, SQL RIGHT JOIN). In fact, this author has never used a NOT IN operator in a join predicate.
The list of values can have as few as one item with no defined maximum limit, although extremely large lists are generally frowned upon. As a rule, this author doesn’t use NOT IN statements with more than 5 or 10 list items. With value lists that contain a large number of values it often makes more sense to use an OUTER JOIN to accomplish the same thing while giving the query optimizer more options.
Finally, the list of values doesn’t have to be a hard-coded, comma-separated list. It can be defined by a query. When this happens, it opens up the possibility that the list has zero values and the argument will evaluate to true for every row. This option will be covered in detail later in the tip.
SQL NOT IN with Strings
The NOT IN operator can be used to compare a string column (char, nchar, varchar, nvarchar) to a list of strings. The script below shows 2 functionally equivalent IF statements – one that uses multiple arguments and a second that uses the NOT IN operator.
--Ignore test users --AND Version IF @UserName <> 'TrainingUser' AND @UserName <> 'TestUser' BEGIN END --Ignore test users --IN Version IF @UserName NOT IN ('TrainingUser', 'TestUser') BEGIN END
The quotes in the list items are necessary because the data types are of the string variety.
SQL NOT IN with Numbers
This next set of example queries is looking for account persons who have made exactly 6, 8, or 9 sales. Since the values are not concurrent, neither a BETWEEN operator nor a combination of arguments utilizing < and > will work.
This SQL SELECT example introduces putting the NOT IN operator in a HAVING clause and uses the != operator instead of <> -- both of which work exactly the same way as the prior IF, WHERE, and <> examples.
--AND Version SELECT AccountsPersonID, COUNT(*) TotalInvoices FROM Sales.Invoices GROUP BY AccountsPersonID HAVING COUNT(*) != 6 AND COUNT(*) != 8 AND COUNT(*) != 9; --IN Version SELECT AccountsPersonID, COUNT(*) TotalInvoices FROM Sales.Invoices GROUP BY AccountsPersonID HAVING COUNT(*) NOT IN (6,8,9);
Since the result of the COUNT function is a number there are no quotes in the value list. This would be true for any column with a numeric data type (int, bigint, smallint, tinyint, numeric, decimal, float, real, money) as well.
SQL NOT IN with Dates
The NOT IN operator can be used to search for any date or datetime value except those match that match one from a list of date or datetime values. Imagine a scenario where the WideWorldImporters wanted to get an average number of items ordered daily per customer for the calendar year 2013, but wanted to exclude some holidays as they may skew the results lower. The query to determine this output may look something like below.
--AND Version SELECT CustomerID, AVG(ItemsPurchased) AverageDay FROM ( SELECT CustomerID, InvoiceDate, COUNT(*) ItemsPurchased FROM Sales.Invoices INNER JOIN Sales.InvoiceLines ON Invoices.InvoiceID = InvoiceLines.InvoiceID WHERE InvoiceDate != '25-Dec-2013' AND InvoiceDate != '4-Jul-2013' AND InvoiceDate != '28-Nov-2013' GROUP BY CustomerID, InvoiceDate) SubQuery GROUP BY CustomerID; --IN Version SELECT CustomerID, AVG(ItemsPurchased) AverageDay FROM ( SELECT CustomerID, InvoiceDate, COUNT(*) ItemsPurchased FROM Sales.Invoices INNER JOIN Sales.InvoiceLines ON Invoices.InvoiceID = InvoiceLines.InvoiceID WHERE InvoiceDate NOT IN ('25-Dec-2013', '4-Jul-2013', '28-Nov-2013') GROUP BY CustomerID, InvoiceDate) SubQuery GROUP BY CustomerID;
There are several ways to handle entering date and datetime values, but the most common method is to use quotes and type in a string that can be easily converted to a date or datetime.
SQL NOT IN with Subqueries
All examples to this point have shown a hard coded list following the NOT IN operator. A very common use case for NOT IN is for the list to be generated during query execution by another TSQL query.
There are 2 important things to remember when building a subquery that will be placed in the parentheses of a NOT IN operator. First, it must return exactly 1 column. An error will be raised if any other number of columns are selected. Second, the query must be able to be run by itself – meaning it has no outside dependencies.
Consider a scenario where an unexpected weather pattern is limiting commerce in the state of California. WideWorldImporters needs to build a list of items that it can order and for which it can get prompt delivery. These would be any stock items that are not sourced from California-based suppliers. Can a query be written to help identify available items?
This query lists the 3 suppliers that are based in California. Notice that the query fits the rules above in that it can be executed and returns exactly one column.
SELECT SupplierID FROM Purchasing.Suppliers INNER JOIN [Application].Cities ON PostalCityID = Cities.CityID INNER JOIN [Application].StateProvinces ON Cities.StateProvinceID = StateProvinces.StateProvinceID WHERE StateProvinces.StateProvinceName = 'California';
Simply place this query inside the parentheses following the NOT IN operator, but do so without the semi colon. The following query does just that. It will search for items in that are sourced from any non-California supplier.
SELECT * FROM Warehouse.StockItems WHERE SupplierID NOT IN ( SELECT SupplierID FROM Purchasing.Suppliers INNER JOIN [Application].Cities ON PostalCityID = Cities.CityID INNER JOIN [Application].StateProvinces ON Cities.StateProvinceID = StateProvinces.StateProvinceID WHERE StateProvinces.StateProvinceName = 'California');
SQL NOT IN Performance Considerations
Arguments that are created with the NOT IN operator are not conducive to index seek operations and commonly result in slower scan operations. For this reason, as often as possible NOT IN arguments should be used in conjunction with additional arguments that can use indexes to reduce the number of rows to be processed. That way the NOT IN argument is only being compared on this reduced set of rows and not an entire table.
While not as popular as some other operators, including IN, the NOT IN operator can be very useful in the right use case and should definitely be a part of any T-SQL writer’s repertoire.
- SQL Server 101
- Learning SQL Server
- SQL Server Indexing Tutorial
- SQL IN Operator
- Comparing performance of IN to alternatives
- SQL LIKE Syntax with Wildcard Characters
- Find text strings in character data types using SQL Server LIKE Operator
- SQL LIKE Examples
About the author
View all my tips
Article Last Updated: 2021-06-30