Excluding Characters With Regular Expressions with T-SQL
By: Tim Smith
One of the most useful techniques with regular expressions involves using the not ^ character, as we are often looking for specific data and our queries can be inverted to find these data. In some cases, it’s easier to find data by finding what the data are not, as opposed to what the data are. In this section, we’ll look at some applications with the not character before moving into business examples with regular expressions.
Using the Not Character for Regex with T-SQL
In the last section, we looked at finding punctuations within our sentences. What if we wanted to find all of our data rows from the alphareg table that started with any special character (query one below this), or we wanted to find any of our data rows from our alphanumreg table where the column AlphabeticNum started with a non-numerical character (query two below this)?
In the below queries, we see the not ^ character which seeks information that doesn’t match what we specify – in query one, the first character is not an alphabetic character in the range of A through Z, and in query two, the first character is not a numerical character in the range of 0 through 9.
SELECT * FROM alphareg WHERE Alphabetic LIKE '[^A-Z]%' SELECT AlphabeticNum FROM alphanumreg WHERE AlphabeticNum LIKE '[^0-9]%'
Three results each from the “not” queries.
Regex to Exclude Numbers and Letters
This offers us a lot of use, especially in a simple example of finding data rows with all the special characters. We could enter every single special character, or we could just as easily write our query to exclude all the alphabetic and numerical character, such as the below example using the alphareg table.
SELECT * FROM alphareg WHERE Alphabetic LIKE '[^A-Z0-9]%'
One result from the not any alphabetic or numerical character query.
Once we enter the not character [^], any range that follows will be excluded. We don’t have to re-enter it for the numbers, as we see in the above query. The result is that all rows which start with any special character are returned.
Regex to Find Text Patterns Not Including Characters
We can use our combination techniques with special characters, like we did with alphabetic and numerical characters. In the below query, we look for sentences that start with any alphabetic character, end with any alphabetic character or period, and have a special character within them.
SELECT * FROM alphareg WHERE Alphabetic LIKE '[A-Z]%[^A-Z0-9 ]%[A-Z.]'
Five results from the query.
Another Regex Example to Not Include Characters
This query also highlights that spaces are considered special characters, so if we’re interested in all special characters that are not spaces, we can include the space in the not regular expression specification. We can further see this by adding a value to our table with a leading, ending and middle space and see the results of a query that retrieves this information by looking for a special character that starts, ends and is within a data row (the new row only qualifies):
INSERT INTO alphareg VALUES (' space space ') SELECT * FROM alphareg WHERE Alphabetic LIKE '[^A-Z0-9]%[^A-Z0-9]%[^A-Z0-9]'
Insert statement with one result from the following query.
In some situations, we may not want a space or another special character, so we can include the special character that we want to exclude in our not regular expression, such as [^A-Z0-9 ] to exclude a space.
- Suppose that we want all special characters, except commas. How would we write that query?