Learn more about SQL Server tools

mssqltips logo
 

Tutorials          DBA          Dev          BI          Career          Categories          Webcasts          Whitepapers          Today's Tip          Join

Tutorials      DBA      Dev      BI      Categories      Webcasts

DBA    Dev    BI    Categories

Free SQL Server Webcast > Building Really Fast SQL Server VMs
 

Using Regular Expressions With T-SQL: From Beginner To Advanced



By:
Overview

We can use some comparable expressions to a full regular expression library for matching certain patterns with T-SQL using the like operator. In this tutorial, we will practice using these expressions (referred to as regular expressions in the context only of T-SQL) for filtering price phrases involving alphabetic, numeric, and special characters. By the end of this tutorial, we will have another tool that we may use for precisely filtering data for some applicable situations.

I’ve used regex in numerous situations and it can be very useful to know and practice, which we’ll be doing in this tutorial.   While regular expressions can sometimes seem counterintuitive, we will experience many situations where this skill can quickly solve problems that multiple ANDs in a WHERE clause may not be as efficient to use. 

As for its use, some examples where regular expressions can provide us with assistance and make complex problems easy:

  • Applying very specific filters on text, numeric or special character data, especially when precision is paramount, and we cannot allow any possible error at all.
  • Parsing data for ETL purposes, finding patterns in code or in word use, or creating rules for inbound or outbound traffic.
  • In older versions of SQL Server, some functions to validate numbers may allow some characters, like e, and regular expressions can provide more accurate filters in these situations.
  • Muddying or intentionally corrupting data in some environments or in some situations to confuse or mislead infiltrators.

We can apply regex in situations where we need to look at text data, numerical data, or data that uses a combination, such as identifiers with letters, numbers and special characters.

The outline for this tutorial is as follows:

  1. Using regular expressions with alphabetic characters
    • Introduction to alphabetic regular expressions
    • Precise alphabetic filtering with regular expressions
    • Case sensitivity and regular expressions
    • Putting it all together
  2. Using regular expressions with numeric characters
    • Introduction to numeric regular expressions
    • Numerical ranges with any combinations or special characters
    • Putting it all together
  3. Using regular expressions with special characters and applications with it
    • Introduction to special character regular expressions
    • Using the “not” character with regular expressions
  4. Wrapping up with business applications using regular expressions examples
    • With date examples
    • With credit card examples
    • With URL examples
    • With email examples
  5. Summary

Last Update: 4/6/2018




More SQL Server Solutions











Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name    *Email    Email me updates 


Signup for our newsletter
 I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.



    



Tuesday, August 21, 2018 - 10:36:20 AM - SL Back To Top

Hi Tim.

I read your excellent article. 

I waas wondering if you  added the following row, wouldn't you get an undesirable result in your query below?

INSERT INTO alphareg 

VALUES ('Twree')

SELECT *

FROM alphareg

WHERE Alphabetic LIKE '%[T][HW][OR][E ]%' 


Tuesday, July 03, 2018 - 1:18:03 PM - Solomon Rutzky Back To Top

these expressions (referred to as regular expressions in the context only of T-SQL)

Hi Tim. My point is that the "expressions" you are describing (i.e. the 3 wildcard options supported by LIKE and PATINDEX) are very much not Regular Expressions, and so should not be referred to as being "regular expressions". Regular Expressions (commonly called "RegEx") is a very specific feature / functionality, one that is not natively supported by SQL Server. If you want to use Regular Expressions in T-SQL, then you need to use SQLCLR. For example, several RegEx functions are available in the free version of SQL# (a library of many SQLCLR stored procedures and functions that I created).

The wildcard options supported by LIKE and PATINDEX are no more Regular Expressions than the analogous wildcards supported by DOS / CMD where * means "0 or more of any", and ? means "0 or 1 of any". But, if being (very) loosely similar to something else is all that is required, then it would be ok to refer to Microsoft SQL Server as "Oracle" since both are RDBMSs, both use SQL, both support INSERT / UPDATE / DELETE / SELECT statements, and so on.

SQL wildcards and RegEx are only similar if you don't know anything, or know that much, about Regular Expressions. And the problem is that because many DBAs know very little (if anything at all) about Regular Expressions, misusing the term only serves to increase the general confusion about them. I think it's great that you are showing people how to use the SQL wildcards, but please stop referring to them as being Regular Expressions. And please change the title of this series to avoid confusion and misrepresentation.

Thank you.


Friday, June 15, 2018 - 4:52:10 AM - Tim Back To Top

@Solomon

We can use some comparable expressions to a full regular expression library for matching certain patterns with T-SQL using the like operator. In this tutorial, we will practice using these expressions (referred to as regular expressions in the context only of T-SQL) for filtering price phrases involving alphabetic, numeric, and special characters. [1rst paragraph]


Thursday, June 14, 2018 - 11:26:09 AM - Solomon Rutzky Back To Top

 

Hi Tim. Interesting article. One note here that might seem nit-picky but is an important distinction: the feature that you are describing here is very specifically NOT Regular Expressions (RegEx). The LIKE operator and PATINDEX function both support wildcards in the form of: "%" (zero or more of any), "_" (only one of any), and "[]" (only one of specific combination of list and/or range(s)). There is a good reason why the Microsoft documention only ever refers to these as "wildcards" and never uses the term "Regular Expressions". It is a common misconception that these wildcards are RegEx because the use of the "[]" list/range, but RegEx doesn't have a single-character "any" option that would equate to the "%" wildcard here. RegEx has "." that equates to the "_" wildcard in T-SQL, but you would then need to use the "*" (zero-or-more) quantifier to make it equivalent to the "%" T-SQL wildcard. In fact, quantifiers ("*" = 0 or more, "+" = 1 or more, "?" = 0 or 1, etc) is something that cannot be done with T-SQL wildcards. Because in T-SQL, they are just wildcards and not Regular Expressions.

 


Learn more about SQL Server tools