Implementing Luhn's Algorithm in T-SQL to validate credit card numbers

By: Derek Colley | Comments (16) | Related: > Functions User Defined UDF

Problem

You are tasked with checking account or credit card numbers for validity, for example as part of a web application handling sensitive card data. You wish to implement this as a function within SQL Server, with the ability to be called with any number and return either TRUE (valid) or FALSE (invalid). Alternatively, you are looking for a method to validate a number based on a check digit: you could be using this as part of a password reset check, for account verification or similar purposes.

Solution

The industry standard for checking credit card numbers for validity is known in colloquial terms as the 'MOD 10 Check', or more formally as Luhn's Algorithm. It has no cryptographic validity, but it is a useful rule-of-thumb check that can be used to validate that a card number is correct, and can be used in its opposite form to generate account (and credit card) numbers - to help discourage potential fraudsters, this function is NOT covered here.

Much literature already exists on Luhn's Algorithm, links for which are provided in the Next Steps section at the end of this tip. The algorithm is simple, but powerful. In the following sections I will provide a step-by-step guide to validating these numbers by hand, then show how we can replicate this as a User-Defined Function (UDF) within SQL Server. I will also show you a method to remove 'noise' from the number - for example, spaces or leading/trailing characters.

One final note - any card numbers used here for illustration / example are fictitious - these pass the MOD 10 check, but are not real credit card numbers, only samples. The samples are from PayPal and may be found here: www.paypalobjects.com/en_US/vhelp/paypalmanager_help/credit_card_numbers.htm

Algorithm Overview

The algorithm can be broken down into 4 stages. All stages are carried out using simple mathematics and rules.

Stage 1: N * 2 for every 2nd N from the right

Let's first get a number which we are testing for validity and split it out into its component digits, in a format we can easily understand. The number we will use is an AMEX sample card number:

The first step is to apply N*2 to every 2nd digit starting from the right. This is true regardless of the length of the number (odd or even number of digits), and is because the final digit is normally a check digit. Let's do this now.

Stage 2: If (N*2) > 9 sum the digits, else carry the digits

In this stage, we carry forward ALL the digits from either the original number or N*2 to produce a final number for verification. The rule is: If the number was doubled in Stage 1 AND this number > 9, then sum the digits. So if the original number was 9, for example, and was selected for doubling in Stage 1 to become 18, we now need to add 1 + 8 = 9 since 18 > 9. Likewise, if the number 2 was doubled in Stage 1 to equal 4, because 4 < 9 we do not need to transform it further. All other digits - i.e. those NOT selected in Stage 1 - are carried forward without transformation.

This is easier to show than describe. Like so:

If (N*2) > 9 sum the digits, else carry the digits

So now we have a final number that we can check for validity.

Stage 3: Sum the digits

This stage is simple - take this output number and sum the digits.

This results in a single output integer we'll call X, in this case 80. We now need to test this number - more formally, IF X % 10 = 0 THEN TRUE ELSE FALSE. For those of you unfamiliar with the MOD operation, it's almost interchangeable with the term 'is divisible by'. When we MOD two numbers, we fetch the remainder of the division operation between the two numbers. So, 10 MOD 2 = 0 because 2 evenly divides into 10 with no remainder. 250 MOD 30 = 10, because 250 is divisible by 30 a total of 8 times (240) with 10 remainder. In SQL and in mathematics, MOD is represented by the percentage (%) sign.

Stage 4: IF X MOD 10 = 0 THEN TRUE ELSE FALSE

Applying this then, we know that 80 MOD 10 = 0, therefore the card number is valid.

For balance, let's show the same grid, but this time using an invalid card number. You can see the algorithm quickly generates an output X that fails the MOD 10 check.

You can see the algorithm quickly generates an output X that fails the MOD 10 check.

Rewriting this for SQL Server

Let's now turn our attention to rewriting this in T-SQL.

The code we write must be fairly efficient, but we're constrained in a unique way by using T-SQL - we're writing in the language of relational set theory, not in object oriented language. So whereas in Python or C we might implement the card number as an array of numbers then apply transformations to all elements of the array, we don't have that luxury in SQL Server. What we CAN do, however, is use a temporary table variable to the same effect.

We must also cleanse the card number and make sure we don't have any trailing spaces. In the real world, we'll receive a number for testing as a string, or VARCHAR of indeterminate length, so we'll work on this basis rather than using an INT/BIGINT.

Luckily we can combine this and use a temporary table variable both for cleansing the number and for validity checking.

Let's now set up the function header, and put in a couple of simple constraints - the function can accept a string no longer than 20 characters, and if there aren't at least three digits present in the string, return an error code:

CREATE FUNCTION dbo.usp_LuhnsAlgorithm_New ( @inputString VARCHAR(20) )
RETURNS TINYINT 
AS BEGIN 
-------------------------------------------------------------------------------
-- Function to calculate whether a number is valid according to the 'MOD 10'
-- check, a.k.a. Luhn's Algorithm.
-- Author:  Derek Colley, August 2014
-- Parameters:  @inputString VARCHAR(20) 
-- Outputs:  TINYINT:    2 = an error occurred, validity undetermined
--       0 = number is not valid 
--       1 = number is valid
-------------------------------------------------------------------------------

-- first a quick check to ensure we have at least 3 numbers in the string - 
-- you can change this to any arbitrary amount, i.e. if you are just 
-- checking credit card numbers, make it 13 digits

DECLARE @result TINYINT

IF @inputString NOT LIKE ('%[0-9]%[0-9]%[0-9]%') 
 RETURN 2

Now, we're going to use a table variable to handle the calculation.

-- set up our table for algorithm calculation

DECLARE @charTable TABLE ( 
 Position INT NOT NULL, 
 ThisChar CHAR(1) NOT NULL, 
 Doubled TINYINT, 
 Summed TINYINT )

There's a neat trick we can use now. The instinctive approach to parsing this string is to step through it one character at a time, removing whitespace and non-numeric data. However, we're in a set-based environment. Neither do we want to encapsulate the string in many string-parsing functions like SUBSTRING, REPLACE, CHARINDEX and the like, as this will get very messy very quickly, and our key tool LIKE isn't rich in functionality compared to the non-database equivalent functions.

For this reason, we'll pivot the string into a table, each row containing one character. Using this methodology, we can quickly eliminate unwanted characters while retaining order, and additionally we can use multiple columns for our calculations.

-- convert the @inputString to a fixed width char datatype 
-- we can then process the string as a set with a known number of elements
-- this avoids RBAR substringing each char to a table in a cursor

SET @inputString = CAST(@inputString AS CHAR(20))
INSERT INTO @charTable(Position, ThisChar) 
 SELECT 1, SUBSTRING(@inputString, 1, 1) UNION ALL 
 SELECT 2, SUBSTRING(@inputString, 2, 1) UNION ALL 
 SELECT 3, SUBSTRING(@inputString, 3, 1) UNION ALL 
 SELECT 4, SUBSTRING(@inputString, 4, 1) UNION ALL 
 SELECT 5, SUBSTRING(@inputString, 5, 1) UNION ALL 
 SELECT 6, SUBSTRING(@inputString, 6, 1) UNION ALL 
 SELECT 7, SUBSTRING(@inputString, 7, 1) UNION ALL 
 SELECT 8, SUBSTRING(@inputString, 8, 1) UNION ALL 
 SELECT 9, SUBSTRING(@inputString, 9, 1) UNION ALL 
 SELECT 10, SUBSTRING(@inputString, 10, 1) UNION ALL 
 SELECT 11, SUBSTRING(@inputString, 11, 1) UNION ALL 
 SELECT 12, SUBSTRING(@inputString, 12, 1) UNION ALL 
 SELECT 13, SUBSTRING(@inputString, 13, 1) UNION ALL 
 SELECT 14, SUBSTRING(@inputString, 14, 1) UNION ALL 
 SELECT 15, SUBSTRING(@inputString, 15, 1) UNION ALL 
 SELECT 16, SUBSTRING(@inputString, 16, 1) UNION ALL 
 SELECT 17, SUBSTRING(@inputString, 17, 1) UNION ALL 
 SELECT 18, SUBSTRING(@inputString, 18, 1) UNION ALL 
 SELECT 19, SUBSTRING(@inputString, 19, 1) UNION ALL 
 SELECT 20, SUBSTRING(@inputString, 20, 1)

We could have also achieved this with a WHILE loop instead, but this way is also valid and a loop hardly seems worthwhile for 20 characters.

So now let's clean up the data.

-- remove non-numerics inc. whitespace from the string 

DELETE FROM @charTable
WHERE  ThisChar NOT LIKE('[0-9]')

As we all know, DELETing from a table with an IDENTITY column doesn't reset the identities in that table to become continuous - we land up with holes in the identity sequence. So, for example, if we had five rows in a table with identities 1-5 and we DELETE the row with identity 4, we don't have a table with identities 1-4 - we have a table with identities 1-3 and 5.

We need to fix this, as we're going to use this measure to work out which rows to calculate upon based on whether they are odd or even. And this opens a real can of worms - my first attempt to do this was met with protests by SQL Server that an UPDATE is side-effecting and not allowed in functions. I came up with this:

-- unfortunately this messes up the Position indicator, 
-- so let's 'reset' this like so... 
DECLARE @tempTable TABLE ( 
 NewPosition INT IDENTITY(1,1), 
 OldPosition INT ) 
INSERT INTO @tempTable (OldPosition)
 SELECT Position 
 FROM @charTable 
 ORDER BY Position ASC 

UPDATE  @charTable
SET   Position = t2.NewPosition 
FROM  @charTable t1 
INNER JOIN  @tempTable t2 ON t1.Position = t2.OldPosition

Bonus Material

If you're interested, here's my original version. It uses a Common Table Expression to expose a rendition of the table with a corrected version of the position, then joins the tables using the original positions.

;WITH Repositioned (NewPosition, OldPosition) AS ( 
      SELECT      ROW_NUMBER() OVER ( ORDER BY Position ) [NewPosition], 
                  Position [OldPosition]
      FROM  @charTable ) 
UPDATE            t1
SET               t1.Position = t2.NewPosition 
FROM        @charTable t1 
INNER JOIN  Repositioned t2 ON t1.Position = t2.OldPosition

Also, in the main code, it's worth noting that in functions you cannot issue a statement that changes the database - a function is designed to return values. This extends to temporary objects too, even though they're allowed. Confused? You'll see that in this segment:

UPDATE  @charTable
SET   Position = t2.NewPosition 
FROM  @charTable t1 
INNER JOIN  @tempTable t2 ON t1.Position = t2.OldPosition

I have specified to update @charTable rather than the alias t1. UPDATE statements on table variables are allowed in functions, but aliases will fail.

Calculations

Now let's do our calculations.

-- now for every 2nd digit from the right of the numeric, 
-- double it and store the result in the Doubled column 

IF ( SELECT MAX(Position) % 2 FROM @charTable ) = 0 -- evens 
BEGIN 
 UPDATE @charTable
 SET  Doubled = CAST(ThisChar AS TINYINT) * 2 
 WHERE Position % 2 <> 0 
END
ELSE BEGIN -- odds
 UPDATE @charTable 
 SET  Doubled = CAST(ThisChar AS TINYINT) * 2 
 WHERE Position % 2 = 0 
END

We can take advantage of SQL Server's unwillingness to divide two INTs correctly (the result is reported as an INT) to simplify our equation:

-- now if the doubled digit is > 9, sum the digits, else carry forward
-- to the Summed column.  This goes for non-doubled digits too.
UPDATE @charTable
SET  Summed = 
   CASE WHEN Doubled IS NULL 
     THEN CAST(ThisChar AS TINYINT) 
     WHEN Doubled IS NOT NULL AND Doubled <= 9 
     THEN Doubled 
     WHEN Doubled IS NOT NULL AND Doubled >= 10 
     -- sum the digits.  Luckily SQL Server butchers int division...
     THEN (Doubled / 10) + (Doubled - 10) 
   END

And now let's define our validity conditions based on the modulo output.

-- finally, sum the Summed column and if the result % 10 = 0, it's valid 
IF ( SELECT SUM(Summed) % 10 FROM @charTable ) = 0
 SET @result = 1
ELSE 
 SET @result = 0

RETURN @result 

END

Let's take a look at the working table. My code doesn't expose this as there's no need - we're only interested in the output value. But for curiosity:

Complete Script

Here is the complete T-SQL code.

CREATE FUNCTION dbo.usp_LuhnsAlgorithm_New ( @inputString VARCHAR(20) )
RETURNS TINYINT 
AS BEGIN 
-------------------------------------------------------------------------------
-- Function to calculate whether a number is valid according to the 'MOD 10'
-- check, a.k.a. Luhn's Algorithm.
-- Author:  Derek Colley, August 2014
-- Parameters:  @inputString VARCHAR(20) 
-- Outputs:  TINYINT:    2 = an error occurred, validity undetermined
--       0 = number is not valid 
--       1 = number is valid
-------------------------------------------------------------------------------

-- first a quick check to ensure we have at least 3 numbers in the string - 
-- you can change this to any arbitrary amount, i.e. if you are just 
-- checking credit card numbers, make it 13 digits

DECLARE @result TINYINT

IF @inputString NOT LIKE ('%[0-9]%[0-9]%[0-9]%') 
 RETURN 2

-- set up our table for algorithm calculation

DECLARE @charTable TABLE ( 
 Position INT NOT NULL, 
 ThisChar CHAR(1) NOT NULL, 
 Doubled TINYINT, 
 Summed TINYINT ) 

-- convert the @inputString to a fixed width char datatype 
-- we can then process the string as a set with a known number of elements
-- this avoids RBAR substringing each char to a table in a cursor

SET @inputString = CAST(@inputString AS CHAR(20))
INSERT INTO @charTable(Position, ThisChar) 
 SELECT 1, SUBSTRING(@inputString, 1, 1) UNION ALL 
 SELECT 2, SUBSTRING(@inputString, 2, 1) UNION ALL 
 SELECT 3, SUBSTRING(@inputString, 3, 1) UNION ALL 
 SELECT 4, SUBSTRING(@inputString, 4, 1) UNION ALL 
 SELECT 5, SUBSTRING(@inputString, 5, 1) UNION ALL 
 SELECT 6, SUBSTRING(@inputString, 6, 1) UNION ALL 
 SELECT 7, SUBSTRING(@inputString, 7, 1) UNION ALL 
 SELECT 8, SUBSTRING(@inputString, 8, 1) UNION ALL 
 SELECT 9, SUBSTRING(@inputString, 9, 1) UNION ALL 
 SELECT 10, SUBSTRING(@inputString, 10, 1) UNION ALL 
 SELECT 11, SUBSTRING(@inputString, 11, 1) UNION ALL 
 SELECT 12, SUBSTRING(@inputString, 12, 1) UNION ALL 
 SELECT 13, SUBSTRING(@inputString, 13, 1) UNION ALL 
 SELECT 14, SUBSTRING(@inputString, 14, 1) UNION ALL 
 SELECT 15, SUBSTRING(@inputString, 15, 1) UNION ALL 
 SELECT 16, SUBSTRING(@inputString, 16, 1) UNION ALL 
 SELECT 17, SUBSTRING(@inputString, 17, 1) UNION ALL 
 SELECT 18, SUBSTRING(@inputString, 18, 1) UNION ALL 
 SELECT 19, SUBSTRING(@inputString, 19, 1) UNION ALL 
 SELECT 20, SUBSTRING(@inputString, 20, 1)


-- remove non-numerics inc. whitespace from the string 
DELETE FROM @charTable
WHERE  ThisChar NOT LIKE('[0-9]') 


-- unfortunately this messes up the Position indicator, 
-- so let's 'reset' this like so... 
DECLARE @tempTable TABLE ( 
 NewPosition INT IDENTITY(1,1), 
 OldPosition INT ) 
INSERT INTO @tempTable (OldPosition)
 SELECT Position 
 FROM @charTable 
 ORDER BY Position ASC 

UPDATE  @charTable
SET   Position = t2.NewPosition 
FROM  @charTable t1 
INNER JOIN  @tempTable t2 ON t1.Position = t2.OldPosition 

-- now for every 2nd digit from the right of the numeric, 
-- double it and store the result in the Doubled column 

IF ( SELECT MAX(Position) % 2 FROM @charTable ) = 0 -- evens 
BEGIN 
 UPDATE @charTable
 SET  Doubled = CAST(ThisChar AS TINYINT) * 2 
 WHERE Position % 2 <> 0 
END
ELSE BEGIN -- odds
 UPDATE @charTable 
 SET  Doubled = CAST(ThisChar AS TINYINT) * 2 
 WHERE Position % 2 = 0 
END 


-- now if the doubled digit is > 9, sum the digits, else carry forward
-- to the Summed column.  This goes for non-doubled digits too.
UPDATE @charTable
SET  Summed = 
   CASE WHEN Doubled IS NULL 
     THEN CAST(ThisChar AS TINYINT) 
     WHEN Doubled IS NOT NULL AND Doubled <= 9 
     THEN Doubled 
     WHEN Doubled IS NOT NULL AND Doubled >= 10 
     -- sum the digits.  Luckily SQL Server butchers int division...
     THEN (Doubled / 10) + (Doubled - 10) 
   END      


-- finally, sum the Summed column and if the result % 10 = 0, it's valid 
IF ( SELECT SUM(Summed) % 10 FROM @charTable ) = 0
 SET @result = 1
ELSE 
 SET @result = 0

RETURN @result 

END

Testing

Here is a test script that you can use:

SELECT '371449635398431' [Test String], [dbo].[usp_LuhnsAlgorithm_New]('371449635398431') [Valid Card?] UNION ALL
SELECT '3714 4963 5398 431' [Test String], [dbo].[usp_LuhnsAlgorithm_New]('3714 4963 5398 431') [Valid Card?] UNION ALL
SELECT '37XX XXXX 5398431' [Test String], [dbo].[usp_LuhnsAlgorithm_New]('37XX XXXX 5398431') [Valid Card?] UNION ALL
SELECT 'This is not a valid string' [Test String], [dbo].[usp_LuhnsAlgorithm_New]('This is not a valid string' ) [Valid Card?] UNION ALL
SELECT '1234123412341234' [Test String], [dbo].[usp_LuhnsAlgorithm_New]('1234123412341234') [Valid Card?]

Here's how to call the function, and some test results using one of our sample card numbers discussed earlier and some other input. Recall that output 1 = valid, output 0 = invalid, output 2 = error/bad string:

Recall that output 1 = valid, output 0 = invalid, output 2 = error/bad string

Next Steps

If you're curious to learn more about UDF functions, please see http://www.mssqltips.com/sql-server-tip-category/160/functions--user-defined-udf/
For sample card numbers for your own experiments, see: http://www.paypalobjects.com/en_US/vhelp/paypalmanager_help/credit_card_numbers.htm
For more information on Luhn's Algorithm from Wikipedia, see: http://en.wikipedia.org/wiki/Luhn_algorithm

About the author

Dr. Derek Colley is a UK-based DBA and BI Developer with more than a decade of experience working with SQL Server, Oracle and MySQL.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips

Thursday, August 28, 2014 - 11:21:36 AM - Malcolm	Back To Top (34321)
Thanks for the updated select statement, I never knew you could do that.

Wednesday, August 27, 2014 - 1:23:18 PM - Brent Ozar	Back To Top (34288)
Jeremy - you're missing the point. If unencrypted credit card numbers made it into the database layer, then they're a huge security risk. Think Profiler trace or plan cache query.

Wednesday, August 27, 2014 - 11:25:55 AM - Tibor Nagy	Back To Top (34283)
Interesting article but I agree with Brent. You should not have card numbers in your DB. A stored procedure to check the card number when it is entered can be interesting but it is better if the clear card number does not reach the DB.

Wednesday, August 27, 2014 - 8:19:39 AM - Brent Ozar	Back To Top (34280)
This is an interesting tip, but I'd zoom out a little and ask, why on earth would you ever want to let unencrypted credit card numbers get into the database layer? That's a huge, huge, HUGE security no-no.