By: Derek Colley | Updated: 2014-08-27 | Comments (16) | Related: > Functions User Defined UDF
Problem
You are tasked with checking account or credit card numbers for validity, for example as part of a web application handling sensitive card data. You wish to implement this as a function within SQL Server, with the ability to be called with any number and return either TRUE (valid) or FALSE (invalid). Alternatively, you are looking for a method to validate a number based on a check digit: you could be using this as part of a password reset check, for account verification or similar purposes.
Solution
The industry standard for checking credit card numbers for validity is known in colloquial terms as the 'MOD 10 Check', or more formally as Luhn's Algorithm. It has no cryptographic validity, but it is a useful rule-of-thumb check that can be used to validate that a card number is correct, and can be used in its opposite form to generate account (and credit card) numbers - to help discourage potential fraudsters, this function is NOT covered here.
Much literature already exists on Luhn's Algorithm, links for which are provided in the Next Steps section at the end of this tip. The algorithm is simple, but powerful. In the following sections I will provide a step-by-step guide to validating these numbers by hand, then show how we can replicate this as a User-Defined Function (UDF) within SQL Server. I will also show you a method to remove 'noise' from the number - for example, spaces or leading/trailing characters.
One final note - any card numbers used here for illustration / example are fictitious - these pass the MOD 10 check, but are not real credit card numbers, only samples. The samples are from PayPal and may be found here: www.paypalobjects.com/en_US/vhelp/paypalmanager_help/credit_card_numbers.htm
Algorithm Overview
The algorithm can be broken down into 4 stages. All stages are carried out using simple mathematics and rules.
Stage 1: N * 2 for every 2nd N from the right
Let's first get a number which we are testing for validity and split it out into its component digits, in a format we can easily understand. The number we will use is an AMEX sample card number:
The first step is to apply N*2 to every 2nd digit starting from the right. This is true regardless of the length of the number (odd or even number of digits), and is because the final digit is normally a check digit. Let's do this now.
Stage 2: If (N*2) > 9 sum the digits, else carry the digits
In this stage, we carry forward ALL the digits from either the original number or N*2 to produce a final number for verification. The rule is: If the number was doubled in Stage 1 AND this number > 9, then sum the digits. So if the original number was 9, for example, and was selected for doubling in Stage 1 to become 18, we now need to add 1 + 8 = 9 since 18 > 9. Likewise, if the number 2 was doubled in Stage 1 to equal 4, because 4 < 9 we do not need to transform it further. All other digits - i.e. those NOT selected in Stage 1 - are carried forward without transformation.
This is easier to show than describe. Like so:
So now we have a final number that we can check for validity.
Stage 3: Sum the digits
This stage is simple - take this output number and sum the digits.
This results in a single output integer we'll call X, in this case 80. We now need to test this number - more formally, IF X % 10 = 0 THEN TRUE ELSE FALSE. For those of you unfamiliar with the MOD operation, it's almost interchangeable with the term 'is divisible by'. When we MOD two numbers, we fetch the remainder of the division operation between the two numbers. So, 10 MOD 2 = 0 because 2 evenly divides into 10 with no remainder. 250 MOD 30 = 10, because 250 is divisible by 30 a total of 8 times (240) with 10 remainder. In SQL and in mathematics, MOD is represented by the percentage (%) sign.
Stage 4: IF X MOD 10 = 0 THEN TRUE ELSE FALSE
Applying this then, we know that 80 MOD 10 = 0, therefore the card number is valid.
For balance, let's show the same grid, but this time using an invalid card number. You can see the algorithm quickly generates an output X that fails the MOD 10 check.
Rewriting this for SQL Server
Let's now turn our attention to rewriting this in T-SQL.
The code we write must be fairly efficient, but we're constrained in a unique way by using T-SQL - we're writing in the language of relational set theory, not in object oriented language. So whereas in Python or C we might implement the card number as an array of numbers then apply transformations to all elements of the array, we don't have that luxury in SQL Server. What we CAN do, however, is use a temporary table variable to the same effect.
We must also cleanse the card number and make sure we don't have any trailing spaces. In the real world, we'll receive a number for testing as a string, or VARCHAR of indeterminate length, so we'll work on this basis rather than using an INT/BIGINT.
Luckily we can combine this and use a temporary table variable both for cleansing the number and for validity checking.
Let's now set up the function header, and put in a couple of simple constraints - the function can accept a string no longer than 20 characters, and if there aren't at least three digits present in the string, return an error code:
CREATE FUNCTION dbo.usp_LuhnsAlgorithm_New ( @inputString VARCHAR(20) ) RETURNS TINYINT AS BEGIN ------------------------------------------------------------------------------- -- Function to calculate whether a number is valid according to the 'MOD 10' -- check, a.k.a. Luhn's Algorithm. -- Author: Derek Colley, August 2014 -- Parameters: @inputString VARCHAR(20) -- Outputs: TINYINT: 2 = an error occurred, validity undetermined -- 0 = number is not valid -- 1 = number is valid ------------------------------------------------------------------------------- -- first a quick check to ensure we have at least 3 numbers in the string - -- you can change this to any arbitrary amount, i.e. if you are just -- checking credit card numbers, make it 13 digits DECLARE @result TINYINT IF @inputString NOT LIKE ('%[0-9]%[0-9]%[0-9]%') RETURN 2
Now, we're going to use a table variable to handle the calculation.
-- set up our table for algorithm calculation DECLARE @charTable TABLE ( Position INT NOT NULL, ThisChar CHAR(1) NOT NULL, Doubled TINYINT, Summed TINYINT )
There's a neat trick we can use now. The instinctive approach to parsing this string is to step through it one character at a time, removing whitespace and non-numeric data. However, we're in a set-based environment. Neither do we want to encapsulate the string in many string-parsing functions like SUBSTRING, REPLACE, CHARINDEX and the like, as this will get very messy very quickly, and our key tool LIKE isn't rich in functionality compared to the non-database equivalent functions.
For this reason, we'll pivot the string into a table, each row containing one character. Using this methodology, we can quickly eliminate unwanted characters while retaining order, and additionally we can use multiple columns for our calculations.
-- convert the @inputString to a fixed width char datatype -- we can then process the string as a set with a known number of elements -- this avoids RBAR substringing each char to a table in a cursor SET @inputString = CAST(@inputString AS CHAR(20)) INSERT INTO @charTable(Position, ThisChar) SELECT 1, SUBSTRING(@inputString, 1, 1) UNION ALL SELECT 2, SUBSTRING(@inputString, 2, 1) UNION ALL SELECT 3, SUBSTRING(@inputString, 3, 1) UNION ALL SELECT 4, SUBSTRING(@inputString, 4, 1) UNION ALL SELECT 5, SUBSTRING(@inputString, 5, 1) UNION ALL SELECT 6, SUBSTRING(@inputString, 6, 1) UNION ALL SELECT 7, SUBSTRING(@inputString, 7, 1) UNION ALL SELECT 8, SUBSTRING(@inputString, 8, 1) UNION ALL SELECT 9, SUBSTRING(@inputString, 9, 1) UNION ALL SELECT 10, SUBSTRING(@inputString, 10, 1) UNION ALL SELECT 11, SUBSTRING(@inputString, 11, 1) UNION ALL SELECT 12, SUBSTRING(@inputString, 12, 1) UNION ALL SELECT 13, SUBSTRING(@inputString, 13, 1) UNION ALL SELECT 14, SUBSTRING(@inputString, 14, 1) UNION ALL SELECT 15, SUBSTRING(@inputString, 15, 1) UNION ALL SELECT 16, SUBSTRING(@inputString, 16, 1) UNION ALL SELECT 17, SUBSTRING(@inputString, 17, 1) UNION ALL SELECT 18, SUBSTRING(@inputString, 18, 1) UNION ALL SELECT 19, SUBSTRING(@inputString, 19, 1) UNION ALL SELECT 20, SUBSTRING(@inputString, 20, 1)
We could have also achieved this with a WHILE loop instead, but this way is also valid and a loop hardly seems worthwhile for 20 characters.
So now let's clean up the data.
-- remove non-numerics inc. whitespace from the string DELETE FROM @charTable WHERE ThisChar NOT LIKE('[0-9]')
As we all know, DELETing from a table with an IDENTITY column doesn't reset the identities in that table to become continuous - we land up with holes in the identity sequence. So, for example, if we had five rows in a table with identities 1-5 and we DELETE the row with identity 4, we don't have a table with identities 1-4 - we have a table with identities 1-3 and 5.
We need to fix this, as we're going to use this measure to work out which rows to calculate upon based on whether they are odd or even. And this opens a real can of worms - my first attempt to do this was met with protests by SQL Server that an UPDATE is side-effecting and not allowed in functions. I came up with this:
-- unfortunately this messes up the Position indicator, -- so let's 'reset' this like so... DECLARE @tempTable TABLE ( NewPosition INT IDENTITY(1,1), OldPosition INT ) INSERT INTO @tempTable (OldPosition) SELECT Position FROM @charTable ORDER BY Position ASC UPDATE @charTable SET Position = t2.NewPosition FROM @charTable t1 INNER JOIN @tempTable t2 ON t1.Position = t2.OldPosition
Bonus Material
If you're interested, here's my original version. It uses a Common Table Expression to expose a rendition of the table with a corrected version of the position, then joins the tables using the original positions.
;WITH Repositioned (NewPosition, OldPosition) AS ( SELECT ROW_NUMBER() OVER ( ORDER BY Position ) [NewPosition], Position [OldPosition] FROM @charTable ) UPDATE t1 SET t1.Position = t2.NewPosition FROM @charTable t1 INNER JOIN Repositioned t2 ON t1.Position = t2.OldPosition
Also, in the main code, it's worth noting that in functions you cannot issue a statement that changes the database - a function is designed to return values. This extends to temporary objects too, even though they're allowed. Confused? You'll see that in this segment:
UPDATE @charTable SET Position = t2.NewPosition FROM @charTable t1 INNER JOIN @tempTable t2 ON t1.Position = t2.OldPosition
I have specified to update @charTable rather than the alias t1. UPDATE statements on table variables are allowed in functions, but aliases will fail.
Calculations
Now let's do our calculations.
-- now for every 2nd digit from the right of the numeric, -- double it and store the result in the Doubled column IF ( SELECT MAX(Position) % 2 FROM @charTable ) = 0 -- evens BEGIN UPDATE @charTable SET Doubled = CAST(ThisChar AS TINYINT) * 2 WHERE Position % 2 <> 0 END ELSE BEGIN -- odds UPDATE @charTable SET Doubled = CAST(ThisChar AS TINYINT) * 2 WHERE Position % 2 = 0 END
We can take advantage of SQL Server's unwillingness to divide two INTs correctly (the result is reported as an INT) to simplify our equation:
-- now if the doubled digit is > 9, sum the digits, else carry forward -- to the Summed column. This goes for non-doubled digits too. UPDATE @charTable SET Summed = CASE WHEN Doubled IS NULL THEN CAST(ThisChar AS TINYINT) WHEN Doubled IS NOT NULL AND Doubled <= 9 THEN Doubled WHEN Doubled IS NOT NULL AND Doubled >= 10 -- sum the digits. Luckily SQL Server butchers int division... THEN (Doubled / 10) + (Doubled - 10) END
And now let's define our validity conditions based on the modulo output.
-- finally, sum the Summed column and if the result % 10 = 0, it's valid IF ( SELECT SUM(Summed) % 10 FROM @charTable ) = 0 SET @result = 1 ELSE SET @result = 0 RETURN @result END
Let's take a look at the working table. My code doesn't expose this as there's no need - we're only interested in the output value. But for curiosity:
Complete Script
Here is the complete T-SQL code.
CREATE FUNCTION dbo.usp_LuhnsAlgorithm_New ( @inputString VARCHAR(20) ) RETURNS TINYINT AS BEGIN ------------------------------------------------------------------------------- -- Function to calculate whether a number is valid according to the 'MOD 10' -- check, a.k.a. Luhn's Algorithm. -- Author: Derek Colley, August 2014 -- Parameters: @inputString VARCHAR(20) -- Outputs: TINYINT: 2 = an error occurred, validity undetermined -- 0 = number is not valid -- 1 = number is valid ------------------------------------------------------------------------------- -- first a quick check to ensure we have at least 3 numbers in the string - -- you can change this to any arbitrary amount, i.e. if you are just -- checking credit card numbers, make it 13 digits DECLARE @result TINYINT IF @inputString NOT LIKE ('%[0-9]%[0-9]%[0-9]%') RETURN 2 -- set up our table for algorithm calculation DECLARE @charTable TABLE ( Position INT NOT NULL, ThisChar CHAR(1) NOT NULL, Doubled TINYINT, Summed TINYINT ) -- convert the @inputString to a fixed width char datatype -- we can then process the string as a set with a known number of elements -- this avoids RBAR substringing each char to a table in a cursor SET @inputString = CAST(@inputString AS CHAR(20)) INSERT INTO @charTable(Position, ThisChar) SELECT 1, SUBSTRING(@inputString, 1, 1) UNION ALL SELECT 2, SUBSTRING(@inputString, 2, 1) UNION ALL SELECT 3, SUBSTRING(@inputString, 3, 1) UNION ALL SELECT 4, SUBSTRING(@inputString, 4, 1) UNION ALL SELECT 5, SUBSTRING(@inputString, 5, 1) UNION ALL SELECT 6, SUBSTRING(@inputString, 6, 1) UNION ALL SELECT 7, SUBSTRING(@inputString, 7, 1) UNION ALL SELECT 8, SUBSTRING(@inputString, 8, 1) UNION ALL SELECT 9, SUBSTRING(@inputString, 9, 1) UNION ALL SELECT 10, SUBSTRING(@inputString, 10, 1) UNION ALL SELECT 11, SUBSTRING(@inputString, 11, 1) UNION ALL SELECT 12, SUBSTRING(@inputString, 12, 1) UNION ALL SELECT 13, SUBSTRING(@inputString, 13, 1) UNION ALL SELECT 14, SUBSTRING(@inputString, 14, 1) UNION ALL SELECT 15, SUBSTRING(@inputString, 15, 1) UNION ALL SELECT 16, SUBSTRING(@inputString, 16, 1) UNION ALL SELECT 17, SUBSTRING(@inputString, 17, 1) UNION ALL SELECT 18, SUBSTRING(@inputString, 18, 1) UNION ALL SELECT 19, SUBSTRING(@inputString, 19, 1) UNION ALL SELECT 20, SUBSTRING(@inputString, 20, 1) -- remove non-numerics inc. whitespace from the string DELETE FROM @charTable WHERE ThisChar NOT LIKE('[0-9]') -- unfortunately this messes up the Position indicator, -- so let's 'reset' this like so... DECLARE @tempTable TABLE ( NewPosition INT IDENTITY(1,1), OldPosition INT ) INSERT INTO @tempTable (OldPosition) SELECT Position FROM @charTable ORDER BY Position ASC UPDATE @charTable SET Position = t2.NewPosition FROM @charTable t1 INNER JOIN @tempTable t2 ON t1.Position = t2.OldPosition -- now for every 2nd digit from the right of the numeric, -- double it and store the result in the Doubled column IF ( SELECT MAX(Position) % 2 FROM @charTable ) = 0 -- evens BEGIN UPDATE @charTable SET Doubled = CAST(ThisChar AS TINYINT) * 2 WHERE Position % 2 <> 0 END ELSE BEGIN -- odds UPDATE @charTable SET Doubled = CAST(ThisChar AS TINYINT) * 2 WHERE Position % 2 = 0 END -- now if the doubled digit is > 9, sum the digits, else carry forward -- to the Summed column. This goes for non-doubled digits too. UPDATE @charTable SET Summed = CASE WHEN Doubled IS NULL THEN CAST(ThisChar AS TINYINT) WHEN Doubled IS NOT NULL AND Doubled <= 9 THEN Doubled WHEN Doubled IS NOT NULL AND Doubled >= 10 -- sum the digits. Luckily SQL Server butchers int division... THEN (Doubled / 10) + (Doubled - 10) END -- finally, sum the Summed column and if the result % 10 = 0, it's valid IF ( SELECT SUM(Summed) % 10 FROM @charTable ) = 0 SET @result = 1 ELSE SET @result = 0 RETURN @result END
Testing
Here is a test script that you can use:
SELECT '371449635398431' [Test String], [dbo].[usp_LuhnsAlgorithm_New]('371449635398431') [Valid Card?] UNION ALL SELECT '3714 4963 5398 431' [Test String], [dbo].[usp_LuhnsAlgorithm_New]('3714 4963 5398 431') [Valid Card?] UNION ALL SELECT '37XX XXXX 5398431' [Test String], [dbo].[usp_LuhnsAlgorithm_New]('37XX XXXX 5398431') [Valid Card?] UNION ALL SELECT 'This is not a valid string' [Test String], [dbo].[usp_LuhnsAlgorithm_New]('This is not a valid string' ) [Valid Card?] UNION ALL SELECT '1234123412341234' [Test String], [dbo].[usp_LuhnsAlgorithm_New]('1234123412341234') [Valid Card?]
Here's how to call the function, and some test results using one of our sample card numbers discussed earlier and some other input. Recall that output 1 = valid, output 0 = invalid, output 2 = error/bad string:
Next Steps
- If you're curious to learn more about UDF functions, please see http://www.mssqltips.com/sql-server-tip-category/160/functions--user-defined-udf/
- For sample card numbers for your own experiments, see: http://www.paypalobjects.com/en_US/vhelp/paypalmanager_help/credit_card_numbers.htm
- For more information on Luhn's Algorithm from Wikipedia, see: http://en.wikipedia.org/wiki/Luhn_algorithm
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2014-08-27