SQL Server function to validate Social Security Numbers SSN

Problem

Most of us have to deal with storing SSN information in databases, and how do we know if the SSN entered is valid or not. In this tip, the High Group list issued by the Social Security Administration is used to check if a particular SSN is valid.

Solution

SSN is broken into three parts xxx-xx-xxxx, the first three represent the AreaCode (geographical location), the next two represent the GroupCode and the last four represent the sequence. SSA website publishes the list of High Group data that is used to generate the next set of SSNs, and the list can be downloaded from the following link http://www.ssa.gov/employer/ssns/HGDec0110.txt. For a particular AreaCode, the GroupCode is issued in a "Odd", "Even", "Even", "Odd" pattern as described below:

1. Odd numbered GroupCodes "01", "03", "05", "07", "09" are issued first for any areacode
2. Even numbered GroupCodes between "10" and "98" are issued next
3. Even numbered GroupCodes "02", "04", "06", "08" are issued next
4. Finally odd numbered GroupCodes between "11" and "99" are issued.

To setup this test we will do the following:

1. Create a table to store the High Group information
2. Populate this table using the data from SSA website
3. Run the T-SQL code to validate the SSN

Step 1

`CREATE TABLE [dbo].[SSN_RangeList](    [AreaCode] [smallint] NOT NULL,    [GroupCode] [tinyint] NOT NULL) ON [PRIMARY]`

Step 2

Load the data from the SSA website.  You will need to remove the '*' in the data.  To make this simpler for this test you can download this data that has been extracted already from the December 2010 file.

Step 3

The SELECT statement below uses the High Group Data and the pattern described above to validate SSNs. Sub select is used to break the given SSN into the three parts (Area, HighGroup and Sequence) which are used to compare against the SSN_RangeList table (HighGroup data from SSA)

`DECLARE @ssn VARCHAR(15)SET @ssn = '900-00-0000'SELECT         @ssn SSN,        CASE         WHEN ISNUMERIC(REPLACE(@SSN, '-', '')) <> 1 OR Sequence = 0 OR HighGroup = 0 THEN 'Invalid SSN'         ELSE            CASE WHEN Groupcode % 2 = 1 THEN --odd groupcode                 CASE WHEN GroupCode <10 THEN                     CASE WHEN HighGroup % 2 = 1 AND HighGroup <= GroupCode THEN 'Valid SSN'                     ELSE 'Invalid SSN'                     END                ELSE                    CASE WHEN HighGroup % 2 = 0 THEN 'Valid SSN'                     WHEN HighGroup % 2 = 1 THEN                         CASE WHEN HighGroup <= GroupCode THEN 'Valid SSN' ELSE 'Invalid SSN' END                    END                END            ELSE --even groupcode                CASE WHEN GroupCode >= 10 THEN                    CASE WHEN HighGroup %2 =1 THEN                        CASE WHEN HighGroup <=9 THEN 'Valid SSN' ELSE 'Invalid SSN' END                    WHEN HighGroup %2 = 0 THEN                        CASE WHEN HighGroup <= GroupCode THEN 'Valid SSN' ELSE 'Invalid SSN' END                    END                ELSE                    CASE WHEN HighGroup %2 =1 THEN                        CASE WHEN HighGroup <=9 THEN 'Valid SSN' ELSE 'Invalid SSN' END                    WHEN HighGroup %2 = 0 THEN                         CASE WHEN HighGroup >=10 THEN 'Valid SSN'                         WHEN HighGroup < 10 THEN                             CASE WHEN HighGroup <= GroupCode THEN 'Valid SSN' ELSE 'Invalid SSN' END                        END                    END                END            END        END [ValidSSN]FROM (SELECT     LEFT(REPLICATE('0', 9 - LEN(LTRIM(REPLACE(@SSN, '-', '')))) + REPLACE(@SSN, '-', ''), 3) Area,    SUBSTRING(REPLICATE('0', 9 - LEN(LTRIM(REPLACE(@SSN, '-', '')))) + REPLACE(@SSN, '-', ''), 4, 2) HighGroup,    RIGHT(REPLICATE('0', 9 - LEN(REPLACE(@SSN, '-', ''))) + REPLACE(@SSN, '-', ''), 4) Sequence,    REPLICATE('0', 9 - LEN(REPLACE(@SSN, '-', ''))) + REPLACE(@SSN, '-', '') SSN) vLEFT JOIN SSN_RangeList sr    ON v.Area = CAST(sr.AreaCode AS VARCHAR(3))`
Next Steps
• This SELECT statement can be made into a view and can be used to join against any table containing SSNs.
• You can also create a function using this code to test and validate.
• Additional checks can be added on the sub select to remove non numeric characters based on the input SSN data.
• Note: The data that was used for this sample was from December 2010.  New lists are generated each month. You can download the latest SSA High Group data from this website.
• Note: As of June 25, 2011 the SSA will begin a new process for this, read more here.

Last Updated: 2011-01-10 Vamsi Bogullu is a Database Architect administering database servers, developing applications and designing enterprise BI solutions.

View all my tips