By: Rick Dobson | Comments (6) | Related: 1 | 2 | 3 | 4 | > Other Database Platforms
Problem
Have you ever needed to store and use results computed by a SAS Analytics package in a SQL Server database? This kind of requirement comes along very infrequently, but it is imperative to understand the different outputs and columns for matching results between your SAS package and SQL Server. How can a SQL Server professional with limited or no SAS experience get a SAS dataset of computed results into SQL Server quickly and easily?
Solution
The first part of this two-part series on SQL Server/SAS interoperability demonstrated how to transfer data from SQL Server to SAS. This conclusion of the two-part series covers how to transfer results computed in a SAS package to a SQL Server database. The popularity of SQL Server and SAS for computer applications enhances the value of a basic understanding of how to make them work together.
Just as with the first part, this second part demonstrates how to use SAS and SQL Server wizards and .csv files to exchange data. However, the focus for this article switches to features that make SAS computational results ready for storage and use in SQL Server. Some attention also goes to how the coding in a SAS package can impact how you process and interpret the data that comes back from SAS to SQL Server.
Picking up from the Solution Created in Part 1
A major objective of the previous tip was the creation of a SAS dataset based on a SQL Server table. The following screen shot shows the dataset in a SAS Viewtable container.
- The Viewtable title shows the name of the dataset is Table1_from_ss in the Mssqltip library.
- The first column, named GUID_val, may initially look like it contains uniqueidentifier values, but closer inspection shows opening and closing braces ({}). This field has a character data type; SAS has no data type comparable to a uniqueidentifier data type.
- Missing numeric values in SAS are represented by default with a ".". Notice this representation for the fourth row in the Var1 column and the first row in the Var2 column.
- Missing character values in SAS are represented by a blank space (as opposed to a "." for a SAS numeric missing value or a NULL value for a SQL Server missing value). Notice blank values in the fifth row of the Name column and the second row of the Var3 column.
A Simple SAS Package that Creates a SAS Dataset
The following SAS program window shows a very simple SAS package whose primary purpose is to illustrate two features. First, it reveals how to create a new dataset based on a previously existing dataset. Second, it shows how to include calculations in the new dataset based on expressions applied to values in the previously existing dataset.
The DATA keyword permits the creation of one or more SAS datasets. For the code in the figure, the DATA keyword specifies the creation of a new dataset named Table1_from_sas. The new dataset belongs to the MSSQLTip library. The KEEP keyword in parentheses at the end of the DATA statement (a statement ends with a semi-colon) indicates a subset of columns to be retained in the new dataset. There are two sources for columns in the new dataset. The SET keyword designates a previously existing dataset (Table1_from_ss) as one source for the new dataset. The GUID_val and ID_val variables in the KEEP clause are from the previously existing Table1_from_ss dataset. The second source for the new dataset are the expressions in the data step. The SumOfVar1ThruVar2 and AddVar1ToVar2 in the KEEP clause are from expressions in the data step to create the new dataset.
The two expressions in the data step illustrate two different approaches to creating a sum for variables on each row of a SAS dataset. Both approaches operate on data in the previously existing dataset to populate variable values in the new dataset. The SUM function generates values for the SumOfVar1ThruVar2 variable in the Table1_from_sas dataset. The SUM function does not return a missing value unless all its inputs are also missing values. In contrast, the + operator, which assigns values for the AddVar1ToVar2 variable, returns a missing value if any source for the operator is a missing value.
The next display presents the results from the PROC PRINT statement in the preceding SAS program window. These are the values that will eventually be passed back to a SQL Server database. For this demonstration, the sample application returns two entity identifier variables -- namely, GUID_val and ID_val. In a typical, real application, you will return just one entity identifier value per row.
Notice that the AddVar1ToVar2 variable contains missing value indicators (.), but the SumOfVar1ThruVar2 has no rows with missing value indicators. This distinction is because of the differences in operation between the SUM function and the + operator.
Using Wizards to Export a Dataset from SAS to SQL Server
In the first part of the seriesyou learned how to export a table from SQL Server to SAS. The second part of the series flips the direction around by exporting a SAS dataset as a .csv file and importing the file into a SQL Server table.
- After a dataset, such as Table1_from_sas, is ready for export, you can open an export wizard by choosing Export Data on the File menu.
- On the wizard's initial screen, select the Library in which the dataset you want to export is stored and then select the member name for the dataset in the library.
- Click Next and accept the default export type settings -- Standard data source with a .csv file.
- After clicking Next again, click Browse. In the Save As dialog box, navigate to the path in which you want to save the .csv file, type the file's name (for example, ForTable1_From_SAS), and click Save.
- After entering your settings, launch the wizard's operation by clicking Finish.
The SAS Export Data wizard creates a .csv file with column names in the first row. Subsequent rows in the .csv file contain data values for import into SQL Server. When the wizard finishes populating the .csv file, you can exit SAS.
At this point, you are ready to start SQL Server so you can import the values from the exported SAS dataset. From the Object Explorer in SQL Server Management Studio, launch and use the Import Data wizard as follows.
- Right click the database in which you want to create a table based on the .csv file created in SAS. Choose Import Data from the Tasks menu.
- Select a Flat File Source and then click the Browse button.
- In the Open dialog box, choose .csv for the file type.
- Then, navigate to the .csv file exported from SAS.
- Select the check box indicating column names are in the first row of the .csv file.
- Click Next until you arrive at the Select Source Tables and Views dialog box. Then, edit the Destination box so that it points at the schema and table name that you want (if the default setting is not acceptable).
- Finally, click Next, Finish, and Finish again. After the wizard indicates all rows are transferred, click Close.
Processing the Imported SAS Dataset Values
After loading the .csv file from SAS into a SQL Server table, you are ready to start using the table of imported data values. Before working with the table based on the .csv file, let's look at the data types of columns in the table (Table1_from_SAS) based on data values exported from SAS and the original staging table (Table1_from_SS) exported to SAS in the preceding tip. Using the INFORMATION_SCHEMA.COLUMNS view, you can generate a result set showing data type information about the columns in both of these tables. See the column data type information below.
The staging table has a mix of data types for its columns, including varchar, uniqueidentifier, int, and float. In contrast, all the columns for Table1_from_SAS have varchar data types. This is a result of the exceedingly easy approach that we used for creating the table from its source .csv file. We could have assigned more appropriate data types to the table's columns, but this is not strictly necessary as you will see. Remember our goal is to demonstrate using the SQL Server and SAS data wizards in the easiest, quickest way possible to facilitate interoperability between the two packages.
There are many operations that you can perform on imported data from a SAS pack, but two especially common ones are to join a table of imported values with a staging table and also perform some arithmetic operation on the values in a column for the table with imported values. The SQL statement and result set in the following screen shot illustrate both of these operations for Table1_from_SS and Table1_from_SAS. Notice the join on ID_val succeeds although the column has an int data type in Table1_from_SS and a varchar data type in Table1_from_SAS. The join will also succeed if we use the GUID_val columns from both tables, which also have different data types.
While performing a join between columns with substantially different data types will work, you do need to transform a column with a varchar data type to a numeric data type before you can perform a numeric calculation on a column's values. The CAST or CONVERT functions can assist with this task. The SELECT statement in the screen shot below demonstrates the use of the CAST function to transform column values with a varchar data type to a float data type before invoking a SUM function. Please note that the SUM function results are different for each column. This is because the SAS SUM function returns a non-null value if any of its arguments is not null, but the + operator returns a null value if any of its arguments have null values. This difference in results reminds you of the value of learning the basics of SAS code when using data imported from a SAS Analytics package.
Next Steps
Between this second installment and the first one in this series, you have witnessed a full round trip of moving data from SQL Server to SAS and back again. The two articles demonstrated how to accomplish this goal with nothing but the most basic operation of the SQL Server and SAS data import and export wizards. I recommend this very basic approach when you are just starting to work with interoperability between SQL Server and SAS and when you encounter the need for interoperability on a very infrequent basis.
There are many ways to enhance the basic interoperability in our round-trip example. For example, the Next Steps section for the first article in this series, discusses how to facilitate interoperability with SSIS. In the context of the demonstration within this article, it would have been relatively easy to assign data types to the columns of data as they were imported via the SQL Server Data Import Wizard. Assigning appropriate data types during the import task would have removed the need to recast two columns before they could be used by a SQL Server SUM function.
Yet another especially powerful approach to facilitating interoperability between SQL Server and SAS is via a special package named SAS/ACCESS, which you can license as an add-on to Base SAS from the SAS Institute. You can link directly from SAS to SQL Server data sources via SAS/Access and ODBC. After your link is established you can select data from SQL Server data sources via PROC SQL in a SAS program. SQL Server professionals who are already handy with T-SQL may be able to learn the capabilities of PROC SQL quickly because it relies on ANSI SQL (and, of course, custom SAS extensions). Using PROC SQL with SAS/ACCESS can dramatically reduce your development time for interoperability between SAS and ODBC data sources, such as SQL Server. Three helpful links for learning more about SAS/ACCESS and PROC SQL follow.
- This link provides an overview of SAS/ACCESS.
- This link demonstrates how to forge a connection from SAS to a SQL Server database via ODBC.
- This link describes a case study with demonstrations of using PROC SQL to extract and to process data from a SQL Server source.
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips