By: Bhavesh Patel | Updated: 2019-11-15 | Comments | Related: More > Integration Services Development
The Raw File source in SQL Server Integration Services (SSIS) can read data quicker than other file sources because there is not a need to translate the data or parse the data when processing the file, so this can be faster to use these types of data files over flat files. In this tip we will take a look at how to create Raw Files using SSIS that can also be used as a source files to take advantage of this storage format.
Why use raw files in SSIS
There many use cases for Raw Files in SSIS such as:
- When using raw files, we can transfer datasets between SSIS packages faster.
- During the SSIS development, we could avoid the overhead of staging tables by writing and reading from raw files.
- We can use the raw file construct to create an inexpensive data lake.
- We can use the raw file to debug and log actions.
- Raw files provide an efficient intermediate format for serializing a temporary result.
- The same result set (raw file) can be reused many times as a part of different data flows or packages.
In SQL Server Integration Services, for the Data Flow, there are two options related to raw files.
- Raw File Source
- Raw File Destination
SSIS Raw File Source
In the Data Flow, the raw file source is available to read raw files. It can only read the data from the file. These files require little parsing and no translation, so the raw file can be read more quickly. It is used to read data which was previously written using the Raw File Destination. The raw file format also contains sort information which is used during processing. You also have the option of configuring the Raw File Source to ignore the sort flags in the file.
SSIS Raw File Destination
In the Data Flow, the raw file destination is available to write raw files. Because the file format of the data is native to the destination, the raw file destination can write data that can be read faster compared to other destination options available in SSIS such as flat files or OLE DB destinations. In addition to writing raw data to a file, you can also use the Raw File Destination to generate an empty raw file that contains just metadata. There is an also option to create a new file or append data to existing files.
Use Raw Files in SQL Server Integration Services Example
Due to that fact that we can move data quickly with raw files, the raw files source/destination is recommended for large volumes of data whenever performance may be an issue. Due to the way the data is stored in raw files, it could be a great benefit to collect the data in staging as raw files before performing large writes to other destinations instead of writing staging data to a database table.
Sample SQL Server Database Script
The following script creates a table we will use to walk through the example.
Use Master GO Create Database RawData GO USE RawData GO CREATE TABLE Customer_rowData ( Id BIGINT PRIMARY KEY IDENTITY(1,1), Name Nvarchar (400), city Nvarchar (200), Address Nvarchar (1000) ) GO Insert Customer_rowData Select N'Bhavesh Patel',N'Ahmedabad',N'At-Vaishnodevi,Ahmedabad,Pin-382470,Gujarat,India' GO 100000
Create New SSIS Project
I created a new SSIS (SQL Server Integration Services) project and assign the name working_with_rawfiles.
After creating the project, in the Solution Explorer I renamed the package to PrepareRawFile.dtsx (just right click on package.dtsx and rename it).
Configure SSIS Control Flow
I dragged a Data Flow Task into the Control Flow from the SSIS Toolbox and renamed it "Prepare_RawFile".
Configure SSIS Source Data
Double click on the Prepare_RawFile task to open the Data Flow.
In the Data Flow, drag an ADO NET Source from the Other Sources section in the SSIS Toolbox. Double click on this task and configure as follows, connecting to your SQL Server and selecting the table we just created and then press OK.
After pressing OK, the Data Flow should look like below.
Configure SSIS Raw File Destination
Drag the Raw File Destination from the Other Destinations section in the Toolbox and link to the ADO NET Source as follows.
Double click on the Raw File Destination to configure.
On the Connection page you need to configure the following:
- Access Mode: For writing to the file destination location, specify the File name or File name from variable.
- File Name: Enter the file full path for writing to the destination location.
- Write Option: There are 4 options available for the write option.
|Create always||This option always creates a new file.|
|Append||This appends data to an existing file. The file metadata is appended over the data, but it must match the file format.|
|Create Once||It creates a new file. If the file exists, then the component fails.|
|Truncate and Append||This truncates the file and adds the new data. The metadata of the newly appended data must match the existing file format.|
After that, go to the Columns page and configure the input columns and output alias columns.
After applying the above steps, the package is ready for execution which we will run.
During executing, the Data Flow tasks look as follows.
A file is created successfully in the desired location.
Reading Raw Files
Since this data is stored in a format specific for SSIS, in order to read this raw file, we need use a Raw File Source in the SSIS package. The steps are similiar to the above steps. There are not many options when configuring these Raw Files sources.
- Kindly run in the test server before rolling-out in production.
- SQL Server Integration Services (SSIS) Tutorial.
- Configure the Flat File Source in SQL Server Integration Services.
- Import Text and CSV Files into SQL Server Database with SSIS Script Task
- I prefer to use raw file in stage while work with data synchronization activity.
- You can download sample project working_with_rawfiles.
Last Updated: 2019-11-15
About the author
View all my tips