SSIS - Configure a source flat file with a fixed width from an existing table


By:   |   Updated: 2014-08-29   |   Comments (8)   |   Related: More > Integration Services Data Flow Transformations

Problem

SQL Server Integration Services (SSIS) can import data from a fixed width flat text file (i.e. no field delimiters), using the Flat File Source component. When importing this type of file, you must set the field names, field widths, and starting positions ahead of time. Typing in this information when there are only a handful of fields is not too burdensome, but when there are hundreds of fields, manually configuring and validating each field can take a long time. Here is a workaround using an existing import table with table columns already defined. Thank Tim Katheeder for coming up with this workaround.

NOTE: I use the term "field" when referring to positional data in the flat text file (e.g. field Name consist of character positions 1 to 10). The term "column" is used when referencing columnar data in a SQL Server table (e.g. column Name is varchar (10)).

Solution

Background

For one project I worked on, the data came from an old COBOL system. Each row of data had 3,500 characters organized into 530 fields. The text file was 100 MB. So yes, we are talking about a substantial amount of typing to specify the fields to import and defining the table columns where the data will reside.

My preparation consisted of two activities:

a) copying the first few rows of the 100 MB data file into a test file for import development, and

b) using Excel to convert the flat file design specifications into a SQL statement creating an import destination table.

In the example presented here, I created a test file having 3 rows of data with 35 characters representing 5 fields, the design specifications for the test file, and a create table statement based on the test file design specifications.

Example: Test text file

Johnson   1133557712345678140207
Wilson    22446688  352741130522
Anderson  13245768   76543140321

Example: Test file design specifications

Design Specifications

Example: Import destination table structure

CREATE TABLE ImportText (
    AcctName varchar (10) DEFAULT '' 
    ,AccountNo varchar (8) DEFAULT '' 
    ,LastBalance varchar (8) DEFAULT '' 
    ,TransDate varchar (6) DEFAULT '' 
    ,Filler varchar (3) DEFAULT ''
)

About the Import Destination Table

This tip does not cover how to translate the fixed width flat text file design specifications into a SQL create table query as there are a variety of approaches depending on your particular situation. However, there is one detail that merits special attention. For this fixed width flat file import process to work, the import destination table columns are all defined as text (e.g. VARCHAR). Keeping in mind the source system, COBOL may export a numeric field defined as "s9(6)v99" that is represented by an 8 character field. The "v" in the format is referred to as an implied decimal. This means a flat file field value of "12345678" (8 characters) translates to a table column value of "123456.78" (9 characters). When importing a fixed width flat text file, the field widths must be exact or subsequent fields may receive an incorrect value. By loading all fields as text the individual peculiarities can be addressed on a case-by-case basis. Transforming data from the Import destination table structure to the final table structure is a standard SSIS data transformation.

Example: Import destination table with raw data (ImportText)

Imoprt destination table with raw data

Example: Final table structure

CREATE TABLE ImportFinal ( 
    AcctName varchar (10) DEFAULT ''
    ,AccountNo varchar (8) DEFAULT ''
    ,LastBalance decimal (9,2) 
    ,TransDate smallDateTime 
)

Example: Final table with formatted data (ImportFinal)

Final table with formatted data
 

We've described the data we are starting with and how we want the data represented when finished. Here are the steps to make it happen.

Step 1: Create a Data Flow Task

Create a Data Flow task in SSIS to house the workaround.

Step 2: Create and configure an OLE DB Source

a) Double click the Data Flow Task to build the Data Flow.

b) Create an OLE DB Source component.

c) Double click the component to open the OLE DB Source Editor and configure the Connection Manager to point to the import destination table (ImportText).

OLE DB Source Editor
OLE DB Source Editor - Columns

When an OLE DB Source component is configured, SSIS stores the table's information in the component for reference. Now the import destination table columns (ImportText) are available for reference.

Step 3: Create a Flat File Destination

a) Create a Flat File Destination component.

b) Draw a Data Flow Path between the OLE DB Source component and the Flat File Destination component. Do not open the Flat File Destination component until after the Data Flow Path is established.

After the Flat File Destination component is connected to the OLE DB Source, SSIS automatically copies the column information from the OLE DB Source component (ImportText) to the Flat File Destination component. In other words, the Flat File Destination component now has ImportText table specifications.

Data Flow tasks

Note: The red circle X in the Flat File Destination component indicates the component configuration is incomplete and will be addressed in the next step.

Step 4: Configure the Flat File Destination

a) Double click the Flat File Destination component to open the Flat File Destination Editor.

b) Click on the New… button and the Flat File Format dialog will open.

c) Select the Ragged Right option and click OK.

d) Back in the Flat File Connection Manager Editor, type in "f" as File name, and then close the editor.

e) In the Flat File Destination Editor click on Mappings and you will see the Flat File Destination component configured with the import table specifications. Click OK.

The Ragged Right option is used in step 4c because it is not unusual for data files with large record structures to have varying record lengths. Step 4D is a little misleading as we needed to complete the Flat File Destination component configuration. Once the configuration is done, the OLE DB Source import destination table columns (ImportText) are loaded into the Flat File Destination component. The "f" is not an actual file and was used for configuration purposes only.

Flat file connection manager
Flat file destination editor

Step 5: Create the Flat File Source

a) Create a Flat File Source component.

b) Delete the Data Flow Path between the OLE DB Source component and the Flat File Destination component.

c) Draw a Data Flow Path between the Flat File Source component and the Flat File Destination component. Do not open the Flat File Source component until after the Data Flow Path is established.

d) Open the Flat File Source Editor component to confirm the columns are populated.

Completing step 5d means the Flat File Source component now has the field definitions from the Flat File Destination component courtesy of SSIS automatic import. In other words, the ImportText table column definitions are now the Flat File Source component field definitions.

Flat file source component

Note: The red circle X in the Flat File Destination component indicates the component configuration is incomplete due to changing the data source from the OLE DB Source to Flat File Source component.

Flat file source editor

Step 6: Create and configure the OLE DB Destination

a) Create an OLE DB Destination component.

b) Delete the Data Flow Path between the Flat File Source component and the Flat File Destination component.

c) Draw a Data Flow Path between the Flat File Source component and the OLE DB Destination component.

d) Double click the component to open the OLE DB Destination Editor.

e) Configure the OLE DB Destination component Connection Manager to point to the import destination table (ImportText). Select Mappings to confirm the input and destination columns are mapped correctly.

f) Click OK to save the configuration changes.

With the completion of step 6f, the Flat File Source and OLE DB Destination components are compatible and ready for use. Both components are using the test file design specifications.

OLE DB destination editor
OLE DB destination editor mapping
OLE DB destination component

NOTE: If you get this warning, you will need to start over as these steps need to be completed in one session.

Component Warning

Step 7: Cleanup

The OLE DB Source and Flat File Destination components are no longer needed and can be deleted. The Data Flow Task is ready to import data from a flat file to the defined table.

The imported data is in a text format in the import destination table (ImportText). Transferring the data to the final table structure (ImportFinal) is a typical SSIS transformation activity and not covered here.

Cleanup components
Next Steps

Now that you have easy access to the data, the next step is cleaning and transforming the raw text data. Here are some helpful tips on making that happen.



Last Updated: 2014-08-29


get scripts

next tip button



About the author
MSSQLTips author Ron Kirchgessner Ron Kirchgessner is a developer with 30 years experience developing marketing and data management software.

View all my tips




Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name
*Email
Email me updates

Signup for our newsletter

I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.





Thursday, January 03, 2019 - 5:08:32 PM - Gokul Back To Top

 HI Ron, Thank you so much for this post. Time Saver. This helps lot!! 


Sunday, March 25, 2018 - 4:07:07 PM - Brandon Back To Top

 Thank you so much, this is a life saver. Just did it with a delimited file with a bunch of columns, and it worked like a charm. Picked ragged right for the first connection like you said. Then after I was done, just created a new connection with the delimited flat file source I needed.

 


Thursday, November 30, 2017 - 1:29:38 PM - Jennifer Back To Top

 Hi Ron,

I tried several times. Why I always get error:

[Flat File Source [2]] Error: Cannot open the datafile "f".

[SSIS.Pipeline] Error: Flat File Source failed the pre-execute phase and returned error code 0xC020200E.

Thank you,

Jennifer

 

 


Friday, September 30, 2016 - 11:35:04 AM - Leo Back To Top

 It is a very interesting, but I have the same problem when I create a new flat file source and loose External Column defenitions. My input file is tab delimited. How can I fix  that?

 

Thanks

 


Friday, November 13, 2015 - 8:02:25 PM - Rampraveen Back To Top

Such a wicked way! Awesome. I was working with flat file with 699 filelds. This helps a lot ! Thanks :)


Thursday, May 28, 2015 - 2:17:22 PM - Alex Back To Top

Awesome. Thanks!

 

Will this work with sparse columns as well? Any different approaches for sparse?

 

 


Thursday, October 16, 2014 - 1:56:28 PM - Scott Back To Top

Very interesting and well-written article, but the first problem that occurred to me was that the input file columns might not exactly match your SQL table definition.  Columns might be in a different order, there might be missing or extra columns (like those pesky Fillers), etc.  If you are demented enough to edit the dtsx files in a text editor, there's another way.  I would not normally recommend text editing an SSIS package, but if you are facing having to add hundreds of column specs for a large import then drastic measures are needed.

Create a package and create the Flat File connection manager.  Define a few columns, it doesn't matter what they are as long as the component ends up in a valid state.  Save the package to a dtsx file and close the solution.

Create a SQL table with the import file layout specs.  It can be in a dev database somewhere, it is only needed temporarily and won't be used in the package.  I will assume it looks like this:

CREATE TABLE dbo.ImportFileLayout(
	ColumnPosition int NOT NULL PRIMARY KEY CLUSTERED,
	ColumnName sysname NOT NULL,
	ColumnWidth smallint NOT NULL
)

Now you can generate the XML for all the connection manager columns in SQL, as required for fixed-length files.  This assumes that all column names are unique, except for columns named "Filler".  The XML column definitions created will append ColumnPosition to all the Filler column names to make them unique.

WITH XMLNAMESPACES ('www.microsoft.com/SqlServer/Dts' as DTS)
SELECT	[DTS:FlatFileColumn/@DTS:ColumnType] = CASE ColumnDelimiter WHEN '' THEN NULL ELSE 'Delimited' END,
		[DTS:FlatFileColumn/@DTS:ColumnDelimiter] = IL.ColumnDelimiter,
		[DTS:FlatFileColumn/@DTS:ColumnWidth] = CASE ColumnDelimiter WHEN '' THEN ColumnWidth ELSE NULL END,
		[DTS:FlatFileColumn/@DTS:MaximumWidth] = IL.ColumnWidth,
		[DTS:FlatFileColumn/@DTS:DataType] = '129',
		[DTS:FlatFileColumn/@DTS:ObjectName] = ColumnName,
		[DTS:FlatFileColumn/@DTS:DTSID] = '{' + CAST(NEWID() AS CHAR(36)) + '}',
		[DTS:FlatFileColumn/@DTS:CreationName] = ''
FROM (
	SELECT	ColumnPosition, 
			ColumnName = CASE ColumnName WHEN 'Filler' THEN 'Filler' + LTRIM(ColumnPosition) ELSE ColumnName END, 
			ColumnWidth,
			ColumnDelimiter = CASE WHEN ColumnPosition = LastColumn THEN '_x000D__x000A_' ELSE '' END
	FROM dbo.ImportFileLayout
	CROSS JOIN (SELECT LastColumn = MAX(ColumnPosition) FROM dbo.ImportFileLayout) x
) IL
ORDER BY IL.ColumnPosition
FOR XML PATH(''), ROOT('DTSFlatFileColumns')

If in Results to Grid mode, open the XML result in a new window by clicking on it.  Copy all the DTS:FlatFileColumn text (everything but the root DTS:FlatFileColumns tags in the first and last line).  Open the .dtsx file in a decent text editor (TextPad, NotePad++, etc), not NotePad, WordPad, or Word.  Find the <DTS:ConnectionManagers> section, then the <DTS:ConnectionManager section for the Flat File connection manager created earlier.  Paste the text copied from SSMS between the DTS:FlatFileColumns tags.

The XML generated by SSIS will have each attribute on a separate line, but this query will create more concise XML with one line per FlatFileColumn node.  One of the advantages of XML is that the formatting is irrelevant, SSIS will have no problem interpreting these column definitions.

Make sure the connection manager properties are correct before saving the file.  You wouldn't want to change them in SSIS and have it reset the column definitions.  Specifically, you want to see DTS:Format="RaggedRight", DTS:ColumnNamesInFirstDataRow="False", DTS:RowDelimiter="_x000D__x000A_", and DTS:TextQualifier="_x003C_none_x003E_".  If you used a random text file to create a valid Flat File connection manager, and you have a sample of a real imput file, you could update DTS:ConnectionString with the real file name if you like.

Save the text file and reopen the package in the designer.  All the column definitions should be valid according to the original layout definition.


Wednesday, September 03, 2014 - 5:22:41 AM - L-P Larsson Back To Top

Great, thanks!



download

























get free sql tips

I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.



Learn more about SQL Server tools