Fixing Cannot Convert Between Unicode and Non-Unicode String Data Types in SSIS

By:   |   Updated: 2023-01-20   |   Comments   |   Related: More > Integration Services Development


Problem

Most of the time string values are stored as Unicode within data sources such as Excel worksheets and flat files, while the destination data column in the SQL database only supports encoded strings. Building the package throws the following exception: "Cannot convert between Unicode and non-Unicode string data types". How do I resolve this?

Solution

In this article, we will reproduce the "Cannot convert between Unicode and non-Unicode string data types" exception and explain several solutions to this issue.

We will cover 5 solutions to handle this issue:

  1. Using a Data Conversion Transformation
  2. Using a Derived Column Transformation
  3. Converting the Data Type Implicitly in the Data Source Component
  4. Using a Script Component
  5. Converting the Data Type Explicitly in the Data Source Component

What is a Unicode String? Why is it Used?

In general, Unicode strings are used to store data in multiple languages within one database. As the Internet grows, it becomes even more critical to support multiple client computers in different geographical locations. Selecting a code page that supports all the characters required by a worldwide audience would be challenging. This is why several data sources use stored strings as Unicode. In contrast, local databases usually store strings with a specific code page since it saves more disk space and could guarantee higher performance.

Reproducing the Error

To reproduce the error, we created a text file containing one column that stores username information. Then, we saved the text file as Unicode, as shown in the screenshot below.

saving the text file as unicode

Now, we created a new Integration Services project using Visual Studio. Then we added a flat file connection manager to read data from the text file and an OLE DB connection manager to establish a connection with the destination SQL Server database. In the SQL Server database, we create a table with one column of type VARCHAR(50).

Within the SSIS package control flow, we added a new data flow task that contains a Flat File source component and an OLE DB Source component. Once we link both components and map the source and destination columns in the OLE DB destination, an error icon appears on the OLE DB destination component with the following error message:

Column "Username" cannot convert between Unicode and non-Unicode string data types.
cannot convert between Unicode and non-Unicode string data types exception

1 - Using a Data Conversion Transformation

The first solution that may solve this issue is by using data conversion transformation. As described in the SSIS toolbox, this component "converts data from one data type to another. Converted data is written to a new output column. For example, convert data from column of decimal type to an output column of integer type, or convert ANSI string to Unicode."

data conversion transformation description in the SSIS toolbox

We should add a data conversion transformation to convert data from Unicode to non-Unicode. Within the data conversion transformation editor, we should select the Username column, change the data type from DT_WSTR to DT_STR, and specify the desired code page, as shown in the screenshot below.

data conversion editor

Next, open the OLE DB destination editor and map the newly generated column with destination columns, as shown below.

ole db destination component columns mapping

After closing the OLE DB destination editor, we should note that the error message disappeared.

data flow task screenshot where the error message disappeared

2 - Using a Derived Column Transformation

The second solution that may solve this error is using a derived column transformation. The SSIS toolbox describes that a derived column transformation "creates new column values by applying an expression to transformation input column."

The derived column transformation description in the SSIS toolbox

We should add a derived column transformation to the data flow task and write an expression that converts the Username column to a non-Unicode column. The image below shows that the derived column transformation supports several type cast functions. To solve the "cannot convert between Unicode and non-Unicode" error, we should use the (DT_STR) type cast function.

type casts in the derived column transformation

As shown in the image below, we added a new derived column with the following expression:

(DT_STR, 50, 1252) [Username]
derived column editor

Now, after mapping the source and destination column within the OLE DB destination component, we can check that the error message has disappeared.

data flow task screenshot where the error message disappeared

3 - Converting the Data Type Implicitly in the Data Source Component

The third solution that may solve this issue is changing the source column data types within SSIS. This can be done by opening the source component advanced editor, as shown below.

Opening the  source advanced editor

Then, in the advanced editor, go to the "Input and Output Properties" tab. Within the Source output > external columns, select the column with a Unicode string data type and change it to an encoded string, as shown below.

changing the output column data type

After changing the column data type, the following warning will appear on the source component:

"The external columns for Flat File Source are out of synchronization with the data source columns. The external column "Username" needs to be updated."

data type conflict warning

This warning appears because we need to refresh the source component metadata by only opening the source component editor and clicking the "OK" button. We can note that the output column data type also changed within the columns tab in the Source editor.

output column data type changed

4 - Using a Script Component

In several cases, the three solutions mentioned above do not solve the error, and the exception is still thrown on the package execution. This usually occurs once the values within the source file contain Unicode characters that cannot be converted.

text file with unicode characters

In this case, we can use a script component that removes or replaces all Unicode characters. This is done mainly using a regular expression:

row.UsernameNonUni = Regex.Replace(row.Username, @"[^\u0000-\u007F]+", "");

Another option is to stage the data within a SQL table with Unicode columns, then use a SQL command to convert the values to an encoded string:

SELECT CAST([Username] as VARCHAR(50)) from staging_table

5 - Converting the Data Type Explicitly in the Data Source Component

The last solution we will provide in this tip is to convert or cast the source column in the source component if this option is available. For example, if we are reading the data using an OLE DB source component, we can use the "SQL Command" data access mode instead of the "Table or View".

Then we can use the data conversion functions in the SQL Command to convert the data from Unicode to non-Unicode characters.

SELECT CAST([Username] as VARCHAR(50)), CONVERT(VARCHAR(50),[Username]) from source_table
using SQL Command data access mode
Next Steps

Read more about the SSIS package performance optimization methods:

Before putting your SSIS package into production, it is highly recommended to read more about SSIS performance optimization techniques and the bad habits to avoid:






get scripts

next tip button



About the author
MSSQLTips author Hadi Fadlallah Hadi Fadlallah is an SQL Server professional with more than 10 years of experience. His main expertise is in data integration. He's one of the top ETL and SQL Server Integration Services contributors at Stackoverflow.com. On the academic level, Hadi holds two master's degrees in computer science and business computing and is seeking a Ph.D. in data science.

View all my tips


Article Last Updated: 2023-01-20

Comments For This Article

















get free sql tips
agree to terms