SQL Server differences of char, nchar, varchar and nvarchar data types

By:   |   Updated: 2016-06-14   |   Comments (4)   |   Related: 1 | 2 | 3 | > Data Types


Problem

The differences of SQL Server char, nchar, varchar and nvarchar are frequently discussed not just during interviews, but also by developers during discussions on database design. In this tip I would like to share not only the basic differences, but also what we need to know and be aware of when using each data type.

Solution

Char, nchar, varchar and nvarchar are all used to store text or string data in SQL Server databases.

  • char - is the SQL-92 synonym for character. Data is padded with blanks/spaces to fill the field size. Fixed length data type.
  • nchar - is the SQL-92 synonym for national char and national character. Fixed length data type.
  • varchar - is the SQL-92 synonym for character varying. Variable length data type.
  • nvarchar - is the SQL-92 synonym for national char varying and national character varying. Variable length data type.

What does N mean in SQL Server

You might wonder what the N stands for? N stands for National Language Character Set and is used to specify a Unicode string. When using Unicode data types, a column can store any character defined by the Unicode Standard, which includes all of the characters defined in the various character sets. Note that Unicode data types take twice as much storage space as non-Unicode data types.

Unicode is typically used in database applications which are designed to facilitate code pages which extend beyond the English and Western Europe code pages. It is designed so that extended character sets can still "fit" into database columns. What this means is that Unicode character data types are limited to half the space, because each byte actually takes two bytes to store the data (Unicode is sometimes referred to as "double-wide"). SQL Server has supported Unicode since SQL Server 7.0 by providing nchar/nvarchar/ntext data types. SQL Server doesn't support UTF-8 encoding for Unicode data, but it does support UTF-16 encoding.

I made a table below that will serve as a quick reference.

Differences of char, nchar, varchar and nvarchar in SQL Server

  char nchar varchar nvarchar
Character Data Type Non-Unicode fixed-length Unicode fixed-length can store both non-Unicode and Unicode characters (i.e. Japanese, Korean etc.) Non-Unicode variable length Unicode variable length can store both non-Unicode and Unicode characters (i.e. Japanese, Korean etc.)
Maximum Length up to 8,000 characters up to 4,000 characters up to 8,000 characters up to 4,000 characters
Character Size takes up 1 byte per character takes up 2 bytes per Unicode/Non-Unicode character takes up 1 byte per character takes up 2 bytes per Unicode/Non-Unicode character
Storage Size n bytes 2 times n bytes Actual Length (in bytes) 2 times Actual Length (in bytes)
Usage use when data length is constant or fixed length columns use only if you need Unicode support such as the Japanese Kanji or Korean Hangul characters due to storage overhead used when data length is variable or variable length columns and if actual data is always way less than capacity use only if you need Unicode support such as the Japanese Kanji or Korean Hangul characters due to storage overhead
    query that uses a varchar parameter does an index seek due to column collation sets query that uses a nvarchar parameter does an index scan due to column collation sets

Advantages and Disadvantages of char, nchar, varchar and nvarchar in SQL Server

Data Types Advantages Disadvantages
char Query performance is better since no need to move the column while updating.

No need to store the length of string in last two bytes.
If not properly used, it can take more space than varchar since it is fixed length and we don't know the length of string to be stored. It is not good for compression since it embeds space characters at the end.
varchar Since it is variable length it takes less memory spaces. Decreases the performance of some SQL queries.
nchar/nvarchar Supports many client computers that are running different locales. If not properly used it may use up a lot of extra storage space.

With the growth and innovation of web applications, it is even more important to support client computers that are running different locales. The easiest way to manage character data in international databases is to always use the Unicode nchar, nvarchar, and ntext data types, instead of their non-Unicode equivalents, char, varchar, and text.

Unicode is a standard for mapping code points to characters. Because it is designed to cover all the characters of all the languages of the world, there is no need for different code pages to handle different sets of characters. SQL Server supports the Unicode Standard, Version 3.2. If all the applications that work with international databases also use Unicode variables instead of non-Unicode variables, character translations do not have to be performed anywhere in the system. Clients will see the same characters in the data as all other clients.

SQL Server stores all textual system catalog data in columns having Unicode data types. The names of database objects, such as tables, views, and stored procedures, are stored in Unicode columns. This enables applications to be developed by using only Unicode, and helps avoid issues with code page conversions.

Remember when developing new applications to consider if it will be used globally because this will help you determine whether to use nchar and nvarchar to support different languages.

Next Steps

Learn more by reading and exploring the following:



sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Sherlee Dizon Sherlee Dizon is an IT consultant and web application developer with over 14 years of experience as a software and web developer.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2016-06-14

Comments For This Article




Tuesday, December 25, 2018 - 9:02:31 AM - Philip van Gass Back To Top (78553)

I would like to know if it is possible to store more than one extra foreign language in addition to English in a NCHAR or NVARCHAR data types ?

I would not think so. 


Thursday, July 21, 2016 - 2:51:38 PM - Andy Yun Back To Top (41942)

@Dman2306 - your recommendation to always use NCHAR/NVARCHAR due to UNICODE, can be extremely detrimental to SQL Server query performance.  Disk storage is not the only thing impacted by a data type decision.  That storage cost compounds in numerous other ways.  Wider records means less records can be stored in an 8KB data page construct.  And all work done by SQL Server are done via pages, not records.  More data pages to consume & process for a query equates to more I/O, both reading & writing from disk, but also impacts RAM usage (due to storage of those data pages in the buffer pool).  Wider data types also impacts the amount of transaction log that must be written for a given DML query.  If you're in Azure, there is a direct dollar cost correlation to the amount of data you are moving around.

If you don't believe me regarding the above, go Google for my Every Byte Counts: Why Your Data Type Choices Matter presentation.  There are two (older) recordings of it available online.  Watch it and hopefully you will gain a better apprecation as to why one should right size your data types.  

---

@Sherlee

"query that uses a varchar parameter does an index seek due to column collation sets"

"query that uses a nvarchar parameter does an index scan due to column collation sets"

 

These two statements are misleading.  They indicate that queries that use varchar/nvarchar will only ever result in a seek/scan operation respectively.  That is not accurate.  When it comes to data types, what impacts seek vs scan is whether the underlying data types match.  
nchar/nvarchar = nchar/nvarchar -> seek
char/varchar = char/varchar -> seek
char/varchar = nchar/nvarchar -> scan due to implicit conversion

 


Thursday, July 21, 2016 - 11:54:56 AM - Robert Back To Top (41941)

The "Table of Differences" is not accurate for variable character data types (varchar and nvarchar).  Both have two additional bytes for storage.  If your string is 5 chracters, varchar requires 7 bytes for varchar and 12 bytes for nvarchar.  If using varchar(max) or nvarchar(max), an additional 24 bytes is required.  See https://msdn.microsoft.com/en-us/library/ms176089(v=sql.110).aspx and https://msdn.microsoft.com/en-us/library/ms186939(v=sql.110).aspx.


Thursday, July 21, 2016 - 8:23:03 AM - Dman2306 Back To Top (41938)

I very much disagree with your statement of "use only if you need Unicode support such as the Japanese Kanji or Korean Hangul characters due to storage overhead". This is shortsighted and exactly what leads to problems like the Y2K fiasco. I have built MANY applications that at the time I built them, were US English only. Then, suddenly, we got an overseas customer. Now I had the task of tracking down every char/varchar, not just in tables, but in sprocs, udfs, etc. and changing them all to Unicode. Then of course making sure we didn't break anything. However, if the developers had the foresight to just support Unicode from the getgo there would have been no issues. Yes, Unicode uses more storage space, but storage space is cheap these days. It is the reason why languages like C#/VB.NET don't even support ASCII strings natively! My recommendation is ALWAYS use nvarchar/nchar unless you are 100% CERTAIN that the field will NEVER require any non-western European characters (e.g. an alphanumeric id that is only allowed 0-9,a-Z). Otherwise, years from now, when your salesmen begin selling outside of the English speaking world you're going to have a daunting refactoring task ahead of you.















get free sql tips
agree to terms