What is the difference between varchar and nvarchar in SQL Server?

In SQL Server, VARCHAR and NVARCHAR are both used to store variable-length string data, but they differ significantly in their handling of character encoding, storage, and use cases. Below is a comprehensive breakdown with detailed examples to illustrate their differences.

1. Character Encoding and Storage

VARCHAR

  • Encoding: Stores non-Unicode characters (ASCII, Latin-1, etc.) using the database/column collation.
  • Storage:
  • 1 byte per character (supports up to 256 characters).
  • Maximum length: 8,000 bytes (e.g., VARCHAR(8000)).
  • Example:
  CREATE TABLE Product (
      ProductID INT,
      ProductCode VARCHAR(10) -- Stores 10 characters, 10 bytes
  );

NVARCHAR

  • Encoding: Stores Unicode characters (UTF-16 encoding).
  • Storage:
  • 2 bytes per character (supports over 65,000 characters, including emojis and multilingual text).
  • Maximum length: 4,000 characters (8,000 bytes).
  • Example:
  CREATE TABLE UserProfile (
      UserID INT,
      UserName NVARCHAR(50) -- Stores 50 characters, 100 bytes
  );

2. Character Handling Examples

Example 1: Basic Text Storage

-- VARCHAR: Loses accents if collation doesn't support them
INSERT INTO Product (ProductCode) VALUES ('Café'); -- Might save as 'Cafe'

-- NVARCHAR: Preserves Unicode characters
INSERT INTO UserProfile (UserName) VALUES (N'Café'); -- Saved as 'Café'

Example 2: Multilingual Text

-- VARCHAR: Fails to store non-Latin characters
INSERT INTO Product (ProductCode) VALUES ('中文'); -- Garbled output: '??'

-- NVARCHAR: Stores multilingual text
INSERT INTO UserProfile (UserName) VALUES (N'山田太郎'); -- Saved correctly

Example 3: Emojis

-- VARCHAR: Fails to store emojis
INSERT INTO Product (ProductCode) VALUES ('😊'); -- Saved as '??'

-- NVARCHAR: Supports emojis
INSERT INTO UserProfile (UserName) VALUES (N'😊'); -- Saved correctly

3. Storage Space Comparison

Example 4: Byte Usage

DECLARE @varcharText VARCHAR(100) = 'Hello'; -- 5 bytes
DECLARE @nvarcharText NVARCHAR(100) = N'Hello'; -- 10 bytes

SELECT 
    DATALENGTH(@varcharText) AS VarcharBytes, -- 5
    DATALENGTH(@nvarcharText) AS NvarcharBytes; -- 10

Example 5: Large Data Storage

CREATE TABLE Log_VARCHAR (LogMessage VARCHAR(8000)); -- Max 8000 bytes (8000 chars)
CREATE TABLE Log_NVARCHAR (LogMessage NVARCHAR(4000)); -- Max 8000 bytes (4000 chars)

4. Performance Considerations

Example 6: Indexing Overhead

-- VARCHAR index (smaller size)
CREATE INDEX IX_ProductCode ON Product(ProductCode); -- 10 bytes per entry

-- NVARCHAR index (double the storage)
CREATE INDEX IX_UserName ON UserProfile(UserName); -- 100 bytes per entry
  • Impact: Larger indexes slow down queries and increase memory usage.

Example 7: Implicit Conversion

-- Slow query (VARCHAR vs. NVARCHAR comparison)
SELECT * 
FROM Product 
WHERE ProductCode = N'Café'; -- Forces conversion to NVARCHAR, causing index scan

-- Optimized query
SELECT * 
FROM Product 
WHERE ProductCode = CAST(N'Café' AS VARCHAR(10)); -- Uses index seek

5. Collation and Sorting

Example 8: Collation Conflicts

-- VARCHAR uses collation for sorting
CREATE TABLE Product_VARCHAR (Name VARCHAR(50) COLLATE Latin1_General_CI_AS);
INSERT INTO Product_VARCHAR VALUES ('cafe'), ('Café');

-- NVARCHAR uses Unicode rules
CREATE TABLE Product_NVARCHAR (Name NVARCHAR(50) COLLATE Latin1_General_CI_AS);
INSERT INTO Product_NVARCHAR VALUES (N'cafe'), (N'Café');

-- Sorting results differ:
SELECT * FROM Product_VARCHAR ORDER BY Name; -- 'cafe', 'Café' (case-insensitive)
SELECT * FROM Product_NVARCHAR ORDER BY Name; -- 'cafe', 'Café' (Unicode rules)

6. Use Cases

When to Use VARCHAR

  • Example 9: Internal system codes (ASCII-only).
  CREATE TABLE ZipCodes (
      Zip VARCHAR(10) -- e.g., '90210'
  );
  • Example 10: Log files (English text).
  CREATE TABLE ErrorLogs (
      LogMessage VARCHAR(1000)
  );

When to Use NVARCHAR

  • Example 11: User-generated content (multilingual).
  CREATE TABLE Comments (
      CommentText NVARCHAR(MAX)
  );
  • Example 12: Global user profiles.
  CREATE TABLE Users (
      Username NVARCHAR(50),
      Bio NVARCHAR(1000)
  );

7. Advanced Scenarios

Example 13: VARCHAR(MAX) vs. NVARCHAR(MAX)

-- VARCHAR(MAX): 2GB storage (non-Unicode)
CREATE TABLE Article_VARCHAR (Content VARCHAR(MAX));

-- NVARCHAR(MAX): 2GB storage (Unicode)
CREATE TABLE Article_NVARCHAR (Content NVARCHAR(MAX));

Example 14: Mixing Data Types

-- Joining VARCHAR and NVARCHAR columns (requires explicit conversion)
SELECT *
FROM Product_VARCHAR V
INNER JOIN Product_NVARCHAR N 
    ON V.Name = N.Name COLLATE Latin1_General_CI_AS;

8. Best Practices

  1. Default to NVARCHAR for user-facing text (e.g., names, comments).
  2. Use VARCHAR for:
  • ASCII-only data (e.g., GUIDs, codes).
  • Large tables where storage optimization is critical.
  1. Avoid mixing VARCHAR and NVARCHAR in comparisons/joins to prevent performance hits.

Summary Table

FeatureVARCHARNVARCHAR
EncodingNon-Unicode (1 byte/char)Unicode (2 bytes/char)
Max Length8,000 bytes4,000 characters
Storage EfficiencyHigh (1x)Low (2x)
Use CaseLatin text, internal systemsMultilingual text, emojis
CollationDepends on database/columnUnicode rules
ExampleProductCode VARCHAR(10)UserName NVARCHAR(50)

Key Takeaways

  • Use VARCHAR for storage efficiency in ASCII-only scenarios.
  • Use NVARCHAR for global applications requiring multilingual support.
  • Always prefix Unicode literals with N (e.g., N'Café').

By understanding these differences, you can optimize both storage and functionality in your SQL Server databases!

Leave a Reply

Your email address will not be published. Required fields are marked *