CIQUERY - technical support experts on software ,hardware,IT platforms  
 CIQUERY - technical support experts on software ,hardware,IT platforms IT Job search  CIQUERY - technical support experts on software ,hardware,IT platforms  
Register here | FAQ | All about CIQUERY | Technical Support Categories| ARTICLES

Free SQL Server Tips

UK IT Jobs - www.itjobfeed.com

Unicode Fun
Unicode has been part of the mix since SQL Server 7.0 .It was introduced primarily to deal with storing data using a mix of international character standards.

As a consequence the same bit pattern is always converted to the same character on all computers.Recently I took over the management of a SQL Server db and one of the checks required analysing the data for damages and improper storage of data.

One of the interesting things I discovered was that mixing different Unicode multiple encoding methods can cause problems. SQL Server uses UCS-2 or UTF-16 Unicode format, which reflects the 32-bit Windows kernel method of storing inforrmation. The main purpose is so that data doesn't have to converted back and forth which leads to enhanced performance.In contrast , web based applications use the UTF-8 format to store Unicode data. One of the UTF-8 features is that stores all 7-bit ASCII characters as seen , then uses control characters to store the remainining Unicode set.

This gets interesting when UTF-16 data is retrieved and misinterpreted as UTF-8. The consequnce is usually mangled data. If this data is reentered , the most probable outcome is ruined data

Instructions to avoid this problem:

1.Pages that contain data retrieved from the server should set the codepage to 65001 . This will automatically convert any UCS-2/UTF-16 data to UTF-8 format when the page is rendered and sent to the client.

2. The web server automatically converts UTF-8 data to UCS-2 (into SQL Server) , if you set the codepage to 65001 any data sent from a Web page

It's worth remembering that a database won't explicitly support Unicode data unless the field types in question are also explicitly supported. If you use text instead of ntext, for instance, any Unicode entities sent to the database will instead be translated into ISO-8859-1 encoding, which should not be intermixed with true Unicode entities to avoid data damage.
Hopefully this helps in either troubleshooting or planning a web application interacting with a SQL Server database.

 

Digg This : Click to email a colleague


Author details
------------------------------------------------------------------------
Jack Vamvas is a SQL Server Consultant. To contact him about this article , or any issues related to SQL Server register on www.ciquery.com and post a question.
For SQL Server Consultancy of any SQL Server task - check SQL Server Specialists
Make SQL Server go faster - click here


© CIQUERY 2008Factsheet  |Blog  |Browse IT jobs  |Advertisers  | Job Tools  |Resources  |Companies  | Contact Us  | ITjobfeed Home