XML
  Home arrow XML arrow Page 4 - XML Tricks for C#
ASP Free Forums 
.NET  
ASP  
ASP Code  
ASP.NET  
ASP.NET Code  
BrainDump  
C#  
Code Examples  
Database  
Database Code  
IIS  
Microsoft Access  
MS SQL Server  
Visual Basic.NET  
Windows Scripting  
Windows Security  
XML  
ASP Web Hosting  
ASP.NET Web Hosting 
Mobile Linux 
App Generation ROI 
Windows Web Hosting
 
IBM® developerWorks 
Sun Developer Network 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
XML

XML Tricks for C#
By: Michael Youssef
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 4 stars4 stars4 stars4 stars4 stars / 64
    2004-03-24

    Table of Contents:
  • XML Tricks for C#
  • Attributes and Document Complexity
  • A First Look at Encoding
  • Unicode
  • Encoding with XML

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
    ADVERTISEMENT


    XML Tricks for C# - Unicode


    (Page 4 of 5 )

    The Unicode Consortium states "Unicode provides a unique number for every character, no matter what the platform, no matter what the program and no matter what the language."  I like that, and I think you will like it too when you know more about it from the next section.

    Looking at all these difference character sets--ASCII, Extended ASCII, ANSI (windows-1252), Shift-JIS, ISO-2022-JP, J-EUC and many other character sets--it became clear that some kind of standardization was needed. The problems of these different character sets were solved in 1996 when the Unicode Consortium released Unicode version 2.0 Standard.

    Unicode Standard provides us with only one single huge character set that covers all the characters of the languages of the world. With this we don't have to go from one character set to another character set when developing our international applications. You must know also that Unicode is built into almost all the common software applications and fully supported by Windows NT, the Windows 2000 server family, Windows .NET servers, the Windows 2003 family, and Windows XP.

    But if Unicode is such an important character set, why don't all the vendors support it? The short answer is that Unicode is one character set that you can use in your application. Using it you can represent any language, which is a valuable feature. Some would say that efficiency is sacrificed using a character set with larger 16 to 24 bit characters (for the Asian languages), when all I need to program for are the shorter Latin-based characters (which take only 7 or 8 bits). This was a major debate with the folks in the Unicode Consortium. Although Unicode uses the same character set for storing all the known (and unknown) characters, the folks in Unicode Consortium offer three types of encoding. 

    Unicode is a multi-byte character set (MBCS) and it uses a number of bytes to store (encode) each character. Here are the 3 types of encoding:

    • UTF-23 which uses a single 32-bit unit to encode each character
    • UTF-16 which uses one or two 16-bit units to encode each character
    • UTF-8 which uses one to four 8-bit units to encode each character

    UTF-32 uses 4 bytes to encode each character so it's not supported by software applications. But UTF-16 and UTF-8 are extensively supported and required for the XML Parser. If there are many characters that require more than 2 bytes for encoding it starts to be more efficient to use UTF-16 because if we have non-Latin characters taking 2 bytes to encode, it will be faster to read one 16-bit unit. UTF-8 use is maximized when you are storing only Latin-characters.

    Now after this simple introduction, let's get down to Encoding with XML

    More XML Articles
    More By Michael Youssef


     

    XML ARTICLES

    - More on Triggers and Styles and Control Temp...
    - Looking at Triggers with Styles and Control ...
    - A Closer Look at Styles and Control Templates
    - Styles and Control Templates
    - Properties and More in XAML
    - Elements and Attributes in XAML
    - XAML in a Nutshell
    - Importing XML Files into Access 2007
    - Using MSXML3.0 with VB 6.0
    - MSXML, concluded
    - MSXML, continued
    - MSXML Tutorial
    - Generating XML Schema Dynamically Using VB.N...
    - XSL Transformations using ASP.NET
    - Applying XSLT to XML Using ASP.NET





    © 2003-2008 by Developer Shed. All rights reserved. DS Cluster 5 hosted by Hostway
    Stay green...Green IT