RAD Studio
ContentsIndex
PreviousUpNext
Unicode Character Types and Literals (C++0x)

C++Builder 2009 implements new character types and character literals for Unicode. These types are among the C++0x features added to C++Builder 2009.

Two new types represent Unicode characters:

  • char16_t is a 16 bit character type. char16_t is a C++ keyword. This type could be used for UTF-16 characters.
  • char32_t is a 32 bit character type. char32_t is a C++ keyword. This type can be used for UTF-32 characters.
The existing wchar_t type is a type for a wide character in the execution wide-character set. A wchar_t wide-character literal begins with an uppercase L (such as L'c').

There are two new forms to create literals of the new types:

  • u'character' is a literal for a single char16_t character, such as u'g'. A multicharacter literal such as u'kh' is badly formed. The value of a char16_t literal is equal to its ISO 10646 code point value, provided that the code point is representable as a 16-bit cvalue. Only characters in the basic multi-lingual plane (BMP) can be represented.
  • U'character' is a literal for a single char32_t character, such as U't'. A multicharacter literal such as U'de' is ill-formed. The value of a char32_t literal is equal to its ISO 10646 code point value.
Multibyte character literals were previously only of the form L'characters', representing one or more characters of the type wchar_t. The value of a single character wide-character literal is that character's encoding in the execution wide-character set.

Copyright(C) 2009 Embarcadero Technologies, Inc. All Rights Reserved.
What do you think about this topic? Send feedback!