C++ char16_t Keyword
The char16_t
keyword in C++ is a data type introduced in C++11 for representing 16-bit Unicode characters. It is primarily used to handle UTF-16 encoded text, which is common in applications requiring internationalization. Unlike the traditional char
, char16_t
ensures compatibility with 16-bit wide characters and provides better support for modern Unicode encoding standards.
Strings using char16_t
are prefixed with u
, and their type is const char16_t*
.
Syntax
</>
Copy
char16_t variable_name = u'character';
const char16_t* string_name = u"string";
- char16_t
- The keyword used to declare a variable to store a 16-bit Unicode character.
- variable_name
- The name of the variable that stores the Unicode character.
- u
- A prefix used for UTF-16 encoded strings or characters.
Examples
Example 1: Declaring a UTF-16 Character
This example demonstrates how to declare a char16_t
variable and print its value as a Unicode character and integer.
</>
Copy
#include <iostream>
using namespace std;
int main() {
char16_t ch = u'A'; // Declare a UTF-16 character
cout << "Character: " << (char)ch << endl;
cout << "Unicode Value: " << (int)ch << endl;
return 0;
}
Output:
Character: A
Unicode Value: 65
Explanation:
- The
char16_t
variablech
is initialized withu'A'
, representing a UTF-16 character. - The character is cast to
char
for display in the first output line. - The character is cast to
int
to display its Unicode value, which is65
.
Example 2: Declaring and Printing a UTF-16 String
This example demonstrates how to declare and print a UTF-16 encoded string using char16_t
.
</>
Copy
#include <iostream>
#include <string>
int main() {
// Create a UTF-16 encoded string
const char16_t* greeting = u"Hello, UTF-16!";
// Convert UTF-16 to wide string (platform-specific handling of wchar_t)
std::wstring wide_greeting(greeting, greeting + std::char_traits<char16_t>::length(greeting));
// Print using wcout
std::wcout << L"Message: " << wide_greeting << std::endl;
return 0;
}
Output:
Message: Hello, UTF-16!
Explanation:
- The string
u"Hello, UTF-16!"
is a UTF-16 encoded string stored in aconst char16_t*
variable. - The program prints the UTF-16 string correctly to output using std::wstring.
Example 3: Working with Non-ASCII Characters
This example shows how to use char16_t
with UTF-16 encoded non-ASCII characters.
</>
Copy
#include <iostream>
#include <string>
#include <codecvt> // For conversion
#include <locale> // For std::wstring_convert
int main() {
const char16_t* japanese = u"こんにちは"; // "Hello" in Japanese
// Convert UTF-16 to UTF-8
std::u16string utf16_string(japanese); // Convert to std::u16string
std::wstring_convert<std::codecvt_utf8_utf16<char16_t>, char16_t> converter;
std::string utf8_string = converter.to_bytes(utf16_string);
// Output the UTF-8 encoded string
std::cout << "UTF-8 String: " << utf8_string << std::endl;
return 0;
}
Output:
UTF-16 String: こんにちは
Explanation:
- A UTF-16 string is declared using the
char16_t*
type:const char16_t* japanese = u"こんにちは";
. - The UTF-16 string is converted into a
std::u16string
for compatibility with the conversion utilities. - The
std::wstring_convert
class is used withstd::codecvt_utf8_utf16
to perform the UTF-16 to UTF-8 conversion. - The
to_bytes
method ofstd::wstring_convert
is called to convert thestd::u16string
into a UTF-8 encodedstd::string
. - The resulting UTF-8 encoded string is printed to the console using
std::cout
. - The program successfully outputs the UTF-8 encoded version of the Japanese text “こんにちは”.
Key Points about char16_t
Keyword
char16_t
is a 16-bit data type introduced in C++11 for handling UTF-16 encoded characters.- Strings using
char16_t
must be prefixed withu
. char16_t
ensures compatibility with modern Unicode encoding standards.- Outputting
char16_t
data typically requires casting, asstd::cout
does not natively support it. - Use
char16_t
when working with UTF-16 strings for internationalization and Unicode compliance in modern applications.