C++ char32_t Keyword
The char32_t
keyword in C++ is a data type introduced in C++11 for representing 32-bit Unicode characters. It is primarily used for handling UTF-32 encoded text, which assigns a unique 32-bit code point for each Unicode character. The char32_t
type provides compatibility with modern Unicode standards and is suitable for applications requiring wide character support.
Strings using char32_t
are prefixed with U
, and their type is const char32_t*
.
Syntax
</>
Copy
char32_t variable_name = U'character';
const char32_t* string_name = U"string";
- char32_t
- The keyword used to declare a variable to store a 32-bit Unicode character.
- variable_name
- The name of the variable that stores the Unicode character.
- U
- A prefix used for UTF-32 encoded strings or characters.
Examples
Example 1: Declaring a UTF-32 Character
This example demonstrates how to declare a char32_t
variable and print its value as a Unicode character and integer.
</>
Copy
#include <iostream>
using namespace std;
int main() {
char32_t ch = U'A'; // Declare a UTF-32 character
cout << "Character: " << (char)ch << endl;
cout << "Unicode Value: " << (int)ch << endl;
return 0;
}
Output:
Character: A
Unicode Value: 65
Explanation:
- The
char32_t
variablech
is initialized withU'A'
, representing a UTF-32 character. - The character is cast to
char
for display in the first output line. - The character is cast to
int
to display its Unicode value, which is65
.
Example 2: Declaring and Printing a UTF-32 String
This example demonstrates how to declare and print a UTF-32 encoded string using char32_t
.
</>
Copy
#include <iostream>
#include <string>
#include <codecvt> // For conversion (deprecated in C++17)
#include <locale> // For std::wstring_convert
using namespace std;
int main() {
// UTF-32 encoded string
const char32_t* greeting = U"Hello, UTF-32!";
// Convert to std::u32string
std::u32string utf32_string(greeting);
// Convert UTF-32 to UTF-8 using std::wstring_convert
std::wstring_convert<std::codecvt_utf8<char32_t>, char32_t> converter;
std::string utf8_string = converter.to_bytes(utf32_string);
// Output the UTF-8 string
cout << "Message: " << utf8_string << endl;
return 0;
}
Output:
Message: Hello, UTF-32!
Explanation:
- UTF-32 String Declaration: The string
greeting
is declared asconst char32_t*
to represent a UTF-32 encoded literal. - Convert to
std::u32string
: Thegreeting
pointer is converted to astd::u32string
for compatibility with conversion utilities. - UTF-32 to UTF-8 Conversion: The
std::wstring_convert
class is used withstd::codecvt_utf8<char32_t>
to convert the UTF-32 encoded string to a UTF-8 encodedstd::string
. - Output with
std::cout
: The UTF-8 encoded string is printed usingstd::cout
.
Example 3: Working with Non-ASCII Characters
This example shows how to use char32_t
with UTF-32 encoded non-ASCII characters.
</>
Copy
#include <iostream>
#include <string>
#include <codecvt> // For conversion (deprecated in C++17)
#include <locale> // For std::wstring_convert
int main() {
// UTF-32 encoded string (Japanese for "Hello")
const char32_t* japanese = U"こんにちは";
// Convert to std::u32string
std::u32string utf32_string(japanese);
// Convert UTF-32 to UTF-8
std::wstring_convert<std::codecvt_utf8<char32_t>, char32_t> converter;
std::string utf8_string = converter.to_bytes(utf32_string);
// Print the UTF-8 encoded string
std::cout << "UTF-8 String: " << utf8_string << std::endl;
return 0;
}
Output:
UTF-32 String: こんにちは
Explanation:
- UTF-32 String Declaration: The
japanese
string is a UTF-32 encoded string usingchar32_t*
. - Conversion to
std::u32string
: The raw UTF-32 pointer is wrapped in astd::u32string
to simplify handling and compatibility with conversion utilities. - UTF-32 to UTF-8 Conversion: The
std::wstring_convert
class is used along withstd::codecvt_utf8<char32_t>
to convert the UTF-32 string to a UTF-8 encodedstd::string
. - Output with
std::cout
: The UTF-8 encoded string is printed usingstd::cout
, which is compatible with UTF-8.
Key Points about char32_t
Keyword
char32_t
is a 32-bit data type introduced in C++11 for handling UTF-32 encoded characters.- Strings using
char32_t
must be prefixed withU
. char32_t
ensures compatibility with modern Unicode encoding standards and supports all Unicode code points.- Outputting
char32_t
data typically requires casting, asstd::cout
does not natively support it. - Use
char32_t
when working with UTF-32 strings for wide character support in modern applications.