ICU 56.1
56.1
|
C++ API: Unicode String. More...
#include "unicode/utypes.h"
#include "unicode/rep.h"
#include "unicode/std_string.h"
#include "unicode/stringpiece.h"
#include "unicode/bytestream.h"
#include "unicode/ucasemap.h"
Go to the source code of this file.
Data Structures | |
class | icu::UnicodeString |
UnicodeString is a string class that stores Unicode characters directly and provides similar functionality as the Java String and StringBuffer/StringBuilder classes. More... | |
Namespaces | |
icu | |
File coll.h. | |
Macros | |
#define | U_COMPARE_CODE_POINT_ORDER 0x8000 |
Option bit for u_strCaseCompare, u_strcasecmp, unorm_compare, etc: Compare strings in code point order instead of code unit order. More... | |
#define | U_STRING_CASE_MAPPER_DEFINED |
#define | US_INV icu::UnicodeString::kInvariant |
Constant to be used in the UnicodeString(char *, int32_t, EInvariant) constructor which constructs a Unicode string from an invariant-character char * string. More... | |
#define | UNICODE_STRING(cs, _length) icu::UnicodeString(TRUE, (const UChar *)L ## cs, _length) |
Unicode String literals in C++. More... | |
#define | UNICODE_STRING_SIMPLE(cs) UNICODE_STRING(cs, -1) |
Unicode String literals in C++. More... | |
#define | UNISTR_FROM_CHAR_EXPLICIT |
This can be defined to be empty or "explicit". More... | |
#define | UNISTR_FROM_STRING_EXPLICIT |
This can be defined to be empty or "explicit". More... | |
#define | UNISTR_OBJECT_SIZE 64 |
Desired sizeof(UnicodeString) in bytes. More... | |
Typedefs | |
typedef int32_t | UStringCaseMapper(const UCaseMap *csm, UChar *dest, int32_t destCapacity, const UChar *src, int32_t srcLength, UErrorCode *pErrorCode) |
Internal string case mapping function type. More... | |
Functions | |
int32_t | u_strlen (const UChar *s) |
Determine the length of an array of UChar. More... | |
U_COMMON_API UnicodeString | icu::operator+ (const UnicodeString &s1, const UnicodeString &s2) |
Create a new UnicodeString with the concatenation of two others. More... | |
C++ API: Unicode String.
Definition in file unistr.h.
#define U_COMPARE_CODE_POINT_ORDER 0x8000 |
#define U_STRING_CASE_MAPPER_DEFINED |
#define UNICODE_STRING | ( | cs, | |
_length | |||
) | icu::UnicodeString(TRUE, (const UChar *)L ## cs, _length) |
Unicode String literals in C++.
Dependent on the platform properties, different UnicodeString constructors should be used to create a UnicodeString object from a string literal. The macros are defined for maximum performance. They work only for strings that contain "invariant characters", i.e., only latin letters, digits, and some punctuation. See utypes.h for details.
The string parameter must be a C string literal. The length of the string, not including the terminating NUL
, must be specified as a constant. The U_STRING_DECL macro should be invoked exactly once for one such string variable before it is used.
#define UNICODE_STRING_SIMPLE | ( | cs | ) | UNICODE_STRING(cs, -1) |
Unicode String literals in C++.
Dependent on the platform properties, different UnicodeString constructors should be used to create a UnicodeString object from a string literal. The macros are defined for improved performance. They work only for strings that contain "invariant characters", i.e., only latin letters, digits, and some punctuation. See utypes.h for details.
The string parameter must be a C string literal.
#define UNISTR_FROM_CHAR_EXPLICIT |
#define UNISTR_FROM_STRING_EXPLICIT |
This can be defined to be empty or "explicit".
If explicit, then the UnicodeString(const char *) and UnicodeString(const UChar *) constructors are marked as explicit, preventing their inadvertent use.
In particular, this helps prevent accidentally depending on ICU conversion code by passing a string literal into an API with a const UnicodeString & parameter.
#define UNISTR_OBJECT_SIZE 64 |
Desired sizeof(UnicodeString) in bytes.
It should be a multiple of sizeof(pointer) to avoid unusable space for padding. The object size may want to be a multiple of 16 bytes, which is a common granularity for heap allocation.
Any space inside the object beyond sizeof(vtable pointer) + 2 is available for storing short strings inside the object. The bigger the object, the longer a string that can be stored inside the object, without additional heap allocation.
Depending on a platform's pointer size, pointer alignment requirements, and struct padding, the compiler will usually round up sizeof(UnicodeString) to 4 * sizeof(pointer) (or 3 * sizeof(pointer) for P128 data models), to hold the fields for heap-allocated strings. Such a minimum size also ensures that the object is easily large enough to hold at least 2 UChars, for one supplementary code point (U16_MAX_LENGTH).
sizeof(UnicodeString) >= 48 should work for all known platforms.
For example, on a 64-bit machine where sizeof(vtable pointer) is 8, sizeof(UnicodeString) = 64 would leave space for (64 - sizeof(vtable pointer) - 2) / U_SIZEOF_UCHAR = (64 - 8 - 2) / 2 = 27 UChars stored inside the object.
The minimum object size on a 64-bit machine would be 4 * sizeof(pointer) = 4 * 8 = 32 bytes, and the internal buffer would hold up to 11 UChars in that case.
#define US_INV icu::UnicodeString::kInvariant |
Constant to be used in the UnicodeString(char *, int32_t, EInvariant) constructor which constructs a Unicode string from an invariant-character char * string.
About invariant characters see utypes.h. This constructor has no runtime dependency on conversion code and is therefore recommended over ones taking a charset name string (where the empty string "" indicates invariant-character conversion).