A single extended grapheme cluster that approximates a user-perceived character.
SDK
- Xcode 6.3+
Framework
- Swift Standard Library
Declaration
Overview
The Character
type represents a character made up of one or more Unicode scalar values, grouped by a Unicode boundary algorithm. Generally, a Character
instance matches what the reader of a string will perceive as a single character. Strings are collections of Character
instances, so the number of visible characters is generally the most natural way to count the length of a string.
Because each character in a string can be made up of one or more Unicode scalar values, the number of characters in a string may not match the length of the Unicode scalar value representation or the length of the string in a particular binary representation.
Every Character
instance is composed of one or more Unicode scalar values that are grouped together as an extended grapheme cluster. The way these scalar values are grouped is defined by a canonical, localized, or otherwise tailored Unicode segmentation algorithm.
For example, a country’s Unicode flag character is made up of two regional indicator scalar values that correspond to that country’s ISO 3166-1 alpha-2 code. The alpha-2 code for The United States is “US”, so its flag character is made up of the Unicode scalar values "\u{1F1FA}"
(REGIONAL INDICATOR SYMBOL LETTER U) and "\u{1F1F8}"
(REGIONAL INDICATOR SYMBOL LETTER S). When placed next to each other in a string literal, these two scalar values are combined into a single grapheme cluster, represented by a Character
instance in Swift.
For more information about the Unicode terms used in this discussion, see the Unicode.org glossary. In particular, this discussion mentions extended grapheme clusters and Unicode scalar values.