unicode.rst - OpenGrok cross reference for /aosp_15_r20/external/python/cpython3/Doc/c-api/unicode.rst

Lines Matching refs:Unicode
5 Unicode Objects and Codecs
11 Unicode Objects
14 Since the implementation of :pep:`393` in Python 3.3, Unicode objects internally
16 of Unicode characters while staying memory efficient.  There are special cases
18 points must be below 1114112 (which is the full Unicode range).
21 in the Unicode object.  The :c:expr:`Py_UNICODE*` representation is deprecated
24 Due to the transition between the old APIs and the new APIs, Unicode objects
27 * "canonical" Unicode objects are all objects created by a non-deprecated
28   Unicode API.  They use the most efficient representation allowed by the
31 * "legacy" Unicode objects have been created through one of the deprecated
37    The "legacy" Unicode object will be removed in Python 3.12 with deprecated
38    APIs. All Unicode objects will be "canonical" since then. See :pep:`623`
42 Unicode Type
45 These are the basic Unicode object types used for the Unicode implementation in
54    single Unicode characters, use :c:type:`Py_UCS4`.
66       whether you selected a "narrow" or "wide" Unicode version of Python at
74    These subtypes of :c:type:`PyObject` represent a Python Unicode object.  In
76    that deal with Unicode objects take and return :c:type:`PyObject` pointers.
83    This instance of :c:type:`PyTypeObject` represents the Python Unicode type.  It
88 access to internal read-only data of Unicode objects:
92    Return true if the object *o* is a Unicode object or an instance of a Unicode
98    Return true if the object *o* is a Unicode object, but not an instance of a
120    Return the length of the Unicode string, in code points.  *o* has to be a
121    Unicode object in the "canonical" representation (not checked).
155    bytes per character this Unicode object uses to store its data.  *o* has to
156    be a Unicode object in the "canonical" representation (not checked).
165    Return a void pointer to the raw Unicode buffer.  *o* has to be a Unicode
195    Read a character from a Unicode object *o*, which must be in the "canonical"
215    Unicode object (not checked).
218       Part of the old-style Unicode API, please migrate to using
225    bytes.  *o* has to be a Unicode object (not checked).
228       Part of the old-style Unicode API, please migrate to using
240    a Unicode object (not checked).
250       Part of the old-style Unicode API, please migrate to using the
264 Unicode Character Properties
267 Unicode provides many different character properties. The most often needed ones
325    Nonprintable characters are those characters defined in the Unicode character
399 Creating and accessing Unicode strings
402 To create Unicode objects and access their basic sequence properties, use these
407    Create a new Unicode object.  *maxchar* should be the true maximum code point
411    This is the recommended way to allocate a new Unicode object.  Objects
420    Create a new Unicode object with the given *kind* (possible values are
436    Create a Unicode object from the char buffer *u*.  The bytes will be
448    Create a Unicode object from a UTF-8 encoded null-terminated char buffer
455    arguments, calculate the size of the resulting Python Unicode string and return
529    | :attr:`%U`        | PyObject\*          | A Unicode object.                |
531    | :attr:`%V`        | PyObject\*,         | A Unicode object (which may be   |
577    Copy an instance of a Unicode subtype to a new true Unicode object if
578    necessary. If *obj* is already a true Unicode object (not a subtype),
581    Objects other than Unicode or its subtypes will cause a :exc:`TypeError`.
587    Decode an encoded object *obj* to a Unicode object.
595    All other objects, including Unicode objects, cause a :exc:`TypeError` to be
604    Return the length of the Unicode object, in code points.
615    Copy characters from one Unicode object into another.  This function performs
642    :c:func:`PyUnicode_New`.  Since Unicode strings are supposed to be immutable,
645    This function checks that *unicode* is a Unicode object, that the index is
655    Unicode object and the index is not out of bounds, in contrast to
703    Create a Unicode object from the Py_UNICODE buffer *u* of the given size. *u*
709    Therefore, modification of the resulting Unicode object is only allowed when
717       Part of the old-style Unicode API, please migrate to using
724    Return a read-only pointer to the Unicode object's internal
733       Part of the old-style Unicode API, please migrate to using
749       Part of the old-style Unicode API, please migrate to using
760       Part of the old-style Unicode API, please migrate to using
810    Encode a Unicode object to UTF-8 on Android and VxWorks, or to the current
914    Encode a Unicode object to :c:data:`Py_FileSystemDefaultEncoding` with the
942    Create a Unicode object from the :c:expr:`wchar_t` buffer *w* of the given *size*.
950    Copy the Unicode object contents into the :c:expr:`wchar_t` buffer *w*.  At most
963    Convert the Unicode object to a wide character string. The output string
1020    Create a Unicode object by decoding *size* bytes of the encoded string *s*.
1030    Encode a Unicode object and return the result as Python bytes object.
1032    name in the Unicode :meth:`~str.encode` method. The codec to be used is looked up
1045    Create a Unicode object by decoding *size* bytes of the UTF-8 encoded string
1060    Encode a Unicode object using UTF-8 and return the result as Python bytes
1067    Return a pointer to the UTF-8 encoding of the Unicode object, and
1076    This caches the UTF-8 representation of the string in the Unicode object, and
1079    pointers to it become invalid when the Unicode object is garbage collected.
1110    corresponding Unicode object.  *errors* (if non-``NULL``) defines the error
1122    not copied into the resulting Unicode string.  If ``*byteorder`` is ``-1`` or
1160    corresponding Unicode object.  *errors* (if non-``NULL``) defines the error
1172    not copied into the resulting Unicode string.  If ``*byteorder`` is ``-1`` or
1209    Create a Unicode object by decoding *size* bytes of the UTF-7 encoded string
1222 Unicode-Escape Codecs
1225 These are the "Unicode Escape" codec APIs:
1231    Create a Unicode object by decoding *size* bytes of the Unicode-Escape encoded
1237    Encode a Unicode object using Unicode-Escape and return the result as a
1242 Raw-Unicode-Escape Codecs
1245 These are the "Raw Unicode Escape" codec APIs:
1251    Create a Unicode object by decoding *size* bytes of the Raw-Unicode-Escape
1257    Encode a Unicode object using Raw-Unicode-Escape and return the result as
1265 These are the Latin-1 codec APIs: Latin-1 corresponds to the first 256 Unicode
1271    Create a Unicode object by decoding *size* bytes of the Latin-1 encoded string
1277    Encode a Unicode object using Latin-1 and return the result as Python bytes
1291    Create a Unicode object by decoding *size* bytes of the ASCII encoded string
1297    Encode a Unicode object using ASCII and return the result as Python bytes
1316    Create a Unicode object by decoding *size* bytes of the encoded string *s*
1322    to Unicode strings, integers (which are then interpreted as Unicode
1331    Encode a Unicode object using the given *mapping* object and return the
1335    The *mapping* object must map Unicode ordinal integers to bytes objects,
1341 The following codec API is special in that maps Unicode to Unicode.
1346    resulting Unicode object. Return ``NULL`` if an exception was raised by the
1349    The mapping table must map Unicode ordinal integers to Unicode ordinal integers
1370    Create a Unicode object by decoding *size* bytes of the MBCS encoded string *s*.
1385    Encode a Unicode object using MBCS and return the result as Python bytes
1392    Encode the Unicode object using the specified code page and return a Python
1408 The following APIs are capable of handling Unicode objects and strings on input
1409 (we refer to them as strings in the descriptions) and return Unicode objects or
1417    Concat two strings giving a new Unicode string.
1422    Split a string giving a list of Unicode strings.  If *sep* is ``NULL``, splitting
1430    Split a Unicode string at line breaks, returning a list of Unicode strings.
1438    Unicode string.
1485    return the resulting Unicode object. *maxcount* == ``-1`` means replace all
1500    Compare a Unicode object, *uni*, with *string* and return ``-1``, ``0``, ``1`` for less
1510    Rich compare two Unicode strings and return one of the following:
1531    *element* has to coerce to a one element Unicode string. ``-1`` is returned
1538    pointer variable pointing to a Python Unicode string object.  If there is an
1551    :c:func:`PyUnicode_InternInPlace`, returning either a new Unicode string