Lines Matching refs:Unicode
5 Unicode Objects and Codecs
11 Unicode Objects
14 Since the implementation of :pep:`393` in Python 3.3, Unicode objects internally
16 of Unicode characters while staying memory efficient. There are special cases
18 points must be below 1114112 (which is the full Unicode range).
21 in the Unicode object. The :c:expr:`Py_UNICODE*` representation is deprecated
24 Due to the transition between the old APIs and the new APIs, Unicode objects
27 * "canonical" Unicode objects are all objects created by a non-deprecated
28 Unicode API. They use the most efficient representation allowed by the
31 * "legacy" Unicode objects have been created through one of the deprecated
37 The "legacy" Unicode object will be removed in Python 3.12 with deprecated
38 APIs. All Unicode objects will be "canonical" since then. See :pep:`623`
42 Unicode Type
45 These are the basic Unicode object types used for the Unicode implementation in
54 single Unicode characters, use :c:type:`Py_UCS4`.
66 whether you selected a "narrow" or "wide" Unicode version of Python at
74 These subtypes of :c:type:`PyObject` represent a Python Unicode object. In
76 that deal with Unicode objects take and return :c:type:`PyObject` pointers.
83 This instance of :c:type:`PyTypeObject` represents the Python Unicode type. It
88 access to internal read-only data of Unicode objects:
92 Return true if the object *o* is a Unicode object or an instance of a Unicode
98 Return true if the object *o* is a Unicode object, but not an instance of a
120 Return the length of the Unicode string, in code points. *o* has to be a
121 Unicode object in the "canonical" representation (not checked).
155 bytes per character this Unicode object uses to store its data. *o* has to
156 be a Unicode object in the "canonical" representation (not checked).
165 Return a void pointer to the raw Unicode buffer. *o* has to be a Unicode
195 Read a character from a Unicode object *o*, which must be in the "canonical"
215 Unicode object (not checked).
218 Part of the old-style Unicode API, please migrate to using
225 bytes. *o* has to be a Unicode object (not checked).
228 Part of the old-style Unicode API, please migrate to using
240 a Unicode object (not checked).
250 Part of the old-style Unicode API, please migrate to using the
264 Unicode Character Properties
267 Unicode provides many different character properties. The most often needed ones
325 Nonprintable characters are those characters defined in the Unicode character
399 Creating and accessing Unicode strings
402 To create Unicode objects and access their basic sequence properties, use these
407 Create a new Unicode object. *maxchar* should be the true maximum code point
411 This is the recommended way to allocate a new Unicode object. Objects
420 Create a new Unicode object with the given *kind* (possible values are
436 Create a Unicode object from the char buffer *u*. The bytes will be
448 Create a Unicode object from a UTF-8 encoded null-terminated char buffer
455 arguments, calculate the size of the resulting Python Unicode string and return
529 | :attr:`%U` | PyObject\* | A Unicode object. |
531 | :attr:`%V` | PyObject\*, | A Unicode object (which may be |
577 Copy an instance of a Unicode subtype to a new true Unicode object if
578 necessary. If *obj* is already a true Unicode object (not a subtype),
581 Objects other than Unicode or its subtypes will cause a :exc:`TypeError`.
587 Decode an encoded object *obj* to a Unicode object.
595 All other objects, including Unicode objects, cause a :exc:`TypeError` to be
604 Return the length of the Unicode object, in code points.
615 Copy characters from one Unicode object into another. This function performs
642 :c:func:`PyUnicode_New`. Since Unicode strings are supposed to be immutable,
645 This function checks that *unicode* is a Unicode object, that the index is
655 Unicode object and the index is not out of bounds, in contrast to
703 Create a Unicode object from the Py_UNICODE buffer *u* of the given size. *u*
709 Therefore, modification of the resulting Unicode object is only allowed when
717 Part of the old-style Unicode API, please migrate to using
724 Return a read-only pointer to the Unicode object's internal
733 Part of the old-style Unicode API, please migrate to using
749 Part of the old-style Unicode API, please migrate to using
760 Part of the old-style Unicode API, please migrate to using
810 Encode a Unicode object to UTF-8 on Android and VxWorks, or to the current
914 Encode a Unicode object to :c:data:`Py_FileSystemDefaultEncoding` with the
942 Create a Unicode object from the :c:expr:`wchar_t` buffer *w* of the given *size*.
950 Copy the Unicode object contents into the :c:expr:`wchar_t` buffer *w*. At most
963 Convert the Unicode object to a wide character string. The output string
1020 Create a Unicode object by decoding *size* bytes of the encoded string *s*.
1030 Encode a Unicode object and return the result as Python bytes object.
1032 name in the Unicode :meth:`~str.encode` method. The codec to be used is looked up
1045 Create a Unicode object by decoding *size* bytes of the UTF-8 encoded string
1060 Encode a Unicode object using UTF-8 and return the result as Python bytes
1067 Return a pointer to the UTF-8 encoding of the Unicode object, and
1076 This caches the UTF-8 representation of the string in the Unicode object, and
1079 pointers to it become invalid when the Unicode object is garbage collected.
1110 corresponding Unicode object. *errors* (if non-``NULL``) defines the error
1122 not copied into the resulting Unicode string. If ``*byteorder`` is ``-1`` or
1160 corresponding Unicode object. *errors* (if non-``NULL``) defines the error
1172 not copied into the resulting Unicode string. If ``*byteorder`` is ``-1`` or
1209 Create a Unicode object by decoding *size* bytes of the UTF-7 encoded string
1222 Unicode-Escape Codecs
1225 These are the "Unicode Escape" codec APIs:
1231 Create a Unicode object by decoding *size* bytes of the Unicode-Escape encoded
1237 Encode a Unicode object using Unicode-Escape and return the result as a
1242 Raw-Unicode-Escape Codecs
1245 These are the "Raw Unicode Escape" codec APIs:
1251 Create a Unicode object by decoding *size* bytes of the Raw-Unicode-Escape
1257 Encode a Unicode object using Raw-Unicode-Escape and return the result as
1265 These are the Latin-1 codec APIs: Latin-1 corresponds to the first 256 Unicode
1271 Create a Unicode object by decoding *size* bytes of the Latin-1 encoded string
1277 Encode a Unicode object using Latin-1 and return the result as Python bytes
1291 Create a Unicode object by decoding *size* bytes of the ASCII encoded string
1297 Encode a Unicode object using ASCII and return the result as Python bytes
1316 Create a Unicode object by decoding *size* bytes of the encoded string *s*
1322 to Unicode strings, integers (which are then interpreted as Unicode
1331 Encode a Unicode object using the given *mapping* object and return the
1335 The *mapping* object must map Unicode ordinal integers to bytes objects,
1341 The following codec API is special in that maps Unicode to Unicode.
1346 resulting Unicode object. Return ``NULL`` if an exception was raised by the
1349 The mapping table must map Unicode ordinal integers to Unicode ordinal integers
1370 Create a Unicode object by decoding *size* bytes of the MBCS encoded string *s*.
1385 Encode a Unicode object using MBCS and return the result as Python bytes
1392 Encode the Unicode object using the specified code page and return a Python
1408 The following APIs are capable of handling Unicode objects and strings on input
1409 (we refer to them as strings in the descriptions) and return Unicode objects or
1417 Concat two strings giving a new Unicode string.
1422 Split a string giving a list of Unicode strings. If *sep* is ``NULL``, splitting
1430 Split a Unicode string at line breaks, returning a list of Unicode strings.
1438 Unicode string.
1485 return the resulting Unicode object. *maxcount* == ``-1`` means replace all
1500 Compare a Unicode object, *uni*, with *string* and return ``-1``, ``0``, ``1`` for less
1510 Rich compare two Unicode strings and return one of the following:
1531 *element* has to coerce to a one element Unicode string. ``-1`` is returned
1538 pointer variable pointing to a Python Unicode string object. If there is an
1551 :c:func:`PyUnicode_InternInPlace`, returning either a new Unicode string