### 导航
- [索引](../genindex.xhtml "总目录")
- [模块](../py-modindex.xhtml "Python 模块索引") |
- [下一页](datatypes.xhtml "数据类型") |
- [上一页](struct.xhtml "struct --- Interpret bytes as packed binary data") |
- ![](https://box.kancloud.cn/a721fc7ec672275e257bbbfde49a4d4e_16x16.png)
- [Python](https://www.python.org/) »
- zh\_CN 3.7.3 [文档](../index.xhtml) »
- [Python 标准库](index.xhtml) »
- [二进制数据服务](binary.xhtml) »
- $('.inline-search').show(0); |
# [`codecs`](#module-codecs "codecs: Encode and decode data and streams.") --- Codec registry and base classes
**Source code:** [Lib/codecs.py](https://github.com/python/cpython/tree/3.7/Lib/codecs.py) \[https://github.com/python/cpython/tree/3.7/Lib/codecs.py\]
- - - - - -
This module defines base classes for standard Python codecs (encoders and decoders) and provides access to the internal Python codec registry, which manages the codec and error handling lookup process. Most standard codecs are [text encodings](../glossary.xhtml#term-text-encoding), which encode text to bytes, but there are also codecs provided that encode text to text, and bytes to bytes. Custom codecs may encode and decode between arbitrary types, but some module features are restricted to use specifically with [text encodings](../glossary.xhtml#term-text-encoding), or with codecs that encode to [`bytes`](stdtypes.xhtml#bytes "bytes").
The module defines the following functions for encoding and decoding with any codec:
`codecs.``encode`(*obj*, *encoding='utf-8'*, *errors='strict'*)Encodes *obj* using the codec registered for *encoding*.
*Errors* may be given to set the desired error handling scheme. The default error handler is `'strict'` meaning that encoding errors raise [`ValueError`](exceptions.xhtml#ValueError "ValueError") (or a more codec specific subclass, such as [`UnicodeEncodeError`](exceptions.xhtml#UnicodeEncodeError "UnicodeEncodeError")). Refer to [Codec Base Classes](#codec-base-classes) for more information on codec error handling.
`codecs.``decode`(*obj*, *encoding='utf-8'*, *errors='strict'*)Decodes *obj* using the codec registered for *encoding*.
*Errors* may be given to set the desired error handling scheme. The default error handler is `'strict'` meaning that decoding errors raise [`ValueError`](exceptions.xhtml#ValueError "ValueError") (or a more codec specific subclass, such as [`UnicodeDecodeError`](exceptions.xhtml#UnicodeDecodeError "UnicodeDecodeError")). Refer to [Codec Base Classes](#codec-base-classes) for more information on codec error handling.
The full details for each codec can also be looked up directly:
`codecs.``lookup`(*encoding*)Looks up the codec info in the Python codec registry and returns a [`CodecInfo`](#codecs.CodecInfo "codecs.CodecInfo") object as defined below.
Encodings are first looked up in the registry's cache. If not found, the list of registered search functions is scanned. If no [`CodecInfo`](#codecs.CodecInfo "codecs.CodecInfo") object is found, a [`LookupError`](exceptions.xhtml#LookupError "LookupError") is raised. Otherwise, the [`CodecInfo`](#codecs.CodecInfo "codecs.CodecInfo") object is stored in the cache and returned to the caller.
*class* `codecs.``CodecInfo`(*encode*, *decode*, *streamreader=None*, *streamwriter=None*, *incrementalencoder=None*, *incrementaldecoder=None*, *name=None*)Codec details when looking up the codec registry. The constructor arguments are stored in attributes of the same name:
`name`编码名称
`encode``decode`The stateless encoding and decoding functions. These must be functions or methods which have the same interface as the [`encode()`](#codecs.Codec.encode "codecs.Codec.encode") and [`decode()`](#codecs.Codec.decode "codecs.Codec.decode") methods of Codec instances (see [Codec Interface](#codec-objects)). The functions or methods are expected to work in a stateless mode.
`incrementalencoder``incrementaldecoder`Incremental encoder and decoder classes or factory functions. These have to provide the interface defined by the base classes [`IncrementalEncoder`](#codecs.IncrementalEncoder "codecs.IncrementalEncoder") and [`IncrementalDecoder`](#codecs.IncrementalDecoder "codecs.IncrementalDecoder"), respectively. Incremental codecs can maintain state.
`streamwriter``streamreader`Stream writer and reader classes or factory functions. These have to provide the interface defined by the base classes [`StreamWriter`](#codecs.StreamWriter "codecs.StreamWriter") and [`StreamReader`](#codecs.StreamReader "codecs.StreamReader"), respectively. Stream codecs can maintain state.
To simplify access to the various codec components, the module provides these additional functions which use [`lookup()`](#codecs.lookup "codecs.lookup") for the codec lookup:
`codecs.``getencoder`(*encoding*)Look up the codec for the given encoding and return its encoder function.
Raises a [`LookupError`](exceptions.xhtml#LookupError "LookupError") in case the encoding cannot be found.
`codecs.``getdecoder`(*encoding*)Look up the codec for the given encoding and return its decoder function.
Raises a [`LookupError`](exceptions.xhtml#LookupError "LookupError") in case the encoding cannot be found.
`codecs.``getincrementalencoder`(*encoding*)Look up the codec for the given encoding and return its incremental encoder class or factory function.
Raises a [`LookupError`](exceptions.xhtml#LookupError "LookupError") in case the encoding cannot be found or the codec doesn't support an incremental encoder.
`codecs.``getincrementaldecoder`(*encoding*)Look up the codec for the given encoding and return its incremental decoder class or factory function.
Raises a [`LookupError`](exceptions.xhtml#LookupError "LookupError") in case the encoding cannot be found or the codec doesn't support an incremental decoder.
`codecs.``getreader`(*encoding*)Look up the codec for the given encoding and return its [`StreamReader`](#codecs.StreamReader "codecs.StreamReader")class or factory function.
Raises a [`LookupError`](exceptions.xhtml#LookupError "LookupError") in case the encoding cannot be found.
`codecs.``getwriter`(*encoding*)Look up the codec for the given encoding and return its [`StreamWriter`](#codecs.StreamWriter "codecs.StreamWriter")class or factory function.
Raises a [`LookupError`](exceptions.xhtml#LookupError "LookupError") in case the encoding cannot be found.
Custom codecs are made available by registering a suitable codec search function:
`codecs.``register`(*search\_function*)Register a codec search function. Search functions are expected to take one argument, being the encoding name in all lower case letters, and return a [`CodecInfo`](#codecs.CodecInfo "codecs.CodecInfo") object. In case a search function cannot find a given encoding, it should return `None`.
注解
Search function registration is not currently reversible, which may cause problems in some cases, such as unit testing or module reloading.
While the builtin [`open()`](functions.xhtml#open "open") and the associated [`io`](io.xhtml#module-io "io: Core tools for working with streams.") module are the recommended approach for working with encoded text files, this module provides additional utility functions and classes that allow the use of a wider range of codecs when working with binary files:
`codecs.``open`(*filename*, *mode='r'*, *encoding=None*, *errors='strict'*, *buffering=1*)Open an encoded file using the given *mode* and return an instance of [`StreamReaderWriter`](#codecs.StreamReaderWriter "codecs.StreamReaderWriter"), providing transparent encoding/decoding. The default file mode is `'r'`, meaning to open the file in read mode.
注解
Underlying encoded files are always opened in binary mode. No automatic conversion of `'\n'` is done on reading and writing. The *mode* argument may be any binary mode acceptable to the built-in [`open()`](functions.xhtml#open "open") function; the `'b'` is automatically added.
*encoding* specifies the encoding which is to be used for the file. Any encoding that encodes to and decodes from bytes is allowed, and the data types supported by the file methods depend on the codec used.
*errors* may be given to define the error handling. It defaults to `'strict'`which causes a [`ValueError`](exceptions.xhtml#ValueError "ValueError") to be raised in case an encoding error occurs.
*buffering* has the same meaning as for the built-in [`open()`](functions.xhtml#open "open") function. It defaults to line buffered.
`codecs.``EncodedFile`(*file*, *data\_encoding*, *file\_encoding=None*, *errors='strict'*)Return a [`StreamRecoder`](#codecs.StreamRecoder "codecs.StreamRecoder") instance, a wrapped version of *file*which provides transparent transcoding. The original file is closed when the wrapped version is closed.
Data written to the wrapped file is decoded according to the given *data\_encoding* and then written to the original file as bytes using *file\_encoding*. Bytes read from the original file are decoded according to *file\_encoding*, and the result is encoded using *data\_encoding*.
If *file\_encoding* is not given, it defaults to *data\_encoding*.
*errors* may be given to define the error handling. It defaults to `'strict'`, which causes [`ValueError`](exceptions.xhtml#ValueError "ValueError") to be raised in case an encoding error occurs.
`codecs.``iterencode`(*iterator*, *encoding*, *errors='strict'*, *\*\*kwargs*)Uses an incremental encoder to iteratively encode the input provided by *iterator*. This function is a [generator](../glossary.xhtml#term-generator). The *errors* argument (as well as any other keyword argument) is passed through to the incremental encoder.
This function requires that the codec accept text [`str`](stdtypes.xhtml#str "str") objects to encode. Therefore it does not support bytes-to-bytes encoders such as `base64_codec`.
`codecs.``iterdecode`(*iterator*, *encoding*, *errors='strict'*, *\*\*kwargs*)Uses an incremental decoder to iteratively decode the input provided by *iterator*. This function is a [generator](../glossary.xhtml#term-generator). The *errors* argument (as well as any other keyword argument) is passed through to the incremental decoder.
This function requires that the codec accept [`bytes`](stdtypes.xhtml#bytes "bytes") objects to decode. Therefore it does not support text-to-text encoders such as `rot_13`, although `rot_13` may be used equivalently with [`iterencode()`](#codecs.iterencode "codecs.iterencode").
The module also provides the following constants which are useful for reading and writing to platform dependent files:
`codecs.``BOM``codecs.``BOM_BE``codecs.``BOM_LE``codecs.``BOM_UTF8``codecs.``BOM_UTF16``codecs.``BOM_UTF16_BE``codecs.``BOM_UTF16_LE``codecs.``BOM_UTF32``codecs.``BOM_UTF32_BE``codecs.``BOM_UTF32_LE`These constants define various byte sequences, being Unicode byte order marks (BOMs) for several encodings. They are used in UTF-16 and UTF-32 data streams to indicate the byte order used, and in UTF-8 as a Unicode signature. [`BOM_UTF16`](#codecs.BOM_UTF16 "codecs.BOM_UTF16") is either [`BOM_UTF16_BE`](#codecs.BOM_UTF16_BE "codecs.BOM_UTF16_BE") or [`BOM_UTF16_LE`](#codecs.BOM_UTF16_LE "codecs.BOM_UTF16_LE") depending on the platform's native byte order, [`BOM`](#codecs.BOM "codecs.BOM") is an alias for [`BOM_UTF16`](#codecs.BOM_UTF16 "codecs.BOM_UTF16"), [`BOM_LE`](#codecs.BOM_LE "codecs.BOM_LE") for [`BOM_UTF16_LE`](#codecs.BOM_UTF16_LE "codecs.BOM_UTF16_LE") and [`BOM_BE`](#codecs.BOM_BE "codecs.BOM_BE") for [`BOM_UTF16_BE`](#codecs.BOM_UTF16_BE "codecs.BOM_UTF16_BE"). The others represent the BOM in UTF-8 and UTF-32 encodings.
## Codec Base Classes
The [`codecs`](#module-codecs "codecs: Encode and decode data and streams.") module defines a set of base classes which define the interfaces for working with codec objects, and can also be used as the basis for custom codec implementations.
Each codec has to define four interfaces to make it usable as codec in Python: stateless encoder, stateless decoder, stream reader and stream writer. The stream reader and writers typically reuse the stateless encoder/decoder to implement the file protocols. Codec authors also need to define how the codec will handle encoding and decoding errors.
### Error Handlers
To simplify and standardize error handling, codecs may implement different error handling schemes by accepting the *errors* string argument. The following string values are defined and implemented by all standard Python codecs:
值
意义
`'strict'`
Raise [`UnicodeError`](exceptions.xhtml#UnicodeError "UnicodeError") (or a subclass); this is the default. Implemented in [`strict_errors()`](#codecs.strict_errors "codecs.strict_errors").
`'ignore'`
Ignore the malformed data and continue without further notice. Implemented in [`ignore_errors()`](#codecs.ignore_errors "codecs.ignore_errors").
The following error handlers are only applicable to [text encodings](../glossary.xhtml#term-text-encoding):
值
意义
`'replace'`
Replace with a suitable replacement marker; Python will use the official `U+FFFD` REPLACEMENT CHARACTER for the built-in codecs on decoding, and '?' on encoding. Implemented in [`replace_errors()`](#codecs.replace_errors "codecs.replace_errors").
`'xmlcharrefreplace'`
Replace with the appropriate XML character reference (only for encoding). Implemented in [`xmlcharrefreplace_errors()`](#codecs.xmlcharrefreplace_errors "codecs.xmlcharrefreplace_errors").
`'backslashreplace'`
Replace with backslashed escape sequences. Implemented in [`backslashreplace_errors()`](#codecs.backslashreplace_errors "codecs.backslashreplace_errors").
`'namereplace'`
Replace with `\N{...}` escape sequences (only for encoding). Implemented in [`namereplace_errors()`](#codecs.namereplace_errors "codecs.namereplace_errors").
`'surrogateescape'`
On decoding, replace byte with individual surrogate code ranging from `U+DC80` to `U+DCFF`. This code will then be turned back into the same byte when the `'surrogateescape'` error handler is used when encoding the data. (See [**PEP 383**](https://www.python.org/dev/peps/pep-0383) \[https://www.python.org/dev/peps/pep-0383\] for more.)
In addition, the following error handler is specific to the given codecs:
值
Codecs
意义
`'surrogatepass'`
utf-8, utf-16, utf-32, utf-16-be, utf-16-le, utf-32-be, utf-32-le
Allow encoding and decoding of surrogate codes. These codecs normally treat the presence of surrogates as an error.
3\.1 新版功能: The `'surrogateescape'` and `'surrogatepass'` error handlers.
在 3.4 版更改: The `'surrogatepass'` error handlers now works with utf-16\* and utf-32\* codecs.
3\.5 新版功能: The `'namereplace'` error handler.
在 3.5 版更改: The `'backslashreplace'` error handlers now works with decoding and translating.
The set of allowed values can be extended by registering a new named error handler:
`codecs.``register_error`(*name*, *error\_handler*)Register the error handling function *error\_handler* under the name *name*. The *error\_handler* argument will be called during encoding and decoding in case of an error, when *name* is specified as the errors parameter.
For encoding, *error\_handler* will be called with a [`UnicodeEncodeError`](exceptions.xhtml#UnicodeEncodeError "UnicodeEncodeError")instance, which contains information about the location of the error. The error handler must either raise this or a different exception, or return a tuple with a replacement for the unencodable part of the input and a position where encoding should continue. The replacement may be either [`str`](stdtypes.xhtml#str "str") or [`bytes`](stdtypes.xhtml#bytes "bytes"). If the replacement is bytes, the encoder will simply copy them into the output buffer. If the replacement is a string, the encoder will encode the replacement. Encoding continues on original input at the specified position. Negative position values will be treated as being relative to the end of the input string. If the resulting position is out of bound an [`IndexError`](exceptions.xhtml#IndexError "IndexError") will be raised.
Decoding and translating works similarly, except [`UnicodeDecodeError`](exceptions.xhtml#UnicodeDecodeError "UnicodeDecodeError") or [`UnicodeTranslateError`](exceptions.xhtml#UnicodeTranslateError "UnicodeTranslateError") will be passed to the handler and that the replacement from the error handler will be put into the output directly.
Previously registered error handlers (including the standard error handlers) can be looked up by name:
`codecs.``lookup_error`(*name*)Return the error handler previously registered under the name *name*.
Raises a [`LookupError`](exceptions.xhtml#LookupError "LookupError") in case the handler cannot be found.
The following standard error handlers are also made available as module level functions:
`codecs.``strict_errors`(*exception*)Implements the `'strict'` error handling: each encoding or decoding error raises a [`UnicodeError`](exceptions.xhtml#UnicodeError "UnicodeError").
`codecs.``replace_errors`(*exception*)Implements the `'replace'` error handling (for [text encodings](../glossary.xhtml#term-text-encoding) only): substitutes `'?'` for encoding errors (to be encoded by the codec), and `'\ufffd'` (the Unicode replacement character) for decoding errors.
`codecs.``ignore_errors`(*exception*)Implements the `'ignore'` error handling: malformed data is ignored and encoding or decoding is continued without further notice.
`codecs.``xmlcharrefreplace_errors`(*exception*)Implements the `'xmlcharrefreplace'` error handling (for encoding with [text encodings](../glossary.xhtml#term-text-encoding) only): the unencodable character is replaced by an appropriate XML character reference.
`codecs.``backslashreplace_errors`(*exception*)Implements the `'backslashreplace'` error handling (for [text encodings](../glossary.xhtml#term-text-encoding) only): malformed data is replaced by a backslashed escape sequence.
`codecs.``namereplace_errors`(*exception*)Implements the `'namereplace'` error handling (for encoding with [text encodings](../glossary.xhtml#term-text-encoding) only): the unencodable character is replaced by a `\N{...}` escape sequence.
3\.5 新版功能.
### Stateless Encoding and Decoding
The base `Codec` class defines these methods which also define the function interfaces of the stateless encoder and decoder:
`Codec.``encode`(*input*\[, *errors*\])Encodes the object *input* and returns a tuple (output object, length consumed). For instance, [text encoding](../glossary.xhtml#term-text-encoding) converts a string object to a bytes object using a particular character set encoding (e.g., `cp1252` or `iso-8859-1`).
The *errors* argument defines the error handling to apply. It defaults to `'strict'` handling.
The method may not store state in the `Codec` instance. Use [`StreamWriter`](#codecs.StreamWriter "codecs.StreamWriter") for codecs which have to keep state in order to make encoding efficient.
The encoder must be able to handle zero length input and return an empty object of the output object type in this situation.
`Codec.``decode`(*input*\[, *errors*\])Decodes the object *input* and returns a tuple (output object, length consumed). For instance, for a [text encoding](../glossary.xhtml#term-text-encoding), decoding converts a bytes object encoded using a particular character set encoding to a string object.
For text encodings and bytes-to-bytes codecs, *input* must be a bytes object or one which provides the read-only buffer interface -- for example, buffer objects and memory mapped files.
The *errors* argument defines the error handling to apply. It defaults to `'strict'` handling.
The method may not store state in the `Codec` instance. Use [`StreamReader`](#codecs.StreamReader "codecs.StreamReader") for codecs which have to keep state in order to make decoding efficient.
The decoder must be able to handle zero length input and return an empty object of the output object type in this situation.
### Incremental Encoding and Decoding
The [`IncrementalEncoder`](#codecs.IncrementalEncoder "codecs.IncrementalEncoder") and [`IncrementalDecoder`](#codecs.IncrementalDecoder "codecs.IncrementalDecoder") classes provide the basic interface for incremental encoding and decoding. Encoding/decoding the input isn't done with one call to the stateless encoder/decoder function, but with multiple calls to the [`encode()`](#codecs.IncrementalEncoder.encode "codecs.IncrementalEncoder.encode")/[`decode()`](#codecs.IncrementalDecoder.decode "codecs.IncrementalDecoder.decode") method of the incremental encoder/decoder. The incremental encoder/decoder keeps track of the encoding/decoding process during method calls.
The joined output of calls to the [`encode()`](#codecs.IncrementalEncoder.encode "codecs.IncrementalEncoder.encode")/[`decode()`](#codecs.IncrementalDecoder.decode "codecs.IncrementalDecoder.decode") method is the same as if all the single inputs were joined into one, and this input was encoded/decoded with the stateless encoder/decoder.
#### IncrementalEncoder Objects
The [`IncrementalEncoder`](#codecs.IncrementalEncoder "codecs.IncrementalEncoder") class is used for encoding an input in multiple steps. It defines the following methods which every incremental encoder must define in order to be compatible with the Python codec registry.
*class* `codecs.``IncrementalEncoder`(*errors='strict'*)Constructor for an [`IncrementalEncoder`](#codecs.IncrementalEncoder "codecs.IncrementalEncoder") instance.
All incremental encoders must provide this constructor interface. They are free to add additional keyword arguments, but only the ones defined here are used by the Python codec registry.
The [`IncrementalEncoder`](#codecs.IncrementalEncoder "codecs.IncrementalEncoder") may implement different error handling schemes by providing the *errors* keyword argument. See [Error Handlers](#error-handlers) for possible values.
The *errors* argument will be assigned to an attribute of the same name. Assigning to this attribute makes it possible to switch between different error handling strategies during the lifetime of the [`IncrementalEncoder`](#codecs.IncrementalEncoder "codecs.IncrementalEncoder")object.
`encode`(*object*\[, *final*\])Encodes *object* (taking the current state of the encoder into account) and returns the resulting encoded object. If this is the last call to [`encode()`](#codecs.encode "codecs.encode") *final* must be true (the default is false).
`reset`()Reset the encoder to the initial state. The output is discarded: call `.encode(object, final=True)`, passing an empty byte or text string if necessary, to reset the encoder and to get the output.
`getstate`()Return the current state of the encoder which must be an integer. The implementation should make sure that `0` is the most common state. (States that are more complicated than integers can be converted into an integer by marshaling/pickling the state and encoding the bytes of the resulting string into an integer).
`setstate`(*state*)Set the state of the encoder to *state*. *state* must be an encoder state returned by [`getstate()`](#codecs.IncrementalEncoder.getstate "codecs.IncrementalEncoder.getstate").
#### IncrementalDecoder Objects
The [`IncrementalDecoder`](#codecs.IncrementalDecoder "codecs.IncrementalDecoder") class is used for decoding an input in multiple steps. It defines the following methods which every incremental decoder must define in order to be compatible with the Python codec registry.
*class* `codecs.``IncrementalDecoder`(*errors='strict'*)Constructor for an [`IncrementalDecoder`](#codecs.IncrementalDecoder "codecs.IncrementalDecoder") instance.
All incremental decoders must provide this constructor interface. They are free to add additional keyword arguments, but only the ones defined here are used by the Python codec registry.
The [`IncrementalDecoder`](#codecs.IncrementalDecoder "codecs.IncrementalDecoder") may implement different error handling schemes by providing the *errors* keyword argument. See [Error Handlers](#error-handlers) for possible values.
The *errors* argument will be assigned to an attribute of the same name. Assigning to this attribute makes it possible to switch between different error handling strategies during the lifetime of the [`IncrementalDecoder`](#codecs.IncrementalDecoder "codecs.IncrementalDecoder")object.
`decode`(*object*\[, *final*\])Decodes *object* (taking the current state of the decoder into account) and returns the resulting decoded object. If this is the last call to [`decode()`](#codecs.decode "codecs.decode") *final* must be true (the default is false). If *final* is true the decoder must decode the input completely and must flush all buffers. If this isn't possible (e.g. because of incomplete byte sequences at the end of the input) it must initiate error handling just like in the stateless case (which might raise an exception).
`reset`()Reset the decoder to the initial state.
`getstate`()Return the current state of the decoder. This must be a tuple with two items, the first must be the buffer containing the still undecoded input. The second must be an integer and can be additional state info. (The implementation should make sure that `0` is the most common additional state info.) If this additional state info is `0` it must be possible to set the decoder to the state which has no input buffered and `0` as the additional state info, so that feeding the previously buffered input to the decoder returns it to the previous state without producing any output. (Additional state info that is more complicated than integers can be converted into an integer by marshaling/pickling the info and encoding the bytes of the resulting string into an integer.)
`setstate`(*state*)Set the state of the decoder to *state*. *state* must be a decoder state returned by [`getstate()`](#codecs.IncrementalDecoder.getstate "codecs.IncrementalDecoder.getstate").
### Stream Encoding and Decoding
The [`StreamWriter`](#codecs.StreamWriter "codecs.StreamWriter") and [`StreamReader`](#codecs.StreamReader "codecs.StreamReader") classes provide generic working interfaces which can be used to implement new encoding submodules very easily. See `encodings.utf_8` for an example of how this is done.
#### StreamWriter Objects
The [`StreamWriter`](#codecs.StreamWriter "codecs.StreamWriter") class is a subclass of `Codec` and defines the following methods which every stream writer must define in order to be compatible with the Python codec registry.
*class* `codecs.``StreamWriter`(*stream*, *errors='strict'*)Constructor for a [`StreamWriter`](#codecs.StreamWriter "codecs.StreamWriter") instance.
All stream writers must provide this constructor interface. They are free to add additional keyword arguments, but only the ones defined here are used by the Python codec registry.
The *stream* argument must be a file-like object open for writing text or binary data, as appropriate for the specific codec.
The [`StreamWriter`](#codecs.StreamWriter "codecs.StreamWriter") may implement different error handling schemes by providing the *errors* keyword argument. See [Error Handlers](#error-handlers) for the standard error handlers the underlying stream codec may support.
The *errors* argument will be assigned to an attribute of the same name. Assigning to this attribute makes it possible to switch between different error handling strategies during the lifetime of the [`StreamWriter`](#codecs.StreamWriter "codecs.StreamWriter") object.
`write`(*object*)Writes the object's contents encoded to the stream.
`writelines`(*list*)Writes the concatenated list of strings to the stream (possibly by reusing the [`write()`](#codecs.StreamWriter.write "codecs.StreamWriter.write") method). The standard bytes-to-bytes codecs do not support this method.
`reset`()Flushes and resets the codec buffers used for keeping state.
Calling this method should ensure that the data on the output is put into a clean state that allows appending of new fresh data without having to rescan the whole stream to recover state.
In addition to the above methods, the [`StreamWriter`](#codecs.StreamWriter "codecs.StreamWriter") must also inherit all other methods and attributes from the underlying stream.
#### StreamReader Objects
The [`StreamReader`](#codecs.StreamReader "codecs.StreamReader") class is a subclass of `Codec` and defines the following methods which every stream reader must define in order to be compatible with the Python codec registry.
*class* `codecs.``StreamReader`(*stream*, *errors='strict'*)Constructor for a [`StreamReader`](#codecs.StreamReader "codecs.StreamReader") instance.
All stream readers must provide this constructor interface. They are free to add additional keyword arguments, but only the ones defined here are used by the Python codec registry.
The *stream* argument must be a file-like object open for reading text or binary data, as appropriate for the specific codec.
The [`StreamReader`](#codecs.StreamReader "codecs.StreamReader") may implement different error handling schemes by providing the *errors* keyword argument. See [Error Handlers](#error-handlers) for the standard error handlers the underlying stream codec may support.
The *errors* argument will be assigned to an attribute of the same name. Assigning to this attribute makes it possible to switch between different error handling strategies during the lifetime of the [`StreamReader`](#codecs.StreamReader "codecs.StreamReader") object.
The set of allowed values for the *errors* argument can be extended with [`register_error()`](#codecs.register_error "codecs.register_error").
`read`(\[*size*\[, *chars*\[, *firstline*\]\]\])Decodes data from the stream and returns the resulting object.
The *chars* argument indicates the number of decoded code points or bytes to return. The [`read()`](#codecs.StreamReader.read "codecs.StreamReader.read") method will never return more data than requested, but it might return less, if there is not enough available.
The *size* argument indicates the approximate maximum number of encoded bytes or code points to read for decoding. The decoder can modify this setting as appropriate. The default value -1 indicates to read and decode as much as possible. This parameter is intended to prevent having to decode huge files in one step.
The *firstline* flag indicates that it would be sufficient to only return the first line, if there are decoding errors on later lines.
The method should use a greedy read strategy meaning that it should read as much data as is allowed within the definition of the encoding and the given size, e.g. if optional encoding endings or state markers are available on the stream, these should be read too.
`readline`(\[*size*\[, *keepends*\]\])Read one line from the input stream and return the decoded data.
*size*, if given, is passed as size argument to the stream's [`read()`](#codecs.StreamReader.read "codecs.StreamReader.read") method.
If *keepends* is false line-endings will be stripped from the lines returned.
`readlines`(\[*sizehint*\[, *keepends*\]\])Read all lines available on the input stream and return them as a list of lines.
Line-endings are implemented using the codec's decoder method and are included in the list entries if *keepends* is true.
*sizehint*, if given, is passed as the *size* argument to the stream's [`read()`](#codecs.StreamReader.read "codecs.StreamReader.read") method.
`reset`()Resets the codec buffers used for keeping state.
Note that no stream repositioning should take place. This method is primarily intended to be able to recover from decoding errors.
In addition to the above methods, the [`StreamReader`](#codecs.StreamReader "codecs.StreamReader") must also inherit all other methods and attributes from the underlying stream.
#### StreamReaderWriter Objects
The [`StreamReaderWriter`](#codecs.StreamReaderWriter "codecs.StreamReaderWriter") is a convenience class that allows wrapping streams which work in both read and write modes.
The design is such that one can use the factory functions returned by the [`lookup()`](#codecs.lookup "codecs.lookup") function to construct the instance.
*class* `codecs.``StreamReaderWriter`(*stream*, *Reader*, *Writer*, *errors='strict'*)Creates a [`StreamReaderWriter`](#codecs.StreamReaderWriter "codecs.StreamReaderWriter") instance. *stream* must be a file-like object. *Reader* and *Writer* must be factory functions or classes providing the [`StreamReader`](#codecs.StreamReader "codecs.StreamReader") and [`StreamWriter`](#codecs.StreamWriter "codecs.StreamWriter") interface resp. Error handling is done in the same way as defined for the stream readers and writers.
[`StreamReaderWriter`](#codecs.StreamReaderWriter "codecs.StreamReaderWriter") instances define the combined interfaces of [`StreamReader`](#codecs.StreamReader "codecs.StreamReader") and [`StreamWriter`](#codecs.StreamWriter "codecs.StreamWriter") classes. They inherit all other methods and attributes from the underlying stream.
#### StreamRecoder Objects
The [`StreamRecoder`](#codecs.StreamRecoder "codecs.StreamRecoder") translates data from one encoding to another, which is sometimes useful when dealing with different encoding environments.
The design is such that one can use the factory functions returned by the [`lookup()`](#codecs.lookup "codecs.lookup") function to construct the instance.
*class* `codecs.``StreamRecoder`(*stream*, *encode*, *decode*, *Reader*, *Writer*, *errors='strict'*)Creates a [`StreamRecoder`](#codecs.StreamRecoder "codecs.StreamRecoder") instance which implements a two-way conversion: *encode* and *decode* work on the frontend — the data visible to code calling `read()` and `write()`, while *Reader* and *Writer*work on the backend — the data in *stream*.
You can use these objects to do transparent transcodings from e.g. Latin-1 to UTF-8 and back.
The *stream* argument must be a file-like object.
The *encode* and *decode* arguments must adhere to the `Codec` interface. *Reader* and *Writer* must be factory functions or classes providing objects of the [`StreamReader`](#codecs.StreamReader "codecs.StreamReader") and [`StreamWriter`](#codecs.StreamWriter "codecs.StreamWriter") interface respectively.
Error handling is done in the same way as defined for the stream readers and writers.
[`StreamRecoder`](#codecs.StreamRecoder "codecs.StreamRecoder") instances define the combined interfaces of [`StreamReader`](#codecs.StreamReader "codecs.StreamReader") and [`StreamWriter`](#codecs.StreamWriter "codecs.StreamWriter") classes. They inherit all other methods and attributes from the underlying stream.
## Encodings and Unicode
Strings are stored internally as sequences of code points in range `0x0`--`0x10FFFF`. (See [**PEP 393**](https://www.python.org/dev/peps/pep-0393) \[https://www.python.org/dev/peps/pep-0393\] for more details about the implementation.) Once a string object is used outside of CPU and memory, endianness and how these arrays are stored as bytes become an issue. As with other codecs, serialising a string into a sequence of bytes is known as *encoding*, and recreating the string from the sequence of bytes is known as *decoding*. There are a variety of different text serialisation codecs, which are collectivity referred to as [text encodings](../glossary.xhtml#term-text-encoding).
The simplest text encoding (called `'latin-1'` or `'iso-8859-1'`) maps the code points 0--255 to the bytes `0x0`--`0xff`, which means that a string object that contains code points above `U+00FF` can't be encoded with this codec. Doing so will raise a [`UnicodeEncodeError`](exceptions.xhtml#UnicodeEncodeError "UnicodeEncodeError") that looks like the following (although the details of the error message may differ):
```
UnicodeEncodeError: 'latin-1' codec can't encode character '\u1234' in
position 3: ordinal not in range(256)
```
.
There's another group of encodings (the so called charmap encodings) that choose a different subset of all Unicode code points and how these code points are mapped to the bytes `0x0`--`0xff`. To see how this is done simply open e.g. `encodings/cp1252.py` (which is an encoding that is used primarily on Windows). There's a string constant with 256 characters that shows you which character is mapped to which byte value.
All of these encodings can only encode 256 of the 1114112 code points defined in Unicode. A simple and straightforward way that can store each Unicode code point, is to store each code point as four consecutive bytes. There are two possibilities: store the bytes in big endian or in little endian order. These two encodings are called `UTF-32-BE` and `UTF-32-LE` respectively. Their disadvantage is that if e.g. you use `UTF-32-BE` on a little endian machine you will always have to swap bytes on encoding and decoding. `UTF-32` avoids this problem: bytes will always be in natural endianness. When these bytes are read by a CPU with a different endianness, then bytes have to be swapped though. To be able to detect the endianness of a `UTF-16` or `UTF-32` byte sequence, there's the so called BOM ("Byte Order Mark"). This is the Unicode character `U+FEFF`. This character can be prepended to every `UTF-16` or `UTF-32`byte sequence. The byte swapped version of this character (`0xFFFE`) is an illegal character that may not appear in a Unicode text. So when the first character in an `UTF-16` or `UTF-32` byte sequence appears to be a `U+FFFE` the bytes have to be swapped on decoding. Unfortunately the character `U+FEFF` had a second purpose as a `ZERO WIDTH NO-BREAK SPACE`: a character that has no width and doesn't allow a word to be split. It can e.g. be used to give hints to a ligature algorithm. With Unicode 4.0 using `U+FEFF` as a `ZERO WIDTH NO-BREAK SPACE` has been deprecated (with `U+2060` (`WORD JOINER`) assuming this role). Nevertheless Unicode software still must be able to handle `U+FEFF` in both roles: as a BOM it's a device to determine the storage layout of the encoded bytes, and vanishes once the byte sequence has been decoded into a string; as a
```
ZERO WIDTH
NO-BREAK SPACE
```
it's a normal character that will be decoded like any other.
There's another encoding that is able to encoding the full range of Unicode characters: UTF-8. UTF-8 is an 8-bit encoding, which means there are no issues with byte order in UTF-8. Each byte in a UTF-8 byte sequence consists of two parts: marker bits (the most significant bits) and payload bits. The marker bits are a sequence of zero to four `1` bits followed by a `0` bit. Unicode characters are encoded like this (with x being payload bits, which when concatenated give the Unicode character):
范围
编码
`U-00000000` ... `U-0000007F`
0xxxxxxx
`U-00000080` ... `U-000007FF`
110xxxxx 10xxxxxx
`U-00000800` ... `U-0000FFFF`
1110xxxx 10xxxxxx 10xxxxxx
`U-00010000` ... `U-0010FFFF`
11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
The least significant bit of the Unicode character is the rightmost x bit.
As UTF-8 is an 8-bit encoding no BOM is required and any `U+FEFF` character in the decoded string (even if it's the first character) is treated as a
```
ZERO
WIDTH NO-BREAK SPACE
```
.
Without external information it's impossible to reliably determine which encoding was used for encoding a string. Each charmap encoding can decode any random byte sequence. However that's not possible with UTF-8, as UTF-8 byte sequences have a structure that doesn't allow arbitrary byte sequences. To increase the reliability with which a UTF-8 encoding can be detected, Microsoft invented a variant of UTF-8 (that Python 2.5 calls `"utf-8-sig"`) for its Notepad program: Before any of the Unicode characters is written to the file, a UTF-8 encoded BOM (which looks like this as a byte sequence: `0xef`, `0xbb`, `0xbf`) is written. As it's rather improbable that any charmap encoded file starts with these byte values (which would e.g. map to
> LATIN SMALL LETTER I WITH DIAERESIS
>
> RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
>
> INVERTED QUESTION MARK
in iso-8859-1), this increases the probability that a `utf-8-sig` encoding can be correctly guessed from the byte sequence. So here the BOM is not used to be able to determine the byte order used for generating the byte sequence, but as a signature that helps in guessing the encoding. On encoding the utf-8-sig codec will write `0xef`, `0xbb`, `0xbf` as the first three bytes to the file. On decoding `utf-8-sig` will skip those three bytes if they appear as the first three bytes in the file. In UTF-8, the use of the BOM is discouraged and should generally be avoided.
## 标准编码
Python comes with a number of codecs built-in, either implemented as C functions or with dictionaries as mapping tables. The following table lists the codecs by name, together with a few common aliases, and the languages for which the encoding is likely used. Neither the list of aliases nor the list of languages is meant to be exhaustive. Notice that spelling alternatives that only differ in case or use a hyphen instead of an underscore are also valid aliases; therefore, e.g. `'utf-8'` is a valid alias for the `'utf_8'` codec.
**CPython implementation detail:** Some common encodings can bypass the codecs lookup machinery to improve performance. These optimization opportunities are only recognized by CPython for a limited set of (case insensitive) aliases: utf-8, utf8, latin-1, latin1, iso-8859-1, iso8859-1, mbcs (Windows only), ascii, us-ascii, utf-16, utf16, utf-32, utf32, and the same using underscores instead of dashes. Using alternative aliases for these encodings may result in slower execution.
在 3.6 版更改: Optimization opportunity recognized for us-ascii.
Many of the character sets support the same languages. They vary in individual characters (e.g. whether the EURO SIGN is supported or not), and in the assignment of characters to code positions. For the European languages in particular, the following variants typically exist:
- an ISO 8859 codeset
- a Microsoft Windows code page, which is typically derived from an 8859 codeset, but replaces control characters with additional graphic characters
- an IBM EBCDIC code page
- an IBM PC code page, which is ASCII compatible
编码
别名
语言
ascii
646, us-ascii
英语
big5
big5-tw, csbig5
繁体中文
big5hkscs
big5-hkscs, hkscs
繁体中文
cp037
IBM037, IBM039
英语
cp273
273, IBM273, csIBM273
德语
3\.4 新版功能.
cp424
EBCDIC-CP-HE, IBM424
希伯来语
cp437
437, IBM437
英语
cp500
EBCDIC-CP-BE, EBCDIC-CP-CH, IBM500
西欧
cp720
阿拉伯语
cp737
希腊语
cp775
IBM775
波罗的海语言
cp850
850, IBM850
西欧
cp852
852, IBM852
中欧和东欧
cp855
855, IBM855
保加利亚语,白俄罗斯语,马其顿语,俄语,塞尔维亚语
cp856
希伯来语
cp857
857, IBM857
土耳其语
cp858
858, IBM858
西欧
cp860
860, IBM860
葡萄牙语
cp861
861, CP-IS, IBM861
冰岛语
cp862
862, IBM862
希伯来语
cp863
863, IBM863
加拿大语
cp864
IBM864
阿拉伯语
cp865
865, IBM865
丹麦语/挪威语
cp866
866, IBM866
俄语
cp869
869, CP-GR, IBM869
希腊语
cp874
泰语
cp875
希腊语
cp932
932, ms932, mskanji, ms-kanji
日语
cp949
949, ms949, uhc
韩语
cp950
950, ms950
繁体中文
cp1006
乌尔都语
cp1026
ibm1026
土耳其语
cp1125
1125, ibm1125, cp866u, ruscii
乌克兰语
3\.4 新版功能.
cp1140
ibm1140
西欧
cp1250
windows-1250
中欧和东欧
cp1251
windows-1251
保加利亚语,白俄罗斯语,马其顿语,俄语,塞尔维亚语
cp1252
windows-1252
西欧
cp1253
windows-1253
希腊语
cp1254
windows-1254
土耳其语
cp1255
windows-1255
希伯来语
cp1256
windows-1256
阿拉伯语
cp1257
windows-1257
波罗的海语言
cp1258
windows-1258
越南语
cp65001
仅Windows: Windows UTF-8 (`CP_UTF8`)
3\.3 新版功能.
euc\_jp
eucjp, ujis, u-jis
日语
euc\_jis\_2004
jisx0213, eucjis2004
日语
euc\_jisx0213
eucjisx0213
日语
euc\_kr
euckr, korean, ksc5601, ks\_c-5601, ks\_c-5601-1987, ksx1001, ks\_x-1001
韩语
gb2312
chinese, csiso58gb231280, euc-cn, euccn, eucgb2312-cn, gb2312-1980, gb2312-80, iso-ir-58
简体中文
gbk
936, cp936, ms936
统一汉语
gb18030
gb18030-2000
统一汉语
hz
hzgb, hz-gb, hz-gb-2312
简体中文
iso2022\_jp
csiso2022jp, iso2022jp, iso-2022-jp
日语
iso2022\_jp\_1
iso2022jp-1, iso-2022-jp-1
日语
iso2022\_jp\_2
iso2022jp-2, iso-2022-jp-2
日语,韩语,简体中文,西欧,希腊语
iso2022\_jp\_2004
iso2022jp-2004, iso-2022-jp-2004
日语
iso2022\_jp\_3
iso2022jp-3, iso-2022-jp-3
日语
iso2022\_jp\_ext
iso2022jp-ext, iso-2022-jp-ext
日语
iso2022\_kr
csiso2022kr, iso2022kr, iso-2022-kr
韩语
latin\_1
iso-8859-1, iso8859-1, 8859, cp819, latin, latin1, L1
西欧
iso8859\_2
iso-8859-2, latin2, L2
中欧和东欧
iso8859\_3
iso-8859-3, latin3, L3
世界语,马耳他语
iso8859\_4
iso-8859-4, latin4, L4
波罗的海语言
iso8859\_5
iso-8859-5, cyrillic
保加利亚语,白俄罗斯语,马其顿语,俄语,塞尔维亚语
iso8859\_6
iso-8859-6, arabic
阿拉伯语
iso8859\_7
iso-8859-7, greek, greek8
希腊语
iso8859\_8
iso-8859-8, hebrew
希伯来语
iso8859\_9
iso-8859-9, latin5, L5
土耳其语
iso8859\_10
iso-8859-10, latin6, L6
北欧语言
iso8859\_11
iso-8859-11, thai
泰语
iso8859\_13
iso-8859-13, latin7, L7
波罗的海语言
iso8859\_14
iso-8859-14, latin8, L8
凯尔特语
iso8859\_15
iso-8859-15, latin9, L9
西欧
iso8859\_16
iso-8859-16, latin10, L10
东南欧
johab
cp1361, ms1361
韩语
koi8\_r
俄语
koi8\_t
塔吉克
3\.5 新版功能.
koi8\_u
乌克兰语
kz1048
kz\_1048, strk1048\_2002, rk1048
哈萨克语
3\.5 新版功能.
mac\_cyrillic
maccyrillic
保加利亚语,白俄罗斯语,马其顿语,俄语,塞尔维亚语
mac\_greek
macgreek
希腊语
mac\_iceland
maciceland
冰岛语
mac\_latin2
maclatin2, maccentraleurope
中欧和东欧
mac\_roman
macroman, macintosh
西欧
mac\_turkish
macturkish
土耳其语
ptcp154
csptcp154, pt154, cp154, cyrillic-asian
哈萨克语
shift\_jis
csshiftjis, shiftjis, sjis, s\_jis
日语
shift\_jis\_2004
shiftjis2004, sjis\_2004, sjis2004
日语
shift\_jisx0213
shiftjisx0213, sjisx0213, s\_jisx0213
日语
utf\_32
U32, utf32
所有语言
utf\_32\_be
UTF-32BE
所有语言
utf\_32\_le
UTF-32LE
所有语言
utf\_16
U16, utf16
所有语言
utf\_16\_be
UTF-16BE
所有语言
utf\_16\_le
UTF-16LE
所有语言
utf\_7
U7, unicode-1-1-utf-7
所有语言
utf\_8
U8, UTF, utf8
所有语言
utf\_8\_sig
所有语言
在 3.4 版更改: The utf-16\* and utf-32\* encoders no longer allow surrogate code points (`U+D800`--`U+DFFF`) to be encoded. The utf-32\* decoders no longer decode byte sequences that correspond to surrogate code points.
## Python Specific Encodings
A number of predefined codecs are specific to Python, so their codec names have no meaning outside Python. These are listed in the tables below based on the expected input and output types (note that while text encodings are the most common use case for codecs, the underlying codec infrastructure supports arbitrary data transforms rather than just text encodings). For asymmetric codecs, the stated purpose describes the encoding direction.
### 文字编码
The following codecs provide [`str`](stdtypes.xhtml#str "str") to [`bytes`](stdtypes.xhtml#bytes "bytes") encoding and [bytes-like object](../glossary.xhtml#term-bytes-like-object) to [`str`](stdtypes.xhtml#str "str") decoding, similar to the Unicode text encodings.
编码
别名
目的
idna
Implements [**RFC 3490**](https://tools.ietf.org/html/rfc3490.html) \[https://tools.ietf.org/html/rfc3490.html\], see also [`encodings.idna`](#module-encodings.idna "encodings.idna: Internationalized Domain Names implementation"). Only `errors='strict'`is supported.
mbcs
ansi, dbcs
Windows only: Encode operand according to the ANSI codepage (CP\_ACP)
oem
Windows only: Encode operand according to the OEM codepage (CP\_OEMCP)
3\.6 新版功能.
palmos
Encoding of PalmOS 3.5
punycode
Implements [**RFC 3492**](https://tools.ietf.org/html/rfc3492.html) \[https://tools.ietf.org/html/rfc3492.html\]. Stateful codecs are not supported.
raw\_unicode\_escape
Latin-1 encoding with `\uXXXX` and `\UXXXXXXXX` for other code points. Existing backslashes are not escaped in any way. It is used in the Python pickle protocol.
undefined
Raise an exception for all conversions, even empty strings. The error handler is ignored.
unicode\_escape
Encoding suitable as the contents of a Unicode literal in ASCII-encoded Python source code, except that quotes are not escaped. Decodes from Latin-1 source code. Beware that Python source code actually uses UTF-8 by default.
unicode\_internal
Return the internal representation of the operand. Stateful codecs are not supported.
3\.3 版后已移除: This representation is obsoleted by [**PEP 393**](https://www.python.org/dev/peps/pep-0393) \[https://www.python.org/dev/peps/pep-0393\].
### 二进制转换
The following codecs provide binary transforms: [bytes-like object](../glossary.xhtml#term-bytes-like-object)to [`bytes`](stdtypes.xhtml#bytes "bytes") mappings. They are not supported by [`bytes.decode()`](stdtypes.xhtml#bytes.decode "bytes.decode")(which only produces [`str`](stdtypes.xhtml#str "str") output).
编码
别名
目的
编码器/解码器
base64\_codec [1](#b64)
base64, base\_64
Convert operand to multiline MIME base64 (the result always includes a trailing `'\n'`)
在 3.4 版更改: accepts any [bytes-like object](../glossary.xhtml#term-bytes-like-object)as input for encoding and decoding
[`base64.encodebytes()`](base64.xhtml#base64.encodebytes "base64.encodebytes") / [`base64.decodebytes()`](base64.xhtml#base64.decodebytes "base64.decodebytes")
bz2\_codec
bz2
使用bz2压缩操作数
[`bz2.compress()`](bz2.xhtml#bz2.compress "bz2.compress") / [`bz2.decompress()`](bz2.xhtml#bz2.decompress "bz2.decompress")
hex\_codec
hex
将操作数转换为十六进制表示,每个字节有两位数
[`binascii.b2a_hex()`](binascii.xhtml#binascii.b2a_hex "binascii.b2a_hex") / [`binascii.a2b_hex()`](binascii.xhtml#binascii.a2b_hex "binascii.a2b_hex")
quopri\_codec
quopri, quotedprintable, quoted\_printable
Convert operand to MIME quoted printable
[`quopri.encode()`](quopri.xhtml#quopri.encode "quopri.encode") with `quotetabs=True` / [`quopri.decode()`](quopri.xhtml#quopri.decode "quopri.decode")
uu\_codec
uu
使用uuencode转换操作数
[`uu.encode()`](uu.xhtml#uu.encode "uu.encode") / [`uu.decode()`](uu.xhtml#uu.decode "uu.decode")
zlib\_codec
zip, zlib
使用gzip压缩操作数
[`zlib.compress()`](zlib.xhtml#zlib.compress "zlib.compress") / [`zlib.decompress()`](zlib.xhtml#zlib.decompress "zlib.decompress")
[1](#id5)In addition to [bytes-like objects](../glossary.xhtml#term-bytes-like-object), `'base64_codec'` also accepts ASCII-only instances of [`str`](stdtypes.xhtml#str "str") for decoding
3\.2 新版功能: 恢复二进制转换。
在 3.4 版更改: 恢复二进制转换的别名。
### 文字转换
The following codec provides a text transform: a [`str`](stdtypes.xhtml#str "str") to [`str`](stdtypes.xhtml#str "str")mapping. It is not supported by [`str.encode()`](stdtypes.xhtml#str.encode "str.encode") (which only produces [`bytes`](stdtypes.xhtml#bytes "bytes") output).
编码
别名
目的
rot\_13
rot13
Returns the Caesar-cypher encryption of the operand
3\.2 新版功能: Restoration of the `rot_13` text transform.
在 3.4 版更改: Restoration of the `rot13` alias.
## [`encodings.idna`](#module-encodings.idna "encodings.idna: Internationalized Domain Names implementation") --- 应用程序中的国际化域名
This module implements [**RFC 3490**](https://tools.ietf.org/html/rfc3490.html) \[https://tools.ietf.org/html/rfc3490.html\] (Internationalized Domain Names in Applications) and [**RFC 3492**](https://tools.ietf.org/html/rfc3492.html) \[https://tools.ietf.org/html/rfc3492.html\] (Nameprep: A Stringprep Profile for Internationalized Domain Names (IDN)). It builds upon the `punycode` encoding and [`stringprep`](stringprep.xhtml#module-stringprep "stringprep: String preparation, as per RFC 3453").
These RFCs together define a protocol to support non-ASCII characters in domain names. A domain name containing non-ASCII characters (such as `www.Alliancefrançaise.nu`) is converted into an ASCII-compatible encoding (ACE, such as `www.xn--alliancefranaise-npb.nu`). The ACE form of the domain name is then used in all places where arbitrary characters are not allowed by the protocol, such as DNS queries, HTTP *Host* fields, and so on. This conversion is carried out in the application; if possible invisible to the user: The application should transparently convert Unicode domain labels to IDNA on the wire, and convert back ACE labels to Unicode before presenting them to the user.
Python supports this conversion in several ways: the `idna` codec performs conversion between Unicode and ACE, separating an input string into labels based on the separator characters defined in [**section 3.1 of RFC 3490**](https://tools.ietf.org/html/rfc3490.html#section-3.1) \[https://tools.ietf.org/html/rfc3490.html#section-3.1\]and converting each label to ACE as required, and conversely separating an input byte string into labels based on the `.` separator and converting any ACE labels found into unicode. Furthermore, the [`socket`](socket.xhtml#module-socket "socket: Low-level networking interface.") module transparently converts Unicode host names to ACE, so that applications need not be concerned about converting host names themselves when they pass them to the socket module. On top of that, modules that have host names as function parameters, such as [`http.client`](http.client.xhtml#module-http.client "http.client: HTTP and HTTPS protocol client (requires sockets).") and [`ftplib`](ftplib.xhtml#module-ftplib "ftplib: FTP protocol client (requires sockets)."), accept Unicode host names ([`http.client`](http.client.xhtml#module-http.client "http.client: HTTP and HTTPS protocol client (requires sockets).") then also transparently sends an IDNA hostname in the *Host* field if it sends that field at all).
When receiving host names from the wire (such as in reverse name lookup), no automatic conversion to Unicode is performed: Applications wishing to present such host names to the user should decode them to Unicode.
The module [`encodings.idna`](#module-encodings.idna "encodings.idna: Internationalized Domain Names implementation") also implements the nameprep procedure, which performs certain normalizations on host names, to achieve case-insensitivity of international domain names, and to unify similar characters. The nameprep functions can be used directly if desired.
`encodings.idna.``nameprep`(*label*)Return the nameprepped version of *label*. The implementation currently assumes query strings, so `AllowUnassigned` is true.
`encodings.idna.``ToASCII`(*label*)Convert a label to ASCII, as specified in [**RFC 3490**](https://tools.ietf.org/html/rfc3490.html) \[https://tools.ietf.org/html/rfc3490.html\]. `UseSTD3ASCIIRules` is assumed to be false.
`encodings.idna.``ToUnicode`(*label*)Convert a label to Unicode, as specified in [**RFC 3490**](https://tools.ietf.org/html/rfc3490.html) \[https://tools.ietf.org/html/rfc3490.html\].
## [`encodings.mbcs`](#module-encodings.mbcs "encodings.mbcs: Windows ANSI codepage") --- Windows ANSI代码页
根据ANSI代码页(CP\_ACP)对操作数进行编码。
[Availability](intro.xhtml#availability): 仅Windows可用
在 3.3 版更改: 支持任何错误处理
在 3.2 版更改: Before 3.2, the *errors* argument was ignored; `'replace'` was always used to encode, and `'ignore'` to decode.
## [`encodings.utf_8_sig`](#module-encodings.utf_8_sig "encodings.utf_8_sig: UTF-8 codec with BOM signature") --- 带BOM签名的UTF-8编解码器
This module implements a variant of the UTF-8 codec: On encoding a UTF-8 encoded BOM will be prepended to the UTF-8 encoded bytes. For the stateful encoder this is only done once (on the first write to the byte stream). For decoding an optional UTF-8 encoded BOM at the start of the data will be skipped.
### 导航
- [索引](../genindex.xhtml "总目录")
- [模块](../py-modindex.xhtml "Python 模块索引") |
- [下一页](datatypes.xhtml "数据类型") |
- [上一页](struct.xhtml "struct --- Interpret bytes as packed binary data") |
- ![](https://box.kancloud.cn/a721fc7ec672275e257bbbfde49a4d4e_16x16.png)
- [Python](https://www.python.org/) »
- zh\_CN 3.7.3 [文档](../index.xhtml) »
- [Python 标准库](index.xhtml) »
- [二进制数据服务](binary.xhtml) »
- $('.inline-search').show(0); |
© [版权所有](../copyright.xhtml) 2001-2019, Python Software Foundation.
Python 软件基金会是一个非盈利组织。 [请捐助。](https://www.python.org/psf/donations/)
最后更新于 5月 21, 2019. [发现了问题](../bugs.xhtml)?
使用[Sphinx](http://sphinx.pocoo.org/)1.8.4 创建。
- Python文档内容
- Python 有什么新变化?
- Python 3.7 有什么新变化
- 摘要 - 发布重点
- 新的特性
- 其他语言特性修改
- 新增模块
- 改进的模块
- C API 的改变
- 构建的改变
- 性能优化
- 其他 CPython 实现的改变
- 已弃用的 Python 行为
- 已弃用的 Python 模块、函数和方法
- 已弃用的 C API 函数和类型
- 平台支持的移除
- API 与特性的移除
- 移除的模块
- Windows 专属的改变
- 移植到 Python 3.7
- Python 3.7.1 中的重要变化
- Python 3.7.2 中的重要变化
- Python 3.6 有什么新变化A
- 摘要 - 发布重点
- 新的特性
- 其他语言特性修改
- 新增模块
- 改进的模块
- 性能优化
- Build and C API Changes
- 其他改进
- 弃用
- 移除
- 移植到Python 3.6
- Python 3.6.2 中的重要变化
- Python 3.6.4 中的重要变化
- Python 3.6.5 中的重要变化
- Python 3.6.7 中的重要变化
- Python 3.5 有什么新变化
- 摘要 - 发布重点
- 新的特性
- 其他语言特性修改
- 新增模块
- 改进的模块
- Other module-level changes
- 性能优化
- Build and C API Changes
- 弃用
- 移除
- Porting to Python 3.5
- Notable changes in Python 3.5.4
- What's New In Python 3.4
- 摘要 - 发布重点
- 新的特性
- 新增模块
- 改进的模块
- CPython Implementation Changes
- 弃用
- 移除
- Porting to Python 3.4
- Changed in 3.4.3
- What's New In Python 3.3
- 摘要 - 发布重点
- PEP 405: Virtual Environments
- PEP 420: Implicit Namespace Packages
- PEP 3118: New memoryview implementation and buffer protocol documentation
- PEP 393: Flexible String Representation
- PEP 397: Python Launcher for Windows
- PEP 3151: Reworking the OS and IO exception hierarchy
- PEP 380: Syntax for Delegating to a Subgenerator
- PEP 409: Suppressing exception context
- PEP 414: Explicit Unicode literals
- PEP 3155: Qualified name for classes and functions
- PEP 412: Key-Sharing Dictionary
- PEP 362: Function Signature Object
- PEP 421: Adding sys.implementation
- Using importlib as the Implementation of Import
- 其他语言特性修改
- A Finer-Grained Import Lock
- Builtin functions and types
- 新增模块
- 改进的模块
- 性能优化
- Build and C API Changes
- 弃用
- Porting to Python 3.3
- What's New In Python 3.2
- PEP 384: Defining a Stable ABI
- PEP 389: Argparse Command Line Parsing Module
- PEP 391: Dictionary Based Configuration for Logging
- PEP 3148: The concurrent.futures module
- PEP 3147: PYC Repository Directories
- PEP 3149: ABI Version Tagged .so Files
- PEP 3333: Python Web Server Gateway Interface v1.0.1
- 其他语言特性修改
- New, Improved, and Deprecated Modules
- 多线程
- 性能优化
- Unicode
- Codecs
- 文档
- IDLE
- Code Repository
- Build and C API Changes
- Porting to Python 3.2
- What's New In Python 3.1
- PEP 372: Ordered Dictionaries
- PEP 378: Format Specifier for Thousands Separator
- 其他语言特性修改
- New, Improved, and Deprecated Modules
- 性能优化
- IDLE
- Build and C API Changes
- Porting to Python 3.1
- What's New In Python 3.0
- Common Stumbling Blocks
- Overview Of Syntax Changes
- Changes Already Present In Python 2.6
- Library Changes
- PEP 3101: A New Approach To String Formatting
- Changes To Exceptions
- Miscellaneous Other Changes
- Build and C API Changes
- 性能
- Porting To Python 3.0
- What's New in Python 2.7
- The Future for Python 2.x
- Changes to the Handling of Deprecation Warnings
- Python 3.1 Features
- PEP 372: Adding an Ordered Dictionary to collections
- PEP 378: Format Specifier for Thousands Separator
- PEP 389: The argparse Module for Parsing Command Lines
- PEP 391: Dictionary-Based Configuration For Logging
- PEP 3106: Dictionary Views
- PEP 3137: The memoryview Object
- 其他语言特性修改
- New and Improved Modules
- Build and C API Changes
- Other Changes and Fixes
- Porting to Python 2.7
- New Features Added to Python 2.7 Maintenance Releases
- Acknowledgements
- Python 2.6 有什么新变化
- Python 3.0
- Changes to the Development Process
- PEP 343: The 'with' statement
- PEP 366: Explicit Relative Imports From a Main Module
- PEP 370: Per-user site-packages Directory
- PEP 371: The multiprocessing Package
- PEP 3101: Advanced String Formatting
- PEP 3105: print As a Function
- PEP 3110: Exception-Handling Changes
- PEP 3112: Byte Literals
- PEP 3116: New I/O Library
- PEP 3118: Revised Buffer Protocol
- PEP 3119: Abstract Base Classes
- PEP 3127: Integer Literal Support and Syntax
- PEP 3129: Class Decorators
- PEP 3141: A Type Hierarchy for Numbers
- 其他语言特性修改
- New and Improved Modules
- Deprecations and Removals
- Build and C API Changes
- Porting to Python 2.6
- Acknowledgements
- What's New in Python 2.5
- PEP 308: Conditional Expressions
- PEP 309: Partial Function Application
- PEP 314: Metadata for Python Software Packages v1.1
- PEP 328: Absolute and Relative Imports
- PEP 338: Executing Modules as Scripts
- PEP 341: Unified try/except/finally
- PEP 342: New Generator Features
- PEP 343: The 'with' statement
- PEP 352: Exceptions as New-Style Classes
- PEP 353: Using ssize_t as the index type
- PEP 357: The 'index' method
- 其他语言特性修改
- New, Improved, and Removed Modules
- Build and C API Changes
- Porting to Python 2.5
- Acknowledgements
- What's New in Python 2.4
- PEP 218: Built-In Set Objects
- PEP 237: Unifying Long Integers and Integers
- PEP 289: Generator Expressions
- PEP 292: Simpler String Substitutions
- PEP 318: Decorators for Functions and Methods
- PEP 322: Reverse Iteration
- PEP 324: New subprocess Module
- PEP 327: Decimal Data Type
- PEP 328: Multi-line Imports
- PEP 331: Locale-Independent Float/String Conversions
- 其他语言特性修改
- New, Improved, and Deprecated Modules
- Build and C API Changes
- Porting to Python 2.4
- Acknowledgements
- What's New in Python 2.3
- PEP 218: A Standard Set Datatype
- PEP 255: Simple Generators
- PEP 263: Source Code Encodings
- PEP 273: Importing Modules from ZIP Archives
- PEP 277: Unicode file name support for Windows NT
- PEP 278: Universal Newline Support
- PEP 279: enumerate()
- PEP 282: The logging Package
- PEP 285: A Boolean Type
- PEP 293: Codec Error Handling Callbacks
- PEP 301: Package Index and Metadata for Distutils
- PEP 302: New Import Hooks
- PEP 305: Comma-separated Files
- PEP 307: Pickle Enhancements
- Extended Slices
- 其他语言特性修改
- New, Improved, and Deprecated Modules
- Pymalloc: A Specialized Object Allocator
- Build and C API Changes
- Other Changes and Fixes
- Porting to Python 2.3
- Acknowledgements
- What's New in Python 2.2
- 概述
- PEPs 252 and 253: Type and Class Changes
- PEP 234: Iterators
- PEP 255: Simple Generators
- PEP 237: Unifying Long Integers and Integers
- PEP 238: Changing the Division Operator
- Unicode Changes
- PEP 227: Nested Scopes
- New and Improved Modules
- Interpreter Changes and Fixes
- Other Changes and Fixes
- Acknowledgements
- What's New in Python 2.1
- 概述
- PEP 227: Nested Scopes
- PEP 236: future Directives
- PEP 207: Rich Comparisons
- PEP 230: Warning Framework
- PEP 229: New Build System
- PEP 205: Weak References
- PEP 232: Function Attributes
- PEP 235: Importing Modules on Case-Insensitive Platforms
- PEP 217: Interactive Display Hook
- PEP 208: New Coercion Model
- PEP 241: Metadata in Python Packages
- New and Improved Modules
- Other Changes and Fixes
- Acknowledgements
- What's New in Python 2.0
- 概述
- What About Python 1.6?
- New Development Process
- Unicode
- 列表推导式
- Augmented Assignment
- 字符串的方法
- Garbage Collection of Cycles
- Other Core Changes
- Porting to 2.0
- Extending/Embedding Changes
- Distutils: Making Modules Easy to Install
- XML Modules
- Module changes
- New modules
- IDLE Improvements
- Deleted and Deprecated Modules
- Acknowledgements
- 更新日志
- Python 下一版
- Python 3.7.3 最终版
- Python 3.7.3 发布候选版 1
- Python 3.7.2 最终版
- Python 3.7.2 发布候选版 1
- Python 3.7.1 最终版
- Python 3.7.1 RC 2版本
- Python 3.7.1 发布候选版 1
- Python 3.7.0 正式版
- Python 3.7.0 release candidate 1
- Python 3.7.0 beta 5
- Python 3.7.0 beta 4
- Python 3.7.0 beta 3
- Python 3.7.0 beta 2
- Python 3.7.0 beta 1
- Python 3.7.0 alpha 4
- Python 3.7.0 alpha 3
- Python 3.7.0 alpha 2
- Python 3.7.0 alpha 1
- Python 3.6.6 final
- Python 3.6.6 RC 1
- Python 3.6.5 final
- Python 3.6.5 release candidate 1
- Python 3.6.4 final
- Python 3.6.4 release candidate 1
- Python 3.6.3 final
- Python 3.6.3 release candidate 1
- Python 3.6.2 final
- Python 3.6.2 release candidate 2
- Python 3.6.2 release candidate 1
- Python 3.6.1 final
- Python 3.6.1 release candidate 1
- Python 3.6.0 final
- Python 3.6.0 release candidate 2
- Python 3.6.0 release candidate 1
- Python 3.6.0 beta 4
- Python 3.6.0 beta 3
- Python 3.6.0 beta 2
- Python 3.6.0 beta 1
- Python 3.6.0 alpha 4
- Python 3.6.0 alpha 3
- Python 3.6.0 alpha 2
- Python 3.6.0 alpha 1
- Python 3.5.5 final
- Python 3.5.5 release candidate 1
- Python 3.5.4 final
- Python 3.5.4 release candidate 1
- Python 3.5.3 final
- Python 3.5.3 release candidate 1
- Python 3.5.2 final
- Python 3.5.2 release candidate 1
- Python 3.5.1 final
- Python 3.5.1 release candidate 1
- Python 3.5.0 final
- Python 3.5.0 release candidate 4
- Python 3.5.0 release candidate 3
- Python 3.5.0 release candidate 2
- Python 3.5.0 release candidate 1
- Python 3.5.0 beta 4
- Python 3.5.0 beta 3
- Python 3.5.0 beta 2
- Python 3.5.0 beta 1
- Python 3.5.0 alpha 4
- Python 3.5.0 alpha 3
- Python 3.5.0 alpha 2
- Python 3.5.0 alpha 1
- Python 教程
- 课前甜点
- 使用 Python 解释器
- 调用解释器
- 解释器的运行环境
- Python 的非正式介绍
- Python 作为计算器使用
- 走向编程的第一步
- 其他流程控制工具
- if 语句
- for 语句
- range() 函数
- break 和 continue 语句,以及循环中的 else 子句
- pass 语句
- 定义函数
- 函数定义的更多形式
- 小插曲:编码风格
- 数据结构
- 列表的更多特性
- del 语句
- 元组和序列
- 集合
- 字典
- 循环的技巧
- 深入条件控制
- 序列和其它类型的比较
- 模块
- 有关模块的更多信息
- 标准模块
- dir() 函数
- 包
- 输入输出
- 更漂亮的输出格式
- 读写文件
- 错误和异常
- 语法错误
- 异常
- 处理异常
- 抛出异常
- 用户自定义异常
- 定义清理操作
- 预定义的清理操作
- 类
- 名称和对象
- Python 作用域和命名空间
- 初探类
- 补充说明
- 继承
- 私有变量
- 杂项说明
- 迭代器
- 生成器
- 生成器表达式
- 标准库简介
- 操作系统接口
- 文件通配符
- 命令行参数
- 错误输出重定向和程序终止
- 字符串模式匹配
- 数学
- 互联网访问
- 日期和时间
- 数据压缩
- 性能测量
- 质量控制
- 自带电池
- 标准库简介 —— 第二部分
- 格式化输出
- 模板
- 使用二进制数据记录格式
- 多线程
- 日志
- 弱引用
- 用于操作列表的工具
- 十进制浮点运算
- 虚拟环境和包
- 概述
- 创建虚拟环境
- 使用pip管理包
- 接下来?
- 交互式编辑和编辑历史
- Tab 补全和编辑历史
- 默认交互式解释器的替代品
- 浮点算术:争议和限制
- 表示性错误
- 附录
- 交互模式
- 安装和使用 Python
- 命令行与环境
- 命令行
- 环境变量
- 在Unix平台中使用Python
- 获取最新版本的Python
- 构建Python
- 与Python相关的路径和文件
- 杂项
- 编辑器和集成开发环境
- 在Windows上使用 Python
- 完整安装程序
- Microsoft Store包
- nuget.org 安装包
- 可嵌入的包
- 替代捆绑包
- 配置Python
- 适用于Windows的Python启动器
- 查找模块
- 附加模块
- 在Windows上编译Python
- 其他平台
- 在苹果系统上使用 Python
- 获取和安装 MacPython
- IDE
- 安装额外的 Python 包
- Mac 上的图形界面编程
- 在 Mac 上分发 Python 应用程序
- 其他资源
- Python 语言参考
- 概述
- 其他实现
- 标注
- 词法分析
- 行结构
- 其他形符
- 标识符和关键字
- 字面值
- 运算符
- 分隔符
- 数据模型
- 对象、值与类型
- 标准类型层级结构
- 特殊方法名称
- 协程
- 执行模型
- 程序的结构
- 命名与绑定
- 异常
- 导入系统
- importlib
- 包
- 搜索
- 加载
- 基于路径的查找器
- 替换标准导入系统
- Package Relative Imports
- 有关 main 的特殊事项
- 开放问题项
- 参考文献
- 表达式
- 算术转换
- 原子
- 原型
- await 表达式
- 幂运算符
- 一元算术和位运算
- 二元算术运算符
- 移位运算
- 二元位运算
- 比较运算
- 布尔运算
- 条件表达式
- lambda 表达式
- 表达式列表
- 求值顺序
- 运算符优先级
- 简单语句
- 表达式语句
- 赋值语句
- assert 语句
- pass 语句
- del 语句
- return 语句
- yield 语句
- raise 语句
- break 语句
- continue 语句
- import 语句
- global 语句
- nonlocal 语句
- 复合语句
- if 语句
- while 语句
- for 语句
- try 语句
- with 语句
- 函数定义
- 类定义
- 协程
- 最高层级组件
- 完整的 Python 程序
- 文件输入
- 交互式输入
- 表达式输入
- 完整的语法规范
- Python 标准库
- 概述
- 可用性注释
- 内置函数
- 内置常量
- 由 site 模块添加的常量
- 内置类型
- 逻辑值检测
- 布尔运算 — and, or, not
- 比较
- 数字类型 — int, float, complex
- 迭代器类型
- 序列类型 — list, tuple, range
- 文本序列类型 — str
- 二进制序列类型 — bytes, bytearray, memoryview
- 集合类型 — set, frozenset
- 映射类型 — dict
- 上下文管理器类型
- 其他内置类型
- 特殊属性
- 内置异常
- 基类
- 具体异常
- 警告
- 异常层次结构
- 文本处理服务
- string — 常见的字符串操作
- re — 正则表达式操作
- 模块 difflib 是一个计算差异的助手
- textwrap — Text wrapping and filling
- unicodedata — Unicode 数据库
- stringprep — Internet String Preparation
- readline — GNU readline interface
- rlcompleter — GNU readline的完成函数
- 二进制数据服务
- struct — Interpret bytes as packed binary data
- codecs — Codec registry and base classes
- 数据类型
- datetime — 基础日期/时间数据类型
- calendar — General calendar-related functions
- collections — 容器数据类型
- collections.abc — 容器的抽象基类
- heapq — 堆队列算法
- bisect — Array bisection algorithm
- array — Efficient arrays of numeric values
- weakref — 弱引用
- types — Dynamic type creation and names for built-in types
- copy — 浅层 (shallow) 和深层 (deep) 复制操作
- pprint — 数据美化输出
- reprlib — Alternate repr() implementation
- enum — Support for enumerations
- 数字和数学模块
- numbers — 数字的抽象基类
- math — 数学函数
- cmath — Mathematical functions for complex numbers
- decimal — 十进制定点和浮点运算
- fractions — 分数
- random — 生成伪随机数
- statistics — Mathematical statistics functions
- 函数式编程模块
- itertools — 为高效循环而创建迭代器的函数
- functools — 高阶函数和可调用对象上的操作
- operator — 标准运算符替代函数
- 文件和目录访问
- pathlib — 面向对象的文件系统路径
- os.path — 常见路径操作
- fileinput — Iterate over lines from multiple input streams
- stat — Interpreting stat() results
- filecmp — File and Directory Comparisons
- tempfile — Generate temporary files and directories
- glob — Unix style pathname pattern expansion
- fnmatch — Unix filename pattern matching
- linecache — Random access to text lines
- shutil — High-level file operations
- macpath — Mac OS 9 路径操作函数
- 数据持久化
- pickle —— Python 对象序列化
- copyreg — Register pickle support functions
- shelve — Python object persistence
- marshal — Internal Python object serialization
- dbm — Interfaces to Unix “databases”
- sqlite3 — SQLite 数据库 DB-API 2.0 接口模块
- 数据压缩和存档
- zlib — 与 gzip 兼容的压缩
- gzip — 对 gzip 格式的支持
- bz2 — 对 bzip2 压缩算法的支持
- lzma — 用 LZMA 算法压缩
- zipfile — 在 ZIP 归档中工作
- tarfile — Read and write tar archive files
- 文件格式
- csv — CSV 文件读写
- configparser — Configuration file parser
- netrc — netrc file processing
- xdrlib — Encode and decode XDR data
- plistlib — Generate and parse Mac OS X .plist files
- 加密服务
- hashlib — 安全哈希与消息摘要
- hmac — 基于密钥的消息验证
- secrets — Generate secure random numbers for managing secrets
- 通用操作系统服务
- os — 操作系统接口模块
- io — 处理流的核心工具
- time — 时间的访问和转换
- argparse — 命令行选项、参数和子命令解析器
- getopt — C-style parser for command line options
- 模块 logging — Python 的日志记录工具
- logging.config — 日志记录配置
- logging.handlers — Logging handlers
- getpass — 便携式密码输入工具
- curses — 终端字符单元显示的处理
- curses.textpad — Text input widget for curses programs
- curses.ascii — Utilities for ASCII characters
- curses.panel — A panel stack extension for curses
- platform — Access to underlying platform's identifying data
- errno — Standard errno system symbols
- ctypes — Python 的外部函数库
- 并发执行
- threading — 基于线程的并行
- multiprocessing — 基于进程的并行
- concurrent 包
- concurrent.futures — 启动并行任务
- subprocess — 子进程管理
- sched — 事件调度器
- queue — 一个同步的队列类
- _thread — 底层多线程 API
- _dummy_thread — _thread 的替代模块
- dummy_threading — 可直接替代 threading 模块。
- contextvars — Context Variables
- Context Variables
- Manual Context Management
- asyncio support
- 网络和进程间通信
- asyncio — 异步 I/O
- socket — 底层网络接口
- ssl — TLS/SSL wrapper for socket objects
- select — Waiting for I/O completion
- selectors — 高级 I/O 复用库
- asyncore — 异步socket处理器
- asynchat — 异步 socket 指令/响应 处理器
- signal — Set handlers for asynchronous events
- mmap — Memory-mapped file support
- 互联网数据处理
- email — 电子邮件与 MIME 处理包
- json — JSON 编码和解码器
- mailcap — Mailcap file handling
- mailbox — Manipulate mailboxes in various formats
- mimetypes — Map filenames to MIME types
- base64 — Base16, Base32, Base64, Base85 数据编码
- binhex — 对binhex4文件进行编码和解码
- binascii — 二进制和 ASCII 码互转
- quopri — Encode and decode MIME quoted-printable data
- uu — Encode and decode uuencode files
- 结构化标记处理工具
- html — 超文本标记语言支持
- html.parser — 简单的 HTML 和 XHTML 解析器
- html.entities — HTML 一般实体的定义
- XML处理模块
- xml.etree.ElementTree — The ElementTree XML API
- xml.dom — The Document Object Model API
- xml.dom.minidom — Minimal DOM implementation
- xml.dom.pulldom — Support for building partial DOM trees
- xml.sax — Support for SAX2 parsers
- xml.sax.handler — Base classes for SAX handlers
- xml.sax.saxutils — SAX Utilities
- xml.sax.xmlreader — Interface for XML parsers
- xml.parsers.expat — Fast XML parsing using Expat
- 互联网协议和支持
- webbrowser — 方便的Web浏览器控制器
- cgi — Common Gateway Interface support
- cgitb — Traceback manager for CGI scripts
- wsgiref — WSGI Utilities and Reference Implementation
- urllib — URL 处理模块
- urllib.request — 用于打开 URL 的可扩展库
- urllib.response — Response classes used by urllib
- urllib.parse — Parse URLs into components
- urllib.error — Exception classes raised by urllib.request
- urllib.robotparser — Parser for robots.txt
- http — HTTP 模块
- http.client — HTTP协议客户端
- ftplib — FTP protocol client
- poplib — POP3 protocol client
- imaplib — IMAP4 protocol client
- nntplib — NNTP protocol client
- smtplib —SMTP协议客户端
- smtpd — SMTP Server
- telnetlib — Telnet client
- uuid — UUID objects according to RFC 4122
- socketserver — A framework for network servers
- http.server — HTTP 服务器
- http.cookies — HTTP state management
- http.cookiejar — Cookie handling for HTTP clients
- xmlrpc — XMLRPC 服务端与客户端模块
- xmlrpc.client — XML-RPC client access
- xmlrpc.server — Basic XML-RPC servers
- ipaddress — IPv4/IPv6 manipulation library
- 多媒体服务
- audioop — Manipulate raw audio data
- aifc — Read and write AIFF and AIFC files
- sunau — 读写 Sun AU 文件
- wave — 读写WAV格式文件
- chunk — Read IFF chunked data
- colorsys — Conversions between color systems
- imghdr — 推测图像类型
- sndhdr — 推测声音文件的类型
- ossaudiodev — Access to OSS-compatible audio devices
- 国际化
- gettext — 多语种国际化服务
- locale — 国际化服务
- 程序框架
- turtle — 海龟绘图
- cmd — 支持面向行的命令解释器
- shlex — Simple lexical analysis
- Tk图形用户界面(GUI)
- tkinter — Tcl/Tk的Python接口
- tkinter.ttk — Tk themed widgets
- tkinter.tix — Extension widgets for Tk
- tkinter.scrolledtext — 滚动文字控件
- IDLE
- 其他图形用户界面(GUI)包
- 开发工具
- typing — 类型标注支持
- pydoc — Documentation generator and online help system
- doctest — Test interactive Python examples
- unittest — 单元测试框架
- unittest.mock — mock object library
- unittest.mock 上手指南
- 2to3 - 自动将 Python 2 代码转为 Python 3 代码
- test — Regression tests package for Python
- test.support — Utilities for the Python test suite
- test.support.script_helper — Utilities for the Python execution tests
- 调试和分析
- bdb — Debugger framework
- faulthandler — Dump the Python traceback
- pdb — The Python Debugger
- The Python Profilers
- timeit — 测量小代码片段的执行时间
- trace — Trace or track Python statement execution
- tracemalloc — Trace memory allocations
- 软件打包和分发
- distutils — 构建和安装 Python 模块
- ensurepip — Bootstrapping the pip installer
- venv — 创建虚拟环境
- zipapp — Manage executable Python zip archives
- Python运行时服务
- sys — 系统相关的参数和函数
- sysconfig — Provide access to Python's configuration information
- builtins — 内建对象
- main — 顶层脚本环境
- warnings — Warning control
- dataclasses — 数据类
- contextlib — Utilities for with-statement contexts
- abc — 抽象基类
- atexit — 退出处理器
- traceback — Print or retrieve a stack traceback
- future — Future 语句定义
- gc — 垃圾回收器接口
- inspect — 检查对象
- site — Site-specific configuration hook
- 自定义 Python 解释器
- code — Interpreter base classes
- codeop — Compile Python code
- 导入模块
- zipimport — Import modules from Zip archives
- pkgutil — Package extension utility
- modulefinder — 查找脚本使用的模块
- runpy — Locating and executing Python modules
- importlib — The implementation of import
- Python 语言服务
- parser — Access Python parse trees
- ast — 抽象语法树
- symtable — Access to the compiler's symbol tables
- symbol — 与 Python 解析树一起使用的常量
- token — 与Python解析树一起使用的常量
- keyword — 检验Python关键字
- tokenize — Tokenizer for Python source
- tabnanny — 模糊缩进检测
- pyclbr — Python class browser support
- py_compile — Compile Python source files
- compileall — Byte-compile Python libraries
- dis — Python 字节码反汇编器
- pickletools — Tools for pickle developers
- 杂项服务
- formatter — Generic output formatting
- Windows系统相关模块
- msilib — Read and write Microsoft Installer files
- msvcrt — Useful routines from the MS VC++ runtime
- winreg — Windows 注册表访问
- winsound — Sound-playing interface for Windows
- Unix 专有服务
- posix — The most common POSIX system calls
- pwd — 用户密码数据库
- spwd — The shadow password database
- grp — The group database
- crypt — Function to check Unix passwords
- termios — POSIX style tty control
- tty — 终端控制功能
- pty — Pseudo-terminal utilities
- fcntl — The fcntl and ioctl system calls
- pipes — Interface to shell pipelines
- resource — Resource usage information
- nis — Interface to Sun's NIS (Yellow Pages)
- Unix syslog 库例程
- 被取代的模块
- optparse — Parser for command line options
- imp — Access the import internals
- 未创建文档的模块
- 平台特定模块
- 扩展和嵌入 Python 解释器
- 推荐的第三方工具
- 不使用第三方工具创建扩展
- 使用 C 或 C++ 扩展 Python
- 自定义扩展类型:教程
- 定义扩展类型:已分类主题
- 构建C/C++扩展
- 在Windows平台编译C和C++扩展
- 在更大的应用程序中嵌入 CPython 运行时
- Embedding Python in Another Application
- Python/C API 参考手册
- 概述
- 代码标准
- 包含文件
- 有用的宏
- 对象、类型和引用计数
- 异常
- 嵌入Python
- 调试构建
- 稳定的应用程序二进制接口
- The Very High Level Layer
- Reference Counting
- 异常处理
- Printing and clearing
- 抛出异常
- Issuing warnings
- Querying the error indicator
- Signal Handling
- Exception Classes
- Exception Objects
- Unicode Exception Objects
- Recursion Control
- 标准异常
- 标准警告类别
- 工具
- 操作系统实用程序
- 系统功能
- 过程控制
- 导入模块
- Data marshalling support
- 语句解释及变量编译
- 字符串转换与格式化
- 反射
- 编解码器注册与支持功能
- 抽象对象层
- Object Protocol
- 数字协议
- Sequence Protocol
- Mapping Protocol
- 迭代器协议
- 缓冲协议
- Old Buffer Protocol
- 具体的对象层
- 基本对象
- 数值对象
- 序列对象
- 容器对象
- 函数对象
- 其他对象
- Initialization, Finalization, and Threads
- 在Python初始化之前
- 全局配置变量
- Initializing and finalizing the interpreter
- Process-wide parameters
- Thread State and the Global Interpreter Lock
- Sub-interpreter support
- Asynchronous Notifications
- Profiling and Tracing
- Advanced Debugger Support
- Thread Local Storage Support
- 内存管理
- 概述
- 原始内存接口
- Memory Interface
- 对象分配器
- 默认内存分配器
- Customize Memory Allocators
- The pymalloc allocator
- tracemalloc C API
- 示例
- 对象实现支持
- 在堆中分配对象
- Common Object Structures
- Type 对象
- Number Object Structures
- Mapping Object Structures
- Sequence Object Structures
- Buffer Object Structures
- Async Object Structures
- 使对象类型支持循环垃圾回收
- API 和 ABI 版本管理
- 分发 Python 模块
- 关键术语
- 开源许可与协作
- 安装工具
- 阅读指南
- 我该如何...?
- ...为我的项目选择一个名字?
- ...创建和分发二进制扩展?
- 安装 Python 模块
- 关键术语
- 基本使用
- 我应如何 ...?
- ... 在 Python 3.4 之前的 Python 版本中安装 pip ?
- ... 只为当前用户安装软件包?
- ... 安装科学计算类 Python 软件包?
- ... 使用并行安装的多个 Python 版本?
- 常见的安装问题
- 在 Linux 的系统 Python 版本上安装
- 未安装 pip
- 安装二进制编译扩展
- Python 常用指引
- 将 Python 2 代码迁移到 Python 3
- 简要说明
- 详情
- 将扩展模块移植到 Python 3
- 条件编译
- 对象API的更改
- 模块初始化和状态
- CObject 替换为 Capsule
- 其他选项
- Curses Programming with Python
- What is curses?
- Starting and ending a curses application
- Windows and Pads
- Displaying Text
- User Input
- For More Information
- 实现描述器
- 摘要
- 定义和简介
- 描述器协议
- 发起调用描述符
- 描述符示例
- Properties
- 函数和方法
- Static Methods and Class Methods
- 函数式编程指引
- 概述
- 迭代器
- 生成器表达式和列表推导式
- 生成器
- 内置函数
- itertools 模块
- The functools module
- Small functions and the lambda expression
- Revision History and Acknowledgements
- 引用文献
- 日志 HOWTO
- 日志基础教程
- 进阶日志教程
- 日志级别
- 有用的处理程序
- 记录日志中引发的异常
- 使用任意对象作为消息
- 优化
- 日志操作手册
- 在多个模块中使用日志
- 在多线程中使用日志
- 使用多个日志处理器和多种格式化
- 在多个地方记录日志
- 日志服务器配置示例
- 处理日志处理器的阻塞
- Sending and receiving logging events across a network
- Adding contextual information to your logging output
- Logging to a single file from multiple processes
- Using file rotation
- Use of alternative formatting styles
- Customizing LogRecord
- Subclassing QueueHandler - a ZeroMQ example
- Subclassing QueueListener - a ZeroMQ example
- An example dictionary-based configuration
- Using a rotator and namer to customize log rotation processing
- A more elaborate multiprocessing example
- Inserting a BOM into messages sent to a SysLogHandler
- Implementing structured logging
- Customizing handlers with dictConfig()
- Using particular formatting styles throughout your application
- Configuring filters with dictConfig()
- Customized exception formatting
- Speaking logging messages
- Buffering logging messages and outputting them conditionally
- Formatting times using UTC (GMT) via configuration
- Using a context manager for selective logging
- 正则表达式HOWTO
- 概述
- 简单模式
- 使用正则表达式
- 更多模式能力
- 修改字符串
- 常见问题
- 反馈
- 套接字编程指南
- 套接字
- 创建套接字
- 使用一个套接字
- 断开连接
- 非阻塞的套接字
- 排序指南
- 基本排序
- 关键函数
- Operator 模块函数
- 升序和降序
- 排序稳定性和排序复杂度
- 使用装饰-排序-去装饰的旧方法
- 使用 cmp 参数的旧方法
- 其它
- Unicode 指南
- Unicode 概述
- Python's Unicode Support
- Reading and Writing Unicode Data
- Acknowledgements
- 如何使用urllib包获取网络资源
- 概述
- Fetching URLs
- 处理异常
- info and geturl
- Openers and Handlers
- Basic Authentication
- Proxies
- Sockets and Layers
- 脚注
- Argparse 教程
- 概念
- 基础
- 位置参数介绍
- Introducing Optional arguments
- Combining Positional and Optional arguments
- Getting a little more advanced
- Conclusion
- ipaddress模块介绍
- 创建 Address/Network/Interface 对象
- 审查 Address/Network/Interface 对象
- Network 作为 Address 列表
- 比较
- 将IP地址与其他模块一起使用
- 实例创建失败时获取更多详细信息
- Argument Clinic How-To
- The Goals Of Argument Clinic
- Basic Concepts And Usage
- Converting Your First Function
- Advanced Topics
- 使用 DTrace 和 SystemTap 检测CPython
- Enabling the static markers
- Static DTrace probes
- Static SystemTap markers
- Available static markers
- SystemTap Tapsets
- 示例
- Python 常见问题
- Python常见问题
- 一般信息
- 现实世界中的 Python
- 编程常见问题
- 一般问题
- 核心语言
- 数字和字符串
- 性能
- 序列(元组/列表)
- 对象
- 模块
- 设计和历史常见问题
- 为什么Python使用缩进来分组语句?
- 为什么简单的算术运算得到奇怪的结果?
- 为什么浮点计算不准确?
- 为什么Python字符串是不可变的?
- 为什么必须在方法定义和调用中显式使用“self”?
- 为什么不能在表达式中赋值?
- 为什么Python对某些功能(例如list.index())使用方法来实现,而其他功能(例如len(List))使用函数实现?
- 为什么 join()是一个字符串方法而不是列表或元组方法?
- 异常有多快?
- 为什么Python中没有switch或case语句?
- 难道不能在解释器中模拟线程,而非得依赖特定于操作系统的线程实现吗?
- 为什么lambda表达式不能包含语句?
- 可以将Python编译为机器代码,C或其他语言吗?
- Python如何管理内存?
- 为什么CPython不使用更传统的垃圾回收方案?
- CPython退出时为什么不释放所有内存?
- 为什么有单独的元组和列表数据类型?
- 列表是如何在CPython中实现的?
- 字典是如何在CPython中实现的?
- 为什么字典key必须是不可变的?
- 为什么 list.sort() 没有返回排序列表?
- 如何在Python中指定和实施接口规范?
- 为什么没有goto?
- 为什么原始字符串(r-strings)不能以反斜杠结尾?
- 为什么Python没有属性赋值的“with”语句?
- 为什么 if/while/def/class语句需要冒号?
- 为什么Python在列表和元组的末尾允许使用逗号?
- 代码库和插件 FAQ
- 通用的代码库问题
- 通用任务
- 线程相关
- 输入输出
- 网络 / Internet 编程
- 数据库
- 数学和数字
- 扩展/嵌入常见问题
- 可以使用C语言中创建自己的函数吗?
- 可以使用C++语言中创建自己的函数吗?
- C很难写,有没有其他选择?
- 如何从C执行任意Python语句?
- 如何从C中评估任意Python表达式?
- 如何从Python对象中提取C的值?
- 如何使用Py_BuildValue()创建任意长度的元组?
- 如何从C调用对象的方法?
- 如何捕获PyErr_Print()(或打印到stdout / stderr的任何内容)的输出?
- 如何从C访问用Python编写的模块?
- 如何从Python接口到C ++对象?
- 我使用Setup文件添加了一个模块,为什么make失败了?
- 如何调试扩展?
- 我想在Linux系统上编译一个Python模块,但是缺少一些文件。为什么?
- 如何区分“输入不完整”和“输入无效”?
- 如何找到未定义的g++符号__builtin_new或__pure_virtual?
- 能否创建一个对象类,其中部分方法在C中实现,而其他方法在Python中实现(例如通过继承)?
- Python在Windows上的常见问题
- 我怎样在Windows下运行一个Python程序?
- 我怎么让 Python 脚本可执行?
- 为什么有时候 Python 程序会启动缓慢?
- 我怎样使用Python脚本制作可执行文件?
- *.pyd 文件和DLL文件相同吗?
- 我怎样将Python嵌入一个Windows程序?
- 如何让编辑器不要在我的 Python 源代码中插入 tab ?
- 如何在不阻塞的情况下检查按键?
- 图形用户界面(GUI)常见问题
- 图形界面常见问题
- Python 是否有平台无关的图形界面工具包?
- 有哪些Python的GUI工具是某个平台专用的?
- 有关Tkinter的问题
- “为什么我的电脑上安装了 Python ?”
- 什么是Python?
- 为什么我的电脑上安装了 Python ?
- 我能删除 Python 吗?
- 术语对照表
- 文档说明
- Python 文档贡献者
- 解决 Bug
- 文档错误
- 使用 Python 的错误追踪系统
- 开始为 Python 贡献您的知识
- 版权
- 历史和许可证
- 软件历史
- 访问Python或以其他方式使用Python的条款和条件
- Python 3.7.3 的 PSF 许可协议
- Python 2.0 的 BeOpen.com 许可协议
- Python 1.6.1 的 CNRI 许可协议
- Python 0.9.0 至 1.2 的 CWI 许可协议
- 集成软件的许可和认可
- Mersenne Twister
- 套接字
- Asynchronous socket services
- Cookie management
- Execution tracing
- UUencode and UUdecode functions
- XML Remote Procedure Calls
- test_epoll
- Select kqueue
- SipHash24
- strtod and dtoa
- OpenSSL
- expat
- libffi
- zlib
- cfuhash
- libmpdec