Python 3 text file encoding is still system-dependent

Python 3 assumes a default encoding of UTF-8 in a variety of useful scenarios.

Clock Icon - Technology Webflow Template
1
min read

Python 3 assumes a default encoding of UTF-8 in a variety of useful scenarios. For example, s: str = buf.decode() will decode the byte buffer as UTF-8. And similarly for buf = s.encode().  Python 3 also defaults to encoding filenames as UTF-8, and Python 3 source code itself is encoded as UTF-8 by default. This is in contrast to Python 2, which either defaulted to ascii or required an explicit encoding.

However, one noteworthy and gotcha-worthy exception is that the open() call, when used in text mode, uses a platform-dependent encoding. So it's best to always specify an explicit encoding: open('path/to/file', 'r', encoding='utf-8')

Send us an email if you'd like to learn more about this!

Benjy Weinberger

Co-founder

Co-founder