Message227346
The encoding used impacts the result:
>>> s = 'abc\udcc3\udca9'
>>> s.encode('ascii', 'surrogateescape').decode('ascii', 'replace')
'abc��'
>>> s.encode('utf-8', 'surrogateescape').decode('utf-8', 'replace')
'abcé'
The original string ('abc\udcc3\udca9') was obtained by decoding a valid utf-8 string with the 'ascii' codec and the 'surrogateescape' error handler.
If anything, the default encoding should probably be sys.getfilesystemencoding(). |
|
| Date |
User |
Action |
Args |
| 2014-09-23 11:23:47 | pitrou | set | recipients:
+ pitrou, lemburg, ncoghlan, vstinner, ezio.melotti, Arfrever, r.david.murray, serhiy.storchaka |
| 2014-09-23 11:23:47 | pitrou | set | messageid: <[email protected]> |
| 2014-09-23 11:23:47 | pitrou | link | issue18814 messages |
| 2014-09-23 11:23:47 | pitrou | create | |
|