问题描述
- 用
r = requests.get(url, params)
爬取网页,新建一个文件并保存r.text
时报错
>>> with open("webtext.txt", 'w') as f:
f.write(r.text)
Traceback (most recent call last):
File "<pyshell#18>", line 2, in <module>
f.write(r.text)
UnicodeEncodeError: 'gbk' codec can't encode character '\xf6' in position 395497: illegal multibyte sequence
- 意思是'gbk'编解码器不能编码Unicode字符'\xf6'
解决途径
>>> import locale
>>> locale.getpreferredencoding()
'cp936'
- 所以在
open()
中把encoding
设置为'utf-8'
即可:
>>> with open("webtext.txt", 'w', encoding='utf-8') as f:
f.write(r.text)
1205641
>>>