python2升级至python3

2021年03月22日周一 | By honmaple | In Python | 0f python

print
string
比值问题
编码问题
cmp
sorted
Exception
SSL
file.read
redis
requests
django

首先使用工具 2to3 转换
2to3 目录名 -w -n

print

https://stackoverflow.com/questions/55559825/how-to-fix-print-double-parentheses-after-2to3-conversion

问题: 如果在python2项目里使用了python3的写法，比如在 python2 里的 print("test"), 使用 2to3 就会转换成
```
print(("test"))
```
所以需要找到 print(( 并修复该转换(其实不转也没什么问题)
```
ag -Gpy 'print(('
```
修复:
```
- print(("test"))
+ print("test")
```

string

问题1:

module 'string' has no attribute 'letters

修复:

- string.letters
+ string.ascii_letters

问题2:
```
'str' object has no attribute 'decode'
```

修复:

- str.decode(xxx)
+ str.encode(xxx).decode('unicode_escape')

比值问题

slice indices must be integers or None or have an __index__ method

问题: python2里 3/2 返回的是整型 1 python3里 3/2 返回的是浮点型 1.5
修复:
```
print(3//2)
# 或者
print(int(3/2))
```

编码问题

sys.setdefaultencoding('utf-8')

python2里的 sys.setdefaultencoding('utf-8') 需要删除

encode、decode

问题:

LookupError: 'base64' is not a text encoding; use codecs.encode() to handle arbitrary codecs

语法:

python2

Python 2.7.16 (default, Dec 21 2020, 23:00:36)
[GCC Apple LLVM 12.0.0 (clang-1200.0.30.4) [+internal-os, ptrauth-isa=sign+stri on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> a = "a"
>>> a.encode("base64")
'YQ==\n'
>>> b = a.encode("base64")
>>> b.decode("base64")
'a'

python3

Python 3.7.4 (default, Sep  7 2020, 15:30:33)
[Clang 11.0.3 (clang-1103.0.32.29)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> a = "a"
>>> a.encode("base64")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
LookupError: 'base64' is not a text encoding; use codecs.encode() to handle arbitrary codecs

修复:

import base64

def b64encode(s):
    if isinstance(s, bytes):
        return base64.b64encode(s)
    return base64.b64encode(s.encode('utf-8')).decode('utf-8')

def b64decode(s):
    return base64.b64decode(s).decode('utf-8')

- x.encode('base64')
- x.decode('base64')
+ b64encode(x)
+ b64decode(x)

编码转换

utf-8转换为gbk

python2

'hello世界'.decode('utf-8').encode('gbk')

python3

''.join([chr(i) for i in 'hello世界'.encode('gbk')])
# 或者
'hello世界'.encode('gbk').decode('unicode_escape')

cmp

python3里cmp内置函数不再存在，需要自定义函数

def cmp(a, b):
    return (a > b) - (a < b)

sorted

python2里sorted有一个cmp参数，python3里统一为key参数

python2

sorted(keys, lambda x, y: cmp(len(x), len(y)), reverse=True)

python3

from functools import cmp_to_key

sorted(keys, key=cmp_to_key(lambda x, y: cmp(len(x), len(y))), reverse=True)

注意: must use keyword argument for key function

Exception

python3里没有 .message, 所以需要修改 e.message 为 str(e)

python3无法使用 as 直接对变量赋值

def main():
    err = None
    try:
        raise ValueError("sss")
    -    except Exception as err:
    -       pass
    +    except Exception as e:
    +       err = e
    return err

print(main())

UnboundLocalError: local variable 'err' referenced before assignment

SSL

问题:

File "/usr/local/lib/python3.6/dist-packages/OpenSSL/SSL.py", line 1591, in set_tlsext_host_name
raise TypeError("name must be a byte string")

解决

- s.set_tlsext_host_name(hostname)
+ s.set_tlsext_host_name(hostname.encode('utf-8'))

file.read

with open("test.txt", "rb") as f:
    for i in f.read(10):
        print(i, type(i))

python2

('\x7f', <type 'str'>)
('E', <type 'str'>)
('\x00', <type 'str'>)

python3

127 <class 'int'>
69 <class 'int'>
0 <class 'int'>

两者之间的转换
```
ord('\x7f') == 127
chr(127) == '\x7f'
```

redis

python3里默认取出的值是 bytes 类型, 需要客户端添加 decode_responses=True 参数, 取出的值才是 str 类型

requests

自定义编码请求

https://github.com/psf/requests/issues/4133

post json

python2

data = {'test': 'hello世界'.decode('utf-8').encode('gbk')}
headers = {'Content-Type': 'application/json;charset=gbk'}
data=json.dumps(data,ensure_ascii=False)
rsp = requests.post("/dynamic/test", data=data,headers=headers)

python3

data = {'test': 'hello世界'}
headers = {'Content-Type': 'application/json;charset=gbk'}
data=json.dumps(data, ensure_ascii=False)
resp = requests.post("/dynamic/test", data=data.encode('gbk'), headers=headers)

post form

python2

data = {'test': 'hello世界'.decode('utf-8').encode('gbk')}
headers = {'Content-Type': 'application/x-www-form-urlencoded;charset=gbk'}
resp = requests.post("/dynamic/test", data=data,headers=headers)

python3

data = {'test': 'hello世界'.encode('gbk')}
headers = {'Content-Type': 'application/x-www-form-urlencoded;charset=gbk'}
resp = requests.post("/dynamic/test", data=data,headers=headers)

响应编码

resp = requests.get("...")
print(type(resp.content))
print(type(resp.text))

python2 resp.content 是 str 类型, resp.text 是 unicode 类型
python3 resp.content 是 bytes 类型, resp.text 是 str 类型

请求headers顺序

根源主要在 requests.structures 的 CaseInsensitiveDict 类

from requests.structures import CaseInsensitiveDict
from collections import OrderedDict

headers = {
    'Accept-Language': 'en-US,en;q=0.8,zh-CN;q=0.6,zh;q=0.4',
    'Accept-Encoding': 'gzip, deflate, sdch',
    'cache': 0,
    'host': 'Host1.com',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
    'User-Agent': 'curl/7.29.0',
    'Host': 'Host2.com'
}
print(headers)

r = CaseInsensitiveDict()
r.update(headers)
print(r)

不同的python版本结果会输出

python2

OrderedDict([('Accept-Language', 'en-US,en;q=0.8,zh-CN;q=0.6,zh;q=0.4'), ('Accept-Encoding', 'gzip, deflate, sdch'), ('cache', 0), ('Host', 'Host2.com'), ('Accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8'), ('User-Agent', 'curl/7.29.0'), ('host', 'Host1.com')])
CaseInsensitiveDict({'Accept-Language': 'en-US,en;q=0.8,zh-CN;q=0.6,zh;q=0.4', 'Accept-Encoding': 'gzip, deflate, sdch', 'cache': 0, 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8', 'User-Agent': 'curl/7.29.0', 'host': 'Host1.com'})

python3

OrderedDict([('Accept-Language', 'en-US,en;q=0.8,zh-CN;q=0.6,zh;q=0.4'), ('Accept-Encoding', 'gzip, deflate, sdch'), ('cache', 0), ('host', 'Host1.com'), ('Accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8'), ('User-Agent', 'curl/7.29.0'), ('Host', 'Host2.com')])
{'Accept-Language': 'en-US,en;q=0.8,zh-CN;q=0.6,zh;q=0.4', 'Accept-Encoding': 'gzip, deflate, sdch', 'cache': 0, 'Host': 'Host2.com', 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8', 'User-Agent': 'curl/7.29.0'}

所以如果要兼容两个版本，需要显示的传入

headers = OrderedDict(sorted(headers.items(), key=lambda x: x[0]))

django

- for k, v in request.GET.iterlists():
+ for k, v in request.GET.lists():

并且k和v的类型在python3里默认为 str ，不需要使用 k.encode("utf-8") 进行转换

作者: honmaple

链接: https://honmaple.me/articles/2021/03/python2升级至python3.html

版权:

知识共享署名-非商业性使用-相同方式共享4.0国际许可协议

加载评论

python2升级至python3

Table of Contents

print

string

比值问题

编码问题

sys.setdefaultencoding('utf-8')

encode、decode

编码转换

cmp

sorted

Exception

SSL

file.read

redis

requests

自定义编码请求

post json

post form

响应编码

请求headers顺序

django