Skip to content

Commit e6cd291

Browse files
authored
Merge pull request ictar#46 from ictar/pd_4
Pd 4
2 parents 4995d44 + 487314e commit e6cd291

8 files changed

+844
-657
lines changed

Others/Lists和Tuples大对决.md

Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
原文:[Lists vs. Tuples](http://nedbatchelder.com/blog/201608/lists_vs_tuples.html "Link to this post" )
2+
3+
---
4+
5+
常见的Python初学者问题:列表和元组之间有何区别?
6+
7+
答案是,有两个不同的差异,以及两者之复杂的相互作用。这就是技术差异和文化差异。
8+
9+
首先:它们有相同之处:列表和元组都是容器,即对象序列:
10+
11+
```python
12+
>>> my_list = [123]
13+
>>> type(my_list)
14+
<class 'list'>
15+
>>> my_tuple = (123)
16+
>>> type(my_tuple)
17+
<class 'tuple'>
18+
```
19+
20+
它们任意一个的元素可以是任意类型,甚至在一个单一序列中。它们都维护元素的顺序(不像集合和字典那样)。
21+
22+
现在是不同之处。列表和元组之间的技术差异是,列表示可变的(可以被改变),而元组是不可变的(不可以被改变)。这是Python语言对它们的唯一区分:
23+
24+
```python
25+
>>> my_list[1= "two"
26+
>>> my_list
27+
[1'two'3]
28+
>>> my_tuple[1= "two"
29+
Traceback (most recent call last):
30+
  File "<stdin>", line 1in <module>
31+
TypeError'tuple' object does not support item assignment
32+
```
33+
34+
这就是列表和元组之间的唯一技术差异,虽然它体现在几个方面。例如,列表有一个.append()方法,用以添加更多元素到列表中,而元组并没有:
35+
36+
```python
37+
>>> my_list.append("four")
38+
>>> my_list
39+
[1'two'3'four']
40+
>>> my_tuple.append("four")
41+
Traceback (most recent call last):
42+
  File "<stdin>", line 1in <module>
43+
AttributeError'tuple' object has no attribute 'append'
44+
```
45+
46+
元组并不需要.append()方法,因为你不可以修改元组。
47+
48+
文化差异是关于列表和元组的实际使用的:当你有一个未知长度的同质序列时使用列表;当你预先知道元素个数的时候使用元组,因为元素的位置语义上是显著的。
49+
50+
例如,假设你有一个函数,它查找目录下以*.py结尾的文件。它应该返回一个列表,因为你并不知道会找到多少个文件,而它们都具有相同的语义:只是你找到的另一个文件。
51+
52+
```python
53+
>>> find_files("*.py")
54+
["control.py""config.py""cmdline.py""backward.py"]
55+
```
56+
57+
另外,假设你需要存储五个值来表示气象观测站的位置:id, city, state, latitude和longitude。那么较之列表,元组更适合:
58+
59+
```python
60+
>>> denver = (44"Denver""CO"40105)
61+
>>> denver[1]
62+
'Denver'
63+
```
64+
65+
(目前,不要讨论使用类来替代)这里,第一个元素是id,第二个元素是city,以此类推。位置决定了意思。
66+
67+
将文化差异加之于C语言上,列表像数组,元组像结构。
68+
69+
Python有一个namedtuple工具,它可以让意思更加明确:
70+
71+
```python
72+
>>> from collections import namedtuple
73+
>>> Station = namedtuple("Station""id, city, state, lat, long")
74+
>>> denver = Station(44"Denver""CO"40105)
75+
>>> denver
76+
Station(id=44city='Denver'state='CO'lat=40long=105)
77+
>>> denver.city
78+
'Denver'
79+
>>> denver[1]
80+
'Denver'
81+
```
82+
83+
元组和列表之间的文化差异一个聪明的总结是:元组是没有名字的namedtuple。
84+
85+
技术差异和文化差异是一个不稳定的联盟,因为有时它们相左。为什么同源序列应该可变,而异源序列不是?例如,因为namedtuple是一个元组,它是不可变的,所以我不可以修改我的气象站:
86+
87+
```python
88+
 denver.lat = 39.7392
89+
Traceback (most recent call last):
90+
  File "<stdin>", line 1in <module>
91+
AttributeError: can't set attribute
92+
```
93+
94+
有时,技术方面的考虑覆盖了文化因素。你不能把一个列表当做一个字典键,因为只有不可变值才能够被哈希,所以只有不可变值才能作为键。要把一个列表当成键,你可以将其转换成元组:
95+
96+
```python
97+
 d = {}
98+
>>> nums = [123]
99+
>>> d[nums] = "hello"
100+
Traceback (most recent call last):
101+
  File "<stdin>", line 1in <module>
102+
TypeError: unhashable type'list'
103+
>>> d[tuple(nums)] = "hello"
104+
>>> d
105+
{(123): 'hello'}
106+
```
107+
108+
另一个技术和文化的冲突是:Python自身在使用列表更有意义的情况下使用了元组。当你使用*args定义一个函数时,args作为元组传递,即使据Python所知,值的位置并不重要。你可能会说,它是元组,因为你不可以改变你所传递的值,但这只是较之文化,更重视了技术差异。
109+
110+
我知道,我知道,在*args中,位置可能是重要的,因为它们是位置参数。但在一个接受*args,然后将其传递给另一个函数的函数中,它只是一个参数序列,和另一个没什么不同,而它们的数量在调用之间可以不同。
111+
112+
这里,Python使用元组,是因为较之列表,它们的空间效率会多一点。列表是过度分配的,以便让附加更快些。这说明Python务实的一面:因地制宜使用数据结构,而不是纠结于*args的列表/元组语义。
113+
114+
在大多数情况下,你应该基于文化差异选择是使用列表还是元组。想想你的数据的含义。如果基于你的程序在现实世界中遇到的,它会有不同的长度,那么可能要使用列表。如果你知道在你写代码的时候,第三个元素意味着什么,那么可能要使用元组。
115+
116+
另一方面,函数式编程强调不可变数据结构,作为一种避免难以推理代码这一副作用的方式。如果你是一个函数式编程粉,那么你可能会因为不可变性喜欢元组。
117+
118+
所以:你应该使用元组还是列表呢?答案是:它并不总是一个简单的答案。

Others/README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -118,3 +118,6 @@
118118

119119
- [Requests vs. urllib:它解决了什么问题?](./Requests vs. urllib:它解决了什么问题?.md)
120120

121+
- [Lists和Tuples大对决](./Lists和Tuples大对决.md)
122+
123+
常见的Python初学者问题:列表和元组之间有何区别?答案是,有两个不同的差异,以及两者之复杂的相互作用。这就是技术差异和文化差异。

Python Weekly/Python Weekly Issue 258.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,9 +25,9 @@
2525

2626
这篇文章重温Tensorflow上的RNNs的使用最佳实践,特别是在官网上没有得到很好记录的特性。
2727

28-
[Lists和Tuples大对决](http://nedbatchelder.com/blog/201608/lists_vs_tuples.html)
28+
[Lists和Tuples大对决](http://nedbatchelder.com/blog/201608/lists_vs_tuples.html) | [中文版](../Others/Lists和Tuples大对决.md)
2929

30-
常见的Python初学者问题:列表和元组之间有何区别?答案是,有两个不同的差异,以及两者之复杂的相互作用。还有就是技术差异和文化差异
30+
常见的Python初学者问题:列表和元组之间有何区别?答案是,有两个不同的差异,以及两者之复杂的相互作用。这就是技术差异和文化差异
3131

3232
[Podcast.__init__ 第71集 - 和Radim Řehůřek聊聊Gensim](https://podcastinit.com/radim-rehurek-gensim.html)
3333

Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
原文:[Better Python Object Serialization](https://hynek.me/articles/serialization/)
2+
3+
---
4+
5+
The Python standard library is full of underappreciated gems. One of them
6+
allows for simple and elegant function dispatching based on argument types.
7+
This makes it perfect for serialization of arbitrary objects – for example to
8+
JSON in web APIs and structured logs.
9+
10+
Who hasn’t seen it:
11+
12+
```python
13+
14+
TypeError: datetime.datetime(...) is not JSON serializable
15+
16+
```
17+
18+
While this shouldn’t be a big deal, it is. The `json` module – that inherited
19+
its API from `simplejson` – offers two ways to serialize objects:
20+
21+
1. Implement a `default()` _function_ that takes an object and returns something that [`JSONEncoder`](https://docs.python.org/3/library/json.html#json.JSONEncoder) understands.
22+
2. Implement or subclass a `JSONEncoder` yourself and pass it as `cls` to the dump methods. You can implement it on your own or just override the `JSONEncoder.default()` _method_.
23+
24+
And since alternative implementations want to be drop-in, they imitate the
25+
`json` module’s API to various degrees1.
26+
27+
## Expandability
28+
29+
What both approaches have in common is that they’re not expandable: adding
30+
support for new types is not provided for. Your single `default()` fallback
31+
has to know about all custom types you want to serialize. Which means you
32+
either write functions like:
33+
34+
```python
35+
36+
def to_serializable(val):
37+
if isinstance(val, datetime):
38+
return val.isoformat() + "Z"
39+
elif isinstance(val, enum.Enum):
40+
return val.value
41+
elif attr.has(val.__class__):
42+
return attr.asdict(val)
43+
elif isinstance(val, Exception):
44+
return {
45+
"error": val.__class__.__name__,
46+
"args": val.args,
47+
}
48+
return str(val)
49+
50+
```
51+
52+
Which is painful since you have to add serialization for all objects in one
53+
place2.
54+
55+
Alternatively you can try to come up with general solutions on your own like
56+
Pyramid’s JSON renderer did in [`JSON.add_adapter`](http://docs.pylonsproject.
57+
org/projects/pyramid/en/latest/narr/renderers.html#using-the-add-adapter-
58+
method-of-a-custom-json-renderer) which uses the widely underappreciated
59+
`zope.interface`’s adapter registry3.
60+
61+
Django on the other hand satisfies itself with a `DjangoJSONEncoder` that is a
62+
subclass of `json.JSONEncoder` and knows how to encode dates, times, UUIDs,
63+
and promises. But other than that, you’re on your own again. If you want to go
64+
further with Django and web APIs, you’re probably already using the Django
65+
REST framework anyway. They came up with a whole [serialization
66+
system](http://www.django-rest-framework.org/api-guide/serializers/) that does
67+
a lot more than just making data `json.dumps()`-ready.
68+
69+
Finally for the sake of completeness I feel like I have to mention my own
70+
solution in [`structlog`](http://www.structlog.org/en/stable/) that I fiercely
71+
hated from day one: adding a `__structlog__` method to your classes that
72+
return a serializable representation in the tradition of `__str__`. Please
73+
don’t repeat my mistake; hashtag [software clown](https://softwareclown.com).
74+
75+
* * *
76+
77+
Given how prevalent JSON is, it’s surprising that we have only siloed
78+
solutions so far. What _I_ personally would like to have is a way to register
79+
serializers in a central place but in a decentralized fashion that doesn’t
80+
require any changes to my (or worse: third party) classes.
81+
82+
## Enter PEP 443
83+
84+
Turns out, Python 3.4 came with a nice solution to this problem in the form of
85+
[PEP 443](https://www.python.org/dev/peps/pep-0443/): [`functools.singledispat
86+
ch`](https://docs.python.org/3/library/functools.html#functools.singledispatch
87+
) (also available on [PyPI](https://pypi.org/project/singledispatch/) for
88+
legacy Python versions).
89+
90+
Put simply, you define a default function and then register additional
91+
versions of that functions depending on the type of the first argument:
92+
93+
```python
94+
95+
from datetime import datetime
96+
from functools import singledispatch
97+
98+
@singledispatch
99+
def to_serializable(val):
100+
"""Used by default."""
101+
return str(val)
102+
103+
@to_serializable.register(datetime)
104+
def ts_datetime(val):
105+
"""Used if *val* is an instance of datetime."""
106+
return val.isoformat() + "Z"
107+
108+
```
109+
110+
Now you can call `to_serializable()` on `datetime` instances too and single
111+
dispatch will pick the correct function:
112+
113+
```python
114+
115+
>>> json.dumps({"msg": "hi", "ts": datetime.now()},
116+
... default=to_serializable)
117+
'{"ts": "2016-08-20T13:08:59.153864Z", "msg": "hi"}'
118+
119+
```
120+
121+
This gives you the power to put your serializers wherever you want: along with
122+
the classes, in a separate module, or along with JSON-related code? _You_
123+
choose! But your _classes_ stay clean and you don’t have a huge `if-elif-else`
124+
branch that you cargo-cult between your projects.
125+
126+
## Going Further
127+
128+
Obviously the utility of `@singledispatch` goes far beyond JSON. Binding
129+
different behaviors to different types in general and object serialization in
130+
particular are universally useful4. Some of my proofreaders mentioned they
131+
tried a ghetto approximation using `dict`s of classes to callables and other
132+
similar atrocities.
133+
134+
In other words, `@singledispatch` just may be the function that you’ve been
135+
missing although it was there all along.
136+
137+
P.S. Of course there’s also a `*multiple*dispatch` on
138+
[PyPI](https://pypi.org/project/multipledispatch/).
139+
140+
## Footnotes
141+
142+
* * *
143+
144+
1. However, from the popular ones: [UltraJSON](https://github.com/esnme/ultrajson) doesn’t support custom object serialization at all and [`python-rapidjson`](https://github.com/kenrobbins/python-rapidjson) only supports the `default()` function. ↩︎
145+
2. Although as you can see it’s manageable with `attrs`; maybe [you should use `attrs`](https://glyph.twistedmatrix.com/2016/08/attrs.html)! ↩︎
146+
3. Unfortunately the API Pyramid uses is currently [undocumented](https://github.com/zopefoundation/zope.interface/issues/41) after being transplanted from [`zope.component`](https://docs.zope.org/zope.component/). ↩︎
147+
4. I’ve been told the original incentive for adding single dispatch to the standard library was a more elegant reimplementation of [`pprint`](https://docs.python.org/3.5/library/pprint.html) (that never happened). ↩︎

0 commit comments

Comments
 (0)