Skip to content

Commit c40f804

Browse files
committed
Merge pull request yidao620c#68 from tylinux/master
12.5小节完成
2 parents 6a9f41e + 23da500 commit c40f804

File tree

1 file changed

+152
-174
lines changed

1 file changed

+152
-174
lines changed

source/c12/p05_locking_with_deadlock_avoidance.rst

Lines changed: 152 additions & 174 deletions
Original file line numberDiff line numberDiff line change
@@ -5,189 +5,167 @@
55
----------
66
问题
77
----------
8-
You’re writing a multithreaded program where threads need to acquire more than one
9-
lock at a time while avoiding deadlock.
8+
你正在写一个多线程程序,其中线程需要一次获取多个锁,此时如何避免死锁问题。
109

1110
|
1211
1312
----------
1413
解决方案
1514
----------
16-
In multithreaded programs, a common source of deadlock is due to threads that attempt
17-
to acquire multiple locks at once. For instance, if a thread acquires the first lock, but
18-
then blocks trying to acquire the second lock, that thread can potentially block the
19-
progress of other threads and make the program freeze.
20-
One solution to deadlock avoidance is to assign each lock in the program a unique
21-
number, and to enforce an ordering rule that only allows multiple locks to be acquired
22-
in ascending order. This is surprisingly easy to implement using a context manager as
23-
follows:
24-
25-
import threading
26-
from contextlib import contextmanager
27-
28-
# Thread-local state to stored information on locks already acquired
29-
_local = threading.local()
30-
31-
@contextmanager
32-
def acquire(*locks):
33-
# Sort locks by object identifier
34-
locks = sorted(locks, key=lambda x: id(x))
35-
36-
# Make sure lock order of previously acquired locks is not violated
37-
acquired = getattr(_local,'acquired',[])
38-
if acquired and max(id(lock) for lock in acquired) >= id(locks[0]):
39-
raise RuntimeError('Lock Order Violation')
40-
41-
# Acquire all of the locks
42-
acquired.extend(locks)
43-
_local.acquired = acquired
44-
45-
try:
46-
for lock in locks:
47-
lock.acquire()
48-
yield
49-
finally:
50-
# Release locks in reverse order of acquisition
51-
for lock in reversed(locks):
52-
lock.release()
53-
del acquired[-len(locks):]
54-
55-
To use this context manager, you simply allocate lock objects in the normal way, but use
56-
the acquire() function whenever you want to work with one or more locks. For
57-
example:
58-
59-
import threading
60-
x_lock = threading.Lock()
61-
y_lock = threading.Lock()
62-
63-
def thread_1():
64-
while True:
65-
with acquire(x_lock, y_lock):
66-
print('Thread-1')
67-
68-
def thread_2():
69-
while True:
70-
with acquire(y_lock, x_lock):
71-
print('Thread-2')
72-
73-
t1 = threading.Thread(target=thread_1)
74-
t1.daemon = True
75-
t1.start()
76-
77-
t2 = threading.Thread(target=thread_2)
78-
t2.daemon = True
79-
t2.start()
80-
81-
If you run this program, you’ll find that it happily runs forever without deadlock—even
82-
though the acquisition of locks is specified in a different order in each function.
83-
The key to this recipe lies in the first statement that sorts the locks according to object
84-
identifier. By sorting the locks, they always get acquired in a consistent order regardless
85-
of how the user might have provided them to acquire().
86-
The solution uses thread-local storage to solve a subtle problem with detecting potential
87-
deadlock if multiple acquire() operations are nested. For example, suppose you wrote
88-
the code like this:
89-
90-
import threading
91-
x_lock = threading.Lock()
92-
y_lock = threading.Lock()
93-
94-
def thread_1():
95-
96-
while True:
97-
with acquire(x_lock):
98-
with acquire(y_lock):
99-
print('Thread-1')
100-
101-
def thread_2():
102-
while True:
103-
with acquire(y_lock):
104-
with acquire(x_lock):
105-
print('Thread-2')
106-
107-
t1 = threading.Thread(target=thread_1)
108-
t1.daemon = True
109-
t1.start()
110-
111-
t2 = threading.Thread(target=thread_2)
112-
t2.daemon = True
113-
t2.start()
114-
115-
If you run this version of the program, one of the threads will crash with an exception
116-
such as this:
117-
118-
Exception in thread Thread-1:
119-
Traceback (most recent call last):
120-
File "/usr/local/lib/python3.3/threading.py", line 639, in _bootstrap_inner
121-
self.run()
122-
File "/usr/local/lib/python3.3/threading.py", line 596, in run
123-
self._target(*self._args, **self._kwargs)
124-
File "deadlock.py", line 49, in thread_1
125-
with acquire(y_lock):
126-
File "/usr/local/lib/python3.3/contextlib.py", line 48, in __enter__
127-
return next(self.gen)
128-
File "deadlock.py", line 15, in acquire
129-
raise RuntimeError("Lock Order Violation")
130-
RuntimeError: Lock Order Violation
131-
>>>
132-
133-
This crash is caused by the fact that each thread remembers the locks it has already
134-
acquired. The acquire() function checks the list of previously acquired locks and en‐
135-
forces the ordering constraint that previously acquired locks must have an object ID
136-
that is less than the new locks being acquired.
15+
在多线程程序中,死锁问题很大一部分是由于线程同时获取多个锁造成的。举个例子:一个线程获取了第一个锁,然后在获取第二个锁的
16+
时候发生阻塞,那么这个线程就可能阻塞其他线程的执行,从而导致整个程序假死。
17+
解决死锁问题的一种方案是为程序中的每一个锁分配一个唯一的id,然后只允许按照升序规则来使用多个锁,这个规则使用上下文管理器
18+
是非常容易实现的,示例如下:
19+
20+
.. code-block:: python
21+
import threading
22+
from contextlib import contextmanager
23+
24+
# Thread-local state to stored information on locks already acquired
25+
_local = threading.local()
26+
27+
@contextmanager
28+
def acquire(*locks):
29+
# Sort locks by object identifier
30+
locks = sorted(locks, key=lambda x: id(x))
31+
32+
# Make sure lock order of previously acquired locks is not violated
33+
acquired = getattr(_local,'acquired',[])
34+
if acquired and max(id(lock) for lock in acquired) >= id(locks[0]):
35+
raise RuntimeError('Lock Order Violation')
36+
37+
# Acquire all of the locks
38+
acquired.extend(locks)
39+
_local.acquired = acquired
40+
41+
try:
42+
for lock in locks:
43+
lock.acquire()
44+
yield
45+
finally:
46+
# Release locks in reverse order of acquisition
47+
for lock in reversed(locks):
48+
lock.release()
49+
del acquired[-len(locks):]
50+
51+
如何使用这个上下文管理器呢?你可以按照正常途径创建一个锁对象,但不论是单个锁还是多个锁中都使用 ``acquire()`` 函数来申请锁,
52+
示例如下:
53+
54+
.. code-block:: python
55+
import threading
56+
x_lock = threading.Lock()
57+
y_lock = threading.Lock()
58+
59+
def thread_1():
60+
while True:
61+
with acquire(x_lock, y_lock):
62+
print('Thread-1')
63+
64+
def thread_2():
65+
while True:
66+
with acquire(y_lock, x_lock):
67+
print('Thread-2')
68+
69+
t1 = threading.Thread(target=thread_1)
70+
t1.daemon = True
71+
t1.start()
72+
73+
t2 = threading.Thread(target=thread_2)
74+
t2.daemon = True
75+
t2.start()
76+
77+
如果你执行这段代码,你会发现它即使在不同的函数中以不同的顺序获取锁也没有发生死锁。
78+
其关键在于,在第一段代码中,我们对这些锁进行了排序。通过排序,使得不管用户以什么样的顺序来请求锁,这些锁都会按照固定的顺序被获取。
79+
如果有多个 ``acquire()`` 操作被嵌套调用,可以通过线程本地存储(TLS)来检测潜在的死锁问题。
80+
假设你的代码是这样写的:
81+
82+
.. code-block:: python
83+
import threading
84+
x_lock = threading.Lock()
85+
y_lock = threading.Lock()
86+
87+
def thread_1():
88+
89+
while True:
90+
with acquire(x_lock):
91+
with acquire(y_lock):
92+
print('Thread-1')
93+
94+
def thread_2():
95+
while True:
96+
with acquire(y_lock):
97+
with acquire(x_lock):
98+
print('Thread-2')
99+
100+
t1 = threading.Thread(target=thread_1)
101+
t1.daemon = True
102+
t1.start()
103+
104+
t2 = threading.Thread(target=thread_2)
105+
t2.daemon = True
106+
t2.start()
107+
108+
如果你运行这个版本的代码,必定会有一个线程发生崩溃,异常信息可能像这样:
109+
110+
.. code-block:: python
111+
Exception in thread Thread-1:
112+
Traceback (most recent call last):
113+
File "/usr/local/lib/python3.3/threading.py", line 639, in _bootstrap_inner
114+
self.run()
115+
File "/usr/local/lib/python3.3/threading.py", line 596, in run
116+
self._target(*self._args, **self._kwargs)
117+
File "deadlock.py", line 49, in thread_1
118+
with acquire(y_lock):
119+
File "/usr/local/lib/python3.3/contextlib.py", line 48, in __enter__
120+
return next(self.gen)
121+
File "deadlock.py", line 15, in acquire
122+
raise RuntimeError("Lock Order Violation")
123+
RuntimeError: Lock Order Violation
124+
>>>
125+
126+
发生崩溃的原因在于,每个线程都记录着自己已经获取到的锁。 ``acquire()`` 函数会检查之前已经获取的锁列表,
127+
由于锁是按照升序排列获取的,所以函数会认为之前已获取的锁的id必定小于新申请到的锁,这时就会触发异常。
137128

138129
|
139130
140131
----------
141132
讨论
142133
----------
143-
The issue of deadlock is a well-known problem with programs involving threads (as
144-
well as a common subject in textbooks on operating systems). As a rule of thumb, as
145-
long as you can ensure that threads can hold only one lock at a time, your program will
146-
be deadlock free. However, once multiple locks are being acquired at the same time, all
147-
bets are off.
148-
149-
Detecting and recovering from deadlock is an extremely tricky problem with few elegant
150-
solutions. For example, a common deadlock detection and recovery scheme involves
151-
the use of a watchdog timer. As threads run, they periodically reset the timer, and as
152-
long as everything is running smoothly, all is well. However, if the program deadlocks,
153-
the watchdog timer will eventually expire. At that point, the program “recovers” by
154-
killing and then restarting itself.
155-
Deadlock avoidance is a different strategy where locking operations are carried out in
156-
a manner that simply does not allow the program to enter a deadlocked state. The
157-
solution in which locks are always acquired in strict order of ascending object ID can
158-
be mathematically proven to avoid deadlock, although the proof is left as an exercise to
159-
the reader (the gist of it is that by acquiring locks in a purely increasing order, you can’t
160-
get cyclic locking dependencies, which are a necessary condition for deadlock to occur).
161-
As a final example, a classic thread deadlock problem is the so-called “dining philoso‐
162-
pher’s problem.” In this problem, five philosophers sit around a table on which there
163-
are five bowls of rice and five chopsticks. Each philosopher represents an independent
164-
thread and each chopstick represents a lock. In the problem, philosophers either sit and
165-
think or they eat rice. However, in order to eat rice, a philosopher needs two chopsticks.
166-
Unfortunately, if all of the philosophers reach over and grab the chopstick to their left,
167-
they’ll all just sit there with one stick and eventually starve to death. It’s a gruesome
168-
scene.
169-
Using the solution, here is a simple deadlock free implementation of the dining philos‐
170-
opher’s problem:
171-
172-
import threading
173-
174-
# The philosopher thread
175-
def philosopher(left, right):
176-
while True:
177-
with acquire(left,right):
178-
print(threading.currentThread(), 'eating')
179-
180-
# The chopsticks (represented by locks)
181-
NSTICKS = 5
182-
chopsticks = [threading.Lock() for n in range(NSTICKS)]
183-
184-
# Create all of the philosophers
185-
for n in range(NSTICKS):
186-
t = threading.Thread(target=philosopher,
187-
args=(chopsticks[n],chopsticks[(n+1) % NSTICKS]))
188-
t.start()
189-
190-
Last, but not least, it should be noted that in order to avoid deadlock, all locking oper‐
191-
ations must be carried out using our acquire() function. If some fragment of code
192-
decided to acquire a lock directly, then the deadlock avoidance algorithm wouldn’t work.
193-
134+
死锁是每一个多线程程序都会面临的一个问题(就像它是每一本操作系统课本的共同话题一样)。根据经验来讲,尽可能保证每一个
135+
线程只能同时保持一个锁,这样程序就不会被死锁问题所困扰。一旦有线程同时申请多个锁,一切就不可预料了。
136+
137+
死锁的检测与恢复是一个几乎没有优雅的解决方案的扩展话题。一个比较常用的死锁检测与恢复的方案是引入看门狗计数器。当线程正常
138+
运行的时候会每隔一段时间重置计数器,在没有发生死锁的情况下,一切都正常进行。一旦发生死锁,由于无法重置计数器导致定时器
139+
超时,这时程序会通过重启自身恢复到正常状态。
140+
141+
避免死锁是另外一种解决死锁问题的方式,在进程获取锁的时候会严格按照对象id升序排列获取,经过数学证明,这样保证程序不会进入
142+
死锁状态。证明就留给读者作为练习了。避免死锁的主要思想是,单纯地按照对象id递增的顺序加锁不会产生循环依赖,而循环依赖是
143+
死锁的一个必要条件,从而避免程序进入死锁状态。
144+
145+
下面以一个关于线程死锁的经典问题:“哲学家就餐问题”,作为本节最后一个例子。题目是这样的:五位哲学家围坐在一张桌子前,每个人
146+
面前有一碗饭和一只筷子。在这里每个哲学家可以看做是一个独立的线程,而每只筷子可以看做是一个锁。每个哲学家可以处在静坐、
147+
思考、吃饭三种状态中的一个。需要注意的是,每个哲学家吃饭是需要两只筷子的,这样问题就来了:如果每个哲学家都拿起自己左边的筷子,
148+
那么他们五个都只能拿着一只筷子坐在那儿,直到饿死。此时他们就进入了死锁状态。
149+
下面是一个简单的使用死锁避免机制解决“哲学家就餐问题”的实现:
150+
151+
.. code-block:: python
152+
import threading
153+
154+
# The philosopher thread
155+
def philosopher(left, right):
156+
while True:
157+
with acquire(left,right):
158+
print(threading.currentThread(), 'eating')
159+
160+
# The chopsticks (represented by locks)
161+
NSTICKS = 5
162+
chopsticks = [threading.Lock() for n in range(NSTICKS)]
163+
164+
# Create all of the philosophers
165+
for n in range(NSTICKS):
166+
t = threading.Thread(target=philosopher,
167+
args=(chopsticks[n],chopsticks[(n+1) % NSTICKS]))
168+
t.start()
169+
170+
最后,要特别注意到,为了避免死锁,所有的加锁操作必须使用 ``acquire()`` 函数。如果代码中的某部分绕过acquire
171+
函数直接申请锁,那么整个死锁避免机制就不起作用了。

0 commit comments

Comments
 (0)