我們知道,同一進程的多個線程之間是內(nèi)存共享的,這意味著,當(dāng)一個線程對全局變量做了修改,將會影響到其他所有線程,這是很危險的。為了避免多個線程同時修改全局變量,我們就需要對全局變量的修改加鎖。
除了對全局變量的修改進行加鎖,你可能也想到了可以使用線程自己的局部變量,因為局部變量只有線程自己能看見,對同一進程的其他線程是不可訪問的。確實如此,讓我們先看一個例子:
from threading import Thread, current_thread
def echo(num):
print current_thread().name, num
def calc():
print 'thread %s is running...' % current_thread().name
local_num = 0
for _ in xrange(10000):
local_num += 1
echo(local_num)
print 'thread %s ended.' % current_thread().name
if __name__ == '__main__':
print 'thread %s is running...' % current_thread().name
threads = []
for i in range(5):
threads.append(Thread(target=calc))
threads[i].start()
for i in range(5):
threads[i].join()
print 'thread %s ended.' % current_thread().name
在上面的代碼中,我們創(chuàng)建了 5 個線程,每個線程都對自己的局部變量 local_num 進行 10000 次的加 1 操作。由于對線程局部變量的修改不會影響到其他線程,因此,我們可以看到,每個線程結(jié)束時打印的 local_num 的值都為 10000,執(zhí)行結(jié)果如下:
thread MainThread is running...
thread Thread-4 is running...
Thread-4 10000
thread Thread-4 ended.
thread Thread-5 is running...
Thread-5 10000
thread Thread-5 ended.
thread Thread-6 is running...
Thread-6 10000
thread Thread-6 ended.
thread Thread-7 is running...
Thread-7 10000
thread Thread-7 ended.
thread Thread-8 is running...
Thread-8 10000
thread Thread-8 ended.
thread MainThread ended.
上面這種線程使用自己的局部變量的方法雖然可以避免多線程對同一變量的訪問沖突,但還是有一些問題。在實際的開發(fā)中,我們會調(diào)用很多函數(shù),每個函數(shù)又有很多個局部變量,這時每個函數(shù)都這么傳參數(shù)顯然是不可取的。
為了解決這個問題,一個比較容易想到的做法就是創(chuàng)建一個全局字典,以線程的 ID 作為 key,線程的局部數(shù)據(jù)作為 value,這樣就可以消除函數(shù)傳參的問題,代碼如下:
from threading import Thread, current_thread
global_dict = {}
def echo():
num = global_dict[current_thread()] # 線程根據(jù)自己的 ID 獲取數(shù)據(jù)
print current_thread().name, num
def calc():
print 'thread %s is running...' % current_thread().name
global_dict[current_thread()] = 0
for _ in xrange(10000):
global_dict[current_thread()] += 1
echo()
print 'thread %s ended.' % current_thread().name
if __name__ == '__main__':
print 'thread %s is running...' % current_thread().name
threads = []
for i in range(5):
threads.append(Thread(target=calc))
threads[i].start()
for i in range(5):
threads[i].join()
print 'thread %s ended.' % current_thread().name
看下執(zhí)行結(jié)果:
thread MainThread is running...
thread Thread-64 is running...
thread Thread-65 is running...
thread Thread-66 is running...
thread Thread-67 is running...
thread Thread-68 is running...
Thread-67 10000
thread Thread-67 ended.
Thread-65 10000
thread Thread-65 ended.
Thread-68 10000
thread Thread-68 ended.
Thread-66 10000
thread Thread-66 ended.
Thread-64 10000
thread Thread-64 ended.
thread MainThread ended.
上面的做法雖然消除了函數(shù)傳參的問題,但是還是有些不完美,為了獲取線程的局部數(shù)據(jù),我們需要先獲取線程 ID,另外,global_dict 是個全局變量,所有線程都可以對它進行修改,還是有些危險。
那到底如何是好?
事實上,Python 提供了 ThreadLocal 對象,它真正做到了線程之間的數(shù)據(jù)隔離,而且不用查找 dict,代碼如下:
from threading import Thread, current_thread, local
global_data = local()
def echo():
num = global_data.num
print current_thread().name, num
def calc():
print 'thread %s is running...' % current_thread().name
global_data.num = 0
for _ in xrange(10000):
global_data.num += 1
echo()
print 'thread %s ended.' % current_thread().name
if __name__ == '__main__':
print 'thread %s is running...' % current_thread().name
threads = []
for i in range(5):
threads.append(Thread(target=calc))
threads[i].start()
for i in range(5):
threads[i].join()
print 'thread %s ended.' % current_thread().name
在上面的代碼中,global_data 就是 ThreadLocal 對象,你可以把它當(dāng)作一個全局變量,但它的每個屬性,比如 global_data.num 都是線程的局部變量,沒有訪問沖突的問題。
讓我們看下執(zhí)行結(jié)果:
thread MainThread is running...
thread Thread-94 is running...
thread Thread-95 is running...
thread Thread-96 is running...
thread Thread-97 is running...
thread Thread-98 is running...
Thread-96 10000
thread Thread-96 ended.
Thread-97 10000
thread Thread-97 ended.
Thread-95 10000
thread Thread-95 ended.
Thread-98 10000
thread Thread-98 ended.
Thread-94 10000
thread Thread-94 ended.
thread MainThread ended.