在寫爬蟲中,我要把爬取到的資料存到資料庫中.每一個頁面裡邊有很多條目,比如一個人的訪客可能有很多個,於是插入卸載循環中,
try:
sql_visitor='INSERT INTO visitor (ownername,owneruid,visitorname,visitoruid,visittime) VALUE ("%s",%d,"%s",%d,"%s")'%(ownername,owneruid,visitorname,visitoruid,visitortime)
print sql_visitor
self.cursor.execute(sql_visitor)
self.connect.commit()
except Exception as e:
print e
一個頁面一個線程,嫌棄慢的我開了5個
max_threads=5
while uid < 8000000 or threadlist:
for thread1 in threadlist:
if not thread1.is_alive():
threadlist.remove(thread1)
while len(threadlist) < max_threads and uid < 8000000:
uid+=1
thread2=threading.Thread(target=run,args=(uid,))
thread2.setDaemon(True)
thread2.start()
threadlist.append(thread2)
time.sleep(5)
運行很順利:
於是我將max_thread設定成10,於是結果如下:INSERT INTO visitor (ownername,owneruid,visitorname,visitoruid,visittime) VALUE ("huosai7",4893,"Liang2017",7252799,"2017-5-22 21:06")##ERTERT IN personalinfo (ownername,owneruid,jifen,huajiao,xiaomijiao,jinbi,haoyou,zhuti,rizhi,xiangce,fenxiang,kongjianfangwenliang,youxiangyanzheng,shipinrenzheng,juzhudi,chush Oiat,chushiuoykia,a N但是sh影響s,aushiwian, aushiwoo,oobia, miakianshijam, miakian,chushani,aoia,ar. ian,shengri,xingbie) VALUE( "huosai7",4893,0,0,0,0,0,0,0,0,0,0,0,0,"","","2100-01-01 12:00","2100- 01-01 12:00","2100-01-01 12:00","2004-1-3 19:28",0,"2100-01-01 12:00",0)
INSERT INTO visitor (ownername,owneruid,visitorname,visitoruid,visittime) VALUE ("龍樂",4894,"Liang2017",7252799,"2017-5-22 21:06")
(1062, "Duplicate entry '489-entry '489. 7252799-2017-05-22 21:06:00' for key 'PRIMARY'")
INSERT INTO personalinfo (ownername,owneruid,jifen,huajiao,xiaomijiao,jinbi,haoyou,zhuti,rizhi,xiangce,fenzhi,xi0 ,youxiangyanzheng,shipinrenzheng,juzhudi,chushengdi,shangcifabiaoshijian,shangcihuodongshijian,zuihoufangwen,zhuceshijian,zaixianshijian,shengri,xingbie) VALUE("龍樂",4894,0,0,0,0,0,0,0,0,0,0 0,0,0,"","","2100-01-01 12:00","2100-01-01 12:00","2100-01-01 12:00","2004-1- 3 20:21",0,"2100-01-01 12:00",0)
.......
INSERT INTO visitor (ownername,owneruid,visitorname,visitoruid,visittime) VALUE ("xiao61",4889,"Liang2017",7252799,"2017-5-22 21:06")(2006, 'MySQL server has gone away')
INSERT INTO personalinfo (ownername,owneruid,jifen,huajiao,xiaomijiao,jinbi,haoyou,zhuti,rizhi,xiangce,fenxiang,kongjianfangwenliang,youxiangyanzheng,shipinrenz,g,a,kongjianfangwenliang. ian,zaixianshijian,shengri ,xingbie) VALUE("xiao61",4889,0,0,0,0,0,0,0,0,0,0,0,0,"","","2100-01-01 12:00 ","2100-01-01 12:00","2100-01-01 12:00","2004-1-3 15:56",0,"2100-01-01 12:00",0)
(2006, 'MySQL server has gone away')
INSERT INTO visitor (ownername,owneruid,visitorname,visitoruid,visittime) VALUE ("糊塗酷酷熊",4897,"Liang2017",7252799,"2017-5-22 21:06")
(2006, 'MySQL server has gone away')
INSERT INTO personalinfo (ownername,owneruid,jifen,huajiao,xiaomijiao,jinbi,haoyou,zhuti,rizhi,xiangce,fenxiang,kongjianfangwenliang,youxiangyanzheng,shipinrenz,g,a,kongjianfangwenliang. ian,zaixianshijian,shengri ,xingbie) VALUE("糊塗酷酷熊",4897,611,0,1655,0,0,2,0,0,0,34,0,0,"","","2007-3-27 00:37","2007-3-27 00:37","2007-3-27 00:37","2004-1-3 21:08",0,"2100-01-01 12:00" ,1)
(2006, 'MySQL server has gone away')
.......
可以看出2006出來了,然後我將max_thread設定成30,然後結果如下:
#就將,夠詳細嗎?不夠詳細還需要什麼只管說!
看這裡,我猜你是用的是pymysql,它的線程安全描述為1,對應的pep249裡面做了詳細的描述:
執行緒可以共享模組但不能共享連線。這也就是說你可能得在每個執行緒中創建一個連線。
吶~為什麼不用orm來做呢?