插入语句有两条,循环插入这两条
只是简单写了下插入语句,没有捕捉到异常
def process_item(self, item, spider):
#print(item)
try:
with self.connection.cursor() as cursor:
#Create a new record
sql1 = "INSERT INTO staff (XNXQ, \
department, \
teacher, \
gender, \
title, \
note1, \
note2) VALUES (%s, %s, %s, %s, %s, %s, %s)"
cursor.execute(sql1, (item['first']['XNXQ'],
item['first']['department'],
item['first']['teacher'],
item['first']['gender'],
item['first']['title'],
item['first']['note1'],
item['first']['note2']))
self.connection.commit()
#Create a new record
cursor.execute("select max(id) from staff")
teacherId = cursor.fetchone()['max(id)']
print('teacherId:' + str(teacherId))
print(item['second'])
sql2 = "INSERT INTO staffCourse (teacherId, \
snum, \
course, \
credit, \
teachWay, \
courseType, \
classNum, \
className, \
stuNum, \
week, \
section, \
location) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)"
cursor.execute(sql2, (teacherId,
item['second']['snum'],
item['second']['course'],
item['second']['credit'],
item['second']['teachWay'],
item['second']['courseType'],
item['second']['classNum'],
item['second']['className'],
item['second']['stuNum'],
item['second']['week'],
item['second']['section'],
item['second']['location']))
self.connection.commit()
except Exception as e:
print('------------------------------------------')
print(e)
查看数据库时,发现少了很多,我猜应该是频繁插入导致数据丢失的,因为我在插入数据库之前把数据print了一下,没少。
怎么解决这个问题?
Did you loop it many times at once?
If I remember correctly. The database has a queue cache. If too much data is stuffed into the cache at once and fills up the cache, it will be lost
If there is a large amount of data to be inserted, you must implement the queue yourself and then insert it regularly
Or try transactions
Since I don’t understand python syntax, I only provide 2 solutions from the perspective of SQL:
1. Use transaction method to write data, and submit every 1000 pieces of data, for example:
fake code
2. Change sql to batch writing, and the performance will be greatly improved
You can check the database log and the execution record.
Although you write insert in the code, commit. But when to commit is controlled by the transaction in your project, not by you here. The project may control the transaction from aspects. Solution:
1. Insert in pages, configure transactions, do not insert in one go, insert in batches, and commit data in batches.