How to save customer data in python: 1. Use [with open()] to create a new object and write the data; 2. Use the pandas package to save, the code is [import pandas as pd #import pandas].
【Related learning recommendations: python tutorial】
Methods for python to always save customer data:
1. Save with open function
Use with open() to create a new object
Write data (here we use the Douban short review of a book in Douban Reading as an example)
import requests from lxml import etree #发送Request请求 url = 'https://book.douban.com/subject/1054917/comments/' head = {'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.94 Safari/537.36'} #解析HTML r = requests.get(url, headers=head) s = etree.HTML(r.text) comments = s.xpath('//div[@class="comment"]/p/text()') #print(str(comments))#在写代码的时候可以将读取的内容打印一下 #保存数据open函数 with open('D:/PythonWorkSpace/TestData/pinglun.txt','w',encoding='utf-8') as f:#使用with open()新建对象f for i in comments: print(i) f.write(i+'\n')#写入数据,文件保存在上面指定的目录,加\n为了换行更方便阅读
What we refer to here is: the open mode of the open function
Parameter Usage
#r read read only. If the file does not exist, an error will be reported.
w write only writes. If the file does not exist, it will be created automatically.
a apend is appended to the end of the file.
rb, wb, ab operate binary
r open read and write mode
2. Saving pandas package
Speaking of Pandas, I have to talk about the two data analysis tool packages related to it (note: pandas, numpy and matplotlib all need to be installed in advance. For detailed installation, please see the previous blog post About the pip installation package)
numpy: (short for Numerical Python), is a basic package for high-performance scientific computing and data analysis
pandas: A Python package based on Numpy that contains advanced data structures and manipulation tools that make data analysis easier
matplotlib: is a plotting package for creating publication-quality charts (Mainly 2D)
import pandas as pd #导入pandas import numpy as np #导入numpy import matplotlib.pypolt as plt #导入matplotlib
Next, I will demonstrate pandas saving data to CSV and Excel
#导入包 import pandas as pd import numpy as np df = pd.DataFrame(np.random.randn(10,4))#创建随机值 #print(df.head(2))#查看数据框的头部数据,默认不写为前5行,小于5行时全部显示;也可以自定义查看几行 print(df.tail())##查看数据框的尾部数据,默认不写为倒数5行,小于5行时全部显示;也可以自定义查看倒数几行 df.to_csv('D:/PythonWorkSpace/TestData/PandasNumpy.csv')#存储到CSV中 #df.to_excel('D:/PythonWorkSpace/TestData/PandasNumpy.xlsx')#存储到Excel中(需要提前导入库 pip install openpyxl) 实例中保存豆瓣读书的短评代码如下: import requests from lxml import etree #发送Request请求 url = 'https://book.douban.com/subject/1054917/comments/' head = {'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.94 Safari/537.36'} #解析HTML r = requests.get(url, headers=head) s = etree.HTML(r.text) comments = s.xpath('//div[@class="comment"]/p/text()') #print(str(comments))#在写代码的时候可以将读取的内容打印一下 ''' #保存数据open函数 with open('D:/PythonWorkSpace/TestData/pinglun.txt','w',encoding='utf-8') as f:#使用with open()新建对象f for i in comments: print(i) f.write(i+'\n')#写入数据,文件保存在上面指定的目录,加\n为了换行更方便阅读 ''' #保存数据pandas函数 到CSV 和Excel import pandas as pd df = pd.DataFrame(comments) #print(df.head())#head()默认为前5行 df.to_csv('D:/PythonWorkSpace/TestData/PandasNumpyCSV.csv') #df.to_excel('D:/PythonWorkSpace/TestData/PandasNumpyEx.xlsx')
If you want to know more related learning, please pay attention php training column!
The above is the detailed content of How to save customer data in python. For more information, please follow other related articles on the PHP Chinese website!