Teach you step by step how to use Python web crawler to obtain fund information

Release: 2023-07-24 14:53:20
forward
897 people have browsed it

##1. Preface

First few A fan came to me the other day to get fund information. I would like to share it here. Friends who are interested can also try it actively.

Teach you step by step how to use Python web crawler to obtain fund information

2. Data Acquisition

Our target website here is the official website of a certain fund, which needs to be crawled The data is shown in the figure below.

You can see that the fund code column in the picture above has different numbers. Click on one randomly to enter the fund details page. The links are also very regular, with the fund code as the symbol. Teach you step by step how to use Python web crawler to obtain fund information

In fact, this website is not difficult. The data is not encrypted. The information on the web page can be seen directly in the source code.

Teach you step by step how to use Python web crawler to obtain fund information

This reduces the difficulty of crawling. Through the browser packet capture method, you can see the specific request parameters, and you can see that only

pi is changing in the request parameters, and this value happens to correspond to the page, so you can directly construct the request parameters. . Teach you step by step how to use Python web crawler to obtain fund information

Code implementation process

After finding the data source, the next step is to implement the code. Let’s take a look. Here are Output some key codes.

Get the stock id data

response = requests.get(url, headers=headers, params=params, verify=False)
    pattern = re.compile(r&#39;.*?"(?P<items>.*?)".*?&#39;, re.S)
    result = re.finditer(pattern, response.text)
    ids = []
    for item in result:
        # print(item.group(&#39;items&#39;))
        gp_id = item.group(&#39;items&#39;).split(&#39;,&#39;)[0]
Copy after login

The result is as shown below:

Teach you step by step how to use Python web crawler to obtain fund information

The details will be constructed later page link to obtain the fund information of the details page. The key code is as follows:

response = requests.get(url, headers=headers)
response.encoding = response.apparent_encoding
selectors = etree.HTML(response.text)
danweijingzhi1 = selectors.xpath(&#39;//dl[@class="dataItem02"]/dd[1]/span[1]/text()&#39;)[0]
danweijingzhi2 = selectors.xpath(&#39;//dl[@class="dataItem02"]/dd[1]/span[2]/text()&#39;)[0]
leijijingzhi = selectors.xpath(&#39;//dl[@class="dataItem03"]/dd[1]/span/text()&#39;)[0]
lst = selectors.xpath(&#39;//div[@class="infoOfFund"]/table//text()&#39;)
Copy after login

The result is as shown in the figure below:

Teach you step by step how to use Python web crawler to obtain fund information Process the specific information into corresponding strings, and then save it to csv file, the results are as shown below:

Teach you step by step how to use Python web crawler to obtain fund informationWith this, you can do further statistics and data analysis.

3. Summary

Hello everyone, I am a Python advanced person. This article mainly shares the use of Python web crawler to obtain fund data information. This project is not too difficult, but there are a few pitfalls. Everyone is welcome to try it. If you encounter any problems, please add me as a friend and I will help solve it.

This article is mainly based on the classification of [stock type]. I have not done other types. You are welcome to try. In fact, the logic is the same, just change the parameters. . Teach you step by step how to use Python web crawler to obtain fund information

The above is the detailed content of Teach you step by step how to use Python web crawler to obtain fund information. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:Go语言进阶学习
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template