It is recommended that the questioner post the website or even his own code so that everyone can help you debug it. It's normal to be different. If the content crawled by your crawler is saved as a static page and is different from what you see with the browser, then the other party's anti-crawler mechanism must have recognized it, so the server will return different information. There are many ways to identify crawlers. If you still have any questions, please feel free to ask again
After actual testing, the conclusion is that bs4 changes the order of attributes.
1. Right-click the page in the browser and select:
2. Comparison in python3 program:
Result:
The order of class and id is just different.
If you use Chrome and Firefox to view the source code of the same web page, the order is also different.
It is recommended that the questioner post the website or even his own code so that everyone can help you debug it. It's normal to be different. If the content crawled by your crawler is saved as a static page and is different from what you see with the browser, then the other party's anti-crawler mechanism must have recognized it, so the server will return different information. There are many ways to identify crawlers. If you still have any questions, please feel free to ask again
The poster recommends that you post all the source code, because the website can identify whether you are operating a human browser or a crawler.
Looking at the current code, it is recommended that you add header information! use-agent That line of code!