Is there any open source tool to collect data from web pages?
For example, if you want to include continuous rule capture, for example, capture the paging information first, get the details page from it, and capture the really needed DOM fields from the details page
Includes the final customization saved to the database,
Includes the ability to forge IP, etc.
Includes automatic queue mechanism, automatic delay
Wait
Thank you
Is there any open source tool to collect data from web pages?
For example, if you want to include continuous rule capture, for example, capture the paging information first, get the details page from it, and capture the really needed DOM fields from the details page
Includes the final customization saved to the database,
Includes the ability to forge IP, etc.
Includes automatic queue mechanism, automatic delay
Wait
Thank you
Yes, you can try [Archer Cloud Crawler Development Platform. 】
Archer Cloud Crawler is a SaaS service platform that helps JS developers quickly develop crawler systems. Archer provides an easy-to-use, flexible and open cloud crawler development framework, allowing developers to implement a crawler by writing just a few lines of JS code online. And the crawler will automatically run on the cloud server, making the crawling faster and more efficient.
phpcrawler, php crawler, php collector, multi-process, multi-thread
phpQuery