Currently, we need to read third-party interface data. The amount of data is relatively large, so the third party uses paging. My current plan is to call the curl reading interface method in a loop, and then convert the json data into Array, when splicing the array, perform data filtering and warehousing operation, but due to the instability of the third-party interface, the reading may fail. Is there a better solution?
Currently, we need to read third-party interface data. The amount of data is relatively large, so the third party uses paging. My current plan is to call the curl reading interface method in a loop, and then add json The data is converted into an array, the array is spliced, and the data is filtered and stored in the database. However, due to the instability of the third-party interface, the reading may fail. Is there a better solution?
If the amount of data is large, you can use a scheduled script to pull it.
When pulling, first sort by a field such as auto-incrementing id, so as to avoid the problem of paging data changes.
Then when the script is executed or when the loop ends, write down the largest ID, and then bring a condition greater than this value the next time it is executed.
Then when the script is executed, if the interface call fails, you can try to pull it again several times. If it fails, stop the script execution, then write down the id, and then give an early warning message and manual intervention.
Well, a simple idea
1. It is recommended to establish a crawling original database, whether it is id or md5, to ensure the uniqueness of single data capture
2. The interface is unstable, and curl failure handling can be handled through exception handling, etc. Try your best to ensure that the data capture is successful
3. Be prepared to capture data repeatedly. Based on 1, ensure that the data will not be processed repeatedly
The above is the optimal solution for loop reading with paging interface, and more For more related content, please pay attention to the PHP Chinese website (www.php.cn)!