php editor Strawberry today introduces a method that can help us ignore the problem of printing reaching the maximum depth limit when using the go colly crawler framework. In the process of crawling web page data, we usually encounter situations where the structure is deeply nested, and the default printing depth limit of the colly framework may not be able to fully display all the data. By setting the debugging options of the colly framework, we can easily solve this problem and obtain a more comprehensive data display. Next, let’s learn about the specific steps!
I have a go colly crawler and I'm trying to crawl many websites. On my terminal it prints a lot of:
2023/05/30 02:22:56 Max depth limit reached 2023/05/30 02:22:56 Max depth limit reached 2023/05/30 02:22:56 Max depth limit reached 2023/05/30 02:22:56 Max depth limit reached 2023/05/30 02:22:56 Max depth limit reached 2023/05/30 02:22:56 Max depth limit reached 2023/05/30 02:22:56 Max depth limit reached
This makes it difficult for me to read some of the print I place. I'm wondering if there is any way to ignore printing this in the terminal. Thank you
Maximum depth limit reached
iscolly.errmaxdepth. Your project must have code like this:
c := colly.newcollector(colly.maxdepth(5)) // ... if err := c.visit("http://go-colly.org/"); err != nil { log.println(err) }
If you don't want this error logged, add a simple check to exclude it:
c := colly.newcollector(colly.maxdepth(5)) // ... if err := c.visit("http://go-colly.org/"); err != nil { // log the error only when the error is not errmaxdepth. if err != colly.errmaxdepth { log.println(err) } }
Another option is to redirect the output to a file:
go run . 2>&1 >log.txt
Or use tee
to copy the output to a file and to standard output:
go run . 2>&1 | tee log.txt
The above is the detailed content of How to ignore printing when maximum depth limit is reached go colly. For more information, please follow other related articles on the PHP Chinese website!