I have an example where I should visit multiple links and extract information from them. The problem is that when I use "colly.Visit(URL)" my visits increase. Example:
package main import ( "fmt" "github.com/gocolly/colly" ) func main() { CATETORIES := []string{ "cate1", "cate2", "cate3", } c := colly.NewCollector() for _, cate := range CATETORIES { c.OnRequest(func(r *colly.Request) { fmt.Println("Visiting categories", r.URL) }) c.Visit(cate) } }
This will print:
Visiting categories http://cate1 Visiting categories http://cate2 Visiting categories http://cate2 Visiting categories http://cate3 Visiting categories http://cate3 Visiting categories http://cate3
I tried initializing colly after each iteration, which worked fine - then the order was: access category http://cate1, access category http://cate2, access category http://cate3 But doing this I will lose my login session.. Any suggestions?
You are adding a new OnRequest
handler for each loop iteration. Configure the handler outside the loop:
func main() { CATETORIES := []string{ "cate1", "cate2", "cate3", } c := colly.NewCollector() c.OnRequest(func(r *colly.Request) { fmt.Println("Visiting categories", r.URL) }) for _, cate := range CATETORIES { c.Visit(cate) } }
The above is the detailed content of Go Colly - Access URL in for loop. For more information, please follow other related articles on the PHP Chinese website!