How does Go Colly find the requested element?

PHPz
Release: 2024-02-13 13:57:08
forward
820 people have browsed it

Go Colly如何找到请求的元素?

php Editor Banana will introduce you to a powerful web crawler framework-Go Colly. Go Colly is a lightweight web crawler framework developed based on the Go language. It has the characteristics of high performance, high concurrency, and easy expansion. When using Go Colly for web crawling, we often need to find the requested elements according to our needs. So, how does Go Colly find the requested element? Next, we will answer them one by one.

Question content

I am trying to use colly to have a specific table loop through its contents but the table is not recognized, this is what I have so far.

package main

import (
    "fmt"
    
    "github.com/gocolly/colly"
)

func main() {
    c := colly.NewCollector(
        colly.AllowedDomains("wikipedia.org", "en.wikipedia.org"),
    )
    
    links := make([]string, 0)

    c.OnHTML("div.mw-parser-output", func(e *colly.HTMLElement) {
        
        e.ForEach("table.wikitable.sortable.jquery-tablesorter > tbody > tr", func(_ int, elem *colly.HTMLElement) {
            fmt.Println(elem.ChildAttr("a[href]", "href"))
            links = append(links, elem.ChildAttr("a[href]", "href"))
        })
    })
    
    c.OnRequest(func(r *colly.Request) {
        fmt.Println("Visiting", r.URL.String())
    })

    c.Visit("https://en.wikipedia.org/wiki/List_of_countries_and_dependencies_by_population")
    fmt.Println("Found urls for", len(links), "countries.")
}
Copy after login

I need to loop through all tr ​​elements in the table.

Workaround

It turns out that the name of the class is actually wikitable.sortable, even though it appears in the chrome console as wikitable sortable jquery-tablesorter. I don't know why the names are so different, but it solved my problem.

The above is the detailed content of How does Go Colly find the requested element?. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:stackoverflow.com
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!