Go's regexp module falls short with stream processing-- nearly all methods require a string or []byte. The regexpscanner module makes it easy to extract tokens that match regular expression patterns.
https://pkg.go.dev/github.com/tonymet/regexpscanner
go get github.com/tonymet/regexpscanner@latest
use ProcessTokens when a simple callback-based stream tokenizer is needed .
ProcessTokens calls handler(string) for each matching token from the Scanner.
package main import ( "fmt" "regexp" "strings" rs "github.com/tonymet/regexpscanner" ) func main() { rs.ProcessTokens( strings.NewReader("<html><body><p>Welcome to My Website</p></body></html>"), regexp.MustCompile(`</?[a-z]+>`), func(text string) { fmt.Println(text) }) }
<html> <body> <p> </p> </body> </html>
Give it a try and see the Go Module Page for more examples
The above is the detailed content of Streaming regex scanner — regexpscanner. For more information, please follow other related articles on the PHP Chinese website!