jsoup: Summary of parsing HTML usage_html/css_WEB-ITnose

WBOY
Release: 2016-06-24 11:42:42
Original
1485 people have browsed it

1. Parsing method

(1) Parsing from string

String html = "First parse< /head>

Parse HTML into a doc.

";

Document doc = Jsoup.parse(html);

?

(2) Get parsing from URL

Document doc = Jsoup.connect("http ://example.com/").get();  

String title = doc.title();

Document doc = Jsoup.connect("http:// example.com") .data("query", "Java").userAgent("Mozilla").cookie("auth", "token").timeout(3000).post();

?

?

(3) Parse from file

File input = new File("/tmp/input.html");

Document doc = Jsoup.parse(input, "UTF-8", "http://example.com/" ; )

getElementByTag(String tag)

getElementByClass(String className)


getElementByAttribute(String key)

siblingElements(), firstElementSibling(), lastElementSibling( ), nextElementSibling(), previousElementSibling()

parent(), children(), child(int index)

(2) Get element data

attr(String key) ? Get key attributes

attributes() ? Get attributes

id(), className(), classNames()

text() ? Get the text content


html() ? Get the HTML content inside the element

outerHtml() ? Get the HTML content including this element

data() ? Get Or the content in the