Parsing a Website for Information with Jsoup
To extract information from a web page into your Java program, you can utilize HTML parsers such as Jsoup. Jsoup stands out as it employs jQuery-like CSS selectors and simplifies iteration through extracted data.
To begin, include the latest Jsoup JAR file in your classpath. Here's an example of how to scan a Best Buy item page and extract title, price, and description:
<code class="java">import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsoup.select.Elements; public class WebsiteScanner { public static void main(String[] args) throws Exception { String url = "https://www.bestbuy.com/site/sony-wh-1000xm5-wireless-bluetooth-noise-canceling-over-the-ear-headphones-black/6497835.p?skuId=6497835"; Document document = Jsoup.connect(url).get(); String title = document.select("h1.v-pdp-product-title").text(); String price = document.select(".v-pdp-price-amount").text(); String description = document.select(".v-pdp-main-description").text(); </code>
The above is the detailed content of How can I use Jsoup to extract specific information from a website in Java?. For more information, please follow other related articles on the PHP Chinese website!