Web Page Data Extraction for Java Programs
Extracting information from web pages into Java programs requires a specific technique known as web scraping. Web scraping involves parsing HTML content to identify and extract targeted data.
One highly recommended approach is to utilize the Jsoup HTML parser, renowned for its jQuery-like CSS selector support and enhanced for loop compatibility. Here's a sample Java code that demonstrates the web scraping process:
<code class="java">import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; import org.jsoup.select.Elements; public class WebScraper { public static void main(String[] args) throws Exception { // Target URL String url = "https://www.bestbuy.com/site/best-buy-insignia-55-class-f30-series-led-4k-uhd-smart-fire-tv/6494164.p?skuId=6494164"; // Connect to the URL and parse HTML content Document document = Jsoup.connect(url).get(); // Get product information using CSS selectors String title = document.select("h1.page-title").text(); String price = document.select(".priceView-customer-price").text(); String description = document.select(".product-lang-en-us .product-description-rich-html").text(); // Print results</code>
The above is the detailed content of How to Extract Web Page Data into Java Programs with Jsoup?. For more information, please follow other related articles on the PHP Chinese website!