HtmlAgilityPack is an open source class library that parses HTML elements. The biggest feature is that it can parse HMTL through XPath, If you have used C# to operate XML before, you will be comfortable using HtmlAgilityPack. The latest version is 1.4.6. The download address is as follows:
http://htmlagilitypack.codeplex.com/
The following is a simple example to introduce the use of HtmlAgilityPack. ForAsp.Net When a website developed by a program wants to simulate login, in addition to knowing the name attribute value of the user name text box and password text box, it also needs to know the VIEWSTATE and EVENTVALIDATION of the page. The two hidden control values, and the name attribute of the submit button, let's see how to use HtmlAgilityPack to get this additional value.
1. Add a reference to HtmlAgilityPack.dll in the project
2. Put several text box controls and a button control in the Aspx page
3. The background events of the button are as follows
protected void btnHtml_Click(object sender, EventArgs e) { if (tbUrl.Text.Length > 0) { HtmlWeb htmlWeb = new HtmlWeb(); HtmlDocument htmlDoc = htmlWeb.Load(this.tbUrl.Text); HtmlNode htmlNode = htmlDoc.DocumentNode.SelectSingleNode("//input[@id='VIEWSTATE']"); string viewStateValue = htmlNode.Attributes["value"].Value; htmlNode = htmlDoc.DocumentNode.SelectSingleNode("//input[@id='EVENTVALIDATION']"); string eventValidation = htmlNode.Attributes["value"].Value; htmlNode = htmlDoc.DocumentNode.SelectSingleNode("//input[@type='submit']"); string submitName = htmlNode.Attributes["name"].Value; tbViewState.Text = viewStateValue; tbEventValidation.Text = eventValidation; tbSubmitName.Text = submitName; } }
4. Taking the login interface of the blog park as an example, the obtained interface is as follows
The above is the detailed content of Introduction to the use of HTML parsing component HtmlAgilityPack. For more information, please follow other related articles on the PHP Chinese website!