Download Bitcoin prices using Html Agility Pack written in C#
P粉156532706
2023-09-05 17:17:03
<p>I need to get Bitcoin price from https://coinmarketcap.com/currencies/bitcoin/ using Html Agility Pack. I'm using this example and it works fine: </p>
<pre class="brush:php;toolbar:false;">var html = @"http://html-agility-pack.net/";
HtmlWeb web = new HtmlWeb();
var htmlDoc = web.Load(html);
var node = htmlDoc.DocumentNode.SelectSingleNode("//head/title");
Console.WriteLine("Node Name: " node.Name "\n" node.OuterHtml);</pre>
<p>XPath is: <code>//*[@id="__next"]/div/div[1]/div[2]/div/div[1]/div[2]/div/ div[2]/div[1]/div</code></p>
<p>HTML code: </p>
<pre class="brush:php;toolbar:false;"><div class="priceValue "><span>$17,162.42</span></div></pre>
<p>I tried the following code but it returns "Object reference not set to an instance of an object": </p>
<pre class="brush:php;toolbar:false;">var html = @"https://coinmarketcap.com/currencies/bitcoin/";
HtmlWeb web = new HtmlWeb();
var htmlDoc = web.Load(html);
var node = htmlDoc.DocumentNode.SelectSingleNode("//div[@class='priceValue']/span");
Console.WriteLine("Node Name: " node.Name "\n" node.InnerText);`</pre></p>
TLDR:
HtmlWeb
to decompress the response (or use a suitable HTTP client)Apparently, the
SelectSingleNode()
call returnsnull
because it cannot find the node.In this case, it is helpful to inspect the loaded HTML. You can do this by getting the value of
htmlDoc.DocumentNode.InnerHtml
. I've tried doing this and the "HTML" generated is meaningless.The reason is that
HtmlWeb
does not decompress the response it receives by default. See this github issue for details. If you used a proper HTTP client (like this), or if the HtmlAgilityPack developers were more proactive, I don't think you would run into this problem.If you insist on using
HtmlWeb
, your code should look like this:Please note that the class of the element you are looking for is actually
priceValue
(with a space character at the end), there is anotheron the page with class
priceValuediv
. That's another question, though, and you should eventually be able to find a more robust selector. Maybe try this: