Use Excel\VBA to scrape information from web pages
P粉555696738
2023-09-02 18:48:32
<p>I'm trying to scrape data from a web page, I need 4 bits of data, 2 of which I've been able to retrieve, but the other 2 have been hinted to me. </p>
<p>This is not a fixed URL, but it has been retrieved from IE after the redirect, so for this example I will use:
https://cpc.farnell.com/warton/microprint-p2010-15-32um-250g/solder-paste-15-32um-250g-pot/dp/SD02808
</p>
<p>I hope to get the title:</p>
<p><strong>Microprint P2010 No-Clean, Lead-Free Solder Paste, 15-32um, 250g Jar - MICROPRINT P2010 15-32UM, 250G</strong></p>
<p>and manufacturer part number</p>
<p><strong>Microprint P2010 15-32UM, 250G</strong></p>
<p>I've been using <strong>getElementbyID</strong> to get the product overview/product information and it works great, but the other text fields don't seem to work (I looked at the Xpath and it errors out on Object) VBE required)</p>
<p>I successfully implemented the other 2 elements using: </p>
<pre class="brush:php;toolbar:false;">Sub Mani()
'Declare variables
Dim objWeb As Object
Dim objHTML As Object
Dim objElement As Object
Dim strData, StrData1 As String
Set objWeb = CreateObject("internetexplorer.Application")
str = "https://cpc.farnell.com/"
objWeb.navigate str & Cells(1, 1).Value
While objWeb.Busy = True
Wend
FullURL = objWeb.LocationURL
Range("b2").Value = FullURL
objWeb.navigate FullURL
'IE.Visible = True
While objWeb.Busy = True
Wend
Set objHTML = objWeb.document
strData = objHTML.getElementById("pdpSection_FAndB").innerText
ActiveSheet.Range("C3").Value = strData
strData = objHTML.getElementById("pdpSection_pdpProdDetails").innerText
Set strData = Nothing
ActiveSheet.Range("D3").Value = strData
objWeb.Quit
End Sub</pre>
<p>URL reloading is for navigation around URL redirects, </p>
<p>https://cpc.farnell.com/SD02808</p>
<p>Redirect to main URL</p>
<p>https://cpc.farnell.com/warton/microprint-p2010-15-32um-250g/solder-paste-15-32um-250g-pot/dp/SD02808</p>
<p>I want to enter the part number into A1, Excel use it as a variable in VB, and then re-read the full url from the IE instance.
This is a legacy issue from trying to use <strong>Get Data from the Web</strong>, which didn't work well due to a page script error, so an instance of IE was used. </p>
IE is dead and should no longer be used. However, it is also possible to use the shortened URL https://cpc.farnell.com/SD02808 via xhr (XML HTTP request). The following code only shows how to read the required information from the loaded HTML code. All safeguards are missing.
The object error occurs in the code shown above because you are trying to set a variable of data type
String
toNothing
, just like an object variable usingSet strData = none
.