Parse HTML elements within frames and iframes
You are having trouble finding the <video>
tag while trying to extract a video link from the provided website. This is because the website uses frames (iframes), which effectively isolate different parts of the content into separate HTML documents.
To solve this problem, you need to dig into the collection of frames in the main document. Each frame contains its own HTML document, and access to these individual documents is necessary to extract data from all parts of the website.
Solution:
Use the WebBrowser.Document.Window.Frames
attribute to access the frame collection. Each HtmlWindow
in this collection has its own HtmlDocument
object.
Modify your code to iterate over each frame's document, using the Frame.Document.Body.GetElementsByTagName()
method to retrieve the element you need. Use HtmlElement.GetAttribute
to extract element attributes.
Example:
<code class="language-csharp">List<MovieLink> moviesLinks = new List<MovieLink>(); private void Browser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e) { var browser = sender as WebBrowser; if (browser.ReadyState != WebBrowserReadyState.Complete) return; var documentFrames = browser.Document.Window.Frames; foreach (HtmlWindow frame in documentFrames) { try { var videoElement = frame.Document.Body .GetElementsByTagName("VIDEO").OfType<HtmlElement>().FirstOrDefault(); if (videoElement != null) { string videoLink = videoElement.GetAttribute("src"); int hash = videoLink.GetHashCode(); if (moviesLinks.Any(m => m.Hash == hash)) { return; // 此 URL 的解析已完成 } string sourceImage = videoElement.GetAttribute("poster"); moviesLinks.Add(new MovieLink() { Hash = hash, VideoLink = videoLink, ImageLink = sourceImage }); } } catch (UnauthorizedAccessException) { } // 忽略此异常 catch (InvalidOperationException) { } // 忽略此异常 } }</code>
Instructions:
DocumentCompleted
event may fire multiple times as the browser loads each frame document. The above is the detailed content of How to Extract Data from HTML Elements Within Frames and IFrames?. For more information, please follow other related articles on the PHP Chinese website!