The content of this article is about the solution to the problem that the pyquery parser cannot obtain the tag name dom node. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you.
As a serious front-end developer, in the process of learning python, I naturally chose the pyquery parser. After all, it saves a lot of learning time just like the front-end jquery.
However, a problem was discovered during use. pyquery cannot filter dom nodes as conveniently as jquery.
After some investigation, we found that: for class names, pyquery can still easily obtain nodes, but when using native tags such as: a, p, img... etc., the nodes cannot be obtained anyway.
It once made me very frustrated...
The culprit
<div xmlns="http://www.w3.org/1999/xhtml" class="image-item-inner" style="width: 398px; height: 598px;"><img src="http://p3.pstatp.com/origin/3f240001a4f84996876d" data-src="http://p3.pstatp.com/origin/3f240001a4f84996876d" alt="" /> <a href="http://p3.pstatp.com/origin/3f240001a4f84996876d" title="查看原图" target="_blank" ga_event="view_original_photo" class="image-origin"><i class="bui-icon icon-enlarge" style="font-size: 14px; color: rgb(255, 255, 255);" /></a></div>
In fact, the problem lies in xmlns="http://www.w3.org /1999/xhtml" Here, the document parsed by pyquery is in xmlns format by default, and this format is the reason why the native tag cannot be obtained.
doc = pq(browser.page_source,parser="html")
The solution is to configure parser="html" when parsing, and the problem will be solved.
The above is the detailed content of The solution to the problem that the pyquery parser cannot obtain the tag name dom node. For more information, please follow other related articles on the PHP Chinese website!