[Translation] XPath and CSS Selectors
Original text: http://ejohn.org/blog/xpath-css-selectors
Recently, I have done a lot of work to implement a system that supports both XPath and XPath. And the parser of CSS 3, what surprised me is that they are very similar in some aspects, and completely different in other aspects. The difference is that CSS is used to work with HTML, you can use # id to get elements based on ID, and .class to get elements based on class. If these are implemented with XPath, it will not be so simple. On the other hand, XPath can use .. to return to the upper node of the DOM tree, and you can also use foo[bar] to get a foo element that has a bar child element. CSS selectors cannot do this at all. To sum up, compared with XPath, CSS selectors are usually shorter, but unfortunately they are not powerful enough.
I think it is valuable to compare the writing methods of these two selectors.
Target | CSS 3 | XPath | |||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
All elements | * | //* | |||||||||||||||||||||
All P elements | p | //p | |||||||||||||||||||||
All Child elements of the p element | p > * | //p/* | |||||||||||||||||||||
Get elements based on ID | # foo | //*[@id='foo'] | |||||||||||||||||||||
Get elements based on Class | .foo >Elements with a certain attribute | *[title] | |||||||||||||||||||||
The first child of all P elements Element | p > *:first-child | ||||||||||||||||||||||
All those that have child element a P element | |||||||||||||||||||||||
//p[a]
| Next sibling element | p * | |||||||||||||||||||||
class="foo" | //*[@class='foo '] | The class attribute has only one value foo | |||||||||||
class="foobar foo bar" | //*[@class=' foo '] | In the class attribute value, foo is in the middle of the values on the other two sides | |||||||||||
| < td style="BORDER-BOTTOM: silver 1px solid; BORDER-LEFT: silver 1px solid; PADDING-BOTTOM: 3px; MARGIN: 0px; PADDING-LEFT: 3px; PADDING-RIGHT: 3px; BORDER-COLLAPSE: collapse; BORDER- TOP: silver 1px solid; BORDER-RIGHT: silver 1px solid; PADDING-TOP: 3px">//*[starts-with(@class,'foo ')] In the class attribute value, foo is on the far left | ||||||||||||
class="bar foo" | //*[substring(@class,string-length( @class)-3)=' foo'] | In the class attribute value, foo is on the far right, XPath1.0 There is no ends-with function in 2.0, but browsers now implement 1.0 |
那么我们能在网页开发中用上XPath吗?最初,jQuery是支持XPath选择器的,但后来,由于效率问题,jQuery放弃了对XPath的支持.刚好,谷歌在上个月发布了Wicked Good XPath,这是一个DOM Level 3 XPath规范的纯JavaScript实现,也是目前同类实现中最快的,我们可以把这个脚本和jQuery结合起来使用.
jQuery.getScript("http://wicked-good-xpath.googlecode.com/files/wgxpath.install.js").success( () { wgxpath.install(); jQuery.xpath = elements = []; xpathResult = document.evaluate(xpath, document, , 6, ( i = 0; i < xpathResult.snapshotLength; i++ jQuery(elements);
这样就能通过$.xpath()静态方法来选择元素了,该方法返回的也是一个jQuery对象,和使用$()没什么差别.本页面已经加载了这个脚本,你可以现在打开控制台试验一下$.xpath方法.
那我们有了CSS选择器,为什么还要用XPath呢,答案是:有些时候,XPath更强大一点.比如:
在上面John Resig总结的表中,有一个CSS无法实现的功能,就是查找包含某个子元素的父元素.的确,目前的CSS还无法实现,不过在未来CSS4的选择器中,将会有一个父选择器
E! > F //注意,2011年的时候,父选择器的语法是$E > F,今年草案又改了.网上有些介绍CSS4选择器的博文还是旧的,这里有一个能在CSS文件中使用父选择器的polyfill https://github.com/Idered/cssParentSelector
该选择器可以选取到那些包含子元素F的E元素.但即便以后实现了CSS4,稍微改变一下需求,查找那些包含后代元素F的E元素,CSS选择器又怎么写呢?应该是没什么办法实现.熟悉jQuery的朋友可能会说,jQuery里有:has伪类,可以这么写E:has(F),的确,如果使用jQuery自定义的过滤器,几乎任何需求都可以用遍历DOM的方法实现,但效率绝对会很低.而XPath就不一样了,毕竟Firefox和chrome都已经实现了XPath的接口document.evaluate方法(Wicked Good XPath应该主要是努力在IE上实现统一的接口),速度肯定比手动遍历DOM来的快.XPath的写法是这样的//E[.//F],怎么样,也挺简单明了的.
另外很重要的一点是,CSS本来是用于给HTML添加样式的,12种节点类型中,只有元素节点(nodetype等于1)才有样式这一说,因此,CSS选择器只能选取到页面中的元素节点,而XPath就不是了,它不光可以用在HTML中,还可以用在XML中,除了元素节点,而可以选择属性节点(//@*)或者文本节点(//text())等,如果未来XPath2.0实现了,它会变的更加强大.