Table of Contents
Beautiful Soup解析器比较
Home Web Front-end HTML Tutorial BeautifulSoup中各种html解析器的比较及使用_html/css_WEB-ITnose

BeautifulSoup中各种html解析器的比较及使用_html/css_WEB-ITnose

Jun 24, 2016 am 11:38 AM

Beautiful Soup解析器比较

·Beautiful Soup支持各种html解析器,包括python自带的标准库,还有其他的许多第三方库模块。其中一个就是lxml parser,至于lxml parser的安装,可以通过以下方法安装:

1)easy_install lxml   2)pip install lxml    

另外,python对于模块的安装,可以查看博客说明,分为两种:easy_install 和 pip.

另外一种纯python解析器为html5lib解析器,可以像web浏览器那样解析html页面,你可以通过下面两种方式安装html5lib:

1)easy_install html5lib   2)pip install html5lib

下面对各种html解析器的优缺点做一下对比:


解析器 使用方法 优点 缺点
Python’s html.parser BeautifulSoup(markup,"html.parser")
  • python自身带有
  • 速度比较快
  • 能较好兼容 (as of Python 2.7.3 and 3.2.)
  • 不能很好地兼容(before Python 2.7.3 or 3.2.2)
    lxml’s HTML parser BeautifulSoup(markup,"lxml")
  • 速度很快
  • 兼容性好
  • External C dependency
    lxml’s XML parser BeautifulSoup(markup, "lxml-xml") BeautifulSoup(markup,"xml")    速度很快
  • The only currently supported XML parser
  • External C dependency
    html5lib BeautifulSoup(markup, "html5lib") 1)兼容性很好
    2)可以像web浏览器一样解析html页面
    3) Creates valid HTML5
  • 速度很慢
  • External Python dependency

  • 如果你想追求速度的话,建议使用 lxml,如果你使用的python版本2.x是2.7.3之前的版本,或者python3.x的是3.2.2之前的版本,你很有必要安装使用 html5lib或lxml使用,因为python内建的html解析器不能很好地适应于这些老版本。



    版权声明:本文为博主原创文章,未经博主允许不得转载。

    Statement of this Website
    The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

    Hot AI Tools

    Undresser.AI Undress

    Undresser.AI Undress

    AI-powered app for creating realistic nude photos

    AI Clothes Remover

    AI Clothes Remover

    Online AI tool for removing clothes from photos.

    Undress AI Tool

    Undress AI Tool

    Undress images for free

    Clothoff.io

    Clothoff.io

    AI clothes remover

    AI Hentai Generator

    AI Hentai Generator

    Generate AI Hentai for free.

    Hot Article

    R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
    2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
    Hello Kitty Island Adventure: How To Get Giant Seeds
    1 months ago By 尊渡假赌尊渡假赌尊渡假赌
    Two Point Museum: All Exhibits And Where To Find Them
    1 months ago By 尊渡假赌尊渡假赌尊渡假赌

    Hot Tools

    Notepad++7.3.1

    Notepad++7.3.1

    Easy-to-use and free code editor

    SublimeText3 Chinese version

    SublimeText3 Chinese version

    Chinese version, very easy to use

    Zend Studio 13.0.1

    Zend Studio 13.0.1

    Powerful PHP integrated development environment

    Dreamweaver CS6

    Dreamweaver CS6

    Visual web development tools

    SublimeText3 Mac version

    SublimeText3 Mac version

    God-level code editing software (SublimeText3)

    What is the purpose of the <datalist> element? What is the purpose of the <datalist> element? Mar 21, 2025 pm 12:33 PM

    The article discusses the HTML &lt;datalist&gt; element, which enhances forms by providing autocomplete suggestions, improving user experience and reducing errors.Character count: 159

    How do I use HTML5 form validation attributes to validate user input? How do I use HTML5 form validation attributes to validate user input? Mar 17, 2025 pm 12:27 PM

    The article discusses using HTML5 form validation attributes like required, pattern, min, max, and length limits to validate user input directly in the browser.

    What is the purpose of the <iframe> tag? What are the security considerations when using it? What is the purpose of the <iframe> tag? What are the security considerations when using it? Mar 20, 2025 pm 06:05 PM

    The article discusses the &lt;iframe&gt; tag's purpose in embedding external content into webpages, its common uses, security risks, and alternatives like object tags and APIs.

    What is the purpose of the <progress> element? What is the purpose of the <progress> element? Mar 21, 2025 pm 12:34 PM

    The article discusses the HTML &lt;progress&gt; element, its purpose, styling, and differences from the &lt;meter&gt; element. The main focus is on using &lt;progress&gt; for task completion and &lt;meter&gt; for stati

    What is the purpose of the <meter> element? What is the purpose of the <meter> element? Mar 21, 2025 pm 12:35 PM

    The article discusses the HTML &lt;meter&gt; element, used for displaying scalar or fractional values within a range, and its common applications in web development. It differentiates &lt;meter&gt; from &lt;progress&gt; and ex

    What are the best practices for cross-browser compatibility in HTML5? What are the best practices for cross-browser compatibility in HTML5? Mar 17, 2025 pm 12:20 PM

    Article discusses best practices for ensuring HTML5 cross-browser compatibility, focusing on feature detection, progressive enhancement, and testing methods.

    What is the viewport meta tag? Why is it important for responsive design? What is the viewport meta tag? Why is it important for responsive design? Mar 20, 2025 pm 05:56 PM

    The article discusses the viewport meta tag, essential for responsive web design on mobile devices. It explains how proper use ensures optimal content scaling and user interaction, while misuse can lead to design and accessibility issues.

    How do I use the HTML5 <time> element to represent dates and times semantically? How do I use the HTML5 <time> element to represent dates and times semantically? Mar 12, 2025 pm 04:05 PM

    This article explains the HTML5 &lt;time&gt; element for semantic date/time representation. It emphasizes the importance of the datetime attribute for machine readability (ISO 8601 format) alongside human-readable text, boosting accessibilit

    See all articles