Table of Contents
1.SAX parsing
1.1.SAX parsing mechanism
1.2. SAX parsing example
1.2.SAX解析实例
Home Backend Development XML/RSS Tutorial XML—SAX for XML parsing

XML—SAX for XML parsing

Feb 24, 2017 pm 03:06 PM

1.SAX parsing

  • When using DOM to parse an XML document, you need to read the entire XML document and build it in memory Document object of the entire DOM tree, so as to operate on the XML document. In this case, if the XML document is particularly large, it will consume a lot of computer memory, and in severe cases may cause memory overflow.

  • SAX parsing allows the document to be processed when the document is read, without having to wait until the entire document is loaded before operating the document.

  • Develop a SAX parser by inheriting DefaultHandler

【 Note】SAX is mainly used for parsing XML documents and cannot modify, delete or add elements.

1.1.SAX parsing mechanism

sax is a push mechanism, you create a Sax parser, the parser will tell you when it finds the content in the XML document (pushing the event to you, somewhat similar to event listening in Java Swing). It is up to the programmer to decide what to do with these findings.

In SAX-based programs, there are five most commonly used SAX events:

1.startDocument()–> tells you that the parser has found the beginning of the document, tells Your parser starts scanning the document
2.endDocument()–> tells you that the parser found the end of the document
3.startElement()–> tells you that the parser has found a start tag. This event tells you the name of the tag, all attribute names and values ​​of the element
4.characters()–> Tells you that the parser found some text, you will get a character array, the offset of the array and a length offset. With these three variables you can get the text found by the parser
5.endElement()–> tells you that the parser found an end tag. This event tells you the name of the element

1.2. SAX parsing example

Still using the XML example used in DOM parsing, as follows:

<?xml version="1.0" encoding="utf-8" standalone="no"?><班级>
    <学生 地址="香港">
        <名字>周小星</名字>
        <年龄>23</年龄>
        <介绍>学习刻苦</介绍>
    </学生>
    <学生 地址="澳门">
        <名字>林晓</名字>
        <年龄>25</年龄>
        <介绍>是一个好学生</介绍>
    </学生></班级>
Copy after login
Copy after login

[Steps]:

1. Use SAXParserFactory to create a SAX parsing factory

SAXParserFactory spf = SAXParserFactory.newInstance();
Copy after login
Copy after login

2. Get the parser object through the SAX parsing factory

SAXParser sp = spf.newSAXParser();
Copy after login
Copy after login

3. Associate the parsing object with the event handler object

sp.parse("src/myClass.xml",new MyHandler());
Copy after login
Copy after login

The MyHandler here needs to be defined by yourself, And it needs to inherit DefaultHandler, and then rewrite the five sax event methods mentioned above in the MyHandler class. Of course, you can also just override what you need.
For example, what I write nowMyHandler is as follows:

class MyHandler extends DefaultHandler{    /**
     * 发现文档开始,该函数只会被调用一次
     */
    @Override
    public void startDocument() throws SAXException {
        System.out.println("startDocument");
    }    /**
     * 发现文档结束,该函数只会被调用一次
     */
    @Override
    public void endDocument() throws SAXException {
        System.out.println("endDocument");
    }    /**
     * 发现XML中的一个元素开始,会被反复调用
     */
    @Override
    public void startElement(String uri, String localName, String qName,
            Attributes attributes) throws SAXException {
        System.out.println("元素名称:"+qName);
    }    /**
     * 发现XML中的一个元素结束,会被反复调用
     */
    @Override
    public void endElement(String uri, String localName, String qName)            
    throws SAXException {

    }    /**
     * 发现XML文件中的文本,会被反复调用
     */
    @Override
    public void characters(char[] ch, int start, int length)            
    throws SAXException {        // 显示文本内容
        String text = new String(ch,start,length);        if(!text.trim().equals("")){
            System.out.println(text);
        }
    }
}
Copy after login

The running result is as follows:

XML—SAX for XML parsing

As you can see, this is the correct A kind of traversal of XML documents, and all sax can do is traverse.


So, if we now have such a requirement: Only display the names and ages of all students, but not the students' introductions, how to implement it?

We can define two Boolean variables isName and isAge in the MyHandler class, and identify whether it is a name element or age in the startElement method element, if so, get the corresponding text in the characters method, as follows:

1. Define two Boolean variables

private boolean isName = false;private boolean isAge = false;
Copy after login
Copy after login

2. Add a judgment in the startElement method

@Overridepublic void startElement(String uri, String localName, String qName,
        Attributes attributes) throws SAXException {    if(qName.equals("名字")){        this.isName = true;
    }else if(qName.equals("年龄")){        this.isAge = true;
    }
}
Copy after login

3. In the characters method, judge whether to obtain the text based on the identifier

@Overridepublic void characters(char[] ch, int start, int length)        throws SAXException {    // 显示文本内容
    String text = new String(ch,start,length);    if(!text.trim().equals("")&&(isName||isAge)){
        System.out.println(text);
    }
    isName = false;
    isAge = false;
}
Copy after login
Copy after login

Finally, remember to reset the two Boolean variables to false.
The running results are as follows:

XML—SAX for XML parsing

1.SAX parsing

  • Using DOM to parse XML documents When doing this, you need to read the entire XML document, build the Document object of the entire DOM tree in memory, and then operate on the XML document. In this case, if the XML document is particularly large, it will consume a lot of computer memory, and in severe cases may cause memory overflow.

  • SAX parsing allows the document to be processed when the document is read, without having to wait until the entire document is loaded before operating the document.

  • Develop a SAX parser by inheriting DefaultHandler

【 Note】SAX is mainly used for parsing XML documents and cannot modify, delete or add elements.

1.1.SAX parsing mechanism

sax is a push mechanism. You create a sax parser. It will tell you when it finds the content in the XML document (pushing the event to you, somewhat similar to event listening in Java Swing). It is up to the programmer to decide what to do with these findings.

In SAX-based programs, there are five most commonly used SAX events:

1.startDocument()–>告诉你解析器发现了文档的开始,告诉你解析器开始扫描文档
2.endDocument()–>告诉你解析器发现了文档结尾
3.startElement()–>告诉你解析器发现了一个起始标签,该事件告诉你标签的名称、该元素所有的属性名和值
4.characters()–>告诉你解析器发现了一些文本,将得到一个字符数组,该数组的偏移量和一个长度偏移量,有这三个变量你可以得到解析器发现的文本
5.endElement()–>告诉你解析器发现了一个结束标签,该事件告诉你元素的名称

1.2.SAX解析实例

依然使用DOM解析中用到的XML例子,如下:

<?xml version="1.0" encoding="utf-8" standalone="no"?><班级>
    <学生 地址="香港">
        <名字>周小星</名字>
        <年龄>23</年龄>
        <介绍>学习刻苦</介绍>
    </学生>
    <学生 地址="澳门">
        <名字>林晓</名字>
        <年龄>25</年龄>
        <介绍>是一个好学生</介绍>
    </学生></班级>
Copy after login
Copy after login

【步骤】:

1.使用SAXParserFactory创建SAX解析工厂

SAXParserFactory spf = SAXParserFactory.newInstance();
Copy after login
Copy after login

2.通过SAX解析工厂得到解析器对象

SAXParser sp = spf.newSAXParser();
Copy after login
Copy after login

3.将解析对象和事件处理器对象关联

sp.parse("src/myClass.xml",new MyHandler());
Copy after login
Copy after login

这里的MyHandler需要自己定义,并且它要继承DefaultHandler,然后在MyHandler类中重写上文提到的5个sax事件方法,当然也可以只重写自己需要的。
比如现在我写的MyHandler如下:

class MyHandler extends DefaultHandler{    /**
     * 发现文档开始,该函数只会被调用一次
     */
    @Override
    public void startDocument() throws SAXException {
        System.out.println("startDocument");
    }    /**
     * 发现文档结束,该函数只会被调用一次
     */
    @Override
    public void endDocument() throws SAXException {
        System.out.println("endDocument");
    }    /**
     * 发现XML中的一个元素开始,会被反复调用
     */
    @Override
    public void startElement(String uri, String localName, String qName,
            Attributes attributes) throws SAXException {
        System.out.println("元素名称:"+qName);
    }    /**
     * 发现XML中的一个元素结束,会被反复调用
     */
    @Override
    public void endElement(String uri, String localName, String qName)            
    throws SAXException {

    }    /**
     * 发现XML文件中的文本,会被反复调用
     */
    @Override
    public void characters(char[] ch, int start, int length)            
    throws SAXException {        // 显示文本内容
        String text = new String(ch,start,length);        
        if(!text.trim().equals("")){
            System.out.println(text);
        }
    }
}
Copy after login

运行结果如下:

XML—SAX for XML parsing

可以看到,这是对XML文档的一种遍历,而sax能够做的也只是遍历了。


那么,如果现在我们有这样一个需求:只显示所有学生的姓名和年龄,不显示学生的介绍,怎么实现呢?

我们可以在MyHandler类中定义两个布尔变量isName和isAge,在startElement方法中标识是否是姓名元素或者年龄元素,如果是的话才在characters方法中获取对应的文本,如下:

1.定义两个布尔变量

private boolean isName = false;private boolean isAge = false;
Copy after login
Copy after login

2.在startElement方法中添加判断

@Overridepublic void startElement(String uri, String localName, String qName,
        Attributes attributes) throws SAXException {    if(qName.equals("名字")){        
        this.isName = true;
    }else if(qName.equals("年龄")){        this.isAge = true;
    }
}
Copy after login

3.在characters方法中根据标识符进行判断是否获取文本

@Overridepublic void characters(char[] ch, int start, int length)        throws SAXException {    // 显示文本内容
    String text = new String(ch,start,length);    if(!text.trim().equals("")&&(isName||isAge)){
        System.out.println(text);
    }
    isName = false;
    isAge = false;
}
Copy after login
Copy after login

最后要记得将两个布尔变量复位成false。
运行结果如下:

XML—SAX for XML parsing

 以上就是XML—XML解析之SAX的内容,更多相关内容请关注PHP中文网(www.php.cn)!


Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Chat Commands and How to Use Them
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Can I open an XML file using PowerPoint? Can I open an XML file using PowerPoint? Feb 19, 2024 pm 09:06 PM

Can XML files be opened with PPT? XML, Extensible Markup Language (Extensible Markup Language), is a universal markup language that is widely used in data exchange and data storage. Compared with HTML, XML is more flexible and can define its own tags and data structures, making the storage and exchange of data more convenient and unified. PPT, or PowerPoint, is a software developed by Microsoft for creating presentations. It provides a comprehensive way of

Using Python to merge and deduplicate XML data Using Python to merge and deduplicate XML data Aug 07, 2023 am 11:33 AM

Using Python to merge and deduplicate XML data XML (eXtensibleMarkupLanguage) is a markup language used to store and transmit data. When processing XML data, sometimes we need to merge multiple XML files into one, or remove duplicate data. This article will introduce how to use Python to implement XML data merging and deduplication, and give corresponding code examples. 1. XML data merging When we have multiple XML files, we need to merge them

Filtering and sorting XML data using Python Filtering and sorting XML data using Python Aug 07, 2023 pm 04:17 PM

Implementing filtering and sorting of XML data using Python Introduction: XML is a commonly used data exchange format that stores data in the form of tags and attributes. When processing XML data, we often need to filter and sort the data. Python provides many useful tools and libraries to process XML data. This article will introduce how to use Python to filter and sort XML data. Reading the XML file Before we begin, we need to read the XML file. Python has many XML processing libraries,

Convert XML data to CSV format in Python Convert XML data to CSV format in Python Aug 11, 2023 pm 07:41 PM

Convert XML data in Python to CSV format XML (ExtensibleMarkupLanguage) is an extensible markup language commonly used for data storage and transmission. CSV (CommaSeparatedValues) is a comma-delimited text file format commonly used for data import and export. When processing data, sometimes it is necessary to convert XML data to CSV format for easy analysis and processing. Python is a powerful

Import XML data into database using PHP Import XML data into database using PHP Aug 07, 2023 am 09:58 AM

Importing XML data into the database using PHP Introduction: During development, we often need to import external data into the database for further processing and analysis. As a commonly used data exchange format, XML is often used to store and transmit structured data. This article will introduce how to use PHP to import XML data into a database. Step 1: Parse the XML file First, we need to parse the XML file and extract the required data. PHP provides several ways to parse XML, the most commonly used of which is using Simple

Python implements conversion between XML and JSON Python implements conversion between XML and JSON Aug 07, 2023 pm 07:10 PM

Python implements conversion between XML and JSON Introduction: In the daily development process, we often need to convert data between different formats. XML and JSON are common data exchange formats. In Python, we can use various libraries to convert between XML and JSON. This article will introduce several commonly used methods, with code examples. 1. To convert XML to JSON in Python, we can use the xml.etree.ElementTree module

Handling errors and exceptions in XML using Python Handling errors and exceptions in XML using Python Aug 08, 2023 pm 12:25 PM

Handling Errors and Exceptions in XML Using Python XML is a commonly used data format used to store and represent structured data. When we use Python to process XML, sometimes we may encounter some errors and exceptions. In this article, I will introduce how to use Python to handle errors and exceptions in XML, and provide some sample code for reference. Use try-except statement to catch XML parsing errors When we use Python to parse XML, sometimes we may encounter some

Python parsing special characters and escape sequences in XML Python parsing special characters and escape sequences in XML Aug 08, 2023 pm 12:46 PM

Python parses special characters and escape sequences in XML XML (eXtensibleMarkupLanguage) is a commonly used data exchange format used to transfer and store data between different systems. When processing XML files, you often encounter situations that contain special characters and escape sequences, which may cause parsing errors or misinterpretation of the data. Therefore, when parsing XML files using Python, we need to understand how to handle these special characters and escape sequences. 1. Special characters and

See all articles