Home > 类库下载 > java类库 > Java Basics Series 17: Detailed explanation of using DOM, SAX, JDOM, and DOM4J to parse XML files

Java Basics Series 17: Detailed explanation of using DOM, SAX, JDOM, and DOM4J to parse XML files

Release: 2016-10-13 15:57:12
1775 people have browsed it

1 Introduction

In Java, there are many ways to parse XML files, among which the four most common methods are probably DOM, SAX, JDOM, and DOM4J. Among them, DOM and SAX, two ways of parsing XML files, have jdk's own API, so there is no need to introduce additional third-party jar packages. On the contrary, both JDOM and DOM4J parsing methods are third-party open source projects, so when using these two methods to parse XML files, you need to introduce additional related jar packages

(1) DOM

DOM is used with The official W3C standard for representing XML documents in a platform- and language-independent way. The DOM is a collection of nodes or pieces of information organized in a hierarchical structure that allows developers to find specific information in the tree. Analyzing this structure usually requires loading the entire document and constructing the hierarchy before any work can be done. Therefore, when using DOM to parse XML files, the parser needs to read the entire XML file into memory to form a tree structure. Convenient for subsequent operations

Advantages: The entire document tree is in memory, easy to operate; supports deletion, modification, rearrangement and other operations

Disadvantages: Transferring the entire document into memory (including useless nodes) is a waste of time and memory , if the XML is too large, it is easy to have a memory overflow problem

(2) SAX

Since the DOM needs to read the entire file at once when parsing the XML file, there are many shortcomings when the file is too large, so in order to solve this problem, there is With SAX, an event-driven parsing method,

SAX parses XML files by continuously loading content into memory from top to bottom. When the parser finds the start mark, end mark, text, document start mark, document When the end flag and other related flags are used, some corresponding events will be triggered. All we need to do is to write custom code in the methods of these events to save the obtained data

Advantages: There is no need to load the entire document in advance. Occupies less resources (memory); the code written using SAX parsing is less than the code written using DOM parsing

Disadvantages: not persistent; after the event, if the data is not saved, the data is lost; stateless; from Only text can be obtained in the event, but I don’t know which element the text belongs to

(3) JDOM

Using JDOM to parse XML files is similar to using DOM to parse. From the code point of view, the parsing ideas are similar. JDOM differs from DOM in two main ways: First, JDOM only uses concrete classes instead of interfaces, which simplifies the API in some aspects, but also limits flexibility. Secondly, JDOM’s API makes extensive use of Collections classes, simplifying the use of Java developers who are already familiar with these classes. Advantages: open source projects; easier to understand than DOM. Disadvantages: JDOM itself does not contain a parser. It usually uses a SAX2 parser to parse and validate input XML documents

(4) DOM4J

DOM4J is a very, very excellent Java XML API with excellent performance, powerful functions and extreme ease of use. It is also a Open source software. Nowadays you can see that more and more Java software is using DOM4J to read and write XML

Because DOM4J is very powerful in terms of performance and code writing, especially when the XML file is large, use DOM4J to parse it There will also be higher efficiency. Therefore, it is recommended that if you need to parse XML files, you can consider using DOM4J to parse them as much as possible. Of course, if the file is very small, it is also possible to use DOM to parse it


Open source project

DOM4J is an intelligent branch of JDOM, which incorporates functions that require more than basic XML documents

Excellent performance and flexibility Good, easy to use and other features

Second DOM parsing XML files

(1) Before writing and testing the code, you need to prepare an XML file. The file I prepared here is: demo1.xml


<?xml version="1.0" encoding="UTF-8" ?>
    <user id="1">
    <user id="2">
Copy after login

(2) Code example:

package cn.zifangsky.xml;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
public class DomParseTest {
    public static void main(String[] args) {
        DocumentBuilderFactory dFactory = DocumentBuilderFactory.newInstance();
        try {
            DocumentBuilder dBuilder = dFactory.newDocumentBuilder();
            // 加载一个xml文件
            Document document = dBuilder
            // 获取user节点集合
            NodeList userList = document.getElementsByTagName("user");
            int userListLength = userList.getLength();
            System.out.println("此xml文件一共有" + userListLength + "个&#39;user&#39;节点\n");
            // 遍历
            for (int i = 0; i < userListLength; i++) {
                // 通过item方法获取指定的节点
                Node userNode = userList.item(i);
                // *********************解析属性***********************
                // 获取该节点的所有属性值,如:id="1"
                NamedNodeMap userAttributes = userNode.getAttributes();
                System.out.println("&#39;user&#39;节点" + i + "有"
                        + userAttributes.getLength() + "个属性:");
                 * 1 在不清楚有哪些属性的情况下可以遍历所有属性,
                 * 并获取每个属性对应的属性名和属性值
                 * */
                for (int j = 0; j < userAttributes.getLength(); j++) {
                    // &#39;user&#39;节点的每个属性组成的节点
                    Node attrnNode = userAttributes.item(j);
                    System.out.println("属性" + j + ": 属性名: "
                            + attrnNode.getNodeName() + " ,属性值: "
                            + attrnNode.getNodeValue());
                 * 2 在知道有哪些属性值的情况下,可以获取指定属性名的属性值
                 * */
                Element userElement = (Element) userList.item(i);
                System.out.println("属性为&#39;id&#39;的对应值是: "
                        + userElement.getAttribute("id"));
                // *********************解析子节点************************
                NodeList childNodes = userNode.getChildNodes();
                System.out.println("\n该节点一共有" + childNodes.getLength()
                        + "个子节点,分别是:");
                // 遍历子节点
                for (int k = 0; k < childNodes.getLength(); k++) {
                    Node childNode = childNodes.item(k);
                    // 从输出结果可以看出,每行后面的换行符也被当做了一个节点,因此是:4+5=9个子节点
                    // System.out.println("节点名: " + childNode.getNodeName() +
                    // ",节点值: " + childNode.getTextContent());
                    // 仅取出子节点中的&#39;ELEMENT_NODE&#39;,换行符组成的Node是&#39;TEXT_NODE&#39;
                    if (childNode.getNodeType() == Node.ELEMENT_NODE) {
                        // System.out.println("节点名: " + childNode.getNodeName()
                        // + ",节点值: " + childNode.getTextContent());
                        // 最低一层是文本节点,节点名是&#39;#text&#39;
                        System.out.println("节点名: " + childNode.getNodeName()
                                + ",节点值: "
                                + childNode.getFirstChild().getNodeValue());
        } catch (Exception e) {
Copy after login

As can be seen from the above code, when using DOM to parse XML files, you generally need to do the following steps:

Create a Document Builder Factory (DocumentBuilderFactory) instance

Through the above The DocumentBuilderFactory generates a new document builder (DocumentBuilder)

Use the above DocumentBuilder to parse (parse) an XML file and generate a document tree (Document)

Get the node with the specified id through the Document or get all the qualified nodes based on the node name The node set

traverses each node, and you can obtain the attributes, attribute values ​​and other related parameters of the node

If the node still has child nodes, you can continue to traverse all its child nodes according to the above method

(3) above The code output is as follows:

属性0: 属性名: id ,属性值: 1
属性为&#39;id&#39;的对应值是: 1
节点名: name,节点值: zifangsky
节点名: age,节点值: 10
节点名: sex,节点值: male
节点名: contact,节点值: https://www.zifangsky.cn
属性0: 属性名: id ,属性值: 2
属性为&#39;id&#39;的对应值是: 2
节点名: name,节点值: admin
节点名: age,节点值: 20
节点名: sex,节点值: male
节点名: contact,节点值: https://www.tar.pub
Copy after login

Three SAX parsed XML files




package cn.zifangsky.xml;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class SAXParseHandler extends DefaultHandler {
     * 用来遍历XML文件的开始标签
     * */
    public void startElement(String uri, String localName, String qName,
            Attributes attributes) throws SAXException {
        super.startElement(uri, localName, qName, attributes);
//      if(qName.equals("user"))
//          System.out.println("&#39;user&#39;元素的id属性值是:" + attributes.getValue("id"));
        int length = attributes.getLength();
        if(length > 0){
            System.out.println("元素&#39;" + qName + "&#39;的属性是:");
            for(int i=0;i<length;i++){
                System.out.println("    属性名:" + attributes.getQName(i) + ",属性值: " + attributes.getValue(i));
        System.out.print("<" + qName + ">");
     * 用来遍历XML文件的结束标签
     * */
    public void endElement(String uri, String localName, String qName)
            throws SAXException {
        super.endElement(uri, localName, qName);
        System.out.println("<" + qName + "/>");
     * 文本内容
     * */
    public void characters(char[] ch, int start, int length)
            throws SAXException {
        super.characters(ch, start, length);
        String value = new String(ch, start, length).trim();
     * 用来标识解析开始
     * */
    public void startDocument() throws SAXException {
     * 用来标识解析结束
     * */
    public void endDocument() throws SAXException {
Copy after login




package cn.zifangsky.xml;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
public class SAXParseTest {
    public static void main(String[] args) {
        SAXParserFactory sFactory = SAXParserFactory.newInstance();
        try {
            SAXParser saxParser = sFactory.newSAXParser();
            SAXParseHandler saxParseHandler = new SAXParseHandler();
            saxParser.parse("src/cn/zifangsky/xml/demo1.xml", saxParseHandler);
        } catch (Exception e) {
Copy after login



    属性名:id,属性值: 1
    属性名:id,属性值: 2
Copy after login





目前最新版本是:JDOM 2.0.6





package cn.zifangsky.xml;
import java.util.List;
import org.jdom2.Document;
import org.jdom2.Element;
import org.jdom2.input.SAXBuilder;
public class JDOMTest {
     * @param args
    public static void main(String[] args) {
        SAXBuilder saxBuilder = new SAXBuilder();
        try {
            Document document = saxBuilder.build("src/cn/zifangsky/xml/demo1.xml");
            Element rootElement = document.getRootElement();
//          System.out.println(rootElement.getName());
            List<Element> usersList = rootElement.getChildren();  //获取子节点
            for(Element u : usersList){
//              List<Attribute> attributes = u.getAttributes();
//              for(Attribute attribute : attributes){
//                  System.out.println("属性名:" + attribute.getName() + ",属性值:" + attribute.getValue());
//              }
                System.out.println("&#39;id&#39;的值是: " + u.getAttributeValue("id"));
        }catch (Exception e) {
Copy after login










&#39;id&#39;的值是: 1
&#39;id&#39;的值是: 2
Copy after login

五 DOM4J解析XML文件





package cn.zifangsky.xml;
import java.io.File;
import java.util.Iterator;
import java.util.List;
import org.dom4j.Attribute;
import org.dom4j.Document;
import org.dom4j.Element;
import org.dom4j.io.SAXReader;
public class DOM4JTest {
    public static void main(String[] args) {
        SAXReader reader = new SAXReader();
        try {
            Document document = reader.read(new File("src/cn/zifangsky/xml/demo1.xml"));
            Element rootElement = document.getRootElement();
            Iterator<Element> iterator = rootElement.elementIterator();
                Element user = iterator.next();
                List<Attribute> aList = user.attributes();
                for(Attribute attribute : aList){
                    System.out.println("属性名:" + attribute.getName() + ",属性值:" + attribute.getValue());
                Iterator<Element> childList = user.elementIterator();
                    Element child = childList.next();
//                  System.out.println(child.getName() + " : " + child.getTextTrim());
                    System.out.println(child.getName() + " : " + child.getStringValue());
        } catch (Exception e) {
Copy after login



name : zifangsky
age : 10
sex : male
contact : https://www.zifangsky.cn
name : admin
age : 20
sex : male
contact : https://www.tar.pub
Copy after login




<?xml version="1.0" encoding="UTF-8" ?>
<user id="2">
    <ownPet id="1">旺财</ownPet>
    <ownPet id="2">九头猫妖</ownPet>
Copy after login



package cn.zifangsky.xml;
import java.util.List;
public class User {
    private String name;
    private String sex;
    private int age;
    private String contact;
    private List<String> ownPet;
    public String getName() {
        return name;
    public void setName(String name) {
        this.name = name;
    public String getSex() {
        return sex;
    public void setSex(String sex) {
        this.sex = sex;
    public int getAge() {
        return age;
    public void setAge(int age) {
        this.age = age;
    public String getContact() {
        return contact;
    public void setContact(String contact) {
        this.contact = contact;
    protected List<String> getOwnPet() {
        return ownPet;
    protected void setOwnPet(List<String> ownPet) {
        this.ownPet = ownPet;
    public String toString() {
        return "User [name=" + name + ", sex=" + sex + ", age=" + age
                + ", contact=" + contact + ", ownPet=" + ownPet + "]";
Copy after login



package cn.zifangsky.xml;
import java.io.File;
import java.util.ArrayList;
import java.util.List;
import org.dom4j.Document;
import org.dom4j.Element;
import org.dom4j.io.SAXReader;
public class XMLtoJava {
    public User parseXMLtoJava(String xmlPath){
        User user = new User();
        List<String> ownPet = new ArrayList<String>();
        SAXReader saxReader = new SAXReader();
        try {
            Document document = saxReader.read(new File(xmlPath));
            Element rootElement = document.getRootElement();  //获取根节点
            List<Element> children = rootElement.elements();  //获取根节点的子节点
            for(Element child : children){
                String elementName = child.getName();  //节点名
                String elementValue = child.getStringValue();  //节点值
                switch (elementName) {
                case "name":
                case "sex":
                case "age":
                case "contact":
                case "ownPet":
        } catch (Exception e) {
        return user;
    public static void main(String[] args) {
        XMLtoJava demo = new XMLtoJava();
        User user = demo.parseXMLtoJava("src/cn/zifangsky/xml/demo2.xml");
Copy after login



User [name=zifangsky, sex=男, age=100, contact=https://www.zifangsky.cn, ownPet=[旺财, 九头猫妖]]
Copy after login



package cn.zifangsky.xml;
import java.io.File;
import java.util.List;
import org.dom4j.Attribute;
import org.dom4j.Document;
import org.dom4j.Element;
import org.dom4j.io.SAXReader;
public class DOM4JTest2 {
     * 解析XML文件并尽可能原样输出
     * @param xmlPath
     *            待解析的XML文件路径
     * @return null
     * */
    public void parse(String xmlPath) {
        SAXReader saxReader = new SAXReader();
        try {
            Document document = saxReader.read(new File(xmlPath));
            Element rootElement = document.getRootElement();
            print(rootElement, 0);
        } catch (Exception e) {
     * 打印一个XML节点的详情
     * @param element
     *            一个XML节点
     * @param level
     *            用于判断xml节点前缩进多少的标识,每深入一层则多输出4个空格
     * @return null
     * */
    public void print(Element element, int level) {
        List<Element> elementList = element.elements(); // 当前节点的子节点List
        // 空格
        StringBuffer spacebBuffer = new StringBuffer("");
        for (int i = 0; i < level; i++)
            spacebBuffer.append("    ");
        String space = spacebBuffer.toString();
        // 输出开始节点及其属性值
        System.out.print(space + "<" + element.getName());
        List<Attribute> attributes = element.attributes();
        for (Attribute attribute : attributes)
            System.out.print(" " + attribute.getName() + "=\""
                    + attribute.getText() + "\"");
        // 有子节点
        if (elementList.size() > 0) {
            // 遍历并递归
            for (Element child : elementList) {
                print(child, level + 1);
            // 输出结束节点
            System.out.println(space + "</" + element.getName() + ">");
        } else {
            // 如果节点没有文本则简化输出
            if (element.getStringValue().trim().equals(""))
                System.out.println(" />");
                System.out.println(">" + element.getStringValue() + "</"
                        + element.getName() + ">");
    public static void main(String[] args) {
        DOM4JTest2 test2 = new DOM4JTest2();
Copy after login



<?xml version="1.0" encoding="UTF-8"?>
<application xmlns="http://wadl.dev.java.net/2009/02"
    <grammars />
    <resources base="http://localhost:9080/Demo/services/json/checkCode">
        <resource path="/">
            <resource path="addCheckCode">
                <method name="POST">
                        <representation mediaType="application/octet-stream" />
                        <representation mediaType="application/xml">
                            <param name="result" style="plain" type="xs:int" />
                        <representation mediaType="application/json">
                            <param name="result" style="plain" type="xs:int" />
            <resource path="findCheckCodeByProfileId">
                <method name="POST">
                        <representation mediaType="application/octet-stream">
                            <param name="request" style="plain" type="xs:long" />
                        <representation mediaType="application/xml" />
                        <representation mediaType="application/json" />
Copy after login


Copy after login


Copy after login

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
Latest Downloads
Web Effects
Website Source Code
Website Materials
Front End Template