Four ways to parse XML
Jun 23, 2017 am 09:24 AM
Four ways to parse XML
XML has now become a universal The data exchange format, with its platform independence, language independence and system independence, brings great convenience to data integration and interaction. For the grammatical knowledge and technical details of XML itself, you need to read relevant technical literature, which includes DOM (Document Object Model), DTD (Document Type Definition), SAX (Simple API for XML), XSD (Xml Schema Definition) ), XSLT (Extensible Stylesheet Language Transformations), please refer to the w3c official website documentation for more information.
XML is parsed in the same way in different languages, but the syntax implemented is different. There are two basic parsing methods, one is called SAX, and the other is called DOM. SAX is based on event stream parsing, and DOM is based on XML document tree structure parsing. Assume that the content and structure of our XML are as follows:
<?xml version="1.0" encoding="UTF-8"?><employees><employee> <name>ddviplinux</name> <sex>m</sex> <age>30</age></employee></employees>
This article uses JAVA language to generate and parse XML documents of DOM and SAX.
First define an interface for operating XML documents, XmlDocument. It defines the interface for creating and parsing XML documents.
package com.beyond.framework.bean; /** * @author zhengwei * 定义XML文档建立与解析的接口 */ public interface XmlDocument { /** * 建立XML文档 * @param fileName 文件全路径名称 */ public void createXml(String fileName); /** * 解析XML文档 * @param fileName 文件全路径名称 */ public void parserXml(String fileName); }
1. DOMGenerate and parse XML documents
## Defines a set of interfaces for the parsed version of an XML document. The parser reads in the entire document and builds a memory-resident tree structure that code can then manipulate using the DOM interface.
Advantages: The entire document tree is in memory, easy to operate; supports deletion, modification, rearrangement and other functions; Disadvantages: The entire document is transferred into memory (including useless nodes) , a waste of time and space; Usage occasions: Once the document is parsed, the data needs to be accessed multiple times; hardware resources are sufficient (memory, CPU).DomDemo ==.document = = .document.createElement("employees"= .document.createElement("employee"= .document.createElement("name".document.createTextNode("丁宏亮"= .document.createElement("sex".document.createTextNode("m"= .document.createElement("age".document.createTextNode("30"=== "gb2312""yes"= PrintWriter(= "生成XML文件成功!" ==== ( i = 0; i < employees.getLength(); i++== ( j = 0; j < employeeInfo.getLength(); j++== ( k = 0; k < employeeMeta.getLength(); k+++ ":" +"解析完毕"
2. SAXGenerate and parse XML documents
import;import;import;import;import javax.xml.parsers.ParserConfigurationException;import javax.xml.parsers.SAXParser;import javax.xml.parsers.SAXParserFactory;import org.xml.sax.Attributes;import org.xml.sax.SAXException;import org.xml.sax.helpers.DefaultHandler;/*** @author zhengwei * SAX文档解析*/public class SaxDemo implements XmlDocument { public void createXml(String fileName) { System.out.println("<<"+filename+">>"); } public void parserXml(String fileName) { SAXParserFactory saxfac = SAXParserFactory.newInstance(); try { SAXParser saxparser = saxfac.newSAXParser(); InputStream is = new FileInputStream(fileName); saxparser.parse(is, new MySAXHandler()); } catch (ParserConfigurationException e) { e.printStackTrace(); } catch (SAXException e) { e.printStackTrace(); } catch (FileNotFoundException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } } } class MySAXHandler extends DefaultHandler { boolean hasAttribute = false; Attributes attributes = null; public void startDocument() throws SAXException { System.out.println("文档开始打印了"); } public void endDocument() throws SAXException { System.out.println("文档打印结束了"); } public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException { if (qName.equals("employees")) { return; } if (qName.equals("employee")) { System.out.println(qName); } if (attributes.getLength() > 0) { this.attributes = attributes; this.hasAttribute = true; } } public void endElement(String uri, String localName, String qName) throws SAXException { if (hasAttribute && (attributes != null)) { for (int i = 0; i < attributes.getLength(); i++) { System.out.println(attributes.getQName(0) + attributes.getValue(0)); } } } public void characters(char[] ch, int start, int length) throws SAXException { System.out.println(new String(ch, start, length)); } }
3. DOM4JGenerate and parse XML documents
has excellent performance, powerful functions and extreme ease of useSpeciality, at the same time it is also an open source software. Nowadays you can see that more and more Java software is using DOM4J to read and write XML. It is particularly worth mentioning that even Sun's JAXM is also using DOM4J.
import; import; import; import; import java.util.Iterator; import org.dom4j.Document; import org.dom4j.DocumentException; import org.dom4j.DocumentHelper; import org.dom4j.Element; import; import; /** * @author zhengwei * Dom4j 生成XML文档与解析XML文档 */ public class Dom4jDemo implements XmlDocument { public void createXml(String fileName) { Document document = DocumentHelper.createDocument(); Element employees=document.addElement("employees"); Element employee=employees.addElement("employee"); Element name= employee.addElement("name"); name.setText("ddvip"); Element sex=employee.addElement("sex"); sex.setText("m"); Element age=employee.addElement("age"); age.setText("29"); try { Writer fileWriter=new FileWriter(fileName); XMLWriter xmlWriter=new XMLWriter(fileWriter); xmlWriter.write(document); xmlWriter.close(); } catch (IOException e) { System.out.println(e.getMessage()); } } public void parserXml(String fileName) { File inputXml=new File(fileName); SAXReader saxReader = new SAXReader(); try { Document document =; Element employees=document.getRootElement(); for(Iterator i = employees.elementIterator(); i.hasNext();){ Element employee = (Element);for(Iterator j = employee.elementIterator(); j.hasNext();){ Element node=(Element); System.out.println(node.getName()+":"+node.getText()); } } } catch (DocumentException e) { System.out.println(e.getMessage()); } System.out.println("dom4j parserXml"); } }
4. JDOMGenerate and parse XML
import; import; import; import java.util.List; import org.jdom.Document; import org.jdom.Element; import org.jdom.JDOMException; import org.jdom.input.SAXBuilder; import org.jdom.output.XMLOutputter; /*** @author zhengwei * JDOM 生成与解析XML文档 */ public class JDomDemo implements XmlDocument { public void createXml(String fileName) { Document document; Element root; root=new Element("employees"); document=new Document(root); Element employee=new Element("employee"); root.addContent(employee); Element name=new Element("name"); name.setText("ddvip"); employee.addContent(name); Element sex=new Element("sex"); sex.setText("m"); employee.addContent(sex); Element age=new Element("age"); age.setText("23"); employee.addContent(age); XMLOutputter XMLOut = new XMLOutputter(); try { XMLOut.output(document, new FileOutputStream(fileName)); } catch (FileNotFoundException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } }public void parserXml(String fileName) { SAXBuilder builder=new SAXBuilder(false); try { Document; Element employees=document.getRootElement(); List employeeList=employees.getChildren("employee"); for(int i=0;i<EMPLOYEELIST.SIZE();I++){ iElement employee=(Element)employeeList.get(i); List employeeInfo=employee.getChildren(); for(int j=0;j<EMPLOYEEINFO.SIZE();J++){ System.out.println(((Element)employeeInfo.get(j)).getName()+":" +((Element)employeeInfo.get(j)).getValue()) } } } catch (JDOMException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } } }
5. Use dom4j to parse XML
Listing 1. Example XML Document (catalog.xml)
<?xml version="1.0" encoding="UTF-8"?> <catalog> <!--An XML Catalog--> <?target instruction?><journal title="XML Zone" publisher="IBM developerWorks"> <article level="Intermediate" date="December-2001"> <title>Java configuration with XML Schema</title> <author> <firstname>Marcello</firstname> <lastname>Vitaletti</lastname> </author> </article></journal> </catalog>
List 2. Modified XML Document (catalog-modified.xml)
<?xml version="1.0" encoding="UTF-8"?><catalog> <!--An XML catalog--><?target instruction?><journal title="XML Zone" publisher="IBM developerWorks"><article level="Introductory" date="October-2002"> <title>Create flexible and extensible XML schemas</title> <author><firstname>Ayesha</firstname> <lastname>Malik</lastname> </author> </article></journal></catalog>
Create document
- ##Modify document
This parser can be obtained from . Make dom4j-1.4/dom4j-full.jar accessible on the classpath, which includes the dom4j classes, the XPath engine, and the SAX and DOM interfaces. If you are already using the SAX and DOM interfaces included in the JAXP parser, add dom4j-1.4/dom4j.jar to the classpath. dom4j.jar includes the dom4j class and XPath engine, but does not include SAX and DOM interfaces.Creating a document
import org .dom4j.DocumentHelper; import org.dom4j.Element; |
# Use the DocumentHelper class to create a Document instance. DocumentHelper is a dom4j API factory class that generates XML document nodes.
Use the addElement() method to create the root element catalog. addElement() is used to add elements to an XML document.
Use the addElement() method in the catalog element to add the journal element. |
Use the addAttribute() method to add title and publisher attributes to the journal element. |
journalElement.addAttribute("publisher", " IBM developerWorks");
Add a journal element to the article element. |
##Element articleElement=journalElement.addElement("article");
Add level and date attributes to the article element. |
Add the title element to the article element. |
##Element lastNameElement=authorElement.addElement("lastname"); |
可以使用 addDocType()方法添加文档类型说明。
document.addDocType("catalog", null,"file://c:/Dtds/catalog.dtd"); |
这样就向 XML 文档中增加文档类型说明:
<!DOCTYPE catalog SYSTEM "file://c:/Dtds/catalog.dtd"> |
如果文档要使用文档类型定义(DTD)文档验证则必须有 Doctype。
XML 声明 <?xml version="1.0" encoding="UTF-8"?> 自动添加到 XML 文档中。
清单 3 所示的例子程序 用于创建 XML 文档 catalog.xml。
清单 3. 生成 XML 文档 catalog.xml 的程序(
import org.dom4j.Document;import org.dom4j.DocumentHelper;import org.dom4j.Element;import;import*;public class XmlDom4J{public void generateDocument(){ Document document = DocumentHelper.createDocument(); Element catalogElement = document.addElement("catalog"); catalogElement.addComment("An XML Catalog"); catalogElement.addProcessingInstruction("target","text"); Element journalElement = catalogElement.addElement("journal"); journalElement.addAttribute("title", "XML Zone"); journalElement.addAttribute("publisher", "IBM developerWorks"); Element articleElement=journalElement.addElement("article"); articleElement.addAttribute("level", "Intermediate"); articleElement.addAttribute("date", "December-2001"); Element titleElement=articleElement.addElement("title"); titleElement.setText("Java configuration with XML Schema"); Element authorElement=articleElement.addElement("author"); Element firstNameElement=authorElement.addElement("firstname"); firstNameElement.setText("Marcello"); Element lastNameElement=authorElement.addElement("lastname"); lastNameElement.setText("Vitaletti"); document.addDocType("catalog",null,"file://c:/Dtds/catalog.dtd");try{ XMLWriter output = new XMLWriter( new FileWriter(new File("c:/catalog/catalog.xml"))); output.write( document ); output.close(); } catch(IOException e){ System.out.println(e.getMessage()); } }public static void main(String[] argv){ XmlDom4J dom4j=new XmlDom4J(); dom4j.generateDocument(); } }
这一节讨论了创建 XML 文档的过程,下一节将介绍使用 dom4j API 修改这里创建的 XML 文档。
这一节说明如何使用 dom4j API 修改示例 XML 文档 catalog.xml。
使用 SAXReader 解析 XML 文档 catalog.xml:
SAXReader saxReader = new SAXReader(); Document document =;
SAXReader 包含在 包中。
inputXml 是从 c:/catalog/catalog.xml 创建的。使用 XPath 表达式从 article 元素中获得 level 节点列表。如果 level 属性值是“Intermediate”则改为“Introductory”。
List list = document.selectNodes("//article/@level" ); Iterator iter=list.iterator(); while(iter.hasNext()){ Attribute attribute=(Attribute); if(attribute.getValue().equals("Intermediate")) attribute.setValue("Introductory"); }
获取 article 元素列表,从 article 元素中的 title 元素得到一个迭代器,并修改 title 元素的文本。
list = document.selectNodes("//article" ); iter=list.iterator(); while(iter.hasNext()){ Element element=(Element); Iterator iterator=element.elementIterator("title"); while(iterator.hasNext()){ Element titleElement=(Element); if(titleElement.getText().equals("Java configuration with XML Schema")) titleElement.setText("Create flexible and extensible XML schema"); } }
通过和 title 元素类似的过程修改 author 元素。
清单 4 所示的示例程序 用于把 catalog.xml 文档修改成 catalog-modified.xml 文档。
清单 4. 用于修改 catalog.xml 的程序(
import org.dom4j.Document;import org.dom4j.Element;import org.dom4j.Attribute;import java.util.List;import java.util.Iterator;import;import*;import org.dom4j.DocumentException;import;public class Dom4JParser{public void modifyDocument(File inputXml){try{ SAXReader saxReader = new SAXReader(); Document document =; List list = document.selectNodes("//article/@level" ); Iterator iter=list.iterator();while(iter.hasNext()){ Attribute attribute=(Attribute);if(attribute.getValue().equals("Intermediate")) attribute.setValue("Introductory"); } list = document.selectNodes("//article/@date" ); iter=list.iterator();while(iter.hasNext()){ Attribute attribute=(Attribute);if(attribute.getValue().equals("December-2001")) attribute.setValue("October-2002"); } list = document.selectNodes("//article" ); iter=list.iterator();while(iter.hasNext()){ Element element=(Element); Iterator iterator=element.elementIterator("title");while(iterator.hasNext()){ Element titleElement=(Element);if(titleElement.getText().equals("Java configuration with XMLSchema")) titleElement.setText("Create flexible and extensible XML schema"); } } list = document.selectNodes("//article/author" ); iter=list.iterator();while(iter.hasNext()){ Element element=(Element); Iterator iterator=element.elementIterator("firstname");while(iterator.hasNext()){ Element firstNameElement=(Element);if(firstNameElement.getText().equals("Marcello")) firstNameElement.setText("Ayesha"); } } list = document.selectNodes("//article/author" ); iter=list.iterator();while(iter.hasNext()){ Element element=(Element); Iterator iterator=element.elementIterator("lastname");while(iterator.hasNext()){ Element lastNameElement=(Element);if(lastNameElement.getText().equals("Vitaletti")) lastNameElement.setText("Malik"); } } XMLWriter output = new XMLWriter(new FileWriter( new File("c:/catalog/catalog-modified.xml") )); output.write( document ); output.close(); } catch(DocumentException e) { System.out.println(e.getMessage()); } catch(IOException e){ System.out.println(e.getMessage()); } }public static void main(String[] argv){ Dom4JParser dom4jParser=new Dom4JParser(); dom4jParser.modifyDocument(new File("c:/catalog/catalog.xml")); } }
结束语:包含在 dom4j 中的解析器是一种用于解析 XML 文档的非验证性工具,可以与JAXP、Crimson 或 Xerces 集成。本文说明了如何使用该解析器创建和修改 XML 文档。
The above is the detailed content of Four ways to parse XML. For more information, please follow other related articles on the PHP Chinese website!

Hot Article

Hot tools Tags

Hot Article

Hot Article Tags

Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

How to transfer files from Quark Cloud Disk to Baidu Cloud Disk?

What to do if the 0x80004005 error code appears. The editor will teach you how to solve the 0x80004005 error code.

What is hiberfil.sys file? Can hiberfil.sys be deleted?

What should I do if the notes I posted on Xiaohongshu are missing? What's the reason why the notes it just sent can't be found?

How to add product links in notes in Xiaohongshu Tutorial on adding product links in notes in Xiaohongshu

Detailed explanation of Oracle error 3114: How to solve it quickly

Analysis of new features of Win11: How to skip logging in to Microsoft account