Web Programming-Detailed Explanation of XML Grammar Analysis-XML/RSS Tutorial-php.cn

Home

Backend Development

XML/RSS Tutorial

Web Programming-Detailed Explanation of XML Grammar Analysis

黄舟

Mar 24, 2017 pm 04:47 PM

Before performing XML grammatical analysis, it is first necessary to understand the basic rules of XML syntax:

Lexical features: 1) XML is case-sensitive, such as element names in opening and closing tags The upper and lower case should be consistent …, and the reserved word strings of XML should meet the upper and lower case requirements ….

　2) XML reserved mark characters are: < > &, reserved words The symbol is not allowed to appear in element names, element text, attribute names, and attribute values. < The user opens the tag, > is used to close the tag, & is used to change the meaning. The common meaning is <generated<, >Generate>, &Generate&, &aposGenerate', "Generate”

　3) The element name starts with an underscore or letter and can contain letters, numbers, periods, hyphens, underscores, colons and other Extended characters of the language. There cannot be spaces (separators, tabs, line feeds, carriage returns) in element names. Element names can be prefixed by name fields. For example: The element text can be a set of characters except XML reserved characters, such as my money is $2000

　4) Attribute name The rule is the same as the element name, and the attribute value is enclosed by single quotes or double quotes, and can be composed of strings other than XML reserved characters, such as: . The attribute name has the xmlns prefix, indicating that the attribute defines a name domain, such as:

Syntactic features: 1) An XML document consists of an XML description, multiple optional Document Description, multiple optional XML directives, multiple optional XML comments and a data body of the root element. In addition, there can be embedded The CDATA segment in the statement, such as:

<?xml …?> /*XML说明*/
　　<!DOCTYPE …> /*XML文档说明*/
　　<!-- … --> /*XML注释*/
　　<?xml-stylesheet …?> /*XML指令*/
　　<root> /*根数据元素*/
　　<child>
　　…<![CDATA[…]]>
　　</child>
　　</root>

Copy after login

2) The XML description is opened by mark, which contains optional descriptions such as version and encoding, such as:
3) XML document description is opened by , such as: 4) XML instructions are opened by , such as:
5) XML comments are opened by , such as:
6) XML elements are opened by Open, closed by />, or , the opening and closing tags of the element match each other, such as or …, XML element Nesting is allowed, and hierarchical matching should also be maintained, such as ...
　7) The CDTATA segment is opened by and closed by ]]>, which is used to make the statements in it avoid XML parsing rules. For example:
Based on the above XML grammatical features, regular expressions for lexical analysis and syntactic analysis can be constructed Pushdown automaton structure.
XML lexical regular expression:
#define digit [1,2,…,9] /*Number character*/
#define letter [a,b,…,z,A,B,…, Z] /*Alphabetic characters*/
　#define signs [~, ! , @, #, %, ^, &,*,(, ), ?, :, ;, “, ', ,, ., / ,-, _, +, =, |, /] /*Symbol character*/
　#define ascii2 [0x80,…,0xFF] /*ASCII chart2 extended character*/
　#define space [0x20, / t, /r, /n] /*Space character, tab character, carriage return character, line feed character*/
　#define reserve [<, >, &] /*XML reserved characters*/
1) The regular expression of the element name:

　　element_name -> (_ | letter | ascii2) (ε| _ | - | : | . | digit | letter | signs | ascii2)*

Copy after login

2) The regular expression of the element text:

　　element_text -> (ε| not reserve)*

Copy after login

3) The regular expression of the attribute name:

　　proper_name -> (_ | letter | ascii2) (ε| _ | - | : | . | digit | letter | signs | ascii2)*

Copy after login

　4) Regular expression of attribute text:

　　proper_value -> (ε| not reserve)*

Copy after login

XML syntax structure:

　xml_document -> xml_header (ε| xml_declare | xml_instruct | xml_comments)* xml_element
　　xml_header -> [<?xml](space)*(proper_token)*(space)* [?>]
　　xml_declare -> [<!]reserve_word(space)*(token)*(space)*[>]
　　xml_instruct -> [<?]reserve_word(space)* (proper_token)* (space)*[?>]
　　xml_comments -> [<!--](ε| digit | letter | signs | ascii2 | space)*[-- >]
　　xml_element -> [<]element_name (space)*( ε| proper_token)*(space)*[/>] | 
　　[<]element_name(space)*( ε | proper_token)*(space)*[>]
　　[ε| <![CDATA[ ]element_text[ε| ]]>]
　　(ε | xml_element)*(space)*[</]element_name[>]
　　proper_token -> proper_name(space)*[=](space)* [ε| <![CDATA[ ] [‘ | “]proper_value[‘ | “] [ε| ]]>]
　　reserve_word -> [DOCTYPE | ELEMENT | NOTATION | …]
　　token -> (ε| not reserve)*

Copy after login

Analyzing XML grammar requires constructing a pushdown automaton, its structure The definition is as follows:

　1) STACK_DFA mata_xml_doc =

　Q: {…} /*详见后面的状态集合*/
　　Σ: /*指向待解析的XML元素词串*/
　　σ: Q×Σ->Q /*状态转移函数，见状态转移列表*/
　　q: {NIL_SKIP} /*初始状态*/
　　Γ: {NIL_FAILED,NIL_SUCCEED} /*终结状态集合*/
　　S:  {Q/*状态*/, N/*DOM节点*/>,<…>} /*下推栈*/

Copy after login

　2) The stack top symbol set is used to reflect Type of current analysis node:

T：{NIL/*空*/, TG/*标记*/, NS/*元素*/, IS/*指令*/, DS/*声明*/, CD/*CDATA界段*/,CM/*注释*/}

Copy after login

　3) The status set reflects the characteristics of a certain stage of analysis, corresponding to the top symbol of the stack:
　

　NIL:  NIL_FAILED /*失败*/
　　NIL_SKIP /*忽略*/
　　NIL_SUCCEED /*成功*/
　　CM:  CM_BEGIN /*注释开始*/
　　CM_END /*注释结束*/
　　TG:  TG_OPEN /*标记打开*/
　　TG_INT_CLOSE /*标记中断*/
　　TG_PRE_CLOSE /*标记准备关闭*/
　　TG_CLOSE /*标记关闭*/
　　NS:  NS_NAME_BEGIN /*元素名开始*/
　　NS_NAME_END /*元素名结束*/
　　NS_KEY_BEGIN /*属性名开始*/
　　NS_KEY_END /*属性名结束*/
　　NS_ASIGN /*属性赋值*/
　　NS_VAL_BEGIN /*属性值开始*/
　　NS_VAL_END /*属性值结束*/
　　NS_TEXT_BEGIN /*元素文本开始*/
　　NS_TEXT_END /*元素文本结束*/
　　IS:  IS_OPEN /*指令打开*/
　　IS_NAME_BEGIN /*指令名开始*/
　　IS_NAME_END /*指令名结束*/
　　IS_KEY_BEGIN /*指令键开始*/
　　IS_KEY_END /*指令键结束*/
　　IS_ASIGN /*赋值符*/
　　IS_VAL_BEGIN /*指令值开始*/
　　IS_VAL_END /*指令值结束*/
　　IS_CLOSE /*指令关闭*/
　　DS:  DS_OPEN /*声明打开*/
　　DS_SKIP /*越过申明节*/
　　DS_CLOSE /*声明关闭*/
　　CD:  CD_BEGIN /*CDATA界段开始*/
　　CD_END /*CDATA界段结束*/

Copy after login

The above is the detailed content of Web Programming-Detailed Explanation of XML Grammar Analysis. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Best Graphic Settings

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks ago By DDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks ago By DDD

Will R.E.P.O. Have Crossplay?

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7549

CakePHP Tutorial

1382

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

Related knowledge

Is the conversion speed fast when converting XML to PDF on mobile phone? Apr 02, 2025 pm 10:09 PM

The speed of mobile XML to PDF depends on the following factors: the complexity of XML structure. Mobile hardware configuration conversion method (library, algorithm) code quality optimization methods (select efficient libraries, optimize algorithms, cache data, and utilize multi-threading). Overall, there is no absolute answer and it needs to be optimized according to the specific situation.

How to convert XML files to PDF on your phone? Apr 02, 2025 pm 10:12 PM

It is impossible to complete XML to PDF conversion directly on your phone with a single application. It is necessary to use cloud services, which can be achieved through two steps: 1. Convert XML to PDF in the cloud, 2. Access or download the converted PDF file on the mobile phone.

How to convert XML to PDF on your phone? Apr 02, 2025 pm 10:18 PM

It is not easy to convert XML to PDF directly on your phone, but it can be achieved with the help of cloud services. It is recommended to use a lightweight mobile app to upload XML files and receive generated PDFs, and convert them with cloud APIs. Cloud APIs use serverless computing services, and choosing the right platform is crucial. Complexity, error handling, security, and optimization strategies need to be considered when handling XML parsing and PDF generation. The entire process requires the front-end app and the back-end API to work together, and it requires some understanding of a variety of technologies.

How to open web.xml Apr 03, 2025 am 06:51 AM

To open a web.xml file, you can use the following methods: Use a text editor (such as Notepad or TextEdit) to edit commands using an integrated development environment (such as Eclipse or NetBeans) (Windows: notepad web.xml; Mac/Linux: open -a TextEdit web.xml)

Is there any mobile app that can convert XML into PDF? Apr 02, 2025 pm 08:54 PM

An application that converts XML directly to PDF cannot be found because they are two fundamentally different formats. XML is used to store data, while PDF is used to display documents. To complete the transformation, you can use programming languages and libraries such as Python and ReportLab to parse XML data and generate PDF documents.

Recommended XML formatting tool Apr 02, 2025 pm 09:03 PM

XML formatting tools can type code according to rules to improve readability and understanding. When selecting a tool, pay attention to customization capabilities, handling of special circumstances, performance and ease of use. Commonly used tool types include online tools, IDE plug-ins, and command-line tools.

How to export pdf with xml Apr 03, 2025 am 06:45 AM

There are two ways to export XML to PDF: using XSLT and using XML data binding libraries. XSLT: Create an XSLT stylesheet, specify the PDF format to convert XML data using the XSLT processor. XML Data binding library: Import XML Data binding library Create PDF Document object loading XML data export PDF files. Which method is better for PDF files depends on the requirements. XSLT provides flexibility, while the data binding library is simple to implement; for simple conversions, the data binding library is better, and for complex conversions, XSLT is more suitable.

How to open xml format Apr 02, 2025 pm 09:00 PM

Use most text editors to open XML files; if you need a more intuitive tree display, you can use an XML editor, such as Oxygen XML Editor or XMLSpy; if you process XML data in a program, you need to use a programming language (such as Python) and XML libraries (such as xml.etree.ElementTree) to parse.

See all articles