Detailed introduction to soap-related xml knowledge-XML/RSS Tutorial-php.cn

1. xmlOverview
在soap The authors of this standard actually had many options when looking for a way to express it, but they ultimately chose to use existing technology as much as possible and minimize the number of new terms necessary to describe the content of soap information. I chose the XML language. Its full name is "extensible Markup Language".
The XML language contains many more functions than soap, for example. Section 3 of the standard v1.1 "Relationship with XML Language" says that "Document type declaration (Document Type Declaration, DTD) shall not be included in soap messages." Processing Instructions must not be included in soap messages. Judging from the relevant content of the XML language standard adopted by soap, we can quickly understand the wisdom of this decision: Because developers do not need to have a full-featured XML parser when using soap, it is easy to implement a solution using soap. In order to understand soap, we must first understand the following concepts:
1. Uniform Resource Identifiers (URI for short).
2.xml language basis.
4.xml name space. ##Attributes
.
1.1 Uniform Resource Identifier In order to access a unique resource item on the Internet, you must know how to identify it from many
objects
. The Uniform Resource Descriptor (URI) is the name given to each resource. The format of the URI is:
: The schema-specific-part part often contains some italics. Line characters (/), these slash characters represent the hierarchical structure in the path
1.1.1 Uniform Resource Locator (URL)
The most familiar URI is the Uniform Resource Locator (Uniform Resource Locator). Uniform Resource Locator, URL), which is what we usually call a URL address. The URL address must also comply with the address representation method of the URI identifier. The format of the schema-specific-part part of the URL address is as follows:

//<user>:<password>@<host>:<port>/<url-path>

Copy after login

The meaning of each component element in the above syntax is as follows:
1.user: The user name at the target address (optional).
2.password: Password assigned to user user (optional).

3.host: The Internet protocol address or complete domain name of a network host (required).

4.port: The port number used to establish a connection. Most protocols have default port numbers. For example, http will use port 80 for communication by default (optional).
5.url-path: The detailed path to access a specific resource. The slash character (/) immediately following the host name or port number is not part of the url-path.

1.1.2 Uniform Resource Name (URN)
Compared with the ubiquitous URL address, most web users are much less familiar with Uniform Resource Name (URN). The difference between a URN and a URL is that the former is not parsed into a unique physical location. URN is a permanent resource identifier (unlike a URL address that has some dynamic path information). They allow other types of identifiers from a certain name space to be mapped into the URN space. Therefore, the syntax of URN provides the ability to analyze and encode character data using various existing protocols. The composition rules of URN also comply with the universal regulations of URI. Its common format is as follows:
::=”urn:””:”

In The
string
"urn:" should be used in the URN name to indicate that this is a URN name. NID is the abbreviation of "Namespace ID", and NSS is the abbreviation of "Namespace-Specific
String
", which gives the string related to the namespace represented by NID. When encountering a URN name, we have to decide how to interpret the NSS string based on the value of the NID. When reading or using URN names, remember that the leading string "urn:" and the letters in the content are case-sensitive. URL and URN are the two most common uses of URI. We will see another usage of URI later: xml Namespace(xmlnamespace).
1.2 Basics of xml Xml language came into people’s sight between 1996 and 1997. The following is an xml code example:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
    <parent>
        <artifactId>framework</artifactId>
        <groupId>com.**.framework</groupId>
        <version>1.0-SNAPSHOT</version>
    </parent>
    <modelVersion>4.0.0</modelVersion>
    <packaging>war</packaging>
</project>

Copy after login

即使你从没度过xml程序代码,这个例子也不是很难懂。我们可以通过这段代码看出xml文档的一些编写规则。
（1）第一行是一条处理指令，作用是声明这份文档使用的xml版本和编码格式，文档中的这条语句并非不可或缺，但最好还是把它加载里面。
（2）xml文档必须有一个封闭元素（版本信息不能算作是一个封闭元素）。在例子中project把整个文档包括在了它的起始标记和结束标记之间。它还有几个子元素，子元素又可以嵌套定义。
（3）这段代码中的所有单词都不是xml的关键字。
（4）注意文档里标记不要拼错了。即使拼错了，xml语法分析器也会接受这份文档，但它不会正确解析你的意图了。如果我们想让xml语法分析器为我们做一些检查工作，并且只读取正确的数据结构，那可以加上一个文档类型说明（Document Type Declaration, DTD）或者一个xml大纲（xml schema）。这里不就DTD进行讨论了，soap技术规范的第三小节也已经明确规定“在soap信息中不得包含文档类型声明”。

1.3 xml大纲
xml大纲（xml schema）比DTD的描述能力更强。这两种事物都提供了对一个xml元素的结构进行定义的方法。虽然格式定义和DTD都允许对元素进行定义，但只有格式定义允许给出数据的类型信息。xml数据是基于文本的，它会用字符”4”而不是”0100”这样的二进制形式来表示数字4（xml允许在信息中对二进制数据进行编码，这样使我们能够把图像数据等内容夹在一条xml信息里发送出去）。

下面给出实例，感受下大纲优于DTD的事实。
一个简单的DTD有一些包含其他元素或字符数据的元素。最简单的元素声明是这样的：给出元素的名字，再把元素的内容定义为字符数据。如下所示：

<! ELEMENT element-name (#PCDATA)>

Copy after login

一个元素可以由其他元素组合而成。如果一个元素不多不少只包含有某个给定元素的一个实例，我们就这样定义：

<! ELEMENT parentElement (childElement)> 
<! ELEMENT childElement (#PCDATA)>

Copy after login

如果parentElement元素中可能包含零个或多个childElement元素，我们就用一个星号(*)来定义，如下：

<! ELEMENT parentElement (childElement *)> 
<! ELEMENT childElement (#PCDATA)>

Copy after login

另外，还可以在一个DTD里指示出元素的组合情况，比如parentElement元素里包含有两种不同数据片段的情况，如下所示：

<! ELEMENT parentElement (childElement1, childElement2)>
<! ELEMENT childElement1 (#PCDATA)>
<! ELEMENT childElement2 (#PCDATA)>

Copy after login

具体示例，比如图书馆里有很多书，书又有书名，作者,版权等属性，那定义的DTD文件Library.dtd可以是：

<! ELEMENT Library (Book*)>
<! ELEMENT Book (Title, Author*, Copyright)>
<! ELEMENT Title (#PCDATA)>
<! ELEMENT Author (#PCDATA)>
<! ELEMENT Copyright (#PCDATA)>

Copy after login

Library是由零个或多个Book类型的元素组成的。每个Book又是由一个Title元素，零个或多个Author类型的元素和一个Copyright元素组成。Title，Author和Copyright这三种元素都包含着字符数据。我们使用这个DTD来检查xml内容，可以定义xml文档：

<?xml version=”1.0”?>
<!DOCTYPE Library PUBLIC “.” “Library.dtd”>
<Library>
<Book>
    <Title>Green Eggs</Title>
    <Author>Dr.Seuss</Author>
    <Copyright>1957</Copyright>
</Book>
<Book>
    <Title>Windows</Title>
    <Author>Scott</Author>
    <Copyright>2000</Copyright>
</Book>
</Library>

Copy after login

语法分析器在需要对数据的类型进行检查时会自动加载Library.dtd，并根据它对文档的内容进行检查。这样做的好处是显而易见的，但如果我们还能在“这个元素包含着字符数据”以外给出更多的信息岂不是更好？因为各种理由，W3C最终发布了关于xml大纲的建议稿。下面用xml大纲重写刚才的DTD文档：

<Schema xmlns:xsd =”http://www.w3.org/2001/XMLSchema”
    xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>
    <complexType name=”Book” content=”mixed”>
        <element type=”Title”></element>
        <element type=”Author”></element>
        <element type=”Copyright”></element>
    </complexType>
    <simpleType name=”Title” content=”textOnly” xsi:type=”string”></simpleType>
    <simpleType name=”Author” content=”textOnly” xsi:type=”string”></simpleType>
    <simpleType name=”Copyright” content=”textOnly” xsi:type=”integer”></simpleType>
</Schema>

Copy after login

把上面这段代码存为一个xml文件(一般是xsd后缀)。在使用大纲的时候，只需简单地在文档里引用就行了，像下面：

<myLibrary:Library xmlns:myLibrary=”x-schema:http://www.scottseely.com/LibrarySchema.xml”>
    <myLibrary:Book>
        <myLibrary:Title>Green Eggs</ myLibrary:Title >
        <myLibrary:Author>Dr.seuss</ myLibrary:Author >
        < myLibrary:Copyright>1957</ myLibrary:Copyright>
    </myLibrary:Book>
    …
</myLibrary:Library>

Copy after login

大纲和文档都使用了文本“text”。这个字符串的作用是通知语法分析器此文档里使用的变量名属于给定URI指定的名字空间。如果URI以x-schema打头，语法分析器就必须从指定地址加载相应的大纲文件。如果没有其他说明，带有xmlns声明的起始标记及对应的结束标记之间的所有元素都是指定的名字空间的一部分。

1.3.1 数据特征
为了更好地对数据进行定义和类型校验，xml大纲使用了“数据特征（facets）”来定义某个特定的数据类型的特性。一个数据值域空间的每个特性都必须用一个数据特征来定义。而一个“数据值域空间”就是一个给定的数据类型的全体有效数据值的集合。不同的数据类型是通过他们各自的数据特征来区分的。xml大纲文档定义的数据特征又分为两大类：基本特征和非基本特征。
一个基本特征就是某个数据值域空间里的数据值的一个抽象特性，他给出了这类数据值的一个基本特点。数据值的基本特征包括：
1.相等
2.顺序
3.边界
4.势。这是一个集合论的概念。有些值域空间里的值在数量上是有限的，而一些则是无限的。
5.数值
6.长度，最小长度，最大长度
7.式样
8.枚举
9.最大内边界，最大外边界，最小内边界，最大内边界
10.经度
11.数值范围
12.编码方式
13.持续时间
14.周期

1.3.2 数据类型
xml格式定义把数据类型和数据特征结合起来，这就使在格式定义中定义的数据项有了准确的含义。www.w3.org/2001/XMLSchema中定义了许多数据类型。

1.4 xml命名空间
在上面已经见过xml命名空间（xml　namespaces）的用法了。命名空间的作用简单说来就是对用在一段上下文里的一组变量名进行定义和声明。名字空间可以使用任意URN，但前提条件是这个URN必须是独一无二的。

1.5 xml属性
出现在这本书里的所有xml文档都使用了元素来表示数据。但大家应该知道xml也支持属性概念。元素必须有起始标记和结束标记，属性则不同，他们不需要标记，只要把他们放到元素的起始标记里就行了。
对属性进行声明有三种不同的办法：
1.精心编写的xml，但不包含任何DTD或大纲
2.精心编写的xml,使用一个DTD
3.精心编写的xml，使用一个或者多个大纲。
第一种办法能正常运行，但实用环境中效果并不好，而soap又禁止使用DTD,所以我们就说下第三种方法吧。在xml大纲创建属性的时候，需要使用attributeType和attribute这两个关键字。这两个关键字只在大纲所属的名字空间里有意义。attributeType用来定义某个类型的特性，而attribute则指出该元素类型定义所针对的对象。
为了使用属性对book进行描述，相应的大纲应该是下面这个样子：

<Schema xmlns:xsd=”http://www.w3.org/2001/XMLSchema”   
xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>  
    <attributeType name=”title” xsi:type=”string” />  
    <attributeType name=”name” xsi:type=”string” />  
    <complexType name=”Author” content=”empty”>  
        <attribute type=”name” />  
        <element type=”Author”>  
    </complexType>  
</Schema>

Copy after login

attributeType的完整语法如下：

<attributeType default=”default value”
xsi:type=”type”
xsi:values=”enumerated values”
name=”idref”
required=”{yes|no}” >

Copy after login

default:属性的默认值。
xsi:type:该属性的数据类型。如果选择的是枚举类型，还要天上xsi:value。
name:属性类型的名字。为了对attributeType进行类型检查，必须有一个名字。
required:表明一个包含有attributeType的元素是否必须带有此处定义的属性。

The above is the detailed content of Detailed introduction to soap-related xml knowledge. For more information, please follow other related articles on the PHP Chinese website!