XML Easy Learning Manual
ajie
;
, the relationship and difference between SGML;
3. Simple application of XML.
Congratulations! You no longer know anything about XML, and you are already at the forefront of network technology. The whole learning process does not seem to be difficult :)
If you are more interested in XML and want to know more about the details of XML and other practical application technologies, please continue to browse our next chapter: The concept of XML. Chapter 2 XML Concepts
Introduction
After the quick start in Chapter 1, you already know that XML is a language that allows you to create your own tags. It can separate data and formats from web pages, and it can store data. And the characteristics of sharing data make XML omnipotent. If you want to learn XML in depth and systematically master the ins and outs of XML, then we must first return to the concept of XML. XML (Extensible Markup Language), an extensible markup language. "Extensibility" "Identity" "Language". Each word clearly points out the important features and functions of XML. Let’s analyze it carefully:
1. Extensibility
2. Identification
3. Language
4. Structure
5. Meta data
6. Display
7. DOM
1. Extensibility---using XML, you can Create your own tags for your documents.
The first word of XML is "expansibility", which is why XML has powerful functions and flexibility.
In HTML, there are many fixed tags that we must remember and then use. You cannot use tags that are not in the HTML specification. In XML, you can create any markup you need. You can give full play to your imagination and give your documents some memorable tag names. For example, if your document contains some game guides, you can create a tag named , and then create , and other tags under according to the game category. You can create any number of markers as long as they are clear and easy to understand.
You may not be comfortable with it at first, because when we learn HTML, there are fixed tags that can be learned and used directly; (many people, including myself, build their own web pages while analyzing other people's codes and logos), and XML does not have any tags to learn, and few documents have exactly the same tags. what should we do? Haha, if not, just create it yourself. Once you actually start writing XML documents, you'll find that it's fun to create new tags as you like. You can create your own unique markup and even create your own HTML language.
Extensibility gives you more choices and powerful capabilities, but it also creates a problem: you must learn to plan. You need to understand your own document, know what parts it consists of, the relationship between them and how to identify them.
One thing to note about establishing a logo is that the logo describes the type or characteristics of the data, such as width , age , name , etc., rather than the content of the data, such as: ,,, these are useless tags. If you have studied databases, you can understand it this way, an identifier is a field name.
2. Identification---Using XML you can identify elements in the document.
The second word of XML is "identification", which shows that the purpose of XML is to identify elements in the document.
Whether you are using HTML or XML, the essence of tags is to facilitate understanding. If there is no tag, your document will appear to the computer as just a long string, and every word will look the same, with no emphasis.
Through tags, your document will be easier to read and understand. You can divide it into paragraphs and list titles. In XML, you can take advantage of its extensibility to create more appropriate tags for documents.
However, there is one thing to remind everyone: the logo is only used to identify information, it does not convey information itself. For example, this HTML code:
frist step
Here means bold. It is only used to indicate that the "frist step" character is displayed in bold. itself does not contain You can't see any actual information on the page, it's the "frist step" that really conveys the information.
3. Language---When using XML, you have to follow a specific syntax to identify your documents.
The third word of XML is "language". This shows that XML as a language must follow certain rules. Although XML's extensibility allows you to create new identities, it still must follow a specific structure, syntax, and clear definitions.
In the field of computers, language often refers to a "programming language", which is used to program to implement some functions and applications. However, not all "languages" are used for programming. XML is just a language used to define identification and describe information. language.
Next let’s take an in-depth understanding of the basic principles of XML application. It may be boring, but the overall understanding is very important. You can go through it quickly first to have a vague concept in your mind. The specific essence needs to be learned slowly in practice. understand.
IV. Structure---XML promotes document structure, and all information is arranged according to a certain relationship.
"Structure" sounds too abstract. We understand it this way. Structure is to establish a framework for your document, just like writing an outline before writing an article. Structure makes your document look less disorganized and each part is closely connected to form a whole.
There are two principles of structuring:
1. Each part (each element) is related to other elements. The associated series form the structure.
2. The meaning of the logo itself is separated from the information it describes.
Let’s look at a simple example to help understand:
XML Easy Learning Manual
< ;chapter>Quick Start with XML
die in in XML > ;Extensibility
Identity
Clear:
...
A document like the above Structure, also known as For a "document tree", the trunk is the parent element, such as , and the branches and pages are child elements, such as and .
5. Meta data (Metadata)---Professional XML users will use meta data to work.
In HTML, we know that we can use meta tags to define keywords, introductions, etc. of web pages. These tags will not be displayed on the web page, but can be searched by search engines and affect the order of search results.
XML deepens and expands this principle. With XML, you can describe where your information is. You can verify the information through meta, perform searches, force display, or process other data.
The following are some uses of XML metadata in practical applications: 1. Digital signatures can be verified to make online business submissions valid.
2. Can be easily indexed and searched more effectively. 3. Data can be transferred between different languages.
The W3C organization is studying a metadata processing method called RDF (Resource Description Framework), which can automatically exchange information. W3C claims that using RDF combined with digital signatures will enable "authentic and credible" e-commerce to exist on the network.
VI. Display
XML alone cannot display pages. We use formatting technology, such as CSS or XSL, to display documents created by XML tags.
We mentioned in Chapter 1 that XML separates data and format. The XML document itself does not know how to display it, and there must be auxiliary files to help achieve it. (XML cancels all tags, including font, color, p and other style definition tags, so XML uses a method similar to CSS in DHTML to define document styles.), the file type used to set the display style in XML There are:
1. It itself is also based on XML language. Using XSL, you can flexibly set the document display style, and the document will automatically adapt to any browser and PDA (handheld computer). XSL can also convert XML into HTML, so that old browsers can also browse XML documents.
2.CSS We are all familiar with CSS. Its full name is Cascading Style Sheets, which is currently the main method used to display XML documents on browsers.
3.Behaviors
Behaviors have not yet become a standard. It is a unique feature of Microsoft's IE browser. You can use it to set some interesting actions on XML tags. If you are interested, you can click here to see an example
7.DOM
The full name of DOM is document object model. What is DOM used for? Assuming that your document is treated as a separate object, DOM is the standard for how to operate and control this object using HTML or XML.
The object-oriented thinking method has become very popular. In programming languages (such as java, js), object-oriented programming ideas are used. In XML, the web page is to be operated and controlled as an object. We can create our own objects and templates. To communicate with objects and how to command objects, you need to use API. The full name of API is Application Programming Interface, which is the rule for accessing and operating objects. DOM is an API that describes the rules of HTML/XML document objects in detail. It specifies the naming convention, program model, communication rules, etc. of HTML/XML document objects. In an XML document, we can think of each identification element as an object --- it has its own name and attributes.
XML creates identifiers, and the role of DOM is to tell script how to operate and display these identifiers in the browser window
Above we have briefly talked about some basic principles of XML, let’s take a look at the relationship between them and what they are How it works, first look at this picture:
1.XML description data type. For example: "King lear" is a title element;
2. CSS stores and controls the display style of the element. For example: the title will be displayed in 18pt font
3.script script controls how the element behaves. For example: when a title element is "out of stock", it will be displayed in red.
4.DOM provides a common platform for the communication of scripts and objects, and displays the results in the browser window.
If there is an error in any part, you will not get the correct result.
Okay, seeing this, we already have an overall general idea of how XML works. Through the study of this chapter, we may feel that XML seems to be more biased towards data processing and is more convenient for programmers to learn. The actual situation is also the same. The purpose of XML design is to conveniently share and interact data. In the next chapter, we will systematically understand various terms about XML. You are welcome to continue browsing.
Chapter 3 XML Terminology
Outline:
Introduction
1. Terms related to XML documents
2. Terms related to DTD
Introduction
The most troublesome thing for beginners to learn XML is that there are a lot of new terminology concepts to understand. Since XML itself is also a brand-new technology, it is constantly developing and changing. Organizations and major network companies (Microsoft, IBM, SUN, etc.) are constantly introducing their own insights and standards, so it is not surprising that new concepts are flying everywhere. . There is no authoritative institution or organization in China to officially name these terms. Most of the Chinese textbooks you see about XML are translated based on the author's own understanding. Some are correct and some are wrong, which further hinders the development of XML. Our understanding and learning of these concepts.
The explanations of XML terms you will see below are also the author’s own understanding and translation. Ajie is based on the XML1.0 standard specification released by the W3C organization and related official documentation. It can be ensured that these understandings are basically correct, or at least not wrong. If you want to read and understand further, I have listed the sources and links to relevant resources at the end of this article, which you can access directly. Okay, let’s get to the point:
1. Terms related to XML documents
What is an XML document? You know the HTML source code file? An XML document is an XML source code file written with XML tags. XML documents are also ASCII plain text files that you can create and modify using Notepad. The suffix name of XML documents is .XML, for example, myfile.xml. You can also directly open the .xml file with IE5.0 or above browsers, but what you see is the "XML original code" and the page content will not be displayed. You can try saving the following code as myfile.xml:
XML Easy Learning Manual
ajie
ajie@aolhoo.com
20010115
The XML document contains three parts: 1 . An XML document declaration;
2. A definition of a document type;
3. Content created with XML tags.
Example:
QUICK START OF XML< ;/title>
ajie
……
The first line It is the declaration of an XML document. The second line indicates that this document uses filelist.dtd to define the document type. The third line and below is the main part of the content.
Let’s learn about the relevant terms in XML documents:
1.Element (element):
We already know the element in HTML. It is the smallest unit that makes up an HTML document, and it is the same in XML. An element is defined by a tag, including the start and end tags and the content inside, like this: ajie
The only difference is that in HTML, the tag is fixed, while in XML , the logo needs to be created by yourself.
2.Tag (logo)
Tag is used to define elements. In XML, tags must appear in pairs, surrounding the data. The name of the identifier is the same as the name of the element. For example, such an element:
ajie
where is the identifier.
3.Attribute:
What is an attribute? Look at this HTML code:word. Among them, color is one of the attributes of font.
Attributes are further descriptions and explanations of the logo. A logo can have multiple attributes, such as the font attribute and size. Attributes in XML are the same as attributes in HTML. Each attribute has its own name and value. The attribute is part of the identifier. Example:
ajie
Attributes in XML are also defined by yourself. We recommend that you try not to use attributes and change attributes into sub-elements. For example, the above code can be changed to Like this:
ajie
female
The reason is that attributes are not easy to expand and be manipulated by programs.
4.Declaration (Declaration)
There is an XML declaration in the first line of all XML documents. This declaration indicates that this document is an XML document and which XML version specification it follows. An XML declaration statement looks like this:
5.DTD (Document Type Definition)
DTD is used to define elements, attributes and relationships between elements in XML documents.
The DTD file can be used to check whether the structure of the XML document is correct. But creating an XML document does not necessarily require a DTD file. Detailed descriptions of DTD files will be listed separately below.
6.Well-formed XML
A document that abides by XML syntax rules and XML specifications is called "well-formed". If all your markup strictly adheres to the XML specification, then your XML document does not necessarily need a DTD file to define it.
A well-formed document must start with an XML declaration, for example:
It is currently 1.0; secondly, it means that the document is "independent" and it does not require a DTD file to verify whether the identification in it is valid; thirdly, it is necessary to explain the language encoding used in the document. The default is UTF-8. If you use Chinese, you need to set it to GB2312.
A well-formatted XML document must have a root element, which is the first element created immediately after the declaration. Other elements are child elements of this root element and belong to a group of root elements.
The content of a well-formed XML document must comply with XML syntax when written. (We will explain XML syntax in detail in the next chapter) 7. Valid XML (valid XML) An XML document that abides by XML syntax rules and complies with the corresponding DTD file specifications is called a valid XML document. Note that we compare "Well-formed XML" and "Valid XML". The biggest difference between them is that one fully complies with the XML specification, while the other has its own "Document Type Definition (DTD)".
The process of comparing an XML document with its DTD file to see if it complies with DTD rules is called validation. This process is usually handled by a software called parser.
A valid XML document must also start with an XML declaration, for example:
Different from the above example, in standalone( Independent) attribute, the setting here is "no", because it must be used together with the corresponding DTD. The DTD file is defined as follows:
PUBLIC "dtd-name">
Where:
"!DOCTYPE" means you want to define a DOCTYPE;
"type-of-doc" is the name of the document type, which is defined by you, usually the same as the DTD file name;
"SYSTEM/PUBLIC" Only one of the parameters is used. SYSTEM refers to the URL of the private DTD file used by the document, while PUBLIC refers to the URL of the public DTD file used by the document.
"dtd-name" is the URL and name of the DTD file. All DTD files have the suffix ".dtd".
We still use the above example, it should be written like this:
gt ;
2. DTD related terms
What is DTD, we have briefly mentioned it above. DTD is an effective method to ensure that the XML document format is correct. You can compare the XML document and the DTD file to see whether the document conforms to the specification and whether the elements and tags are used correctly. A DTD document contains: the definition rules of elements, the definition rules of relationships between elements, the attributes that can be used by elements, and the rules of entities or symbols that can be used.
The DTD file is also an ASCII text file with the suffix .dtd. For example: myfile.dtd.
Why use DTD files? My understanding is that it meets network sharing and data interaction. The biggest benefit of using DTD is the sharing of DTD files. (This is the PUBLIC attribute in the DTD description statement above). For example, if two people in the same industry and different regions use the same DTD file as a document creation specification, their data can be easily exchanged and shared. If other people on the Internet want to add data, they only need to create a document according to the public DTD specification, and they can join immediately.
Currently, there are already a large number of written DTD files available. Targeting different industries and applications, these DTD files have established common element and label rules. You don't need to recreate them yourself, just add the new logos you need based on them.
Of course, if you like, you can create your own DTD, which may match your document more perfectly. Creating your own DTD is also very simple. Generally, you only need to define 4-5 elements.
There are two ways to call a DTD file:
1. DTD directly included in the XML document
You only need to insert some special instructions in the DOCTYPE declaration, like this:
We have an XML document:
We just insert the following code after the first line:
]>
2. Call an independent DTD file
Save the DTD document as a .dtd file, and then call it in the DOCTYPE declaration line. For example, save the following code as myfile. dtd
Then call it in the XML document, in the first line After inserting:
We can see that the calls to js in DTD documents and HTML are similar. Regarding how to write DTD documents, we will discuss the details of writing DTD documents and XML documents in the next chapter. The grammar is introduced together.
Let’s learn about the terms related to DTD:
1.Schema (planning)
Schema is the description of data rules. Schema does two things:
a. It defines the element data type and the relationship between elements;
b. It defines the content type that the element can contain.
DTD is a schema for XML documents.
2.Document Tree (Document Tree)
We have already mentioned "Document Tree" in Chapter 2. It is an image representation of the hierarchical structure of document elements. A document structure tree contains the root element, which is the top-level element (that is, the first element immediately following the XML declaration statement). Look at the example:
author>
The above example is arranged in a three-level structure into a "tree" shape, in which is the root element. In XML and DTD files, the first element defined is the root element.
3.Parent Element/Child Element
A parent element refers to an element that contains other elements, and the contained element is called its child element. Look at the "structure tree" above, where is the parent element, , are its child elements, and is the child element of . The last-level elements that do not contain any child elements like are also called "page elements".
4.Parser (parsing software)
Parser is a tool software that checks whether XML documents comply with DTD specifications.
XML parsers have developed into two types: one is the "non-confirmation parser", which only detects whether the document complies with XML syntax rules and whether the document tree is established with element identifiers. The other is the "confirmation class paeser", which not only detects the document syntax and structure tree, but also compares and analyzes whether the element identifiers you use comply with the specifications of the corresponding DTD file.
Parser can be used independently or as part of an editing software or browser. In the following list of related resources, I have listed some of the currently popular parsers.
Okay, through the study of Chapter 3, we have learned some basic terms of XML and DTD, but we still don’t know how to write these files and what kind of syntax needs to be followed. In the next chapter, we will focus on writing Syntax for XML and DTD documents. Please continue browsing, thank you!
Chapter 4 XML Syntax
Outline:
1. XML syntax rules
2. Element syntax
3. Comment syntax
4. CDATA syntax
5. Namespaces syntax
6. Entity syntax
7. DTD The syntax of
Through the study of the previous three chapters, we already have an understanding of what XML is, its implementation principles and related terminology. Next, we will start to learn the syntax specifications of XML and write our own XML documents.
1. XML syntax rules
XML documents are similar to the original code of HTML, and also use tags to identify content. The following important rules must be followed when creating XML documents:
Rule 1: There must be an XML declaration statement
We have already mentioned this when studying in the previous chapter. The declaration is the first sentence of the XML document, and its format is as follows:
The function of the declaration is to tell the browser or other Handler: This document is an XML document. The version in the declaration statement indicates the version of the XML specification that the document complies with; standalone indicates whether the document comes with a DTD file, if so, the parameter is no; encoding indicates the language encoding used in the document, and the default is UTF-8.
Rule 2: Whether there is a DTD file
If the document is a "valid XML document" (see the previous chapter), then the document must have a corresponding DTD file and strictly comply with the specifications set by the DTD file. The declaration statement of the DTD file follows the XML declaration statement in the following format:
PUBLIC "dtd-name">
Among them:
"!DOCTYPE" means that you want to define a DOCTYPE;
"type-of-doc" is the name of the document type, which is defined by you. It is usually the same as the DTD file name;
Only one of the two parameters "SYSTEM/PUBLIC" is used. SYSTEM refers to the URL of the private DTD file used by the document, while PUBLIC refers to the URL of the public DTD file used by the document.
"dtd-name" is the URL and name of the DTD file. All DTD files have the suffix ".dtd".
We still use the above example, it should be written like this:
gt ;
Rule 3: Pay attention to your capitalization
In XML documents, there is a difference between upper and lower case.
and
are different identifiers. Note that when writing elements, the case of the front and rear identifiers must be the same. For example: ajie, it is wrong to write ajie.
You'd better develop a habit of either all capital letters, all lower case letters, or capitalize the first letter. This reduces documentation errors caused by case mismatches.
Rule 4: Add quotes to attribute values
In HTML code, attribute values can be quoted or not. For example: word and word can both be interpreted correctly by the browser.
But in XML, it is stipulated that all attribute values must be quoted (can be single quotes or double quotes), otherwise it will be regarded as an error.
Rule 5: All tags must have corresponding closing tags
In HTML, tags may not appear in pairs, such as
. In XML, it is stipulated that all tags must appear in pairs. If there is a start tag, there must be an end tag. Otherwise it will be considered an error.
Rule 6: All empty tags must also be turned off
An empty tag is a tag with no content between the pair of tags. For example,
, and other logos. In XML, it is stipulated that all tags must have an end tag. For such empty tags, the processing method in XML is to add / at the end of the original tag, and that's it. For example:
should be written as
;
should be written as;
should be written as
2. Syntax of elements
The element consists of a pair of logos and their content. Like this: ajie. The name of the element and the name of the identifier are the same. Identities can be further described using attributes.
In XML, there are no reserved words, so you can use any word as an element name as you like. However, the following specifications must also be followed:
1. The name can contain letters, numbers and other letters;
2. The name cannot begin with numbers or "_" (underscore);
3. The name cannot begin with the letters xml (or XML or ..)
4. The name cannot contain spaces
5. The name cannot contain ":" (colon)
In order to make the elements easier to read, understand and operate, we have some suggestions:
1. Do not use " in the name ." Because in many programming languages, "." is used as an attribute of an object, for example: font.color. For the same reason, it is best not to use "-". If it must be used, replace it with "_";
2. Keep the name as short as possible.
3. Try to use the same standard for capitalization and capitalization of names.
4. The name can use non-English characters, such as Chinese. But some software may not support it. (IE5 currently supports Chinese elements.)
In addition, add a little explanation about attributes. In HTML, attributes can be used to define the display format of elements. For example: word will display word in red. In XML, attributes are just descriptions of identifiers and have nothing to do with the display of element content. For example, the same sentence: word will not display the word in red. (So, some netizens will ask: How to display text in red in XML? This requires the use of CSS or XSL, which we will describe in detail below.)
3. Syntax of comments
Comments are for ease of reading and understanding , additional information added to the XML document will not be interpreted by the program or displayed by the browser.
The syntax of comments is as follows:
As you can see, it is the same as the comment syntax in HTML, which is very easy. Developing good commenting habits will make your documents easier to maintain, share, and look more professional.
4. Syntax of CDATA
CDATA stands for character data, which is translated as character data. When we write XML documents, we sometimes need to display letters, numbers and other symbols themselves, such as "
For example:
ajie