我当前正在使用 lxml
并希望验证 xml 内容。
我从 tei = etree.element("tei", nsmap={none: 'http://www.tei-c.org/ns/1.0'}
完全用 python 编写,包含许多子元素。 p>
现在,我想使用以下代码使用特定的 .xsd
文件检查结构是否正确:
xmlschema_doc = etree.parse(xsd_file_path) xmlschema = etree.xmlschema(xmlschema_doc) # run check status = xmlschema.validate(xml_tree)
它返回 false,并显示错误 element 'tei':没有可用于验证 root.
的匹配全局声明
我观察到一件非常奇怪的事情,如果我使用 编写 xml
et = etree.elementtree(xmldata) et.write('test.xml', pretty_print=true, xml_declaration=true, encoding='utf-8')
如果我用 b= etree.parse('test.xml')
重新打开它,我最终没有错误,并且由于 xmlschema.validate(b)
的结果,xml 结构是有效的
知道我需要在 xml 结构中添加什么吗?
编辑: 无效 xml 中的第一项
有效 xml 文件中的第一项
编辑:
<?xml version='1.0' encoding='UTF-8'?> <TEI xmlns="http://www.tei-c.org/ns/1.0"> <text> <body> <listBibl> <biblFull> <titleStmt> <title xml:lang="en">article</title> <title xml:lang="fr">article</title> <title type="sub" xml:lang="en">A subtitle</title> <author role="aut"> <persName> <forename type="first">John</forename> <surname>Doe</surname> </persName> <email>email</email> <idno type="http://orcid.org/">orcid</idno> <affiliation ref="#localStruct-affiliation"/> <affiliation ref="#struct-affiliation"/> </author> <author role="aut"> <persName> <forename type="first">Jane</forename> <forename type="middle">Middle</forename> <surname>Doe</surname> </persName> <email>email</email> <idno type="http://orcid.org/">orcid</idno> <affiliation ref="#localStruct-affiliationA"/> <affiliation ref="#localStruct-affiliationB"/> </author> </titleStmt> <editionStmt> <edition> <ref type="file" subtype="author" n="1" target="upload.pdf"/> </edition> </editionStmt> <publicationStmt> <availability> <licence target="https://creativecommons.org/licenses//cc-by/"/> </availability> </publicationStmt> <notesStmt> <note type="audience" n="2"/> <note type="invited" n="1"/> <note type="popular" n="0"/> <note type="peer" n="1"/> <note type="proceedings" n="0"/> <note type="commentary">small comment</note> <note type="description">small description</note> </notesStmt> <sourceDesc> <biblStruct> <analytic> <title xml:lang="en">article</title> <title xml:lang="fr">article</title> <title type="sub" xml:lang="en">A subtitle</title> <author role="aut"> <persName> <forename type="first">John</forename> <surname>Doe</surname> </persName> <email>email</email> <idno type="http://orcid.org/">orcid</idno> <affiliation ref="#localStruct-affiliation"/> <affiliation ref="#struct-affiliation"/> </author> <author role="aut"> <persName> <forename type="first">Jane</forename> <forename type="middle">Middle</forename> <surname>Doe</surname> </persName> <email>email</email> <idno type="http://orcid.org/">orcid</idno> <affiliation ref="#localStruct-affiliationA"/> <affiliation ref="#localStruct-affiliationB"/> </author> </analytic> <monogr> <idno type="isbn">978-1725183483</idno> <idno type="halJournalId">117751</idno> <idno type="issn">xxx</idno> <imprint> <publisher>springer</publisher> <biblScope unit="serie">a special collection</biblScope> <biblScope unit="volume">20</biblScope> <biblScope unit="issue">1</biblScope> <biblScope unit="pp">10-25</biblScope> <date type="datePub">2024-01-01</date> </imprint> </monogr> <series/> <idno type="doi">reg</idno> <idno type="arxiv">ger</idno> <idno type="bibcode">erg</idno> <idno type="ird">greger</idno> <idno type="pubmed">greger</idno> <idno type="ads">gaergezg</idno> <idno type="pubmedcentral">gegzefdv</idno> <idno type="irstea">vvxc</idno> <idno type="sciencespo">gderg</idno> <idno type="oatao">gev</idno> <idno type="ensam">xcvcxv</idno> <idno type="prodinra">vxcv</idno> <ref type="publisher">https://publisher.com/ID</ref> <ref type="seeAlso">https://link1.com/ID</ref> <ref type="seeAlso">https://link2.com/ID</ref> <ref type="seeAlso">https://link3.com/ID</ref> </biblStruct> </sourceDesc> <profileDesc> <textClass> <keywords scheme="author"> <term xml:lang="en">keyword1</term> <term xml:lang="en">keyword2</term> <term xml:lang="fr">mot-clé1</term> <term xml:lang="fr">mot-clé2</term> </keywords> <classCode scheme="halDomain" n="physics"/> <classCode scheme="halDomain" n="halDomain2"/> <classCode scheme="halTypology" n="ART"/> </textClass> </profileDesc> </biblFull> </listBibl> </body> <back> <listOrg type="structures"> <org type="institution" xml:id="localStruct-affiliation"> <orgName>laboratory for MC, university of Yeah</orgName> <orgName type="acronym">LMC</orgName> <desc> <address> <addrLine>Blue street 155, 552501 Olso, Norway</addrLine> <country key="LS">Lesotho</country> </address> <ref type="url" target="https://lmc.univ-yeah.com"/> </desc> </org> <org type="institution" xml:id="localStruct-affiliationB"> <orgName>laboratory for MCL, university of Yeah</orgName> <orgName type="acronym">LMCL</orgName> <desc> <address> <addrLine>Blue street 155, 552501 Olso, Norway</addrLine> <country key="NO">Norway</country> </address> <ref type="url" target="https://lmcl.univ-yeah.com"/> </desc> </org> </listOrg> </back> </text> </TEI>
看看https://www.php.cn/link/e1ff36b97044a1c7c73c73e4d27aeba4,你基本上应该使用
tei_namespace = "http://www.tei-c.org/ns/1.0" tei = "{%s}" % tei_namespace nsmap = {none : tei_namespace} # the default namespace (no prefix) root = etree.element(tei + "tei", nsmap=nsmap) # lxml only! text = etree.subelement(root, tei + "text")
对所有元素依此类推,以确保它们是在 tei 命名空间中创建的。
在内存中创建的 elementtree 对架构有效(在我将其与导入的 w3c xml.xsd 一起下载后)是例如
from lxml import etree TEI_NAMESPACE = "http://www.tei-c.org/ns/1.0" TEI = "{%s}" % TEI_NAMESPACE NSMAP = {None : TEI_NAMESPACE} # the default namespace (no prefix) root = etree.Element(TEI + "TEI", nsmap=NSMAP) # lxml only! text = etree.SubElement(root, TEI + "text") body = etree.SubElement(text, TEI + "body") listBibl = etree.SubElement(body, TEI + "listBibl") biblFull = etree.SubElement(listBibl, TEI + "biblFull") sourceDesc = etree.SubElement(biblFull, TEI + "sourceDesc") profileDesc = etree.SubElement(biblFull, TEI + "profileDesc") xmlschema_doc = etree.parse("aofr.xsd") xmlschema = etree.XMLSchema(xmlschema_doc) # run check status = xmlschema.validate(root) print(status) print(xmlschema.error_log)
以上是无法使用架构验证 XML,但可以通过从中读取写入的文件来工作的详细内容。更多信息请关注PHP中文网其他相关文章!