出版科学 ›› 2016, Vol. 24 ›› Issue (4): 18-.

• 专论 特约稿 • 上一篇    下一篇

XML 标记的语义

艾兰· 瑞尼尔 戴维德· 杜宾 斯芬伯格· 麦奎因 克劳斯· 惠特福德   

  1. 伊利诺伊大学香槟分校图书情报研究生院,万维网联盟,卑尔根大学
  • 出版日期:2016-07-15 发布日期:2016-07-15
  • 作者简介:王晓光,武汉大学信息管理学院教授、博士生导师;王俊芳,武汉大学信息管理学院2014 级硕士生。
  • 基金资助:

    本文系中组部“青年拔尖人才” 支持计划和教育部“新世纪优秀人才”支持计划资助成果之一。

Towards a Semantics for XML Markup

  • Online:2016-07-15 Published:2016-07-15

摘要:

尽管XML 文档类型定义提供了一种机器可读形式的、能够说明XML 语言语法的机制,但目前并没
有类似的机制来指定XML 词汇表的具体语义。这意味着没办法说明XML 标记的意义,由XML 形式呈现的事
实和关系无法清晰、全面和规范地定义。这在实践和理论上都引起了严重的后果。从积极的方面看,XML 结构
能被赋予任意语义,并可用于最初的设计者无法预见的领域。从不太积极的方面来看,内容开发者和软件工程
师必须依靠乏味的文档,或者更糟的情况是,只能依靠猜测标记语言设计者的意图来开展工作。这一过程既费
时费力,又易出错,还无法核实验证。即便是设计者当初的建档工作做得相当完美,不如意的情况还是会发生。
另外,对标记语义本质研究的匮乏也意味着属于工程应用领域的数字文档处理根本没有什么理论。尽管目前正
在进行的一些工程(XML 模式、RDF、语义网)已经取得了一些成绩,但是这些工程都没有直接全面地解决
XML 标记语义的核心问题。本文回顾了标记意义这个概念的发展历史,阐明了解释XML 正式语义的动机,并介
绍了一个研究语义的科研项目——BECHAMEL 标记语义计划。

关键词: SGML, XML, 标记, 语义, 知识表示

Abstract:

 Although XML Document Type Definitions provide a mechanism for specifying, in machine-readable
form, the syntax of an XML markup language, there is no comparable mechanism for specifying the semantics
of an XML vocabulary. That is, there is no way to characterize the meaning of XML markup so that the facts and
relationships represented by the occurrence of XML constructs can be explicitly, comprehensively, and mechanically
identified. This has serious practical and theoretical consequences. On the positive side, XML constructs can be
assigned arbitrary semantics and used in application areas not foreseen by the original designers. On the less positive
side, both content developers and application engineers must rely upon prose documentation, or, worse, conjectures
about the intention of the markup language designer — a process that is time-consuming, error-prone, incomplete,
and unverifiable, even when the language designer properly documents the language. In addition, the lack of a
substantial body of research in markup semantics means that digital document processing is undertheorized as an
engineering application area. Although there are some related projects underway (XML Schema, RDF, the Semantic
Web) which provide relevant results, none of these projects directly and comprehensively address the core problems
of XML markup semantics. This paper (i) summarizes the history of the concept of markup meaning, (ii) characterizes
the specific problems that motivate the need for a formal semantics for XML and (iii) describes an ongoing research
project : the BECHAMEL Markup Semantics Project —that is attempting to develop such a semantics.