出版科学 ›› 2014, Vol. 22 ›› Issue (1): 79-.

• 多媒体 数字出版 • 上一篇    下一篇

数字出版中的词汇和难句抽取研究

孙继兰   

  1. 北京工商大学计算机与信息学院
  • 出版日期:2014-01-15 发布日期:2014-01-15
  • 作者简介:孙继兰,北京工商大学计算机与信息学院讲师。

Vocabulary and Difficult Sentences Extraction in Digital Publishing

  • Online:2014-01-15 Published:2014-01-15

摘要:

指出通过在数字出版平台应用自然语言处理技术,提供词汇及难句抽取服务,能减少外文原著阅读中的困难,提高电子书和纸质书的阅读效率;在讨论数字出版平台提供词汇抽取服务的相关问题后,进一步提出难句抽取服务的相关建议,分析其可行性,给出参考抽取策略。

关键词: 数字出版, 外文原著, 自然语言处理, 云平台, 词汇提取, 难句抽取

Abstract:

Using natural language processing (NLP) on publishers’platforms to provide words and difficult sentences extraction services, thus would help readers improve the efficiency of reading foreign language publications both of electronic and paper formats. This paper first discusses about the issues related to digital publishing platforms’ providing word extraction services, then puts forward some suggestions on the difficult sentences extraction service further, finally analyzes the feasibility and gives some corresponding extraction strategies.