Tibetan language entity knowledge information extraction method

An information extraction and entity technology, applied in the field of Tibetan entity knowledge information extraction, can solve problems such as inability to apply and realize large-scale data processing and knowledge acquisition, and achieve the effect of promoting knowledge sharing.

Active Publication Date: 2014-11-05
MINZU UNIVERSITY OF CHINA
View PDF12 Cites 43 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Therefore, it is impossible to directly apply relatively mature methods in English and Chinese entity attribute and relationship extraction to Tibetan
In this case, the acquisition of Tibetan entity knowledge information relies more on manual methods, which cannot realize large-scale data processing and knowledge acquisition

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Tibetan language entity knowledge information extraction method
  • Tibetan language entity knowledge information extraction method
  • Tibetan language entity knowledge information extraction method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0012] The technical solutions of the present invention will be described in further detail below with reference to the accompanying drawings and embodiments.

[0013] figure 1 It is a flow chart of the method for extracting Tibetan entity knowledge information provided by this embodiment, as figure 1 As shown, the Tibetan entity knowledge information extraction method of the present invention comprises:

[0014] Step S101, extracting Tibetan-Chinese comparable corpus information.

[0015] According to the differences in the form of Tibetan and Chinese text corpora in different network environments, different methods are adopted.

[0016] Specifically, for a large number of Tibetan-Chinese text corpora in the network environment that are only parallel at the page level, or parallel across the network without direct cross-language internal links, a multi-feature Tibetan-Chinese comparable prediction acquisition model based on bilingual web pages is constructed. Since the tit...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a Tibetan language entity knowledge information extraction method, which comprises the following steps that: Tibetan and Chinese comparable language material information is extracted from Tibetan and Chinese text language material information; entity equivalence pairs are extracted from the Tibetan and Chinese comparable language material information; the Tibetan and Chinese cross-language entity relationship is extracted from the entity equivalence pairs; a Tibetan language "entity-attribute-value" triad is extracted from the Tibetan and Chinese cross-language entity relationship; and the triad is stored into a Tibetan language entity knowledge semantic resource library. The Tibetan language entity knowledge information extraction method solves the problem of Tibetan language training language material deficiency to a certain degree, promotes the knowledge sharing among different languages, and provides support for the study in the fields of Tibetan and Chinese cross-language knowledge questions, information retrieval, machine translation and the like.

Description

technical field [0001] The invention relates to a method for extracting Tibetan-language entity knowledge information, in particular to a method for extracting Tibetan-Chinese cross-language entity knowledge information based on natural annotation. Background technique [0002] With the explosive growth of Web content, the social network research on the Web is no longer limited to the analysis of Web structure, but has turned to the analysis of Web content as the research object. Among them, knowledge graph has become a natural language processing field in the era of big data. Research hotspots. The knowledge graph uses nodes to represent entities or concepts, and edges represent various semantic relationships between entities or concepts, and the extraction of entity knowledge information is one of the main research contents. [0003] In the extraction of entity knowledge information, the key problem to be solved is the extraction of entities and their attribute relationsh...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/367G06F40/194
Inventor 孙媛
Owner MINZU UNIVERSITY OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products