Entity Recognition and Linking System and Method Based on cn-dbpedia

A technology of entity recognition and entity, which is applied in the direction of instrumentation, computing, semantic analysis, etc., can solve the problem of less contextual information in short texts, and achieve good effect of word segmentation and entity recognition

Active Publication Date: 2022-04-12
FUDAN UNIV
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The present invention can solve the problem of entity linking with less contextual information in short texts

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Entity Recognition and Linking System and Method Based on cn-dbpedia
  • Entity Recognition and Linking System and Method Based on cn-dbpedia
  • Entity Recognition and Linking System and Method Based on cn-dbpedia

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0056] The invention proposes a CN-DBpedia-based short text entity recognition and linking system and method. The framework of the technical solution proposed by the present invention is as follows: figure 1 As shown, it includes an entity linking module and an entity recognition module. The entity link module includes a synonym matching unit and an entity link unit; the entity recognition module includes a tokenizer, a word probability calculation unit and an entity discrimination unit. In the present invention, the synonym matching unit first uses the thesaurus of CN-DBpedia to identify candidate entities for the input text sequence, that is, to identify all possible entity synonyms in the sequence. Then the probability of each entity corresponding to the entity synonym is calculated. Finally, the text sequence and the identified candidate entities and their probabilities will be input to the entity recognition module, and the tokenizer of the entity recognition module wil...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a CN-DBpedia-based entity recognition and linking system and method. The system includes an entity link module and an entity recognition module; the entity link module includes a synonym matching unit and an entity link unit; the entity recognition module includes a word segmenter, a word probability calculation unit and an entity discrimination unit. The invention constructs the semantic relationship between entities and words, so that the relationship with entities can be found in very few contexts. The invention integrates the entity recognition algorithm based on machine learning and the non-supervised word segmentation algorithm. It can consider the rationality of entity name division from a global point of view, and expand the vocabulary space of word segmentation, and calculate the word-forming probability of entity words with a more reasonable algorithm. The invention links first and then recognizes, so that the semantic information of the text is fully utilized during entity recognition, and better word segmentation and entity recognition are realized.

Description

technical field [0001] The invention belongs to the technical field of data services, and in particular relates to a CN-DBpedia-based entity identification and linking system and method. Background technique [0002] The advent of the era of big data has brought unprecedented data dividends to the rapid development of artificial intelligence. Under the "feeding" of big data, artificial intelligence technology has made unprecedented progress. Its progress is prominently reflected in related fields such as knowledge engineering represented by knowledge graph and machine learning represented by deep learning. As the dividends of deep learning for big data are exhausted, the ceiling of the effect of deep learning models is increasingly approaching. On the other hand, a large number of knowledge graphs continue to emerge, but these treasure houses containing a large amount of human prior knowledge have not been effectively utilized by deep learning. Integrating knowledge graph...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/295G06F40/30G06F40/247
CPCG06F40/247G06F40/295G06F40/30
Inventor 梁家卿陈砺寒肖仰华
Owner FUDAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products