Text processing method and device based on ambiguity entity words

A text processing, entity word technology, applied in neural learning methods, electrical digital data processing, natural language data processing and other directions, can solve problems such as poor disambiguation effect

Active Publication Date: 2018-07-13
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF8 Cites 68 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, these two methods are less effective in disambiguation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text processing method and device based on ambiguity entity words
  • Text processing method and device based on ambiguity entity words
  • Text processing method and device based on ambiguity entity words

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary and are intended to explain the present invention and should not be construed as limiting the present invention.

[0037] The text processing method and device based on ambiguous entity words according to the embodiments of the present invention will be described below with reference to the accompanying drawings.

[0038] figure 1 It is a schematic flowchart of a text processing method based on ambiguous entity words provided by an embodiment of the present invention.

[0039] Such as figure 1 As shown, the method includes the following steps:

[0040] Step 101, acquire the context of the text to be disambiguated and at least two candidate...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a text processing method and device based on ambiguity entity words. The method comprises the steps of obtaining the context of a text with ambiguities to be eliminated and at least two candidate entities represented by the text with the ambiguities to be eliminated, generating a semantic vector of the context through a trained word vector model, generating a first entity vector of at least two candidate entities through a trained non-supervision neural network model, calculating the similarity between the context and each candidate entity, and determining a target entity represented in the context of the text with the ambiguities to be eliminated. On the basis of the learned non-supervision neural network model of the entity text semantics and the relationship between the entities, a first entity vector of the generated candidate entities includes the text semantics of the candidate entities and the relation between the entities, entity information of the text with the ambiguities to be eliminated is completely described, the similarity is calculated with the context semantic vectors, the target entity is determined, and the ambiguity eliminating accuracy ofthe text with the ambiguities to be eliminated is improved.

Description

technical field [0001] The invention relates to the technical field of natural language processing, in particular to a text processing method and device based on ambiguous entity words. Background technique [0002] With the popularization of the mobile Internet, Weibo, Tieba and major news websites have greatly facilitated people's lives, but most of the data on these platforms exist in unstructured or semi-structured forms, resulting in There are a large number of ambiguous entity words in the data in these knowledge bases. By disambiguating the ambiguous entity words, it is possible to identify which thing the entity word actually refers to in different contexts, and provide convenience for subsequent specific applications. [0003] However, in related technologies, one way can use the existing knowledge base data to calculate the degree of text overlap and correlation for disambiguation; the other way is to use the existing knowledge base data to perform unsupervised or ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06N3/08
CPCG06N3/088G06F40/30G06N5/022G06F40/295G06N3/045G06N20/00G06N5/02
Inventor 冯知凡陆超朱勇李莹
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products