Text entity detection method and system and related components

A detection method and entity technology, applied in the field of machine learning, can solve problems such as inability to perform effective entity extraction and scarcity of training resources

Active Publication Date: 2019-10-18
SUZHOU UNIV
View PDF3 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In related technologies, the neural model of deep learning sequence labeling is usually used to realize entity mining, but this method requires a large amount of high-quality manual labeling data as raw materials for training models, which is currently open Named entity recognition training resources for domain categories are still quite scarce, and effective entity extraction cannot be performed

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text entity detection method and system and related components
  • Text entity detection method and system and related components
  • Text entity detection method and system and related components

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0043] In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. Obviously, the described embodiments It is a part of the embodiments of this application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

[0044] see below figure 1 , figure 1 It is a flow chart of a text entity detection method provided by the embodiment of this application.

[0045] Specific steps can include:

[0046] S101: Use the seed entity set to match each sentence instance in the target sentence to obtain a matching result, and generate label data corresponding to the target s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a text entity detection method, and the method comprises the steps: carrying out the matching of each statement instance in a target statement through a seed entity set to obtain a matching result, and generating annotation data corresponding to the target statement according to the matching result; querying a statement instance matched with an unlabeled corpus word frequency table in the target statement, and modifying the labeled data according to a query result to obtain local labeled data; training a sequence annotation neural model by utilizing the local annotationdata; and performing sequence annotation on the unannotated corpus in the target statement by utilizing the trained sequence annotation neural model so as to obtain an entity set of the target statement. According to the method, high-quality entity mining can be realized on the premise of not being limited by the quality and the scale of the unlabeled corpus. The invention further discloses a text entity detection system, a computer readable storage medium and electronic equipment, which have the above beneficial effects.

Description

technical field [0001] The present invention relates to the technical field of machine learning, in particular to a text entity detection method and system, a computer-readable storage medium and an electronic device. Background technique [0002] New homogeneous entity mining is an open-domain entity extraction technique. Different from the traditional named entity recognition technology, which is only aimed at the identification of some specific types of entities, the mining of new similar entities focuses more on the analysis of the set of seed entities composed of a given entity of any open category, using the technology of entity extraction from related open entities. More new entities that belong to the same category as the entities in the set are mined from the unlabeled corpus in the domain. For example, given a seed entity set containing country names such as {China, Germany}, the extraction system can mine other entities such as {Japan, France}. New similar entit...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06F16/33G06N3/08
CPCG06F16/334G06N3/08G06F2216/03G06F40/289
Inventor 陈文亮郁圣卫杨耀晟张民
Owner SUZHOU UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products