Recognition method, device and electronic equipment for nested entity data

An entity data and identification method technology, applied in the field of data identification, can solve the problems affecting the identification efficiency and accuracy of nested entity data, large time and labor costs, and difficulty in dividing the thickness and granularity of entities.

Active Publication Date: 2021-04-23
BEIJING PERFECT WORLD SOFTWARE TECH DEV CO LTD
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, this type of entity data labeling method is difficult to divide the granularity of entities, and multi-level BIO labeling requires a lot of time and labor costs. It is difficult to process a large amount of effective labeling data in a short period of time, which will affect the identification of nested entity data. efficiency and accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Recognition method, device and electronic equipment for nested entity data
  • Recognition method, device and electronic equipment for nested entity data
  • Recognition method, device and electronic equipment for nested entity data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] Hereinafter, the present application will be described in detail with reference to the drawings and embodiments. It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined with each other.

[0029] In order to improve the current BIO entity data labeling method, it is difficult to divide the granularity of entities, and multi-level BIO labeling requires a lot of time and labor costs, and it is difficult to process a large amount of effective labeling data in a short period of time, which will affect the nested entity data. Identify technical issues of efficiency and accuracy. This embodiment provides a method for identifying nested entity data, such as figure 1 As shown, the method includes:

[0030] 101. Arranging and combining seed entity vocabulary of different entity categories to generate a short text data set.

[0031] First determine the entity category. This embodiment ca...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The application discloses a nested entity data identification method, device and electronic equipment, and relates to the technical field of data identification. The method includes: arranging and combining the seed entity vocabulary of different entity categories to generate a short text data set; defining at least one entity category label for the short text in the short text data set, and corresponding to each entity category label in the short text Index information of the start and end of the subtext; use the defined short text dataset as the training set to train the recognition model of deep learning; use the trained recognition model to recognize nested entity data. This application adopts the method of starting and ending index + entity category label for sentences to define entity labeling information, which makes the realization of multi-nested entity content labeling easier, optimizes the process and workload of nested entity recognition for labeling, and saves Time cost and labor cost, which in turn can improve the recognition efficiency and accuracy of nested entity data.

Description

technical field [0001] The present application relates to the technical field of data identification, in particular to an identification method, device and electronic equipment for nested entity data. Background technique [0002] Named Entity Recognition (NER) is an important research direction in the field of natural language processing. It refers to the recognition of entities with specific meaning in text, mainly including names of people, places, institutions, and proper nouns. With the development of deep learning technology and the needs of actual production applications, the requirements for named entity recognition are also increasing. When using entities for search support, fine-grained and nested entity information is required to ensure the accuracy and coverage of searches. At this stage, the main technology used for named entity recognition is deep learning technology. [0003] When using deep learning technology for entity recognition, a large amount of labele...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/242G06F40/295G06F16/31G06F16/33G06F16/35G06N3/04G06N3/08
CPCG06F40/295G06F40/242G06F16/316G06F16/3344G06F16/35G06N3/049G06N3/08G06N3/045
Inventor 于淼刘炎覃建策陈邦忠
Owner BEIJING PERFECT WORLD SOFTWARE TECH DEV CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products