Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Entity recognition method and device for text, electronic equipment and storage medium

A technology of entity recognition and text, applied in the field of data processing, which can solve problems such as incomplete recognition of input text data and inaccurate recognition results

Active Publication Date: 2021-03-09
PHARMCUBE (BEIJING) CO LTD
View PDF8 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] For this reason, the first purpose of this application is to propose a text entity recognition method to realize entity recognition of multilingual mixed texts, and to improve the accuracy of entity recognition for overly long texts, and to solve problems in the prior art. For ultra-long sentences, the input text data is truncated and cannot be fully recognized, and the recognition results are not accurate enough.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Entity recognition method and device for text, electronic equipment and storage medium
  • Entity recognition method and device for text, electronic equipment and storage medium
  • Entity recognition method and device for text, electronic equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0045] Embodiments of the present application are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary, and are intended to explain the present application, and should not be construed as limiting the present application.

[0046] The text entity recognition method, device, electronic device, and storage medium in the embodiments of the present application are described below with reference to the accompanying drawings.

[0047] The text entity recognition method of the embodiment of the present application solves the problem that super-long sentences are truncated by BERT and cannot be recognized, and can allow the recognition of disease targets in scenes such as mixed Chinese and English texts, when there is no labeled training data. , using ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a text entity recognition method and device, electronic equipment and a storage medium, and relates to the technical field of data processing, and the method comprises the steps: obtaining a to-be-processed text, wherein the to-be-processed text is a mixed text of at least two languages; obtaining a clause tool according to the language category, and performing clause processing on the to-be-processed text through the clause tool to obtain a plurality of to-be-processed sentences; performing word segmentation processing on the to-be-processed sentences to obtain a plurality of to-be-processed segmented words, and splicing the to-be-processed segmented words into a character string with a target length; and when the target length is greater than a preset length threshold, performing matching annotation on the plurality of to-be-processed segmented words based on entries of the dictionary to obtain an entity identification result. Therefore, entity recognition of the multi-language mixed text is realized, and the accuracy of entity recognition of the overlong text can be improved.

Description

technical field [0001] The present application relates to the technical field of data processing, and in particular to a text entity recognition method, device electronic equipment and a storage medium. Background technique [0002] At present, with the continuous development of the medical and health field, data from different sources and in different formats in the medical and health field continue to emerge, and a large amount of information that can be identified and mined is hidden in these big data. As the most important step in medical data analysis, medical entity recognition (especially disease entity recognition) can extract medical terms existing in related texts, which plays an important role in subsequent research. Due to different problems in medical texts from different sources, for example: Chinese-based medical literature is often mixed with English descriptions of disease words, target words, etc.; medical patent texts often have long description sentences,...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/295G06F40/284G06F40/211G06F40/242
CPCG06F40/211G06F40/242G06F40/284G06F40/295
Inventor 郭韦良阳晓文张荣驰何小莲邓奕
Owner PHARMCUBE (BEIJING) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products