Supercharge Your Innovation With Domain-Expert AI Agents!

A multi-label intelligent marking method and system

A multi-label and label technology, applied in the computer field, can solve the problems of time-consuming and labor-intensive, huge data, a large amount of manual labeling data, etc., and achieve the effect of improving the recall rate, high flexibility, and flexible settings

Active Publication Date: 2022-01-25
上海暖哇科技有限公司
View PDF16 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The difficulty of this type of technology is that both the data feature space and the label space have extremely high dimensions and are sparse, requiring a large amount of manual data labeling, which is time-consuming and labor-intensive.
For example: Wikipedia's label dimension L is millions, then there will be 2L possible label subsets, the data is huge, and the traditional labeling method is obviously no longer applicable

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A multi-label intelligent marking method and system
  • A multi-label intelligent marking method and system
  • A multi-label intelligent marking method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0050] This embodiment provides a multi-label intelligent marking method, which belongs to the field of computer technology, is applicable to various multi-label intelligent marking business scenarios, and is especially suitable for the medical field.

[0051] figure 1 A flow chart of a multi-label intelligent marking method provided in Embodiment 1, such as figure 1 As shown, the multi-label intelligent marking method specifically includes:

[0052] S1. Perform preliminary screening by searching in the self-built standard thesaurus to obtain m candidate standard words to be matched associated with any tag, where m is an integer not less than 1.

[0053] Specifically, step S1 includes at least the following sub-steps:

[0054] S11, storing the standard words in the self-built standard thesaurus in batches to the ES system;

[0055] S12. Create an index for the standard words stored in the ES system;

[0056] S13. Calculate the degree of association between the standard word ...

Embodiment 2

[0096] In order to implement the multi-label intelligent marking method in the first embodiment above, this embodiment provides a multi-label intelligent marking system.

[0097] Figure 4 It is a schematic structural diagram of a multi-label intelligent marking system provided by Embodiment 2 of the present invention. Such as Figure 4 As shown, the multi-label intelligent marking system 100 includes at least:

[0098] Preliminary screening module 1: used for preliminary screening by searching in the self-built standard thesaurus to obtain m candidate standard words to be matched associated with any tag, where m is an integer not less than 1;

[0099] Similarity calculation module 2: carry out similarity calculation with any label and the m candidate standard words to be matched one by one, and obtain the similarity between any label and each candidate standard word to be matched;

[0100] Matching result determination module 3: used to set the similarity threshold, accord...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a multi-label intelligent marking method and system, which belong to the field of computer technology, and obtain m candidate standard words to be matched associated with any label by searching in a self-built standard lexicon for preliminary screening; A label and m candidate standard words to be matched are calculated one by one to obtain the similarity between any label and each candidate standard word to be matched; the similarity threshold is set, and according to the similarity threshold, the candidate standard words to be matched Determine the n standard words to be matched related to any label. The multi-label intelligent marking method and system realizes as many standard words as possible in the self-built standard lexicon by manually constructing a label dictionary to find corresponding labels to realize large-scale multi-label technology, and improve the recall rate under the premise of ensuring accuracy; The flexibility is high, and tags can be added at any time; using elasticsearch to search for preliminary screening combined with similarity calculations, the standard words that match any tag can be finally obtained, which can meet the needs of high-concurrency commercial systems and improve the recall rate.

Description

technical field [0001] The invention relates to the field of computer technology, in particular to a multi-label intelligent marking method and system. Background technique [0002] In the application process of medical data, we need to perform multi-label marking tasks on the words in the standard lexicon. The traditional manual-based method is inefficient and can no longer meet normal production needs. Large-scale multi-label learning has been widely used in practical applications such as document annotation, search ranking and product recommendation, and is an important research problem in current computer technology. [0003] An existing large-scale multi-label learning technique is to construct a classifier, which can automatically select a subset of labels most relevant to standard words from a very large-scale label collection to mark standard words. The difficulty of this type of technology is that both the data feature space and the label space have extremely high...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06K9/62G06F16/901
CPCG06F16/901G06F18/213G06F18/22G06F18/253G06F18/214
Inventor 顾玲玲毛顺亿曹羽段艳婷孙铭权郑天龙龚快快朱亮
Owner 上海暖哇科技有限公司
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More