Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Bidding data named entity recognition method based on pre-training model

A data naming and entity recognition technology, applied in data processing applications, electrical digital data processing, natural language data processing, etc., can solve the problems of lack of model framework, high computational cost, and difficult data utilization.

Active Publication Date: 2021-08-20
湖南达德曼宁信息技术有限公司
View PDF7 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Since CRF considers the global text path, it needs to calculate the scores of all possible paths, and then select the best path according to the score, which leads to a large computational overhead. This kind of extraction method is slow and prone to entity boundary errors.
[0009] On the one hand, the existing model framework for named entity extraction for bidding data is still scarce, and labeled high-quality bidding data is very scarce
On the other hand, bidding data is generally public information, which is relatively easy to obtain, and a large amount of new data will be generated every day, but the obtained data is often the original plain text data, and it will be difficult for supervised models to integrate these unlabeled use the data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Bidding data named entity recognition method based on pre-training model
  • Bidding data named entity recognition method based on pre-training model
  • Bidding data named entity recognition method based on pre-training model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0048] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0049] In the description of the present invention, it should be noted that the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer" etc. The indicated orientation or positional relationship is based on the orientation or positional relationship shown in the drawings, and is only for the convenience of describing the present invention and simplifying the description, rather than indicating or implying that the referred device...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a bidding data named entity recognition method based on a pre-training model. The method specifically comprises the following steps: S1, obtaining an open source pre-training model; S2, acquiring an unlabeled corpus, and performing data preprocessing; S3, training the pre-training model in the S1; S4, performing supervised training by using the labeled data to obtain a reference model M; S5, enabling the reference model M to predict the unlabeled data to obtain pseudo-label data; S6, adding the pseudo label data into the training set, adding the real label data into the training set, and jointly training to obtain a model M'; S7, constructing a fragment decoding network; S8, encoding the text input model M'; S9, inputting the text code into a fragment decoding network; and S10, extracting entity fragments and categories thereof. According to the invention, after the model is pre-trained, model decoding is carried out in a fragment identification mode to predict the beginning and ending positions of the entity, the decoding speed can be increased, and an entity result with good precision can be obtained.

Description

technical field [0001] The invention relates to the field of bidding text processing, in particular to a named entity recognition method for bidding data based on a pre-trained model. Background technique [0002] The bidding data refers to the bidding announcement information or the bidding and bidding announcement information disclosed by the tenderer. In the bidding data, the length of the data text is often long (the average length of the whole article is more than 1500 words), and the entity types in the text are more granular (for example, time entities can also be divided into bidding start time, bidding deadline, and bidding opening time). Extracting bidding entity data plays a vital role in analyzing bidding demand or winning bid information in a certain period of time in a certain area, and is an emerging business requirement. If you want to extract the named entity information in the bidding data, the most direct idea is to use the technology related to named ent...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/295G06F40/30G06F40/126G06Q30/08
CPCG06Q30/08G06F40/126G06F40/295G06F40/30
Inventor 刘洋
Owner 湖南达德曼宁信息技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products