Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A Named Entity Recognition Method for Bidding Data Based on Pre-trained Model

A data naming and entity recognition technology, applied in data processing applications, electronic digital data processing, natural language data processing, etc., can solve the problems of lack of model framework, difficult data utilization, high computational overhead, etc., to enhance semantic understanding ability, Avoid manual annotation and enhance the effect of model effects

Active Publication Date: 2021-10-01
湖南达德曼宁信息技术有限公司
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Since CRF considers the global text path, it needs to calculate the scores of all possible paths, and then select the best path according to the score, which leads to a large computational overhead. This kind of extraction method is slow and prone to entity boundary errors.
[0009] On the one hand, the existing model framework for named entity extraction for bidding data is still scarce, and labeled high-quality bidding data is very scarce
On the other hand, bidding data is generally public information, which is relatively easy to obtain, and a large amount of new data will be generated every day, but the obtained data is often the original plain text data, and it will be difficult for supervised models to integrate these unlabeled use the data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Named Entity Recognition Method for Bidding Data Based on Pre-trained Model
  • A Named Entity Recognition Method for Bidding Data Based on Pre-trained Model
  • A Named Entity Recognition Method for Bidding Data Based on Pre-trained Model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0048] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0049] In the description of the present invention, it should be noted that the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer" etc. The indicated orientation or positional relationship is based on the orientation or positional relationship shown in the drawings, and is only for the convenience of describing the present invention and simplifying the description, rather than indicating or implying that the referred device...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention relates to a bidding data named entity recognition method based on a pre-training model, which specifically includes the following steps: S1: obtaining an open-source pre-training model; S2: obtaining unlabeled corpus and performing data preprocessing; S3: training the pre-trained model in S1 Training model; S4: Use labeled data for supervised training to obtain benchmark model M; S5: Make benchmark model M predict unlabeled data to obtain pseudo-label data; S6: Add pseudo-label data to the training set and real label data To the training set to train together to get the model M ' ; S7: Construct segment decoding network; S8: Input text into model M ' Encoding; S9: Input the text encoding into the segment decoding network; S10: Extract entity segments and their categories. After the pre-training model, the present invention performs model decoding and predicts the beginning and end positions of entities in a fragment recognition manner, which can speed up the decoding speed and obtain entity results with better precision.

Description

technical field [0001] The invention relates to the field of bidding text processing, in particular to a named entity recognition method for bidding data based on a pre-trained model. Background technique [0002] The bidding data refers to the bidding announcement information or the bidding and bidding announcement information disclosed by the tenderer. In the bidding data, the length of the data text is often long (the average length of the whole article is more than 1500 words), and the entity types in the text are more granular (for example, time entities can also be divided into bidding start time, bidding deadline, and bidding opening time). Extracting bidding entity data plays a vital role in analyzing bidding demand or winning bid information in a certain period of time in a certain area, and is an emerging business requirement. If you want to extract the named entity information in the bidding data, the most direct idea is to use the technology related to named ent...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/295G06F40/30G06F40/126G06Q30/08
CPCG06Q30/08G06F40/126G06F40/295G06F40/30
Inventor 刘洋
Owner 湖南达德曼宁信息技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products