BILSTM-CRF product name identification method based on self-attention

A technology for product names and identification methods, applied in neural learning methods, natural language data processing, metadata text retrieval, etc. The effect of stickiness, labor cost reduction, and workload reduction

Active Publication Date: 2019-04-12
FOCUS TECH
View PDF8 Cites 27 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The disadvantage of this method is that it is difficult to build a complete product name dictionary and corresponding product attribute library, and the construction process is time-consuming and laborious. At the same time, it cannot better solve the problem of product name nesting, such as "flowers delivered directly in the same city", Flowers itself is a product

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • BILSTM-CRF product name identification method based on self-attention
  • BILSTM-CRF product name identification method based on self-attention
  • BILSTM-CRF product name identification method based on self-attention

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0045] The present invention will be further described below in conjunction with accompanying drawing and exemplary embodiment:

[0046] Such as figure 1 As shown, the self-attention-based BiLSTM-CRF product name recognition method provided in this example includes the following steps:

[0047] Step 101 establishes a dictionary of some product names and a corresponding attribute library, performs keyword matching on product titles according to the constructed product name dictionary, finds candidate product names, and forms a product name candidate set;

[0048] Step 102 uses the product name attribute database to find out the product name most similar to the product title attribute as a preliminary label; utilize the attribute database corresponding to the product name in the candidate set and the attribute database of the product title to calculate the similarity, and the attribute database here is other information of the product The sum of , such as product keywords, prod...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a BILSTM-CRF product name identification method based on self-attention. The identification method is characterized by comprising three parts of semi-automatic product title data labeling, model construction and training and model use. The semi-automatically labeling product title data part establishes an iterative process of preliminary labeling, learning, label prediction, manual correction, learning and label prediction, the model construction and training part includes performing N-dimensional dense vector coding on each word, inputting the N-dimensional dense vector coding into a BiLSTM layer to obtain text sequence characteristics, and obtaining a label probability of each word by utilizing a Softmax classification layer; extracting text local features by using the CRF layer, and training a model; the model use part is used for extracting text features and obtaining probability of all labels by utilizing a classification layer; and obtaining a corresponding tag by using a Viterbi algorithm, thereby identifying the product name. According to the method, the labor cost is greatly reduced, and the model accuracy and robustness are improved.

Description

technical field [0001] The invention relates to the field of electronic commerce, in particular to a self-attention-based BILSTM-CRF product name recognition method. Background technique [0002] As far as the e-commerce field is concerned, the product titles filled in by merchants contain a large number of descriptive words and redundant information, such as "acaricide factory direct sales Yuexiu plant protection acaricide acaricide wholesale purchase", in which the product name should be acaricide The redundant information is factory direct sales, Yuexiu plant protection, wholesale procurement, etc. These modifiers and redundant information have caused great difficulties in product name identification. At present, the identification method of product name is mainly to establish a product name dictionary, find candidate product names according to keyword matching, and find out the most suitable product according to product attributes provided by merchants. The disadvantage...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27G06F16/33G06F16/38G06N3/08
CPCG06N3/08G06F40/242G06F40/295
Inventor 房海朔殷亚云
Owner FOCUS TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products