Unlock instant, AI-driven research and patent intelligence for your innovation.

Method and device for predicting similar items and training model

A technology of items and models, applied in special data processing applications, instruments, commerce, etc., can solve problems such as weak correlation, loss of training time, and non-referential judgment results, and achieve the effect of improving speed and accuracy

Inactive Publication Date: 2018-12-04
上海宏原信息科技有限公司
View PDF5 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Currently, traditional natural language processing often uses the bag-of-words model to quantify a piece of text describing things, but for computers, this method is difficult to accurately calculate the semantic and grammatical similarity between two words and articles, and the data of the bag-of-words model is sparse The curse of sex and dimensionality can lead to poor performance of the model
In recent years, shallow neural networks have been gradually used to learn low-dimensional and continuous word vectors directly from a large amount of text data. Word vectors can effectively express the semantics and grammar of a word, but they still have certain limitations for classifying items to determine whether two items are similar. Limitations, one means that the item word vector still needs high latitude, and most of the dimensions are useless, and the really useful features are hidden in those few dimensions, resulting in a large loss of training time, and the judgment results are not informative , and the correlation between the second item pair is relatively weak, for example, it will determine that the two product description texts of "long skirt" and "short skirt" are products with the same similarity

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for predicting similar items and training model
  • Method and device for predicting similar items and training model
  • Method and device for predicting similar items and training model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0047] Such as figure 1 As shown, the method for training and predicting a similar item model in an embodiment of the present invention includes the following steps:

[0048] Step S1: Obtain the word bag representation and vector representation of the item, where the vector representation of the item is related to the attribute dimension value of the item and its word vector. For example, the word vector is trained through Chinese corpus and language model, where the language model adopts at least one of Glove model, Word2Vec model, SENNA model and HLBL model. A hash table method is used to store the attribute dimension values ​​of items and their corresponding word vectors. The vector representation of the item can be represented by the vector of the dimension value of each attribute of the item and the vector of the dimension value of all the attributes of the item.

[0049] Step S2: Obtain the vector representation of the item pair based on the distance feature of the item pair...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and a device for predicting similar items and training a model. The method for training a predicted similar item model comprises the following steps of acquiring the word bag expression and the vector expression of items, wherein the vector expression of the item is related to the attribute dimension value of the item and a word vector; obtaining the vector expression of an item pair based on the distance characteristic of the item pair, wherein the distance characteristic of the item pair is related to the word bag expression and the vector expression of the item; and combining the vector expression of a similar item pair and the vector expression of a randomly-sampled item pair, training a classification model. In the invention, through the vector expression of the item, the vector expression of the item pair is acquired and the classification model is trained; and the input data of the trained classification model is compressed several times so as toimprove the speed of the trained classification model and the accuracy of a training result. The method is suitable for various items, such as a commodity, a cargo, a product and the like.

Description

Technical field [0001] The present invention relates to the field of computers, in particular to the field of machine learning, and in particular to a method and device for predicting similar items and training their models. Background technique [0002] Similar item pairs are two items that are similar in appearance, price, and use in dozens of dimensions. Among them, different dimensions affect similarity to different degrees, and different categories of the same dimension affect similarity to different degrees. [0003] At present, traditional natural language processing often uses a bag-of-words model to quantify a piece of text that describes a thing, but for computers, this method is difficult to accurately calculate the semantic and grammatical similarity of two words and articles, and the data of the bag-of-words model is sparse The curse of sex and dimensionality will lead to a decrease in the performance of the model. In recent years, shallow neural networks have been gr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06Q30/06G06F17/30
CPCG06Q30/0629
Inventor 杨骏史建明李杰
Owner 上海宏原信息科技有限公司