Cross-platform commodity matching method and system based on natural language processing

A natural language processing, cross-platform technology, applied in the field of cross-platform product matching based on natural language processing, can solve problems such as errors, loss of semantic and syntactic information, product accuracy, data integrity errors, etc.

Pending Publication Date: 2021-06-04
时代涌现信息科技(南京)有限公司
View PDF0 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The disadvantage of the first method is: the title of the product is a short text, and converting the text into structured data will lose some semantic and syntactic information, and at the same time, some errors will be introduced due to the relatively poor effect of the conversion method; the second There are two methods to directly match product titles on the platform, or to match product titles and product attributes at the same time, but most of them use different similarity calculation methods to obtain the final results, without considering semantic information for matching; therefore, use There are large errors in the accuracy and data integrity of the products matched by the existing methods

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cross-platform commodity matching method and system based on natural language processing
  • Cross-platform commodity matching method and system based on natural language processing
  • Cross-platform commodity matching method and system based on natural language processing

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0093] Step 1. Obtain the product data of each platform to be matched, and perform data preprocessing;

[0094] In the data storage modules of each platform, product information is stored in different databases in different forms; first, the data stored in the platform in an unstructured form is converted into structured data, and product title information and product attribute information are obtained;

[0095] Then unify the format of the original commodity data obtained from each platform, and unify the field and form, deduplicate the duplicate commodities under the same platform, fill in some incomplete information (commodity category, preferential information), and discard some impurity data Wait.

[0096] Step 2. Obtain the title feature vector and attribute feature vector of each commodity according to the preprocessed data;

[0097] In order to achieve product matching on different platforms, it is necessary to use the title information and other attribute information...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a cross-platform commodity matching method based on natural language processing, and relates to the technical field of cross-platform commodity matching, and the method comprises the steps of: obtaining to-be-matched platform commodity data, and carrying out the data preprocessing; obtaining a title feature vector and an attribute feature vector of each commodity according to the preprocessed data; integrating the title feature vector and the attribute feature vector of each commodity to obtain a total feature vector of each commodity; classifying all commodities according to platforms, and dividing the commodities of each platform into a plurality of subsets according to a unified classification rule to obtain a plurality of commodity full-quantity feature vector subsets under each platform; and calculating the similarity of the commodities in the corresponding commodity full-quantity feature vector subsets under different platforms, and obtaining matched commodities in different platforms according to a calculation result. According to the cross-platform commodity matching method based on natural language processing, low-cost and high-precision matching of large-scale multi-channel commodities can be realized.

Description

technical field [0001] The invention relates to the technical field of cross-platform commodity matching, in particular to a cross-platform commodity matching method and system based on natural language processing. Background technique [0002] Currently, there are many methods for product matching, and different forms of product description information generally require different matching methods. The description of the product includes the product title and the list of detailed information (attributes) of the product. The description of the product also has a variety of storage forms, which can be stored in the database of the platform as structured data or as unstructured data (such as text, webpage) exists in the platform; for the commodities stored in the database, the matching problem is mainly to solve the matching technology between commodity individuals in the database, but because of the different design patterns adopted by different databases, the matching problem...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/295G06F40/242G06F16/25G06K9/62G06N3/04G06N3/08G06Q30/06
CPCG06F40/295G06F40/242G06F16/254G06Q30/0601G06N3/04G06N3/08G06F18/2411G06F18/22
Inventor 蒋哲宇考文鹏
Owner 时代涌现信息科技(南京)有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products