Unlock instant, AI-driven research and patent intelligence for your innovation.

Commodity query keyword automatic generation method based on OCR

A technology of automatic generation and keywords, applied in the field of information retrieval, can solve problems affecting users' use

Active Publication Date: 2016-11-09
WUHAN UNIV
View PDF8 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the information identified by OCR has a lot of noise and some useless information
If this information is not further analyzed, the results are likely to affect the user's use

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Commodity query keyword automatic generation method based on OCR
  • Commodity query keyword automatic generation method based on OCR
  • Commodity query keyword automatic generation method based on OCR

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0057] In order to facilitate those of ordinary skill in the art to understand and implement the present invention, the present invention will be described in further detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the implementation examples described here are only used to illustrate and explain the present invention, and are not intended to limit this invention.

[0058] please see figure 1 , the present invention provides a kind of OCR-based commodity query keyword automatic generation method, it is characterized in that: first build the product name list, word list, word co-occurrence table and brand scoring table of all commodities, synthesize all word lists to form commodity category scoring table , and store all tables in the database; then automatically generate product query keywords based on the product category scoring table;

[0059] Construct the product name list, word list, word co-occurrence table and br...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a commodity query keyword automatic generation method based on OCR (Optical Character Recognition), comprising the steps as follows: establishing a commodity information database; extracting text information in a product package drawing using an OCR technology to get a word group containing product information; correcting wrong characters by calculating the similarity between the word group and the words in the database so as to standardize the word group; taking a commodity category with the highest score as the category of a product represented by the word group based on scoring rules; selecting a word symbiosis table corresponding to the commodity category, and calculating the symbiosis score of each word in the word group to filter useless words; and finally, selecting a brand with the highest score as the brand name of the product represented by the word group according to the brand score table of the commodity category and scoring rules, and taking the brand name and the word group after filtering as commodity query keywords for users to retrieve. The computation efficiency is high, the database is convenient to update, and the correctness of user's query of commodity information is improved greatly.

Description

technical field [0001] The invention belongs to the technical field of information retrieval, in particular to a method for automatically generating commodity keywords based on OCR. Background technique [0002] The Internet and handheld smart terminals have experienced explosive development in the past 10 years, which has greatly enriched people's access to information and changed people's lifestyles. More and more people choose to complete shopping through e-commerce. People can make better shopping choices with the help of detailed product information on various e-commerce websites and reviews of products by other buyers. However, when shoppers are shopping in shopping malls, bookstores and other places, it becomes more difficult to inquire about the specific information of commodities. Usually people's practice is to read the product packaging and artificially extract the possible keywords in the organization, and then input them into the search engine for query. But t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/32G06K9/34G06K9/72G06Q30/06
CPCG06Q30/0625G06V20/62G06V30/153G06V10/768G06V30/10
Inventor 黄浩钟林杌李宗鹏颜钱
Owner WUHAN UNIV