Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A purchase word clustering method and device

A clustering method and technology for purchasing words, which are applied in the field of purchasing word clustering methods and devices, and can solve the problems of short text length, limited improvement in clustering accuracy, and fewer words included.

Active Publication Date: 2015-11-11
TENCENT TECH (SHENZHEN) CO LTD
View PDF3 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this common clustering method is not very effective in the case of short texts, because short texts have sparsity problems: the text is short in length, contains few words, and there are multiple meanings in one word and multiple words in one meaning (that is, one meaning can be expressed in different terms)
[0008] However, with these existing short text clustering methods, the improvement of clustering accuracy is still limited

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A purchase word clustering method and device
  • A purchase word clustering method and device
  • A purchase word clustering method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] figure 1 It is a flow chart of the purchase word clustering method provided by the present invention.

[0024] like figure 1 As shown, the method includes:

[0025] Step 101, for a purchased word, according to whether the advertiser has purchased the purchased word and the number of times the advertiser has purchased the purchased word, an advertiser vector is established for the purchased word.

[0026] Wherein, according to the purchased words purchased by each advertiser and the number of times each purchased word is purchased, an advertiser vector is established for each purchased word, and each advertiser feature in the advertiser vector has purchased the purchased word , the weight of each advertiser feature in the advertiser vector is determined according to the number of times.

[0027] Step 102, clustering the purchased words according to the advertiser vector of each purchased word.

[0028] If two purchased words are purchased by the same one or more adve...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a clustering method and a device for purchase words. The clustering method includes that advertiser vector quantity is built up for each purchase word according to purchase words purchased by each advertiser and purchase times of each purchase word. Each advertiser characteristic in the advertiser vector quantity purchases the purchase words, and weight of each advertiser characteristic in the advertiser vector quantity is confirmed according to the times. Clustering is performed on the purchase words according to the advertiser vector quantity of each purchase word. The clustering method and the device for the purchase words can improve accuracy of clustering of the purchase words.

Description

technical field [0001] The present invention relates to the field of computer technology, in particular to a purchase word clustering method and device. Background technique [0002] Text clustering technology is a technology that divides a group of texts into several subsets according to the relationship between texts. The distance between the texts in the subsets is very close, and the distance between the subsets is relatively large. Its essence is Finding different data models hidden in the data can realize the blind classification of the sample space. [0003] The so-called purchase words refer to the text content submitted by users in systems such as bidding advertisements for bidding. Purchase words are short texts with an average length of 3-5 words, which can be regarded as short texts to some extent. Therefore, purchase word clustering can be abstracted as clustering a collection of short texts process. [0004] The existing text clustering methods are mainly ba...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30G06F17/27
Inventor 杨俊丽王迪赫南
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products