Addressing class query word mining method and system

A query word and addressing technology, applied in the Internet field, can solve problems such as increased burden on search engines, false official website addresses, waste of manpower, etc., to improve search satisfaction, improve search results, and reduce frequent operations.

Active Publication Date: 2014-06-18
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
View PDF5 Cites 21 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in actual situations, the following official website addresses may not be ranked first and cannot meet the search needs of users:
[0003] 1. The official website address does not appear in the first place in the search results;
[0004] 2. The official website address does not appear on the home page;
[0005] 3. The official website address is not included by search engines;
[

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Addressing class query word mining method and system
  • Addressing class query word mining method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0139] After filtering and classifying the clicked URLs according to the preset main domain URL format, the following main domain URLs are obtained:

[0140] http: / / www.mogujie.com

[0141] http: / / www.mogujie.com /

[0142] http: / / www.mogujie.com / index.html

[0143] http: / / www.mogujie.com / index.php

[0144] http: / / www.mogujie.com / default.html

[0145] http: / / www.mogujie.com / default.htm

[0146] According to the main domain URL format of "http: / / domain name / ", the above main domain URL is normalized, and the generated main domain name is: www.mogujie.com.

[0147] Use the main domain name www.mogujie.com as the key, extract the query word set of the key, and count the number of times the query words in the query word set are queried, and get the following 5 query words and the corresponding query times: Mogujie (100 ), Mogujie official website (40), Mogujie official website (30), Mogujie address (10), Mogujie website (20), among them, 100, 40, 30, 10 and 20 correspond to th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an addressing class query word mining method and system. The method comprises that: primary domain URLs with the same domain names recorded in a click log of a user are normalized so that corresponding primary domain names are generated, and a query word set of the primary domain names is generated according to corresponding query words of the primary domain URLs; word segmentation is performed on the query words in the query word set, occurrence frequency of the obtained word segments is counted, and the longest word segment in the word segments with the highest occurrence frequency is confirmed to the corresponding core word of the primary domain names; and the query word set comprises the core word through confirmation and the query word with the highest query frequency is a corresponding addressing class query word of the primary domain names. According to the technical scheme provided by the invention, an addressing class query word set can be automatically mined and generated, and addressing class Bad Case mining recall rate is enhanced.

Description

【Technical field】 [0001] The invention relates to the search technology in the Internet field, in particular to a mining method and system for addressing query words. 【Background technique】 [0002] Search engine query words can be divided into addressing query words, information query words and transaction query words. According to Andrei Broder's research, the proportions of these three query words are: 12.3%, 62% and 25.7%. Addressing query words refer to the query words that users provide when they need to query the address of a certain website, for example, Taobao.com, Mogujie, Ping An official website of China, etc. The user's search needs for such query words are very clear, that is, they hope to find The corresponding official website address, so the search engine needs to put the corresponding official website address in the front position of the search results, such as the first three. However, in actual situations, the following official website addresses may no...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): H04L29/12G06F17/30
Inventor 阮星华
Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products