Method and device for excavating search log and page search method and device

A log and page technology, applied in the Internet field, can solve the problems that users cannot find, cannot understand timeliness requirements, and cannot identify users' timeliness requirements.

Active Publication Date: 2011-05-25
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
View PDF2 Cites 21 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, in the existing search technology, it is impossible to identify the timeliness requirement of the query entered by the user. For example, the user wants to obtain relevant information about an event that just happened, but the search engine will not understand the timeliness requirement of the user. The returned search results are only based on the previous search history, and the search results are sorted according to the preset weights of each attribute. Users may not be able to quickly and accurately find the desired page from the search results.
For example, if a user wants to obtain network information about the recent explosion in Hebei, he enters the query of "Hebei explosion". Since the event has just occurred and there are few network resources, in the search results, the information about the recent explosion in Hebei The page may be submerged in the massive pages of historical events related to the Hebei explosion, and users cannot quickly and accurately find the desired page from the search results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for excavating search log and page search method and device
  • Method and device for excavating search log and page search method and device
  • Method and device for excavating search log and page search method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0075] figure 1 The flow chart of the mining method for the search log provided by the present invention, such as figure 1 As shown, the method may include the following steps:

[0076] Step 101: Perform word segmentation processing on the query captured from the search log.

[0077] When crawling queries from search logs, the crawling strategy can use one or any combination of the following strategies:

[0078] Crawling strategy 1: Crawl the queries corresponding to the proportion of pages clicked by the user in the corresponding search results whose publishing time is within the first most recent time period to all pages clicked by the user exceeding the preset first proportion threshold. For example, assuming that the most recent first time period is within the past 2 days, and the preset first ratio threshold is 50%, if the pages clicked by the user in the search results of a certain query are released within the past 2 days If the proportion of clicks on the total page...

Embodiment 2

[0109] figure 2 The method flowchart of page search provided by the present invention, such as figure 2 As shown, the method may include the following steps:

[0110] Step 201: Perform word segmentation processing on the query input by the user.

[0111] Step 202: Using the combination of words and / or attributes of each word obtained after word segmentation, and the distribution probability of each combination, summarize the type corresponding to the query.

[0112] The processing method of query input by the user in steps 201 to 202 is the same as the processing method of captured query in steps 101 to 102, and will not be repeated here.

[0113] Step 203: Search the timeliness probability table, and determine the timeliness probability corresponding to the type summarized in step 202.

[0114] Step 204: If the highest value of the determined timeliness probability exceeds the preset timeliness probability threshold, it is determined that the query meets the timeliness r...

Embodiment 3

[0127] image 3 The structural diagram of the digging device for searching logs provided by the embodiment of the present invention, such as image 3 As shown, the mining device may include: a grabbing unit 300 , a first word segmentation unit 310 , a first type determination unit 320 , a screening unit 330 and a probability calculation unit 340 .

[0128] The grabbing unit 300 is configured to grab queries from search logs.

[0129] The first word segmentation unit 310 is configured to perform word segmentation processing on the query captured by the capture unit 300 .

[0130] The word segmentation processing method adopted by the first word segmentation unit 310 may include but not limited to: a word segmentation method for character string matching, a word meaning word segmentation method, and a statistical word segmentation method.

[0131] The first type determining unit 320 is configured to use the combination of each word and / or attribute of each word obtained after ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method and a device for excavating a search log, a page search method and a page search device. By the method for excavating the search log, the timeliness probabilities of types corresponding to queries can be counted and can reflect the timeliness requirements of the queries, so that whether a query input by a user has the timeliness requirement or not is identified and a search result corresponding to the query input by the user is optimized when the query has the timeliness requirement in the page search method, namely the sorting weight of a time attribute in the search result is improved; therefore, the user can quickly and accurately find the required page from the search result, and the timeliness requirement of the user on the search result is met.

Description

【Technical field】 [0001] The invention belongs to the technical field of the Internet, and in particular relates to a search log mining method and device and a page search method and device. 【Background technique】 [0002] With the continuous development of Internet technology and the continuous expansion of information, people's demand for network information is getting higher and higher, and search engines have become an important tool for people to obtain network information. After the user inputs a search term (query), the search engine usually includes pages containing the search term in search results and returns to the user. [0003] However, in the existing search technology, it is impossible to identify the timeliness requirement of the query input by the user. For example, the user wants to obtain relevant information about an event that just happened, but the search engine will not understand the timeliness requirement of the user. The returned search results are...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 辜斯缪
Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products