Search engine method and system sensitive to geographical position

A technology of geographic location and search engine, applied in search engine method and system, web page retrieval considering web page geographic location information and link relationship, and search engine system field, which can solve the problem of lack of consideration of network structure, inability to filter spam web pages, and inability of spam web pages to be filtered. Well culled and other issues

Inactive Publication Date: 2014-03-26
PEKING UNIV
View PDF5 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The traditional web page ranking method mainly adopts the PageRank algorithm (Page L, Brin S, Motwani R, et al. The PageRank citation ranking: bringing order to the web [J]. 1999), which calculates the ranking of each web page based on the link relationship between web pages. Ranking scores, weighted according to topic, can return satisfactory results for general topic-related queries, but cannot be ranked according to the geographic relevance between search terms and web pages; Bruno Martins et al. Research on geographic information retrieval (Martins B, Calado P.Learning to rank for geographic information retrieval[C]//Proceedings of the6th Workshop on Geographic Inform

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Search engine method and system sensitive to geographical position
  • Search engine method and system sensitive to geographical position
  • Search engine method and system sensitive to geographical position

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0085] Assume that there are four web pages A, B, C, D in a network, and their connection relationship with each other is as follows: Figure 5 As shown in , the direction of the arrow indicates the linking direction of the web page. Each web page contains varying amounts of geographic information. The user's query sentence is "near X University", and the four webpages are retrieved and sorted through this algorithm, and the webpage that best meets the user's query requirements is returned.

[0086] Before online query, the four webpages are processed to calculate their correlation with geographical hotspots, the steps are as follows:

[0087] 1. Select some geographical hotspots. Since the number of experimental web pages is small, select two geographical hotspot IPs 1 (134, 229), ip 2 (818, 551);

[0088] 2. Set the geographical scope of the four web pages, and the frequency of the point sets and point occurrences of each web page is:

[0089] A point set: {(448,117),(6...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a webpage retrieval method sensitive to geographical positions, a search engine method and a search engine system. Firstly, a cloud server calculates geographical relevance of selected geographical hot spots of web pages in an off-line state and calculates scores of importance of each geographical hot spot of each web page by being combined with a network link structure obtained by a grid crawling unit, the scores are recorded in meta data of each corresponding web page as fields, and the meta data of each web page are stored in a space database of the server; when a user inquires on line, the server analyzes a geographical range of an inquire statement through natural language processing, calculates the geographical relevance of the inquire statement relative to the geographical hot spots according to the distance between the inquire statement and the geographical hot spots, calls the scores of corresponding geographical hot spots of the web pages in the space database, calculates scores of the web pages in a specific inquiry on line, sequences the results in a descending order, and outputs a retrieval result at the user side.

Description

technical field [0001] The invention provides a search engine method and system, in particular to a webpage retrieval method considering the geographic location information and link relationship of the webpage, and provides a corresponding search engine system, which belongs to the field of geographic information retrieval. Background technique [0002] With the development of information technology, the Internet has become an important source of data. In recent years, the popularization of cloud technology has not only solved the problem of data sharing, but also brought severe challenges to information mining and knowledge discovery. In the era of big data, how to effectively mine highly relevant and highly reliable data is particularly important. According to the research of Mark Sanderson et al. (Sanderson M, Kohler J. Analyzing geographic queries[C] / / SIGIR Workshop on Geographic Information Retrieval.2004,2), 15%-19% of web search queries are geographically related, bas...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/29G06F16/9537
Inventor 姜丹高勇李浩然刘家骏郭潇程静
Owner PEKING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products