Method and device for searching webpages according to sentence serial numbers

A web search and serial number technology, applied in electrical digital data processing, special data processing applications, instruments, etc., can solve problems such as top ranking of web pages, and achieve the goal of improving response speed, reducing time complexity, and improving search satisfaction. Effect

Inactive Publication Date: 2010-12-22
SHANGHAI LAISEEK CO LTD
View PDF1 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the sentence distance of decomposed keywords, keywords or punctuation marks in a certain web page cannot be directly obtained, that is, the absolute value of the difference between the sequence numbers of sentences
It can be seen that the existing search engines cannot guarant...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for searching webpages according to sentence serial numbers
  • Method and device for searching webpages according to sentence serial numbers
  • Method and device for searching webpages according to sentence serial numbers

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040] The idea, specific structure and technical effects of the present invention will be further described below in conjunction with the accompanying drawings, so as to fully understand the purpose, features and effects of the present invention.

[0041] Such as figure 1 Shown, the present invention discloses a kind of method that carries out web page search according to sentence sequence number, comprises the following steps:

[0042] Step 101, obtaining several webpages and downloading them to the webpage database;

[0043] The search engine company obtains several webpages from the Internet through the webpage fetcher, and downloads the several webpages to the computer of the search engine company, that is, the webpage database.

[0044] Step 102, performing sentence segmentation on several webpages, and assigning serial numbers to the sentences of each webpage respectively;

[0045] First, the indexer scans each web page, segments words for each web page, and records the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses method and device for searching webpages according to sentence serial numbers. The method comprises the following steps of: A, obtaining a plurality of webpages and downloading to a webpage database; B, carrying out sentence segmentation on the plurality of webpages and respectively distributing serial numbers for sentences of each webpage; C, making a forward index table including sentence serial numbers; D, making an inverted index table including the sentence index numbers; E, inputting a search item and segmenting the search item into at least one key letter, one key word or a punctuation mark; and F, calculating a sequencing weight value of a webpage including the key letter, the key word or the punctuation mark according to the inverted index table and outputting search results. By adopting the method and the device of the invention, the sequencing weight value of webpages with zero distances or smaller distances among sentences including the key letter, the keyword or the punctuation mark can be increased, thereby putting the ranking of the webpages forwards to increase the search satisfaction of users.

Description

technical field [0001] The invention relates to the fields of information retrieval and natural language processing, in particular to a method and device for searching webpages according to sentence serial numbers. Background technique [0002] Existing mainstream search engines, such as Google, Yahoo, Baidu, etc., all search by keywords or keywords. The index structures of these search engines all necessarily include keywords or key words. [0003] At the Seventh World Wide Web Conference in 1998, Sergey Brin and Lawrence Page published a paper titled "The Anatomy of a Large-Scale Hypertextual Web Search Engine", which disclosed the index structure of the Google search engine. Both the forward index table and the backward index table of the Google search engine include the position information of the first 4K words, words or punctuation marks in the web page downloaded by the search engine. [0004] The patent number is ZL01109132.0, and the invention patent titled "Metho...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 杜一华
Owner SHANGHAI LAISEEK CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products