Method and device for establishing index

An index building and indexing technology, applied in the field of information retrieval, can solve the problems of large index segment capacity, affecting retrieval efficiency, affecting the efficiency of index updating, etc., to achieve the effect of increasing the number of documents, improving retrieval efficiency, and reducing the frequency

Active Publication Date: 2011-06-08
NEW FOUNDER HLDG DEV LLC +2
View PDF0 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Therefore, if there are too many index segments in the index, there will be more index files to be opened, read, and processed during retrieval, which will affect retrieval efficiency
However, if the number of index segments in the index is too small, which means that the capacity of the index segment is large, then when the index is updated, it is necessary to perform multiple write operations to the disk, which will inevitably affect the efficiency of the index update
[0008] To sum up, the number of index segments in the index directly affects the efficiency of retrieval and index update. If the number of index segments in the index is too large, it will affect the efficiency of retrieval; if the number of index segments in the index is too small, it will affect the efficiency of indexing. update efficiency

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for establishing index
  • Method and device for establishing index
  • Method and device for establishing index

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0035] In order to provide an implementation plan for improving retrieval efficiency without reducing the index update efficiency, the embodiment of the present invention provides an index establishment method and device. The preferred embodiment of the present invention will be described below in conjunction with the accompanying drawings. It should be understood that, The preferred embodiments described here are only used to illustrate and explain the present invention, not to limit the present invention. And in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined with each other.

[0036] In the embodiment of the present invention, by adding one level of merging in the first storage area, the frequency of writing to the second storage area is reduced, and the number of documents in the final index segment is increased, and the number of index segments is reduced. On the premise of affecting the retrieval per...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method and a device for establishing an index. The main technical scheme is that the method comprises the following steps of: monitoring the amount of indexing segments stored in a first storage area and comprising single documents; when the amount of the indexing segments comprising the signal documents reaches a first threshold value, combining the indexing segments comprising the signal documents into an indexing segment comprising the first threshold value; monitoring the amount of the indexing segments comprising first threshold value files in a first storage area; and when the amount of the indexing segments comprising the first threshold value files reaches a second threshold value, combining the indexing segments comprising the first threshold value files into an indexing segment comprising files multiplying the first threshold value by the second threshold value, and when the total amount of the files included in all the indexing segments in the first storage area reaches a set maximum combination threshold value, combining all the indexing segments in the first storage area into an indexing segment comprising maximum combination threshold value files, and writing the indexing segment into a second storage area. By adopting the technical scheme, the times for disc writing are reduced, the index updating efficiency is guaranteed, and the retrieval efficiency is improved.

Description

technical field [0001] The invention relates to the field of information retrieval, in particular to an index establishment method and device. Background technique [0002] The core technology of a retrieval engine is an index, which is a sequence including several documents. Based on the rapid update of information content, the sequence of documents included in the index also needs to be continuously updated. Currently, the problem with index updates is that in order to update a small number of documents, the entire index needs to be rewritten, but in fact the vast majority of documents in the index are irrelevant to this update. Therefore, in order to reduce unnecessary updates, a segmentation mechanism is commonly used when indexing, that is, the number of documents included in the index is preset, and the index is divided into multiple sub-indexes, and each sub-index is called an index segment (segment). The segmentation mechanism solves the problem of incremental upda...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 徐剑波童征宇赵东岩李晓蕊
Owner NEW FOUNDER HLDG DEV LLC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products