Real-time index creating and real-time searching method and device

An index building and real-time search technology, applied in the field of information search, can solve problems such as poor real-time performance, poor portability, and high overhead, and achieve the effects of reducing overhead, reducing system overhead, and improving accuracy

Inactive Publication Date: 2013-09-11
ALIBABA GRP HLDG LTD
View PDF3 Cites 34 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] 1. Use c / c++ language to realize real-time search: the feature of this implementation method is to use the characteristics of c / c++ language to directly operate the memory, and the operation efficiency is high. However, due to the low development efficiency of c / c++ and poor portability, Moreover, the resource demand for equipment is high and the overhead is large;
[0005] 2. Lucene / solr quasi-real-time search (NRT, Near Real-time Search): Lucene provides a unified application programming interface (API, Application Programming Interface) to call the getReader method. According to different application characteristics, quasi-real-time search can be achieved. This implementation method can meet the search application requirements that are not very accurate in real-time requirements, and the real-time performance is poor.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Real-time index creating and real-time searching method and device
  • Real-time index creating and real-time searching method and device
  • Real-time index creating and real-time searching method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0053] Such as figure 1As shown, it is a schematic diagram of the implementation flow of a method for establishing a real-time index in Embodiment 1. The method includes:

[0054] S101. Obtain source data;

[0055] In the embodiment of this application, the source data includes two parts, one part is the full index source data required to establish a full index, and the other part is the memory index source data required to establish a memory index, which are introduced separately as follows:

[0056] The first part, full index source data

[0057] The source of full index source data can be various storage systems, such as databases, key-value storage systems, file systems, etc., or network data, for example, data can be crawled from the Internet. During specific implementation, after receiving the full dump (dump) request, the search server acquires the full index source data according to the full index source data and search service characteristics. The search server pro...

Embodiment 2

[0073] As the scale of source data continues to increase, building memory indexes for real-time data after the start time point will take up a lot of memory space and increase system overhead. Therefore, you can merge the established memory indexes into full indexes , and delete the established memory index, and then create a new memory index for the data after the start time of index merging. In this way, the space occupied by the memory index can be reduced, thereby reducing the system overhead. Accordingly, the real-time index building method may further include the following steps:

[0074] When the next start time point of indexing arrives, merge the memory index into the full index, and delete the memory index in the memory; and

[0075] Create a new in-memory index for the data after the next start time point in the source data.

[0076] Wherein, the next time point of indexing is relative to the start time point of indexing. Preferably, the full index and memory index...

Embodiment 3

[0078] In the specific implementation process, after the new memory index is established, the user can continue to submit requests such as search requests, data addition requests, data update requests, and data deletion requests to the search server. Among them, for search requests, the search server only It is necessary to search from the full index and memory index according to the search conditions submitted by the user, and return the search results that meet the search conditions to the user. Therefore, the search data request does not involve data modification, so there is no need to update the memory index; and The other three requests: add data request, delete data request, and update data request all involve data modification, and correspondingly, the memory index needs to be updated in order to further improve the accuracy of real-time search.

[0079] In order to better understand the embodiment of the present application, the implementation process of updating the m...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a real-time index creating and real-time searching method and device which are are used for increasing accuracy of real-time searching and reducing consumption of system resource. The real-time index creating method comprises the following steps: obtaining source data, creating full volume index for data in the source data prior to the start point when index creating start point is up, and then storing the full volume index into a disk storage; and creating memory index for data in the source data after the start point, and then storing the memory index into a memory. The real-time searching method comprises the following steps: receiving a searching request which contains searching conditions; seeking records that satisfy the searching conditions in the full volume index and the memory index respectively; and combining and returning the records that satisfy the searching conditions in the full volume index and the memory index.

Description

technical field [0001] The present application relates to the technical field of information search, in particular to a real-time index establishment and real-time search method and device. Background technique [0002] In short, real-time search is to conduct instant and fast search for information, so as to achieve the purpose of instant search. Real-time search makes the network environment more and more real-time, convenient and simple. Through the real-time search service, users can quickly obtain the latest information, and can quickly find and understand the events of concern. As the web grows, real-time search becomes more and more important. When using traditional search engines to search, due to the problem of long delay, it cannot meet the requirements of real-time search well. [0003] At present, there are two main methods of search engines for real-time search: [0004] 1. Use c / c++ language to realize real-time search: the feature of this implementation me...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 龙毅傅巍玮
Owner ALIBABA GRP HLDG LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products