A file indexing system and method based on elasticsearch full-text retrieval

A file system and file indexing technology, applied in the file system, file metadata retrieval, file access structure, etc., can solve the problems of waste of storage resources, inability of file system management program to feedback file path information, slow query index speed, etc.

Active Publication Date: 2021-07-02
武汉华讯国蓉科技有限公司
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] At present, the two mainstream full-text search tools in the world are solr and Elasticsearch. They both use the Lucene framework as the core of the search engine, but they target different scenarios. Solr is mainly used in scenarios with diverse data formats and low data content update frequency, while Elasticsearch applications In scenarios where the data format is single and the data content is updated frequently, Elasticsearch is relatively slower than Solr to query the index, but the index creation speed is significantly higher than that of Solr. For the scenario where the file system often has frequent updates, Elasticsearch can be used. To achieve the purpose of quickly updating the index, but because Elasticsearch only supports data input in json format, and the index field is not allowed to be modified after the index is created, this brings some application troubles to the application of the index system for the file system. Currently, the application There are still many deficiencies in the technology:
[0003] 1. At present, the full-text search tool ElasticSearch is used for the index query application for the file system, which can only index and display the content of the file, and cannot feed back the path information of the file to the file system management program, so that the management program can perform various operations on the files to which the search results belong. a management operation;
[0004] 2. At present, the full-text search tool ElasticSearch is used to index and query the file system in quasi-real-time or non-real-time, and it cannot be real-time. That is, when there is a file update in the file system, the search result immediately reflects this update, and always keeps the two synchronized and consistent sex;
[0005] 3. At present, there are multiple data copies when using the full-text search tool ElasticSearch to index and query the file system, resulting in a waste of storage resources

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A file indexing system and method based on elasticsearch full-text retrieval
  • A file indexing system and method based on elasticsearch full-text retrieval
  • A file indexing system and method based on elasticsearch full-text retrieval

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] The present invention will be further described below in conjunction with accompanying drawing and embodiment:

[0039] A file indexing system based on ElasticSearch full-text retrieval in the present invention includes: a user operation management module, which is used to receive user retrieval requests and send them to the file system management module, and receive the searched file url value from the file system management module;

[0040] File system for storing and managing files and directories;

[0041] The ElasticSearch cluster module is used to perform the keyword retrieval in the user retrieval request, and return the retrieval result to the ElasticSearch client module;

[0042] The file system management module is connected with the user operation management module and the file system, and is used to process the user retrieval request, transmit the keywords in the user retrieval request to the ElasticSearch client module, and detect the files in the file syst...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a file indexing system and method based on ElasticSearch full-text retrieval, wherein the system includes a user operation management module, a file system management module, a file system, an ElasticSearch cluster module, a database system, and an ElasticSearch client module. The retrieval will be completely real-time, and the retrieval results at any time will not be outdated or wrong, and at the same time effectively reduce the waste of resources caused by ElasticSearch when performing full-text retrieval of the file system; and the method described can realize bidirectional communication between files and ElasticSearch Exchange, support the acquisition of the path of the file where the search result is located, and perform more operations on the hit file.

Description

technical field [0001] The invention relates to a software retrieval system and method, in particular to a file indexing system and method based on ElasticSearch full-text retrieval. Background technique [0002] At present, the two mainstream full-text search tools in the world are solr and Elasticsearch. They both use the Lucene framework as the core of the search engine, but they target different scenarios. Solr is mainly used in scenarios with diverse data formats and low data content update frequency, while Elasticsearch applications In scenarios where the data format is single and the data content is updated frequently, Elasticsearch is relatively slower than Solr to query the index, but the index creation speed is significantly higher than that of Solr. For the scenario where the file system often has frequent updates, Elasticsearch can be used. To achieve the purpose of quickly updating the index, but because Elasticsearch only supports data input in json format, and...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/13G06F16/14
Inventor 袁东万修远陶毅昊冯骏
Owner 武汉华讯国蓉科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products