Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Solr technology based distributed searching method and system

A distributed search and distributed technology, applied in the field of information retrieval, can solve problems such as impossible crawling, difficulty, and information noise, and achieve the effect of enhancing stability, improving accuracy, and focusing on search

Inactive Publication Date: 2014-11-12
SOUTHEAST UNIV
View PDF2 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At this time, it is very difficult to find the information you need accurately and quickly on the Internet.
There are three reasons for this: first, the information on the Internet is complex and disorderly, and different websites may have duplicate information, so the search results obtained by using search engines will generate information noise; second, only based on user input It is very difficult to judge the user's real search intention by using the query words; third, it is impossible for the crawler program of the search engine to crawl all the information on the Internet, or to crawl online information in real time.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Solr technology based distributed searching method and system
  • Solr technology based distributed searching method and system
  • Solr technology based distributed searching method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] The preferred embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings, so that the advantages and features of the present invention can be more easily understood by those skilled in the art, so as to define the protection scope of the present invention more clearly.

[0043] see Figure 1-Figure 9 , the embodiment of the present invention includes:

[0044] A distributed search system, said system comprising:

[0045] 1) Automatic classifier, used to automatically classify electronic files;

[0046] When the ERMS offline client system registers and archives electronic files, it must automatically classify the electronic files to facilitate subsequent distributed indexing. Since the documents under the electronic file may be inconsistent with the subject described by the file metadata, the final type determination of the electronic file cannot be made completely based on the electronic file type defined in th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a solr technology based distributed searching method and system. The method comprises steps as follows: 1), when an off-line client system registers and files electronic documents, firstly, the electronic documents are automatically classified on the basis of a naive bayesian algorithm; 2), after the electronic documents are classified, the electronic documents are indexed in a distributed manner on the basis of a consistent Hash algorithm according to the classification of the electronic documents; and 3), after the indexing documents are established, a user inputs an inquiry statement for inquiring the electronic documents. The system adopts a distributed mode of an open source searching tool Solr and distributes the inquiry requests to the distributed nodes, each distributed node responds to the searching request, and then, a result is subjected to merging and duplication elimination and is returned to the user after well sorted, so that distributed vertical search is realized. With adoption of the manner, the accuracy for automatic classification of the electronic documents can be improved, and the stability of the system is improved.

Description

technical field [0001] The invention relates to the field of information retrieval, in particular to a distributed search method and system based on solr technology. Background technique [0002] With the rapid development of Internet technology, the amount of data on the Internet has increased rapidly, and the increase of massive data has had a huge impact on the search quality of general search engines. At this time, it is very difficult to find the information you need accurately and quickly on the Internet. There are three reasons for this: first, the information on the Internet is complex and disorderly, and different websites may have duplicate information, so the search results obtained by using search engines will generate information noise; second, only based on user input It is very difficult to judge the user's real search intention by using specific query words; third, it is impossible for the crawler program of a search engine to crawl all the information on th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/951
Inventor 吴含前姚莉王存哲李露
Owner SOUTHEAST UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products