Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Method for solving problem of ApacheSolr phrase search inaccuracy

A phrase search and phrase technology, applied in the field of network search, can solve the problems of inaccurate search and inconsistent word segmentation results.

Active Publication Date: 2017-07-07
HUNAN ANTVISION SOFTWARE
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The technical problem to be solved by the present invention is that when Apache Solr searches for phrases, the search is inaccurate due to the inconsistency of the word segmentation results between the index mode and the search mode

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for solving problem of ApacheSolr phrase search inaccuracy
  • Method for solving problem of ApacheSolr phrase search inaccuracy

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment 1

[0024] see figure 1 , figure 2 As can be seen, the present invention solves the inaccurate method of ApacheSolr phrase search; It is characterized in that: the method comprises the steps: Step 1: data reception, QParserPlugin receives the search statement parameter that client end transmits by http protocol; Step 2: phrase search , use regular expressions in QParserPlugin to match the phrases in the search statement parameters to get a set of phrases; in the parse method, first call the getString method to get the search statement, and then use the regular expression matching "quoted statement" to match the search statement. "Phrase search statement", step 3: data word segmentation and replacement, perform index mode word segmentation on the phrases in the phrase set obtained in step 2; replace the phrases in the original search statement with the phrases that have been divided into words; call the word segmenter to pair according to the index mode The matched phrase is ...

specific Embodiment 2

[0026] 1. In the first step, QParserPlugin receives the search statement parameters transmitted by the client through the http protocol;

[0027] 2. In the second step, use regular expressions in QParserPlugin to match the phrases in the search statement parameters to obtain a set of phrases;

[0028] 3. The third step is to traverse the phrases in the phrase set to perform word segmentation in the index mode;

[0029] 4. The fourth step is to replace the phrases in the original search statement with the phrases that have been divided into good words;

[0030] 5. In the fifth step, Apache Solr's parser converts the replaced search statement into Query.

[0031] 6. The sixth step is to enter the search process of Apache Solr;

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for solving the problem of ApacheSolr phrase search inaccuracy. The method is characterized by comprising the following steps of performing data receiving: receiving search statement parameters transmitted through an http protocol by a client through QParserPlugin; performing phrase search: performing a matching search on phrases in the statement parameters by using a regular expression in the QParserPlugin to obtain a phrase set; performing data word segmentation and replacement: performing index mode word segmentation through the phrases in the phrase set obtained in the step 2; and replacing phrases in an original search statement with the phrases subjected to the word segmentation; performing data conversion: converting the replaced search statement into Query through a grammar analyzer of Apache Solr; and performing data processing and output: entering a search process of the Apache Solr, and outputting data after completion. According to the method, the grammar analyzer of the Apache Solr is extended in a plug-in manner, and grammar analysis rules are rewritten, so that the problem of phrase search inaccuracy is solved; a pluggable grammar analyzer extension plug-in is arranged; and the phrases are subjected to the index mode word segmentation in an index mode and then are searched for.

Description

technical field [0001] The invention relates to the technical field of network search, in particular to a method for solving inaccurate Apache Solr phrase search. Background technique [0002] There is a search syntax in Apache Solr called "phrase search" which is PhraseQuery; the syntax of phrase search is to add quotation marks on the keywords, and the search principle is that the distance between the keywords in the quotation marks is the specified slop parameter size; but when building an index There will be more results of document word segmentation than query word segmentation results during search, which will cause a mismatch between the index mode and the search mode, resulting in inaccurate "phrase search". [0003] The invention provides a method. Before entering the Apache Solr search operation, the keywords in the phrase search grammar are divided into words according to the index mode, and then the original phrase search statement is replaced, and finally enters...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/3325G06F16/3344
Inventor 何小成黄三伟
Owner HUNAN ANTVISION SOFTWARE
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products