A method to solve inaccurate apachesolr phrase search

A phrase search and phrase technology, applied in the field of network search, can solve the problems of inaccurate search and inconsistent word segmentation results.

Active Publication Date: 2021-03-02
HUNAN ANTVISION SOFTWARE
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The technical problem to be solved by the present invention is that when Apache Solr searches for phrases, the search is inaccurate due to the inconsistency of the word segmentation results between the index mode and the search mode

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method to solve inaccurate apachesolr phrase search
  • A method to solve inaccurate apachesolr phrase search

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment 1

[0024] see figure 1 , figure 2 As can be seen, the present invention solves the inaccurate method of ApacheSolr phrase search; It is characterized in that: the method comprises the steps: step 1: data reception, QParserPlugin receives the search statement parameter that client end transmits by http protocol; Step 2: phrase search , use regular expressions in QParserPlugin to match the phrases in the search statement parameters to get a set of phrases; in the parse method, first call the getString method to get the search statement, and then use the regular expression matching "quoted statement" to match the search statement. "Phrase search sentence", step 3: data word segmentation and replacement, perform index mode word segmentation on the phrases in the phrase set obtained in step 2; replace the phrases in the original search sentence with the phrases that have been divided into words; call the word segmenter to pair according to the index mode The matched phrase is wo...

specific Embodiment 2

[0026] 1. In the first step, QParserPlugin receives the search statement parameters transmitted by the client through the http protocol;

[0027] 2. In the second step, use regular expressions in QParserPlugin to match the phrases in the search statement parameters to obtain a set of phrases;

[0028] 3. The third step is to traverse the phrases in the phrase set to perform word segmentation in the index mode;

[0029] 4. The fourth step is to replace the phrases in the original search statement with the phrases that have been divided into good words;

[0030] 5. In the fifth step, Apache Solr's parser converts the replaced search statement into Query.

[0031] 6. The sixth step is to enter the search process of Apache Solr;

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for solving inaccurate ApacheSolr phrase search; it is characterized in that: the method includes the following steps: data reception, QParserPlugin receives the search statement parameters transmitted by the client through the http protocol; phrase search, using regular expressions in the QParserPlugin The expression matches the phrases in the search statement parameters to obtain a phrase set; data word segmentation and replacement, perform index mode word segmentation through the phrases in the phrase set obtained in step 2; replace the phrases in the original search statement with the divided phrases; data conversion , convert the replaced search statement into Query through the syntax parser of Apache Solr; data processing and output, enter the search process of Apache Solr, and output the data after completion. The invention adopts the plug-in mode to extend the syntax analyzer of Apache Solr, rewrites its syntax analysis rules, and solves the problem of inaccurate phrase search. A pluggable grammar parser extension plug-in is set; the index mode is used to perform index mode word segmentation on phrases before searching.

Description

technical field [0001] The invention relates to the technical field of network search, in particular to a method for solving inaccurate Apache Solr phrase search. Background technique [0002] There is a search syntax in Apache Solr called "phrase search" which is PhraseQuery; the syntax of phrase search is to add quotation marks on the keywords, and the search principle is that the distance between the keywords in the quotation marks is the specified slop parameter size; but when building an index There will be more results of document word segmentation than query word segmentation results during search, which will cause a mismatch between the index mode and the search mode, resulting in inaccurate "phrase search". [0003] The invention provides a method. Before entering the Apache Solr search operation, the keywords in the phrase search grammar are divided into words according to the index mode, and then the original phrase search statement is replaced, and finally enters...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/332G06F16/33
CPCG06F16/3325G06F16/3344
Inventor 何小成黄三伟
Owner HUNAN ANTVISION SOFTWARE
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products