Integrated retrieval method for multi-language information retrieval
An information retrieval, multi-language technology, applied in the information field, can solve the problems of the accuracy of noise retrieval results, loss of source language information, etc., to achieve the effect of improving accuracy and reducing noise
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
specific Embodiment approach 1
[0021] Specific implementation mode 1. The specific steps of the multilingual information retrieval integrated retrieval method are as follows:
[0022] Step 1: query the keyword q in the source language input by the user i Keywords translated into the target language t ij , where t ij Indicates the source language query keyword q i The jth reasonable translation of ;
[0023] Step 2, the keyword t of the target language obtained in step 1 ij According to the word order of each word, the modification and collocation relationship of each word, the word distance of each word is divided into three kinds of relationship patterns: exact match pattern, co-occurrence pattern and independent pattern, described exact match pattern is that each word of phrase must be in order Adjacent occurrence; Described co-occurrence mode is that several words that form phrase co-occur in preset window and promptly represents the occurrence of this phrase; Described independent mode is that in ph...
specific Embodiment approach 2
[0042] Embodiment 2. The difference between this embodiment and Embodiment 1 is that the conditional probability P(t of the exact matching pattern in the query document D is obtained in step 3. ij |D, θ Ex ) is specifically: for the exact matching mode, the entire phrase can be regarded as an independent vocabulary, and the maximum likelihood estimation is used for statistics, and the calculation process is expressed as:
[0043] P ( t ij | D , θ Ex ) = Len ( t ij ) × ...
specific Embodiment approach 3
[0045] Specific embodiment 3. The difference between this embodiment and specific embodiment 1 or 2 is that the conditional probability P(t of the co-occurrence pattern in the query document D is obtained in step 3 ij |D, θ Co ) is specifically: by counting the number of co-occurrences of words within the preset window range in the document, combined with the characteristics of phrases in cross-language retrieval, the following co-occurrence pattern is obtained:
[0046] P ( t ij | D , θ Co ) = Σ s = 1 n - 1 Σ t = s + ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com