A method and device for determining semantic redundancy, and a corresponding search method and device
A technology to determine the method and semantics, applied in the field of natural language processing, can solve problems such as inability to recall, inability to obtain rankings, etc., to achieve the effect of increasing the recall rate and improving the search effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0055] figure 1 The flow chart of the method for determining semantic redundancy provided by Embodiment 1 of the present invention, such as figure 1 As shown, the method may include:
[0056] Step 101: Determine word A in semantic redundancy mining.
[0057] Since there is semantic redundancy, nouns are mostly used as the central word, so in this step, nouns are used as the main word to determine word A, and at the same time, statistics are carried out in a large-scale corpus, and nouns whose frequency of occurrence is greater than the preset first frequency threshold are used as words a. The first frequency threshold can be set according to actual needs, for example, a noun whose frequency of occurrence is greater than 10 in the corpus is used as word A.
[0058] Step 102: Determine the collocation word B of the word A.
[0059] The collocation word B determined in this step is used for subsequent mining of redundant words. In view of forming semantic redundancy with word A...
Embodiment 2
[0080] figure 2 The search method provided for Embodiment 2 of the present invention, such as figure 2 As shown, the search methods include:
[0081] Step 201: Perform word segmentation processing on the query input by the user.
[0082] Step 202: Determine the collocation word pairs formed by pairs of each word obtained after the word segmentation process.
[0083] When determining collocation word pairs in this step, it can be similar to the method of step 102 in the first embodiment, that is, it is determined that among the words obtained after the word segmentation process, the co-occurrences are within the preset window range and the co-occurrence conditions meet the preset first template The two words form a collocation word pair. The first template may include but not limited to: adjective+noun, noun+noun, noun+verb, verb+noun, and so on.
[0084] Step 203: Use the determined collocation word pair to search the semantic redundancy pair database, and if a semantic ...
Embodiment 3
[0093] image 3 The structural diagram of the device for determining semantic redundancy provided by Embodiment 3 of the present invention, as shown in image 3 As shown, the apparatus may include: a collocation word pair determination unit 300 , a context vector determination unit 310 and a redundant pair determination unit 320 .
[0094] The collocation word pair determination unit 300 determines the word A and its collocation word B.
[0095] Wherein, the collocation word pair determining unit 300 may specifically include: a candidate word determining subunit 301, configured to determine a noun in the corpus whose occurrence frequency is greater than a preset first frequency threshold as the word A.
[0096]Due to the presence of semantic redundancy, nouns are mostly used as the central word, so the candidate word determination subunit 301 mainly determines the word A based on nouns, and at the same time performs statistics in a large-scale corpus, and nouns with a frequen...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com