Determination method and determination device for semantic redundancy and corresponding search method and device
A technology for determining methods and semantics, applied in the field of natural language processing, can solve problems such as inability to obtain rankings and inability to recall, and achieve the effect of improving recall rate and search effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Example Embodiment
[0054] Example one
[0055] figure 1 This is a flowchart of a method for determining semantic redundancy provided by Embodiment 1 of the present invention, such as figure 1 As shown, the method can include:
[0056] Step 101: Determine the word A in semantic redundancy mining.
[0057] Because of the semantic redundancy, nouns are mostly used as the central word. Therefore, in this step, nouns are used as the main word to determine the word A. At the same time, statistics are performed in a large-scale corpus, and nouns with a frequency greater than the preset first frequency threshold are used as words. A. The first frequency threshold can be set according to actual needs, for example, nouns with a frequency greater than 10 in the corpus are used as word A.
[0058] Step 102: Determine the collocation word B of word A.
[0059] The collocation word B determined in this step is used for subsequent mining of redundant words. In view of the semantic redundancy with word A, it usually me...
Example Embodiment
[0079] Embodiment two
[0080] figure 2 This is the search method provided in the second embodiment of the present invention, such as figure 2 As shown, the search method includes:
[0081] Step 201: Perform word segmentation processing on the query input by the user.
[0082] Step 202: Determine the collocation word pair formed by each word after word segmentation processing.
[0083] When determining the collocation word pair in this step, it can be similar to the method of step 102 in the first embodiment, that is, it is determined that each word obtained after word segmentation processing co-occurs within the preset window range and the co-occurrence condition satisfies the preset first template The two words form a collocation pair. The first template may include, but is not limited to: adjective + noun, noun + noun, noun + verb, verb + noun, and so on.
[0084] Step 203: Use the determined collocation word pair to search the semantic redundancy pair database, and if the semant...
Example Embodiment
[0092] Embodiment three
[0093] image 3 This is a structural diagram of the device for determining semantic redundancy provided in the third embodiment of the present invention, such as image 3 As shown, the device may include: a collocation word pair determining unit 300, a context vector determining unit 310, and a redundant pair determining unit 320.
[0094] The collocation word pair determining unit 300 determines the word A and its collocation word B.
[0095] The collocation word pair determining unit 300 may specifically include: a candidate word determining subunit 301, configured to determine a noun whose appearance frequency is greater than a preset first frequency threshold in the corpus as the word A.
[0096] Since nouns are used as the central word in most cases of semantic redundancy, the candidate word determination subunit 301 uses nouns as the main word to determine the word A, and at the same time, performs statistics in a large-scale corpus to find nouns with a ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap