Unlock instant, AI-driven research and patent intelligence for your innovation.

A method and device for extracting core words

A core word, non-core technology, applied in the field of core word extraction, can solve problems such as errors and inaccurate query results, and achieve the effect of improving the probability of core words and improving query accuracy.

Active Publication Date: 2017-09-15
ALIBABA (CHINA) CO LTD
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this kind of query method will have the following technical defects: due to the segmentation of query words, multiple word segmentations will be obtained, but some of the word segmentations are not the core words of the query word (the core word refers to the smallest complete word that can accurately express the meaning of the query word. word units), if the query results obtained based on these non-core word queries have the highest frequency, taking the query result with the highest frequency as the query result may not be the result actually required by the user, resulting in inaccurate or wrong query results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and device for extracting core words
  • A method and device for extracting core words
  • A method and device for extracting core words

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] In order to make the above objects, features and advantages of the present invention more comprehensible, the embodiments of the present invention will be further described in detail below in conjunction with the accompanying drawings and specific implementation methods.

[0024] see figure 1 , is a flowchart of a method for extracting core words provided by an embodiment of the present invention. This method can be applied to any application scenario that requires input of query words for query, such as map search and surrounding search. The method can be pre-configured to save the existing A core word thesaurus for known core words, and a non-core word thesaurus for saving known non-core words, including:

[0025] S110. Segment the query word by using a preset word segmentation method to obtain the word segmentation that forms the query word;

[0026] Wherein, the preset word segmentation methods may include basic word segmentation methods, mixed word segmentation me...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the present invention discloses a method and device for extracting core words, which can extract relatively accurate core words from query words input by users, thereby achieving the purpose of improving query accuracy, including: adopting a preset word segmentation method The query word is segmented to obtain the word segmentation of the query word; the word segmentation of the query word is respectively matched with the words in the core word lexicon and the non-core word lexicon; if there is The participle matching the core word thesaurus and / or the participle matching the non-core word thesaurus, and there is an unknown participle, then: determine the participle matching the core word thesaurus as the query word core words; and, obtaining unknown word segmentations satisfying the preset core word length standard or splicing unknown word segmentations as the core words of the query words, the unknown word segmentations refer to the core words related to the core word lexicon and the non- Word segmentation that does not match any words in the core word thesaurus.

Description

technical field [0001] The invention relates to the field of word processing, in particular to a method and device for extracting core words. Background technique [0002] In the electronic map query application, when performing POI query according to the query word input by the user, the usual practice is to first segment the query word input by the user, and then match each segment word with the POI database to obtain multiple query results. The query result with the highest frequency among the query results is used as the query result of this query. However, this kind of query method will have the following technical defects: due to the segmentation of query words, multiple word segmentations will be obtained, but some of the word segmentations are not the core words of the query word (the core word refers to the smallest complete word that can accurately express the meaning of the query word. word units), if the query results obtained based on these non-core word querie...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/27G06F17/30
Inventor 彭松
Owner ALIBABA (CHINA) CO LTD