Short text query expansion and indexing method based on word vector
A query expansion, short text technology, applied in the field of short text query expansion and retrieval based on word vector, can solve the problems of reducing retrieval accuracy, topic offset, noise, etc., to avoid the number of clusters and the process of iteration, The effect of reducing time complexity and meeting the requirements of clustering
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment
[0068] In order to illustrate the working process of this system in detail, the specific process of this system is introduced below in conjunction with specific examples.
[0069] A. Short text corpus information preprocessing
[0070] For short texts and forwarded texts less than 20 characters, delete them directly. Segment the remaining text in the corpus. Get a corpus dictionary, record the number of occurrences of each word, and remove words that appear too infrequently. Create an inverted index for the remaining short text.
[0071] B. The training model represents each word in the corpus dictionary with a word vector
[0072] Such as figure 2 As shown, each word is encoded and classified, and according to its context information, the logistic regression model is used for classification training, so as to obtain the vector representation of each word.
[0073] For the convenience of illustration, assume that the input data X = [0.2, -0.1, 0.3, -0.2] T , training to...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com