Method and device for searching documents
A document and document set technology, applied in the field of retrieval, can solve the problem that sorting cannot be well applied to heterogeneous academic networks.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0076] The ordering method of document importance proposed by the present invention, its preferred embodiment comprises:
[0077] Step 1: Topic Modeling
[0078] The purpose of step one is to discover topics from document collections using a probabilistic topic model. Probabilistic topic models can efficiently mine topics in document collections. In these methods, documents are usually assumed to be generated from a mixture of |T| probabilistic models. Latent Dirichlet Allocation (LDA) is a widely used topic model. In this model, the likelihood of a document set D is defined as:
[0079] P ( z , w | Θ , Φ ) = Π d ∈ D Π z ∈ T θ dz n dz ...
Embodiment 2
[0115] The retrieval device of the document that the present invention proposes, its preferred embodiment comprises:
[0116] A topic identification module, the topic identification module uses a probabilistic topic model to identify topics from the document set, and obtains the topic distribution of the documents according to the identified topics;
[0117] A random walk module, the random walk module calculates a random walk ranking score for each document according to topic distribution;
[0118] A retrieval module, the retrieval module calculates the relevance score of the document to the query keyword according to the query keyword, and combines the random walk ranking score and the correlation score to obtain a retrieval result.
[0119] Wherein, the topic identification module includes:
[0120] The parameter calculation submodule, the parameter calculation module calculates the posterior probability distribution on the topic z according to the Gibbs sampling method: ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com