Chinese word relevancy calculation method and device based on Wikipedia concept vectors
A technology of concept vector and calculation method, which is applied in the field of calculation of Chinese word correlation based on Wikipedia concept vector, and can solve problems such as inability to accurately distinguish concepts
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0105] In order to enable those skilled in the art to better understand the solutions of the embodiments of the present invention, the embodiments of the invention will be further described in detail below in conjunction with the drawings and implementations.
[0106] The embodiment of the present invention is based on the flow chart of the Chinese word relevance computing method of Wikipedia concept vector, as figure 1 shown, including the following steps.
[0107] Step 101, constructing a basic corpus of Wikipedia.
[0108] Obtain its Dump raw corpus from the Wikipedia Dump service site; and normalize the raw corpus, and only keep the Wikipedia concept documents whose namespace attribute is 0; for each concept document, only keep its official text and concept annotation information; the processed The concept documents are collected as the basic corpus of Wikipedia, specifically:
[0109] Step 1-1) visit the Wikipedia Dump service site, download the latest zhwiki...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com