Method and system for calculating new words in text based on word frequency matrix feature vectors
A technology of eigenvectors and matrices, which is applied in the field of calculating new words and systems in text based on word frequency matrix eigenvectors, can solve problems such as low accuracy, low efficiency, and high cost, and achieve high accuracy and computational efficiency
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0051] The technical solutions of the present invention will be further described below in conjunction with the accompanying drawings and embodiments.
[0052]Because the method of the present invention for calculating new words in a text based on the eigenvector of the word frequency matrix can be distributed and parallelized on a large scale, and new words in more than 1 million documents can be mined within one hour. The following takes one of the documents as an example to show the implementation manner of the present invention.
[0053] Calculation of word frequency dictionary of S1 text set
[0054] figure 2 Shown is a screenshot of a piece of network news, in which some network buzzwords (new words) are marked by boxes.
[0055] First preprocess it, remove the punctuation marks in the article, and uniformly replace the punctuation marks with "|", such as image 3 shown.
[0056] Use the conventional word segmentation method to only segment the text, and count the f...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com