Method and device for adding tags to short texts automatically
A text label, automatic adding technology, applied in the Internet field, can solve the problems of low accuracy, inappropriate keywords, increase user operations, etc., to achieve the effect of improving accuracy
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0024] An embodiment of the present invention provides a method for automatically adding tags to short texts, such as figure 1 As shown, the method includes:
[0025] Step 101, counting the reciprocal document frequency of each tag word in the tag word set;
[0026] Optionally, a set of tagged words and corpus associated with the set of tagged words are preset; generally, large-scale language instances may not be observed in statistical natural language processing. So, people simply use the text as a substitute, and take the context in the text as the context of language in the real world. A text collection can be called a corpus (Corpus). Optionally, relevant texts are collected from the Internet, for example, the question-and-answer content in Tencent's "Wanwen" product can be used as the corpus.
[0027] Segmenting the corpus; segmenting a sentence into individual words, for example, segmenting the sentence "This is a method for automatically adding tags to short texts",...
Embodiment 2
[0057] An embodiment of the present invention provides a method for automatically adding tags to short texts, such as image 3 As shown, the method includes:
[0058] Step 301, preset the set of tagged words and the corpus associated with the set of tagged words;
[0059] Optionally, according to requirements, obtain a set of tag words. For example, if you want to add tags to film and television content, you need to collect a set of commonly used tags for film and television, including film and television genres, stars, and so on.
[0060] In general, large-scale language instances may not be observed in statistical natural language processing. So, people simply use the text as a substitute, and take the context in the text as the context of language in the real world. A text collection can be called a corpus (Corpus). Optionally, relevant texts are collected from the Internet, for example, the question-and-answer content in Tencent's "Wanwen" product can be used as the cor...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com