Supercharge Your Innovation With Domain-Expert AI Agents!

Short text classification method and system

A classification method and short text technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve problems such as loss of semantics, inaccurate classification results, etc., to achieve the effect of ensuring accuracy and improving computing efficiency

Inactive Publication Date: 2018-11-06
XIAMEN KUAISHANGTONG INFORMATION TECH CO LTD
View PDF7 Cites 24 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the above methods either ignore the semantic information and position information of the words in the sentence, or ignore the relationship between adjacent words, so that the learned results lose part of the semantics, thus making the classification results inaccurate

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Short text classification method and system
  • Short text classification method and system
  • Short text classification method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] In order to make the technical problems, technical solutions and beneficial effects to be solved by the present invention clearer and clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0035] With the advent of the mobile Internet era, short text data such as Weibo, comments, and WeChat have shown explosive growth, which puts forward higher requirements for text processing. A kind of short text classification method of the present invention based on this, such as figure 1 As shown, it includes the following steps:

[0036] a. Perform word segmentation processing on the short text to obtain the word after word segmentation;

[0037] B. carry out part-of-speech tagging to described word, obtain the part-of-speech vector of described w...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a short text classification method and system. The method comprises the following steps: performing word segmentation processing on a short text, performing part-of-speech tagging on words obtained after word segmentation, and obtaining part-of-speech vectors of the words; multiplying the part-of-speech vectors by corresponding part-of-speech weight values, and obtaining new part-of-speech vectors; multiplying word vectors of the words by corresponding TF-IDF weight values to obtain weighted word vectors; splicing the weighted word vectors with the new part-of-speech vectors, and obtaining weighted spliced word vectors; overlaying the weighted spliced word vectors of the words with the weighted spliced word vectors of corresponding adjacency words to obtain adjacency word vectors; and classifying the short text according to the adjacency word vectors of the words, thus a short text classification result with relatively high accuracy is obtained; meanwhile, the method classifies the short text by calculating similarity of the short text with the candidate document according to the adjacency word vectors, thus computational efficiency is further improved.

Description

technical field [0001] The invention relates to the technical field of computer natural language processing, in particular to a short text classification method and a system for applying the method. Background technique [0002] Short text classification is a branch of shallow natural language processing, and its processing objects are various forms of short text corpus. In the field of natural language processing, how to express a word or sentence has always been a difficult problem. Although existing text classification methods such as word embedding technology (Word Embedding) can make the expression of words more and more powerful, however, the ability to express sentences or short texts still needs to be improved. [0003] At present, the better methods are word2vec, sentence2vec, tf-idf, LSI and so on. However, the above methods either ignore the semantic information and position information of the words in the sentence, or ignore the association relationship between...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30G06F17/27
CPCG06F40/289
Inventor 邹辉肖龙源蔡振华李稀敏刘晓葳谭玉坤
Owner XIAMEN KUAISHANGTONG INFORMATION TECH CO LTD
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More