Class center vector text classification method based on dependency, word class and semantic dictionary

A technology of dependency relationship and semantic dictionary, which is applied in the field of class-centered vector text classification, can solve problems such as large vector dimension, low classification accuracy, and sparse vector weights, and achieve reduced vector weights, high classification efficiency, and high classification accuracy Effect
CN108763402AActive Publication Date: 2018-11-06深圳占领信息技术有限公司 +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
深圳占领信息技术有限公司
Publication Date
2018-11-06

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention relates to text classification of natural language processing, and specifically relates to a class center vector text classification method based on dependency, word class and a semanticdictionary. To overcome the semantic defect of a feature selection algorithm based on statistics, the invention introduces the dependency, the semantic dictionary and the word class to optimize and cluster text features, provides an improved weight calculation formula, and further provides an improved class center vector text classification method. The text classification method of the inventionhas advantages of both high classification efficiency of a traditional class center vector method and high classification precision of a K-nearest neighbor algorithm, and can be widely used in variousclassification systems.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention relates to text classification in natural language processing, in particular to a class center vector text classification method based on dependency relationship, part of speech and semantic dictionary. Background technique

[0002] With the rapid development of computer technology, especially in the context of the "Internet +" era, network information such as documents, pictures, audio and video has exploded exponentially, and a large number of electronic files exist in the form of electronic files in daily life. How to obtain the desired information from massive data is a hot and difficult point in current research, and text classification is one of the important research directions.

[0003] Text classification is an important research direction in text processing technology, which began in the 1950s. It is a comprehensive technology integrating linguistics, mathematics, computer science and cognitive science. In the late 1950s, H.P. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More