A News Classification Method Based on Text Context Structure and Attribute Information Overlay Network

A technology of attribute information and text information, applied in the field of information processing, can solve the problems of missing and neglecting features, achieve the effect of feature construction and diversity, effective feature construction, and strict control

Active Publication Date: 2022-05-03
UNIV OF ELECTRONICS SCI & TECH OF CHINA
View PDF3 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Finally, for newspaper and periodical news, although the main reason for whether it can become the front page news is the amount of information contained in the context of the news text, but due to the limitations of the front page layout and later typesetting, some news that is too long or too short hard to make front page news
Most of the previous technical means only considered the text context structure information, while ignoring the text attribute information such as "title length" and "text length", resulting in the lack of features

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A News Classification Method Based on Text Context Structure and Attribute Information Overlay Network
  • A News Classification Method Based on Text Context Structure and Attribute Information Overlay Network
  • A News Classification Method Based on Text Context Structure and Attribute Information Overlay Network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] specific implementation plan

[0037] In order to make the purpose of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings.

[0038] figure 1 The steps of the news classification method proposed by the present invention are shown visually. Specifically, it includes data acquisition, generation of text feature vectors, generation of attribute feature vectors, data set division, weighted random sampling, compound model training, and final classification prediction.

[0039] figure 2 It intuitively shows the method of converting text into vectors in the present invention, the principle is as follows:

[0040] Simultaneously train word vectors and text vectors; let the text d i The corresponding encoding vector is p i , the encoding vector corresponding to the word t in the text is w t ; can construct word t in text d i The vector at the jth occurrence in is as follows:

[0041]...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a newspaper news classification method based on text context structure and attribute information superposition network, belonging to the field of information processing. The present invention uses the text vector representation method to convert the text of indefinite length into a fixed-length vector to avoid the loss and redundancy of text information; from the perspective of training data, weighted random sampling is adopted to adjust the possibility of sample selection through weight , to optimize the composition of training samples; from the perspective of feature extraction, this paper not only considers the contextual result information but also text attribute information to optimize the feature extraction process. The present invention not only improves the extraction method of text features, but also additionally incorporates attribute features into the process of feature construction. Using the text vector representation method to convert the variable-length text into a fixed-length vector avoids the loss and redundancy of text information, and optimizes the extraction method of text features; additional feature information of news is added to enrich the source of features.

Description

technical field [0001] The invention belongs to the field of information processing, and relates to a news classification method and system based on a superimposed network of text context structure information and attribute information. Background technique [0002] Key term definitions: [0003] Neural network: It is a mathematical model or computational model that imitates the structure and function of biological neural networks, and is used to estimate or approximate functions. Neural networks are computed by a large number of artificial neuron connections. In most cases, the artificial neural network can change the internal structure on the basis of external information, which is an adaptive system. [0004] Text representation: It is a machine learning technique in the field of natural language processing that maps text, a high-level cognitive abstract entity, to a vector on the real number field for subsequent computer processing. [0005] Weighted random sampling: ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/35G06F16/9532G06K9/62
CPCG06F16/9532G06F16/35G06F18/2411
Inventor 蔡世民陈明仁戴礼灿
Owner UNIV OF ELECTRONICS SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products