Unlock instant, AI-driven research and patent intelligence for your innovation.

A Text Summarization Method Based on Advanced Semantics

A high-level semantic and text technology, applied in the field of natural language processing, can solve problems such as the inability to solve the loss of low-frequency vocabulary information, and achieve the effect of reducing information loss and improving accuracy

Active Publication Date: 2021-01-12
ZHEJIANG UNIV
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method cannot better deal with the loss of low-frequency vocabulary information.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Text Summarization Method Based on Advanced Semantics
  • A Text Summarization Method Based on Advanced Semantics
  • A Text Summarization Method Based on Advanced Semantics

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0052] The present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be noted that the following embodiments are intended to facilitate the understanding of the present invention, but do not limit it in any way.

[0053] Such as figure 1 As shown, a text summarization method based on advanced semantics includes the following steps:

[0054] S01, see figure 2 In the S01 part, use text segmentation tools, such as CoreNLP / Jieba, etc., to segment the text corpus and convert it into a sequence of semantic tags (such as part-of-speech sequences, named entity sequences) corresponding to vocabulary one-to-one. Since the model needs to use the high-level semantic information of the vocabulary, it is first necessary to use text processing tools such as CoreNLP / Jieba to process the original text data. On the one hand, the text (especially Chinese) needs to be segmented first, and the smallest unit of the corpus i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a text abstract generation method based on high-level semantics, including: (1) performing word segmentation on text corpus and converting it into a sequence of semantic tags corresponding to words one by one; (2) using a bidirectional loop on the text abstract model The network acts as an encoder to encode the lexical sequence and the semantic label sequence to obtain the abstract representation on the vocabulary and the abstract representation on the semantics; (3) merge the abstract representation on the vocabulary and the abstract representation on the semantics; (4) combine The final abstract representation is sent to the decoder to calculate the lexical attention weight and semantic attention weight respectively, and predict the probability distribution of each step of the sequence on the vocabulary; (5) combine the attention weight distribution and the vocabulary probability distribution to obtain The final output probability distribution converts the final probability distribution into readable vocabulary and concatenates them into sentences for output. The invention can improve the accuracy of the model in predicting low-frequency words and performing text summarization on unlabeled data.

Description

technical field [0001] The invention belongs to the field of natural language processing, in particular to a method for generating text summaries based on high-level semantics. Background technique [0002] Text summarization in the field of natural language is a method of automatically compressing a long text into a short text by computer technology while retaining the original text. This technology is currently used in all major media websites. Through this technology, the originally long text content can be compressed into short text containing key information, thereby saving screen space and displaying more content to users. On the media interface where space is at a premium, displaying more content will bring more traffic to manufacturers, directly increase the exposure rate of advertisements and other information, increase user activity, and bring direct benefits to manufacturers. [0003] Early text summarization techniques were based on textual rules, which are usua...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/284G06F40/30G06N3/04
CPCG06F40/284G06F40/30G06N3/044
Inventor 李昊蔡登潘博远雷陈奕王国鑫何晓飞
Owner ZHEJIANG UNIV