Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A bilingual news aggregation method and system

An aggregation method and bilingual technology, applied in the field of information processing, can solve the problems of consuming large computing resources and high cost of corpus acquisition, and achieve the effect of less computing resource occupation and low acquisition cost.

Active Publication Date: 2021-02-12
无码科技(杭州)有限公司
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Aiming at the existing bilingual news aggregation method, the present invention provides a simple, effective, and low-cost bilingual news aggregation method and system, aiming to solve the problem of high cost of acquiring corpus and consuming a large amount of computing resources in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A bilingual news aggregation method and system
  • A bilingual news aggregation method and system
  • A bilingual news aggregation method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0048] The above and other technical features and advantages of the present invention will be clearly and completely described below in conjunction with the accompanying drawings. Apparently, the described embodiments are only some of the embodiments of the present invention, not all of them.

[0049] The general invention principle of the present invention is as figure 1 As shown, firstly, the bilingual topic model training data is constructed through the bilingual corpus aligned at the chapter level, and the bilingual topic model is trained; then the topic representation of the bilingual news is performed through the bilingual topic model, and the similarity between bilingual news is calculated by combining other news features; Finally, compare the calculated similarity with the preset threshold to achieve clustering and matching aggregation of bilingual news.

[0050] see figure 2 , Embodiment 1 of the present invention provides a bilingual news aggregation method, compri...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a bilingual news aggregation method and system. The method comprises the steps: training data of a bilingual theme model is constructed by using bilingual corpora with the sametheme; according to the training data, based on LDA model, the bilingual theme model is obtained. Bilingual theme model is used to represent the theme of the aggregated bilingual news, and the similarity between bilingual news is calculated according to the characteristics of bilingual news. Bilingual news with preset similarity is clustered to form news clusters. The invention constructs training data by splicing bilingual documents, trains bilingual subject model, and maps news of different languages to the same semantic space. Then the method uses the bilingual theme model to represent thetheme of the aggregated bilingual news, and calculates the similarity between bilingual news according to the characteristics of bilingual news. News clustering is performed based on similarity. Thetraining data acquisition cost of the invention is low, and the computational resource occupies less space.

Description

technical field [0001] The invention relates to the technical field of information processing, in particular to a bilingual news aggregation method and system. Background technique [0002] With the rapid development of information technology, a lot of news is produced every moment. How to disseminate news quickly and effectively has become a very important issue in today's society. In addition to traditional newspapers, television, etc., which can be used as media for news dissemination, the Internet is also an important medium for news dissemination. At present, when using the network to disseminate news, it is necessary to collect and analyze news from various news websites, classify and aggregate them according to categories, and then present them to users. [0003] At present, bilingual news aggregation usually adopts the method of "translate first and then cluster", that is, first translate news in different languages ​​into the same language through machine translat...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/35G06F40/289G06F40/216G06F40/58
CPCG06F40/216G06F40/289G06F40/58
Inventor 余忠庆冯大辉
Owner 无码科技(杭州)有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products