A Bayesian network-based method for analyzing the relationship between characters in web news data

A technology of news data and person relations, applied in the field of knowledge discovery, can solve the problems of lack of generality and semantic accuracy

Active Publication Date: 2019-03-22
YUNNAN UNIV
View PDF13 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] Purpose of the present invention: In order to solve the problems that the known character relationship analysis method can only judge a small number of predefined character relationship types, lack of versatility and semantic accuracy, the present invention introduces character entities disclosed on the Internet based on webpage news data Knowledge base, build a BN used to describe the dependencies between various entities involved in webpage news data, and pay attention to the modeling and analysis of person-entity associations based on BN

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Bayesian network-based method for analyzing the relationship between characters in web news data
  • A Bayesian network-based method for analyzing the relationship between characters in web news data
  • A Bayesian network-based method for analyzing the relationship between characters in web news data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0082] Example: Analysis of character relationship of news data on the web page of "Today's Headlines".

[0083] : preprocessing

[0084] Follow step 1.1 to obtain the public knowledge map from http: / / openkg.cn / dataset / rdf G k , G k There are 109332 entities in China, including person entities and non-person entities. for G k All entities of , perform self-organizing mapping based on their adjacent entities, and output 50´50 neuron vectors W j ( j =0, 1, …, 50 2 -1). Each output neuron vector W j can be regarded as a class in the clustering results, and all entities will be classified into a certain output neuron vector after the self-organizing map ends W j in the represented class. At the same time, each output neuron vector W j Both have two-dimensional coordinates ( W j,x , W j,y ) ( W j,x Î {0, 1, ..., 49},W j,y Î{0, 1, …, 49}), each assigned to W j entities will inherit from W j The two-dimensional coordinates of ( W j,x , W j,y ). The na...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The relationship extraction and dependency analysis between people and entities can provide strong support for public opinion dissemination, recommendation system, precision marketing and other fields. In reality, many people are not directly related to each other, but indirectly related to each other through other entities. Existing methods can only determine a small number of predefined relationship types, and cannot achieve quantitative analysis of the relationship between people in the network. The invention introduces the disclosed character entity knowledge base, Using Self-Organizing Mapping Method, Processing high-dimensional and sparse web news data into complete training data, A Bayesian network (BN) is construct to describe dependencies between various entity involved in web news data, This paper focuses on the modeling and analysis of human entity association based on BN, and makes use of the knowledge in the historical web pages and the information in the new web pages tomake quantitative analysis and inference of human relationship, and makes full use of information resources, which effectively improves the accuracy and efficiency of human relationship analysis.

Description

technical field [0001] The invention discloses a character relationship analysis method in webpage news data, relates to Bayesian network structure learning and parameter learning from webpage news data, and probabilistic reasoning supporting character relationship analysis, belonging to the field of knowledge discovery. Background technique [0002] Extracting useful knowledge that meets people's specific needs from text information to generate economic and social benefits is an important goal and task of information extraction technology. Relation Extraction based on text information is an important topic of information extraction, and its task is to identify and obtain the relationship between entities from text information. Text information can come from various sources, such as online communities, blogs, microblogs, web news, etc. In recent years, various traditional news media have shifted their focus to the Internet platform, publishing news through web carriers. We...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/36G06N5/04G06N7/00
CPCG06N5/04G06N7/01Y02D10/00
Inventor 岳昆李磊李维华王笑一郭建斌
Owner YUNNAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products