Topic information acquisition method based on network topology

A technology of subject information and collection methods, applied in the field of network security, can solve problems such as the inability to reduce the impact

Inactive Publication Date: 2009-05-27
BEIJING JIAOTONG UNIV
View PDF0 Cites 34 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The general topic information collection system treats the URLs extracted from the same web page i

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Topic information acquisition method based on network topology
  • Topic information acquisition method based on network topology
  • Topic information acquisition method based on network topology

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0060] The performance of the topic information collection system is measured by the precision rate. The precision rate is a measurement of the accuracy of obtaining relevant topic web pages. If M is the total number of captured topics and T is the number of related topics in the obtained web pages, then the precision rate is precision=T / M. By focusing on the keyword "Beijing Olympics", multiple themed webpages from hundreds of websites were captured, and the related topic collection efficiency of the topic information collection system and the general collection system were compared and the influence of different system parameters on the topic information collection performance (weight The parameter t is taken as 0.5). Such as figure 1 As shown, it is a workflow flowchart of the topic information collection method based on the network topology, such as Figure 2 to Figure 6 The results shown above are the average value of multiple simulation data. Due to the fast update of ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a topic information acquisition method based on network topology. An initial web page set is obtained from a search engine and is expressed as a vector set through purification, word division and removal of stop words, and a vector space model is used to calculate the text similarity. A network structure is utilized to perform linkage analysis to extracted URLs first, the linkage is filtered through directory hierarchies of the URLs, and then the weights of the URLs are modified according to the scaleless property of a network to perform the prior absorption selection. At the same time, unrelated topic areas are feedback, and the lengths of buffer areas of unrelated URLs are set through the distance between the URLs and a seed set. The heat of acquired topics is calculated to select one topic to obtain a new reply.

Description

technical field [0001] The invention relates to a method for collecting subject information based on network topology and belongs to the field of network security. Background technique [0002] With the increasing popularity of information networking, the information on the Internet is increasing day by day, and huge potential value is contained in these massive heterogeneous Web information resources. The Internet's convenient and quick way of publishing information and the communication platform for audience interaction have made the Internet surpass traditional media and become the main way to obtain real-time information. News events are often the first to appear on the Internet and spark discussion online. [0003] How to effectively extract and utilize network information has become a huge challenge. Search engines provide users with fast and effective ways to obtain information by means of queries. The network information collection system downloads web pages from ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 刘云熊菲李勇沈波张振江贾凡程辉张立张彦超司夏萌
Owner BEIJING JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products