Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Web information extraction method based on minimum weight communication determining set in multi-view image

An information extraction and multi-view technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve problems such as boring, boring text, and loss of comprehensive information in the event evolution process, so as to enrich information and improve semantic analysis , to achieve the effect of continuity in time

Inactive Publication Date: 2016-03-30
CHANGSHU RES INSTITUE OF NANJING UNIV OF SCI & TECH
View PDF1 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, with the explosive growth of Internet information, people often encounter this kind of problem when searching for information on the Internet: browse a particularly large set of Web documents and extract meaningful information
Although the timeline system proposes a sequence of events based on chronological order, the linearly structured event axis usually loses comprehensive information on the evolution process of events
(2) These systems are usually summarized in the form of text, but the text may sometimes seem tedious and uninteresting to the reader

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Web information extraction method based on minimum weight communication determining set in multi-view image

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0046] The problem of generating a graphical time-series storyline can be defined as follows:

[0047] Input: query subject and collection of objects , , where each object is a text description (for example, a short paragraph or sentence) and a timestamp an image of .

[0048] Output: A graphical chronological storyline consisting of the most representative objects that outline the query's relevant topics.

[0049] Below we will transform this problem into a minimum weight connected dominator set problem on a multi-view graph, which can be decomposed into two optimization problems: 1) find the minimum weight dominator set; 2) use the directed Steiner tree (SteinerTree) to connect the dominator set elements.

[0050] 1. Multi-view object graph construction

[0051] Definition: Multi-ViewGraph is a triplet ,in is the set of vertices, is a set of undirected edges, is a set of directed edges.

[0052] Knowing a collection of images and textual descriptions w...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a Web information extraction method based on a minimum weight communication determining set in a multi-view image. According to the web information extraction method, a text, an image and time information are integrated; and by transforming a problem into an optimization problem based on the image and solving the problem, an abstract based on a story axis is generated so as to reflect an event evolution process of a given theme. The Web information extraction method has the advantages that: (1) according to the method provided by the invention, image processing and text processing are combined so as to improve semantic analysis and provide a vivid graphical abstract for a reader; (2) the problem is transformed into the optimization problem based on the image, and the problem is solved by using an effective heuristic method; and (3) the generated story axis simultaneously achieves continuity of time and coherence of contents, and a retrieval speed is improved, and richer information and a better result are provided for the reader.

Description

technical field [0001] The invention relates to a new web information extraction method for a topic, in particular to a web information extraction method for generating a graphical story axis through a minimum weight connected decision set in a multi-view diagram. Background technique [0002] With the rapid development of information technology, the Internet has become the most popular medium for information release. Whether it is publishing information or reading information, it is extremely convenient for people. However, with the explosive growth of Internet information, people often encounter this kind of problem when searching for information on the Internet: browse a particularly large set of Web documents and extract meaningful information. In recent years, various types of Web document understanding systems have been proposed to solve this problem. For example, a query-based multi-document automatic summarization system aims to extract summary sentences from a doc...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 李涛李千目王鹏飞
Owner CHANGSHU RES INSTITUE OF NANJING UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products