Identifying sources of media content having a high likelihood of producing on-topic content

a technology of media content and sources, applied in the field of identifying sources of media content having a high likelihood of producing on-topic content, can solve problems such as not being very good at keeping the information consumer up to da

Inactive Publication Date: 2008-05-15
COLLECTIVE INTELLECT
View PDF7 Cites 117 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

[0011]According to one aspect of an embodiment of the present invention, a quality centrality measure may be created for each of the topic areas of interest.
[0012]According to another aspect of an embodiment of the present invention, quality centrality measure may be based upon latent semantic analysis.
[0013]According to one aspect of an embodiment of the present invention, the initial graph scores may be based upon one or more of a topic density score, a maven density score and a relevancy score.

Problems solved by technology

Search engines, such as Google, are adequate for generalized ad hoc searches, but are not very good at keeping the information consumer up to date regarding the best content on a subject from the user's perspective.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Identifying sources of media content having a high likelihood of producing on-topic content
  • Identifying sources of media content having a high likelihood of producing on-topic content
  • Identifying sources of media content having a high likelihood of producing on-topic content

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029]Methods and systems are described for proactively and programmatically identifying sources of media content having a high likelihood of producing on-topic content in relation to a specific topic of interest. According to one embodiment, in the context of blog sites, the approach uses an initial set of seed blog sites to build a graph as a result of deep crawling of the web. The graph is then analyzed from both a link perspective and a content perspective to identify target blog sites with a high likelihood of producing the desired on-topic content. In one embodiment, the nodes of the graph represent posts and the edges represent inbound / outbound citations among posts. As part of the analysis of the graph, various scores with different weights may be assigned to each node based on measures of on-topic posts generated by the associated blog sites. According to one embodiment, subsequent execution of a topic net involves monitoring the health of the topic net to determine whether...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

Methods and systems are provided for identifying on-topic sources of media content. According to one embodiment, candidate seed sites are identified from which current seeds are selected for deep crawling. The current seeds are identified by correlating relevancy scores or key-word search results from multiple search engines; and selecting the current seeds based on on-topic scores of the candidate seeds. Periodically, a topic net associated with the topic area of interest is executed to locate relevant sources of media content by (i) building a graph in which nodes represent pages and edges represent links among pages by performing an iterative 360 crawl starting from the seeds; (ii) assigning initial node graph scores; (iii) computing final node graph scores by performing link analysis; (iv) computing a site graph scores by aggregating and averaging corresponding node graph scores; and (v) configuring sites with the highest site graph scores to be scraped.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS[0001]This application claims the benefit of priority to U.S. Provisional Patent Application No. 60 / 969,950 filed on Sep. 5, 2007 and U.S. Provisional Patent Application No. 60 / 866,064 filed on Nov. 15, 2006, both of which are hereby incorporated by reference in their entirety for all purposes.COPYRIGHT NOTICE[0002]Contained herein is material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction of the patent disclosure by any person as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all rights to the copyright whatsoever. Copyright© 2006-2007, Collective Intellect, Inc.BACKGROUND[0003]1. Field[0004]Embodiments of the present invention generally relate to filters, ranking mechanisms and / or readers of news, messages, Really Simple Syndication, Rich Site Summary or RDF Site Summary (collectively, RSS) feeds, message board postings, pod casts, inst...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(United States)
IPC IPC(8): G06F17/30
CPCG06F17/30873G06F16/954
Inventor WOLTERS, TIMOTHY J.SETAYESH, MEHRSHAD
Owner COLLECTIVE INTELLECT
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products