Unlock instant, AI-driven research and patent intelligence for your innovation.

Digital television interaction service page information extraction method and device

A technology of information extraction and service pages, applied in the direction of electronic digital data processing, special data processing applications, instruments, etc., can solve problems such as waste of resources, large quantities, and irregular interactive service pages, so as to reduce processing capacity and improve acquisition speed Effect

Inactive Publication Date: 2011-11-09
GUANGDONG XINGHAI DIGITAL HOME IND TECH RES INST +1
View PDF5 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In the prior art, the interactive service pages are irregular and large in number, and contain a lot of data. During the retrieval process, a large amount of data needs to be processed, resulting in waste of resources, and cannot quickly search for key data on the interactive service pages. quick search

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Digital television interaction service page information extraction method and device
  • Digital television interaction service page information extraction method and device
  • Digital television interaction service page information extraction method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0047] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0048] In the present invention, aiming at the characteristics of interactive service pages, a method and device for extracting digital TV interactive service page information based on Document Object Model (Document Object Model, DOM) are proposed. DOM is a standard specification provided by W3C for establishing a tree structure of an Extensible Markup Language (eXtensible Markup Language, XML) document in memory, and elements in an XML document can be represented ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention discloses a digital television interaction service page information extraction method and a digital television interaction service page information extraction device. The method comprises the following steps of: acquiring web pages, and remaking the web pages to obtain extensible hypertext markup language (XHTML) documents; establishing a document object model (DOM) tree according to the XHTML documents; clustering the acquired web pages according to the DOM tree; acquiring a web page template corresponding to the clustered web pages of the same cluster; and performing information extraction according to the web page template, and acquiring extracted detailed information. By the digital television interaction service page information extraction method and the digital television interaction service page information extraction device provided by the embodiment of the invention, the digital television interaction service page key information acquisition speed can be increased, and the digital television interaction service page information data processing load also can be reduced.

Description

technical field [0001] The invention relates to the technical field of digital television, in particular to an information extraction method and device for a digital television interactive service page. Background technique [0002] With the rapid development of the Internet (Internet) and digital television, interactive service pages have become a huge and complex information warehouse. How to quickly extract information from massive interactive service pages and improve the efficiency of people's access to information is becoming more and more important. At present, the vast majority of interactive service pages are dynamic web pages, which are usually composed of the background database of the website through some common template, and have very similar page structures, such as search results returned by search engines, product information in online stores Pages, etc. are typical dynamic web pages. The number of such webpages is often huge and rich in content, so the ext...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 林格张洁颜权
Owner GUANGDONG XINGHAI DIGITAL HOME IND TECH RES INST