Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

113 results about "Page analysis" patented technology

System and method for information access

To convert the structure of a Web page into contents that a user can easily listen to, and for permitting the user to access and obtain information without have to perform and navigation, in a manner similar to listening to the radio. A designated target page is obtained from a Web server, and a linked list for the target page is analyzed to obtain a linked destination page. A transcoding module inserts the main content of the linked destination page into a linked location on the target page, and converts the page into a structure appropriate for oral reading. Then, a VoiceXML generation module converts the target page into a VoiceXML document, and a VoiceXML browser performs a speech response process for the VoiceXML document. A command is input to the VoiceXML browser orally or by using a dial key entry provided using a telephone, and a speech from the VoiceXML browser is output to the telephone.
Owner:IBM CORP

Printing performance enhancements for variable data publishing

An apparatus for printing pages of a print job includes a page analyzer, a converting apparatus, an identifying apparatus, an optimizer apparatus, a storage apparatus, and a merging apparatus. The page analyzer is operative to identify static page aspects and variable page aspects from page data within a print job. The converting apparatus communicates with the page analyzer and is operative to convert the static page aspects into static page layout objects and the variable page aspects into variable print data. The identifying apparatus communicates with the converting apparatus and is operative to identify the static page layout objects in the manner allowing for an optimized form to be created, and to allow for appropriate merging with the variable print data. The optimizer apparatus communicates with the identifying apparatus and is operative to convert the static page layout objects to an optimized form. The storage apparatus communicates with the optimizer apparatus and is operative to store at least one instantiation of the static page layout objects in the optimized form. A merging apparatus communicates with the storing apparatus and is operative to merge the static page layout objects with the variable print data to create merged print data. A method is also provided.
Owner:HEWLETT PACKARD DEV CO LP

Mobile terminal cross-platform application development framework and method based on front-end framework

The invention discloses a mobile terminal cross-platform application development framework based on a front-end framework. The mobile terminal cross-platform application development framework comprises an application performance and service logic layer, an HTML rendering layer, a mobile device API and a mobile terminal operating system platform. The application performance and service logic layer is used for conducting basic page layout, network request data processing, data loading, page navigation development and service logic processing through the front-end framework and compiling a page file; the HTML rendering layer is used for conducting calculation, layout and mobile device interface calling on a page analysis result through a browser and rendering the page onto a user interface; the mobile device API is used for achieving data interaction between the HTML rendering layer and the hardware function of a mobile terminal operating system and is provided for access; the mobile terminal operating system platform is used for packaging completed engineering and generating an installation file capable of running under the corresponding system according to the recognized mobile terminal operating system. By means of the mobile terminal cross-platform application development framework, the mobile software development efficiency can be improved, the mobile software development cycle can be shortened, and the cost can be reduced.
Owner:SUZHOU INST FOR ADVANCED STUDY USTC

Webpage information extraction method and device based on http protocol

InactiveCN104050281AConducive to value miningEasy to analyzeSpecial data processing applicationsDatabaseHTML
The invention relates to a webpage information extraction method and device based on an http protocol. The method comprises the steps of template generation, webpage address analysis, information extraction, information checking and information storage, wherein in the template generation step, a corresponding page analysis template is customized according to a target page where information is about to be extracted, and a target field and checking rules are predefined in the page analysis template; in the webpage address analysis step, the webpage address of the target page is analyzed to obtain an HTML source file of the target page; in the information extraction step, the HTML source file of the target page is read and analyzed, and page information matched with the target field predefined in the page analysis template is extracted from the HTML source file of the target page; in the information checking step, whether the extracted page information meets requirements is checked according to the predefined checking rules; in the information storage step, the page information subjected to information checking is stored. According to the webpage information extraction method and device, the page information in a network is subjected to effective data filtration, acquisition and collection through the open http protocol, templates are customized according to different target pages, and extraction of customizing information is achieved.
Owner:北京思特奇信息技术股份有限公司

System and method for identifying and acquiring related web page information by using mobile terminal

The invention discloses a system and a method for identifying and acquiring related web page information by using a mobile terminal. The system comprises a data source, a web site information collecting terminal, a web site information identifying processing module, a data index and page analysis module and an information real-time distribution module, wherein the web site information collecting terminal is used for acquiring still images or dynamic images and transforming the still images or the dynamic images to digital images; the web site information identifying processing module is used for receiving the digital images and acquiring web page address data contained in digital image information through an image identification operation; the data index and page analysis module is used for receiving the web page address data, performs index and page analysis processing in accordance with the web page address data and sends the obtained web page related information to the information real-time distribution module; and the information real-time distribution module is used for feeding back database related web page information corresponding to page addresses to the mobile terminal of a user. By the aid of the method and the system, web pages and the content containing web address information can be identified quickly through the mobile terminal, and related content information of web pages can be acquired.
Owner:高凌

An abstract extraction method combining a page analysis rule and NLP text vectorization

The invention discloses an abstract extraction method combining a page analysis rule and NLP text vectorization. The abstract extraction method comprises the following steps: S1, extracting text datain an html format in a'body 'label of text data of a webpage by using a Readability packet; S2, obtaining the text length of the text corpus, and eliminating unqualified text corpus; S3, judging whether the number of sentences of the text corpus is greater than a threshold value or not; S4, judging whether paragraph subtitle phrases can be obtained or not; S5, defining regular matching keywords, and removing the texts matched with the regular matching keywords to obtain filtered text corpora; S6, judging the compliance of the language segments; And S7, training a Word2Vec model, splitting thetext corpus into sentences, splitting the sentences into words, performing vectorization operation, solving sentence similarity by using EMD, giving weights based on the sentence similarity by using aTextRank algorithm, and determining the sentence with the highest weight as a text abstract sentence. According to the method, relatively core sentences can be obtained for long blogs and news articles, so that the subjects can be quickly known.
Owner:重庆电信系统集成有限公司 +1
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products