Unlock instant, AI-driven research and patent intelligence for your innovation.

Webpage text parsing method and device and mobile terminal

A text parsing, web page technology, applied in program control devices, network data retrieval, website content management, etc., can solve problems such as delayed page display, and achieve the effect of reducing parsing and speeding up processing.

Active Publication Date: 2016-06-01
ALIBABA (CHINA) CO LTD
View PDF3 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0012] As can be seen from the above sequence diagram, when loading and executing javascript scripts during the execution of ordinary scripts, the parsing process of HTML documents will be suspended, resulting in delayed display of the page

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Webpage text parsing method and device and mobile terminal
  • Webpage text parsing method and device and mobile terminal
  • Webpage text parsing method and device and mobile terminal

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0059]以下将结合附图对本发明的具体实施例进行详细描述。

[0060]本发明的网页文本解析方法和装置,在解析出网页元素为普通的javascript脚本之后,加载和执行普通javascript脚本,同时构建所述普通javascript脚本对应的DOM树节点,进行下一网页元素的解析。在进行javascript脚本的加载和执行时,并不停止构建所述普通javascript脚本对应的DOM树节点和下一网页元素的解析工作,加快了网页文本处理速度,使得对javascript脚本渲染显示提前。进而减少了整个网页的解析、加载和渲染显示时间。

[0061]图2示出了本发明的网页文本解析方法的一个实施例的流程图。

[0062]如图2所示,本发明的网页文本解析方法包括:

[0063]S200,解析网页文本的网页元素。

[0064]浏览器在进行网页的渲染前首先要根据用户请求去目标网站获取网页文本即网页的源文件,获取到网页文本后,将网页文本解析成DOM树。浏览器根据DOM树结构对网页进行排版渲染。同时网页包含很多网页元素,例如网页文本、图片和javascript脚本文件等。如果是javascript脚本文件,则要根据javascript脚本文件的类型进行相应的处理。

[0065]S210,确定当前解析的网页元素为普通的javascript脚本。

[0066]浏览器进行网页文本的某一网页元素解析时,首先解析该元素的HTML标记信息,当解析到是标签的网页元素时,则认为是普通的javascript脚本。

[0067]确认解析出当前的网页元素为普通的javascript脚本后,同时执行S220和S230。

[0068]S220,加载所述普通javascript脚本以获得所述普通javascript脚本的javascript执行文件。这里加载所述普通javascript脚本是去网页服务器获取所述普通javascript脚本的javascript执行文件。

[0069]S230,构建普通javascript脚本对应的DOM树节点。

[0070]完成S220后,进入S240执行所述普通javascript脚本的javascript执行文件。

[0071]在获取到所述普通javascript脚本的javascript文件后,执行所述javascript文件。这里javascript文件的执行包括某些运算的执行或者跟当前DO...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a webpage text parsing method and device and a mobile terminal. After a parsing result shows that a webpage element is a common javascript, a common javascript is loaded, and meanwhile, a DOM (Document Object Model) tree node corresponding to the common javascript is constructed. After the common javascript finishes being loaded, the common javascript is executed, and a next webpage element is parsed after the DOM tree node corresponding to the common javascript finishes being constructed. When the common javascript is loaded and executed, the parsing work of the DOM tree node corresponding to the common javascript and the next webpage element does not stop constructing, and webpage text processing speed is quickened. Therefore, the parsing, loading and rendering display time of the whole webpage is shortened, and therefore, element rendering display behind the common javascript element is carried out in advance.

Description

technical field [0001] The present invention relates to the technical field of mobile communication, more specifically, to a web page text analysis method and device Background technique [0002] When a browser renders a web page, it first parses the text of the web page into a DOM tree, and then renders the web page according to the DOM tree. Among them, the webpage resources that will affect the timing of webpage rendering mainly include external css style files and javascript script files. The css style files will affect the rendering results of webpages, so now mainstream browsers need to wait for the css style files to be loaded before initiating rendering process; for javascript script files, there are currently three types of javascript script files, which are defer and async attributes <script>元素的和普通的<script>元素。目前浏览器解析,加载和执行script脚本之间关系的标准时序,如图1A、图1B、图1C所示,各有不同:[0003]图1A示出了现有技术的普通javascript脚本<script>的处理时序图。[0004]图中线条1表示网页文本解析时间轴,线条2表示一个普通的<scrip...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/445G06F17/30G06F40/143H04M1/72445
CPCG06F16/986G06F40/221H04M1/72445G06F40/143G06F9/4488
Inventor 周超贺永明胡立琼
Owner ALIBABA (CHINA) CO LTD