The invention discloses a
system for identifying a text floor of a webpage. The
system comprises a webpage analysis and
layout module, a node identifying module, a floor dividing module and a mobile terminal page generation module, wherein the webpage analysis and
layout module is suitable for analyzing a
source code of the webpage and carrying out
layout calculation on a paring result to generate a DOM (
Document Object Model) tree; the node identifying module is suitable for traversing from a root node of the DOM tree to identify a text node and a garbage word node in the DOM tree; the floor dividing module is suitable for dividing the text node identified according to the floor of the webpage; and the mobile terminal page generation module is suitable for generating a mobile terminal page. According to the
system and the method for identifying the text floor of the webpage, after conventional content of webpage of Internet is identified and extracted, BBS text, news text and commends can be effectively extracted, the representing characteristics of floors of the text in the original webpage can be restored, the representing effect maintains the original characteristics of multiple floors so as to provide excellent reading experience for users.