Unlock instant, AI-driven research and patent intelligence for your innovation.

Method and device for identifying web page information

A technology for web page information and identification methods, which is applied in the fields of instrumentation, calculation, and electrical digital data processing.

Active Publication Date: 2018-08-24
阿里巴巴(中国)网络技术有限公司
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the number of webpages published by seller users is very large. Due to the limitation of human resources, the number of webpages processed by this spot check is also very limited. Therefore, this manual spot check method is difficult to be widely used, and its work efficiency is also low. very low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for identifying web page information
  • Method and device for identifying web page information
  • Method and device for identifying web page information

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0045] see figure 1 , which is a method flow chart of a web page information identification method disclosed in Embodiment 1 of the present application, the method includes the following steps:

[0046]Step 101: Obtain web page log information from the database, the web page log information includes feature information describing the object in the publishing log, the feature information in the exposure log, the feature information in the click log, and the feature information in the transaction log any one or more of them;

[0047] As a new type of information carrier, the webpage of the website is used to carry the information of a specific object, so that the website users can browse, and the specific object is the description object of the webpage. The description objects of the webpages of different websites are also different. For example, for shopping websites such as Taobao, Jingdong, Amazon, and Dangdang, the description objects of their webpages can be products (ie, ...

Embodiment 2

[0080] Due to the large variety of description objects under each category and the large difference in characteristic information, the accuracy of the judgment result is not high. Therefore, Embodiment 2 of the present invention provides an information identification method to further identify whether the description object of each subcategory in each category is false information. see figure 2 , which is a method flow chart of another web page information identification method disclosed in Embodiment 2 of the present application, including the following steps:

[0081] Step 201: Obtain webpage log information from the database, the webpage log information includes the characteristic information describing the object in the release log, the characteristic information in the exposure log, the characteristic information in the click log and the characteristic information in the transaction log. any one or more than one;

[0082] Wherein, the feature information describing the...

Embodiment 3

[0102] In the following, the statistical model is a Gaussian mixture model, the description object is a product, and the feature information includes price information and title information. The product category is divided according to the industry to which the product belongs, and the product subcategory is divided according to the type of product. A method for identifying web page information is described in more detail. see image 3, which is a method flowchart of an information identification method disclosed in Embodiment 3 of the present application, including the following steps:

[0103] Step 301: extracting webpage log information from the database, the webpage log information includes any of the price information of the product in the release log, the price information in the exposure log, the price information in the click log, and the price information in the transaction log. one or more than one;

[0104] Step 302: The webpage log information obtained according ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention discloses a method and a device for identifying webpage information. The identifying method comprises the following steps of: acquiring webpage log information from a database; according to categories which described objects belong to, dividing the acquired webpage log information and carrying out statistics the webpage log information in each category; establishing a statistic model of each category by utilizing the webpage log information in each category, which is subjected to statistics, and according to the statistic model, determining feature information distribution of the described object of each category; judging whether feature information of the described object in the identified webpage information is in the normal range of the feature information distribution of the category which the described object belongs to; and if yes, determining the identified webpage information is real information, or determining the identified webpage information is false information. According to the embodiment of the invention, the false information of a product can be automatically identified and identifying efficiency is improved.

Description

technical field [0001] The invention relates to the field of computer application technology, in particular to a method and device for identifying web page information. Background technique [0002] On the third-party shopping platform, the seller user publishes the product webpage through the platform, and the buyer user uses the search engine on the platform to find the webpage that meets the specific search conditions in the webpage published by the seller, and the search engine lists these webpages that meet the specific search condition It is displayed to buyers in the form of search results, and buyers and users further browse product search results to decide whether to click and view a search result product in detail. In addition, when a buyer user uses a search engine to search for a webpage whose product satisfies a specific search condition, the search engine will also sort the webpages that are the search results based on the webpage information. Therefore, in or...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
CPCG06F16/9535
Inventor 冯景华陈超杨宝春
Owner 阿里巴巴(中国)网络技术有限公司