Check patentability & draft patents in minutes with Patsnap Eureka AI!

A network crawler system and method based on the scientific research subject of a party school

A technology for scientific research and web crawling, applied in the field of Internet search engines, can solve the problems of low search relevance and low accuracy of search information, and achieve the effect of improving accuracy

Inactive Publication Date: 2018-12-07
合肥明高软件技术有限公司
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The object of the present invention is to provide a kind of web crawler system and method based on the scientific research work theme of the party school, by improving it according to the characteristics of the scientific research work of the party school on the basis of the Shark-Search algorithm, set up a search engine of the scientific research work theme of the party school, By using keywords to establish the theme, each keyword has a different specified weight, and using the theme correlation analysis module to optimize the theme and filter the webpage, which solves the problem of low search relevance of the existing party school scientific research theme search webpage, The problem of low accuracy of search information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A network crawler system and method based on the scientific research subject of a party school
  • A network crawler system and method based on the scientific research subject of a party school
  • A network crawler system and method based on the scientific research subject of a party school

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0047] see figure 1 Shown, a kind of web crawler system based on party school scientific research work subject of the present invention comprises HTML document, initial seed module, crawling module, database, topic correlation analysis module, sorting module, topic establishment module;

[0048] The theme establishment module is used to establish the theme that the crawler faces;

[0049] The topic correlation analysis module is used to calculate the topic correla...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a network crawler system and method based on a scientific research subject of a party school, which relate to the technical field of an Internet search engine. The network crawler system of the invention comprises an initial seed module, a crawling module, a database, a subject correlation analysis module, a sorting module and a subject establishment module. The network crawler working method comprises the following steps: 1, a crawler module retrieves a web page; 2, a relevance analysis module is called to carry out relevance analysis on that web page; 3, the crawlingmodule eliminates or retains the web page according to the analysis result; 4, the crawl module extracts the URL to be processed from the database; 5, the sorting module sorts the importance degree ofthe web page; 6, the crawling module judges whether there is a new URL in the database. The invention improves the relevance and the precision of searching information of the webpage of the scientific research work of the party school by establishing a search engine of the scientific research work of the party school and utilizing the subject relevance analysis module to carry out the subject optimization and the webpage filtering.

Description

technical field [0001] The invention belongs to the technical field of Internet search engines, and in particular relates to a web crawler system and method based on the scientific research work theme of a party school. Background technique [0002] Traditional general-purpose search engines are facing enormous challenges: first, web information resources are growing exponentially, and search engines cannot index all pages; second, users in different fields have different search needs, and the "broad and extensive" General search engines cannot meet the "specialized and refined" search needs of professional users. Faced with these challenges, various "theme search engines" for specific groups of people have emerged. [0003] At the same time, with the continuous development of the scientific research work of the party school in our country, the scientific research resources of the party school have exceeded the terabyte level, but there is no effective way to retrieve infor...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 徐玉红
Owner 合肥明高软件技术有限公司
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More