Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and apparatus for quickly searching for contents required to be queried

A content and fast technology, applied in the field of search engines, can solve the problems of wide data sources, limited search features, and many repeated content, and achieve the effect of efficient and accurate query, high matching efficiency, and good user experience

Inactive Publication Date: 2017-01-04
GUANGDONG IDATATECH CO LTD
View PDF8 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, general crawler search has the following shortcomings: because the goal of crawling is to cover as large a network as possible, the results of crawling will inevitably contain a large number of web pages that users do not need; For data with a certain structure, general search engines are mostly based on keyword retrieval, and it is difficult to realize the requirements for querying semantic information and intelligent indexing engines
The seemingly powerful search engine actually has many disadvantages. For example, due to the wide range of data sources, the repetitive content is many and complex; High chain rate, incomplete link information
[0007] With the popularity of AJAX / Web2.0, how to capture dynamic pages such as AJAX has become an urgent problem for search engines to solve. If search engines still use the "crawling" mechanism, they will not be able to capture valid data on AJAX pages.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and apparatus for quickly searching for contents required to be queried
  • Method and apparatus for quickly searching for contents required to be queried
  • Method and apparatus for quickly searching for contents required to be queried

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0047] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0048] In the embodiment of the method and device for quickly searching the content to be queried in the present invention, the flow chart of the method for quickly searching the content to be queried is as follows figure 1 shown. figure 1 In , the method for quickly searching for the content to be queried includes the following steps:

[0049] Step S01 uses the web crawler system to collect various data from the Internet, and associates the collected data wi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and an apparatus for quickly searching for contents required to be queried. The method comprises the steps of acquiring various data from the internet, and performing associated storage on the data and corresponding nodes in a graphic structure in a graphic database; converting non-structured data into structured data capable of analyzing applications; performing cleaning and building a unified data model; establishing a data warehouse by adopting an HBase, and loading the cleaned data into the data warehouse; associating dispersed data through company names, abbreviations or stock codes, and storing the dispersed data in the corresponding nodes according to modes of nodes and relationships in the graphic structure; extracting the data stored in each node from the graphic structure, and establishing a Chinese index; and inputting a statement required to be queried, searching for related graphic structures by adopting a traversal algorithm, and arranging the searched graphic structures according to correlation values. According to the method and the apparatus, the retrieval can be quickly carried out; the query is efficient and accurate; relatively good experience can be provided for users; and the matching efficiency is relatively high.

Description

technical field [0001] The invention relates to the field of search engines, in particular to a method and device for quickly searching for content to be queried. Background technique [0002] Search engines rely on web crawlers to effectively collect relevant web page information from the massive Internet data. How to improve the search efficiency of web crawlers is a research hotspot in this field. A traditional web crawler includes a protocol processing module. URL (uniform resource locator, also known as webpage address, is the address of a standard resource on the Internet) consists of two parts: a protocol module and a detection module. Among them, the protocol module is used to provide the network protocol required by the web crawler to solve how to obtain web pages; the detection module is responsible for sorting the collected URL information and processing duplicate content on the network to improve the search efficiency of the web crawler. [0003] However, gene...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/951
Inventor 陈乐华涂继来黄晓晖
Owner GUANGDONG IDATATECH CO LTD
Features
  • Generate Ideas
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More