Method and apparatus for quickly searching for contents required to be queried

A content and fast technology, applied in the field of search engines, can solve the problems of wide data sources, limited search features, and many repeated content, and achieve the effect of efficient and accurate query, high matching efficiency, and good user experience

Inactive Publication Date: 2017-01-04
GUANGDONG IDATATECH CO LTD
View PDF8 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, general crawler search has the following shortcomings: because the goal of crawling is to cover as large a network as possible, the results of crawling will inevitably contain a large number of web pages that users do not need; For data with a certain structure, general search engines are mostly based on keyword retrieval, and it is difficult to realize the requirements for querying semantic information and intelligent indexing engines
The s

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and apparatus for quickly searching for contents required to be queried
  • Method and apparatus for quickly searching for contents required to be queried
  • Method and apparatus for quickly searching for contents required to be queried

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0047] The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.

[0048] In the embodiments of the method and device for quickly searching the content to be queried in the present invention, the flow chart of the method for quickly searching the content to be queried is as follows: figure 1 Shown. figure 1 , The method for quickly searching the content to be queried includes the following steps:

[0049] Step S01: Use a web crawler system to collect various data from the Internet, and associate the collec...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method and an apparatus for quickly searching for contents required to be queried. The method comprises the steps of acquiring various data from the internet, and performing associated storage on the data and corresponding nodes in a graphic structure in a graphic database; converting non-structured data into structured data capable of analyzing applications; performing cleaning and building a unified data model; establishing a data warehouse by adopting an HBase, and loading the cleaned data into the data warehouse; associating dispersed data through company names, abbreviations or stock codes, and storing the dispersed data in the corresponding nodes according to modes of nodes and relationships in the graphic structure; extracting the data stored in each node from the graphic structure, and establishing a Chinese index; and inputting a statement required to be queried, searching for related graphic structures by adopting a traversal algorithm, and arranging the searched graphic structures according to correlation values. According to the method and the apparatus, the retrieval can be quickly carried out; the query is efficient and accurate; relatively good experience can be provided for users; and the matching efficiency is relatively high.

Description

technical field [0001] The invention relates to the field of search engines, in particular to a method and device for quickly searching for content to be queried. Background technique [0002] Search engines rely on web crawlers to effectively collect relevant web page information from the massive Internet data. How to improve the search efficiency of web crawlers is a research hotspot in this field. A traditional web crawler includes a protocol processing module. URL (uniform resource locator, also known as webpage address, is the address of a standard resource on the Internet) consists of two parts: a protocol module and a detection module. Among them, the protocol module is used to provide the network protocol required by the web crawler to solve how to obtain web pages; the detection module is responsible for sorting the collected URL information and processing duplicate content on the network to improve the search efficiency of the web crawler. [0003] However, gene...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/951
Inventor 陈乐华涂继来黄晓晖
Owner GUANGDONG IDATATECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products