Professional intelligence information acquisition method, apparatus and device, and storage medium

A technology of intelligence information and collection methods, which is applied in the direction of digital data information retrieval, web data retrieval using information identifiers, instruments, etc., can solve the problems of incomplete and correct processing, blank data, garbage data, and high difficulty in data acquisition, so as to reduce duplication Acquisition, improve collection efficiency, and avoid duplicate storage effects

Pending Publication Date: 2022-05-03
深圳市易海聚信息技术有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, with the Internet, information is becoming more and more complicated, and at the same time, it is becoming more and more difficult to obtain various materials.
Most monitoring based on open source crawler technology on the Internet cannot comprehensively and correctly handle various websites with different styles and technologies on the Internet, especially AJAX websites, where it is easy to collect blank data, garbage data or non-target data , so when users use this type of system, if they add the target website by themselves, once they encounter a slightly more complicated one, they will not be able to add it successfully

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Professional intelligence information acquisition method, apparatus and device, and storage medium
  • Professional intelligence information acquisition method, apparatus and device, and storage medium
  • Professional intelligence information acquisition method, apparatus and device, and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] In order to make the above-mentioned features and advantages of the present invention more comprehensible, the following specific embodiments are described in detail with reference to the accompanying drawings, but the present invention is not limited thereto.

[0038] like figure 1 As shown, the present invention provides a kind of professional intelligence information collection method, comprises the following steps:

[0039] S01: Read the task plan, which is used for batch management of tasks; the task plan includes information about the start scheduling time, scheduling frequency and number of processes of the target site collection;

[0040] S02: Perform data collection according to the task plan, and the data collection is collected according to the collection configuration;

[0041] Collection configuration includes URL, API, title, publication time, author, content, and collection time; for websites with restricted permissions, the collection configuration also...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a display box and a professional intelligence information acquisition method. The professional intelligence information acquisition method comprises the following steps of S01, reading a task plan; S02, performing data acquisition according to the task plan; S03, judging whether data acquisition succeeds or not; the invention also provides a professional intelligence information acquisition device, professional intelligence information acquisition equipment and a computer readable storage medium. According to the method, the required professional information is automatically collected according to the task plan, the collected professional information is subjected to standard formatting, and the file content is judged, so that repeated storage of the repeated content is avoided, and subsequent analysis is facilitated; meanwhile, the time and the human resource cost for manually collecting, analyzing and sorting professional intelligence data are effectively saved; repeated collection of repeated information is effectively reduced through URL duplicate removal, and the collection efficiency is further improved; and in addition, through targeted acquisition configuration setting, the success rate and accuracy of acquisition are improved.

Description

【Technical field】 [0001] The present application relates to the technical field of professional intelligence, in particular to a method for collecting professional intelligence information, and also relates to a device, equipment, and computer-readable storage medium for collecting professional intelligence information. 【Background technique】 [0002] With the development of the times and technology, the Internet has become an important channel for public information collection among newspapers, books, maps, audio-visual materials and many other sources of public information. The computer network has spread all over the world, and the Internet has been widely used in various fields such as politics, economy, and military. [0003] However, with the Internet, information is becoming more and more complex, and at the same time, it is becoming more and more difficult to obtain various materials. Most monitoring based on open source crawler technology on the Internet cannot com...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/951G06F16/955G06F16/215G06F16/25
CPCG06F16/951G06F16/9566G06F16/215G06F16/258
Inventor 雷关勇
Owner 深圳市易海聚信息技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products