A method of crawling and screening financial warehouse receipt risk control information based on stream computing

A stream computing and screening method technology, which is applied in computing, digital data information retrieval, web data retrieval using information identifiers, etc., can solve the real-time requirements of parallel crawlers and financial warehouse receipt risk control for goods valuation Advanced problems, to maximize processing efficiency, prevent local hot spots, and reduce performance impact

Active Publication Date: 2020-01-17
BEIJING UNIV OF TECH
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The technical problem to be solved by the present invention is to provide a method for crawling and screening financial warehouse receipt risk information based on flow computing, so as to solve the problem that the traditional method has low real-time performance in terms of parallel crawlers, and the risk control of financial warehouse receipts does not value goods. Problems with high real-time requirements

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method of crawling and screening financial warehouse receipt risk control information based on stream computing
  • A method of crawling and screening financial warehouse receipt risk control information based on stream computing

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] The present invention will be further described below in conjunction with the accompanying drawings and specific embodiments.

[0034] Such as figure 1 As shown, the method for crawling and screening financial warehouse receipt risk information based on stream computing in the present invention includes the following steps:

[0035] Step 1: Decompose the web crawler task under the streaming computing framework.

[0036] According to the streaming computing framework, the overall process of web crawling is decoupled and decomposed into six sub-processes: URL screening, page analysis, keyword filtering, numerical filtering, feature vector matching filtering, and resource updating. Encapsulate 6 sub-processes into 6 types of logical sub-tasks (Bolt type tasks) according to the streaming computing framework. In addition, data source logic tasks (Spout type tasks) are also required in the streaming computing framework.

[0037] There can be multiple logical subtasks of th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a financial warehouse receipt risk-control information crawling and screening method based on stream-oriented computation. Based on a stream-oriented computation technology, a crawling process is decoupled into six sub-processes: URL screening, page analysis, keyword filtering, numerical value filtering, feature vector matching filtering, and resource update. Using the technical scheme can solve problems that a conventional method is relatively low in real-time property in parallel crawler aspect, and financial warehouse receipt risk-control is high in cargo assessment real-time property requirement.

Description

technical field [0001] The invention belongs to the related fields of web crawlers and stream computing, and in particular relates to a method for crawling and screening financial warehouse receipt risk control information based on stream computing. Background technique [0002] As a new type of storage transaction and mortgage method, financial warehouse receipts are widely used by banks and storage companies with the popularization of Internet applications. However, in order to avoid the corresponding risks, banks often choose those products with small price changes, strong liquidity and good resilience as financing objects, such as fixed assets and heavy metal goods. However, the mortgage products of small, medium and micro enterprises are relatively small, usually bulk products, and there are many types of products, and the product prices are closely related to the current market prices. Limited by technical limitations, it is difficult for banks to make statistics on t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/951G06F16/955G06F16/9535
CPCG06F16/9535
Inventor 李浩
Owner BEIJING UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products