Visual modeling analysis method and platform for electromagnetic data
By building an SOA-based visual modeling and analysis platform, and combining Kafka, HBase, Flink, and Spark technologies, the problem that traditional electromagnetic data analysis tools cannot process massive amounts of data in real time has been solved, achieving efficient and secure electromagnetic data analysis and monitoring.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- INSTITUTE OF INFORMATION ENGINEERING CHINESE ACADEMY OF SCIENCES
- Filing Date
- 2023-02-27
- Publication Date
- 2026-06-12
Smart Images

Figure CN116401728B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of data processing technology, and in particular to a visualization modeling and analysis method and platform for electromagnetic data. Background Technology
[0002] With the rapid development of the information age, electromagnetic waves have become the main medium and important carrier for humans to obtain information. Most of the work on the Internet, such as transmitting information, interfering with signals, and receiving information, is accomplished through electromagnetic waves.
[0003] The electromagnetic space is becoming increasingly complex and information-dense, containing a wealth of valuable information, leading to intensified competition among various parties for control of the electromagnetic space.
[0004] However, due to the inadequacy of traditional electromagnetic data analysis tools, there is a limitation in achieving real-time analysis and processing of massive amounts of electromagnetic data. Summary of the Invention
[0005] The present invention provides a visualization modeling and analysis method and platform for electromagnetic data, which addresses the shortcomings of existing technologies due to the inadequacy of traditional electromagnetic data analysis tools, which prevent real-time analysis and processing of massive electromagnetic data. The method enables the construction or use of analysis models as analysis tools based on the processing needs of electromagnetic data, and utilizes these models to perform real-time parallel analysis and processing of electromagnetic data, thereby achieving real-time monitoring of massive amounts of electromagnetic data with different processing requirements.
[0006] This invention provides a visualization modeling and analysis method for electromagnetic data, applied to a visualization modeling and analysis platform for electromagnetic data, comprising:
[0007] Acquire electromagnetic data of at least one target;
[0008] Based on the processing requirements for each target electromagnetic data, a target model for each target electromagnetic data is determined; each target model is generated based on visual modeling.
[0009] Each target electromagnetic data is input into the target model of each target electromagnetic data, and the analysis results output by the target model that meet the processing requirements of each target electromagnetic data are obtained.
[0010] According to the present invention, a visualization modeling and analysis method for electromagnetic data is provided, wherein determining the target model for each target electromagnetic data according to the processing requirements of each target electromagnetic data includes:
[0011] If, based on the processing requirements of the target electromagnetic data, it is determined that the target model exists in the model library, the target model is retrieved from the model library; the model library is used to store models generated based on visual modeling.
[0012] If, based on the processing requirements of the target electromagnetic data, it is determined that the target model does not exist in the model library, multiple operators corresponding to the processing requirements are retrieved from the operator library; the operator library is used to store all operators in each electromagnetic data analysis process.
[0013] Drag the multiple operators onto the canvas to determine the directed acyclic graph of the processing requirements;
[0014] Based on the directed acyclic graph, the target model is generated and stored in the model library.
[0015] According to the present invention, a visualization modeling and analysis method for electromagnetic data is provided, wherein the operator library is constructed based on the following steps:
[0016] Decouple multiple electromagnetic data analysis processes to generate multiple services;
[0017] Determine the operators for each service;
[0018] Based on all the operators, the operator library is constructed.
[0019] According to the visualization modeling and analysis method for electromagnetic data provided by the present invention, before acquiring at least one target electromagnetic data, the method further includes:
[0020] Multiple initial electromagnetic data were collected using a long-term electromagnetic data acquisition device.
[0021] The multiple initial electromagnetic signals are transmitted to the Kafka cluster via VPN;
[0022] The initial electromagnetic data is processed by the Kafka cluster to remove peaks, and the initial electromagnetic data is stored in the HBase distributed storage system.
[0023] The target electromagnetic signal is real-time data determined from the peak-shaving electromagnetic signal generated by the Kafka cluster; or,
[0024] The target electromagnetic signal is offline data determined from the peak-shaving electromagnetic signals stored in the HBase distributed storage system.
[0025] According to the present invention, a visualization modeling and analysis method for electromagnetic data is provided, wherein the HBase distributed storage system is used to store the initial electromagnetic data sent by the Kafka cluster in its entirety.
[0026] The intermediate results calculated for each target model are stored in a PostgreSQL database;
[0027] After inputting each target electromagnetic data into the target model of each target electromagnetic data and obtaining the analysis results output by the target model that meet the processing requirements of each target electromagnetic data, the method further includes: storing the analysis results in a MySQL database.
[0028] According to the present invention, a visualization modeling and analysis method for electromagnetic data is provided. When the target electromagnetic signal is determined to be the real-time data, the target model is based on the Flink computing framework to perform stream processing on the target electromagnetic signal.
[0029] If the target electromagnetic signal is determined to be offline data, and if the operator in the target model is iterative processing, then the target model is a batch processing of the target electromagnetic signal based on the Flink computing framework.
[0030] If the operator in the target model is a non-iterative process, then the target model is a batch processing of the target electromagnetic signal based on the Spark computing framework.
[0031] This invention also provides a visualization modeling and analysis platform for electromagnetic data, comprising: a modeling and analysis module, wherein the modeling and analysis module is specifically used for:
[0032] Acquire target electromagnetic data;
[0033] Based on the processing requirements of the target electromagnetic data, a target model for the target electromagnetic data is determined; the target model is constructed based on visual modeling.
[0034] The target electromagnetic data is input into the target model to obtain the analysis results output by the target model that meet the processing requirements.
[0035] According to the present invention, a visualization modeling and analysis platform for electromagnetic data is provided, wherein the modeling and analysis module includes: an operator library, a model library, a scheduling submodule, and a modeling submodule;
[0036] The operator library is used to store all operators in each electromagnetic data analysis process;
[0037] The model library is used to store the target model;
[0038] The modeling submodule is specifically used for:
[0039] If, based on the processing requirements of the target electromagnetic data, it is determined that the target model exists in the model library, the target model is retrieved from the model library.
[0040] If, based on the processing requirements of the target electromagnetic data, it is determined that the target model does not exist in the model library, multiple operators corresponding to the processing requirements are retrieved from the operator library.
[0041] Drag the multiple operators onto the canvas to determine the directed acyclic graph of the processing requirements;
[0042] The target model is generated based on the directed acyclic graph.
[0043] Store the target model in the model library;
[0044] The scheduling submodule is used to schedule the target model in order to analyze the target electromagnetic data.
[0045] According to the present invention, a visualization modeling and analysis platform for electromagnetic data is provided, the platform further includes: a data acquisition module, a data transmission module, and a data storage module;
[0046] The data acquisition module is used to acquire initial electromagnetic data;
[0047] The data transmission module is specifically used for:
[0048] The initial electromagnetic data is subjected to peak reduction processing in order to send the initial electromagnetic data to the data storage module or the modeling and analysis module;
[0049] The data storage module is used to store the initial electromagnetic data and the analysis results;
[0050] The data transmission module includes: a VPN device and a Kafka cluster;
[0051] The VPN device is used to transmit the initial electromagnetic data to the Kafka cluster;
[0052] The Kafka cluster is used to perform peak smoothing on the initial electromagnetic data so that the initial electromagnetic data can be sent to the data storage module or the modeling and analysis module.
[0053] According to the present invention, a visualization modeling and analysis platform for electromagnetic data is provided, wherein the modeling and analysis module further includes: a computation submodule, wherein the computation submodule integrates the Flink computation framework and the Spark computation framework; the target electromagnetic signal is real-time data determined from the peak-shaving electromagnetic signal generated by the Kafka cluster; or,
[0054] The target electromagnetic signal is offline data determined from the peak-shaving electromagnetic signals stored in the HBase distributed storage system.
[0055] The computational submodule is specifically used for:
[0056] If the target electromagnetic data is determined to be the real-time data, the Flink computing framework is invoked to perform stream processing on the target electromagnetic data based on the target model.
[0057] If the target electromagnetic data is determined to be offline data, and if the operator in the target model is iterative processing, then based on the target model, the Flink computing framework is invoked to batch process the target electromagnetic data to generate the analysis results.
[0058] If the operator in the target model is a non-iterative process, then based on the target model, the Spark computing framework is invoked to batch process the target electromagnetic data to generate the analysis results.
[0059] The present invention also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the visualization modeling and analysis method for electromagnetic data as described above.
[0060] The present invention also provides a non-transitory computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the visualization modeling and analysis method for electromagnetic data as described above.
[0061] The present invention also provides a computer program product, including a computer program that, when executed by a processor, implements the visualization modeling and analysis method for electromagnetic data as described above.
[0062] The present invention provides a visualization modeling and analysis method and platform for electromagnetic data. Based on the processing requirements of electromagnetic data, an analysis model is constructed or called as an analysis tool, and the analysis model is used to perform real-time parallel analysis and processing of electromagnetic data, so as to realize real-time monitoring of massive electromagnetic data with different processing requirements. Attached Figure Description
[0063] To more clearly illustrate the technical solutions in this invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of this invention. For those skilled in the art, other drawings can be obtained from these drawings without creative effort.
[0064] Figure 1This is one of the flowcharts of the visualization modeling and analysis method for electromagnetic data provided by the present invention;
[0065] Figure 2 This is a schematic diagram of the structure of the Azkaban framework for system model scheduling provided by the present invention;
[0066] Figure 3 This is a schematic diagram of the multi-engine computing framework of the electromagnetic visualization modeling and analysis system provided by the present invention;
[0067] Figure 4 This is the second flowchart of the visualization modeling and analysis method for electromagnetic data provided by the present invention;
[0068] Figure 5 This is a schematic diagram of the structure of the visualization modeling and analysis platform for electromagnetic data provided by the present invention;
[0069] Figure 6 This is a schematic diagram of the structure of the electronic device provided by the present invention. Detailed Implementation
[0070] To make the objectives, technical solutions, and advantages of this invention clearer, the technical solutions of this invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of this invention. All other embodiments obtained by those skilled in the art based on the embodiments of this invention without creative effort are within the scope of protection of this invention.
[0071] In the new context of joint operations, the electromagnetic space has become the fifth dimension of combat space, in addition to land, sea, air, and space. Competition for control of the electromagnetic space is becoming increasingly complex and intense. The informatization level of battlefield equipment will also rapidly improve with the high-speed development of the information age. Electromagnetic space warfare is shifting from traditional electronic reconnaissance, electronic offense, and electronic defense to electromagnetic spectrum control and electronic warfare management. Multi-level applications of electromagnetic space situational awareness and decision-making, analyzing and processing electromagnetic data in specific scenarios, and intuitively viewing the electromagnetic situational awareness map and electromagnetic compatibility analysis results of radiation sources in complex electromagnetic environments are crucial for obtaining significant potential value. More accurate battlefield perception and understanding of the operational status of electromagnetic radiation sources are decisive factors in achieving victory in the final electromagnetic space war.
[0072] There are many methods for electromagnetic data analysis, mainly including the following:
[0073] The first method involves using a combination of various signal analysis instruments, such as oscilloscopes, spectrum analyzers, vector signal analyzers, signal analyzers, and network analyzers, to analyze electromagnetic data.
[0074] The second method is to use the commercial software Matlab for signal analysis. Matlab integrates a large number of analysis methods, which can perform electromagnetic wave simulation, frequency domain analysis, cosine signal spectrum analysis, signal processing, and electromagnetic field visualization.
[0075] Currently, machine learning is widely used for electromagnetic data analysis. While machine learning algorithms were initially applied only to computer science fields such as image analysis and pattern recognition, their superior performance has led to their emergence as a new approach offering more options for solving complex electromagnetic problems. Machine learning extracts features from electromagnetic data, converts feature vectors into spectral forms, and then classifies signals and identifies anomalies. Researchers are also using deep learning models for mining and correlation analysis of electromagnetic environment data.
[0076] Relying on signal analyzer hardware for signal analysis is costly. Firstly, the instruments themselves are expensive, and completing electromagnetic data analysis often requires a suite of supporting products. Secondly, users need to invest significant time learning how to use the equipment, and incorrect operation can severely impact experimental results.
[0077] Electromagnetic data analysis and processing technology is still in its infancy, and there are few tools for systematic modeling and analysis. Matlab, which is used for signal analysis, is commercial software, expensive, and not domestically produced. It also poses certain security risks when analyzing important data.
[0078] In machine learning-based electromagnetic data analysis, the primary approach is to use internal libraries for programming analysis. Similar to traditional electromagnetic analysis methods, this approach is mainly procedural or object-oriented, resulting in strong dependencies between algorithms and overly tight coupling of models. Even for a simple analysis task, electromagnetic analysts need to do a lot of preliminary work, consuming a significant amount of time to research deeper domains, and a single method is only applicable to specific analysis tasks.
[0079] The massive amounts of data in the electromagnetic space possess the 4V characteristics of big data: volume, velocity, variety, and value. This makes traditional database management technologies insufficient for storing such massive amounts of data in the electromagnetic space, and traditional electromagnetic processing systems, which adopt a single-node deployment approach, are no longer adequate for the real-time analysis and processing of such massive amounts of data. Big data technologies must be used to address these shortcomings.
[0080] To address the problems existing in the prior art, this invention provides a visualization modeling and analysis method and platform for electromagnetic data.
[0081] The following is combined Figures 1-6This invention describes the visualization modeling and analysis method and platform for electromagnetic data provided by embodiments of the present invention.
[0082] The visualization modeling and analysis method for electromagnetic data provided in this embodiment of the invention can be executed by an electronic device or an electromagnetic data visualization modeling and analysis platform within an electronic device capable of implementing this method. In this embodiment, the electronic device includes, but is not limited to, a server. It should be noted that the aforementioned execution entity does not constitute a limitation of this invention.
[0083] Figure 1 This is one of the flowcharts illustrating the visualization modeling and analysis method for electromagnetic data provided by this invention, such as... Figure 1 As shown, including but not limited to the following steps:
[0084] First, in step S1, at least one target electromagnetic data is acquired.
[0085] Target electromagnetic data refers to electromagnetic data that needs to be analyzed and monitored. It can be directly collected or obtained after preprocessing such as noise reduction and filtering.
[0086] Optionally, before acquiring at least one target electromagnetic data, the method further includes:
[0087] Multiple initial electromagnetic data were collected using a long-term electromagnetic data acquisition device.
[0088] The multiple initial electromagnetic signals are transmitted to the Kafka cluster via VPN;
[0089] The initial electromagnetic data is processed by the Kafka cluster to remove peaks, and the initial electromagnetic data is stored in the HBase distributed storage system.
[0090] The target electromagnetic signal is real-time data determined from the peak-shaving electromagnetic signal generated by the Kafka cluster; or,
[0091] The target electromagnetic signal is offline data determined from the peak-shaving electromagnetic signals stored in the HBase distributed storage system.
[0092] The electromagnetic long-term acquisition device transmits the acquired in-phase / quadrature (I / Q) data to the Kafka cluster via sockets.
[0093] Because there are a large number of electromagnetic long-term acquisition devices and the electromagnetic data is generated at a high speed, traditional data receiving technologies are prone to data accumulation, affecting data timeliness and clearly failing to meet the requirements. Using a Kafka cluster as a data pipeline allows for short-term storage of electromagnetic data without power interruption, enabling efficient real-time processing of large volumes of streaming electromagnetic data and achieving data peak reduction. The short-term stored electromagnetic data is then sent to the data storage module or modeling and analysis module, preventing excessive data accumulation that could overwhelm the analysis module.
[0094] Electromagnetic long-term acquisition equipment is often deployed in special scenario private network environments. Electromagnetic data carries a large amount of private data, and the confidentiality, integrity and availability of the data must be strictly guaranteed during transmission. A Virtual Private Network (VPN) device is deployed between the acquisition front-end equipment and the data processing cluster. Secure access to the data is achieved through tunnel encryption, enabling reliable transmission over the Internet and ensuring data security.
[0095] The present invention provides a visualization modeling and analysis method for electromagnetic data. By constructing or calling an analysis model as an analysis tool based on the processing requirements of electromagnetic data, the method utilizes the analysis model to perform real-time parallel analysis and processing of electromagnetic data, thereby achieving real-time monitoring of massive amounts of electromagnetic data with different processing requirements.
[0096] Furthermore, in step S2, the target model for each target electromagnetic data is determined according to the processing requirements for each target electromagnetic data.
[0097] The processing requirements are determined based on the application scenario of the target electromagnetic data. For example, if the processing requirement for the target electromagnetic data is modulation identification, then the target model should have data input, preprocessing, modulation, and output functions.
[0098] Optionally, the HBase distributed storage system is used to store the initial electromagnetic data sent by the Kafka cluster in its entirety.
[0099] The intermediate results calculated for each target model are stored in a PostgreSQL database;
[0100] After inputting each target electromagnetic data into the target model of each target electromagnetic data and obtaining the analysis results output by the target model that meet the processing requirements of each target electromagnetic data, the method further includes: storing the analysis results in a MySQL database.
[0101] On the one hand, the initial electromagnetic data is fully stored using the HBase distributed data system, which facilitates subsequent offline analysis. The initial electromagnetic data is large in volume, and the HBase distributed storage system is a distributed system built on top of HDFS. A single table can have billions of rows and millions of columns, providing a high-performance, column-oriented, and scalable data storage method. Using the HBase distributed storage system to store the electromagnetic data in its entirety improves storage efficiency and facilitates subsequent offline analysis.
[0102] On the other hand, the initial electromagnetic data sent by the Kafka cluster can be directly used as the target electromagnetic data for real-time analysis.
[0103] Specifically, long-duration electromagnetic data acquisition devices are deployed at multiple locations to ensure the integrity of signal acquisition. The main fields of the acquired electromagnetic data are: location device identification document (ID), time, frequency, bandwidth, intensity, modulation method, gain, etc. The acquired data is accessed to the data center in a structured manner through a Transmission Control Protocol (TCP) connection and a VPN dedicated channel. The TCP reliable connection is mainly implemented in a programmatic way to ensure reliable data delivery. The VPN dedicated channel requires the installation of a VPN service-supporting client in the data center and the monitoring LAN. In addition, since there are many long-duration electromagnetic acquisition devices deployed in the venue, a large amount of data will be collected in a short period of time. The data center will pass the accessed data through a Kafka message pipeline to both smooth out data spikes and perform preliminary grouping and aggregation of the data.
[0104] According to the visualization modeling and analysis method for electromagnetic data provided by this invention, a Kafka cluster is used as a data pipeline to efficiently process real-time electromagnetic data streams and achieve the purpose of data peak reduction.
[0105] Optionally, the operator library is constructed based on the following steps:
[0106] Decouple multiple electromagnetic data analysis processes to generate multiple services;
[0107] Determine the operators for each service;
[0108] Based on all the operators, the operator library is constructed.
[0109] Based on the concept of Service-Oriented Architecture (SOA), the traditional electromagnetic data analysis process is first decomposed into multiple general services. These services are implemented as operators for each service, and all operators are integrated into a visualization modeling and analysis platform to form an operator library.
[0110] This visualization modeling and analysis platform is designed for loosely coupled services. It collects electromagnetic data from the environment through long-term electromagnetic data acquisition devices and builds analysis models through drag-and-drop visualization. It is applicable to a variety of electromagnetic analysis scenarios and has significant theoretical and practical value.
[0111] SOA defines all functionalities as independent services, which interact and collaborate to complete the overall business logic. It separates the presentation layer of each service from the logic layer, adding an interface layer to expose the service to the outside world. Through standardized descriptions of service interfaces, services can be provided to any heterogeneous platform and any user interface. It allows and supports service-based algorithm models to be loosely coupled, component-oriented, and cross-technology. Service requesters may not even know where the service runs, what language it's written in, or the message transmission path; they only need to submit a service request to receive the service result.
[0112] Each service corresponds to a different functional unit. SOA is a component model that connects different services of an application through well-defined interfaces and contracts between these services. The interfaces are defined in a neutral manner, independent of the hardware platform, operating system, and programming language used to implement the service. This allows services built on various such algorithmic models to interact in a unified and universal way. This characteristic of having neutral interface definitions (not forcibly bound to a specific implementation) is called loose coupling between services.
[0113] The goal of SOA is to decouple complex, tightly coupled relationships into business-oriented, fine-grained, loosely coupled, and stateless services. Besides flexibility, loose coupling allows a service to persist even as the internal structure and implementation of each service within the application gradually changes. Conversely, tight coupling means that the interfaces between different components of an application are tightly linked to their functionality and structure, making them very vulnerable to changes that could lead to the refactoring of parts or the entire algorithm model. The need for loosely coupled algorithm models stems from the need for business applications to become more flexible in response to changing environments, such as frequently shifting policies, business levels, business priorities, partnerships, industry position, and other business-related factors that can even influence the nature of the business. Businesses that can flexibly adapt to environmental changes are called on-demand businesses, where the way tasks are completed or executed can be changed as needed.
[0114] In the field of electromagnetic analysis, there are few models that can be loosely coupled for model building and analysis. This invention has a complete theoretical and practical foundation, is adaptable to the visualization modeling and rapid calculation of electromagnetic data in various scenarios, and has strong reference value for the field of electromagnetic analysis.
[0115] The operator library integrates a large number of operators for electromagnetic data preprocessing, feature analysis, classification, clustering, statistics, and visualization. It mainly includes operators for statistical analysis of electromagnetic properties in specific scenarios, as well as operators for statistical analysis of electromagnetic data using machine learning algorithms. It can complete the main analysis tasks of electromagnetic spatial data.
[0116] Statistics on electromagnetic characteristics in specific scenarios include: electromagnetic trace occupancy, interference number, noise floor estimation, Fourier transform, wavelet transform, trace signal extraction, electromagnetic detection in specific scenarios, large-scale electromagnetic data analysis, electromagnetic cache, etc.
[0117] Machine learning algorithms are used to perform statistical analysis on electromagnetic data, including: statistical analysis (e.g., correlation analysis, one-sample t-test, normality analysis, principal component analysis, chi-square test); classification (e.g., CART classification tree, ID3, nearest neighbor, Naive Bayes, logistic regression, support vector machine); clustering (e.g., K-means, clustering algorithms (Density-Based Spatial Clustering of Applications with Noise, DBSCAN), hierarchical clustering); regression (e.g., CART regression tree, support vector regression, LASSO regression, linear regression, nearest neighbor regression); time series models (e.g., GM, differencing, ARIMA); and evaluation (e.g., accuracy, precision-recall, RMSE, cross-validation).
[0118] For example, an algorithm for modulation recognition of electromagnetic data includes five segments: data input, data preprocessing, feature extraction, modulation recognition, and output results. This algorithm is decomposed into multiple services, each corresponding to an operator: input operator, preprocessing operator, feature extraction operator, modulation recognition operator, and data output operator. These operators are implemented using Python scripts and integrated into a unified operator library.
[0119] Based on the service-oriented architecture (SOA) concept, traditional electromagnetic data analysis algorithms are loosely coupled. Various general electromagnetic data analysis and processing services are separated and implemented as general operators. There is almost no correlation between operators, and a unified interface is provided to facilitate the modification of individual operators and the construction of models. Modification of individual operators is more efficient, which solves the problem of tight coupling between traditional electromagnetic analysis algorithms and models, and improves the efficiency and cost of modeling.
[0120] To meet various analysis tasks of electromagnetic data, common machine learning algorithms and data processes can be encapsulated and implemented as general operators, which can then be integrated into the operator library. When using them, users only need to drag and drop them onto the canvas to call them, making the operation more flexible. This reduces the repetitive coding work for electromagnetic analysts, while also lowering the cost of learning general algorithms and machine learning algorithms, allowing them to focus more on model optimization and improving the efficiency of professionals in modeling and analyzing electromagnetic data.
[0121] The visualization modeling and analysis method for electromagnetic data provided by this invention is based on the SOA (Service-Oriented Architecture) decoupling concept. It decouples the basic process of traditional electromagnetic data analysis into multiple services, implements each service in Python, and encapsulates it into an independent operator, thereby making the algorithm model composed of operators loosely coupled.
[0122] Optionally, determining the target model for each target electromagnetic data based on the processing requirements for each target electromagnetic data includes:
[0123] If, based on the processing requirements of the target electromagnetic data, it is determined that the target model exists in the model library, the target model is retrieved from the model library; the model library is used to store models generated based on visual modeling.
[0124] If, based on the processing requirements of the target electromagnetic data, it is determined that the target model does not exist in the model library, multiple operators corresponding to the processing requirements are retrieved from the operator library; the operator library is used to store all operators in each electromagnetic data analysis process.
[0125] Drag the multiple operators onto the canvas to determine the directed acyclic graph of the processing requirements;
[0126] Based on the directed acyclic graph, the target model is generated and stored in the model library.
[0127] If a target model that meets the processing requirements exists in the model library, the target model is retrieved from the model library. If no target model that meets the processing requirements exists in the model library, multiple operators corresponding to the processing requirements are retrieved from the operator library. For example, if the processing requirement for the target electromagnetic data is modulation identification, the corresponding operators should include: data input operator, preprocessing operator, modulation operator, and output operator.
[0128] To achieve efficient and convenient visual modeling, this invention employs visual utility classes such as Vue, Element-Ui, Sass pre-compiler, and Loadsh's Koa to establish a drag-and-drop graphical configuration for visual modeling. This includes canvas generation, where corresponding operators are dragged and dropped onto the canvas to form a backend executable Directed Acyclic Graph (DAG) data structure. This facilitates the scheduling and computation of the model in the next step, completing the construction of the electromagnetic data analysis model as the target model. Ultimately, the target model can be scheduled and executed.
[0129] The visualization modeling and analysis method for electromagnetic data provided by this invention addresses the problem of tight coupling between algorithms and models in existing technologies. Based on big data technology, it utilizes operators with independent functions for visualization modeling, forming loosely coupled algorithm models. This method offers greater flexibility, allowing business personnel to model more flexibly and efficiently. Furthermore, it enables the simultaneous formation of multiple models, achieving parallel processing of electromagnetic data with different processing requirements.
[0130] Optionally, after dragging and dropping the plurality of operators onto the canvas to generate the target model, the method further includes:
[0131] The target model is stored in the model library.
[0132] After generating the target model using operators, the target model needs to be published to the model library for subsequent scheduling. This allows the model to be executed on a scheduled basis and in an emergency to complete upper-layer electromagnetic business, reducing the time required to generate the model and improving the processing speed of electromagnetic data.
[0133] Distributed scheduling is based on the Azkaban lightweight task scheduling framework, which has three core units: Azkaban Web Server, Azkaban Executor Server, and Azkaban Executor Server.
[0134] The Azkaban Web Server is the core of the entire scheduling cluster, responsible for the management and scheduling of all jobs. The Azkaban Executor Server is the node in the entire scheduling cluster that actually runs the jobs. This type of node may act as a client for submitting jobs. For example, in Spark on YARN deployment mode, it is only used as a client in cluster mode, while in client mode, it will perform some computational logic. For example, if a regular Java program needs to process a small amount of data, the Executor Server node may have a large workload and consume more memory, CPU and other node resources. The DB is the data storage shared by all nodes in the cluster, containing job information, various scheduling metadata, etc.
[0135] Azkaban Web Server needs to select a suitable Executor Server to run the workflow based on the runtime status information of the Executor Server. Then, it schedules the workflows submitted to the queue to the selected Executor Server for execution. From a scheduling perspective, the interaction between Azkaban Web Server and the Executor Server is very simple, conducted through the Representational State Transfer Application Programming Interface (REST API). The basic pattern is that Azkaban Web Server actively calls the REST API exposed by the Executor Server to obtain relevant resource information, such as the Executor Server's status information and to allocate the workflow to the specified Executor Server for execution, based on scheduling needs.
[0136] The implementation of selecting an Executor Server and scheduling can be seen in the `Queue Processor Thread.select Executor And Dispatch Flow()` method. The `Queue Processor Thread` is a thread running on the Azkaban Web Server. Defined in the Executor Manager, it is the core thread for internal scheduling. The `select Executor()` method handles how to choose a suitable Executor Server, and then uses the `dispatch()` method to schedule the required Workflow to run on that Executor Server.
[0137] The Azkaban Web Server selects an Executor by calling the `select Executor()` method. First, it checks the current exflow configuration to see if it requires scheduling the exflow to a specific Executor Server. If so, it returns the information for that Executor Server, and subsequent scheduling will directly run on that server. Otherwise, it selects an Executor Server according to certain calculation rules. When creating the Executor Selector, the parameter `ExecutorManager.this.filterList` is passed in. This filter list is read from the `azkaban.properties` file, containing the configuration values of `azkaban.executorselector.filters`, and an Executor Filter object is created, which contains a set of Factor Filters. The ExecutorSelector is then used to select an Executor Server. For details on the selection logic, please refer to the `ExecutorSelector.getBest()` method.
[0138] Due to the inherent properties of electromagnetic fields, the data contains a significant amount of sensitive information. Therefore, in many scenarios, we cannot directly disclose the data. Simultaneously, we need to utilize remote environments to meet current data analysis requirements. Thus, remote scheduling support is necessary. Technically, Remote Procedure Call (RPC) is employed. This method requests services from a remote computer over a network, much like calling a local method. It eliminates the need to understand the underlying network protocols. RPC transcends the transport and application layers, facilitating the development of distributed applications.
[0139] A complete RPC architecture comprises four core components: the client, the client stub, the server, and the server stub. The client is the service caller; the client stub stores the server's address information, packages the client's request parameters into a network message, and then sends it remotely to the service provider; the server is the actual service provider; the server stub receives messages from the client, unpacks them, and calls local methods.
[0140] The specific calling process is as follows:
[0141] Step 101: The Client invokes the service using a local call method (i.e., via an interface).
[0142] Step 102: After receiving the call, the Client Stub is responsible for assembling the method, parameters, etc. into a message body that can be transmitted over the network (serializing the message body object into binary).
[0143] Step 103: The Client sends the message to the Server via sockets;
[0144] Step 104: After receiving the message, the Server Stub decodes it (deserializes the message object);
[0145] Step 105: The Server Stub calls the local service based on the decoding result;
[0146] Step 106: The local service executes and returns the result to the Server Stub;
[0147] Step 107: The Server Stub packages the returned result into a message (serializing the result message object);
[0148] Step 108: The server sends the message to the client via sockets;
[0149] Step 109: The Client Stub receives the result message and decodes it (serializes the result message);
[0150] In step 110, the Client obtains the final result.
[0151] RPC can encapsulate steps 102-104 and steps 107-109, implementing a remote scheduling scheme that supports RPC. It integrates various scheduling algorithms such as time-series and preemptive scheduling, which facilitates flexible scheduling of the DAG directed acyclic graph of the analysis model. At the same time, it can concurrently and distributedly schedule hundreds of analysis tasks, which can meet the flexible scheduling of the published model by the upper-layer electromagnetic business subsystem.
[0152] Regardless of the data type, it ultimately needs to be converted into a binary stream for transmission over the network. The sender of the data needs to convert the object into a binary stream, while the receiver of the data needs to restore the binary stream back into an object.
[0153] Figure 2 This is a schematic diagram of the structure of the Azkaban framework for system model scheduling provided by the present invention, as shown below. Figure 2 As shown, it includes:
[0154] Developers or programs submit messages to the Azkaban web server. The Azkaban web server unpacks the messages and distributes the corresponding schedules to the respective Azkaban executor servers. The Azkaban web server and the Azkaban executor servers then read / write data from the database.
[0155] According to the electromagnetic data visualization modeling and analysis platform provided by the present invention, based on the characteristics of electromagnetic data and operators, the visualization modeling and analysis platform integrates multiple big data computing engines and adopts a distributed cluster to solve the problems of computational difficulties and low efficiency of massive electromagnetic data, thus supporting the efficient analysis of massive electromagnetic data.
[0156] Furthermore, in step S3, the target electromagnetic data is input into the target model to obtain the analysis results output by the target model that meet the processing requirements, so as to monitor the target electromagnetic data.
[0157] For example, if the processing requirement for target electromagnetic data is modulation identification, then the target model receives the input target electromagnetic data, preprocesses and modulates the target electromagnetic data, and outputs the modulated target electromagnetic data as the analysis result to achieve monitoring of electromagnetic data.
[0158] This invention can establish corresponding target models for each target electromagnetic data in parallel, enabling real-time analysis and processing of massive electromagnetic data with different processing requirements.
[0159] Optionally, if the target electromagnetic signal is determined to be the real-time data, the target model is based on the Flink computing framework to perform stream processing on the target electromagnetic signal;
[0160] If the target electromagnetic signal is determined to be offline data, and if the operator in the target model is iterative processing, then the target model is a batch processing of the target electromagnetic signal based on the Flink computing framework.
[0161] If the operator in the target model is a non-iterative process, then the target model is a batch processing of the target electromagnetic signal based on the Spark computing framework.
[0162] Since electromagnetic data includes a large amount of offline data stored in the HBase distributed storage system and real-time data being collected in real time, the visualization modeling and analysis platform needs to integrate multiple computing engines to support both stream processing and batch processing of the data.
[0163] Due to the high requirements for processing speed of real-time data, the Flink computing framework was chosen as the computing engine to perform stream processing on real-time data.
[0164] Batch processing offers higher computational precision. Therefore, when the operators in the target model are iterative, the Flink computing framework is chosen as the computing engine for batch processing of offline data. When the operators in the target model are non-iterative, the Spark computing framework is chosen as the computing engine for batch processing of offline data. This enables efficient integration of Kafka clusters with Flink, Spark computing frameworks, and HBase distributed storage systems, allowing for cluster deployment on multiple servers.
[0165] Many frameworks support both stream processing and batch processing. To improve the computational efficiency of the target model, this invention improves the computational method of operators. Currently, there are many electromagnetic statistical operators, such as noise floor estimation, occupancy, and interference number of electromagnetic traces, which require a large number of iterative calculations. The commonly used computing engines are subdivided into two dimensions: iterative processing and non-iterative processing, for different data volumes. The proposed computing engine solution is to use the Flink computing framework for iterative calculations in stream processing and batch processing, and the Spark computing framework for non-iterative calculations in batch processing.
[0166] Flink's core is a streaming data stream execution engine that provides functions such as data distribution, data communication, and fault tolerance mechanisms for distributed computing of data streams.
[0167] Based on its stream execution engine, Flink provides several higher-level APIs, such as the DataSet API, DataStream API, and Table API, to enable users to write distributed tasks.
[0168] The DataSet API is used for batch processing of static data, abstracting static data into a distributed dataset. Users can easily use various operators provided by Flink to process the distributed dataset, supporting Java, Scala, and Python.
[0169] The DataStream API is used to perform stream processing operations on data streams, abstracting streaming data into distributed data streams. Users can easily perform various operations on distributed data streams, and it supports Java and Scala.
[0170] The Table API is used to perform query operations on structured data. It abstracts structured data into relational tables and performs various query operations on the relational tables through a SQL-like DSL. It supports Java and Scala.
[0171] In addition, Flink provides domain libraries for specific application areas. For example, Flink ML, Flink's machine learning library, provides a machine learning Pipelines API and implements various machine learning algorithms. Gelly, Flink's graph computation library, provides related APIs for graph computation and implementations of various graph computation algorithms.
[0172] Apache Spark is a distributed, open-source processing system for big data workloads. It uses in-memory caching and optimized query execution to perform fast analytical queries on data of any size. It provides development APIs in Java, Scala, Python, and R, supporting code reuse across multiple workloads—batch processing, interactive queries, real-time analytics, machine learning, and graphics processing. You'll find it used by numerous organizations across industries, including FINRA, Yelp, Zillow, DataXu, the Urban Institute, and CrowdStrike. Apache Spark has become one of the most popular distributed big data processing frameworks. Spark significantly speeds up execution by reading data into memory, performing operations, and writing back results in a single step. Spark can also significantly accelerate machine learning algorithms that repeatedly call functions on the same dataset using in-memory caching, thus reusing data. Data reuse is achieved by creating data abstractions—Data Frames—on Resilient Distributed Datasets (RDDs), which are collections of objects cached in memory and reused across multiple Spark operations. It significantly reduces latency, making Spark several times faster than MapReduce, especially noticeable when performing machine learning and interactive analytics.
[0173] The Spark execution flow is as follows:
[0174] Step 201: The Application is first constructed into a DAG graph by the Driver and decomposed into Stages;
[0175] Step 202, then the Driver requests resources from the Cluster Manager;
[0176] Step 203: The Cluster Manager sends a call signal to certain Work Nodes;
[0177] Step 204: The summoned Work Node starts its Executor process to respond to the summons and requests tasks from the Driver;
[0178] Step 205: The Driver assigns a Task to a Work Node;
[0179] Step 206: The Executor executes Tasks in units of Stages, during which the Driver monitors the process.
[0180] Step 207: After receiving the signal that the Executor task is completed, the Driver sends a deregistration signal to the Cluster Manager;
[0181] Step 208: The Cluster Manager sends a resource release signal to the Work Node;
[0182] Step 209: The Executor corresponding to the Work Node stops running.
[0183] By employing a layered design of Batch Layer and Speed Layer, a single system can simultaneously support real-time and batch processing computations. The Server Layer logically unifies the interfaces of the two data sources, enabling applications to be developed, deployed, and externally queried and displayed with a unified data view. This achieves data and application integration and significantly improves the computational efficiency of the electromagnetic analysis model.
[0184] Figure 3 This is a schematic diagram of the multi-engine computing framework of the electromagnetic visualization modeling and analysis system provided by the present invention, as shown below. Figure 3 As shown, it includes:
[0185] Based on the Lambda architecture, a multi-engine computing framework adapted to electromagnetic data analysis operators was designed. The multi-engine computing framework mainly completes the distributed computing of each node of the DAG of the scheduling model, thereby improving the model's computing performance and efficiency.
[0186] The framework includes: a batch processing layer, a service layer, and a stream processing layer;
[0187] In the service layer, offline data is sent to the batch processing layer, and real-time data is sent to the stream processing layer;
[0188] In the batch processing layer, all offline data is stored. In the batch processing framework, if the offline data is processed iteratively, the Flink computing framework is called for batch processing; if the offline data is processed non-iteratively, the Spark computing framework is called for batch processing, and the batch processing results are sent to the batch processing view in the service layer.
[0189] In the stream processing layer, the Flink computing framework is invoked to perform stream processing on real-time data, and the stream processing results are sent to the stream processing view of the service layer.
[0190] Batch processing results in the batch processing view and stream processing results in the stream processing view can be queried and displayed through a unified interface. Based on the characteristics of electromagnetic data and operators, the visualization modeling and analysis platform integrates multiple big data computing engines, implementing a computing framework integrating Spark and Flink based on the Lambda architecture. Furthermore, the use of distributed cluster deployment further improves the computational efficiency of the model.
[0191] Intermediate results from model analysis calculations are stored in a PostgreSQL database, while final results are stored in a MySQL database. A distributed cluster is employed to address the computational difficulties and low efficiency associated with massive amounts of electromagnetic data, enabling efficient analysis of such data.
[0192] Performance tests on MySQL and PostgreSQL databases for data insertion and query operations revealed that PostgreSQL outperforms MySQL in data insertion, while MySQL performs better in data querying. During the intermediate results generated by the target model calculation, data insertion operations are repeatedly performed as the calculation iterates, and PostgreSQL demonstrates better performance in this regard. After the entire model converges, a small number of analysis results will be generated. These results, which require storage for the front-end display interface, are primarily used for query operations, where MySQL exhibits better performance and stability. This invention combines PostgreSQL and MySQL, two relational databases, to ensure efficient data storage and retrieval, thereby improving throughput.
[0193] The present invention provides a visualization modeling and analysis method for electromagnetic data. Based on the processing requirements of electromagnetic data, an analysis model is constructed or called as an analysis tool, and the analysis model is used to perform real-time parallel analysis and processing of electromagnetic data, so as to realize real-time monitoring of massive electromagnetic data with different processing requirements.
[0194] Figure 4 This is the second flowchart of the visualization modeling and analysis method for electromagnetic data provided by this invention, as shown below. Figure 4 As shown, it includes:
[0195] First, the initial electromagnetic data is monitored over a long period of time;
[0196] Secondly, the collected initial electromagnetic data is transmitted to the Kafka cluster via a VPN device based on Secure Socket Layer (SSL).
[0197] Next, the Kafka cluster performs peak smoothing on the initial electromagnetic data in order to store the entire initial electromagnetic data in the HBase distributed storage system.
[0198] Subsequently, visualization modeling and analysis are performed on real-time data generated by the Kafka cluster or offline data in the HBase distributed storage system, as detailed below:
[0199] Based on the processing requirements of electromagnetic data, the corresponding operators are retrieved from the operator library or written in the form of online coding. The operators are then combined to generate the target model.
[0200] Finally, on the one hand, the target model is stored in the model library; on the other hand, distributed model scheduling is performed based on the target model, and model computation is performed based on Apache Flink or Apache Spark to generate analysis results, which are then stored in a MySQL database.
[0201] The present invention provides a visualization modeling and analysis method for electromagnetic data, which is designed for massive electromagnetic space data. It employs big data technology and visualization modeling technology to address the difficulties of traditional technologies, such as insufficient storage efficiency, high learning cost, high analysis difficulty, tight model coupling, and low computational efficiency.
[0202] The visualization modeling and analysis platform for electromagnetic data provided by this invention is described below. The visualization modeling and analysis platform for electromagnetic data described below can be referred to in correspondence with the visualization modeling and analysis method for electromagnetic data described above.
[0203] Figure 5 This is a schematic diagram of the structure of the visualization modeling and analysis platform for electromagnetic data provided by the present invention, as shown below. Figure 5 As shown, the visualization modeling and analysis platform 500 includes: a modeling and analysis module 510;
[0204] The modeling and analysis module 510 is specifically used for:
[0205] Acquire target electromagnetic data;
[0206] Based on the processing requirements of the target electromagnetic data, a target model for the target electromagnetic data is determined; the target model is constructed based on visual modeling.
[0207] The target electromagnetic data is input into the target model to obtain the analysis results output by the target model that meet the processing requirements.
[0208] Target electromagnetic data refers to electromagnetic data that needs to be analyzed and monitored. It can be directly collected or obtained after preprocessing such as noise reduction and filtering.
[0209] The processing requirements are determined based on the application scenario of the target electromagnetic data. For example, if the processing requirement for the target electromagnetic data is modulation identification, then the target model should have data input, preprocessing, modulation, and output functions.
[0210] For example, if the processing requirement for target electromagnetic data is modulation identification, then the target model receives the input target electromagnetic data, preprocesses and modulates the target electromagnetic data, and outputs the modulated target electromagnetic data as the analysis result to achieve monitoring of electromagnetic data.
[0211] Optionally, the platform further includes: a data acquisition module, a data transmission module, and a data storage module;
[0212] The data acquisition module is used to acquire initial electromagnetic data;
[0213] The data transmission module is specifically used for:
[0214] The initial electromagnetic data is subjected to peak reduction processing in order to send the initial electromagnetic data to the data storage module or the modeling and analysis module;
[0215] The data storage module is used to store the initial electromagnetic data and the analysis results;
[0216] The data transmission module includes: a VPN device and a Kafka cluster;
[0217] The VPN device is used to transmit the initial electromagnetic data to the Kafka cluster;
[0218] The Kafka cluster is used to perform peak smoothing on the initial electromagnetic data so that the initial electromagnetic data can be sent to the data storage module or the modeling and analysis module.
[0219] The data acquisition module includes at least one electromagnetic long-term acquisition device, each of which is used to acquire electromagnetic data.
[0220] The electromagnetic long-term acquisition device transmits the acquired in-phase / quadrature (I / Q) data to the Kafka cluster via sockets.
[0221] Because there are a large number of electromagnetic long-term acquisition devices and the electromagnetic data is generated at a high speed, traditional data receiving technologies are prone to data accumulation, affecting data timeliness and clearly failing to meet the requirements. Using a Kafka cluster as a data pipeline allows for short-term storage of electromagnetic data without power interruption, enabling efficient real-time processing of large volumes of streaming electromagnetic data and achieving data peak reduction. The short-term stored electromagnetic data is then sent to the data storage module or modeling and analysis module, preventing excessive data accumulation that could overwhelm the analysis module.
[0222] Electromagnetic long-term acquisition equipment is often deployed in special scenario private network environments. Electromagnetic data carries a large amount of private data, and the confidentiality, integrity and availability of the data must be strictly guaranteed during transmission. A Virtual Private Network (VPN) device is deployed between the acquisition front-end equipment and the data processing cluster. Secure access to the data is achieved through tunnel encryption, enabling reliable transmission over the Internet and ensuring data security.
[0223] The present invention provides a visualization modeling and analysis platform for electromagnetic data. Based on the processing needs of electromagnetic data, it constructs or calls analysis models as analysis tools, and uses the analysis models to perform real-time parallel analysis and processing of electromagnetic data, thereby realizing real-time monitoring of massive amounts of electromagnetic data with different processing needs.
[0224] Optionally, the data storage module includes: an HBase distributed data system, a PostgreSQL database, and a MySQL database;
[0225] The HBase distributed data system is used to store all the initial electromagnetic data sent by the Kafka cluster.
[0226] The PostgreSQL database is used to store the intermediate results of the calculation of the target model;
[0227] The MySQL database is used to store the analysis results;
[0228] The target electromagnetic data is real-time data determined from the initial electromagnetic data sent by the Kafka cluster, or offline data determined from the initial electromagnetic data stored in the HBase distributed data system.
[0229] On the one hand, the initial electromagnetic data is fully stored using the HBase distributed data system, which facilitates subsequent offline analysis. The initial electromagnetic data is large in volume, and the HBase distributed storage system is a distributed system built on top of HDFS. A single table can have billions of rows and millions of columns, providing a high-performance, column-oriented, and scalable data storage method. Using the HBase distributed storage system to store the electromagnetic data in its entirety improves storage efficiency and facilitates subsequent offline analysis.
[0230] On the other hand, the initial electromagnetic data sent by the Kafka cluster can be directly used as the target electromagnetic data for real-time analysis.
[0231] Specifically, long-duration electromagnetic data acquisition devices are deployed at multiple locations to ensure the integrity of signal acquisition. The main fields of the acquired electromagnetic data are: location device identification document (ID), time, frequency, bandwidth, intensity, modulation method, gain, etc. The acquired data is accessed to the data center in a structured manner through a Transmission Control Protocol (TCP) connection and a VPN dedicated channel. The TCP reliable connection is mainly implemented in a programmatic way to ensure reliable data delivery. The VPN dedicated channel requires the installation of a VPN service-supporting client in the data center and the monitoring LAN. In addition, since there are many long-duration electromagnetic acquisition devices deployed in the venue, a large amount of data will be collected in a short period of time. The data center will pass the accessed data through a Kafka message pipeline to both smooth out data spikes and perform preliminary grouping and aggregation of the data.
[0232] The visualization modeling and analysis platform for electromagnetic data provided by this invention uses a Kafka cluster as a data pipeline, which can efficiently process real-time electromagnetic data streams and achieve the purpose of data peak reduction.
[0233] Optionally, the platform further includes: a decomposition module;
[0234] The decomposition module is used to decouple multiple electromagnetic data analysis processes and generate multiple services;
[0235] Determine the operators for each service;
[0236] All operators are stored in the operator library.
[0237] Based on the concept of Service-Oriented Architecture (SOA), the traditional electromagnetic data analysis process is first decomposed into multiple general services. These services are implemented as operators for each service, and all operators are integrated into a visualization modeling and analysis platform to form an operator library.
[0238] In the field of electromagnetic analysis, there are few models that can be loosely coupled for model building and analysis. This invention has a complete theoretical and practical foundation, is adaptable to the visualization modeling and rapid calculation of electromagnetic data in various scenarios, and has strong reference value for the field of electromagnetic analysis.
[0239] The operator library integrates a large number of operators for electromagnetic data preprocessing, feature analysis, classification, clustering, statistics, and visualization. It mainly includes operators for statistical analysis of electromagnetic properties in specific scenarios, as well as operators for statistical analysis of electromagnetic data using machine learning algorithms. It can complete the main analysis tasks of electromagnetic spatial data.
[0240] Statistics on electromagnetic characteristics in specific scenarios include: electromagnetic trace occupancy, interference number, noise floor estimation, Fourier transform, wavelet transform, trace signal extraction, electromagnetic detection in specific scenarios, large-scale electromagnetic data analysis, electromagnetic cache, etc.
[0241] Machine learning algorithms are used to perform statistical analysis on electromagnetic data, including: statistical analysis (e.g., correlation analysis, one-sample t-test, normality analysis, principal component analysis, chi-square test); classification (e.g., CART classification tree, ID3, nearest neighbor, Naive Bayes, logistic regression, support vector machine); clustering (e.g., K-means, clustering algorithms (Density-Based Spatial Clustering of Applications with Noise, DBSCAN), hierarchical clustering); regression (e.g., CART regression tree, support vector regression, LASSO regression, linear regression, nearest neighbor regression); time series models (e.g., GM, differencing, ARIMA); and evaluation (e.g., accuracy, precision-recall, RMSE, cross-validation).
[0242] For example, an algorithm for modulation recognition of electromagnetic data includes five segments: data input, data preprocessing, feature extraction, modulation recognition, and output results. This algorithm is decomposed into multiple services, each corresponding to an operator: input operator, preprocessing operator, feature extraction operator, modulation recognition operator, and data output operator. These operators are implemented using Python scripts and integrated into a unified operator library.
[0243] Based on the service-oriented architecture (SOA) concept, traditional electromagnetic data analysis algorithms are loosely coupled. Various general electromagnetic data analysis and processing services are separated and implemented as general operators. There is almost no correlation between operators, and a unified interface is provided to facilitate the modification of individual operators and the construction of models. Modification of individual operators is more efficient, which solves the problem of tight coupling between traditional electromagnetic analysis algorithms and models, and improves the efficiency and cost of modeling.
[0244] To meet various analysis tasks of electromagnetic data, common machine learning algorithms and data processes can be encapsulated and implemented as general operators, which can then be integrated into the operator library. When using them, users only need to drag and drop them onto the canvas to call them, making the operation more flexible. This reduces the repetitive coding work for electromagnetic analysts, while also lowering the cost of learning general algorithms and machine learning algorithms, allowing them to focus more on model optimization and improving the efficiency of professionals in modeling and analyzing electromagnetic data.
[0245] The visualization modeling and analysis platform for electromagnetic data provided by this invention, based on the SOA (Service-Oriented Architecture) decoupling concept, decouples the basic process of traditional electromagnetic data analysis into multiple services. Each service is implemented in Python and encapsulated into an independent operator, thereby making the algorithm model composed of operators loosely coupled.
[0246] Optionally, the modeling and analysis module 110 includes: an operator library, a model library, a scheduling submodule, and a modeling submodule;
[0247] The operator library is used to store all operators in each electromagnetic data analysis process;
[0248] The model library is used to store the target model;
[0249] The modeling submodule is specifically used for:
[0250] If, based on the processing requirements of the target electromagnetic data, it is determined that the target model exists in the model library, the target model is retrieved from the model library.
[0251] If, based on the processing requirements of the target electromagnetic data, it is determined that the target model does not exist in the model library, multiple operators corresponding to the processing requirements are retrieved from the operator library.
[0252] Drag the multiple operators onto the canvas to determine the directed acyclic graph of the processing requirements;
[0253] The target model is generated based on the directed acyclic graph.
[0254] Store the target model in the model library;
[0255] The scheduling submodule is used to schedule the target model in order to analyze the target electromagnetic data.
[0256] If a target model that meets the processing requirements exists in the model library, the target model is retrieved from the model library. If no target model that meets the processing requirements exists in the model library, multiple operators corresponding to the processing requirements are retrieved from the operator library. For example, if the processing requirement for the target electromagnetic data is modulation identification, the corresponding operators should include: data input operator, preprocessing operator, modulation operator, and output operator.
[0257] To achieve efficient and convenient visual modeling, this invention employs visual utility classes such as Vue, Element-Ui, Sass pre-compiler, and Loadsh's Koa to establish a drag-and-drop graphical configuration for visual modeling. This includes canvas generation, where corresponding operators are dragged and dropped onto the canvas to form a backend executable Directed Acyclic Graph (DAG) data structure. This facilitates the scheduling and computation of the model in the next step, completing the construction of the electromagnetic data analysis model as the target model. Ultimately, the target model can be scheduled and executed.
[0258] The visualization modeling and analysis platform for electromagnetic data provided by this invention addresses the problem of tight coupling between algorithms and models in existing technologies. Based on big data technology, it utilizes operators with independent functions for visualization modeling, forming loosely coupled algorithm models. This results in greater flexibility, making modeling more flexible and efficient for business personnel. Furthermore, it can simultaneously form multiple models, enabling parallel processing of electromagnetic data with different processing requirements.
[0259] Optionally, after dragging and dropping the plurality of operators onto the canvas to generate the target model, the method further includes:
[0260] The target model is stored in the model library.
[0261] After generating the target model using operators, the target model needs to be published to the model library for subsequent scheduling. This allows the model to be executed on a scheduled basis and in an emergency to complete upper-layer electromagnetic business, reducing the time required to generate the model and improving the processing speed of electromagnetic data.
[0262] According to the electromagnetic data visualization modeling and analysis platform provided by the present invention, based on the characteristics of electromagnetic data and operators, the visualization modeling and analysis platform integrates multiple big data computing engines and adopts a distributed cluster to solve the problems of computational difficulties and low efficiency of massive electromagnetic data, thus supporting the efficient analysis of massive electromagnetic data.
[0263] Optionally, the modeling and analysis module further includes: a computation submodule, which integrates the Flink and Spark computing frameworks; the target electromagnetic signal is real-time data determined from the peak-shaving electromagnetic signal generated by the Kafka cluster; or,
[0264] The target electromagnetic signal is offline data determined from the peak-shaving electromagnetic signals stored in the HBase distributed storage system.
[0265] The computational submodule is specifically used for:
[0266] If the target electromagnetic data is determined to be the real-time data, the Flink computing framework is invoked to perform stream processing on the target electromagnetic data based on the target model.
[0267] If the target electromagnetic data is determined to be offline data, and if the operator in the target model is iterative processing, then based on the target model, the Flink computing framework is invoked to batch process the target electromagnetic data to generate the analysis results.
[0268] If the operator in the target model is a non-iterative process, then based on the target model, the Spark computing framework is invoked to batch process the target electromagnetic data to generate the analysis results.
[0269] Since electromagnetic data includes a large amount of offline data stored in the HBase distributed storage system and real-time data being collected in real time, the visualization modeling and analysis platform needs to integrate multiple computing engines to support both stream processing and batch processing of the data.
[0270] Due to the high requirements for processing speed of real-time data, the Flink computing framework was chosen as the computing engine to perform stream processing on real-time data.
[0271] Batch processing offers higher computational precision. Therefore, when the operators in the target model are iterative, the Flink computing framework is chosen as the computing engine for batch processing of offline data. When the operators in the target model are non-iterative, the Spark computing framework is chosen as the computing engine for batch processing of offline data. This enables efficient integration of Kafka clusters with Flink, Spark computing frameworks, and HBase distributed storage systems, allowing for cluster deployment on multiple servers.
[0272] Many frameworks support both stream processing and batch processing. To improve the computational efficiency of the target model, this invention improves the computational method of operators. Currently, there are many electromagnetic statistical operators, such as noise floor estimation, occupancy, and interference number of electromagnetic traces, which require a large number of iterative calculations. The commonly used computing engines are subdivided into two dimensions: iterative processing and non-iterative processing, for different data volumes. The proposed computing engine solution is to use the Flink computing framework for iterative calculations in stream processing and batch processing, and the Spark computing framework for non-iterative calculations in batch processing.
[0273] The visualization modeling and analysis platform for electromagnetic data provided by this invention constructs or calls analysis models as analysis tools based on the processing needs of electromagnetic data, and uses the analysis models to perform real-time parallel analysis and processing of electromagnetic data, thereby realizing real-time monitoring of massive amounts of electromagnetic data with different processing needs.
[0274] Figure 6 This is a schematic diagram of the structure of the electronic device provided by the present invention, such as... Figure 6 As shown, the electronic device may include a processor 610, a communications interface 620, a memory 630, and a communication bus 640, wherein the processor 610, communications interface 620, and memory 630 communicate with each other via the communication bus 640. The processor 610 can call logical instructions in the memory 630 to execute a visualization modeling and analysis method for electromagnetic data. This method includes: acquiring at least one target electromagnetic data; determining a target model for each target electromagnetic data according to the processing requirements of each target electromagnetic data; each target model being generated based on visualization modeling; inputting each target electromagnetic data into the target model of each target electromagnetic data; and obtaining analysis results output by the target model that meet the processing requirements of each target electromagnetic data.
[0275] Furthermore, the logical instructions in the aforementioned memory 630 can be implemented as software functional units and, when sold or used as independent products, can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention, essentially, or the part that contributes to the prior art, or a part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0276] On the other hand, the present invention also provides a computer program product, which includes a computer program that can be stored on a non-transitory computer-readable storage medium. When the computer program is executed by a processor, the computer is able to execute the visualization modeling and analysis method for electromagnetic data provided by the above methods. The method includes: acquiring at least one target electromagnetic data; determining a target model for each target electromagnetic data according to the processing requirements of each target electromagnetic data; each target model being generated based on visualization modeling; inputting each target electromagnetic data into the target model of each target electromagnetic data; and obtaining an analysis result output by the target model that meets the processing requirements of each target electromagnetic data.
[0277] In another aspect, the present invention also provides a non-transitory computer-readable storage medium storing a computer program thereon, which, when executed by a processor, implements a visualization modeling and analysis method for electromagnetic data provided by the methods described above. This method includes: acquiring at least one target electromagnetic data; determining a target model for each target electromagnetic data according to processing requirements for each target electromagnetic data; each target model being generated based on visualization modeling; inputting each target electromagnetic data into the target model for each target electromagnetic data; and obtaining analysis results output by the target model that conform to the processing requirements of each target electromagnetic data.
[0278] The system embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs. Those skilled in the art can understand and implement this without any creative effort.
[0279] Through the above description of the embodiments, those skilled in the art can clearly understand that each embodiment can be implemented by means of software plus necessary general-purpose hardware platforms, and of course, it can also be implemented by hardware. Based on this understanding, the above technical solutions, in essence or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product can be stored in a computer-readable storage medium, such as ROM / RAM, magnetic disk, optical disk, etc., and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute the methods described in the various embodiments or some parts of the embodiments.
[0280] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims
1. A visualization modeling and analysis method for electromagnetic data, characterized in that, Visual modeling and analysis systems for electromagnetic data include: Acquire at least one target electromagnetic data; Based on the processing requirements for each target electromagnetic data, a target model for each target electromagnetic data is determined; each target model is generated based on visual modeling. Each target electromagnetic data is input into the target model of each target electromagnetic data, and the analysis results output by the target model that meet the processing requirements of each target electromagnetic data are obtained; the step of determining the target model of each target electromagnetic data according to the processing requirements of each target electromagnetic data includes: If, based on the processing requirements of the target electromagnetic data, it is determined that the target model exists in the model library, the target model is retrieved from the model library; the model library is used to store models generated based on visual modeling. If, based on the processing requirements of the target electromagnetic data, it is determined that the target model does not exist in the model library, multiple operators corresponding to the processing requirements are retrieved from the operator library; the operator library is used to store all operators in each electromagnetic data analysis process. Drag the multiple operators onto the canvas to determine the directed acyclic graph of the processing requirements; Based on the directed acyclic graph, the target model is generated and stored in the model library.
2. The visualization modeling and analysis method for electromagnetic data according to claim 1, characterized in that, The operator library is built based on the following steps: Decouple multiple electromagnetic data analysis processes to generate multiple services; Determine the operators for each service; Based on all the operators, the operator library is constructed.
3. The visualization modeling and analysis method for electromagnetic data according to claim 1, characterized in that, Before acquiring at least one target electromagnetic data, the method further includes: Multiple initial electromagnetic data were collected using a long-term electromagnetic data acquisition device. The multiple initial electromagnetic signals are transmitted to the Kafka cluster via VPN; The initial electromagnetic data is processed by the Kafka cluster to remove peaks, and the initial electromagnetic data is stored in the HBase distributed storage system. The target electromagnetic signal is real-time data determined from the peak-shaving electromagnetic signal generated by the Kafka cluster; or, The target electromagnetic signal is offline data determined from the peak-shaving electromagnetic signals stored in the HBase distributed storage system.
4. The visualization modeling and analysis method for electromagnetic data according to claim 3, characterized in that, The HBase distributed storage system is used to store all the initial electromagnetic data sent by the Kafka cluster. The intermediate results calculated for each target model are stored in a PostgreSQL database; After inputting each target electromagnetic data into the target model of each target electromagnetic data and obtaining the analysis results output by the target model that meet the processing requirements of each target electromagnetic data, the method further includes: storing the analysis results in a MySQL database.
5. The visualization modeling and analysis method for electromagnetic data according to claim 4, characterized in that, When the target electromagnetic signal is determined to be the real-time data, the target model is based on the Flink computing framework to perform stream processing on the target electromagnetic signal; If the target electromagnetic signal is determined to be offline data, and if the operator in the target model is iterative processing, then the target model is a batch processing of the target electromagnetic signal based on the Flink computing framework. If the operator in the target model is a non-iterative process, then the target model is a batch processing of the target electromagnetic signal based on the Spark computing framework.
6. A visualization modeling and analysis system for electromagnetic data, characterized in that, include: The modeling and analysis module is specifically used for: Acquire target electromagnetic data; Based on the processing requirements of the target electromagnetic data, a target model for the target electromagnetic data is determined; the target model is constructed based on visual modeling. The target electromagnetic data is input into the target model to obtain the analysis results output by the target model that meet the processing requirements; The modeling and analysis module includes: an operator library, a model library, a scheduling submodule, and a modeling submodule; The operator library is used to store all operators in each electromagnetic data analysis process; The model library is used to store the target model; The modeling submodule is specifically used for: If, based on the processing requirements of the target electromagnetic data, it is determined that the target model exists in the model library, the target model is retrieved from the model library. If, based on the processing requirements of the target electromagnetic data, it is determined that the target model does not exist in the model library, multiple operators corresponding to the processing requirements are retrieved from the operator library. Drag the multiple operators onto the canvas to determine the directed acyclic graph of the processing requirements; The target model is generated based on the directed acyclic graph. Store the target model in the model library; The scheduling submodule is used to schedule the target model in order to analyze the target electromagnetic data.
7. The visualization modeling and analysis system for electromagnetic data according to claim 6, characterized in that, The system further includes: a data acquisition module, a data transmission module, and a data storage module; The data acquisition module is used to acquire initial electromagnetic data; The data transmission module is specifically used for: The initial electromagnetic data is subjected to peak reduction processing in order to send the initial electromagnetic data to the data storage module or the modeling and analysis module; The data storage module is used to store the initial electromagnetic data and the analysis results; The data transmission module includes: a VPN device and a Kafka cluster; The VPN device is used to transmit the initial electromagnetic data to the Kafka cluster; The Kafka cluster is used to perform peak smoothing on the initial electromagnetic data so that the initial electromagnetic data can be sent to the data storage module or the modeling and analysis module.
8. The visualization modeling and analysis system for electromagnetic data according to claim 7, characterized in that, The modeling and analysis module further includes: a computation submodule, which integrates the Flink and Spark computing frameworks; the target electromagnetic signal is real-time data determined from the peak-shaving electromagnetic signal generated by the Kafka cluster; or, The target electromagnetic signal is offline data determined from the peak-shaving electromagnetic signals stored in the HBase distributed storage system. The computational submodule is specifically used for: If the target electromagnetic data is determined to be the real-time data, the Flink computing framework is invoked to perform stream processing on the target electromagnetic data based on the target model. If the target electromagnetic data is determined to be offline data, and if the operator in the target model is iterative processing, then based on the target model, the Flink computing framework is invoked to batch process the target electromagnetic data to generate the analysis results. If the operator in the target model is a non-iterative process, then based on the target model, the Spark computing framework is invoked to batch process the target electromagnetic data to generate the analysis results.