Method, device and equipment for monitoring quality of filter rod forming process chain and storage medium

By collecting and organizing real-time data from the filter rod forming process chain, and utilizing multivariate time-series correlation anomaly detection and deep feature extraction algorithms, the entire process of filter rod production is traceable and accurately judged. This solves the problem of difficulty in tracing quality issues in existing technologies and provides complete data support and closed-loop management.

CN122196779APending Publication Date: 2026-06-12HONGYUN HONGHE TOBACCO (GRP) CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
HONGYUN HONGHE TOBACCO (GRP) CO LTD
Filing Date
2026-03-04
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

In existing technologies, it is difficult to quickly and accurately trace back to specific batches, machines, process parameters, and even raw material batches in the filter rod production process. This makes it difficult to analyze the root causes of quality problems and results in long processing cycles.

Method used

By collecting real-time operating parameters and quality inspection data of the filter rod forming process chain, we organize them into a standardized multivariate time-series dataset. We then use a multivariate time-series correlation anomaly detection algorithm to identify anomalies, perform deep feature extraction and time-series sequence alignment and matching, realize full-process feature fusion and quality level determination, and construct a full-process data correlation mapping system.

🎯Benefits of technology

It has achieved standardized integration of data across all process stages, precise correlation and matching of cross-process time sequence data, shortened the root cause localization cycle of quality problems, and provided accurate judgment and closed-loop control of the quality status of the entire process.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122196779A_ABST
    Figure CN122196779A_ABST
Patent Text Reader

Abstract

The application discloses a quality monitoring method, device and equipment of a filter rod forming process chain and a storage medium, relates to the tobacco technical field, and the method eliminates information islands among physical dispersion processes such as forming, solidification, emission and reception by breaking through the data link of the whole link of the filter rod forming process chain, realizes standardized integration of operation and quality data of the whole process link, adapts time delay characteristics existing among links, completes accurate correlation and matching of cross-process time sequence data, makes up for the deficiency that the existing production management and monitoring system can only realize data recording and visualization, realizes deep analysis and feature mining of whole-process data, constructs an associated mapping system of whole-process data, supports accurate determination of the quality state of the whole process of filter rod production, realizes whole-link binding of quality data and process link parameters, realizes closed-loop management and control of the quality state of the whole process of the filter rod forming process chain, and provides complete data basis and analysis basis for quality optimization and abnormal control of the production process.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of tobacco technology, and in particular to a method, apparatus, equipment and storage medium for quality monitoring of a filter rod forming process chain. Background Technology

[0002] Filter rod production involves multiple physically dispersed and time-delayed stages, including molding, curing, emission, and reception. Currently, data from each stage is isolated, forming "information silos." When quality problems arise in the final cigarette product, it is difficult to quickly and accurately trace back to the specific batch, machine, process parameters, and even raw material batch of the filter rod production. This makes root cause analysis of quality problems difficult and results in long processing cycles.

[0003] Existing production management systems (MES) or supervisory control systems (SCADA) primarily focus on data recording and visualization, lacking the ability to dynamically map and deeply analyze physical entities in virtual space. This prevents accurate traceability across processes and in specific time sequences. Summary of the Invention

[0004] The main objective of this application is to provide a quality monitoring method, device, equipment, and storage medium for the filter rod forming process chain, in order to solve the problems in the prior art where it is difficult to quickly and accurately trace back to the specific batch, machine, process parameters, and even raw material batch of the filter rod production process, making it difficult to analyze the root causes of quality problems and resulting in long processing cycles.

[0005] To achieve the above objectives, this application provides the following technical solution: A quality monitoring method for a filter rod forming process chain, the quality monitoring method comprising: Step S1: Collect real-time operating parameters and quality inspection data of each process step in the filter rod forming process chain, and organize them into a standardized multivariate time-series dataset; Step S2: Identify data anomalies in the standardized multi-dimensional time series dataset using a multi-dimensional temporal correlation anomaly detection algorithm and mark the anomalies to obtain a multi-dimensional time series dataset with marked anomalies. Step S3: Obtain the deep representation features of the multivariate time series dataset through the time series deep feature extraction algorithm to obtain the deep time series feature dataset; Step S4: Perform cross-process time sequence association matching on the deep time series feature dataset using a time sequence alignment and matching algorithm to obtain a full-process associated dataset; Step S5: Perform full-process feature fusion and quality level determination on the full-process associated dataset using a full-dimensional feature fusion classification algorithm to obtain the quality level result of each batch of filter rods; Step S6: Bind the quality grade results of all batches of filter rods with the operating parameters of the corresponding process links to obtain the full-process quality monitoring data of the filter rod forming process chain.

[0006] Beneficial effects of steps S1 to S6: By establishing a data link across the entire filter rod molding process chain, information silos between physically dispersed processes such as molding, curing, emission, and reception are eliminated. This enables standardized integration of operational and quality data across all processes, adapts to the time delays between each stage, and achieves precise correlation and matching of cross-process time-series data. It overcomes the limitations of existing production management and monitoring systems that can only record and visualize data, enabling in-depth analysis and feature mining of the entire process data. A correlation mapping system for the entire process data is constructed to support accurate determination of the quality status of the entire filter rod production process. This achieves full-link binding of quality data and process parameters, providing complete data support for cross-process, full-process traceability of filter rod quality issues, reducing the root cause location cycle of quality problems, and realizing closed-loop control of the quality status of the entire filter rod molding process chain. This provides a complete data foundation and analytical basis for quality optimization and anomaly control in the production process.

[0007] As a further improvement to this application, step S1 involves collecting real-time operating parameters and quality inspection data for each process step in the filter rod forming process chain, and organizing them into a standardized multivariate time-series dataset, including: Step S1.1: Traverse all process steps in the filter rod forming process chain, extract the process parameter collection items and quality inspection items corresponding to each process step, and obtain a list of data collection items for the entire process. Step S1.2: Based on the full process data collection item list, synchronously collect the real-time operating parameters and quality inspection data of the corresponding nodes of each process link to obtain the original dataset of the full process; Step S1.3: Clean the original dataset of the entire process to obtain the cleaned original dataset of the entire process; Step S1.4: Perform dimensional unification and numerical normalization on the original dataset of the entire process after cleaning to obtain a normalized dataset of the entire process. Step S1.5: Add a unified time sequence stamp and process identifier to the standardized full-process dataset to obtain a full-process dataset with time sequence identifier; Step S1.6: Perform dimensional integration and format unification on the full process dataset with time sequence labels to obtain a standardized multivariate time series dataset.

[0008] The beneficial effects of steps S1.1 to S1.6 are as follows: This section addresses the data collection needs of various physically dispersed stages in the filter rod molding process chain, achieving unified organization and standardization of data collection dimensions across the entire process. It adapts to the differentiated data collection characteristics of multiple stages such as molding, curing, emission, and reception, ensuring the integrity and consistency of collection dimensions. It avoids the problem of inconsistent collection standards between different processes, simultaneously connecting the data acquisition links of each dispersed stage, achieving synchronous collection of operational and quality data across the entire process, eliminating data barriers between stages, and completing noise removal and standardization processing of raw data. This eliminates dimensional differences and numerical deviations between data, ensuring data standardization and comparability. Through the matching of time sequence and process identifiers, it achieves precise binding of data with corresponding processes and time sequence dimensions, adapting to the time delay characteristics between stages. Ultimately, it forms a unified and standardized data system, providing complete, unified, and traceable basic data support for the entire process quality control of the filter rod molding process chain, and compensating for the shortcomings of existing production management systems such as scattered data collection, inconsistent standards, and inability to support cross-process collaborative analysis.

[0009] As a further improvement to this application, step S2 involves identifying data anomalies in the standardized multivariate time-series dataset using a multivariate temporal correlation anomaly detection algorithm and marking the anomalies to obtain a multivariate time-series dataset with marked anomalies, including: Step S2.1: Divide the standardized multivariate time series dataset into a sliding window with a fixed step size to obtain a multivariate time series variable matrix; Step S2.2: Construct graph structure data of the correlation relationships between the corresponding variables based on the multivariate time series variable matrix; Step S2.3: Aggregate the temporal information of the neighborhood variables of the graph structure data through the graph attention mechanism of the MTAD-GAT algorithm to obtain the temporal feature matrix after variable dependency fusion; Step S2.4: Encode and reconstruct the temporal dimension of the temporal feature matrix to obtain the temporal reconstruction residual dataset; Step S2.5: Calculate the anomaly score for each time series sampling point based on the time series reconstruction residual dataset, and generate a time series dataset with anomaly scores; Step S2.6: Adaptive outlier determination based on extreme value theory, identifying data outliers from the time series dataset with outlier scores, and generating a time series dataset with outlier labels; Step S2.7: Align the time series dataset with outlier markers with the standardized multivariate time series dataset in terms of dimensions and merge the attributes to obtain a multivariate time series dataset with outlier markers.

[0010] Beneficial effects of steps S2.1 to S2.7: This module's sub-steps are adapted to the time-series data characteristics of the multi-physically dispersed links in the filter rod molding process chain. It captures the correlation and dependence of process parameters between processes such as molding, curing, emission, and receiving, making up for the shortcomings of existing production management and monitoring systems that can only achieve single-dimensional data recording and cannot identify cross-variable correlation anomalies. It adapts to the time delay characteristics between processes, enabling accurate identification of abnormal states in multi-dimensional time-series data of the entire process. It avoids the problem of insufficient adaptability of traditional anomaly detection methods to industrial linkage production data, completes the standardized marking of abnormal states and aligns the dimensions of the original data, ensures the integrity and reliability of time-series data of the entire process, eliminates the pain point of the inability to identify data anomalies in various dispersed links, provides accurate abnormal state data support for quality anomaly control in the filter rod production process, and fills the gap in the existing system's lack of in-depth analysis capabilities for process chain linkage anomalies.

[0011] As a further improvement of this application, step S3 involves obtaining the deep representation features of the multivariate time-series dataset through a time-series deep feature extraction algorithm, resulting in a deep time-series feature dataset, including: Step S3.1: Perform dual-view enhancement processing on the multivariate time-series dataset with labeled abnormal states to obtain a time-series dataset with a pair of original views and context views. Step S3.2: Input the time series dataset into the temporal coding network of the TS2Vec algorithm, and perform feature mapping through hierarchical causal convolution to obtain the hierarchical temporal feature set corresponding to the dual views; Step S3.3: Perform cross-view comparative learning optimization on the hierarchical temporal feature set to obtain a dual-view temporal feature set after feature space alignment; Step S3.4: Perform multi-scale hierarchical feature aggregation on the dual-view temporal feature set to obtain a unified representation feature set with full temporal coverage; Step S3.5: Map and match the unified representation feature set with the original temporal dimension to obtain the deep temporal feature dataset.

[0012] Beneficial effects of steps S3.1 to S3.5: This module's sub-steps are adapted to the time-series data characteristics of physically dispersed stages in the filter rod forming process chain, such as forming, curing, emission, and receiving. It matches the time delay attributes between each process, overcoming the limitations of existing production management and monitoring systems that can only record and visualize surface-level data and cannot uncover deep data correlations. It breaks through the limitations of traditional feature processing methods in capturing the global dependence of long-cycle industrial time-series data, effectively extracting potential linkage patterns and deep correlation information from the time-series data of the entire process. It retains the full-cycle dimensional attributes of the time-series data and the implicit correlation characteristics between processes, avoiding subjective bias and loss of effective information caused by manual feature processing. This forms a unified feature system with strong representational capabilities, providing feature support with deep correlation information for the quality status analysis of the filter rod production process. It fills the gap in existing systems' lack of in-depth analysis capabilities for multi-stage data in the process chain, and resolves the pain point of the inability to achieve deep linkage analysis of data from various dispersed stages.

[0013] As a further improvement to this application, step S4 involves performing cross-process time series correlation matching on the deep time series feature dataset using a time series alignment and matching algorithm to obtain a full-process correlation dataset, including: Step S4.1: The deep time-series feature dataset is split according to the process nodes of the filter rod forming process chain to obtain an independent time-series feature sequence subset for each process step; Step S4.2: Perform time sequence constraint matching between process steps on the subset of time-series feature sequences to obtain a set of constrained feature sequence pairs; Step S4.3: Calculate the normalization path and alignment cost matrix of the sequence pairs in the constrained feature sequence pair set using the Soft-DTW algorithm to obtain the optimal alignment path of the feature sequence pair set. Step S4.4: According to the optimal alignment path, perform cross-process feature sequence temporal dimension alignment mapping on the feature sequence pair set to obtain the feature sequence set after full-process temporal alignment; Step S4.5: After aligning the feature sequence set of the entire process time sequence, perform dimensional splicing and association binding according to the batch identifier to obtain the batch-level full-process associated feature set; Step S4.6: Perform dimensional matching and merging of the batch-level full-process associated feature set with the original data attributes to obtain the full-process associated dataset.

[0014] Beneficial effects of steps S4.1 to S4.6: This module's sub-steps are adapted to the physically dispersed and time-delayed process characteristics of filter rod forming, curing, emission, and receiving processes. It compensates for the shortcomings of existing production management and monitoring systems in achieving precise cross-process temporal correlation and matching, addressing the pain points of temporal misalignment and data link breakage between multiple dispersed processes. It achieves precise temporal alignment of the entire process feature sequence, breaks down feature correlation links between dispersed stages, eliminates information silos between processes, and completes full-process data correlation and binding at the batch level. This ensures the temporal continuity and batch consistency of the entire process chain data, adapts to the time delay attributes of each stage in the production process, and avoids the problems of temporal misalignment and insufficient correlation accuracy in traditional cross-process data matching methods. It provides complete correlation data support for quality traceability and cross-process collaborative analysis throughout the filter rod production process, filling the gap in existing systems' lack of effective data correlation capabilities for dispersed processes with time delays, and meeting the core requirements of quality control throughout the entire filter rod forming process chain.

[0015] As a further improvement to this application, step S5 involves performing full-process feature fusion and quality level determination on the entire process-related dataset using a full-dimensional feature fusion classification algorithm to obtain the quality level result for each batch of filter rods, including: Step S5.1: Generate an independent feature token vector based on each feature dimension in the full-process associated dataset to obtain a tokenized feature sequence set; Step S5.2: Add classification tokens and temporal location codes to the tokenized feature sequence set to obtain a complete token sequence set with location information; Step S5.3: Input the complete token sequence set into the multi-head self-attention layer of the FT-Transformer algorithm, and perform full-dimensional feature interaction fusion through the global attention mechanism to obtain the feature representation set after attention fusion; Step S5.4: Input the feature representation set into the feedforward neural network layer of the FT-Transformer algorithm, and generate a high-dimensional feature set after nonlinear mapping through nonlinear feature transformation and dimensionality optimization; Step S5.5: Through multi-layer network iterative processing, the global feature output of the classification token corresponding to the high-dimensional feature set is extracted to obtain the batch-level global fusion feature set; Step S5.6: Input the batch-level global fusion feature set into the fully connected classification layer for quality level mapping and determination, and obtain the quality level result of each batch of filter rods.

[0016] Beneficial effects of steps S5.1 to S5.6: This module's sub-steps are adapted to the full-process quality control needs of the filter rod forming process chain, including forming, curing, emission, and receiving. It compensates for the shortcomings of existing production management and monitoring systems, which can only achieve data recording and visualization but cannot achieve deep integration of full-process features and accurate quality judgment. It breaks through the limitations of traditional quality judgment methods in terms of insufficient adaptability to cross-process related data, realizes global interaction and deep integration of all process-related features, fully retains the feature correlation information between each dispersed link, adapts to the time delay characteristics between processes, and completes accurate judgment of filter rod batch quality level. It avoids the problem that traditional quality judgment methods rely only on single-process data and cannot cover the quality influencing factors of the entire process chain. It provides accurate batch quality judgment results for the quality control of the filter rod production process, and at the same time provides core quality judgment basis for root cause analysis of quality problems and full-process traceability. It fills the gap in the existing system's lack of systematic intelligent judgment capability for the entire filter rod process chain quality, and adapts to the core needs of closed-loop quality control of the entire filter rod forming process chain.

[0017] As a further improvement to this application, step S6 involves binding the quality grade results of all batches of filter rods with the operating parameters of the corresponding process steps to obtain full-process quality monitoring data for the filter rod forming process chain, including: Step S6.1: Extract the batch unique identifier and standardize the format of the quality grade results for each batch of filter rods to generate a quality grade result set with a unique batch identifier; Step S6.2: Perform precise batch identification matching between the quality grade result set and the standardized multivariate time series dataset to obtain a batch-to-batch matching dataset of quality grade and process parameters. Step S6.3: Perform attribute association binding on the process sequence dimension of the quality grade and process parameter matching dataset to obtain a batch quality association dataset with full process sequence link attributes; Step S6.4: Merge all batch dimensions and unify data dimensions of the batch quality correlation dataset to obtain a quality and process correlation dataset covering all batches and processes. Step S6.5: Standardize the storage structure and complete the full-link metadata of the quality and process association dataset to obtain the full-process quality monitoring data of the filter rod forming process chain.

[0018] Beneficial effects of steps S6.1 to S6.5: This module's sub-steps are adapted to the physically dispersed and time-delayed characteristics of the filter rod forming process chain, including forming, curing, launching, and receiving. It compensates for the shortcomings of existing production management and monitoring systems, such as broken data links and the inability to support accurate traceability across the entire process. It achieves full-link association and binding between batch quality results and corresponding process operation data, eliminating information silos between processes and constructing a complete closed-loop data system for quality control throughout the filter rod forming process chain. This meets the core needs of forward tracking and reverse tracing throughout the filter rod production process, providing complete full-link data support for root cause localization and analysis of quality problems. It addresses the pain points of difficult traceability and long processing cycles for filter rod quality issues in existing technologies, filling the gap in existing production management systems' lack of full-link closed-loop integration capabilities for filter rod process chain quality and process data. It provides a complete, standardized, and reusable closed-loop data foundation for quality optimization, anomaly control, and full-process traceability management in filter rod production, adapting to the practical application needs of intelligent quality control in the filter rod forming process chain.

[0019] To achieve the above objectives, this application also provides the following technical solutions: A quality monitoring device for a filter rod forming process chain, the quality monitoring device being applied to the quality monitoring method described above, the quality monitoring device comprising: The process data acquisition module is used to collect real-time operating parameters and quality inspection data of each process step in the filter rod forming process chain, and organize them into a standardized multivariate time-series dataset. The process data anomaly marking module is used to identify data anomaly points in the standardized multivariate time series dataset and mark the anomaly status using a multivariate time series correlation anomaly detection algorithm, thereby obtaining a multivariate time series dataset with marked anomaly status. The deep representation feature extraction module is used to obtain the deep representation features of the multivariate time series dataset through a time series deep feature extraction algorithm, thereby obtaining a deep time series feature dataset; The cross-process data association and matching module is used to perform cross-process time series association matching on the deep time series feature dataset through a time series alignment and matching algorithm to obtain a full-process associated dataset; The data quality level determination module is used to perform full-process feature fusion and quality level determination on the full-process associated dataset through a full-dimensional feature fusion classification algorithm to obtain the quality level result of each batch of filter rods. The quality grade result binding module is used to bind the quality grade results of all batches of filter rods with the operating parameters of the corresponding process links to obtain the full-process quality monitoring data of the filter rod forming process chain.

[0020] To achieve the above objectives, this application also provides the following technical solutions: An electronic device includes a processor and a memory coupled to the processor, the memory storing program instructions executable by the processor; when the processor executes the program instructions stored in the memory, it implements the quality monitoring method of the filter rod forming process chain as described above.

[0021] To achieve the above objectives, this application also provides the following technical solutions: A computer-readable storage medium storing program instructions, which, when executed by a processor, enable the quality monitoring method for the filter rod forming process chain described above. Attached Figure Description

[0022] Figure 1 This is a schematic flowchart of one embodiment of a quality monitoring method for a filter rod forming process chain according to this application; Figure 2 This is a schematic diagram of the functional modules of a quality monitoring device for a filter rod forming process chain according to an embodiment of this application. Figure 3 This is a schematic diagram of the structure of an embodiment of the electronic device of this application; Figure 4 This is a schematic diagram of the structure of one embodiment of the storage medium of this application. Detailed Implementation

[0023] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of this application, and not all of the embodiments. Based on the embodiments of this application, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the scope of protection of this application.

[0024] The terms "first," "second," and "third" in this application are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of technical features indicated. Therefore, a feature defined as "first," "second," or "third" may explicitly or implicitly include at least one of that feature. In the description of this application, "multiple" means at least two, such as two, three, etc., unless otherwise explicitly specified. All directional indications (such as up, down, left, right, front, back, etc.) in the embodiments of this application are only used to explain the relative positional relationships and movements between components in a specific orientation (e.g., as shown in the figures). If the specific orientation changes, the directional indications also change accordingly. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover non-exclusive inclusion. For example, a process, method, apparatus, product, or device that includes a series of steps or units is not limited to the listed steps or units, but may optionally include steps or units not listed, or may optionally include other steps or units inherent to these processes, methods, products, or devices.

[0025] In this document, the term "embodiment" means that a particular feature, structure, or characteristic described in connection with an embodiment may be included in at least one embodiment of this application. The appearance of this phrase in various places throughout the specification does not necessarily refer to the same embodiment, nor is it a mutually exclusive, independent, or alternative embodiment. It will be explicitly and implicitly understood by those skilled in the art that the embodiments described herein can be combined with other embodiments.

[0026] It should be noted that, due to the limited number of symbols or letters that can represent specific meanings, in embodiments with many formulas or codes, there may be situations where symbols or letters cannot meet the requirements. Therefore, the interpretation of formula symbols in the steps or sub-steps of the embodiments is only valid for the current step or sub-step.

[0027] If the same symbol has different interpretations in different steps or sub-steps, the interpretation in the current step or sub-step shall prevail; if the same symbol appears in different steps or sub-steps, but no interpretation is given in subsequent steps or sub-steps after its first appearance, the interpretation in the first step or sub-step shall be used.

[0028] like Figure 1 As shown, this embodiment provides an example of a quality monitoring method for the filter rod forming process chain. In this embodiment, the quality monitoring method includes the following steps: Step S1: Collect real-time operating parameters and quality inspection data for each process step in the filter rod forming process chain, and organize them into a standardized multivariate time-series dataset.

[0029] Furthermore, step S1 specifically includes the following steps: Step S1.1: Traverse all process steps in the filter rod forming process chain, extract the process parameter collection items and quality inspection items corresponding to each process step, and obtain a list of data collection items for the entire process.

[0030] Preferably, this sub-step standardizes the data collection dimensions for the entire process of filter rod forming. The core processes of the filter rod forming process chain are four categories: fiber bundle opening and forming, filter rod curing and storage, filter rod air-powered emission, and winding machine receiving and forming. These processes are fully matched with the physically dispersed production nodes in the process chain.

[0031] Preferably, during implementation, the data collection items can first be divided into two categories according to the process dimension: process operation parameter data collection items and quality inspection items. Secondly, standardized attributes are defined for each category of data collection items, including six core attributes: unique ID of the data collection item, process code, data type, sampling frequency, measurement range, and unit of measurement. Then, the validity of the data collection items is screened, and non-core parameters (such as auxiliary parameters such as equipment lighting status and cabinet door opening / closing status) that are not directly causally related to the quality of the finished filter rod are removed. Finally, a fixed-structure list of data collection items for the entire process is formed.

[0032] For example, the sampling frequency of dynamic process parameters (spindle speed, opening pressure, and filament tension in the molding stage, and air pressure and conveying speed in the launching stage) is set to 100Hz; the sampling frequency of slowly changing process parameters (temperature and humidity of the curing chamber, storage time, etc.) is set to 1Hz; and the sampling frequency of online quality inspection parameters (circumference, hardness, suction resistance, and weight of filter rods, etc.) is set to 10Hz. The final list of valid data collection items includes no less than 42 core process parameters and no less than 8 quality inspection items.

[0033] Step S1.2: Based on the full process data collection item list, synchronously collect the real-time operating parameters and quality inspection data of the corresponding nodes of each process link to obtain the original dataset of the entire process.

[0034] Preferably, this sub-step can adopt the IEEE 1588 PTP precision time synchronization protocol to control the time synchronization error of all acquisition nodes in the entire process within ±1ms, ensuring the timing consistency of data across processes.

[0035] Preferably, the data sources cover three categories: real-time register data of PLCs from molding machines, curing warehouses, launching stations, and winding machines; offline / online detection data from online quality testing instruments; and analog data acquired by field sensors. The acquisition triggering mechanism adopts batch-triggered acquisition, using the batch change level signal of the filament bundle as the start and end trigger signals of the acquisition cycle. The acquisition cycle of a single batch is completely matched with the entire production cycle of the corresponding filter rod batch, avoiding cross-batch data mixing.

[0036] Preferably, the collected raw dataset of the entire process adopts a time-series database-specific row storage format. The fixed structure of a single data record is: unique record ID, batch UUID identifier, process code, millisecond-level timestamp, collection item ID, original collection value, and data status bit, ensuring the traceability of each data record.

[0037] Step S1.3: Clean the original dataset of the entire process to obtain the cleaned original dataset of the entire process.

[0038] Preferably, data cleaning can be achieved through missing value filling, duplicate data removal, outlier removal, and invalid data removal.

[0039] The missing value completion process can use linear interpolation based on a sliding window to complete missing values. The sliding window length is set to 5 consecutive sampling points, and the missing value judgment threshold is set to 30%. If the missing data ratio of a single collection item in a single batch exceeds 30%, it is marked as an invalid collection item and removed as a whole. Missing values ​​with a missing value ratio ≤30% can be filled by mode filling or linear interpolation.

[0040] Among them, duplicate data removal can be based on three dimensions: unique record ID, batch UUID, and timestamp. For duplicate data in the same batch, the same collection item, and the same timestamp, only the first valid record is retained, and all others are removed.

[0041] Among them, the removal of abnormal values ​​in the measurement range can be based on the measurement range set in the collection item list in step S1.1. Sampled values ​​that exceed the range of [minimum range, maximum range] are marked as invalid values ​​and removed. For example, the range of the filament tension in the forming process is set to 0-200cN, and values ​​that exceed this range are directly removed.

[0042] For example, step S1.1 can be implemented using the following pseudocode: def clean_original_dataset(original_dataset, item_list, window_size=5, missing_rate_threshold=0.3, valid_rate_threshold=0.95): cleaned_dataset = [] # Split dataset by batch batch_datasets = split_dataset_by_batch(original_dataset) for batch_data in batch_datasets: valid_batch_data = [] # Split single batch data by collection item item_datasets = split_dataset_by_item(batch_data) for item_id, item_data in item_datasets.items(): # Get the measurement range of the current data acquisition item item_range = item_list[item_id]["range"] # Eliminate outliers in measurement range item_data = filter_out_of_range(item_data, item_range) # Calculate the missing rate missing_rate = calculate_missing_rate(item_data) if missing_rate>missing_rate_threshold: continue # Remove invalid data collection items # Sliding window linear interpolation to complete missing values item_data = sliding_window_linear_interpolation(item_data,window_size) valid_batch_data.extend(item_data) # Remove duplicate data valid_batch_data = drop_duplicate_data(valid_batch_data) # Verify the percentage of valid data valid_rate = calculate_valid_rate(valid_batch_data, batch_data) if valid_rate>= valid_rate_threshold: cleaned_dataset.extend(valid_batch_data) return cleaned_dataset Step S1.4: Perform dimensional unification and numerical normalization on the original dataset of the entire process after cleaning to obtain a normalized dataset of the entire process.

[0043] Preferably, the dimensionless processing can adopt the International System of Units (SI) to unify the dimensions of all parameters, wherein the pressure unit is unified to Pa, the temperature unit is unified to K, the length unit is unified to m, the speed unit is unified to m / s, the mass unit is unified to kg, and the tension unit is unified to N.

[0044] Preferably, the numerical normalization process can use the Min-Max normalization algorithm, and the normalized numerical range is set to [0,1].

[0045] It is worth noting that for constant value parameters (such as the rated speed of equipment) with no fluctuation within a single batch, in order to avoid calculation anomalies with a denominator of 0, the normalized value can be uniformly assigned to 0.5.

[0046] Step S1.5: Add a unified time sequence stamp and process identifier to the standardized full-process dataset to obtain a full-process dataset with time sequence identifier.

[0047] Preferably, the time stamp addition can adopt a millisecond-level Unix timestamp format, which is fully compatible with the PTP time synchronization protocol in step S1.2, adding a globally unique time stamp to each data record to ensure that all data in the entire process can be sorted globally by timestamp without any sorting conflicts.

[0048] Preferably, the process identifier can be added using a 4-digit numerical coding rule. The first digit is the process chain category code (fixed to 1, representing the filter rod forming process chain), the second digit is the process sequence code (1=filament tow opening and forming stage, 2=filter rod curing and storage stage, 3=filter rod air-powered launching stage, 4=winding machine receiving and forming stage), and the third and fourth digits are the node codes within the process. For example, the opening node code for the forming stage is 1101, and the forming machine head node code is 1102, ensuring that each piece of data can be accurately located to the corresponding process node.

[0049] Preferably, batch identifier binding can achieve global association of all process data in the same batch by binding a unified UUID format batch unique identifier to all data records in a single batch, and provide a primary key for subsequent cross-process data matching.

[0050] Step S1.6: Perform dimensional integration and format unification on the full process dataset with time sequence labels to obtain a standardized multivariate time series dataset.

[0051] Preferably, the dimension integration can adopt a batch-time series two-dimensional aggregation rule, using the batch UUID identifier as the row primary key and the millisecond-level time sequence stamp as the column sorting basis, to integrate all the collected item values ​​under the same batch and the same timestamp into a multi-dimensional time series data record, and finally form a three-dimensional tensor format dataset of "batch number × time step × feature dimension", wherein the feature dimension is consistent with the number of valid collected items generated in step S1.1, and the time step is completely matched with the number of sampling points in a single batch.

[0052] Preferably, the format is uniform and can adopt a common time series data tensor format, which is compatible with the input requirements of all subsequent time series analysis algorithms. At the same time, the batch identifier, process identifier, and full information of the collected item attributes are retained in the metadata dimension of the tensor to ensure the traceability of the dataset.

[0053] The beneficial effects of steps S1.1 to S1.6 are as follows: This section addresses the data collection needs of various physically dispersed stages in the filter rod molding process chain, achieving unified organization and standardization of data collection dimensions across the entire process. It adapts to the differentiated data collection characteristics of multiple stages such as molding, curing, emission, and reception, ensuring the integrity and consistency of collection dimensions. It avoids the problem of inconsistent collection standards between different processes, simultaneously connecting the data acquisition links of each dispersed stage, achieving synchronous collection of operational and quality data across the entire process, eliminating data barriers between stages, and completing noise removal and standardization processing of raw data. This eliminates dimensional differences and numerical deviations between data, ensuring data standardization and comparability. Through the matching of time sequence and process identifiers, it achieves precise binding of data with corresponding processes and time sequence dimensions, adapting to the time delay characteristics between stages. Ultimately, it forms a unified and standardized data system, providing complete, unified, and traceable basic data support for the entire process quality control of the filter rod molding process chain, and compensating for the shortcomings of existing production management systems such as scattered data collection, inconsistent standards, and inability to support cross-process collaborative analysis.

[0054] Step S2: Identify data anomalies in the standardized multivariate time series dataset using a multivariate time series correlation anomaly detection algorithm and mark the anomalies to obtain a multivariate time series dataset with marked anomalies.

[0055] Furthermore, step S2 specifically includes the following steps: Step S2.1: Divide the standardized multivariate time series dataset into a sliding window with a fixed step size to obtain a multivariate time series variable matrix.

[0056] Preferably, the only input to step S2.1 is the standardized multivariate time-series dataset output by S1. First, the dataset can be uniformly resampled with a resampling frequency of 10Hz and resampling can be completed using linear interpolation to ensure that the sampling step size of all feature dimensions is completely consistent, so as to eliminate the time-series misalignment problem caused by the difference in sampling frequency of different acquisition items in step S1.

[0057] Preferably, the fixed-step sliding window generally has a window length L of 1200 time steps, corresponding to a time span of 120s at a sampling frequency of 10Hz, which completely covers the minimum process flow delay of the filter rod forming-curing stage, ensuring that the parameter correlation between processes can be completely covered within a single window; the sliding step size S is set to 100 time steps, corresponding to a sliding interval of 10s, and a sliding strategy with a non-overlapping area ratio of 8.3% is adopted to balance the accuracy of abnormal positioning and the calculation efficiency.

[0058] Preferably, the window filling rule is to fill the sequence at the end of the batch that is not long enough with zero padding to a fixed window length, ensuring that all window dimensions are completely consistent.

[0059] Preferably, during the partitioning process, all standardized time-series data are traversed according to the batch dimension, and sliding partitioning is performed with the set window length and sliding step size. Each window outputs a two-dimensional matrix with dimension [L×D], where L is the window length and D is the feature dimension (completely consistent with the number of valid collection items in step S1.1). All windows are concatenated in batch and time sequence to finally generate a multivariate time-series variable matrix with dimension [N×L×D], where N is the total number of sliding windows in a single batch.

[0060] Step S2.2: Construct graph-structured data of the correlation between corresponding variables based on the multivariate time series variable matrix.

[0061] Preferably, each feature dimension of the multivariate time-series variable matrix (i.e., each acquisition item defined in step S1.1) can be defined as a node of the graph, with the total number of nodes equal to the feature dimension D. Each node corresponds to a unique node ID, which is one-to-one with the acquisition item ID. At the same time, the nodes are divided into forming node group, curing node group, transmitting node group, and receiving node group according to their respective processes, which are completely matched with the physical flow structure of the process chain.

[0062] Preferably, the directed adjacency matrix can be constructed using a dual-constraint approach of "process prior + data-driven" to construct the adjacency matrix A∈R. D×DThis avoids spurious associations caused by purely data-driven approaches. Specifically, prior process constraints are implemented based on the physical flow logic of filter rod molding, allowing only upstream process nodes to point to downstream process nodes (molding → curing → emission → reception). Nodes within the same process can be bidirectionally connected, but downstream nodes cannot point back to upstream nodes. Data-driven constraints, on the other hand, calculate the linear dependence between variables based on the Pearson correlation coefficient. The absolute value threshold of the correlation coefficient is set to 0.3. Only when the absolute value of the correlation coefficient between two nodes is ≥0.3 and meets the prior process constraints is the corresponding position in the adjacency matrix assigned a value of 1; otherwise, it is assigned a value of 0, ultimately generating a sparse directed adjacency matrix.

[0063] Preferably, the graph structure data generation can use the multivariate time series variable matrix generated in step S2.1 as the time series feature matrix X∈R of the graph nodes. N×L×D Combined with the generated directed adjacency matrix A, they together form graph structure data G=(X,A) that meets the input requirements of graph neural networks, where X is the temporal feature of the nodes and A is the adjacency structure of the graph.

[0064] Step S2.3: Aggregate the temporal information of neighborhood variables of graph structure data through the graph attention mechanism of the MTAD-GAT algorithm to obtain the temporal feature matrix after variable dependency fusion.

[0065] Preferably, the MTAD-GAT algorithm core used in step S2.3 consists of three parts: a multi-head graph attention layer, a gated recurrent unit (GRU), and a Seq2Seq encoding / decoding structure. The core of this step performs feature aggregation and temporal feature extraction operations of the graph attention layer.

[0066] The first step, for single-head attention weight calculation, is to calculate the attention weights between nodes for the temporal features within each window of the graph structure data G=(X,A). The formula for single-head attention calculation is as follows: .

[0067] in, , Let be the temporal feature vectors of nodes i and j, and W be the learnable linear transformation weight matrix. Let T be the learnable attention vector, where T is the transpose and || is the vector concatenation operation. Let A be the set of neighboring nodes of node i as defined by the adjacency matrix A. Let be the normalized attention weight of node j to node i, and set the negative slope of LeakyReLU to 0.2.

[0068] The second step involves multi-head attention feature aggregation. A multi-head attention mechanism can be used, with four attention heads. The aggregation results from each head are then concatenated to obtain the final node aggregation features, as shown in the following formula: .

[0069] Where K is the number of attention heads, K=4. W represents the weight of the k-th attention head. k Let σ be the linear transformation matrix of the k-th attention head, and σ be the Sigmoid activation function. The output feature vector after aggregating neighborhood information for node i.

[0070] The third step is to fuse temporal features. The node features output by the graph attention layer can be input into the GRU layer for temporal dependency extraction. The GRU hidden layer dimension is set to 128 and the number of layers is set to 2. Finally, a temporal feature matrix after variable dependency fusion with dimension [N×L×H] is generated, where H is the GRU hidden layer dimension and H=128.

[0071] Fourth, regarding the model pre-training rules, the MTAD-GAT model adopts an unsupervised pre-training method. The pre-training dataset uses 30 consecutive days of normal production history data from the filter rod forming process chain, with a batch size of 32, an Adam optimizer, a learning rate of 0.001, 100 training rounds, and an early stopping policy patience of 10 to avoid model overfitting.

[0072] For example, step S2.3 can be implemented using the following pseudocode: def gat_feature_aggregation(graph_data, head_num=4, hidden_dim=128): # Input graph structure data G=(X,A) X, A = graph_data # Multi-head graph attention layer feature aggregation multi_head_features = [] for _ in range(head_num): # Linear Transformation and Attention Weight Calculation W = linear_layer(input_dim=X.shape[-1], output_dim=hidden_dim) a = attention_vector(input_dim=2*hidden_dim) e = leaky_relu(torch.matmul(torch.cat([W@X, W@X], dim=-1), a)) # Adjacency Matrix Mask and Normalization e_masked = e.masked_fill(A == 0, -1e9) alpha = softmax(e_masked, dim=-1) # Neighborhood Feature Aggregation head_feature = sigma(torch.matmul(alpha, W@X)) multi_head_features.append(head_feature) # Multi-head feature splicing concat_feature = torch.cat(multi_head_features, dim=-1) # GRU Temporal Feature Fusion gru_layer = GRU(input_dim=concat_feature.shape[-1], hidden_dim=hidden_dim, num_layers=2) fused_feature_matrix, _ = gru_layer(concat_feature) return fused_feature_matrix Step S2.4: Encode and reconstruct the temporal dimension of the temporal feature matrix to obtain the temporal reconstruction residual dataset.

[0073] Preferably, a GRU-based Seq2Seq encoding / decoding structure can be adopted, with both the encoder and decoder having two GRU layers and a hidden layer dimension of 128, identical to the GRU layer dimension in S2.3. The encoder input is the temporal feature matrix generated in S2.3, and the decoder output is a reconstructed temporal feature matrix with the same dimension as the input. During encoding, the encoder reads the input temporal feature matrix sequentially, compressing the temporal information step by step, and finally outputs a context vector C of dimension [N×H], where H is the hidden layer dimension, and the context vector C contains the global feature information of the temporal sequence within the corresponding window. During reconstruction and decoding, the decoder uses the context vector C as the initial hidden state and reconstructs the sequence in reverse temporal order, outputting a reconstructed temporal feature matrix with the same length and dimension as the input sequence. .

[0074] The next step is to calculate the reconstruction residuals, comparing the input feature matrix X with the reconstructed feature matrix for each window. Calculate the reconstruction residuals element by element: Where R is the reconstruction residual matrix of a single window, and its dimensions are the same as X. Consistent; the residual matrices of all windows are concatenated in temporal order to finally generate a temporal reconstruction residual dataset, with dimensions completely aligned with the multivariate temporal variable matrix in step S2.1.

[0075] Preferably, the temporal reconstruction residual dataset can also be Z-score standardized to eliminate the differences in residual magnitudes across different feature dimensions.

[0076] Step S2.5: Calculate the anomaly score for each time series sampling point based on the time series reconstruction residual dataset, and generate a time series dataset with anomaly scores.

[0077] The first step is to calculate the window-level outlier score. For each sliding window, the mean squared error (MSE) of the residuals within the window is calculated as the window-level outlier score, as shown in the following formula: .

[0078] Among them, S win For single-window anomaly scores, L is the window length, D is the feature dimension, and R is the feature score. norm (t,d) represents the standardized residual value of the t-th time step and the d-th feature dimension within the window.

[0079] The second step is to set the sampling point weights. Since a single sampling point will be covered by multiple sliding windows, a Gaussian distribution weight function is used to calculate the weight of the sampling point within the window. Sampling points at the center of the window have higher weights, while those at the edges have lower weights, to avoid the influence of window edge noise on abnormal scores. The weight calculation formula is as follows: .

[0080] Among them, w t Let L be the weight of the t-th time step within the window, and L be the window length.

[0081] The third step is point-by-point anomaly score aggregation. By traversing all time-series sampling points, the window-level anomaly scores of all windows covering that sampling point are multiplied by the weights of their corresponding positions, summed, and then divided by the total weights to obtain the final anomaly score S for that sampling point. t The formula is as follows: .

[0082] Among them, Ω t To cover the set of all sliding windows that cover the t-th sampling point, w t,win Let t be the weight value of the t-th sampling point within the corresponding window.

[0083] The fourth step is to generate anomaly score datasets. This involves aligning the anomaly score of each sampling point with the standardized multivariate time-series dataset output in step S1, adding a corresponding anomaly score value to each data record, and finally generating a time-series dataset with anomaly scores. The dataset dimensions are completely consistent with the original standardized multivariate time-series dataset, with only the anomaly score dimension added.

[0084] Step S2.6: Adaptive outlier determination based on extreme value theory, identifying data outliers from the time series dataset with outlier scores, and generating a time series dataset with outlier labels.

[0085] The first step is extreme value theory fitting. The peak over-threshold (POT) method from extreme value theory can be used to fit the anomaly score sequence to a generalized Pareto distribution (GPD), thereby calculating the adaptive anomaly threshold. The cumulative distribution function of the GPD distribution is as follows: .

[0086] Where u is the initial threshold value, ξ is the shape parameter, and β is the scale parameter.

[0087] The second step is to set the initial threshold value. The initial threshold value u can be determined by the mean remaining lifetime plot method. The abnormal score sequence is arranged in ascending order, and the mean remaining lifetime of the samples exceeding u is calculated. When the mean remaining lifetime curve begins to show linear growth, the corresponding u value is the initial threshold value. The initial search range of u is set to the 80% to 99% quantile of the abnormal score sequence.

[0088] The third step is to estimate and verify the GPD distribution parameters. For abnormal score excess samples that exceed the initial threshold u, the maximum likelihood estimation method is used to fit the shape parameter ξ and scale parameter β of the GPD distribution. After the fitting is completed, the KS test is used to verify the goodness of fit, with the significance level set at 0.05. If the verification fails, the initial threshold u is readjusted until the goodness of fit meets the requirements.

[0089] The fourth step is to calculate the anomaly detection threshold. Based on the fitted GPD distribution, the upper limit of the false alarm rate for anomalies is set to 0.1%, corresponding to a confidence level of 99.9%. The final anomaly detection threshold S is then calculated. th The formula is as follows: .

[0090] Where N is the total number of samples in the abnormal score sequence, n u The number of excess samples is α, and the false alarm rate is set at α=0.001.

[0091] Step 5: Binarize and label outliers. Traverse all time-series sampling points. If the outlier score S of a sampling point is... t ≥S thIf S is an outlier, it is marked as an anomaly and assigned a value of 1; if S t <S th If the value is not found, it is marked as a normal point and assigned a value of 0; finally, an abnormal state flag bit is added to each data record to generate a time series dataset with abnormal point flags.

[0092] Preferably, to avoid noise misjudgment at a single sampling point, an anomaly point duration threshold filtering is adopted. The anomaly mark is retained only when 5 or more consecutive sampling points are marked as anomalies; otherwise, the corresponding sampling point is corrected to a normal point. The duration threshold is set to 5 time steps, corresponding to a time span of 0.5s at a sampling frequency of 10Hz, to filter out misjudgments caused by instantaneous noise.

[0093] Step S2.7: Align the dimensions and merge the attributes of the time series dataset with outlier markers with the standardized multivariate time series dataset to obtain a multivariate time series dataset with marked outlier states.

[0094] The first step is to accurately align the time series dimensions. Using the millisecond-level timestamps of the standardized multivariate time series dataset output in step S1 as the primary key, the time series dataset with outlier markers generated in step S2.6 is matched one-to-one with the timestamps to ensure that the outlier status markers of each sampling point correspond completely with the timestamps of the original data, with an alignment error of 0 and no misalignment or omission.

[0095] The second step is attribute dimension merging. The aligned abnormal state marker and abnormal score are added as two new feature dimensions and merged into the original standardized multivariate time series dataset. The merged dataset retains all metadata information of the original dataset, including batch UUID, process code, collection item attributes, timestamp, etc., and adds two new dimensions: abnormal state marker and abnormal score.

[0096] The third step is batch-level anomaly labeling: For a single batch of datasets, if the proportion of outliers in the batch exceeds 0.5%, the batch is marked as an abnormal batch, and a new batch-level anomaly label is added to provide a basis for batch-level anomaly early warning for subsequent quality control.

[0097] The fourth step is to standardize the dataset format: the merged dataset adopts a tensor format that is completely consistent with the standardized multivariate time series dataset output in step S1, with the dimensions being [number of batches × time step × (feature dimension + 2)]. The two newly added dimensions are anomaly score and anomaly state label, ensuring that the dataset format is compatible with the algorithm input requirements of the subsequent step S3.

[0098] Beneficial effects of steps S2.1 to S2.7: This module's sub-steps are adapted to the time-series data characteristics of the multi-physically dispersed links in the filter rod molding process chain. It captures the correlation and dependence of process parameters between processes such as molding, curing, emission, and receiving, making up for the shortcomings of existing production management and monitoring systems that can only achieve single-dimensional data recording and cannot identify cross-variable correlation anomalies. It adapts to the time delay characteristics between processes, enabling accurate identification of abnormal states in multi-dimensional time-series data of the entire process. It avoids the problem of insufficient adaptability of traditional anomaly detection methods to industrial linkage production data, completes the standardized marking of abnormal states and aligns the dimensions of the original data, ensures the integrity and reliability of time-series data of the entire process, eliminates the pain point of the inability to identify data anomalies in various dispersed links, provides accurate abnormal state data support for quality anomaly control in the filter rod production process, and fills the gap in the existing system's lack of in-depth analysis capabilities for process chain linkage anomalies.

[0099] Step S3: Obtain the deep representation features of the multivariate time series dataset through the time series deep feature extraction algorithm to obtain the deep time series feature dataset.

[0100] Furthermore, step S3 specifically includes the following steps: Step S3.1: Perform dual-view enhancement processing on the multivariate time-series dataset with labeled abnormal states to obtain a time-series dataset with a pair of original views and context views.

[0101] The first step is input data preprocessing. The only input in step S3.1 is the multivariate time series dataset with labeled abnormal states output in step S2. First, the mask preprocessing of abnormal samples is completed.

[0102] The masking rules are as follows: for sampling points marked as 1 for abnormal status, the mean of normal samples in a single batch corresponding to the feature dimension is used to fill them; for five or more consecutive abnormal segments, normal time segments under the same process and working conditions are used for matching and replacement to avoid interference of abnormal values ​​on the feature learning process. The preprocessed data retains the original time length, feature dimension and full metadata information.

[0103] The second step is the dual-view construction rules. The TS2Vec standard dual-view enhancement paradigm can be adopted, strictly adhering to the temporal constraints of the filter rod forming process, and not performing any enhancement operations that disrupt the temporal order. The definitions and construction rules for the two types of views are as follows: ① Original View (Instance View): Based on a complete time sequence of a single batch, the fixed window length is set to 2400 time steps, corresponding to a time span of 240s at a unified 10Hz sampling frequency for steps S1 and S2. This fully covers the minimum process flow cycle of the entire process from filter rod forming to curing to emission, ensuring that the parameter correlation characteristics across processes can be fully captured within a single view.

[0104] ②Context view: For paired original views, a continuous random pruning method is used to generate the view. The pruning length is set to 60% to 80% of the original view length. The pruning start position is randomly generated to ensure that the pruned subsequence is a continuous time segment and shares the same core time context with the original view. Each original view corresponds to only one unique context view, forming a one-to-one pair of samples.

[0105] The third step is to enhance the timing micro-offset. In order to further improve the generalization ability of the model, the paired dual views are subjected to a timing micro-offset of ±5 time steps, corresponding to a time span of ±0.5s. The offset does not exceed 0.5% of the window length. Without destroying the process timing logic, the feature bias caused by the sampling synchronization error is eliminated.

[0106] The fourth step is to generate paired datasets. By iterating through all the preprocessed time-series data by batch dimension, all sample pairs are constructed, and finally, a time-series dataset with paired original view and context view is generated. The tensor dimension of the dataset is [N×2×L×D], where N is the total number of valid view samples in a single batch, 2 corresponds to the two dimensions of the dual view, L is the fixed view length of 2400, and D is the feature dimension of the original data, which is completely consistent with the number of valid collection items + anomaly label dimension in step S1.1.

[0107] Step S3.2: Input the time series dataset into the time series coding network of the TS2Vec algorithm, and perform feature mapping through hierarchical causal convolution to obtain the hierarchical time series feature set corresponding to the dual views.

[0108] The first step, the basic structure of the temporal coding network, uses the TS2Vec temporal coding network in step S3.2. Its core is a temporal convolutional network (TCN) structure composed of multiple layers of dilated causal convolutions. Through a weight-sharing coding network, it simultaneously processes paired dual-view data, ensuring the consistency of feature mapping rules. The core characteristic of causal convolution is that the convolution output at each time step depends only on the current time step and previous historical time steps, completely concealing future temporal information, perfectly matching the irreversible flow logic of the filter rod forming process.

[0109] The second step is to set the core parameters of the network structure: ① The number of convolutional layers is set to 10, and the expansion coefficient d of each layer increases exponentially by 2, i.e. d=1, 2, 4, 8, 16, 32, 64, 128, 256, 512. Finally, the global receptive field of the 10 convolutional layers reaches 2400 time steps, which is completely matched with the fixed length of the original view and can fully cover the full-cycle temporal information within a single view.

[0110] ② The kernel size of each convolutional layer is set to 3, the number of output channels is fixed at 64, and causal padding is used to ensure that the temporal length remains unchanged before and after the convolution operation, without dimensional compression.

[0111] ③ After each convolutional layer, a normalization layer and a GELU activation function are added to avoid the neuron death problem caused by the ReLU activation function and improve the model training stability.

[0112] The third step is to define the hierarchical causal convolution calculation rules. The output calculation formula for a single dilated causal convolutional layer is as follows: .

[0113] in, Let K be the output feature of the l-th convolutional layer at the t-th time step, and K be the kernel size, where K=3. , Let be the learnable weights and biases of the l-th convolutional layer, and d be the dilation coefficient of the l-th layer. represents the historical input features of the (l-1)th layer, and LN is the layer normalization operation.

[0114] The fourth step is the implementation of dual-view feature mapping. The paired original view and context view are input into a temporal coding network with fully shared weights. Each convolutional layer outputs temporal features at the corresponding scale. The lower-level convolutions (layers 1-5) capture the local parameter fluctuation features within a single process, while the higher-level convolutions (layers 6-10) capture the cross-process global correlation features of the entire process from molding to curing to emission to reception. Finally, each view generates 10 levels of temporal features, which correspond one-to-one with the number of convolutional layers.

[0115] The fifth step is to generate the hierarchical temporal feature set. The hierarchical features of the original view and the context view are spliced ​​into tensors according to the view type, hierarchical order and temporal order. Finally, the hierarchical temporal feature set corresponding to the two views is generated. The tensor dimension is [N×2×L×C×K], where C is the number of output channels of the convolutional layer (64) and K is the number of convolutional layers (10). This ensures that the hierarchical features of the two views correspond one-to-one and there is no dimensional misalignment.

[0116] For example, step S3.2 can be implemented using the following pseudocode: def dilated_causal_conv_encoder(dual_view_dataset, layer_num=10, out_channels=64, kernel_size=3): # Input a pairwise dataset of two views [N, 2, L, D] N, view_num, L, D = dual_view_dataset.shape # Weight-sharing temporal coding networks encoder_layers = nn.ModuleList() for l in range(layer_num): dilation = 2 ** l # Dilated causal convolutional layers with layer normalization and activation functions conv_layer = nn.Conv1d( in_channels=out_channels if l>0 else D, out_channels=out_channels, kernel_size=kernel_size, padding=dilation*(kernel_size-1), # Cause and effect padding dilation=dilation ) norm_layer = nn.LayerNorm([out_channels, L]) encoder_layers.append(nn.Sequential(conv_layer, norm_layer,nn.GELU())) # Dual-view feature mapping, weight sharing hierarchical_features = [] for view_idx in range(view_num): view_data = dual_view_dataset[:, view_idx, :, :].permute(0, 2, 1)# [N, D, L] layer_features = [] x = view_data for layer in encoder_layers: x = layer(x) layer_features.append(x.permute(0, 2, 1)) # [N, L, C] hierarchical_features.append(torch.stack(layer_features, dim=-1))# [N, L, C, K] # Concatenate dual-view hierarchical features to generate a hierarchical temporal feature set hierarchical_feature_set = torch.stack(hierarchical_features, dim=1) # [N, 2, L, C, K] return hierarchical_feature_set Step S3.3: Perform cross-view comparative learning optimization on the hierarchical temporal feature set to obtain a dual-view temporal feature set after feature space alignment.

[0117] The first step is to compare and learn the core logic: This step adopts a dual comparison and learning paradigm of time-series level + instance level. Unlike the traditional global instance comparison, time-series level comparison can ensure that the context features of the same time step are accurately aligned, fully preserve the time sequence details of the filter rod process, and adapt to the accuracy requirements of subsequent cross-process time sequence alignment.

[0118] The second step is to set rules for positive and negative samples, adhering to the batch consistency and timing consistency constraints of the filter rod process, and to complete the standardized definition of positive and negative samples: Positive samples: Original view features and context view features from the same batch and time step are strongly correlated paired samples.

[0119] Negative samples are divided into two categories: one is features from the same batch but at different time steps (time-series negative samples), and the other is features from different batches but at any time step (batch negative samples). The two types of negative samples together constitute the negative sample set, ensuring that the model can learn quality-related features with batch-specificity and time-series-specificity.

[0120] The third step is to design the loss function and set its parameters. The TS2Vec standard normalized temperature-scaled cross-entropy loss (NT-Xent) can be used. Simultaneously, the loss is calculated separately for the features of each convolutional layer, and a layer-weighted summation is performed. The loss function calculation formula is as follows: .

[0121] in, , Let be the original view and context view features of the i-th sample, at the t-th time step, and at the l-th layer; sim() be the cosine similarity function; τ be the temperature coefficient, set to 0.07; Neg(i,t) be the negative sample set corresponding to the positive sample; w lThe weights are set to 0.7 for the loss of layer l, 0.7 for the weights of the higher-level convolutions (layers 6-10), and 0.3 for the weights of the lower-level convolutions (layers 1-5), prioritizing the learning effect of the global correlation features across processes.

[0122] The fourth step involves setting the model training parameters. Unsupervised training is adopted, eliminating the need for filter rod quality labels. The training dataset consists of 30 consecutive days of normal production history data from the filter rod forming process chain, which is completely consistent with the model training dataset in step S2, ensuring the adaptability of the model to the working conditions throughout the entire process. The optimizer is AdamW, with weight decay set to 0.0001, initial learning rate set to 0.001, batch size set to 32, training epochs set to 50, and early stopping policy patience set to 5. Training is terminated when the validation set loss does not decrease for 5 consecutive epochs to avoid model overfitting.

[0123] The fifth step is feature space alignment. After training, the dual-view hierarchical temporal feature set generated in step S3.2 is input into the optimized temporal coding network to complete the remapping of the feature space. This ensures that the cosine similarity of positive sample pairs is ≥0.9 and the cosine similarity of negative sample pairs is ≤0.3, achieving accurate alignment of the feature space. Finally, a dual-view temporal feature set with aligned feature space is generated, and the tensor dimension is completely consistent with the hierarchical temporal feature set output in S3.2.

[0124] Step S3.4: Perform multi-scale hierarchical feature aggregation on the dual-view temporal feature set to obtain a unified representation feature set with full temporal coverage.

[0125] The first step is multi-scale feature pooling. For each level of temporal features, three pooling operations are performed to complete multi-scale feature extraction. The pooling process maintains the temporal length and ensures full temporal coverage. ① Global Temporal Average Pooling: Captures the global temporal trend of each feature dimension.

[0126] ② Global temporal max pooling: captures temporal abrupt change information for each feature dimension.

[0127] ③ Sliding window mean pooling: The window length is set to 100 time steps to capture the local temporal fluctuation features of each feature dimension. The three pooling results are concatenated according to the feature dimensions to generate multi-scale features at each level, with dimensions of [N×2×L×3C×K], where C is the number of single channels, 64.

[0128] The second step is the adaptive calculation of hierarchical attention weights: a hierarchical self-attention mechanism is adopted to adaptively learn the weights of features at different levels, eliminating the need for manually setting fixed weights and avoiding biases caused by prior process knowledge. The weight calculation formula is as follows: .

[0129] In the formula, F l For the multi-scale features of the l-th layer, GAP is global average pooling, MLP is a two-layer perceptron, and α l is the normalized weight of the features of the l-th layer, and the sum of all weights is 1.

[0130] The third step is hierarchical feature weighted aggregation. The multi-scale features of each level are multiplied by the corresponding adaptive attention weight, and element-wise summation is performed along the hierarchical dimension to complete the hierarchical feature aggregation. Then, the aggregated original view and context view features are fused by element-wise mean fusion to eliminate the differences between the two views, retain the core temporal representation information, and finally generate a unified representation feature set with full temporal coverage. The tensor dimension is [N×L×C], where C is the number of feature channels (64), the temporal length L is exactly the same as the original view, and each time step corresponds to a unique aggregated feature.

[0131] Step S3.5: Map and match the unified representation feature set with the original temporal dimension to obtain the deep temporal feature dataset.

[0132] The first step is to accurately map the temporal dimension using the millisecond-level timestamps of the standardized multivariate temporal dataset output in step S1 as the unique primary key. Each window feature of the unified representation feature set is mapped back to its corresponding position in the original temporal sequence according to the timestamp. For the features in the overlapping areas of the sliding window, a Gaussian weighted average is used to complete the fusion. The weight function is completely consistent with the sampling point weight function in step S2.5, with higher weights at the center of the window and lower weights at the edges to avoid noise interference from the window edge features and ensure that each time step corresponds to a unique fused deep feature.

[0133] The second step is to bind features to metadata. The deep features after mapping and fusion are bound one-to-one with all metadata of the original dataset, such as batch UUID, process code, timestamp, and collection item attributes. At the same time, the two dimensions of abnormal status mark and abnormal score generated in step S2 are concatenated with the deep features to ensure that the full-link association between features and original data and abnormal information is traceable.

[0134] The third step is to standardize the dataset format. The final deep temporal feature dataset adopts a three-dimensional tensor format that is completely consistent with steps S1 and S2. The tensor dimension is [batch size × time step × (64+2)], where 64 is the deep representation feature dimension and 2 is the abnormal state label and abnormal score dimension, which is fully compatible with the input requirements of the Soft-DTW algorithm in the subsequent step S4.

[0135] Beneficial effects of steps S3.1 to S3.5: This module's sub-steps are adapted to the time-series data characteristics of physically dispersed stages in the filter rod forming process chain, such as forming, curing, emission, and receiving. It matches the time delay attributes between each process, overcoming the limitations of existing production management and monitoring systems that can only record and visualize surface-level data and cannot uncover deep data correlations. It breaks through the limitations of traditional feature processing methods in capturing the global dependence of long-cycle industrial time-series data, effectively extracting potential linkage patterns and deep correlation information from the time-series data of the entire process. It retains the full-cycle dimensional attributes of the time-series data and the implicit correlation characteristics between processes, avoiding subjective bias and loss of effective information caused by manual feature processing. This forms a unified feature system with strong representational capabilities, providing feature support with deep correlation information for the quality status analysis of the filter rod production process. It fills the gap in existing systems' lack of in-depth analysis capabilities for multi-stage data in the process chain, and resolves the pain point of the inability to achieve deep linkage analysis of data from various dispersed stages.

[0136] Step S4: Use a time sequence alignment and matching algorithm to perform cross-process time sequence association matching on the deep time sequence feature dataset to obtain the full-process associated dataset.

[0137] Furthermore, step S4 specifically includes the following steps: Step S4.1: The deep time-series feature dataset is split according to the process nodes of the filter rod forming process chain to obtain an independent time-series feature sequence subset for each process step.

[0138] The first step, step S4.1, has the deep time-series feature dataset output from step S3 as its sole input. The dataset is already bound to full metadata such as batch UUID, 4-digit standardized process code, millisecond-level timestamp, and abnormal status marker. The final output of step S4.1 is a subset of time-series feature sequences that are independent of each process step.

[0139] The second step is to match the process nodes according to the rules. The splitting criteria are completely one-to-one with the 4-digit process codes defined in step S1.5. The four core process steps are fixed and split as follows: 11XX code segment corresponds to the filament opening and forming step, 12XX code segment corresponds to the filter rod curing and storage step, 13XX code segment corresponds to the filter rod wind-powered emission step, and 14XX code segment corresponds to the winding machine receiving and forming step. There are no additional or missing process nodes.

[0140] The third step is the implementation process of dimensional decomposition: ① First, the input dataset is grouped by batch UUID to ensure that all process data in the same batch are collected into the same data group, thus avoiding data mixing across batches.

[0141] ② Secondly, within a single batch of data, a two-level split is performed according to the process coding mask, retaining only the feature sequence and metadata of the corresponding coding segment, and stripping away irrelevant data from other processes. Each process retains only its own three core data types: deep representation features, anomaly markers, and time-series stamps.

[0142] ③ Finally, each split process sequence is supplemented with process-specific metadata, including the production start timestamp, production end timestamp, machine number, and number of valid sampling points for that process, thus completing the independent encapsulation of the sequence.

[0143] Preferably, for the minimum effective sequence length threshold of a single process, the molding stage is set to 1200 time steps (corresponding to a minimum production cycle of 120s at a sampling frequency of 10Hz), the curing and storage stage is set to 3600 time steps (corresponding to a minimum curing cycle of 6 minutes), the wind power transmission stage is set to 1200 time steps, and the winding machine receiving stage is set to 1200 time steps.

[0144] Preferably, the threshold for the percentage of valid data in a single process sequence is set to 95%. Sequences below this threshold are marked as invalid sequences, and the corresponding batches are marked as invalid batches and directly discarded.

[0145] Preferably, each process sequence after splitting must have a unique batch UUID, process code, continuous time stamp, no cross-process data mixing, dimensional consistency error of 0, and a verification pass rate of 100%. The subset of independent time-series feature sequences of each process that passes the verification is used as the sole input of step S4.2.

[0146] Step S4.2: Perform time sequence constraint matching between process steps on the subset of time-series feature sequences to obtain a set of constrained feature sequence pairs.

[0147] The first step, step S4.2, has only one input: the independent temporal feature sequence subsets of each process step output from step S4.1; the final output of step S4.2 is a set of feature sequence pairs with process temporal constraints.

[0148] The second step is to establish a time sequence constraint rule: strictly follow the physical flow path of filter rod production, and fix a unique upstream and downstream pairing logic as filament tow opening and forming → filter rod solidification and storage → filter rod air-powered launch → winding machine receiving and forming. Only upstream processes are allowed to pair with direct downstream processes, and reverse pairing, cross-level pairing, and pairing within the same process are prohibited to ensure that the pairing relationship fully conforms to the actual production.

[0149] The third step involves sequence pair matching: ① First, use the batch UUID as the unique matching primary key to ensure that the paired upstream and downstream sequences belong to the same filter rod production batch, and prevent cross-batch pairing.

[0150] ② Secondly, process flow time window constraints are set. Based on the general process specifications for filter rod production in the tobacco industry, the maximum allowable flow interval thresholds for upstream and downstream processes are set: the maximum interval from the end time of the molding stage to the start time of the curing stage is set to 300s, the maximum interval from the end time of the curing stage to the start time of the emission stage is set to 1800s, and the maximum interval from the end time of the emission stage to the start time of the receiving stage is set to 60s. Sequence pairs that exceed the thresholds are judged as flow abnormalities and are not paired.

[0151] ③ Finally, process constraint tags are added to each compliant sequence pair, including upstream process ID, downstream process ID, batch UUID, transfer time interval, and constraint compliance tag, forming a standardized feature sequence pair. All compliant sequence pairs together constitute a set of constrained feature sequence pairs.

[0152] Step S4.3: Calculate the normalization path and alignment cost matrix of sequence pairs in the constrained feature sequence pair set using the Soft-DTW algorithm to obtain the optimal alignment path of the feature sequence pair set.

[0153] The first step, step S4.3, has only one input: the set of constrained feature sequence pairs output from step S4.2. The final output of step S4.3 is the set of optimal alignment paths and the set of alignment cost matrices corresponding to the set of feature sequence pairs.

[0154] The second step involves the core principles and parameter settings of the Soft-DTW algorithm. ① Core features of the algorithm: Soft-DTW is a smooth improvement on the traditional dynamic time warping algorithm. By introducing the soft minimum operation of minimum convolution, it realizes the differentiability calculation of the alignment distance. At the same time, it has strong robustness to noise and local fluctuations in industrial time series data and is fully adaptable to the alignment requirements of non-fixed time delay between filter rod production processes.

[0155] ② Core hyperparameter settings: The smoothing coefficient γ is set to 0.1. This parameter balances the smoothness and accuracy of alignment and adapts to the noise level of industrial time series data; the global constraint adopts Sakoe-Chiba band constraint, and the constraint window radius R is set to 200 time steps (corresponding to a time span of 20s at a sampling frequency of 10Hz), which strictly limits the maximum time offset of the alignment path and avoids over-alignment that does not conform to the process logic; the distance metric adopts Euclidean distance as the distance calculation benchmark for point-by-point features between sequence pairs.

[0156] The third step is the implementation of the algorithm calculation: ① For each feature sequence pair, define the upstream sequence as X=[x1,x2,...,x...]. M ]∈R M×D The downstream sequence is Y=[y1,y2,...,y N ]∈RN×D , where M and N are the time step lengths of the upstream and downstream sequences, respectively, and D is the feature dimension (completely consistent with the 64-dimensional deep feature dimension output by S3).

[0157] ② Calculate the cost matrix C ∈ RM × N for each sequence pair, where each element C i,j The Euclidean distance between the eigenvectors of the upstream sequence at time step i and the downstream sequence at time step j is calculated using the following formula: C i,j =||xi−yj|| 2 .

[0158] ③ Based on the soft minimum operation of Soft-DTW, calculate the cumulative cost matrix A∈R of the regularized path. M×N The recursive formula for cumulative cost is: A i,j =C i,j +softmin(A i-1,j A i-1,j-1 A i,j-1 ).

[0159] Where softmin(a,b,c)=−γlog(e -a / γ +e -b / γ +e -c / γ ), the boundary condition is A 0,0 =0, and the initial values ​​of the other boundaries are +∞.

[0160] ④ Based on the cumulative cost matrix, the globally optimal regular path π∗ is solved by the backtracking algorithm. The path must satisfy the monotonicity and continuity constraints, and fully comply with the Sakoe-Chiba range constraint to ensure that the path has no temporal reversal and no skipping.

[0161] For example, step S4.3 can be implemented using the following pseudocode: import torch from soft_dtw_cuda import SoftDTW def calculate_soft_dtw_alignment(constrained_sequence_pairs, gamma=0.1, sakoe_chiba_radius=200): # Initialize the Soft-DTW algorithm and set core parameters sdtw = SoftDTW(use_cuda=True, gamma=gamma, sakoe_chiba_radius=sakoe_chiba_radius) optimal_path_set = [] cost_matrix_set = [] # Traverse all constrained sequence pairs for seq_pair in constrained_sequence_pairs: upstream_seq, downstream_seq = seq_pair["upstream"], seq_pair["downstream"] # Dimension adaptation: [batch, length, feature_dim] upstream_tensor = upstream_seq.unsqueeze(0) downstream_tensor = downstream_seq.unsqueeze(0) # Calculate the alignment cost matrix and cumulative cost loss, cost_matrix, accum_matrix = sdtw(upstream_tensor,downstream_tensor, return_matrix=True) # Backtracking to find the optimal regular path optimal_path = sdtw.backtrack_optimal_path(accum_matrix.squeeze()) # Store the results and bind sequence pair metadata cost_matrix_set.append({"seq_pair_id": seq_pair["id"],"cost_matrix": cost_matrix.squeeze()}) optimal_path_set.append({"seq_pair_id": seq_pair["id"],"optimal_path": optimal_path}) return optimal_path_set, cost_matrix_set Step S4.4: Based on the optimal alignment path, perform cross-process feature sequence temporal dimension alignment mapping on the feature sequence set to obtain the full-process temporal alignment feature sequence set.

[0162] The first step, step S4.4, takes only the optimal alignment path set and alignment cost matrix set output from step S4.3, and the subset of independent temporal feature sequences for each process output from step S4.1. The final output of this step is the feature sequence set after temporal alignment of all processes.

[0163] The second step is to construct a global timing reference axis. The production start time stamp of the forming stage within a single batch is taken as the zero point of the global reference axis, and the production end time stamp of the receiving stage is taken as the end point of the global reference axis. The sampling step size of the reference axis is fixed at 100ms (which is completely matched with the 10Hz sampling frequency uniform throughout the entire process) to ensure that the timing resolution of the reference axis is completely consistent with the original data without any loss of accuracy.

[0164] The third step is to align and map across processes: ① For the optimal alignment path of each upstream and downstream sequence pair, establish a one-to-one mapping relationship between the upstream time step and the downstream time step, clarify the specific production time point of the downstream process corresponding to each production time point of the upstream process, and accurately match the flow delay between processes.

[0165] ② Using the global time-series reference axis as a unified scale, the feature sequence of each process is mapped to the corresponding time position of the reference axis through linear interpolation according to the mapping relationship of the alignment path. The interpolation process preserves the original numerical distribution of the feature vectors without feature distortion.

[0166] ③ For blank time steps without corresponding mapping values ​​on the baseline axis, fill them with the feature mean of adjacent valid time steps in the same process to ensure that the mapped sequence has no missing values; for time steps that exceed the Sakoe-Chiba constraint range, fill them with zero values ​​and mark them as invalid feature bits to avoid interference from invalid data.

[0167] The fourth step is to align and integrate the entire process sequence. The mapped sequences of the four processes (molding, curing, transmitting, and receiving) within the same batch are integrated according to the time step order of the global reference axis to ensure that the sequence time length of the four processes is completely consistent. Each time step corresponds to the feature vector of the four processes. Finally, a feature sequence set after the time sequence alignment of the entire process of a single batch is generated, with a tensor dimension of [T×D×4], where T is the total number of time steps of the global reference axis, D is the feature dimension of a single process, and 4 corresponds to the four core processes.

[0168] Step S4.5: After aligning the feature sequence set of the entire process time sequence, perform dimensional splicing and association binding according to the batch identifier to obtain the batch-level full-process associated feature set.

[0169] The first step, step S4.5, has the unique input of the feature sequence set after full process time alignment output from S4.4; the final output of step S4.5 is the batch-level full process associated feature set.

[0170] The second step is batch-level dimension stitching: ① Group the full process alignment feature sequence set by batch UUID to ensure that the four process feature sequences in the same batch are grouped into the same group, and there is no cross-batch data mixing.

[0171] ② Within a single batch group, features are spliced ​​along the time step dimension. The feature vectors of the four processes corresponding to the same time step—forming, curing, transmitting, and receiving—are spliced ​​together end to end along the feature dimension, generating a global feature vector that integrates information from all processes for each time step.

[0172] ③ For each time step feature after splicing, bind the full metadata of the corresponding batch, including batch UUID, raw material batch number, machine number of each process, production start and end time, and abnormal status mark, to achieve strong association between features and information of the entire production chain.

[0173] The third step is to define the key parameters and dimensions. The feature dimension of a single process is fixed at 66 dimensions (64-dimensional deep representation features + 1-dimensional abnormal state marker + 1-dimensional abnormal score). After the four processes are spliced ​​together, the global feature dimension of a single time step is fixed at 264 dimensions. The tensor dimension of a single batch feature set is [T×264], where T is the total number of time steps of the global reference axis, which is fully compatible with the input format requirements of the subsequent S5 algorithm.

[0174] Step S4.6: Perform dimensional matching and merging of the batch-level full-process associated feature set with the original data attributes to obtain the full-process associated dataset.

[0175] The first step, step S4.6, takes as its sole input the batch-level full-process association feature set output from step S4.5 and the standardized multivariate time-series dataset output from step S1; the final output of this step is the full-process association dataset.

[0176] The second step is to implement dimension matching and merging: ① Using batch UUID + millisecond-level timestamp as the joint primary key, the batch-level full-process associated feature set is matched one-to-one with the standardized multivariate time-series dataset output by S1 to ensure that the global features of each time step are completely aligned with the corresponding original process parameters and quality inspection data, with a matching error of 0.

[0177] ② The matched original data attributes are merged into the global feature vector along the feature dimension. The merged feature dimension is "264-dimensional full-process fusion feature + S1 original feature dimension", which retains all the collected information of the original data without information loss.

[0178] ③ Supplement the merged dataset with full-link metadata, including alignment path information for each process, flow time delay information, and anomaly point distribution information, to improve the full-process traceability attributes of the dataset.

[0179] The third step is to standardize the dataset format: The final generated full-process associated dataset adopts a unified three-dimensional tensor format with tensor dimensions of [number of batches × time step × total feature dimension]. The number of batches is the total number of effective production batches, the time step is consistent with the global baseline length of a single batch, and the total feature dimension is a fixed value, which fully adapts to the input requirements of the subsequent S5 FT-Transformer algorithm.

[0180] Beneficial effects of steps S4.1 to S4.6: This module's sub-steps are adapted to the physically dispersed and time-delayed process characteristics of filter rod forming, curing, emission, and receiving processes. It compensates for the shortcomings of existing production management and monitoring systems in achieving precise cross-process temporal correlation and matching, addressing the pain points of temporal misalignment and data link breakage between multiple dispersed processes. It achieves precise temporal alignment of the entire process feature sequence, breaks down feature correlation links between dispersed stages, eliminates information silos between processes, and completes full-process data correlation and binding at the batch level. This ensures the temporal continuity and batch consistency of the entire process chain data, adapts to the time delay attributes of each stage in the production process, and avoids the problems of temporal misalignment and insufficient correlation accuracy in traditional cross-process data matching methods. It provides complete correlation data support for quality traceability and cross-process collaborative analysis throughout the filter rod production process, filling the gap in existing systems' lack of effective data correlation capabilities for dispersed processes with time delays, and meeting the core requirements of quality control throughout the entire filter rod forming process chain.

[0181] Step S5: Perform full-process feature fusion and quality level determination on the full-process associated dataset using a full-dimensional feature fusion classification algorithm to obtain the quality level result of each batch of filter rods.

[0182] Furthermore, step S5 specifically includes the following steps: Step S5.1: Generate an independent feature token vector based on each feature dimension in the full-process associated dataset to obtain a tokenized feature sequence set.

[0183] The first step, step S5.1, takes the entire process-related dataset as its sole input from step S4. The dataset tensor format is [number of batches × time step × total feature dimension], where the total feature dimension is fixed at 314 (264-dimensional deep features fused across all processes + 50-dimensional original standardized features from S1, including 42 core process parameters and 8 online quality inspection items). The dataset is bound to full traceability information such as batch UUIDs, time stamps, and process metadata. The final output of step S5.1 is a tokenized feature sequence set.

[0184] The second step is feature preprocessing and dimensionality splitting: ① First, the dimensions are split according to the feature type, and the feature subsets are divided into three categories: continuous feature subset (including process operation parameters, deep characterization features, and quality inspection data, totaling 312 dimensions), categorical feature subset (including process code and machine number, totaling 1 dimension), and binary feature subset (including abnormal status markers, totaling 1 dimension).

[0185] ② For continuous feature subsets, perform Z-Score standardization secondary processing, using the mean and standard deviation of the entire training set as a benchmark to eliminate numerical distribution shifts caused by fluctuations in operating conditions of different batches.

[0186] ③ For categorical feature subsets, one-hot encoding is used to complete discrete value mapping. The encoding dimension is fully matched with the total number of process nodes and the total number of machines, avoiding ordinal bias caused by numerical encoding. For binary feature subsets, the original 0 / 1 encoding is directly retained without additional mapping.

[0187] Step 3: Core implementation and parameter settings for feature tokenization: ① The feature-wise linear projection tokenization mechanism of the FT-Transformer standard is adopted, and a dedicated linear projection layer is set for each independent feature dimension to ensure that the tokenization process of each feature is independent and there is no cross-interference of information.

[0188] ② Core parameter settings: The output dimension dtoken of the single feature token is fixed at 64 dimensions, which is consistent with the deep feature dimension output by S3 to ensure the uniformity of the feature space; the input dimension of the linear projection layer is the encoding dimension of the corresponding feature (1 dimension for continuous and binary features, and one-hot encoding dimension for categorical features), and the output dimension is fixed at 64 dimensions.

[0189] ③ The tokenization calculation formula is as follows: t i =W i ·x i +b i Among them, t i ∈R d,token W is the feature token vector corresponding to the i-th feature dimension. i ∈R d,token×d,in Let be the linear projection weight matrix specific to the i-th feature, where d,in is the input dimension of the corresponding feature, and x is the linear projection weight matrix. i b represents the preprocessed feature values. i This is the bias vector for the corresponding linear layer.

[0190] ④ Complete token sequence concatenation according to time step dimension. For each time step's 314 feature dimensions, concatenate all the token vectors corresponding to the features in the order of the feature dimensions to generate a single time step token sequence. The sequence length is equal to the total feature dimension of 314, and each sequence element is a 64-dimensional feature token vector.

[0191] For example, step S5.1 can be implemented using the following pseudocode: import torch import torch.nn as nn class FeatureTokenizer(nn.Module): def __init__(self, total_feature_num, continuous_feature_num,categorical_feature_dims, token_dim=64): super().__init__() self.token_dim = token_dim # Create a dedicated linear projection layer for each continuous feature self.continuous_proj = nn.ModuleList([ nn.Linear(1, token_dim) for _ in range(continuous_feature_num) ]) # Create a dedicated embedding layer for each categorical feature self.categorical_emb = nn.ModuleList([ nn.Embedding(dim, token_dim) for dim in categorical_feature_dims ]) self.total_feature_num = total_feature_num def forward(self, x, continuous_mask, categorical_mask): # x Input dimension: [batch_size, time_step, total_feature_num] batch_size, time_step, _ = x.shape token_sequence = torch.zeros(batch_size, time_step, self.total_feature_num, self.token_dim, device=x.device) # Tokenization of continuous features cont_idx = 0 for i in range(self.total_feature_num): if continuous_mask[i]: feat = x[:, :, i].unsqueeze(-1) # [batch, time_step, 1] token_sequence[:, :, i, :] = self.continuous_proj[cont_idx](feat) cont_idx += 1 # Tokenization of categorical features cat_idx = 0 for i in range(self.total_feature_num): if categorical_mask[i]: feat = x[:, :, i].long() # [batch, time_step] token_sequence[:, :, i, :] = self.categorical_emb[cat_idx](feat) cat_idx += 1 # Output dimension: [batch_size, time_step, seq_len, token_dim], seq_len = total number of features return token_sequence def generate_tokenized_sequence(full_process_dataset, tokenizer,continuous_mask, categorical_mask): tokenized_feature_set = tokenizer(full_process_dataset, continuous_mask, categorical_mask) return tokenized_feature_set Step S5.2: Add classification tokens and temporal location codes to the tokenized feature sequence set to obtain a complete token sequence set with location information.

[0192] The first step's sole input is the tokenized feature sequence set output from step S5.1, with tensor dimensions of [batch size × time step × sequence length × token dimension]. The final output of this step is a complete token sequence set with location information.

[0193] The second step is to add rules for the Global Classification Token (CLSToken): ① The CLS token is a global anchor token used by the FT-Transformer algorithm for the final classification task. It can aggregate feature information of the entire sequence and time series through the attention mechanism to achieve batch-level global feature extraction without additional pooling operations.

[0194] ② Parameter settings: The dimension of the CLS token is exactly the same as that of the feature token, fixed at 64 dimensions. Learnable parameters are used for initialization, and the initial values ​​follow a normal distribution with a mean of 0 and a standard deviation of 0.02. The parameters are trained and optimized together with the overall parameters of the model.

[0195] ③ Addition method: At the beginning of the token sequence at each time step, a separate CLS token is concatenated. The length of the single time step token sequence after concatenation is expanded from 314 to 315, where the 0th position is the CLS token and the 1st to 315th positions are the feature tokens of the corresponding feature dimensions, ensuring that the CLS token can complete attention interaction with all feature tokens.

[0196] The third step is to add rules for timing position encoding: ① To address the irreversible temporal characteristics of the filter rod forming process, a learnable temporal position encoding is adopted. Unlike fixed sine and cosine encoding, it can adaptively learn the periodic characteristics and flow delay patterns of industrial time series data. At the same time, it strictly follows causal constraints, injecting only the temporal sequence information without destroying the original correlation of features.

[0197] ② Parameter settings: The dimension of the position encoding is completely consistent with the dimension of the token, which is fixed at 64 dimensions. The encoding length is completely matched with the total number of time steps in a single batch. Each time step corresponds to a unique position encoding vector, which is initialized using learnable parameters.

[0198] ③ Addition method: For each complete token sequence (including CLS tokens) at each time step, perform element-wise addition of the position encoding vector at the corresponding time step with all token vectors in the sequence, as shown in the following formula: t~ i,t =t i,t +p t In the formula, t~ i,t Let t be the encoded vector of the ith token at time step t. i,t p is the original token vector. t This is the position encoding vector for the t-th time step.

[0199] ④ Supplementing causal constraints by adding process sequence masks to position encoding ensures that the position encoding of upstream process features precedes that of downstream processes, thus avoiding attention interactions that result in temporal logic reversal.

[0200] The fourth step is to generate a complete token sequence set. In the order of batch and time step, all token sequences after adding CLS tokens and position encoding are tensor concatenated to finally generate a complete token sequence set with position information. The tensor dimension is [number of batches × time step × 315 × 64], where 315 is the sequence length (1 CLS token + 314 feature tokens) and 64 is the token dimension, which is fully adapted to the input format requirements of the FT-Transformer algorithm.

[0201] Step S5.3: Input the complete token sequence set into the multi-head self-attention layer of the FT-Transformer algorithm, and perform full-dimensional feature interaction fusion through the global attention mechanism to obtain the feature representation set after attention fusion.

[0202] The first step, step S5.3, has the unique input of the complete token sequence set with location information output from step S5.2; the final output of this step is the feature representation set after attention fusion.

[0203] The second step is to configure the core structure and parameters of the FT-Transformer multi-head self-attention system. ① It adopts the standard multi-head self-attention (MHSA) structure. The core is to split the query, key and value vectors into multiple heads, perform attention calculations in parallel, and capture feature association information of different subspaces at the same time, adapting to the complex feature association characteristics of multi-dimensional and multi-process filter rods.

[0204] ② Core parameter settings: The number of attention heads h is fixed at 8, and the dimension per head d head Fixed at 64 dimensions, total attention dimension d model =h×d headThe 512-dimensional token is adapted to the 64-dimensional token dimension via linear projection; attention is calculated using scaled dot product attention with a scaling factor of d. head =64, to avoid the softmax gradient vanishing due to excessively large dot product values.

[0205] ③ Process causality mask setting: Add a process causality mask to the attention weight matrix. This allows upstream process features to influence downstream process features only, and prohibits downstream features from influencing upstream features. This fully conforms to the irreversible flow logic of the filter rod forming process, avoiding the learning of false associations. The positions in the mask matrix where interaction is prohibited are assigned a value of -10. 9 This ensures that the weights approach 0 after softmax.

[0206] Step 3, Attention Calculation: ① Generate query, key, and value vectors through a linear projection layer: Q=W Q ·X in K=W K ·X in V=W V ·X in In the formula, Xin is the complete set of input token sequences, and W... Q W K W V The linear projection weight matrix is ​​a learnable matrix, where Q, K, and V are the query, key, and value vectors, respectively, and their dimensions are [batch × time step × sequence length × d_model].

[0207] ② Multi-head splitting and scaling dot product attention calculation: .

[0208] Where, Q h K h V h Let M be the query, key, and value vector corresponding to the h-th attention head, and M be the process causality mask matrix. The softmax operation is performed along the key dimension.

[0209] ③ Multi-head result stitching and linear projection: X att =Concat(head1,head2,...,head h )·W O .

[0210] Among them, head h For the output of the h-th attention head, W O To output the linear projection weight matrix, X att This is the feature representation set after attention fusion.

[0211] The fourth step is residual connection and layer normalization. A Pre-LN pre-layer normalization structure is adopted. Compared with post-normalization, it has stronger training stability and is more suitable for industrial small sample scenarios. The calculation formula is as follows: Xout=Xin+MHSA(LN(Xin)) where LN is the layer normalization operation and the normalization dimension is the token dimension to ensure the stability of feature distribution and accelerate model convergence.

[0212] For example, step S5.3 can be implemented using the following pseudocode: class MultiHeadSelfAttention(nn.Module): def __init__(self, token_dim=64, head_num=8, d_model=512): super().__init__() self.head_num = head_num self.d_head = d_model / / head_num self.d_model = d_model # Linear projection layer self.q_proj = nn.Linear(token_dim, d_model) self.k_proj = nn.Linear(token_dim, d_model) self.v_proj = nn.Linear(token_dim, d_model) self.out_proj = nn.Linear(d_model, token_dim) self.scale = torch.sqrt(torch.tensor(self.d_head, dtype=torch.float32)) def forward(self, x, causal_mask=None): # x Input dimension: [batch_size, time_step, seq_len, token_dim] batch_size, time_step, seq_len, _ = x.shape # Linear projection to generate QKV Q = self.q_proj(x).view(batch_size, time_step, seq_len, self.head_num, self.d_head).transpose(2, 3) K = self.k_proj(x).view(batch_size, time_step, seq_len, self.head_num, self.d_head).transpose(2, 3) V = self.v_proj(x).view(batch_size, time_step, seq_len, self.head_num, self.d_head).transpose(2, 3) # Scaled dot - product attention calculation attn_scores = torch.matmul(Q, K.transpose(-1, -2)) / self.scale # Add the process causal mask if causal_mask is not None: attn_scores = attn_scores.masked_fill(causal_mask == 0, -1e9) attn_weights = torch.softmax(attn_scores, dim=-1) # Attention - weighted summation attn_output = torch.matmul(attn_weights, V).transpose(2, 3).contiguous() attn_output = attn_output.view(batch_size, time_step, seq_len, self.d_model) # Output projection output = self.out_proj(attn_output) return output, attn_weights def feature_interaction_fusion(full_token_sequence, causal_mask): # Front layer normalization ln = nn.LayerNorm(full_token_sequence.shape[-1]) mhsa_layer = MultiHeadSelfAttention() # Residual Connection ln_x = ln(full_token_sequence) attn_output, _ = mhsa_layer(ln_x, causal_mask) fused_feature_set = full_token_sequence + attn_output return fused_feature_set Step S5.4: Input the feature representation set into the feedforward neural network layer of the FT-Transformer algorithm, and generate a high-dimensional feature set after nonlinear mapping through nonlinear feature transformation and dimensionality optimization.

[0213] The first step, step S5.4, has the unique input of the attention-fused feature representation set output from step S5.3; the final output of step S5.4 is the high-dimensional feature set after nonlinear mapping.

[0214] Step 2: FFN layer core structure and parameter settings: ① The two-layer linear transformation structure of the FT-Transformer standard is adopted, with the addition of GELU activation function and Dropout regularization, while retaining residual connections and pre-layer normalization, which is completely consistent with the Pre-LN structure in step S5.3, ensuring the stability of model training.

[0215] ② Core parameter settings: The input dimension of the first linear layer is 64-dimensional (token dimension), and the output dimension (intermediate hidden layer dimension) is fixed at 2048-dimensional, which is 32 times the input dimension, ensuring sufficient non-linear expressive power; the input dimension of the second linear layer is 2048-dimensional, and the output dimension is restored to 64-dimensional, consistent with the input feature dimension, to adapt to the residual connection requirements; the Dropout rate is set to 0.1 to address the overfitting risk of industrial data and suppress the model's overlearning of noisy features; the GELU activation function replaces the traditional ReLU to avoid the neuron death problem and improve the model's generalization ability.

[0216] Step 3, FFN layer calculation: ① Pre-layer normalization and nonlinear transformation, X ffn =X in+FFN(LN(X in In the formula, X in This is the attention fusion feature representation set output from step S5.3, where LN is the layer normalization operation and FFN() is the feedforward network transformation function.

[0217] ② Core transformation formula for feedforward network: FFN(X)=Dropout(W2·GELU(W1·LN(X)+b1)+b2).

[0218] Where W1 and b1 are the weights and biases of the first linear layer, W2 and b2 are the weights and biases of the second linear layer, and Dropout is the random deactivation regularization operation.

[0219] Step 4, Feature Dimension Optimization Rules: ① To address the filter rod's quality-sensitive characteristics, an activation function is used to enhance its numerical differences; to address irrelevant noise characteristics, Dropout is used to randomly deactivate and suppress its interference.

[0220] ② During the transformation process, the temporal dimension and sequence length dimension are kept unchanged. Nonlinear mapping is only performed on the feature dimension to ensure a one-to-one correspondence between the features and the original temporal sequence and feature dimension, with no information misalignment.

[0221] For example, step S5.4 can be implemented using the following pseudocode: class FeedForwardNetwork(nn.Module): def __init__(self, token_dim=64, hidden_dim=2048, dropout_rate=0.1): super().__init__() self.ffn = nn.Sequential( nn.Linear(token_dim, hidden_dim), nn.GELU(), nn.Dropout(dropout_rate), nn.Linear(hidden_dim, token_dim), nn.Dropout(dropout_rate) ) self.ln = nn.LayerNorm(token_dim) def forward(self, x): # Pre-layer normalization + residual connection ln_x = self.ln(x) ffn_output = self.ffn(ln_x) output = x + ffn_output return output def nonlinear_feature_transform(fused_feature_set): ffn_layer = FeedForwardNetwork() high_dim_feature_set = ffn_layer(fused_feature_set) return high_dim_feature_set Step S5.5 involves iterative processing through a multi-layer network to extract the global feature output of the classification token corresponding to the high-dimensional feature set, thereby obtaining a batch-level global fusion feature set.

[0222] Preferably, the FT-Transformer encoder stacking structure and parameter settings are as follows: ① A single encoder layer is composed of the multi-head self-attention layer in step S5.3 and the feedforward neural network layer in step S5.4 connected in series, and adopts the Pre-LN pre-normalization structure to fully preserve the residual connections.

[0223] ② Core parameter settings: The number of encoder stacking layers is fixed at 6, which is consistent with the general optimal structure of Transformer for industrial structured data, taking into account both model expressiveness and training efficiency; all encoder layers share the same structural parameters, but the weights are not shared, and feature association information of different levels is learned layer by layer. The lower-level encoder captures the direct association between features, and the higher-level encoder captures the long-term quality influence rules across processes and time series.

[0224] ③ Core parameters for model training: Supervised training is adopted, and the training label is the actual quality inspection level of the filter rod batch (compliant with GB / T22838-2009 cigarette filter rod standard); the training set uses valid production batch data from the filter rod forming process chain for 6 consecutive months, divided into training set, validation set, and test set in a 7:2:1 ratio; the optimizer is AdamW, with weight decay set to 0.0001, initial learning rate set to 0.0001, cosine annealing learning rate scheduling strategy, batch size set to 16, training rounds set to 30, and early stopping strategy patience set to 5. Training is terminated when the classification accuracy of the validation set does not improve for 5 consecutive rounds to avoid overfitting.

[0225] Preferably, the implementation process of multi-layer network iterative processing is as follows: ① Input the high-dimensional feature set output from step S5.4 into the first encoder layer to complete the first round of attention fusion and nonlinear transformation, and output the first encoder features.

[0226] ②The output features of the previous encoder are used as the input features of the next encoder. The iterative processing of the 6 encoders is completed in sequence, and finally the high-level feature set of the 6th encoder is output. The tensor dimension is completely consistent with the input features, which is [batch number × time step × 315 × 64].

[0227] ③ Extract the CLS token feature at the start position of each time step sequence from the output feature set of the 6th layer encoder. The CLS token feature has aggregated the full-dimensional feature information of the corresponding time step, with the dimension being [batch number × time step length × 64].

[0228] Preferably, for the CLS token features of all time steps in a single batch, a Global Average Pooling (GAP) operation is performed to eliminate the differences in the time-series dimension and generate a unique global fusion feature vector for each batch; the global fusion feature vectors of all batches are tensor-concatenated in batch order to finally generate a batch-level global fusion feature set with a tensor dimension of [number of valid batches × 64]. Each row vector corresponds to a unique global fusion feature of the filter batch and is bound to the corresponding batch UUID to ensure traceability.

[0229] Step S5.6: Input the batch-level global fusion feature set into the fully connected classification layer for quality level mapping and determination, and obtain the quality level result of each batch of filter rods.

[0230] The first step, step S5.6, takes the batch-level global fusion feature set output from S5.5 as its sole input; the final output of step S5.6 is the quality grade result for each batch of filter rods.

[0231] The second step involves classifying the quality grades according to the national industry standard GB / T22838-2009 Cigarette Filter Rods, combined with the internal control standards of filter rod manufacturers. The quality grades of filter rods are divided into four fixed categories: superior grade, first grade, qualified grade, and unqualified grade. The number of categories corresponding to the classification task is four, and the classification labels use unique thermal coding with a four-dimensional coding dimension.

[0232] Step 3: Fully connected classification layer structure and parameter settings: ① The classification layer adopts a two-layer fully connected network structure. The first layer is a feature mapping layer with an input dimension of 64 dimensions (global fusion feature dimension) and an output dimension of 32 dimensions, with an additional GELU activation function and a Dropout layer (Dropout rate of 0.1). The second layer is a classification output layer with an input dimension of 32 dimensions and an output dimension of 4 dimensions (number of quality level categories), with an additional Softmax activation function, outputting the confidence probability of each batch corresponding to the 4 quality levels.

[0233] ② The loss function adopts the standard multi-class cross-entropy loss function. To address the class imbalance problem in industrial scenarios where the proportion of defective samples is low, class weights are introduced. The weight of the defective product category is set to 3.0, and the weights of superior, first-class, and qualified products are set to 1.0, thereby improving the model's accuracy in identifying defective products. The loss function formula is as follows: .

[0234] Where N is the total number of batches, w c Let y be the weight of the c-th category. i,c For the unique hot encoding of the true label in the i-th batch, y^ i,c This represents the confidence probability of the corresponding category output by the model.

[0235] Step 4, Quality Grade Determination: ① During the inference phase, the batch-level global fusion feature set is input into the trained fully connected classification layer, and the confidence probability of each batch corresponding to the four quality levels is output through the Softmax activation function, with the sum of the probabilities being 1.

[0236] ② The maximum confidence level determination rule is adopted, and the category with the highest confidence level is taken as the final quality level result of the batch. At the same time, a confidence level threshold is set to 0.6. If the highest confidence level is lower than 0.6, it is marked as a batch to be re-inspected and supplemented by manual testing for confirmation.

[0237] ③ Bind a unique batch UUID, production time, machine number, raw material batch number and other full metadata to the quality grade result of each batch to ensure the traceability of the results throughout the entire chain. Finally, a standardized quality grade result set is generated, which includes four core contents: unique batch identifier, quality grade, classification confidence level and re-inspection mark.

[0238] For example, step S5.6 can be implemented using the following pseudocode: class QualityClassificationHead(nn.Module): def __init__(self, feature_dim=64, hidden_dim=32, class_num=4,dropout_rate=0.1): super().__init__() self.classifier = nn.Sequential( nn.Linear(feature_dim, hidden_dim), nn.GELU(), nn.Dropout(dropout_rate), nn.Linear(hidden_dim, class_num), nn.Softmax(dim=-1) ) def forward(self, batch_feature_set): # Input dimensions: [batch_num, feature_dim] confidence_probs = self.classifier(batch_feature_set) return confidence_probs def quality_level_determination(batch_global_feature_set, confidence_threshold=0.6): # Load the completed classification head classifier = QualityClassificationHead() classifier.eval() with torch.no_grad(): confidence_probs = classifier(batch_global_feature_set) # Quality Grade Determination max_probs, pred_labels = torch.max(confidence_probs, dim=-1) # Grade Mapping: 0 = Superior Grade, 1 = First Grade, 2 = Qualified Grade, 3 = Unqualified Grade level_mapping = {0:"Superior Grade", 1:"First Grade", 2:"Qualified Grade", 3:"Unqualified Grade"} quality_result_set = [] for batch_idx in range(len(pred_labels)): batch_uuid = batch_global_feature_set[batch_idx]["uuid"] max_prob = max_probs[batch_idx].item() pred_level = level_mapping[pred_labels[batch_idx].item()] recheck_flag = max_prob <confidence_threshold quality_result_set.append({ "batch_uuid": batch_uuid, "quality_level": pred_level, "confidence": max_prob, "recheck_flag": recheck_flag }) return quality_result_set Beneficial effects of steps S5.1 to S5.6: This module's sub-steps are adapted to the full-process quality control needs of the filter rod forming process chain, including forming, curing, emission, and receiving. It compensates for the shortcomings of existing production management and monitoring systems, which can only achieve data recording and visualization but cannot achieve deep integration of full-process features and accurate quality judgment. It breaks through the limitations of traditional quality judgment methods in terms of insufficient adaptability to cross-process related data, realizes global interaction and deep integration of all process-related features, fully retains the feature correlation information between each dispersed link, adapts to the time delay characteristics between processes, and completes accurate judgment of filter rod batch quality level. It avoids the problem that traditional quality judgment methods rely only on single-process data and cannot cover the quality influencing factors of the entire process chain. It provides accurate batch quality judgment results for the quality control of the filter rod production process, and at the same time provides core quality judgment basis for root cause analysis of quality problems and full-process traceability. It fills the gap in the existing system's lack of systematic intelligent judgment capability for the entire filter rod process chain quality, and adapts to the core needs of closed-loop quality control of the entire filter rod forming process chain.

[0239] Step S6: Bind the quality grade results of all batches of filter rods with the operating parameters of the corresponding process links to obtain the full-process quality monitoring data of the filter rod forming process chain.

[0240] Furthermore, step S6 specifically includes the following steps: Step S6.1: Extract the batch unique identifier and standardize the format of the quality grade results for each batch of filter rods to generate a quality grade result set with a unique batch identifier.

[0241] Preferably, the batch unique identifier in UUID format, which is consistent throughout the entire process from step S1 to step S5, is used as the core primary key. This primary key is a unique identification code for a single batch of filter rods throughout its entire life cycle, without duplication or tampering, ensuring the uniqueness of the entire process data link. At the same time, auxiliary traceability identifiers are extracted, including six auxiliary joint primary keys: batch production serial number, forming machine code, transmitter code, winding machine code, tow raw material batch number, and production team code. These serve as redundant backups of the core primary key to avoid traceability failures caused by primary key anomalies.

[0242] Preferably, batch records with missing core primary key UUIDs are directly removed; records with quality levels outside the fixed enumeration range are marked as invalid; values ​​with confidence levels exceeding the [0,1] interval are corrected according to boundary values; for records with duplicate batch UUIDs, only the one valid record with the highest classification confidence is retained, and all other duplicates are removed, ensuring that a single batch UUID corresponds to a unique quality level record.

[0243] Step S6.2: Perform precise batch identification matching between the quality grade result set and the standardized multivariate time series dataset to obtain a batch-to-batch matching dataset of quality grade and process parameters.

[0244] Preferably, the core matching rules are as follows: ① An inner join exact matching mode is adopted, which only retains batches that exist in both the quality grade result set and the standardized multivariate time series dataset, ensuring that each valid quality result has corresponding full process data and each valid process data batch has corresponding quality grade result, thus eliminating one-sided data loss.

[0245] ② Match the unique primary key as the batch UUID, requiring the UUID strings to be completely identical. Fuzzy matching and fault-tolerant matching are not allowed. The matching accuracy is required to be 100% to avoid traceability distortion caused by batch mismatch from the root.

[0246] ③ The matching dimension is batch-level full matching. After a single batch is successfully matched, the full quality level field of the batch will be bound to each time-series sampling record of the corresponding batch, ensuring that each process sampling point can be directly associated with the final quality result of the batch.

[0247] Preferably, the matching implementation process is as follows: ① Hash Index Construction: For the quality grade result set and the standardized multivariate time series dataset, an in-memory hash index is constructed using the batch UUID as the unique key, which reduces the matching time complexity from O(n²) to O(n), adapting to the high-efficiency matching needs of batch data of more than 100,000 in industrial scenarios.

[0248] ② One-to-one exact match: Based on the hash index, iterate through all valid batches in the quality grade result set, and retrieve the batch data with the corresponding UUID in the index of the standardized multivariate time series dataset. If the retrieval is successful, it is considered a successful match; otherwise, it is considered a failed match, and the batch record is directly removed.

[0249] ③ Dimension merging: For batches that are successfully matched, the 14 standardized quality grade fields of the batch are added as new feature dimensions and merged into the standardized multivariate time series dataset of the corresponding batch. The merged single sampling record contains the original full process parameters, time series stamps, process identifiers, and full quality grade information without any information loss.

[0250] Step S6.3: Perform attribute association binding on the process sequence dimension of the quality grade and process parameter matching dataset to obtain a batch quality association dataset with full process sequence link attributes.

[0251] Preferably, the process sequence link attributes correspond one-to-one with the 4-digit standardized process code defined in step S1.5, covering 4 core process links. The link attributes of each process are fixed at 9 items, specifically including: unique process code, process name, process production start timestamp, process production end timestamp, process machine number, number of valid sampling points in the process, number of abnormal points in the process, process abnormality ratio, and time interval between upstream and downstream processes, ensuring that the attributes are completely matched with the actual process production.

[0252] Preferably, the association binding implementation process is as follows: ① Process dimension splitting and attribute aggregation: Based on the 4-bit process code mask, the single batch matching dataset is split into process dimensions, and the full data of four processes are aggregated respectively: tow opening and forming (11XX code), filter rod curing and storage (12XX code), filter rod wind power emission (13XX code), and winding machine receiving and forming (14XX code). Nine link attributes of each process are counted. The time interval between upstream and downstream processes is the difference between the production end timestamp of the upstream process and the production start timestamp of the downstream process, in milliseconds.

[0253] ② Construction of temporal link relationships: Strictly following the irreversible flow logic of filter rod production, each process is given a unique upstream process code and a unique downstream process code to construct a parent-child relationship: the molding stage is the starting point of the link, and only the downstream process is set as the curing stage; the upstream of the curing stage is the molding stage and the downstream is the launching stage; the upstream of the launching stage is the curing stage and the downstream is the receiving stage; the receiving stage is the ending point of the link, and only the upstream process is set as the launching stage, forming a complete closed-loop process temporal link chain with no reverse associations and no cross-level associations.

[0254] ③ Full-link attribute binding: The nine link attributes and upstream and downstream association identifiers of each process are added as new dimensions and bound to all sampling records of the corresponding process; at the same time, a full-process link overview attribute is added to a single batch dataset, including the total production time of the whole process, the total number of anomalies in the whole process, and the total percentage of anomalies in the whole process, to ensure that each sampling record has complete process attribution, upstream and downstream links, and full-process overview attributes, which can directly locate the process link and upstream and downstream related links without secondary retrieval.

[0255] Step S6.4: Merge all batch dimensions and unify data dimensions of the batch quality correlation dataset to obtain a quality and process correlation dataset covering all batches and processes.

[0256] Preferably, the data dimension unification processing flow is as follows: ① Establish a unified dataset field whitelist, fixing the name, data type, length, and order of all fields. The whitelist includes four categories: S1 original process parameter field, S2 anomaly flag field, S6.1 quality grade field, and S6.3 process link attribute field, totaling 86 fixed fields with no omissions and no non-standard extensions.

[0257] ② Traverse all single-batch datasets and perform dimension alignment according to the field whitelist: For fields missing in the whitelist, fill them with compliant default values ​​according to the data type (fill 0 for numeric types, fill empty strings for string types, and fill 0 for binary types); remove all non-standard fields outside the whitelist to ensure that the number, order, and data type of fields in all single-batch datasets are completely consistent, and the dimension error is 0.

[0258] Preferably, the full batch merging and index building process is as follows: ① According to the order of batch production start time, the single batch datasets after unifying all dimensions are vertically merged to form a complete full batch dataset. The merged dataset uses "batch UUID + millisecond-level timestamp" as the joint unique primary key to ensure that each sampled record is globally unique, without duplication or conflict.

[0259] ② Construct a three-level distributed index for the merged full batch dataset: the first-level index is the batch UUID, the second-level index is the process code, and the third-level index is the timestamp. The index covers all high-frequency traceability query dimensions, controlling the response time of single batch full-link traceability within 1 second, meeting the needs of rapid traceability in industrial sites.

[0260] Step S6.5: Standardize the storage structure of the quality and process-related dataset and complete the full-link metadata to obtain the full-process quality monitoring data of the filter rod forming process chain.

[0261] Preferably, the standardized storage structure mapping follows the general storage specifications for industrial time-series databases, is compatible with mainstream industrial time-series databases such as InfluxDB and TimescaleDB, and adopts a standardized storage structure of "Measurement + Tag + Field + Timestamp" to achieve high-throughput writes and millisecond-level query response.

[0262] Preferably, the field classification mapping rules are fixed as follows: ① Tag field: Dimensional field used for indexed queries, including batch UUID, process code, process name, machine number, raw material batch number, quality grade, and production team. All tag fields are indexed inverted to improve traceability query efficiency.

[0263] ② Numerical Fields: Indicator fields used for numerical analysis, including all process operation parameters, quality inspection data, anomaly scores, anomaly status markers, classification confidence levels, process anomaly percentages, and flow intervals. All numerical fields are uniformly set to floating-point or binary integer types to ensure compatibility of numerical calculations.

[0264] ③ Timestamp: A globally unique time key, uniformly using millisecond-level Unix timestamps, completely consistent with the time base of the entire process from step S1 to step S5, with no time offset.

[0265] Preferably, the completion of full-link metadata follows the "Industrial Data Governance Metadata Management Specification" and the data security management requirements of the tobacco industry, and completes the metadata of the entire data lifecycle, specifically including: data collection source and full-process link record of data processing.

[0266] Beneficial effects of steps S6.1 to S6.5: This module's sub-steps are adapted to the physically dispersed and time-delayed characteristics of the filter rod forming process chain, including forming, curing, launching, and receiving. It compensates for the shortcomings of existing production management and monitoring systems, such as broken data links and the inability to support accurate traceability across the entire process. It achieves full-link association and binding between batch quality results and corresponding process operation data, eliminating information silos between processes and constructing a complete closed-loop data system for quality control throughout the filter rod forming process chain. This meets the core needs of forward tracking and reverse tracing throughout the filter rod production process, providing complete full-link data support for root cause localization and analysis of quality problems. It addresses the pain points of difficult traceability and long processing cycles for filter rod quality issues in existing technologies, filling the gap in existing production management systems' lack of full-link closed-loop integration capabilities for filter rod process chain quality and process data. It provides a complete, standardized, and reusable closed-loop data foundation for quality optimization, anomaly control, and full-process traceability management in filter rod production, adapting to the practical application needs of intelligent quality control in the filter rod forming process chain.

[0267] Overall, the beneficial effects of steps S1 to S6 are as follows: By establishing a data link across the entire filter rod molding process chain, information silos between physically dispersed processes such as molding, curing, emission, and reception are eliminated. This enables standardized integration of operational and quality data across all processes, adapts to the time delays between each stage, and achieves precise correlation and matching of cross-process time-series data. It overcomes the limitations of existing production management and monitoring systems that can only record and visualize data, enabling in-depth analysis and feature mining of the entire process data. A correlation mapping system for the entire process data is constructed to support accurate determination of the quality status of the entire filter rod production process. This achieves full-link binding of quality data and process parameters, providing complete data support for cross-process, full-process traceability of filter rod quality issues, reducing the root cause location cycle of quality problems, and realizing closed-loop control of the quality status of the entire filter rod molding process chain. This provides a complete data foundation and analytical basis for quality optimization and anomaly control in the production process.

[0268] like Figure 2 As shown, this embodiment provides an example of a quality monitoring device for the filter rod forming process chain. In this embodiment, the quality monitoring device is applied to the quality monitoring method as described in the above embodiment.

[0269] Specifically, the quality monitoring device includes a process data acquisition module 1, a process data anomaly marking module 2, a deep characterization feature extraction module 3, a cross-process data association and matching module 4, a data quality level determination module 5, and a quality level result binding module 6, which are connected electrically or through communication in sequence.

[0270] The system comprises the following modules: Process data acquisition module 1 collects real-time operating parameters and quality inspection data for each process step in the filter rod forming process chain and organizes them into a standardized multivariate time-series dataset; Process data anomaly marking module 2 identifies data anomalies in the standardized multivariate time-series dataset using a multivariate time-series correlation anomaly detection algorithm and marks the anomalies, resulting in a multivariate time-series dataset with marked anomalies; Deep representation feature extraction module 3 extracts deep representation features from the multivariate time-series dataset using a time-series deep feature extraction algorithm, resulting in a deep time-series feature dataset; Cross-process data association matching module 4 performs cross-process time-series correlation matching on the deep time-series feature dataset using a time-series sequence alignment matching algorithm, resulting in a full-process correlated dataset; Data quality level determination module 5 performs full-process feature fusion and quality level determination on the full-process correlated dataset using a full-dimensional feature fusion classification algorithm, resulting in the quality level result for each batch of filter rods; and Quality level result binding module 6 binds the quality level results of all batches of filter rods with the operating parameters of the corresponding process steps, resulting in full-process quality monitoring data for the filter rod forming process chain. It should be noted that this embodiment is a functional module embodiment based on the above method embodiment. For additional content such as extensions, optimizations, limitations, examples, principle explanations, and beneficial effects of this embodiment, please refer to the above embodiments. This embodiment will not repeat them here.

[0271] Figure 3 This is a schematic diagram of the structure of an electronic device according to an embodiment of this application. Figure 3 As shown, the electronic device 7 includes a processor 71 and a memory 72 coupled to the processor 71.

[0272] The memory 72 stores program instructions for implementing the quality monitoring method of the filter rod forming process chain of any of the above embodiments.

[0273] The processor 71 is used to execute program instructions stored in the memory 72 for quality monitoring of the filter rod forming process chain.

[0274] The processor 71 can also be referred to as a CPU (Central Processing Unit). The processor 71 may be an integrated circuit chip with signal processing capabilities. The processor 71 can also be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components. A general-purpose processor can be a microprocessor or any conventional processor.

[0275] Furthermore, Figure 4 This is a schematic diagram of the structure of a storage medium according to an embodiment of this application. See also: Figure 4The storage medium 8 in this embodiment stores program instructions 81 capable of implementing all the above methods. These program instructions 81 can be stored in the storage medium as a software product, including several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) or processor to execute all or part of the steps of the methods in each embodiment of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks, or terminal devices such as computers, servers, mobile phones, and tablets.

[0276] In the several embodiments provided in this application, it should be understood that the disclosed apparatus, devices, and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another device, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between devices or units may be electrical, mechanical, signal, or other forms.

[0277] Furthermore, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated units described above can be implemented in hardware or as software functional units. The above are merely embodiments of this application and do not limit the patent scope of this application. Any equivalent structural or procedural transformations made based on the description and drawings of this application, or direct or indirect applications in other related technical fields, are similarly included within the patent protection scope of this application.

Claims

1. A quality monitoring method for a filter rod forming process chain, characterized in that, The quality monitoring method includes: Step S1: Collect real-time operating parameters and quality inspection data of each process step in the filter rod forming process chain, and organize them into a standardized multivariate time-series dataset; Step S2: Identify data anomalies in the standardized multi-dimensional time series dataset using a multi-dimensional temporal correlation anomaly detection algorithm and mark the anomalies to obtain a multi-dimensional time series dataset with marked anomalies. Step S3: Obtain the deep representation features of the multivariate time series dataset through the time series deep feature extraction algorithm to obtain the deep time series feature dataset; Step S4: Perform cross-process time sequence association matching on the deep time series feature dataset using a time sequence alignment and matching algorithm to obtain a full-process associated dataset; Step S5: Perform full-process feature fusion and quality level determination on the full-process associated dataset using a full-dimensional feature fusion classification algorithm to obtain the quality level result of each batch of filter rods; Step S6: Bind the quality grade results of all batches of filter rods with the operating parameters of the corresponding process links to obtain the full-process quality monitoring data of the filter rod forming process chain.

2. The quality monitoring method according to claim 1, characterized in that, Step S1: Collect real-time operating parameters and quality inspection data for each process step in the filter rod forming process chain, and organize them into a standardized multivariate time-series dataset, including: Step S1.1: Traverse all process steps in the filter rod forming process chain, extract the process parameter collection items and quality inspection items corresponding to each process step, and obtain a list of data collection items for the entire process. Step S1.2: Based on the full process data collection item list, synchronously collect the real-time operating parameters and quality inspection data of the corresponding nodes of each process link to obtain the original dataset of the full process; Step S1.3: Clean the original dataset of the entire process to obtain the cleaned original dataset of the entire process; Step S1.4: Perform dimensional unification and numerical normalization on the original dataset of the entire process after cleaning to obtain a normalized dataset of the entire process. Step S1.5: Add a unified time sequence stamp and process identifier to the standardized full-process dataset to obtain a full-process dataset with time sequence identifier; Step S1.6: Perform dimensional integration and format unification on the full process dataset with time sequence labels to obtain a standardized multivariate time series dataset.

3. The quality monitoring method according to claim 1, characterized in that, Step S2: Identify data anomalies in the standardized multivariate time-series dataset using a multivariate temporal correlation anomaly detection algorithm and complete the anomaly state labeling to obtain a multivariate time-series dataset with labeled anomaly states, including: Step S2.1: Divide the standardized multivariate time series dataset into a sliding window with a fixed step size to obtain a multivariate time series variable matrix; Step S2.2: Construct graph structure data of the correlation relationships between the corresponding variables based on the multivariate time series variable matrix; Step S2.3: Aggregate the temporal information of the neighborhood variables of the graph structure data through the graph attention mechanism of the MTAD-GAT algorithm to obtain the temporal feature matrix after variable dependency fusion; Step S2.4: Encode and reconstruct the temporal dimension of the temporal feature matrix to obtain the temporal reconstruction residual dataset; Step S2.5: Calculate the anomaly score for each time series sampling point based on the time series reconstruction residual dataset, and generate a time series dataset with anomaly scores; Step S2.6: Adaptive outlier determination based on extreme value theory, identifying data outliers from the time series dataset with outlier scores, and generating a time series dataset with outlier labels; Step S2.7: Align the time series dataset with outlier markers with the standardized multivariate time series dataset in terms of dimensions and merge the attributes to obtain a multivariate time series dataset with outlier markers.

4. The quality monitoring method according to claim 1, characterized in that, Step S3: Obtain the deep representation features of the multivariate time-series dataset using a time-series deep feature extraction algorithm to obtain a deep time-series feature dataset, including: Step S3.1: Perform dual-view enhancement processing on the multivariate time-series dataset with labeled abnormal states to obtain a time-series dataset with a pair of original views and context views. Step S3.2: Input the time series dataset into the temporal coding network of the TS2Vec algorithm, and perform feature mapping through hierarchical causal convolution to obtain the hierarchical temporal feature set corresponding to the dual views; Step S3.3: Perform cross-view comparative learning optimization on the hierarchical temporal feature set to obtain a dual-view temporal feature set after feature space alignment; Step S3.4: Perform multi-scale hierarchical feature aggregation on the dual-view temporal feature set to obtain a unified representation feature set with full temporal coverage; Step S3.5: Map and match the unified representation feature set with the original temporal dimension to obtain the deep temporal feature dataset.

5. The quality monitoring method according to claim 1, characterized in that, Step S4: Perform cross-process time series correlation matching on the deep time series feature dataset using a time series alignment and matching algorithm to obtain a full-process correlation dataset, including: Step S4.1: The deep time-series feature dataset is split according to the process nodes of the filter rod forming process chain to obtain an independent time-series feature sequence subset for each process step; Step S4.2: Perform time sequence constraint matching between process steps on the subset of time-series feature sequences to obtain a set of constrained feature sequence pairs; Step S4.3: Calculate the normalization path and alignment cost matrix of the sequence pairs in the constrained feature sequence pair set using the Soft-DTW algorithm to obtain the optimal alignment path of the feature sequence pair set. Step S4.4: According to the optimal alignment path, perform cross-process feature sequence temporal dimension alignment mapping on the feature sequence pair set to obtain the feature sequence set after full-process temporal alignment; Step S4.5: After aligning the feature sequence set of the entire process time sequence, perform dimensional splicing and association binding according to the batch identifier to obtain the batch-level full-process associated feature set; Step S4.6: Perform dimensional matching and merging of the batch-level full-process associated feature set with the original data attributes to obtain the full-process associated dataset.

6. The quality monitoring method according to claim 1, characterized in that, Step S5: Perform full-process feature fusion and quality level determination on the full-process associated dataset using a full-dimensional feature fusion classification algorithm to obtain the quality level results for each batch of filter rods, including: Step S5.1: Generate an independent feature token vector based on each feature dimension in the full-process associated dataset to obtain a tokenized feature sequence set; Step S5.2: Add classification tokens and temporal location codes to the tokenized feature sequence set to obtain a complete token sequence set with location information; Step S5.3: Input the complete token sequence set into the multi-head self-attention layer of the FT-Transformer algorithm, and perform full-dimensional feature interaction fusion through the global attention mechanism to obtain the feature representation set after attention fusion; Step S5.4: Input the feature representation set into the feedforward neural network layer of the FT-Transformer algorithm, and generate a high-dimensional feature set after nonlinear mapping through nonlinear feature transformation and dimensionality optimization; Step S5.5: Through multi-layer network iterative processing, the global feature output of the classification token corresponding to the high-dimensional feature set is extracted to obtain the batch-level global fusion feature set; Step S5.6: Input the batch-level global fusion feature set into the fully connected classification layer for quality level mapping and determination, and obtain the quality level result of each batch of filter rods.

7. The quality monitoring method according to claim 1, characterized in that, Step S6: Link the quality grade results of all batches of filter rods with the operating parameters of the corresponding process steps to obtain the full-process quality monitoring data of the filter rod molding process chain, including: Step S6.1: Extract the batch unique identifier and standardize the format of the quality grade results for each batch of filter rods to generate a quality grade result set with a unique batch identifier; Step S6.2: Perform precise batch identification matching between the quality grade result set and the standardized multivariate time series dataset to obtain a batch-to-batch matching dataset of quality grade and process parameters. Step S6.3: Perform attribute association binding on the process sequence dimension of the quality grade and process parameter matching dataset to obtain a batch quality association dataset with full process sequence link attributes; Step S6.4: Merge all batch dimensions and unify data dimensions of the batch quality correlation dataset to obtain a quality and process correlation dataset covering all batches and processes. Step S6.5: Standardize the storage structure and complete the full-link metadata of the quality and process association dataset to obtain the full-process quality monitoring data of the filter rod forming process chain.

8. A quality monitoring device for a filter rod forming process chain, wherein the quality monitoring device is applied to the quality monitoring method as described in any one of claims 1 to 7, characterized in that, The quality monitoring device includes: The process data acquisition module is used to collect real-time operating parameters and quality inspection data of each process step in the filter rod forming process chain, and organize them into a standardized multivariate time-series dataset. The process data anomaly marking module is used to identify data anomaly points in the standardized multivariate time series dataset and mark the anomaly status using a multivariate time series correlation anomaly detection algorithm, thereby obtaining a multivariate time series dataset with marked anomaly status. The deep representation feature extraction module is used to obtain the deep representation features of the multivariate time series dataset through a time series deep feature extraction algorithm, thereby obtaining a deep time series feature dataset; The cross-process data association and matching module is used to perform cross-process time series association matching on the deep time series feature dataset through a time series alignment and matching algorithm to obtain a full-process associated dataset; The data quality level determination module is used to perform full-process feature fusion and quality level determination on the full-process associated dataset through a full-dimensional feature fusion classification algorithm to obtain the quality level result of each batch of filter rods. The quality grade result binding module is used to bind the quality grade results of all batches of filter rods with the operating parameters of the corresponding process links to obtain the full-process quality monitoring data of the filter rod forming process chain.

9. An electronic device, characterized in that, The method includes a processor and a memory coupled to the processor, the memory storing program instructions executable by the processor; when the processor executes the program instructions stored in the memory, it implements the quality monitoring method as described in any one of claims 1 to 7.

10. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores program instructions that, when executed by a processor, enable the quality monitoring method as described in any one of claims 1 to 7.