EMMC state inference method and system based on FTL mapping fluctuation
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- 广东全芯半导体有限公司
- Filing Date
- 2026-02-25
- Publication Date
- 2026-06-19
AI Technical Summary
Traditional eMMC status inference methods are susceptible to differences in load and controller implementation, as well as vendor-specific firmware policies, resulting in poor portability, easy confusion and misjudgment, and unstable confidence levels. They cannot provide reliable fault prevention and affect the operational stability of the built-in storage chip.
By jointly processing host-side I/O observation data and active perturbation probe event data in a spatiotemporal manner, and using probe spatiotemporal annotation to achieve event alignment and mapping of fluctuation fingerprints for delayed responses, state inference is performed by combining blind source separation and parameterized FTL digital twin model to identify internal working states such as garbage collection and wear leveling, thereby improving the robustness and interpretability of state inference.
Without relying on vendor-specific health interfaces, it achieves stable and repeatable quantitative characterization of eMMC status, reduces false positives and false negatives, provides reliable early warning and fault prevention, and improves the operational stability of built-in memory chips.
Smart Images

Figure CN122240416A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of state analysis technology, and in particular to an eMMC state inference method and system based on FTL mapping fluctuations. Background Technology
[0002] In traditional technologies, eMMC status inference typically involves collecting JEDEC eMMC standard command / response and error information (such as busy duration, timeout / CRC error, retries, R1 / R1b status bits, etc.), combined with statistical characteristics of read / write latency and throughput (long-tail latency, periodic jitter, write blocking window, etc.), and using threshold rules to output inferences such as "ready / busy / programming or erase / abnormal risk or nearing endurance." However, traditional technologies are susceptible to differences in load and controller implementation, as well as vendor-specific firmware policies, resulting in poor portability, susceptibility to confusion and misjudgment, and unstable confidence levels. This makes it impossible to provide a reliable basis for fault prevention and fails to guarantee the operational stability of the built-in memory chip. Summary of the Invention
[0003] Therefore, it is necessary to provide a method and system for eMMC state inference based on FTL mapping fluctuations that can effectively improve the operational stability of built-in memory chips, addressing the aforementioned technical problems.
[0004] Firstly, this application provides an eMMC state inference method based on FTL-mapped fluctuations, including: Joint spatiotemporal processing is performed on host-side I / O observation data corresponding to the built-in storage chip and active perturbation probe event data to obtain delayed basis sequence data and probe event spatiotemporal labeled data. Based on the spatiotemporal annotation data of the probe events, the delayed base sequence data is subjected to mapped fluctuation fingerprint analysis to obtain mapped fluctuation fingerprint MVF data; Blind source separation is performed on the mapped fluctuation fingerprint MVF data to obtain common mode performance factor data and mapped fluctuation factor data; The mapped volatility factor data is subjected to parameter inversion based on a parameterized FTL digital twin model to obtain mechanism parameter vector data; Based on the common-mode performance factor data and the mechanism parameter vector data, state-space inference analysis is performed on the internal working state of the built-in memory chip to obtain chip working state data.
[0005] Secondly, this application also provides an eMMC state inference system based on FTL mapping fluctuations, comprising: the system including: a visualization device and a computer device; The computer device is used to perform joint spatiotemporal processing on host-side I / O observation data and active perturbation probe event data corresponding to the built-in storage chip to obtain delayed basis sequence data and probe event spatiotemporal labeled data. The computer device is used to perform mapped fluctuation fingerprint analysis on the delayed base sequence data based on the spatiotemporal annotation data of the probe event to obtain mapped fluctuation fingerprint (MVF) data. The computer device is used to perform blind source separation on the mapped fluctuation fingerprint MVF data to obtain common mode performance factor data and mapped fluctuation factor data. The computer device is used to perform parameter inversion on the mapped volatility factor data based on a parameterized FTL digital twin model to obtain mechanism parameter vector data. The computer device is used to perform state-space inference analysis on the internal working state of the built-in memory chip based on the common-mode performance factor data and the mechanism parameter vector data to obtain chip working state data; the chip working state data is displayed in the visualization device.
[0006] The aforementioned eMMC state inference method and system based on FTL mapping fluctuations obtains a load-normalized delay basis sequence by jointly spatiotemporally processing host-side I / O observation data and active perturbation probe event data. It then utilizes probe spatiotemporal annotation to achieve event alignment and mapping fluctuation fingerprint construction for the delay response, enabling stable and repeatable quantification of non-stationary fluctuations caused by changes in flash memory conversion layer mapping. Furthermore, by performing blind source separation on the mapping fluctuation fingerprint, it decouples global common-mode performance changes such as thermal throttling and power supply current limiting from non-common-mode components related to mapping fluctuations, significantly reducing misjudgments and missed judgments caused by environmental and system load disturbances. Furthermore, by combining the parameterized flash conversion layer digital twin model with the parameterized flash conversion layer to perform parameter inversion on the mapping fluctuation factor, an interpretable mechanism parameter vector is obtained and used for state space inference. This enables probabilistic and continuous online identification and early warning of internal working states such as garbage collection, wear leveling, bad block replacement or remapping, cache exhaustion and write-back, thermal throttling, and read disturbance sensitivity without reading the internal mapping table or relying on the vendor's proprietary health interface. This improves the robustness, portability, and interpretability of state inference, thereby providing a reliable basis for write scheduling, data migration, lifetime management, and fault prevention, and effectively improving the operational stability of the built-in storage chip. Attached Figure Description
[0007] To more clearly illustrate the technical solutions in the embodiments or related technologies of this application, the accompanying drawings used in the description of the embodiments or related technologies will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0008] Figure 1 This is an application environment diagram of an eMMC state inference method based on FTL mapping fluctuations in one embodiment; Figure 2 This is a flowchart illustrating an eMMC state inference method based on FTL mapping fluctuations in one embodiment. Figure 3 This is an internal structural diagram of a computer device in one embodiment. Detailed Implementation
[0009] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.
[0010] This application provides an eMMC state inference method based on FTL mapping fluctuations, which can be applied to, for example... Figure 1 In the application environment shown, the visualization device 102 communicates with the server 104 via a network. A data storage system can store the data that the server 104 needs to process. The data storage system can be integrated onto the server 104 or located in the cloud or on other network servers. The visualization device 102 can be, but is not limited to, various personal computers, laptops, smartphones, tablets, IoT devices, and portable wearable devices. IoT devices can include smart speakers, smart TVs, smart air conditioners, smart in-vehicle devices, etc. Portable wearable devices can include smartwatches, smart bracelets, head-mounted devices, etc. The server 104 can be implemented using a standalone server or a server cluster consisting of multiple servers.
[0011] In one exemplary embodiment, such as Figure 2 As shown, an eMMC state inference method based on FTL mapping fluctuations is provided, and this method is applied to... Figure 1 Taking the server in the example, the explanation includes the following steps 202 to 210. Wherein:
[0012] Step 202: Perform joint spatiotemporal processing on the host-side I / O observation data and active perturbation probe event data corresponding to the built-in storage chip to obtain delayed basis sequence data and probe event spatiotemporal annotation data.
[0013] Step 204: Based on the spatiotemporal annotation data of the probe event, perform mapping fluctuation fingerprint analysis on the delayed base sequence data to obtain the mapping fluctuation fingerprint MVF data.
[0014] Step 206: Perform blind source separation on the mapped fluctuation fingerprint MVF data to obtain common mode performance factor data and mapped fluctuation factor data.
[0015] Step 208: Perform parameter inversion on the mapped volatility factor data based on the parameterized FTL digital twin model to obtain mechanism parameter vector data.
[0016] Step 210: Based on the common-mode performance factor data and mechanism parameter vector data, perform state-space inference analysis on the internal working state of the built-in memory chip to obtain chip working state data.
[0017] Among them, the built-in storage chip is an eMMC that is integrated inside the terminal device, accessed by the host through a standard storage interface, and managed by a flash conversion layer.
[0018] Among them, host-side input / output observation data: a set of request attributes and performance measurement data that the host can record during the process of initiating and completing storage read / write requests, including at least completion latency and its corresponding address, size, concurrency, and other information.
[0019] Among them, the active perturbation probe event data includes lightweight probe read and write operations actively performed by the host under the condition of meeting the quality of service constraints, and their metadata records, which at least include the probe type, the time of occurrence, and the probe address range.
[0020] Among them, joint spatiotemporal processing is a process of aligning, cleaning, and normalizing input and output observation data and probe event data according to a unified time axis and address space to form data that can be jointly analyzed.
[0021] Among them, the delay base sequence data is a delay time series that reflects the intrinsic changes in storage service time as much as possible after queuing and load factor correction of the original completion delay.
[0022] Among them, the probe event spatiotemporal annotation data is a spatiotemporal index data formed by structured annotation of the type, occurrence time and corresponding logical address range of each probe event.
[0023] Among them, the mapping wave fingerprint analysis is an analytical process that uses probe annotation as a condition to align delayed base sequences with events and extract features such as multi-scale, long tail and correlation to characterize the wave structure of mapping changes.
[0024] Among them, the mapped fluctuation fingerprint (MVF) data is a feature representation data used to characterize the mapped fluctuation structure of the flash memory conversion layer, formed by fusing multi-scale fluctuation spectrum features, long-tailed delay statistical features, and cross-address correlation features.
[0025] Blind source separation is a process of decomposing multiple potential components in mixed observations into mutually distinguishable factor components according to statistical structure in the absence of internal labels.
[0026] Among them, common-mode performance factor data: factor data characterizing the global performance change components caused by synchronous changes in multiple address groups due to thermal throttling, power supply current limiting, etc.
[0027] Among them, the mapping volatility factor data is a factor data that characterizes the non-common mode and structural volatility components caused by flash conversion layer mapping updates, garbage collection, wear leveling, etc.
[0028] Among them, parameter inversion of the parameterized flash conversion layer digital twin model is the process of estimating the values of the internal mechanism parameters of the model by matching the output of the digital twin model with the observation of the mapped volatility factor.
[0029] Among them, the mechanism parameter vector data is a multi-dimensional parameter set obtained by parameter inversion, which is used to quantify the level of internal mechanisms such as garbage collection intensity, wear leveling migration, remapping rate, and cache write-back intensity.
[0030] Internal operating state: a set of operating modes formed by the internal mechanism of the built-in storage chip during a certain period of time, such as active garbage collection, wear leveling migration, bad block replacement, cache exhaustion write-back, hot throttling, etc.
[0031] State-space inference analysis is a sequence inference process that treats internal working states as latent variables and combines observation factors and mechanism parameters to recursively estimate their posterior probability or intensity over time.
[0032] Among them, chip operating state data: the result data output by state space inference, used to represent the current and time-varying internal operating state of the built-in memory chip and its probability or intensity.
[0033] Specifically, the host side observes and collects input / output requests from the built-in storage chip within a preset sliding time window, forming host-side input / output observation data. This data includes at least the completion timestamp, completion delay, queue depth or concurrency, request size, logical address range identifier, and synchronization semantic identifier for each request. Simultaneously, under preset service quality constraints, active perturbation probe events are triggered, and the probe type, probe occurrence time, and probe logical address range are recorded, forming active perturbation probe event data. Then, the timestamps of the host-side I / O observation data and the active perturbation probe event data are aligned using a unified time base, obvious abnormal sampling points are removed, and the completion delay is combined with queue depth or concurrency, request size, and synchronization semantics for load normalization to eliminate queuing effects, resulting in delay base sequence data characterizing storage service time. Simultaneously, probe events are spatiotemporally labeled according to their occurrence time and logical address range, resulting in probe event spatiotemporally labeled data.
[0034] Driven by probe event spatiotemporal annotation data, delayed base sequence data is aligned to the probe occurrence time (using relative time or phase folding methods). Delay response fragments within fixed time windows are extracted before and after each probe event, and then aggregated according to probe logical address range and non-probe range. Based on this, multi-perspective features reflecting the mapped wave structure are calculated on the aligned delayed response fragments, including wave spectrum signatures across time scales (reflecting wave energy at different scales), long-tail statistical features of the delayed distribution tail (e.g., high quantile and over-threshold statistics), and inter-group synchronous wave correlation features based on logical address grouping (reflecting global and local wave differences). These features are then fused and shaped according to a unified feature structure to form mapped wave fingerprint MVF data (the mapped wave fingerprint MVF data may include event-aligned response summaries to reflect the excitation effect of probes on internal mapping changes).
[0035] The mapped wave fingerprint MVF data is considered as a hybrid observation generated by the superposition of "common-mode performance variation components" and "mapped wave structure components." A common-mode discrimination mask is constructed based on the correlation characteristics across logical address groups in the mapped wave fingerprint MVF data to identify wave components that appear synchronously in most logical address groups. Then, a subspace leakage constraint for blind source separation is constructed based on the common-mode discrimination mask to suppress the leakage of common-mode components into the mapped wave factors. Under the leakage constraint, constrained blind source separation is performed on the mapped wave fingerprint MVF data to obtain a set of candidate factors, and factor identities are assigned to the candidate factor set. Factors that meet the criteria of "cross-group synchronization, low-rank / low-frequency dominance" are determined as common-mode performance factor data, and factors that reflect the criteria of "local differences, non-common-mode wave spectrum and tail structure changes" are determined as mapped wave factor data.
[0036] Based on a pre-established parameterized FTL digital twin model using the historical physical parameters of the built-in memory chip, the mapped volatility factor data is input as the target observation into the parameterized FTL digital twin model. The internal mechanism parameters of the parameterized FTL digital twin model are estimated through parameter inversion, minimizing the difference between the output of the parameterized FTL digital twin model and the mapped volatility factor data (achieved through iterative optimization or recursive estimation). This yields the mechanism parameter vector data. The mechanism parameter vector data includes at least one or more of the following: garbage collection intensity parameters, wear leveling migration parameters, bad block replacement or remapping rate parameters, and cache exhaustion and write-back intensity parameters. These parameters can be updated dynamically over a time window to reflect the dynamic evolution of the mechanism intensity.
[0037] Similarly, a state-space inference model for the internal operating states of the built-in memory chip is constructed based on historical operating state data. The state variables represent the probability or intensity of at least one internal operating state (e.g., garbage collection intensity state, wear leveling state, bad block replacement or remapping state, cache exhaustion and write-back state, thermal throttling state, read disturbance sensitive state, etc.). Common-mode performance factor data and mechanism parameter vector data are input into the state-space inference model. Under the analysis of the state-space inference model, the common-mode performance factor data characterizes the global performance change channel, and the mechanism parameter vector data characterizes the internal mechanism channel of the flash conversion layer. A state-space inference algorithm is used to estimate the state variables online (e.g., constraining state transitions with mechanism parameter vectors, and constraining observation noise or common-mode channels with common-mode performance factors), outputting chip operating state data updated over time. The chip operating state data includes at least the probability distribution or intensity value of each internal operating state, and can simultaneously output state switching times or change point information to characterize the transition of the internal mechanism from one mode to another.
[0038] In the aforementioned eMMC state inference method based on FTL mapping fluctuations, a load-normalized delay basis sequence is obtained by jointly spatiotemporally processing host-side I / O observation data and active perturbation probe event data. The event alignment and mapping fluctuation fingerprint of the delay response are achieved by using probe spatiotemporal annotation, so that the non-stationary fluctuations caused by flash conversion layer mapping changes can be stably and repeatably quantified and characterized. Furthermore, by performing blind source separation on the mapping fluctuation fingerprint, the non-common-mode components related to global common-mode performance changes such as thermal throttling and power supply current limiting are decoupled from the mapping fluctuations, significantly reducing misjudgments and omissions caused by environmental and system load disturbances. By combining the parameterized flash conversion layer digital twin model with the parameterized flash conversion layer to perform parameter inversion on the mapping fluctuation factor, an interpretable mechanism parameter vector is obtained and used for state space inference. This enables probabilistic and continuous online identification and early warning of internal working states such as garbage collection, wear leveling, bad block replacement or remapping, cache exhaustion and write-back, thermal throttling, and read disturbance sensitivity without reading the internal mapping table or relying on the vendor's proprietary health interface. This improves the robustness, portability, and interpretability of state inference, thereby providing a reliable basis for write scheduling, data migration, lifetime management, and fault prevention, and effectively improving the operational stability of the built-in storage chip.
[0039] In one exemplary embodiment, such as Figure 3 As shown, based on common-mode performance factor data and mechanism parameter vector data, state-space inference analysis is performed on the internal operating state of the built-in memory chip to obtain chip operating state data, including steps 302 to 306. Wherein:
[0040] Step 302: Compile the state dictionary into the mechanism parameter vector data to obtain dictionary constraint model data.
[0041] Step 304: Based on the common-mode performance factor data and dictionary constraint model data, perform interactive multi-model filtering inference on the internal working state to obtain preliminary state posterior distribution data.
[0042] Step 306: Based on the common-mode performance factor data, perform posterior reweighting calibration on the preliminary state posterior distribution data by common-mode subspace stripping to obtain chip operating state data.
[0043] Among them, state dictionary compilation is the process of mapping the mechanism parameter vector to a set of internal working state entries and their decision boundaries and transition relationships, and generating a constrained state model that can be used for inference.
[0044] Among them, dictionary-constrained model data consists of structured model data composed of a set of state entries, boundary constraints for each state decision, and adjacency relationships and transition strengths for state transitions.
[0045] Interactive multi-model filtering inference is a sequence inference method that performs prior mixing and filtering updates on multiple state models according to transition probabilities, and then fuses them to obtain the overall state posterior.
[0046] Among them, the preliminary state posterior distribution data are intermediate posterior results output by the interactive multi-model filter, which quantify the probability or intensity of each internal working state.
[0047] Among them, the posterior reweighting calibration of common mode subspace stripping is a process of stripping the common mode influence from the preliminary posterior using the common mode direction characterized by the common mode performance factor and conservatically redistributing the probability mass to correct the state weights.
[0048] Among them, the posterior reweighting calibration of common mode subspace stripping is a process in which the common mode direction component is stripped from the preliminary state posterior using the common mode performance factor-constructed common mode subspace, and the remaining posterior weights are redistributed under the constraints of non-negativity, normalization and probability mass conservation to correct the chip operating state estimation.
[0049] Specifically, the mechanism parameter vector data is used as "state generation parameters." Parameter dimensions related to the internal working state (such as garbage collection intensity, wear leveling migration intensity, bad block replacement or remapping rate, cache exhaustion and write-back intensity, etc.) are mapped to a preset state set, forming a parameter subset and decision boundary for each state. Consistency processing is performed on the decision boundaries of each state (such as handling mutual exclusion / inclusion relationships, eliminating conflict threshold intervals, and merging equivalent states). Based on the reasonable evolutionary relationship of the state mechanism, state transition constraints (such as allowed / prohibited adjacency relationships, jump amplitude limits, transition costs, or prior probabilities) are generated. Finally, the "set of state entries (including decision boundaries) + state transition constraint relationships (adjacency structure)" are assembled into dictionary constraint model data.
[0050] The dictionary-constrained model data uses "state entries" as multiple candidate sub-models (each sub-model corresponds to an internal working state or state combination), and the state transition constraints in the dictionary-constrained model data serve as the prior structure for switching between models. Common-mode performance factor data is used as the common-mode channel input on the observation side to explain the impact of global performance fluctuations on observations and to adjust the observation noise or likelihood calculation of each sub-model. An interactive multi-model filtering inference process is employed. At each time step, prediction updates are performed on each sub-model, the observation likelihood under the common-mode factor condition is calculated, and the model probability is updated. Then, the model probabilities are interactively mixed and normalized according to the transition constraints, outputting preliminary posterior state distribution data containing "probabilities of each state model + corresponding state estimates".
[0051] The preliminary state posterior distribution data is represented as a weighted sample set or a discrete state probability vector. The common-mode performance factor data is used to determine the direction of change dominated by the common-mode channel in the posterior (common-mode subspace). To reduce the interference of common-mode performance changes such as thermal throttling and power supply current limiting on state discrimination, the common-mode performance factor data is used separately to determine the direction of change dominated by the common-mode channel in the posterior (common-mode subspace). Based on this, a posterior reweighting process is performed on the preliminary state posterior distribution after common-mode subspace stripping. That is, state probabilities that contribute too much to the projection on the common-mode subspace and may be "spuriously boosted" by common-mode disturbances such as thermal throttling or power supply current limiting are suppressed and reweighted. At the same time, state probabilities that are related to the mapping mechanism and have stronger explanatory power in non-common-mode directions are compensated and reweighted. The reweighted posterior is then normalized and subjected to mass conservation correction to maintain the consistency of overall probability quality. Finally, chip operating state data (such as the output of state probability or state strength over time) that is more sensitive to internal mechanism states and has eliminated the influence of common-mode performance changes is obtained.
[0052] In this embodiment, a dictionary constraint model is formed by compiling the mechanism parameter vector into a state dictionary. This makes the candidate space and state transition relationship of the internal working state subject to explicit constraints of consistency and reachability, thereby reducing unreasonable state jumps and improving the inference convergence speed. Furthermore, interactive multi-model filtering inference is introduced under the constraints of the dictionary constraint model. This can obtain a more robust preliminary state posterior distribution when multiple internal mechanisms coexist or alternately dominate, reducing misjudgments caused by single-model assumptions. Finally, the posterior reweighting calibration of the preliminary posterior by stripping the common-mode subspace using the common-mode performance factor can effectively suppress the interference of global common-mode performance changes such as thermal throttling and power supply current limiting on state discrimination, improve the accuracy, robustness and interpretability of state recognition, and thus output more reliable chip working state data for life management and fault early warning.
[0053] In an exemplary embodiment, based on the preliminary state posterior distribution data and common-mode performance factor data, the preliminary state posterior distribution data undergoes posterior reweighting calibration by common-mode subspace stripping to obtain chip operating state data, including steps 402 to 408. Wherein:
[0054] Step 402: Perform identifiable subspace basis processing on the common mode performance factor data to obtain common mode subspace basis data.
[0055] Step 404: Based on the preliminary state posterior distribution data and the common mode subspace basis data, perform a constrained projection of the preliminary state posterior distribution data by stripping the common mode direction to obtain the projected posterior weight data.
[0056] Step 406: Resample the projected posterior weight data to obtain the posterior sample set data.
[0057] Step 408: Perform posterior probability quality conservation backfilling on the posterior sample set data to obtain chip operating status data.
[0058] Among them, the identifiable subspace basis processing is a process of extracting a set of reproducible common-mode action directions from the common-mode performance factor data based on the synchronization consistency of the common-mode factor in multiple address groups or multiple time slices and the energy ratio of the main direction.
[0059] Among them, the common mode subspace basis data is a set of basis vectors or equivalent projection operator data used to characterize the main directions of global common mode disturbances such as thermal throttling and power supply current limiting.
[0060] Among them, the constrained projection of common mode direction stripping is a projection process that strips the component projection along the common mode subspace direction in the preliminary posterior to suppress common mode interference, under the constraints of non-negativity and normalization of posterior weights.
[0061] Among them, the projected posterior weight data is the updated weight data used to represent the posterior weight of each state or each sample after the initial state posterior distribution is stripped and projected in the common mode direction and the constraints are satisfied.
[0062] Among them, the posterior sample set data is the sample set and its corresponding weight representation data that have more uniform coverage of the high posterior probability region obtained by resampling based on the projected posterior weights.
[0063] Among them, the posterior probability quality conservation backfilling is a backfilling process that compensates for the probability quality loss caused by projection and resampling and redistributes it to samples or states according to the reachable neighborhood rule, while maintaining the overall posterior probability quality conservation.
[0064] Specifically, the common-mode performance factor data (e.g., the global performance change factor sequence obtained from blind source separation) is mean-reduced and scale-normalized to eliminate dimensional differences. Then, a "synchronicity / low-rank" identifiable criterion is constructed on multiple logical address groups or multiple time slices (e.g., examining the synchronization consistency of common-mode factors in each group and the energy proportion of the main direction). Based on this, the common-mode performance factor data is processed into an identifiable subspace basis: the set of main directions that can best explain the cross-group synchronization changes is extracted, and directions that do not meet the synchronization criteria or have too low energy are removed. Finally, the common-mode subspace basis data is output. The common-mode subspace basis data can be represented as a set of basis vectors or equivalent projection operators to characterize the main action directions of common-mode disturbances such as thermal throttling and power supply current limiting in the posterior space.
[0065] The preliminary state posterior distribution data is represented as a state weight vector or sample-weight form (e.g., the posterior weights of each state model, or the weights of the particle ensemble), and mapped to a representation space that can be aligned with the common-mode subspace basis data (e.g., treating the temporal components of weight changes, or a vector composed of multi-state weights, as the projection object). Then, a projection operator is constructed based on the common-mode subspace basis data to remove the common-mode direction components from the preliminary state posterior distribution data, resulting in preliminary state posterior distribution data free from common-mode influence. To avoid posterior distortion caused by projection, the projection process of the preliminary state posterior distribution data free from common-mode influence is constrained. The constraints include at least satisfying the non-negativity and normalization of posterior weights, and minimizing the posterior difference before and after projection, or the observation consistency constraint. This outputs projected posterior weight data, suppressing the "global synchronization weight drift" caused by common-mode disturbances, while preserving the differential weight structure related to the true internal mechanism.
[0066] When the projected posterior weight data is represented in "sample-weight" form, the effective number of samples in the projected posterior weight data is calculated, and the resampling trigger condition is determined accordingly. Then, a discrete cumulative distribution is constructed using the projected weights, and a preset number of sample indices are extracted according to this distribution. This process replicates high-weight samples and eliminates low-weight samples, generating a new sample set. Simultaneously, the weights in the new sample set are uniformly reset to equal weight and normalized to eliminate weight degradation and make the sample set more evenly cover the high posterior probability region, thus outputting the posterior sample set data. When the projected posterior weight data is represented in "discrete state weight vector" form, states are sampled according to the weight ratio of each state, and the sampled states are instantiated as samples to obtain an equal-weighted state sample set.
[0067] To address potential probabilistic quality loss or distribution shifts caused by projection stripping and resampling, the "missing probabilistic quality" is estimated in the posterior sample set data (e.g., comparing the changes in total weights and tail quality before and after projection and resampling). Then, a quality backfilling route is determined based on state neighborhood or state transition constraints (e.g., backfilling is only allowed to adjacent reachable states or states within the same mechanism cluster). Missing quality is then allocated back to the sample set under the constraint of probabilistic quality conservation, including conserved transport-style compensation and redistribution of sample weights. This ensures that the overall posterior after backfilling re-satisfies normalization and quality conservation, while maintaining consistency with the projected structure as much as possible. Finally, expectation estimation or maximum a posteriori estimation is performed on the backfilled sample set to output chip operating state data (including the probability or strength of each internal operating state and its confidence level).
[0068] In this embodiment, by constructing an identifiable common-mode subspace basis for the common-mode performance factor and performing a constrained projection of the preliminary state posterior by stripping the common-mode direction, the synchronous bias of state weights caused by global common-mode disturbances such as thermal throttling and power supply current limiting can be effectively suppressed under constraints such as maintaining posterior non-negativity and normalization. This improves the sensitivity of the posterior distribution to differences in the actual internal mechanisms. Furthermore, resampling the projected weights can alleviate weight degradation and enhance the coverage of high-probability state regions. Posterior probability quality conservation backfilling can conserve and compensate for the probability quality loss caused by projection and resampling and reasonably redistribute it within the reachable neighborhood. Ultimately, more stable, robust, and reliable chip operating state data is obtained, reducing false positives and false negatives and improving the interpretability of state switching identification.
[0069] In an exemplary embodiment, the common-mode performance factor data is processed using a distinguishable subspace basis to obtain common-mode subspace basis data, including steps 502 to 510. Wherein:
[0070] Step 502: Construct identifiability criteria for common mode performance factor data to obtain common mode identifiability criteria data.
[0071] Step 504: Based on the common mode identifiable criterion data, perform synchronous alignment of the common mode performance factor data across logical address groups to obtain aligned common mode factor matrix data.
[0072] Step 506: Decouple the aligned common mode factor matrix data from the adversarial subspace that satisfies the common mode identifiable criterion data to obtain candidate common mode subspace data.
[0073] Step 508: Perform monotonically expanded subspace homotopy convergence processing on the candidate common mode subspace data to obtain stable common mode subspace data.
[0074] Step 510: Perform orthogonal decomposition on the stable common mode subspace data to obtain the common mode subspace basis data.
[0075] Among them, the construction of identifiability criteria involves extracting quantifiable evaluation rules such as synchronization consistency and principal direction concentration from the common-mode performance factors to determine which components can be stably identified as common modes.
[0076] Among them, the common mode identifiable criterion data is structured data that carries the evaluation rules, including synchronicity indicators, low-rank indicators and their thresholds or weights and other parameters.
[0077] Among them, synchronization alignment across logical address groups: by estimating and compensating for time shifts or phase deviations between common mode factor sequences of different logical address groups, they are aligned on the same relative time base.
[0078] Among them, the aligned common mode factor matrix data is a matrix representation of the common mode factor sequence after aligning each logical address group, stacked according to time and group index.
[0079] Among them, adversarial subspace decoupling is the process of decomposing the alignment matrix into common-mode subspace components and residual components under the joint objective of maximizing common-mode explanatory power and minimizing non-common-mode leakage.
[0080] Among them, candidate common mode subspace data: subspace representation data (such as basis vector set or projection operator) obtained by decoupling adversarial subspace and may contain common mode principal directions.
[0081] Among them, the homotopy convergence processing of monotonically expanded subspaces is a process of gradually expanding the candidate subspaces by increasing the dimension and applying monotonically constrained, so as to improve the common mode explanatory power and reduce leakage until convergence.
[0082] Among them, stable common-mode subspace data: the final common-mode subspace representation data that is more robust to noise and window changes, determined after monotonic expansion and convergence determination.
[0083] Specifically, common-mode performance factor data (typically a time-sampled factor sequence, potentially including observations from multiple logical address groups or sampling channels) is aggregated within a preset sliding time window. The common-mode performance factor data undergoes mean reduction, scale normalization, and outlier suppression to eliminate dimensional differences and impulse noise. Then, identifiability criteria are constructed based on the physical meaning of "common mode." One type of criterion characterizes synchronization consistency across logical address groups (e.g., the proportion of unidirectional changes across multiple address groups, synchronization peak intensity, or consistency score), while another type characterizes low-rank / dominant-direction concentration (e.g., the proportion of energy in the dominant direction, the degree of energy spectrum steepness, or the explained variance ratio). These criteria are then organized into common-mode identifiability criterion data (including thresholds, weights, and evaluation function forms).
[0084] The time series of common-mode performance factors are extracted for each logical address group. A reference sequence (e.g., the logical address group sequence with the highest synchronization consistency score or a weighted average of multiple sequences) is determined by combining the evaluation function regarding "maximum synchronization consistency" from the common-mode identifiable criteria data. For each non-reference logical address group sequence, its optimal alignment parameters relative to the reference sequence (including relative time shift and optional phase offset) are calculated within a preset alignment search range. The optimal alignment parameters aim to maximize the synchronization consistency score between the sequence group and the reference sequence. The sequence group is then shifted along the time axis and subjected to necessary interpolation and resampling according to the optimal alignment parameters to ensure synchronization alignment of each logical address group sequence at the same relative time position. Finally, the aligned logical address group sequences are stacked in a "time index × logical address group index" matrix to form an alignment common-mode factor matrix. Each column corresponds to the alignment common-mode factor sequence of a logical address group, and each row corresponds to the cross-group common-mode factor value at the same relative time position.
[0085] The aligned common-mode factor matrix data is represented as the observation matrix to be decomposed. Under the constraint of the common-mode identifiable criterion data, the "common-mode subspace" and the "non-common-mode residual subspace" are solved simultaneously. The objective of solving the common-mode subspace is to maximize the projected reconstructed components in terms of synchronization consistency across logical address groups and the low-rank / main direction energy proportion. The constraint objective of the non-common-mode residual subspace is to minimize the projected components in terms of synchronization consistency, thereby suppressing the leakage of common-mode components into the residuals. In implementation, an alternating iterative optimization is used. In each iteration, the residual subspace is first fixed, and the common-mode subspace is updated to increase the common-mode criterion score and decrease the reconstruction error. Then, the common-mode subspace is fixed again, the residual subspace is updated, and a penalty is imposed on its synchronization consistency score to reduce leakage. When the common mode criterion score reaches the preset threshold and the residual synchronization consistency is lower than the preset upper limit, or when the iteration converges, the process stops and outputs candidate common mode subspace data (which can be a common mode projection operator or a common mode basis vector set). The corresponding reconstructed components are the interpretable structures in the aligned common mode factor matrix that are judged as "common mode".
[0086] Using candidate common-mode subspace data as the initial subspace, and setting the growth step size and maximum dimension limit for the subspace dimension, the common-mode explanatory power index (e.g., the proportion of energy in projection reconstruction or the proportion of energy in the main direction) and the non-common-mode leakage index (e.g., the synchronization consistency score in the residual or the common-mode criterion violation) are calculated for the alignment of the common-mode factor matrix under the current dimension. Then, a new directional component is added to the candidate subspace in a "one-order dimension increment" manner (e.g., selecting the direction that best improves the common-mode criterion score from the current residual as the candidate expansion direction), and the common-mode explanatory power index and the non-common-mode leakage index are calculated again after the addition. If the common-mode explanatory power monotonically increases relative to the previous dimension and the non-common-mode leakage monotonically decreases relative to the previous dimension after expansion, the expansion is accepted and the next round of dimension increment is entered; otherwise, the expansion is rejected and an alternative expansion direction is attempted. If the monotonic constraint cannot be satisfied within a preset number of attempts, the expansion is terminated. Finally, the subspace that satisfies the monotonic constraint and reaches the convergence condition (e.g., the index gain is below the threshold or reaches the maximum dimension limit) is determined as the stable common-mode subspace data.
[0087] When stable common-mode subspace data is represented by a set of not necessarily orthogonal basis vectors or direction components, the vector set is first arranged into a matrix column-wise and orthogonalized to ensure that the resulting vectors are pairwise orthogonal and their norms are normalized, thus obtaining an orthonormal basis. When stable common-mode subspace data is represented by a subspace projection operator or an equivalent matrix, the matrix is subjected to eigenvalue decomposition or equivalent orthogonal decomposition. The eigenvectors corresponding to the largest eigenvalue (or the main energy component) are selected as subspace directions, and the norms of the selected vector set are normalized. Finally, the resulting set of orthogonal vectors (or the orthogonal projection operator formed by them) is output as the common-mode subspace basis data.
[0088] In this embodiment, by constructing a common-mode identifiable criterion and synchronizing the common-mode performance factor across logical address groups, the verifiability and consistency of common-mode components can be improved. By using adversarial subspace decoupling and monotonically expanding subspace homotopy convergence, a more stable and purer common-mode subspace can be obtained while suppressing non-common-mode leakage. Furthermore, by outputting a standard common-mode subspace basis through orthogonal decomposition, it is easier to remove the common-mode direction in the subsequent process, thereby improving the anti-interference capability and the accuracy of state inference.
[0089] In an exemplary embodiment, posterior probability quality conservation backfilling is performed on the posterior sample set data to obtain chip operating state data, including steps 602 to 610. Wherein:
[0090] Step 602: Estimate the probability quality of missing data by performing probability quality estimation on the sample weight distribution data in the posterior sample set data to obtain missing probability quality data. Step 604: Perform quality bucket routing analysis based on state neighborhood on the posterior sample set data to obtain quality backfill routing data; Step 606: Based on the missing probability quality data and the quality backfilling routing data, perform conserved transport backfilling on the posterior probability quality to obtain the backfilled sample weight data. Step 608: Perform posterior uncertainty structuring on the backfilled sample weight data to obtain state posterior data with confidence. Step 610: Expectation estimation is performed on the posterior data with confidence level to obtain chip operating state data.
[0091] Among them, the sample weight distribution data includes the weight values of each sample in the posterior sample set and their distribution pattern on the sample set (such as weight set, normalized sum and concentration).
[0092] Among them, probability quality loss estimation is a process of calculating the difference between the total quality of the current sample weights and the total quality of the reference to evaluate the posterior probability quality loss.
[0093] Among them, missing probability quality data: structured data that characterizes the overall or bucketed posterior probability quality loss amount and its distribution location.
[0094] Among them, the quality bucketing routing analysis based on state neighborhood is an analysis process that buckets samples according to their state and adjacency relationship and generates allowed quality redistribution paths and proportions.
[0095] Among them, quality backfill routing data describes the probability quality of backfilling from which state buckets to which state buckets, as well as routing rule data on backfilling ratios and path constraints.
[0096] Among them, the posterior probability quality is the total probability of the posterior distribution on the sample or state and its distribution in each state region (usually satisfying a normalized sum of 1).
[0097] Among them, the conservation transport backfill is a backfilling process that compensates for missing quality through transport and redistributes it to samples or states while maintaining the conservation of total probability quality and satisfying neighborhood constraints.
[0098] Among them, the backfilled sample weight data is the updated set of sample weights obtained after conservation transportation backfill compensation and normalization.
[0099] Among them, the posterior uncertainty structured processing is the process of explicitly expressing the uncertainty of the posterior distribution using computable indicators and fields (such as entropy, number of valid samples, and backfilling ratio).
[0100] Among them, the posterior data of the state with confidence level includes the posterior probability (or strength) of each state and its corresponding confidence level / uncertainty index.
[0101] Among them, expectation estimation is an estimation method that uses the posterior probability of the state as a weight to sum the state categories or state strengths to obtain the output result under the expected meaning.
[0102] Specifically, the sample weight distribution data (including the weight of each sample and its sum before and after normalization) is read from the posterior sample set data, and a reference quantity for comparison is set (e.g., the total posterior quality before projection and resampling, or the theoretically required normalized total quality of 1). Then, the difference between the current total weight quality and the reference quantity is calculated to obtain the overall missing probability quality. Simultaneously, samples can be further aggregated by state category or by state neighborhood, and the missing quality component within each aggregation bucket is calculated to distinguish which state regions the missing data occurs in. Finally, the overall missing quality and its bucket decomposition results are summarized to form the missing probability quality data.
[0103] Based on the state label or state coordinates (e.g., the internal working state or discrete index in the state space) corresponding to each sample in the posterior sample set data, and combined with the state neighborhood relationship (which can be determined by state transition constraints, adjacency relationships, or reachability relationships), the samples are divided into "state buckets / neighborhood buckets". Within each bucket, the number of samples, total weight, weight density, and neighborhood connectivity information are statistically analyzed, and routing rules from the "supply bucket" to the "demand bucket" are generated accordingly. For example, backfilling is prioritized within the same state bucket; if the bucket is insufficient, backfilling is carried out along the allowed adjacency edges to adjacent buckets, and routing weights are assigned to each routing edge (which can be determined by adjacency strength, transition probability, or distance decay). The final output is quality backfilling routing data, which is essentially a quality transport network constrained by state neighborhood, clearly defining "where to backfill from and to where, allowed paths, and allocation ratios".
[0104] Using missing probability quality data as the "quality requirement" to be compensated, and quality backfilling routing data as the allowed transportation path constraint, the quality amount to be compensated for each requirement bucket is determined. Then, the compensation quality is distributed along the routing network according to the routing weights to the samples in the corresponding buckets. When backfilling at the sample level, the quality can be proportionally allocated within the bucket according to the relative weight or confidence level of the samples, allowing high-confidence samples to receive a larger backfill increment. The entire process must satisfy the probability quality conservation constraint (the sum of the weights of all samples after backfilling restores to the reference total quality, and the inflow and outflow of each routing edge are conserved), and simultaneously satisfy the weight non-negativity and normalization constraints, ultimately yielding the backfilled sample weight data.
[0105] The weights of backfilled samples are aggregated according to state labels or state buckets to obtain the posterior probability vector for each internal working state. Then, uncertainty measures (such as posterior entropy, the difference between the maximum and second-largest posterior probabilities, and the number of effective samples inversely derived from the sum of squared weights) are calculated based on this posterior probability vector. Weight concentration and multimodality are statistically analyzed at the sample level to determine whether a "dispersion / bimodal" phenomenon exists in the posterior. Simultaneously, the "degree of backfill introduction" is included as a source of uncertainty, i.e., the proportion of backfill quality in each state bucket or the magnitude of probability change before and after backfilling is calculated to distinguish between high probabilities supported by real observations and high probabilities mainly relying on conservation backfill compensation. Finally, the posterior probability vector, along with the aforementioned uncertainty measures and backfill introduction indicators, are structurally encapsulated to form state posterior data with confidence levels, where the confidence level can be obtained by jointly mapping the uncertainty measures and backfill introduction indicators.
[0106] The posterior probabilities of each internal working state and their corresponding state representations are read from the posterior data with confidence levels. These state representations can be discrete state numbers or continuous intensity parameters associated with that state (e.g., intensity values obtained by mapping mechanism parameter vectors to garbage collection intensity, thermal throttling intensity, etc.). When the output is a discrete state, the posterior probabilities of each working state are used as weights to calculate the weighted expectation of the state number, and the state corresponding to the highest posterior probability is taken as the main state output to obtain a combination of "expected state + most likely state". When the output is a continuous intensity, the intensity values of each working state are weighted and summed according to their posterior probabilities to obtain the expected intensity. The variance or confidence interval of the intensity can be calculated simultaneously as a stability measure. Finally, the above expected estimation results are encapsulated together with confidence indicators (e.g., number of effective samples, posterior entropy, backfill ratio) from the posterior data with confidence levels into chip working state data, so that the final output includes both the state value and its confidence level.
[0107] In this embodiment, by estimating the missing probability quality and performing routing analysis based on state neighborhood, the compensation direction of the posterior probability quality can be accurately located and constrained; by conserving transport backfilling, the posterior normalization is restored and the bias caused by weight degradation is suppressed; and by structuring uncertainty and estimating expectation, a smooth state result with confidence is output, thereby improving the robustness and interpretability of chip operating state determination.
[0108] In an exemplary embodiment, the mechanism parameter vector data is compiled into a state dictionary to obtain dictionary constraint model data, including steps 702 to 708. Wherein:
[0109] Step 702: Perform parameter topology mapping on the mechanism parameter vector data to obtain state generation parameter domain data; Step 704: Generate parameter domain data based on the state, perform conflict resolution analysis on the boundary constraints for determining the internal working state, and obtain the set of dissipated constraints. Step 706: Perform reachability analysis and pruning on the dissipation constraint set data to obtain the reachable state set data; Step 708: Based on the reachable state set data, assemble and compile the state dictionary and state transition constraints of the internal working states to obtain dictionary constraint model data.
[0110] Among them, parameter topology mapping: partition the mechanism parameter vector according to the key dimensions required for state determination and establish adjacency relationships, so that the parameter space has a topological structure representation that can be used for state inference.
[0111] Among them, the state generation parameter domain data is structured data consisting of parameter partitions, the value range of each partition, and their adjacency relationships, which is used to describe the transitionability of mechanism parameters between different state regions.
[0112] Among them, the decision boundary constraint is a set of parameter threshold ranges, logical conditions, or combination rules used to determine whether a certain internal working state is valid.
[0113] Among them, conflict resolution analysis is a process of detecting and eliminating contradictions between the overlapping, mutual exclusion, gaps or circular dependencies of the decision boundary constraints of different states.
[0114] Among them, the dissipation constraint set data is a data representation of the consistent and simultaneously applicable decision boundary constraint set obtained after conflict resolution.
[0115] Among them, reachability analysis pruning is a process of searching for accessible states and eliminating unreachable states and their transitions under given parameter evolution constraints and allowed transition relationships.
[0116] Among them, the reachable state set data is the set of state entries that are reasonably accessible under the current parameter domain and evolution conditions, as determined by reachability analysis.
[0117] Among them, the state dictionary is a set of entries that define the internal working states of candidates. Each entry is associated with a state identifier, a corresponding subset of parameters, and decision boundary constraints.
[0118] Among them, state transition constraints are a set of rules that restrict the allowed / prohibited transition relationships between states and their transition conditions or transition strengths.
[0119] Among them, assembly and compilation is the process of structurally integrating and numbering the reachable state set, state dictionary entries, and state transition constraints into a dictionary constraint model that can be directly used for filtering inference.
[0120] Specifically, within each sliding time window, the mechanism parameter vector data (e.g., garbage collection intensity, wear leveling migration intensity, bad block replacement or remapping rate, cache write-back intensity, etc.) are normalized and dimensionally unified, and necessary derived quantities (e.g., parameter change rate, short-term mean, and fluctuation amplitude) are calculated to reflect the mechanism evolution trend. Then, the mechanism parameter vector data is mapped to a preset "state generation parameter domain," that is, the multidimensional parameter space is organized according to the key dimensions required for state determination. For example, each internal working state is associated with a subset of parameters and its effective value range, and the adjacency relationships between parameter domains (e.g., adjacent intervals, continuous transition intervals) are explicitly represented as a topological structure (e.g., interval graphs or partitioning relationships), thus forming the state generation parameter domain data.
[0121] Based on the parameter subsets and value ranges corresponding to each state in the state-generated parameter domain data, a consistency check is performed on the decision boundary constraints of the internal working states. This involves checking for overlaps, mutual exclusions, gaps, or circular dependencies in the boundary constraints of different states on the same parameter dimension (e.g., multiple states simultaneously satisfying the same parameter interval, certain intervals failing to match any state, or boundary conditions between states mutually negating each other). When a conflict is detected, a conflict resolution strategy is executed to obtain a "dissipated" constraint set. This includes, for example, pruning overlapping intervals according to priority rules, merging adjacent intervals and introducing buffer zones, introducing arbitration conditions for mutually exclusive constraints (which can be assisted by parameter change rate or common mode strength), or establishing catch-all / transitional state constraints for gap intervals. Finally, the dissipated constraint set data is output, representing a decision boundary constraint set where conflicts have been resolved, constraints can be simultaneously satisfied, and reasonable coverage is achieved.
[0122] Based on the dissipation constraint set data, a "satisfiable region" is established for each internal working state. This region represents the feasible parameter interval of the decision boundary constraint corresponding to the state in the state generation parameter domain. Then, using the position of the mechanism parameter vector data in the parameter domain within the current time window as the starting point, the initial set of currently satisfiable states is determined. Next, a reachability search is performed under a preset "allowed transition relationship," where the allowed transition relationship is jointly given by the topological adjacency relationship in the state generation parameter domain and the transition restrictions in the dissipation constraint set. During the search, two constraints are checked for each candidate transition: first, whether the satisfiable region of the target state intersects with the "parameter single-step change boundary" (i.e., whether it is possible to enter the target state region within the allowed parameter change range); and second, whether the transition violates the mutual exclusion / prohibition conditions in the dissipation constraint set. States and transition edges that do not satisfy either of the above constraints are directly pruned. States that satisfy the constraints are marked as reachable and continue to expand until a preset search depth is reached or coverage convergence is achieved. Finally, all states marked as reachable are aggregated into a reachable state set data.
[0123] Using the reachable state set data as compilation input, a state dictionary entry is generated for each reachable state and assigned a unique index. Each state dictionary entry contains at least the state's identifier, its corresponding decision boundary constraint reference in the dissipation constraint set data, the index of the mechanism parameter subset associated with that state, and the definition of the state output field used for inference. The set of allowed successor states for each state is extracted from the transition relationships preserved from reachability analysis and solidified into a state transition constraint structure (e.g., adjacency list or adjacency matrix). Simultaneously, a transition strength parameter or penalty parameter is assigned to each allowed transition edge. These parameters are generated from the rate of change of the mechanism parameter vector data, the neighborhood distance in the state generation parameter domain, or a stability index and written into the transition edge attributes. Finally, the state dictionary entry set and the state transition constraint structure are checked for consistency (ensuring that all edge-referenced nodes are within the reachable set and that there are no prohibited transitions), and both are encapsulated into a unified data object and output as dictionary constraint model data.
[0124] In this embodiment, by mapping the mechanism parameter vector to a parameter topology and forming a state generation parameter domain, a computable correspondence can be established between internal mechanism changes and state space structure. By resolving conflicts in the decision boundary constraints to obtain a set of dissipated constraints, ambiguity caused by overlapping, mutually exclusive, and gaps in state boundaries can be avoided. By pruning through reachability analysis to obtain a set of reachable states, impossible states and transitions can be eliminated under the current parameter evolution conditions, thereby reducing inference complexity and unreasonable jumps. By assembling and compiling the state dictionary and state transition constraints to generate a dictionary constraint model, a consistent, constrained, and interpretable set of models and transition structures can be provided for subsequent filtering inference, thereby improving the convergence speed, stability, and accuracy of state inference.
[0125] In an exemplary embodiment, based on the probe event spatiotemporal annotation data, mapped fluctuation fingerprint analysis is performed on the delayed base sequence data to obtain mapped fluctuation fingerprint MVF data, including steps 802 to 808. Wherein:
[0126] Step 802: Based on the probe occurrence time data in the probe event spatiotemporal annotation data, perform probe-driven phase folding and alignment on the delayed base sequence data to obtain phase response trajectory data.
[0127] Step 804: Extract the wave spectrum signature across time scales from the phase response trajectory data to obtain multi-scale wave spectrum signature data.
[0128] Step 806: Perform joint embedding modeling of long-tail statistical features and correlation features based on logical address grouping on the delay data in the phase response trajectory data to obtain joint embedding feature data.
[0129] Step 808: Perform feature fusion on the multi-scale wave spectrum signature data and the joint embedded feature data to obtain the mapped wave fingerprint MVF data.
[0130] Among them, the probe occurrence time data is a timestamp used to identify the trigger time of each active perturbation probe event.
[0131] Among them, probe-driven phase folding alignment: based on the probe occurrence time, the delayed sequence is converted into a relative time and multiple probe response windows are superimposed and normalized to achieve event phase consistency alignment processing.
[0132] Among them, phase response trajectory data: delayed response sequences or statistical trajectory data obtained by summarizing in a relative time (or phase position) coordinate system, used to characterize the typical response morphology after probe excitation.
[0133] Among them, the wave spectrum signature extraction across time scales is a process that extracts the wave energy distribution and spectral morphology of the phase response trajectory at multiple time scales to form a scale-related structural signature.
[0134] Among them, multi-scale wave spectrum signature data consists of structured spectrum signature data composed of energy spectrum characteristics, peak characteristics, and energy ratios between scales at different time scales.
[0135] Among them, long-tail statistical features are statistics extracted from the tail of the delay distribution (high quantile or the part exceeding the threshold), which are used to characterize the frequency and severity of extreme delays.
[0136] Among them, the correlation feature based on logical address grouping is a set of correlation indicators obtained by measuring the synchronicity and dependency between latency fluctuations of different groups after grouping logical address ranges according to rules.
[0137] Among them, joint embedding modeling is a process that couples and models long-tail statistical features and group correlation features in the same representation space and maps them to a unified vector representation.
[0138] Among them, joint embedding feature data: low-dimensional or stereotyped vector features output by joint embedding modeling, used to simultaneously express tail risk intensity and cross-group synchronization structure.
[0139] Specifically, based on the probe occurrence time data in the probe event spatiotemporal annotation data, the delayed base sequence data is mapped to a relative time coordinate with the "probe occurrence time" as the reference. For each probe event corresponding to the probe occurrence time data, a delay segment within a preset alignment window is extracted, and the timestamp of each sampling point within the segment is converted to a relative time relative to the probe time. The relative time segments of probe events corresponding to multiple probe occurrence time data are superimposed and folded according to their relative time positions (which can be grouped and folded separately according to probe type or probe logical address range), and multiple observations at the same relative time position are aggregated (e.g., by calculating the mean, quantiles, or weighted statistics), thereby compressing the "multiple event responses" into one or more comparable phase response trajectories. Finally, the phase response trajectory data is output, which includes the relative time (or phase position) index and the corresponding delayed response statistics.
[0140] Detrending and scaling of the phase response trajectory data are performed to eliminate slow-varying drift and dimensional differences. Then, the trajectory is decomposed into multiple scales under different scale parameters (e.g., energy distribution is calculated according to different window lengths or scale resolutions), and spectral signatures that characterize the wave structure are extracted from each scale (e.g., energy proportion vector, peak position and intensity, energy ratio between scales, etc.). Finally, the spectral signatures of each scale are spliced or stacked in a unified order to form multi-scale wave spectral signature data, thereby characterizing the structural differences of the mapped wave at different scales such as "short-term impact - medium-term decline - long-term tail".
[0141] Tail-related statistics (such as high quantile delay, over-threshold ratio, and over-threshold mean) are calculated from the delay data of the phase response trajectory data to form long-tail statistical features. Then, logical address range identifiers are used to group logical addresses according to preset rules, and corresponding phase response trajectories or delay response statistical sequences are formed for each logical address group. Subsequently, correlation indicators of synchronization fluctuations between different logical address groups (such as inter-group correlation coefficient, cross-correlation peak, and hysteresis) are calculated to form correlation features. The long-tail statistical features and correlation features are jointly modeled in the same embedding space (for example, the two types of features are coupled and represented so that the tail strength and inter-group synchronization structure are expressed in the same vector), and the jointly embedded feature data is output to enhance the distinguishability of "global common-mode fluctuations" and "local mapping fluctuations".
[0142] The multi-scale wave spectrum signature data and the jointly embedded feature data are dimensionally aligned and scaled (e.g., eliminating differences in the scale of different features according to preset weights or standardization rules). Then, they are spliced or weighted and combined into a unified feature vector according to a preset fusion strategy. Necessary shaping processing (e.g., dimensionality reduction, redundancy removal, or stabilization mapping) can be performed to improve consistency across windows and devices. The mapped wave fingerprint MVF data is output, which contains both multi-scale wave spectrum structure information and "tail-correlation" coupling structure information.
[0143] In this embodiment, by performing probe-driven phase folding alignment on the delayed base sequence based on the probe occurrence time, the delayed response under multiple probe excitations can be compressed into a repeatable phase response trajectory, thereby significantly reducing the impact of load time shift and random jitter on the consistency of feature extraction. Furthermore, by extracting the fluctuation spectrum signature across time scales and jointly embedding and modeling the long-tail statistical features and the correlation features based on logical address grouping, the multi-scale structure, tail risk, and global / local synchronization differences of the mapped fluctuation can be simultaneously characterized. Finally, the two types of features are fused to form the mapped fluctuation fingerprint MVF data, which makes the non-stationary fluctuations caused by the flash conversion layer mapping changes more identifiable and resistant to interference, providing more stable and sensitive input features for subsequent common mode decoupling and internal working state inference.
[0144] In an exemplary embodiment, joint embedding modeling of long-tail statistical features and correlation features based on logical address grouping is performed on the delay data in the phase response trajectory data to obtain joint embedding feature data, including steps 902 to 910. Wherein:
[0145] Step 902: Perform phase domain tail gate extraction on the delayed data in the phase response trajectory data to obtain the over-threshold delayed subsequence data.
[0146] Step 904: Group the logical address range identifier data corresponding to the phase response trajectory data to obtain the data of each logical address group.
[0147] Step 906: Based on the over-threshold delay subsequence data, perform coherence graph analysis on the synchronization fluctuation relationship between different logical address groups to obtain inter-group coherence graph data.
[0148] Step 908: Based on the over-threshold delayed subsequence data and the inter-group coherence graph data, perform coupled embedding modeling on the long-tail statistical features and the inter-group coherence features between different logical address groups to obtain coupled embedding representation data.
[0149] Step 910: Dimensionality reduction and shaping of the coupled embedding representation data to obtain joint embedding feature data.
[0150] Among them, phase domain tail control extraction: the process of filtering delayed tail samples and forming a tail event sequence according to preset phase interval and threshold rules in the probe phase alignment coordinate system.
[0151] Among them, the over-threshold delayed subsequence data consists of the tail delayed sample sequence data composed of samples whose delay values exceed the threshold and their phase index, event identifier, etc.
[0152] Among them, the logical address range identifier data is used to identify the logical address range (starting logical address and length or segment number) corresponding to each request or probe response.
[0153] Among them, logical address group data: data of multiple address groups and their associated sample index sets obtained by dividing the logical address range identifier data according to preset grouping rules.
[0154] Among them, synchronous fluctuation relationship: the dependency relationship between tail delay events or intensity sequences of different logical address groups after phase alignment, which are exhibited by simultaneous occurrence or change in the same direction.
[0155] Among them, coherent graph analysis is a graph analysis process that constructs and evaluates the inter-group connection structure using logical address groups as nodes and inter-group synchronization fluctuation intensity as edge weights.
[0156] Among them, inter-group coherence graph data consists of nodes (logical address groups) and their edge weights (synchronization strength, hysteresis, and other attributes) used to represent the inter-group synchronization structure.
[0157] Among them, inter-group coherence features are structural and weight statistics extracted from the inter-group coherence graph, such as the proportion of strong edges, average synchronization strength, connectivity and clustering.
[0158] Among them, coupled embedding modeling is a process that jointly models long-tail statistical features and inter-group coherent features in the same representation space and generates a unified vector representation.
[0159] Among them, coupled embedding represents data: the vector or group of vectors output by coupled embedding modeling is used to simultaneously express the tail risk intensity and cross-group synchronization topology.
[0160] Among them, dimensionality reduction and shaping is the process of de-redundancy compression of coupled embedded representations and mapping them to fixed-dimensional, numerically stable representations.
[0161] Specifically, the phase range and threshold strategy for tail gate control are determined in the phase response trajectory data. The phase range can be selected from the phase interval where extreme delays are most likely to occur after probe triggering (e.g., the interval where the relative time is positive and falls within a preset window). The threshold strategy can be determined by the high quantile of the delay data in this phase interval (e.g., using the delay corresponding to the 95th or 99th percentile as the threshold), or a fixed threshold or an adaptive threshold (updated with the window) can be used. Then, the delay samples in the phase response trajectory data are gated point by point. When the sample phase position falls within the phase range and the delay value exceeds the threshold, the sample, its phase index, and the corresponding probe type identifier (if any) are written into the tail sample set. They can be merged according to continuous over-threshold segments, and the start and end phases, duration, and peak value of each segment are recorded to preserve the tail event pattern. Finally, the selected over-threshold samples are organized into over-threshold delay subsequence data according to phase order.
[0162] Extract the logical address range identifier data corresponding one-to-one with the phase response trajectory data (e.g., the starting logical address and length corresponding to each input / output request or each probe response segment, or the merged logical segment number), and group the logical address ranges according to preset grouping rules. These grouping rules can be: bucketing by logical address intervals (dividing intervals with a fixed step size), grouping by access frequency (dividing into hot, warm, cold, or multi-quantile groups based on access / write frequency within a window), or grouping by probe coverage (grouping within and outside the probe coverage area, or further subdividing by probe type). For each group, establish a "group-sample mapping relationship," that is, aggregate the delay sample index, phase index, and corresponding over-threshold sample index belonging to that group into the same group structure, and count basic quantities such as the number of samples within the group and the number of over-threshold samples within the group. Finally, output the data for each logical address group, including the address range definition of each group and its associated delay / over-threshold sample index set.
[0163] Using the overthreshold delay subsequence data as the core reference point and leveraging the "group-sample mapping relationship" provided by each logical address group, an overthreshold event sequence or overthreshold intensity sequence is constructed within each logical address group. For example, the number of overthreshold occurrences, the mean or peak overthreshold, are statistically analyzed at each phase position on a unified phase grid to obtain comparable "tail response sequences" within the group. Then, the synchronization fluctuation relationship between any two groups is calculated, i.e., the correlation coefficient, cross-correlation peak, and corresponding phase lag of the tail response sequences of the two groups are calculated, or the co-occurrence rate and consistency score of tail events within the same phase interval are calculated, and the synchronization intensity is used as the inter-group edge weight; when the synchronization intensity exceeds a preset threshold, an edge is established between the two groups; otherwise, no edge is established or a weak connection is assigned. Finally, all logical address groups are used as nodes, and synchronization intensity is used as edge weight to form inter-group coherence graph data, which can include attributes such as lag parameters and significance scores for each edge.
[0164] After extracting long-tail statistical features from the over-threshold delay subsequence data, the high-quantile delay (e.g., 95th / 99th percentile), over-threshold proportion, over-threshold mean and variance, tail conditional expectation (mean after exceeding a certain quantile), and tail event duration and peak distribution are calculated at the overall level or separately for logical address groups, forming a long-tail statistical feature vector. Then, inter-group coherence features are extracted from the inter-group coherence graph data, such as edge weight statistics (average synchronization strength, maximum synchronization strength, strong edge proportion), graph structure statistics (node degree distribution, number of connected components, clustering coefficient), and the proportion of edges with lag, forming an inter-group coherence feature vector. By performing coupled embedding modeling on both of the above simultaneously, the two types of features are coupled in the same representation space. During the coupled embedding modeling process, a joint representation can be generated by using "long-tail features as node attributes and coherence graph as structural constraints" (for example, aggregating node representations on the coherence graph and then gating and fusing them with the global long-tail vector). Alternatively, the tail severity can be injected into the graph structure by "long-tail intensity modulating coherence edge weights" and then a global embedding can be generated from the weighted graph. Finally, coupled embedding representation data is output, which is a unified vector or vector group that can simultaneously express "tail risk intensity" and "cross-group synchronization structure".
[0165] Before dimensionality reduction and shaping of the coupled embedding representation data, the features of each dimension are first standardized or range-scaled to eliminate dimensional differences. Highly redundant dimensions can be removed by using variance or correlation thresholds. Only then is dimensionality reduction and shaping performed, compressing the coupled embedding representation into fixed-dimensional, cross-window stable joint embedding features. In practice, linear dimensionality reduction is generally used (e.g., retaining several dimensions with the largest explained variance along the principal direction) or a constrained shaping mapping is used (e.g., maintaining the adjacency relationship of the coherence graph while preserving the embedding distance structure as much as possible). After dimensionality reduction, the output vector is normalized to improve comparability between different devices / windows, ultimately obtaining the joint embedding feature data.
[0166] In this embodiment, by extracting the phase domain tail gate control from the delay data in the phase response trajectory to obtain the super-threshold delay subsequence, extreme delay events related to mapping fluctuations can be highlighted from background fluctuations, improving the sensitivity to tail risks. Then, by grouping the logical address range identifier data and performing inter-group coherence graph analysis based on the super-threshold subsequence, the synchronization structure of tail fluctuations between different logical address groups can be characterized, thereby distinguishing between global common model synchronization and local mapping anomalies. Furthermore, long-tail statistical features are extracted from the super-threshold subsequence and coupled with inter-group coherence features for embedding modeling, which can simultaneously express tail severity and cross-group synchronization topology in the same representation, enhancing the identifiability of the mapping fluctuation structure. Finally, stable joint embedding feature data is obtained through dimensionality reduction, which reduces feature redundancy and noise sensitivity, and improves comparability across windows and devices, thus providing more robust and discriminative input features for subsequent blind source separation and internal working state inference.
[0167] In an exemplary embodiment, blind source separation is performed on the mapped fluctuation fingerprint MVF data to obtain common mode performance factor data and mapped fluctuation factor data, including steps 1002 to 1008. Wherein:
[0168] Step 1002: Perform common mode discrimination mask aggregation on the cross-logical address group correlation features in the mapped fluctuation fingerprint MVF data to obtain common mode discrimination mask data.
[0169] Step 1004: Based on the common-mode discrimination mask data, construct the subspace leakage constraint for blind source separation to obtain leakage constraint data.
[0170] Step 1006: Based on the leakage constraint data, perform constrained blind source separation on the mapped fluctuation fingerprint MVF data to obtain candidate factor set data.
[0171] Step 1008: Assign factor identities to the candidate factor set data to obtain common mode performance factor data and mapped volatility factor data.
[0172] Among them, cross-logical address group correlation features are a set of correlation indicators that measure the synchronicity and dependency between latency or fingerprint feature fluctuations of different logical address groups.
[0173] Among them, common mode discrimination mask aggregation is a process that thresholds and summarizes cross-group correlation structures according to the criterion of "synchronization of the majority address group" to form a common mode labeling mask.
[0174] Among them, common-mode discrimination mask data is a binary or soft-weighted mask data used to indicate which cross-group synchronization structures belong to the common-mode channel and which belong to the non-common-mode channel.
[0175] Among them, the subspace leakage constraint is a constraint rule or penalty term that restricts common mode components from entering the mapped fluctuation factor subspace during the blind source separation process.
[0176] Among them, leakage constraint data is a structured representation of subspace leakage constraints, including mask index, leakage metric form and its threshold / weight parameters, etc.
[0177] Among them, constrained blind source separation is a process of decomposing the mapped fluctuation fingerprint under conditions such as leakage constraints to obtain the separation of mutually distinguishable potential factor components.
[0178] Among them, the candidate factor set data consists of a data set composed of multiple candidate factors output by constrained blind source separation, their activation / weight sequences, and related discriminant statistics.
[0179] Among them, factor identity assignment is the classification process of classifying candidate factors as common-mode performance factors or mapping volatility factors based on the common-mode and mapping volatility criteria.
[0180] Specifically, cross-logical address group correlation features (such as correlation coefficient matrices, cross-correlation peak matrices, coherence graph edge weight matrices, or their low-dimensional representations) are extracted from the mapped fluctuation fingerprint MVF data. These correlation features are then symmetricized, denoised, and scaled (e.g., the correlation values are mapped to a unified interval and obvious outliers are removed). The correlation structure is then aggregated according to the discrimination principle of "common mode = synchronization of most address groups." Specifically, for each time window or feature block, the average correlation strength between each address group and the remaining address groups, the proportion of strongly correlated neighbors, and the proportion of global synchronization components are calculated. Based on these statistics, the correlation structure is binarized or soft-thresholded into a mask (e.g., when the correlation strength exceeds the threshold and the proportion of strongly correlated neighbors exceeds the threshold, it is marked as common mode correlation; otherwise, it is marked as non-common mode correlation). Simultaneously, the mask can be smoothed or constrained on the time axis (to avoid frequent mask flipping), ultimately yielding common mode discrimination mask data to indicate "which inter-group synchronization structures belong to the common mode channel and which are more likely to belong to local mapping fluctuations."
[0181] The target structure of the common-mode subspace is determined based on the common-mode discrimination mask data (e.g., requiring common-mode factors to exhibit a consistent direction on most address groups and maintain high consistency on the strongly synchronized edges marked by the mask). Simultaneously, a leakage metric is defined (e.g., the synchronization consistency score induced by the mapped volatility factor on the strongly synchronized edges of the mask, or the energy proportion of the mapped volatility factor projected onto the common-mode direction). This leakage metric is then written into the penalty term in the separation optimization objective or as an explicit inequality constraint, ensuring that any common-mode structure of the mapped volatility factor is penalized or suppressed by projection during the optimization process. A constraint strength parameter is set (which can be adaptively adjusted according to the mask strength), ultimately forming leakage constraint data, including: mask index, leakage metric form, penalty weight / threshold, and optional common-mode structure priors (such as low-rank / synchronization consistency constraints).
[0182] The mapped wave fingerprint MVF data is organized into observation matrices or tensors to be decomposed (e.g., stacking eigenvectors by time windows to form a matrix, or forming a tensor by "time × feature block × address group"). After introducing leakage constraint data, constrained blind source separation is performed. During the decomposition process, the factor matrix and mixing coefficients are solved simultaneously to minimize the reconstruction error and satisfy the subspace leakage constraint (e.g., by alternating minimization: fixing the mixing coefficients to update the factors, then fixing the factors to update the mixing coefficients, and applying leakage penalty or projection stripping to the mapped wave factors after each update to suppress their consistency on the mask's strong synchronization edges). To ensure interpretability, common constraints such as nonnegativity, sparsity, or smoothness can be added, but the core is that the leakage constraint must participate in the optimization, so that the separation result forms a separable factor set between the "common-mode synchronization structure" and the "non-common-mode local structure". After iteration to convergence, candidate factor set data is output. The set contains multiple candidate factors and their temporal weights / activation sequences, and includes the synchronization score of each factor on the mask edge and the local difference score on the non-mask edge.
[0183] For each candidate factor set, a "common mode score" and a "mapped volatility score" are calculated. The common mode score is obtained by combining factors such as consistency scores on mask strong synchronization edges, the proportion of cross-address group unidirectional changes, low-rank energy concentration, and the proportion of slow-varying trends over time. The mapped volatility score is obtained by combining factors such as the intensity of local differences on non-mask edges, structural selectivity related to logical address grouping, and the intensity of tail-correlation coupling changes after phase alignment. Then, according to the scoring rules, the factor or multiple factors with the highest common mode score that meet the threshold condition are assigned as common mode performance factor data, and the remaining factors that meet the mapped volatility criterion are assigned as mapped volatility factor data. If there are boundary factors (the scores of the two types are close), the decision is made according to the principle of "minimum leakage" or according to the consistency priority with the common mode discrimination mask, and the final two types of factors are scaled and corrected for sign consistency.
[0184] In this embodiment, by performing common-mode discrimination mask aggregation on the cross-logical address group correlation features in the mapped fluctuation fingerprint, the common-mode structure of "synchronous changes in most address groups" can be explicitly marked, providing a verifiable prior for subsequent decoupling. Based on this mask, a subspace leakage constraint can be constructed, which can directly suppress the leakage of common-mode components to the mapped fluctuation factor during the separation stage, reducing the pollution of the mapped fluctuation characterization by common-mode disturbances. Further, constrained blind source separation is implemented and a candidate factor set is output. Then, the common-mode performance factor and the mapped fluctuation factor are distinguished by factor identity assignment, which can significantly improve the purity and stability of the two types of factors and enhance the distinguishability between global performance changes such as thermal throttling and power supply current limiting and the mapped fluctuation of the flash conversion layer.
[0185] It should be understood that although the steps in the flowcharts of the embodiments described above are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowcharts of the embodiments described above may include multiple steps or multiple stages. These steps or stages are not necessarily completed at the same time, but can be executed at different times. The execution order of these steps or stages is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the steps or stages of other steps.
[0186] In one exemplary embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as follows: Figure 3 As shown. This computer device includes a processor, memory, input / output interfaces (I / O), and communication interfaces.
[0187] Those skilled in the art will understand that Figure 3 The structure shown is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation on the computer device to which the present application is applied. Specific computer devices may include more or fewer components than those shown in the figure, or combine certain components, or have different component arrangements.
[0188] In one embodiment, an eMMC state inference system based on FTL mapping fluctuations is also provided, comprising: a visualization device and a computer device; Computer equipment is used to perform joint spatiotemporal processing on host-side I / O observation data and active perturbation probe event data corresponding to the built-in storage chip to obtain delayed basis sequence data and probe event spatiotemporal labeled data. Computer equipment is used to perform mapped fluctuation fingerprint analysis on delayed base sequence data based on probe event spatiotemporal annotation data to obtain mapped fluctuation fingerprint (MVF) data. Computer equipment is used to perform blind source separation on mapped volatility fingerprint (MVF) data to obtain common mode performance factor data and mapped volatility factor data. Computer equipment is used to perform parameter inversion on mapped volatility factor data based on a parameterized FTL digital twin model to obtain mechanism parameter vector data; Computer equipment is used to perform state-space inference analysis on the internal working state of a built-in memory chip based on common-mode performance factor data and mechanism parameter vector data to obtain chip working state data; the chip working state data is then displayed in a visualization device.
[0189] In one embodiment, a computer device is also provided, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to implement the steps in the above method embodiments.
[0190] In one embodiment, a computer-readable storage medium is provided storing a computer program that, when executed by a processor, implements the steps in the above method embodiments.
[0191] In one embodiment, a computer program product or computer program is provided, the computer program product or computer program including computer instructions stored in a computer-readable storage medium. A processor of a computer device reads the computer instructions from the computer-readable storage medium, and executes the computer instructions, causing the computer device to perform the steps in the above method embodiments.
[0192] It should be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, data stored, data displayed, etc.) involved in this application are all information and data authorized by the user or fully authorized by all parties, and the collection, use and processing of the relevant data must comply with relevant regulations.
[0193] Those skilled in the art will understand that all or part of the processes in the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium. When the computer program is executed, it can include the processes of the embodiments of the above methods.
[0194] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.
[0195] The embodiments described above are merely illustrative of several implementation methods of this application, and while the descriptions are specific and detailed, they should not be construed as limiting the scope of this patent application. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these all fall within the protection scope of this application. Therefore, the protection scope of this application should be determined by the appended claims.
Claims
1. An eMMC state inference method based on FTL mapping fluctuation, characterized in that, The method includes: Joint spatiotemporal processing is performed on host-side I / O observation data corresponding to the built-in storage chip and active perturbation probe event data to obtain delayed basis sequence data and probe event spatiotemporal labeled data. Based on the spatiotemporal annotation data of the probe events, the delayed base sequence data is subjected to mapped fluctuation fingerprint analysis to obtain mapped fluctuation fingerprint MVF data; Blind source separation is performed on the mapped fluctuation fingerprint MVF data to obtain common mode performance factor data and mapped fluctuation factor data; The mapped volatility factor data is subjected to parameter inversion based on a parameterized FTL digital twin model to obtain mechanism parameter vector data; Based on the common-mode performance factor data and the mechanism parameter vector data, state-space inference analysis is performed on the internal working state of the built-in memory chip to obtain chip working state data.
2. The method of claim 1, wherein, The step of performing state-space inference analysis on the internal operating state of the built-in memory chip based on the common-mode performance factor data and the mechanism parameter vector data to obtain chip operating state data includes: The mechanism parameter vector data is compiled into a state dictionary to obtain dictionary constraint model data; Based on the common-mode performance factor data and the dictionary constraint model data, interactive multi-model filtering inference is performed on the internal working state to obtain preliminary state posterior distribution data. Based on the common-mode performance factor data, the preliminary state posterior distribution data is subjected to posterior reweighting calibration by common-mode subspace stripping to obtain the chip operating state data.
3. The method of claim 2, wherein, The step of performing posterior reweighting calibration on the preliminary state posterior distribution data by stripping the common-mode subspace based on the preliminary state posterior distribution data and the common-mode performance factor data to obtain the chip operating state data includes: The common-mode performance factor data is subjected to identifiable subspace basis processing to obtain common-mode subspace basis data; Based on the preliminary state posterior distribution data and the common mode subspace basis data, a constrained projection of the preliminary state posterior distribution data with common mode direction stripping is performed to obtain the projected posterior weight data. The projected posterior weight data is resampled to obtain the posterior sample set data; The posterior sample set data is backfilled with posterior probability quality conservation to obtain the chip operating status data.
4. The method of claim 3, wherein, The process of performing identifiable subspace basis processing on the common-mode performance factor data to obtain common-mode subspace basis data includes: A common-mode performance factor data is obtained by constructing a common-mode identifiability criterion. Based on the common mode identifiable criterion data, the common mode performance factor data is synchronized across logical address groups to obtain aligned common mode factor matrix data; The aligned common mode factor matrix data is subjected to adversarial subspace decoupling that satisfies the common mode identifiable criterion data to obtain candidate common mode subspace data. The candidate common mode subspace data is subjected to monotonically expanded subspace homotopy convergence processing to obtain stable common mode subspace data; The stable common-mode subspace data is orthogonally decomposed to obtain the common-mode subspace basis data.
5. The method of claim 3, wherein, The step of performing posterior probability quality conservation backfilling on the posterior sample set data to obtain the chip operating status data includes: Probability quality missing data is estimated by performing a probability quality missing data on the sample weight distribution data in the posterior sample set data. The posterior sample set data is subjected to quality bucket routing analysis based on state neighborhood to obtain quality backfill routing data; Based on the missing probability quality data and the quality backfilling routing data, the posterior probability quality is backfilled using a conserved transport method to obtain the backfilled sample weight data. The backfilled sample weight data is subjected to posterior uncertainty structuring processing to obtain state posterior data with confidence. The expected estimation is performed on the posterior data with confidence level to obtain the chip operating state data.
6. The method of claim 2, wherein, The process of compiling the mechanism parameter vector data into a state dictionary to obtain dictionary constraint model data includes: The mechanism parameter vector data is subjected to parameter topology mapping to obtain state generation parameter domain data; Based on the state, parameter domain data is generated, and conflict resolution analysis is performed on the boundary constraints for determining the internal working state to obtain dissipated constraint set data. The dissipation constraint set data is pruned using reachability analysis to obtain the reachable state set data; Based on the reachable state set data, the state dictionary and state transition constraints of the internal working state are assembled and compiled to obtain the dictionary constraint model data.
7. The method according to claim 1, characterized in that, The step of performing mapped fluctuation fingerprint analysis on the delayed base sequence data based on the spatiotemporal annotation data of the probe events to obtain mapped fluctuation fingerprint MVF data includes: Based on the probe occurrence time data in the probe event spatiotemporal annotation data, the delayed base sequence data is subjected to probe-driven phase folding and alignment to obtain phase response trajectory data; The phase response trajectory data is subjected to cross-timescale wave spectrum signature extraction to obtain multi-scale wave spectrum signature data; The delay data in the phase response trajectory data is modeled by joint embedding of long-tail statistical features and correlation features based on logical address grouping to obtain joint embedding feature data. The multi-scale wave spectrum signature data and the joint embedded feature data are fused to obtain the mapped wave fingerprint MVF data.
8. The method according to claim 7, characterized in that, The joint embedding modeling of long-tail statistical features and correlation features based on logical address grouping in the delay data of the phase response trajectory data yields joint embedding feature data, including: Phase domain tail gate extraction is performed on the delayed data in the phase response trajectory data to obtain over-threshold delayed subsequence data; The logical address range identifier data corresponding to the phase response trajectory data is grouped to obtain data for each logical address group. Based on the super-threshold delay subsequence data, coherence graph analysis is performed on the synchronization fluctuation relationship between different logical address groups to obtain inter-group coherence graph data; Based on the over-threshold delayed subsequence data and the inter-group coherence graph data, the long-tail statistical features and the inter-group coherence features between different logical address groups are coupled and embedded to model the data, resulting in coupled embedded representation data; the long-tail statistical features are extracted from the over-threshold delayed subsequence data. The coupled embedding representation data is dimensionality reduced and shaped to obtain the joint embedding feature data.
9. The method according to claim 1, characterized in that, The step of blind source separation of the mapped fluctuation fingerprint MVF data to obtain common mode performance factor data and mapped fluctuation factor data includes: Common mode discrimination mask aggregation is performed on the cross-logical address group correlation features in the mapped fluctuation fingerprint MVF data to obtain common mode discrimination mask data; Based on the common-mode discrimination mask data, the subspace leakage constraint for blind source separation is constructed to obtain leakage constraint data; Based on the leakage constraint data, constrained blind source separation is performed on the mapped fluctuation fingerprint MVF data to obtain candidate factor set data; Factor identity assignment is performed on the candidate factor set data to obtain the common mode performance factor data and the mapped volatility factor data.
10. An eMMC state inference system based on FTL mapping fluctuations, characterized in that, The system includes: visualization equipment and computer equipment; The computer device is used to perform joint spatiotemporal processing on host-side I / O observation data and active perturbation probe event data corresponding to the built-in storage chip to obtain delayed basis sequence data and probe event spatiotemporal labeled data. The computer device is used to perform mapped fluctuation fingerprint analysis on the delayed base sequence data based on the spatiotemporal annotation data of the probe event to obtain mapped fluctuation fingerprint (MVF) data. The computer device is used to perform blind source separation on the mapped fluctuation fingerprint MVF data to obtain common mode performance factor data and mapped fluctuation factor data. The computer device is used to perform parameter inversion on the mapped volatility factor data based on a parameterized FTL digital twin model to obtain mechanism parameter vector data. The computer device is used to perform state-space inference analysis on the internal working state of the built-in memory chip based on the common-mode performance factor data and the mechanism parameter vector data to obtain chip working state data; the chip working state data is displayed in the visualization device.