A frequency domain decoupling superlens inverse design method, device and equipment

By employing a frequency-domain decoupled superlens reverse design method, and utilizing a multi-module collaborative architecture, we can achieve efficient compression and accurate fusion of multimodal information. This solves the limitations of single-task design methods and the problem of multimodal information fusion in existing superlens design methods, thereby improving the flexibility and accuracy of the design.

CN122194464APending Publication Date: 2026-06-12浙江优众新材料科技有限公司

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
浙江优众新材料科技有限公司
Filing Date
2026-03-13
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

Existing reverse engineering methods for superlenses are mostly limited to single-task scenarios, making it difficult to flexibly adapt to multiple application scenarios. Furthermore, multimodal information fusion is prone to information distortion or redundant conflicts, leading to a decrease in accuracy and efficiency.

Method used

A frequency-domain decoupled superlens inverse design method is adopted. Through a unified architecture of text-driven and multi-module collaboration, semantic analysis, continuous wavelet networks, progressive multi-detail feature encoding, joint diffusion self-attention and decoupled generative fusion modules are used to achieve efficient compression, decoupling and accurate fusion of multimodal information and output the design parameters of the target superlens.

🎯Benefits of technology

It achieves flexible adaptability of superlens design, eliminating the need to retrain models for different scenarios, improving accuracy and efficiency, avoiding information distortion and redundant conflicts, and reducing iterative trial and error.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122194464A_ABST
    Figure CN122194464A_ABST
Patent Text Reader

Abstract

This disclosure provides a frequency-domain decoupled superlens reverse design method, apparatus, and device, belonging to the field of optical equipment. This disclosure involves acquiring textual information about the superlens to be designed; this textual information includes: target spectral information and / or target structural information of the superlens; determining the target type of the superlens based on the textual information; combining the textual information, the superlens spectral response sample set, and the superlens structural sample set into multimodal input data; inputting the multimodal input data into a pre-trained first model to obtain the reverse design latent features corresponding to the target type; decoding and outputting the target superlens design parameters based on the reverse design latent features; and determining the target superlens based on the target superlens design parameters. In summary, the technical solution provided by this disclosure can be flexibly applied, improving accuracy and efficiency, and can adapt to various application scenarios.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to the field of optical equipment, and more particularly to a method, apparatus and device for reverse design of a frequency-domain decoupled superlens. Background Technology

[0002] Currently, existing superlens inverse design methods are mostly limited to single-task scenarios, only applicable to specific types of superlenses (such as achromatic and polarized lenses), making them inflexible and unable to support multi-task compatibility. Furthermore, at the multimodal information fusion level, either forcibly aligning structural and spectral modal features leads to information distortion, or simply superimposing modal data causes redundant conflicts, resulting in high fusion difficulty and unstable results. Thus, the design process relies on multiple iterations of trial and error, requiring repeated adjustments to model parameters and changes to training datasets to adapt to different design requirements, making it impossible to achieve multi-scenario coverage with a single modeling effort. Consequently, existing superlens inverse design methods suffer from fixed patterns, decreased accuracy and efficiency, and difficulty in adapting to diverse application scenarios. Summary of the Invention

[0003] This disclosure provides a frequency-domain decoupled superlens reverse design method, apparatus, and device, which to some extent solves the problems of existing superlens reverse design methods, such as fixed modes, decreased accuracy and efficiency, and difficulty in adapting to various application scenarios.

[0004] According to one aspect of this disclosure, a frequency-domain decoupled superlens inverse design method is provided. The method includes: acquiring textual information of the superlens to be designed; the textual information includes: target spectral information and / or target structural information of the superlens; determining the target type of the superlens based on the textual information of the superlens to be designed; combining the textual information of the superlens to be designed, a superlens spectral response sample set, and a superlens structural sample set into multimodal input data; inputting the multimodal input data into a pre-trained first model to obtain inverse design latent features corresponding to the target type; the first model is determined based on a semantic analysis module, a continuous wavelet network CWNet feature decomposition module, a progressive multidetail feature encoding module, a joint diffusion self-attention J-DiT generation module, a conditional flow matching loss CFM module, a decoupled generative DGFM fusion module, and an Euler inverse integral module; decoding and outputting target superlens design parameters based on the inverse design latent features, and determining the target superlens based on the target superlens design parameters; the target superlens design parameters include: the arrangement of the unit array, the topology of the unit, and the spectral refractive index.

[0005] Furthermore, according to one aspect of the method disclosed herein, the method further includes: training a first model; training the first model includes: acquiring a training dataset; the training dataset includes: training text information, training structural information, and training spectral response information; performing semantic parsing on the training dataset using a semantic analysis module to obtain semantic features, structural features, and spectral features; performing multi-scale feature decomposition on the semantic features, structural features, and spectral features using a CWNet feature decomposition module to obtain multiple feature decomposition results; compressing and encoding the multiple feature decomposition results using a progressive multi-detail feature encoding module to obtain encoded features; performing generative modeling and loss constraints on the encoded features using a J-DiT generation module and a CFM module to obtain candidate latent features; and decoupling and inversely integrating the candidate latent features using a DGFM fusion module and an Euler inverse integral module to obtain inverse latent features corresponding to the training dataset.

[0006] Furthermore, according to one aspect of the method of this disclosure, the CWNet feature decomposition module is used to perform multi-scale feature decomposition on semantic features, structural features and spectral features to obtain multiple feature decomposition results, including: performing multi-scale decomposition on semantic features using continuous wavelets to obtain the decomposition results corresponding to the semantic features; performing high-frequency or low-frequency separation on structural features using wavelet decomposition to obtain the decomposition results corresponding to the structural features; and performing wavelet packet decomposition on spectral features to obtain the decomposition results corresponding to the spectral features.

[0007] Furthermore, according to one aspect of the method disclosed herein, a progressive multi-decomposition feature encoding module is used to compress and encode multiple feature decomposition results to obtain encoded features, including: inputting multiple feature decomposition results into the progressive multi-decomposition feature encoding module, learning the spatiotemporal correlation between multiple feature decomposition results through a spatiotemporal convolutional layer; and outputting the compressed encoded features based on the spatiotemporal correlation and probability distribution constraints.

[0008] Furthermore, according to one aspect of the method of this disclosure, generative modeling and loss constraints are applied to encoded features using a J-DiT generation module and a CFM module to obtain candidate latent features, including: inputting encoded features into the J-DiT generation module and determining the dependencies of encoded features through a self-attention mechanism; using the CFM module to determine the linear interpolation path between the noisy latent representation and the conditional latent representation of the encoded features, and determining the mean square error based on the linear interpolation path; and outputting candidate latent features based on the dependencies and the mean square error.

[0009] Furthermore, according to one aspect of the method of this disclosure, determining the target type of a superlens based on the text information of the superlens to be designed includes: obtaining a first mapping table; the first mapping table is used to indicate the correspondence between superlens function labels and target types; obtaining superlens function labels from the text information of the superlens to be designed; matching the superlens function labels with the first mapping table, and determining the matching result as the target type of the superlens.

[0010] Furthermore, according to one aspect of the method of this disclosure, the superlens spectral response sample set includes: spectral response curves, transmission efficiency, incident angle, and polarization information; the superlens structure sample set includes: superlens unit topology images, unit array arrangement information, and unit size information.

[0011] Furthermore, according to one aspect of the method disclosed herein, based on reverse design latent features, the design parameters of the target superlens are decoded and output, and the target superlens is determined based on the design parameters of the target superlens, comprising: inputting the reverse design latent features into a decoder to obtain a structural feature image and spectral response information; the decoder includes at least one of the following: an ST-VAE decoder and a generator decoder; determining the arrangement of the unit array and the topology of the units based on the structural feature image; determining the spectral refractive index based on the spectral response information; and determining the target superlens based on the arrangement of the unit array, the topology of the units, and the spectral refractive index.

[0012] According to another aspect of this disclosure, a frequency-domain decoupled superlens inverse design apparatus is provided. The apparatus includes: an acquisition unit for acquiring textual information of a superlens to be designed; the textual information includes: superlens target spectral information and / or superlens target structural information; a determination unit for determining the target type of the superlens based on the textual information of the superlens to be designed; a combination unit for combining the textual information of the superlens to be designed, a superlens spectral response sample set, and a superlens structural sample set into multimodal input data; and an inverse design unit for inputting the multimodal input data into a pre-trained first model to obtain inverse design latent features corresponding to the target type; the first model is based on a semantic analysis module, a continuous wavelet network CWNet feature decomposition module, a spatiotemporal variational autoencoder progressive multidetail feature encoding module, and a joint diffusion self-attention J-DiT module. The generation module, conditional flow matching loss (CFM) module, decoupled generative DGFM fusion module, and Euler inverse integral module are determined; the decoding unit is used to decode the target superlens design parameters based on the inverse design latent features, and determine the target superlens based on the target superlens design parameters; the target superlens design parameters include: the arrangement of the unit array, the topology of the unit, and the spectral refractive index.

[0013] According to another aspect of this disclosure, an electronic device is provided, comprising: a memory for storing computer-readable instructions; and a processor for executing the computer-readable instructions, causing the electronic device to perform the method as described in any embodiment of one aspect.

[0014] According to another aspect of this disclosure, a non-transitory computer-readable storage medium is provided for storing computer-readable instructions that, when executed by a processor, cause the processor to perform the method as described in any embodiment of one aspect.

[0015] According to another aspect of this disclosure, a computer program product is provided, including a computer program that, when executed by a processor, implements the method as described in any embodiment of one aspect.

[0016] This disclosure provides a frequency-domain decoupled superlens inverse design method, apparatus, and device. The disclosure involves acquiring textual information of the superlens to be designed; this textual information includes: target spectral information and / or target structural information of the superlens; determining the target type of the superlens based on the textual information; combining the textual information, the superlens spectral response sample set, and the superlens structural sample set into multimodal input data; inputting the multimodal input data into a pre-trained first model to obtain the inverse design latent features corresponding to the target type; the first model is determined based on a semantic analysis module, a continuous wavelet network (CWNet) feature decomposition module, a progressive multi-detail feature encoding module, a joint diffusion self-attention (J-DiT) generation module, a conditional flow matching loss (CFM) module, a decoupled generative DGFM fusion module, and an Euler inverse integral module; decoding and outputting the target superlens design parameters based on the inverse design latent features, and determining the target superlens based on these parameters; the target superlens design parameters include: the arrangement of the unit array, the topology of the units, and the spectral refractive index. In contrast to existing superlens inverse design methods, which are bound to a single task, suffer from distortion and redundancy in multimodal information fusion, and rely on repeated iterative trial and error leading to low efficiency and accuracy, this disclosure achieves targeted improvements throughout the entire process through a unified architecture driven by text and with multi-module collaboration. Specifically, it breaks the limitations of a single task by combining text information with design requirements and matching corresponding target types, allowing for flexible adaptation to various types of superlens designs without the need to retrain models for different scenarios. Simultaneously, the spatiotemporal variational autoencoder progressive multi-detail feature encoding module of the first model can efficiently compress and structure the fused multimodal features. The combined diffusion self-attention J-DiT generation module and the conditional flow matching loss CFM module can accurately generate and optimize latent features based on target type constraints, reducing design bias. Furthermore, the decoupled generative DGFM fusion module and the continuous wavelet network CWNet feature decomposition module of the first model perform hierarchical decoupling and accurate fusion of multimodal information, avoiding information distortion caused by forced alignment and eliminating redundant conflicts caused by simple superposition, ensuring the integrity of the fused features. The Euler inverse integral module enables efficient decoding from latent features to superlens design parameters, significantly reducing iterative trial and error. In summary, the technical solution provided in this disclosure can be flexibly applied, improving accuracy and efficiency, and can adapt to various application scenarios.

[0017] It should be understood that both the foregoing general description and the following detailed description are exemplary and intended to provide further illustration of the claimed technology. Attached Figure Description

[0018] The above and other objects, features, and advantages of this disclosure will become more apparent from the more detailed description of the embodiments thereof in conjunction with the accompanying drawings. The drawings are provided to further illustrate the embodiments of this disclosure and form part of the specification. They are used together with the embodiments of this disclosure to explain the disclosure and do not constitute a limitation thereof. In the drawings, the same reference numerals generally represent the same components or steps.

[0019] Figure 1 A flowchart illustrating a frequency-domain decoupled superlens reverse design method provided in this embodiment of the disclosure; Figure 2 A complete flowchart of reverse engineering of a frequency-domain decoupled superlens provided in this disclosure embodiment; Figure 3 A structural block diagram of a frequency-domain decoupled superlens reverse design device provided in this disclosure embodiment; Figure 4 This is a hardware block diagram of an electronic device provided in an embodiment of the present disclosure. Detailed Implementation

[0020] To make the objectives, technical solutions, and advantages of this disclosure more apparent, exemplary embodiments according to this disclosure will now be described in detail with reference to the accompanying drawings. Obviously, the described embodiments are merely some embodiments of this disclosure, and not all embodiments of this disclosure. It should be understood that this disclosure is not limited to the exemplary embodiments described herein.

[0021] Currently, existing superlens inverse design methods are mostly limited to single-task scenarios, only applicable to specific types of superlenses (such as achromatic and polarized lenses), making them inflexible and unable to support multi-task compatibility. Furthermore, at the multimodal information fusion level, either forcibly aligning structural and spectral modal features leads to information distortion, or simply superimposing modal data causes redundant conflicts, resulting in high fusion difficulty and unstable results. Thus, the design process relies on multiple iterations of trial and error, requiring repeated adjustments to model parameters and changes to training datasets to adapt to different design requirements, making it impossible to achieve multi-scenario coverage with a single modeling effort. Consequently, existing superlens inverse design methods suffer from fixed patterns, decreased accuracy and efficiency, and difficulty in adapting to diverse application scenarios.

[0022] Therefore, to address the aforementioned problems, this disclosure provides a frequency-domain decoupled superlens inverse design method. Compared to existing superlens inverse design methods, which suffer from issues such as being tied to a single task, prone to distortion and redundancy in multimodal information fusion, and relying on repeated iterative trial and error leading to low efficiency and accuracy, this disclosure achieves targeted improvements throughout the entire process through a unified architecture of text-driven and multi-module collaboration. Specifically, it combines text information with design requirements and matches corresponding target types, breaking the limitations of a single task and flexibly adapting to various types of superlens designs without requiring retraining the model for different scenarios. Simultaneously, the first model... The progressive multi-detail feature encoding module can efficiently compress and structure the fused multimodal features. The joint diffusion-attention J-DiT generation module, combined with the conditional flow matching loss (CFM) module, can accurately generate and optimize latent features based on target type constraints, reducing design bias. Furthermore, the hierarchical decoupling and accurate fusion of multimodal information through the decoupled generative DGFM fusion module of the first model and the continuous wavelet network (CWNet) feature decomposition module avoids information distortion caused by forced alignment and eliminates redundant conflicts caused by simple superposition, ensuring the integrity of the fused features. The Euler inverse integral module enables efficient decoding from latent features to superlens design parameters, significantly reducing iterative trial and error.

[0023] This disclosure provides a frequency-domain decoupled reverse engineering method for superlenses. Please refer to... Figure 1 , Figure 1 This is a flowchart illustrating a frequency-domain decoupled superlens reverse design method provided in an embodiment of this disclosure. Figure 1 As shown, the method includes: In step S101, the text information of the superlens to be designed is obtained; the text information includes: the target spectral information of the superlens and / or the target structural information of the superlens; In step S102, the target type of the superlens is determined based on the text information of the superlens to be designed; In step S103, the text information of the superlens to be designed, the superlens spectral response sample set, and the superlens structure sample set are combined into multimodal input data. In step S104, the multimodal input data is input into the pre-trained first model to obtain the inverse design latent features corresponding to the target type; the first model is determined based on the semantic analysis module, the continuous wavelet network CWNet feature decomposition module, the progressive multi-detail feature encoding module, the joint diffusion self-attention J-DiT generation module, the conditional flow matching loss CFM module, the decoupled generative DGFM fusion module, and the Euler inverse integral module; In step S105, the target superlens design parameters are decoded and output based on the reverse design latent features, and the target superlens is determined based on the target superlens design parameters. The target superlens design parameters include: the arrangement of the unit array, the topology of the unit, and the spectral refractive index.

[0024] In this disclosure, the text information of the superlens to be designed can be understood as natural language or structured text data used to describe the design requirements of the superlens. The target spectral information of the superlens can be the optical spectral performance indicators that the superlens must achieve, as specified in the design requirements, such as at least one parameter like the target spectral response range or spectral transmittance. The target structural information of the superlens can be understood as the physical structural constraints and target shape of the superlens defined in the design requirements, including at least one of the following: overall superlens size, unit array density, shape restrictions of the basic units, unit size range, and material compatibility requirements. This text information can accurately convey the design objectives and can be a single-dimensional requirement (such as specifying only the target spectral response range) or a multi-dimensional combination of requirements (such as simultaneously specifying structural dimensions and spectral transmittance characteristics), without any specific limitations.

[0025] In this disclosure, the target type of the superlens can be understood as a superlens category classified according to at least one of the design functions, application scenarios and performance indicators, such as achromatic superlens, polarization-modulated superlens, wideband imaging superlens, high transmission efficiency superlens, etc. Different types correspond to different feature extraction and generation logics, and the first model can call the corresponding module parameters based on the target type.

[0026] In this disclosure, the superlens spectral response sample set can be understood as a collection containing a large amount of data related to the spectral performance of superlenses. The superlens spectral response sample set of this disclosure includes: spectral response curves, transmission efficiency, incident angle, and polarization information. The spectral response curves reflect the reflection / transmission characteristics of the superlens for different wavelengths of light, the transmission efficiency characterizes the ability of light signals to pass through, and the incident angle and polarization information correspond to the performance of the superlens under different incident conditions. This sample set provides a spectral dimension reference for multimodal data fusion, supporting the model in accurately learning the relationship between spectrum and structure.

[0027] In this disclosure, the superlens structure sample set can be understood as a collection encompassing the physical structural feature data of various superlenses. The superlens structure sample set of this disclosure includes: superlens unit topology images, unit array arrangement information, and unit size information. Specifically, the unit topology images record the shape (e.g., cylindrical, cross-shaped, ring-shaped, etc.) and material distribution of the basic superlens units; the unit array arrangement information reflects the arrangement pattern of the units in a two-dimensional plane (e.g., periodic arrangement, non-periodic arrangement); and the unit size information includes parameters such as unit height, width, and period.

[0028] In this disclosure, multimodal input data can be understood as a comprehensive input that integrates three types of heterogeneous data: text, spectrum, and structure. By associating and aligning textual requirements with two types of sample data sets, the information limitations of single-modal data are broken, providing a comprehensive design reference dimension for the first model. This ensures that the latent features generated by the model not only meet the textual requirements but also conform to the physical characteristics and performance laws of the superlens.

[0029] In this disclosure, the first model can be understood as an end-to-end deep learning model for inverse design of superlenses, constructed collaboratively by multiple modules. Through the division of labor and cooperation among various functional modules, it realizes the transformation from multimodal input to latent features for inverse design. It has the capabilities of semantic parsing, feature decomposition, fusion encoding, accurate generation, loss optimization, and efficient decoding. It does not require adjustment of the model architecture for different design tasks and can adapt to the design needs of multiple types of superlenses. The specific details of the first model are described below.

[0030] In this disclosure, the reverse design latent feature can be understood as a high-dimensional feature vector with strong representational ability obtained after multimodal input data is processed by the model. This feature vector condenses information on text requirements, spectral characteristics, and structural patterns, and is precisely matched with the target type of the superlens.

[0031] In this disclosure, the target superlens design parameters can be understood as a set of parameters that uniquely determine the physical structure and optical performance of the superlens. The target superlens design parameters of this disclosure include: the arrangement of the unit array, the topology of the units, and the spectral refractive index. The unit array arrangement determines the overall structural distribution of the superlens, the unit topology affects the modulation effect of the optical signal, and the spectral refractive index relates to the superlens's ability to interact with light of different wavelengths. These three types of parameters work together to ensure that the target superlens meets the preset text design requirements.

[0032] In this disclosure, the target superlens can be understood as a functional superlens device that, after being fabricated based on the output design parameters, can achieve the preset spectral and structural requirements. Its optical performance, physical structure, and textual requirements are highly compatible and can be adapted to the corresponding application scenarios.

[0033] Specifically, the reverse engineering of a target superlens may include the following steps: First, the acquired textual information of the superlens to be designed is preprocessed to extract core design requirements (such as target spectral range, transmission efficiency threshold, structural size limitations, etc.) and transformed into digital semantic features recognizable by the model. Second, the superlens spectral response sample set and structural sample set are called, and data is filtered and associated in combination with digital semantic features to remove redundant and invalid data and construct standardized multimodal input data. Subsequently, the multimodal input data is input into the pre-trained first model, and the spectral and structural modal features are extracted sequentially through the CWNet feature decomposition module. The semantic, spectral, and structural features are decoupled and fused by the DGFM fusion module, and then the progressive multidetail feature encoding module generates structured latent features. The latent features are precisely optimized by the J-DiT generation module combined with the CFM module and the Euler inverse integral module to obtain the inverse design latent features that match the target type. Finally, the inverse design latent features are decoded to output the target superlens design parameters such as unit array arrangement, unit topology, and spectral refractive index. The final target superlens is then determined based on the target superlens design parameters.

[0034] The following will elaborate on how the first model works, including: The method also includes: training a first model; training the first model includes: Obtain the training dataset; the training dataset includes: training text information, training structure information, and training spectral response information; The semantic analysis module is used to perform semantic parsing on the training dataset to obtain semantic features, structural features, and spectral features. The CWNet feature decomposition module is used to perform multi-scale feature decomposition on semantic features, structural features, and spectral features, obtaining multiple feature decomposition results; The progressive multi-detail feature encoding module is used to compress and encode the results of multiple feature decompositions to obtain encoded features; Generative modeling and loss constraints are applied to the encoded features using the J-DiT generation module and the CFM module to obtain candidate latent features; The candidate latent features are decoupled and inversely integrated using the DGFM fusion module and the Euler inverse integral module to obtain the inverse latent features corresponding to the training dataset.

[0035] In this disclosure, the training dataset can be understood as a set of labeled datasets covering multi-dimensional information of the superlens used to train the first model. It consists of a massive number of validated superlens design samples, supporting the model's learning of the intrinsic relationship between textual requirements, structural morphology, and spectral performance. Specifically, the training textual information can be understood as the superlens design requirement text data corresponding to the training samples, consistent with the format of the superlens text information to be designed, including the sample's target spectral indicators, structural constraints, etc., used to train the model's semantic parsing capabilities. The training structural information can be understood as the superlens physical structure data corresponding to the training samples, including unit topology, array arrangement, size parameters, etc., consistent with the data dimensions of the superlens structure sample set, providing a basis for the model to learn structural features. The training spectral response information can be understood as the actual spectral performance data corresponding to the training samples, including spectral response curves, transmission efficiency, polarization characteristics, etc., which can verify the matching degree between the model-generated features and actual performance, supporting loss calculation and parameter optimization.

[0036] In this disclosure, the semantic analysis module can be understood as a functional module with natural language processing and structured information parsing capabilities. Its core function is to extract key requirements from training text information, while simultaneously associating training structural information and training spectral response information for feature mapping. The semantic features obtained from the semantic analysis module can accurately represent the design requirements in the training text, providing semantic anchors for cross-modal feature association; structural features can quantitatively represent the physical structural properties of training samples; and spectral features can accurately characterize the spectral performance patterns of training samples.

[0037] In this disclosure, the Continuous Wavelet Network Feature Decomposition Module (CWNet) can be understood as a multi-scale feature extraction module built based on the principle of continuous wavelet transform. It possesses adaptive hierarchical decomposition capabilities, selecting appropriate wavelet basis functions for different modal features to achieve multi-scale feature splitting and noise filtering. The feature decomposition results obtained from the CWNet module can be understood as a set of multi-modal features at different scales. This retains key information of each modality (such as wavelength peaks in spectral features and topological contours in structural features) while also extracting fine-grained feature details, providing richer feature dimensions for subsequent encoding and fusion.

[0038] In this disclosure, the progressive multi-detail feature encoding module can be understood as a compression encoding module combining spatiotemporal feature modeling and variational autoencoder mechanisms. Its core strength lies in its ability to efficiently compress multi-scale decomposed features while simultaneously generating a structured feature space through variational inference, avoiding feature redundancy and information loss. The encoded features obtained from the progressive multi-detail feature encoding module can be understood as high-dimensional structured feature vectors that condense multimodal and multi-scale features, significantly reducing feature dimensionality while retaining core correlation information, providing efficient input for subsequent generative modeling.

[0039] In this disclosure, the Joint Diffusion Transformer Generation Module (J-DiT Generation Module) can be understood as a generative modeling module that integrates diffusion model and self-attention mechanism. Based on encoded features, it generates latent features that conform to physical laws through a stepwise denoising process. At the same time, it captures the global correlation between features with the help of self-attention mechanism to ensure the consistency and rationality of generated features and adapt to the design requirements of various types of superlenses.

[0040] In this disclosure, the Conditional Flow Matching Module (CFM module) can be understood as a loss constraint and optimization module based on flow matching theory. It uses the real features of the training samples as the constraint target, constructs a flow matching loss function between generated features and real features, dynamically corrects the generation direction of the J-DiT generation module, reduces generation bias, and improves the matching degree between candidate latent features and real samples.

[0041] In this disclosure, the candidate latent features obtained by the J-DiT generation module and the CFM module can be understood as latent feature vectors that initially conform to the characteristics of superlenses after generative modeling and loss constraints. They already possess the correlation characteristics of text, structure, and spectrum, but still have problems such as modal feature coupling and local detail deviation, which need to be further decoupled and optimized.

[0042] In this disclosure, the Disentangled Generative Fusion Module (DGFM) can be understood as a module with the ability to decouple cross-modal features and perform semantic-level fusion. Its core function is to decompose the coupled semantic, structural, and spectral modal information in candidate latent features, and then perform precise fusion based on semantic association to eliminate modal conflicts and redundancy and enhance the interpretability of features.

[0043] In this disclosure, the Euler inverse integration module can be understood as a feature optimization module built based on the Euler numerical integration algorithm. It is used to perform inverse integration on the decoupled candidate latent features to improve the convergence speed and numerical stability of feature optimization, while correcting local biases in the features to ensure that the latent features conform to the physical modeling laws of the superlens.

[0044] In this disclosure, the inverse latent features obtained from the DGFM fusion module and the Euler inverse integral module can be understood as the final latent feature vector that is precisely matched with the training samples after decoupling fusion and inverse integral optimization. It condenses the core correlation information of the text requirements, structural morphology and spectral performance of the training samples, and has strong representation ability and physical rationality. It can be used as the target feature for model training for parameter iteration.

[0045] Specifically, training the first model may include the following steps: Step 1: Collect a large number of superlens design samples, organize the design requirement text, physical structure data, and actual spectral performance data corresponding to each sample, and construct the initial training dataset; clean the initial data and remove samples with missing data, contradictory physical parameters, etc.

[0046] The second step involves inputting the training set data into the semantic analysis module. The module performs semantic decomposition on the training text information, extracts information such as target spectral indicators and structural constraints, and transforms them into semantic features. At the same time, it performs quantization mapping on the training structural information and training spectral response information to generate structural features and spectral features respectively, ensuring the semantic correlation and dimensional consistency of the three types of features.

[0047] The third step involves inputting semantic features, structural features, and spectral features into the CWNet feature decomposition module. The module uses an adaptive wavelet basis function to perform multi-scale hierarchical decomposition on the three types of features, separating feature components of different granularities, filtering noise interference, and outputting feature decomposition results in multiple dimensions to provide fine-grained feature support for subsequent encoding.

[0048] Step 4: Input all feature decomposition results into the progressive multi-detail feature encoding module. The module compresses multi-scale features through the encoder, removes redundant information, and constructs a structured latent feature space with the help of variational inference. This integrates multi-modal and multi-scale features into a unified dimension of encoded features, achieving efficient feature representation.

[0049] Step 5: Input the encoded features into the J-DiT generation module. The module generates preliminary latent features based on the progressive denoising mechanism of the diffusion model and combined with the self-attention global association capability. At the same time, the CFM module is started. Taking the real features of the training samples as the target, the flow matching loss between the generated latent features and the real features is calculated. The loss signal is fed back to the J-DiT generation module to dynamically adjust the generation parameters and iteratively optimize to obtain candidate latent features, ensuring that the candidate features fit the real rules of the samples.

[0050] Step 6: Input the candidate latent features into the DGFM fusion module. The module decouples the coupled modal features and performs accurate fusion again based on semantic association to eliminate modal conflicts. Then, input the decoupled and fused features into the Euler inverse integral module. Perform inverse operation through the Euler numerical integral algorithm to optimize feature convergence and stability, correct local feature deviations, and obtain the final inverse latent features.

[0051] The following section will explain in detail how the CWNet feature decomposition module in the first model performs feature decomposition, including: The semantic features are decomposed into multi-scale values ​​using continuous wavelets to obtain the decomposition results corresponding to the semantic features. High-frequency or low-frequency separation of structural features is performed using wavelet decomposition to obtain the decomposition results corresponding to the structural features; Wavelet packet decomposition is applied to the spectral features to obtain the decomposition results corresponding to the spectral features.

[0052] In this disclosure, continuous wavelet multi-scale decomposition can be understood as an operation that decomposes semantic features layer by layer at different scales (resolutions) based on continuous wavelet basis functions. It can simultaneously preserve the global outline and local details of semantic information and adapt to the multi-dimensional expression characteristics of text requirements.

[0053] In this disclosure, the high-frequency or low-frequency separation of wavelet decomposition is understood as the operation of splitting structural features into high-frequency components (corresponding to the detailed features of the structure, such as the edges and textures of the unit topology) and low-frequency components (corresponding to the overall contour features of the structure, such as the arrangement rules of the unit array) through wavelet transform, thereby realizing the hierarchical representation of structural features.

[0054] In this disclosure, wavelet packet decomposition can be understood as an operation of multi-scale, multi-resolution decomposition of spectral features across the entire frequency band. Compared with traditional wavelet decomposition, it can cover a finer frequency range and accurately capture key details such as wavelength peak and bandwidth of the spectral response curve.

[0055] Specifically, feature decomposition can include the following steps: First, semantic features, structural features, and spectral features are input to the CWNet feature decomposition module. For semantic features, a Gaussian wavelet basis function can be selected to perform continuous wavelet multi-scale decomposition, outputting semantic sub-features at different scales. For structural features, a Haar wavelet basis function can be selected to separate high-frequency detail sub-features and low-frequency contour sub-features. For spectral features, a Daubechies 4-Wavelet basis function (db4 wavelet basis function) is selected to perform wavelet packet decomposition, outputting spectral sub-features corresponding to each frequency band. Finally, all sub-features are integrated to obtain the decomposition results corresponding to the three types of features.

[0056] The following section will elaborate on how the progressive multi-detail feature encoding module in the first model performs compression encoding, including: The results of multiple feature decompositions are input into the progressive multi-detail feature encoding module, and the spatiotemporal correlation between the multiple feature decomposition results is learned through the spatiotemporal convolutional layer; Based on spatiotemporal correlation and probability distribution constraints, the compressed encoded features are output.

[0057] In this disclosure, the spatiotemporal convolutional layer can be understood as a convolutional layer that simultaneously captures the dependencies of features in the spatial dimension (positional associations of features of different modalities) and the temporal dimension (scale-level associations of feature decomposition), enabling the modeling of associations of multimodal and multiscale features.

[0058] In this disclosure, spatiotemporal correlation can be understood as the inherent dependency between different feature decomposition results in terms of spatial distribution and scale level (such as the correspondence between the "achromatic requirement" of semantic features and the "wideband component" of spectral features).

[0059] In this disclosure, the probability distribution constraint can be understood as ST-VAE constraining the encoded features to latent variables that conform to a normal distribution based on variational inference, thereby ensuring the structuring and generalization ability of the encoded features.

[0060] Specifically, compression encoding can include the following steps: Step 1: Concatenate all feature decomposition results output by CWNet into a multi-channel feature tensor and input it into the encoding module of ST-VAE; Step 2: Through two spatiotemporal convolutional layers (3×3 kernel size, stride 1), the spatiotemporal correlation of feature tensors is learned to obtain associated features; Step 3: Input the associated features into the fully connected layer and output the mean vector. With variance vector Based on variational inference, feature mapping is performed to conform to... Latent variables of the distribution; Step 4: Perform dimensionality compression on the latent variables and output the final encoded features.

[0061] The following section will elaborate on how the J-DiT generation module and CFM module in the first model obtain candidate latent features, including: The encoded features are input into the J-DiT generation module, and the dependencies of the encoded features are determined through a self-attention mechanism; The CFM module is used to determine the linear interpolation path of the noise latent representation and the conditional latent representation of the encoded features, and the mean square error is determined based on the linear interpolation path. Based on dependency relationships and mean squared error, candidate latent features are output.

[0062] In this disclosure, the self-attention mechanism can be understood as a mechanism that captures the global dependencies of encoded features by calculating the attention weights of each element within a feature. The dependencies obtained from the self-attention mechanism can be understood as the correlation strength between different dimensions (such as semantics, structure, and spectrum) in the encoded features.

[0063] In this disclosure, the noise latent representation can be understood as derived from the standard normal distribution. The random latent vectors obtained from sampling are used to simulate the initial noise input of the diffusion model.

[0064] In this disclosure, the conditional latent representation can be understood as a latent vector obtained by encoding feature mapping, carrying the design requirements of the superlens, as a task constraint for the generation process.

[0065] In this disclosure, the linear interpolation path can be understood as a continuous transition path from the noise latent representation to the conditional latent representation (i.e., ,in For conditional latent representation, For the noise potential representation, t∈[0,1], it is used to simulate the denoising process of the diffusion model.

[0066] In this disclosure, the mean square error can be understood as the difference between the predicted velocity field generated by J-DiT in the CFM module and the target velocity field. , The squared error between the two is used to constrain deviations in the generation process.

[0067] Specifically, the process of determining candidate latent features may include the following steps: First, the encoded features are mapped to conditional latent representations. Simultaneously, the noise latent representation is obtained through sampling. ;Will The J-DiT generation module is input, which calculates the global dependencies of encoded features through a multi-layer self-attention mechanism and initializes the generation network parameters; the CFM module defines the linear interpolation path and determines the target velocity field. The J-DiT generation module predicts the velocity field based on dependencies. CFM module calculation and The mean squared error is used as the loss; the J-DiT parameters are updated with the loss as a constraint, and the candidate latent features that meet the path constraints are output after iterative optimization.

[0068] For example, this disclosure also provides specific CWNet feature decomposition parameter configuration examples: The semantic features were decomposed using a Gaussian wavelet basis with a scale parameter of 4, resulting in four sets of semantic sub-features with different granularities. The structural features were decomposed using a Haar wavelet basis, separating high-frequency detail sub-features (corresponding to the topological edges of the unit) and low-frequency contour sub-features (corresponding to the array arrangement). The spectral features were decomposed using a db4 wavelet basis, with the wavelet packet decomposed into six frequency band sub-features, covering the full spectrum range of 400-2500nm.

[0069] Example of ST-VAE compression encoding effect: The input multimodal feature decomposition result (dimension 1024×32×32) is processed through a spatiotemporal convolutional layer and variational coding to output a coded feature with a dimension of 256, achieving a compression rate of 97.6% while retaining more than 95% of the core correlation information.

[0070] Example of J-DiT and CFM module generation: Encoding feature mapping to conditional latent representation Sampling noise latent representation Through a 100-step linear interpolation path and velocity field prediction, the final generated candidate latent features and The matching accuracy reached 98%, and the mean square error was controlled within 0.01.

[0071] The following will explain in detail how to determine the target type of a superlens, including: Obtain the first mapping table; the first mapping table is used to indicate the correspondence between the superlens function labels and the target types. Obtain the superlens function label from the text information of the superlens to be designed; Match the superlens function label with the first mapping table, and determine the matching result as the target type of the superlens.

[0072] In this disclosure, the first mapping table can be understood as a pre-built structured association table that stores the mapping relationship between typical functional requirements in the field of superlenses and corresponding target types. Its content can be formulated based on at least one industry standard such as the application scenarios and performance indicators of superlenses.

[0073] In this disclosure, the superlens functional label can be understood as a keyword identifier extracted from the text information of the superlens to be designed, which characterizes the core function or performance requirements of the superlens. For example, from the text designing an achromatic superlens covering the 400-760nm band, functional labels such as achromatic can be extracted, which is the functionalized and standardized result of the text requirements.

[0074] Specifically, determining the target type of a superlens can include the following steps: First, the first mapping table is retrieved from the pre-stored database of the model, loading the correspondence between functional labels and target types. Next, the semantic analysis module extracts and classifies keywords from the text information of the hyperlens to be designed, obtaining hyperlens functional labels. Then, the extracted functional labels are matched one by one with the label items in the first mapping table: if a single match exists, the corresponding type is directly taken as the target type; if multiple matches exist, the final target type can be selected according to the hyperlens functional priority rule; the hyperlens functional priority rule can be set according to system or user preferences, and there are no specific restrictions. Finally, the matched hyperlens target type is output.

[0075] The following will explain in detail how to determine the target superlens using the target superlens design parameters, including: The latent features are inversely designed and input into the decoder to obtain structural feature images and spectral response information; the decoder includes at least one of the following: an ST-VAE decoder and a generator decoder; Based on structural feature images, the arrangement of the cell array and the topology of the cells are determined; Determine the spectral refractive index based on spectral response information; The target superlens is determined based on the arrangement of the unit array, the topology of the unit, and the spectral refractive index.

[0076] In the present disclosure, the decoder can be understood as a model component for decoding high-dimensional inverse design latent features into directly resolvable meta-lens feature data, and its core function is to restore the structural and spectral information condensed in the latent features. Among them, the ST-VAE decoder can be understood as a decoding component配套 with the ST-VAE feature encoding module, which restores the compressed encoded features into structured structural and spectral data based on the probabilistic decoding mechanism of the variational autoencoder; the generator decoder can be understood as a decoding component constructed based on a generative model (such as a diffusion generator), which can generate high-fidelity structural feature images and spectral response curves that conform to physical laws.

[0077] In the present disclosure, the structural feature image can be understood as two-dimensional or three-dimensional image data output by the decoder, visually presenting the physical structure of the meta-lens, including details such as the shape, size of the meta-lens unit, and the arrangement of the units in the plane, and is an intuitive carrier of structural parameters.

[0078] In the present disclosure, the spectral response information can be understood as a data set output by the decoder, quantitatively characterizing the optical performance of the meta-lens, including indicators such as transmittance or reflectance and polarization response at different wavelengths, and is the core basis for deriving the spectral refractive index.

[0079] Specifically, when determining the target meta-lens, the following steps can be included: First, input the inverse design latent features into the decoder (such as using the ST-VAE decoder and the generator decoder in combination), and the decoder outputs the corresponding structural feature image and spectral response information based on the pre-trained parameters; then, perform parameter parsing on the structural feature image: extract the shape of the unit (such as annular, cross-shaped) through an image recognition algorithm to determine the unit topology; identify the arrangement rule of the units (such as square periodic arrangement, hexagonal aperiodic arrangement) to determine the arrangement method of the unit array; perform parameter derivation on the spectral response information: combine the preset material optical model, and calculate the spectral refractive index of the meta-lens (that is, the refractive index values at different wavelengths) based on the transmittance corresponding to different wavelengths in the spectral response curve; integrate the unit array arrangement method, unit topology, and spectral refractive index to form a complete set of target meta-lens design parameters; finally, perform physical verification on the design parameter set: verify whether the unit size is suitable for the micro-nano processing process and whether the spectral refractive index matches the characteristics of the selected optical material (such as silicon, silicon nitride); after passing the verification, the target meta-lens that meets the text requirements can be determined.

[0080] Exemplarily, Figure 2 is a complete flowchart of the inverse design of a frequency-domain decoupled meta-lens provided by an embodiment of the present disclosure. From Figure 2 It can be seen that: This process begins with a multimodal input layer, integrating three types of input: text semantic description, reference image data, and spectral response information, to achieve unified input of design requirements and basic data. Subsequently, a semantically driven generation layer completes text parsing, multimodal feature extraction, and cross-modal alignment, providing standardized features for subsequent processing. Next, a multimodal feature encoding layer, using CWNet feature decomposition, global or local feature processing, and an ST-VAE encoder, completes multi-scale decomposition and compressed encoding of features. Then, the generation core engine takes over, using the DGTalker decoupling module to achieve feature decoupling and fusion, combined with CFM conditional matching, J-DiT generation, and inverse integral generation to obtain accurate latent features. Finally, through the ST-VAE decoder in the output reconstruction layer, superlens structure generation, and spectral response prediction, the transformation from latent features to superlens design results is completed, forming an end-to-end reverse design chain from input, feature processing, generation to output.

[0081] This disclosure also provides a frequency-domain decoupled superlens reverse design apparatus. Figure 3 A structural block diagram of a frequency-domain decoupled superlens reverse design device provided in this disclosure embodiment is shown below. Figure 3 As shown, the frequency-domain decoupled superlens reverse engineering device 300 includes: The acquisition unit 301 is used to acquire text information of the superlens to be designed; the text information includes: the spectral information of the superlens target and / or the structural information of the superlens target. The determining unit 302 is used to determine the target type of the superlens based on the text information of the superlens to be designed; The combination unit 303 is used to combine the text information of the superlens to be designed, the spectral response sample set of the superlens, and the structural sample set of the superlens into multimodal input data. The reverse design unit 304 is used to input multimodal input data into the pre-trained first model to obtain the reverse design latent features corresponding to the target type. The first model is determined based on the semantic analysis module, the continuous wavelet network CWNet feature decomposition module, the progressive multi-detail feature encoding module, the joint diffusion self-attention J-DiT generation module, the conditional flow matching loss CFM module, the decoupled generative DGFM fusion module, and the Euler inverse integral module. The decoding unit 305 is used to decode the target superlens design parameters based on the reverse design latent features, and to determine the target superlens based on the target superlens design parameters. The target superlens design parameters include: the arrangement of the unit array, the topology of the unit, and the spectral refractive index.

[0082] In one exemplary embodiment, the reverse design unit 304 is specifically used for: the method further includes: training a first model; training the first model includes: acquiring a training dataset; the training dataset includes: training text information, training structural information, and training spectral response information; using a semantic analysis module to perform semantic parsing on the training dataset to obtain semantic features, structural features, and spectral features; using a CWNet feature decomposition module to perform multi-scale feature decomposition on the semantic features, structural features, and spectral features to obtain multiple feature decomposition results; using a progressive multi-detail feature encoding module to compress and encode the multiple feature decomposition results to obtain encoded features; using a J-DiT generation module and a CFM module to perform generative modeling and loss constraints on the encoded features to obtain candidate latent features; using a DGFM fusion module and an Euler inverse integral module to decouple and inversely integrate the candidate latent features to obtain the inverse latent features corresponding to the training dataset.

[0083] In one exemplary embodiment, the reverse design unit 304 is specifically used to: perform multi-scale feature decomposition on semantic features, structural features and spectral features using the CWNet feature decomposition module to obtain multiple feature decomposition results, including: performing multi-scale decomposition on semantic features using continuous wavelets to obtain the decomposition results corresponding to the semantic features; performing high-frequency or low-frequency separation on structural features using wavelet decomposition to obtain the decomposition results corresponding to the structural features; and performing wavelet packet decomposition on spectral features to obtain the decomposition results corresponding to the spectral features.

[0084] In one exemplary embodiment, the reverse design unit 304 is specifically used to: compress and encode multiple feature decomposition results using a progressive multi-detail feature encoding module to obtain encoded features, including: inputting multiple feature decomposition results into the progressive multi-detail feature encoding module, learning the spatiotemporal correlation between multiple feature decomposition results through a spatiotemporal convolutional layer; and outputting the compressed encoded features based on the spatiotemporal correlation and probability distribution constraints.

[0085] In one exemplary embodiment, the reverse design unit 304 is specifically used to: perform generative modeling and loss constraints on the encoded features using the J-DiT generation module and the CFM module to obtain candidate latent features, including: inputting the encoded features into the J-DiT generation module and determining the dependencies of the encoded features through a self-attention mechanism; using the CFM module to determine the linear interpolation path between the noisy latent representation and the conditional latent representation of the encoded features, and determining the mean square error based on the linear interpolation path; and outputting candidate latent features based on the dependencies and the mean square error.

[0086] In one exemplary embodiment, the determining unit 302 is specifically used to: obtain a first mapping table; the first mapping table is used to indicate the correspondence between the superlens function tags and the target type; obtain the superlens function tags of the text information of the superlens to be designed; match the superlens function tags with the first mapping table, and determine the matching result as the target type of the superlens.

[0087] In one exemplary embodiment, the combination unit 303 is specifically used for: the superlens spectral response sample set including: spectral response curve, transmission efficiency, incident angle and polarization information; the superlens structure sample set including: superlens unit topology image, unit array arrangement information and unit size information.

[0088] In one exemplary embodiment, the decoding unit 305 is specifically used to: input the reverse design latent features into the decoder to obtain a structural feature image and spectral response information; the decoder includes at least one of the following: an ST-VAE decoder and a generator decoder; based on the structural feature image, determine the arrangement of the unit array and the topology of the units; based on the spectral response information, determine the spectral refractive index; based on the arrangement of the unit array, the topology of the units, and the spectral refractive index, determine the target superlens.

[0089] Figure 4 This is a hardware block diagram of an electronic device provided according to an embodiment of the present disclosure. The electronic device 400 according to an embodiment of the present disclosure includes at least a processor and a memory for storing computer-readable instructions. When the computer-readable instructions are loaded and executed by the processor, the processor performs the frequency-domain decoupled superlens reverse engineering method described in any of the preceding embodiments of the present disclosure.

[0090] Figure 4 The illustrated electronic device 400 specifically includes a central processing unit (CPU) 401, a graphics processing unit (GPU) 402, and a memory 403. These units are interconnected via a bus 404. The CPU 401 and / or GPU 402 can function as the aforementioned processor, and the memory 403 can function as the aforementioned memory storing computer-readable instructions. Furthermore, the electronic device 400 may also include a communication unit 405, a storage unit 406, an output unit 407, an input unit 408, and an external device 409, all of which are also connected to the bus 404.

[0091] In summary, this disclosure provides a frequency-domain decoupled superlens inverse design method and apparatus. This disclosure involves acquiring textual information about the superlens to be designed; this textual information includes: target spectral information and / or target structural information of the superlens; based on the textual information, determining the target type of the superlens; combining the textual information, the superlens spectral response sample set, and the superlens structural sample set into multimodal input data; inputting the multimodal input data into a pre-trained first model to obtain the inverse design latent features corresponding to the target type; the first model is determined based on a semantic analysis module, a continuous wavelet network (CWNet) feature decomposition module, a progressive multi-detail feature encoding module, a joint diffusion self-attention (J-DiT) generation module, a conditional flow matching loss (CFM) module, a decoupled generative DGFM fusion module, and an Euler inverse integral module; based on the inverse design latent features, decoding and outputting the target superlens design parameters, and determining the target superlens based on these parameters; the target superlens design parameters include: the arrangement of the unit array, the topology of the units, and the spectral refractive index. In contrast to existing superlens inverse design methods, which are bound to a single task, suffer from distortion and redundancy in multimodal information fusion, and rely on repeated iterative trial and error leading to low efficiency and accuracy, this disclosure achieves targeted improvements throughout the entire process through a unified architecture driven by text and with multi-module collaboration. Specifically, it breaks the limitations of a single task by combining text information with design requirements and matching corresponding target types, allowing for flexible adaptation to various types of superlens designs without the need to retrain models for different scenarios. Simultaneously, the progressive multi-detail feature encoding module of the first model can efficiently compress and structure the fused multimodal features. The combined diffusion-attention J-DiT generation module and the conditional flow matching loss (CFM) module can accurately generate and optimize latent features based on target type constraints, reducing design bias. Furthermore, the decoupled generative DGFM fusion module and the continuous wavelet network (CWNet) feature decomposition module of the first model perform hierarchical decoupling and accurate fusion of multimodal information, avoiding information distortion caused by forced alignment and eliminating redundant conflicts caused by simple superposition, ensuring the integrity of the fused features. The Euler inverse integral module enables efficient decoding from latent features to superlens design parameters, significantly reducing iterative trial and error. In summary, the technical solution provided in this disclosure can be flexibly applied, improving accuracy and efficiency, and can adapt to various application scenarios.

[0092] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this disclosure.

[0093] The basic principles of this disclosure have been described above with reference to specific embodiments. However, it should be noted that the advantages, benefits, and effects mentioned in this disclosure are merely examples and not limitations, and should not be considered as essential features of each embodiment of this disclosure. Furthermore, the specific details disclosed above are for illustrative and facilitative purposes only, and are not limitations. These details do not limit the scope of this disclosure to the necessity of employing the aforementioned specific details for implementation.

[0094] The block diagrams of devices, apparatuses, devices, and systems disclosed herein are merely illustrative examples and are not intended to require or imply that they must be connected, arranged, or configured in the manner shown in the block diagrams. As those skilled in the art will recognize, these devices, apparatuses, devices, and systems can be connected, arranged, and configured in any manner. Words such as “comprising,” “including,” “having,” etc., are open-ended terms meaning “including but not limited to,” and are used interchangeably with them. The terms “or” and “and” as used herein refer to the terms “and / or,” and are used interchangeably with them unless the context clearly indicates otherwise. The term “such as” as used herein refers to the phrase “such as but not limited to,” and is used interchangeably with it.

[0095] Additionally, as used herein, the "or" used in a list of items beginning with "at least one" indicates a separate list, such that a list of, for example, "at least one of A, B, or C" means A or B or C, or AB or AC or BC, or ABC (i.e., A and B and C). Furthermore, the word "exemplary" does not imply that the described example is preferred or better than other examples.

[0096] It should also be noted that in the systems and methods of this disclosure, the components or steps can be decomposed and / or recombined. These decompositions and / or recombinations should be considered as equivalent solutions to this disclosure.

[0097] Various changes, substitutions, and modifications can be made to the technology described herein without departing from the teachings defined by the appended claims. Furthermore, the scope of the claims of this disclosure is not limited to the specific aspects of the processes, machines, manufactures, events, means, methods, and actions described above. Currently existing or later-developed processes, machines, manufactures, events, means, methods, or actions that perform substantially the same function or achieve substantially the same result as the corresponding aspects described herein can be utilized. Therefore, the appended claims include such processes, machines, manufactures, events, means, methods, or actions within their scope.

[0098] The above description of the disclosed aspects is provided to enable any person skilled in the art to make or use this disclosure. Various modifications to these aspects will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to other aspects without departing from the scope of this disclosure. Therefore, this disclosure is not intended to be limited to the aspects shown herein, but rather to be carried out within the widest scope consistent with the principles and novel features disclosed herein.

[0099] The above description has been given for purposes of illustration and description. Furthermore, this description is not intended to limit the embodiments of this disclosure to the forms disclosed herein. Although numerous exemplary aspects and embodiments have been discussed above, those skilled in the art will recognize certain variations, modifications, alterations, additions, and sub-combinations therein.

Claims

1. A method for reverse design of a superlens with frequency domain decoupling, characterized in that, The method includes: Obtain text information of the superlens to be designed; the text information includes: the target spectral information of the superlens and / or the target structural information of the superlens; Based on the textual information of the superlens to be designed, the target type of the superlens is determined; Based on the textual information of the superlens to be designed, the superlens spectral response sample set, and the superlens structure sample set, they are combined into multimodal input data; The multimodal input data is input into the pre-trained first model to obtain the inverse design latent features corresponding to the target type; the first model is determined based on the semantic analysis module, the continuous wavelet network CWNet feature decomposition module, the progressive multi-detail feature encoding module, the joint diffusion self-attention J-DiT generation module, the conditional flow matching loss CFM module, the decoupled generative DGFM fusion module, and the Euler inverse integral module; Based on the reverse design latent features, the target superlens design parameters are decoded and output, and the target superlens is determined based on the target superlens design parameters; the target superlens design parameters include: the arrangement of the unit array, the topology of the unit, and the spectral refractive index.

2. The method according to claim 1, characterized in that, The method further includes: training the first model; the training of the first model includes: Obtain the training dataset; the training dataset includes: training text information, training structure information, and training spectral response information; The semantic analysis module is used to perform semantic parsing on the training dataset to obtain semantic features, structural features, and spectral features. The CWNet feature decomposition module is used to perform multi-scale feature decomposition on the semantic features, structural features, and spectral features to obtain multiple feature decomposition results. The progressive multi-detail feature encoding module is used to compress and encode the multiple feature decomposition results to obtain encoded features; The J-DiT generation module and the CFM module are used to perform generative modeling and loss constraints on the encoded features to obtain candidate latent features; The candidate latent features are decoupled and inversely integrated using the DGFM fusion module and the Euler inverse integral module to obtain the inverse latent features corresponding to the training dataset.

3. The method according to claim 2, characterized in that, The CWNet feature decomposition module is used to perform multi-scale feature decomposition on the semantic features, structural features, and spectral features to obtain multiple feature decomposition results, including: The semantic features are decomposed using continuous wavelets at multiple scales to obtain the decomposition results corresponding to the semantic features. The structural features are separated into high-frequency and low-frequency components using wavelet decomposition to obtain the decomposition results corresponding to the structural features; The spectral features are subjected to wavelet packet decomposition to obtain the decomposition results corresponding to the spectral features.

4. The method according to claim 2, characterized in that, The progressive multi-detail feature encoding module is used to compress and encode the multiple feature decomposition results to obtain encoded features, including: The multiple feature decomposition results are input into the progressive multi-detail feature encoding module, and the spatiotemporal correlation between the multiple feature decomposition results is learned through the spatiotemporal convolutional layer; Based on the spatiotemporal correlation and probability distribution constraints, the compressed encoded features are output.

5. The method according to claim 2, characterized in that, The process of using the J-DiT generation module and the CFM module to perform generative modeling and loss constraints on the encoded features to obtain candidate latent features includes: The encoded features are input into the J-DiT generation module, and the dependencies of the encoded features are determined through a self-attention mechanism; The CFM module is used to determine the linear interpolation path between the noise latent representation and the conditional latent representation of the encoded features, and the mean square error is determined based on the linear interpolation path. Based on the dependency relationship and the mean squared error, the candidate latent features are output.

6. The method according to claim 1, characterized in that, The process of determining the target type of the superlens based on the text information of the superlens to be designed includes: Obtain the first mapping table; the first mapping table is used to indicate the correspondence between the superlens function label and the target type; The superlens function tag is obtained from the superlens text information to be designed; The superlens function label is matched with the first mapping table, and the matching result is determined as the target type of the superlens.

7. The method according to claim 1, characterized in that, The superlens spectral response sample set includes: spectral response curves, transmission efficiency, incident angle, and polarization information; The superlens structure sample set includes: superlens unit topology images, unit array arrangement information, and unit size information.

8. The method according to claim 1, characterized in that, The process of decoding and outputting target superlens design parameters based on the reverse design latent features, and determining the target superlens based on the target superlens design parameters, includes: The inverse design latent features are input into the decoder to obtain structural feature images and spectral response information; the decoder includes at least one of the following: a spatiotemporal variational autoencoder (ST-VAE) decoder and a generator decoder; Based on the structural feature image, the arrangement of the unit array and the topology of the unit are determined; The spectral refractive index is determined based on the spectral response information; The target superlens is determined based on the arrangement of the unit array, the topology of the unit, and the spectral refractive index.

9. A frequency-domain decoupled superlens reverse design device, characterized in that, The device includes: The acquisition unit is used to acquire text information of the superlens to be designed; the text information includes: the target spectral information of the superlens and / or the target structural information of the superlens. The determining unit is used to determine the target type of the superlens based on the text information of the superlens to be designed; The combination unit is used to combine the text information of the superlens to be designed, the superlens spectral response sample set, and the superlens structure sample set into multimodal input data. The reverse design unit is used to input the multimodal input data into the pre-trained first model to obtain the reverse design latent features corresponding to the target type; the first model is determined based on the semantic analysis module, the continuous wavelet network CWNet feature decomposition module, the spatiotemporal variational autoencoder progressive multi-detail feature encoding module, the joint diffusion self-attention J-DiT generation module, the conditional flow matching loss CFM module, the decoupled generative DGFM fusion module, and the Euler inverse integral module; The decoding unit is used to decode and output the target superlens design parameters based on the reverse design latent features, and to determine the target superlens based on the target superlens design parameters; the target superlens design parameters include: the arrangement of the unit array, the topology of the unit, and the spectral refractive index.

10. An electronic device, characterized in that, include: Memory, used to store computer-readable instructions; as well as A processor for executing the computer-readable instructions, causing the electronic device to perform the method as described in any one of claims 1-8.