System and method for improving the measurement performance of characterization systems

By training multiple machine learning models with uncertainty estimators, the system enhances characterization system performance by selecting the best model for accurate measurements, addressing the limitations of existing technologies.

JP2026518917APending Publication Date: 2026-06-11KLA CORP

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Applications
Current Assignee / Owner
KLA CORP
Filing Date
2024-05-07
Publication Date
2026-06-11

AI Technical Summary

Technical Problem

Characterization systems face deteriorating measurement performance when data exceeds the range of the data distribution used to create and optimize the measurement recipe, leading to poor performance during runtime.

Method used

A system and method that trains multiple machine learning models with uncertainty estimators, selecting the best model based on the lowest uncertainty estimator to generate measurement outputs, using empirical or simulated data labeled with known information.

🎯Benefits of technology

Improves measurement performance by reducing errors and uncertainty in characterization systems, especially for samples beyond the training data range.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 2026518917000001_ABST
    Figure 2026518917000001_ABST
Patent Text Reader

Abstract

A method for improving the measurement performance of a characterization system is disclosed. The method involves training multiple machine learning models on a set of training data, each machine learning model having the ability to generate uncertainty estimators, and the first machine learning model may be trained differently from one or more additional machine learning models on one of a set of hyperparameters or a dataset. The method may further include receiving multiple sample measurement datasets from one or more test samples. For each of the multiple sample measurement datasets, the method may further include applying each trained machine learning model to determine the measurement and uncertainty estimator for each trained machine learning model, and generating measurement outputs based on N trained machine learning models having the lowest uncertainty estimators.
Need to check novelty before this filing date? Find Prior Art

Description

【Technical Field】 【0001】 The present disclosure generally relates to a characteristic evaluation system, and more particularly to a system and method for improving the measurement performance of a characteristic evaluation system. 【Background Art】 【0002】 A characteristic evaluation system typically evaluates (e.g., inspects or measures) various characteristics of a sample. For example, a metrology system often measures various characteristics of a sample. As the size of the sample decreases and the sample density increases, the demand for a characteristic evaluation system required to evaluate the sample increases. Various techniques have been developed to obtain sample measurement data. The collected measurement data can be analyzed by several data fitting and optimization techniques (e.g., algorithms, models, or the like). The measurement performance of such a system often deteriorates when the measured data exceeds the range of the data distribution used to create and optimize the measurement recipe. For example, the measurement recipe is often adjusted based on data within the data distribution, and the same recipe is used during runtime for new data outside the distribution range, in which case the measurement performance of the recipe during runtime is typically poor. 【Prior Art Documents】 【Patent Documents】 【0003】 【Patent Document 1】 U.S. Patent Application Publication No. 2021 / 0357402 【Patent Document 2】 U.S. Patent Application Publication No. 2023 / 0129132 【Summary of the Invention】 【Problems to be Solved by the Invention】 【0004】 Therefore, it is necessary to develop a system and method to remedy the above defects. [Means for solving the problem] 【0005】 Characterization systems according to one or more embodiments of the present disclosure are disclosed. In embodiments, the system includes one or more controllers, each including one or more processors configured to execute a set of program instructions stored in memory. In embodiments, the set of program instructions is configured to cause one or more processors to train a plurality of machine learning models based on a set of training data, which includes empirical data or simulated data labeled based on known information. In embodiments, each of the plurality of machine learning models has the ability to generate uncertainty estimators. In embodiments, a first machine learning model among the plurality of machine learning models is distinct from one or more additional machine learning models based on at least one of a set of hyperparameters or a dataset. In embodiments, the set of program instructions is configured to cause one or more processors to receive a plurality of sample measurement datasets from one or more test samples. In embodiments, for each of the plurality of sample measurement datasets, the set of program instructions is configured to cause one or more processors to apply each trained machine learning model to determine the measure and uncertainty estimator for each trained machine learning model. In this embodiment, for each of a plurality of sample measurement datasets, the set of program instructions is configured to cause one or more processors to generate measurement outputs based on N trained machine learning models having the lowest uncertainty estimators, wherein the N trained machine learning models are subsets of a plurality of trained machine learning models. 【0006】 A characterization system according to one or more embodiments of the present disclosure is disclosed. In an embodiment, the system includes a characterization subsystem. In an embodiment, the system includes one or more controllers, each including one or more processors that are communication-connected to the characterization subsystem and configured to execute a set of program instructions stored in memory. In an embodiment, the set of program instructions is configured to cause one or more processors to train a plurality of machine learning models based on a set of training data, which includes empirical data or simulated data labeled based on known information. In an embodiment, each of the plurality of machine learning models has the ability to generate an uncertainty estimator. In an embodiment, a first machine learning model among the plurality of machine learning models is different from one or more additional machine learning models based on at least one of a set of hyperparameters or a dataset. In an embodiment, the set of program instructions is configured to cause one or more processors to receive a plurality of sample measurement datasets from one or more test samples. In an embodiment, for each of the plurality of sample measurement datasets, the set of program instructions is configured to cause one or more processors to apply each trained machine learning model to determine the measurement and uncertainty estimator for each trained machine learning model. In this embodiment, for each of a plurality of sample measurement datasets, the set of program instructions is configured to cause one or more processors to generate measurement outputs based on N trained machine learning models having the lowest uncertainty estimators, wherein the N trained machine learning models are subsets of a plurality of trained machine learning models. 【0007】 Methods according to one or more embodiments of the present disclosure are disclosed. In embodiments, the method includes training a plurality of machine learning models on a set of training data including empirical data or simulated data labeled based on known information. In embodiments, each of the plurality of machine learning models has the ability to generate uncertainty estimators. In embodiments, a first machine learning model among the plurality of machine learning models differs from one or more additional machine learning models based on at least one of a set of hyperparameters or a dataset. In embodiments, the method includes receiving a plurality of sample measurement datasets from one or more test samples. In embodiments, for each of the plurality of sample measurement datasets, the method includes applying each trained machine learning model to determine the measurement and uncertainty estimator for each trained machine learning model. In embodiments, for each of the plurality of sample measurement datasets, the method includes generating measurement outputs based on N trained machine learning models having the lowest uncertainty estimators, wherein the N trained machine learning models are a subset of the plurality of trained machine learning models. 【0008】 It should be understood that both the above general description and the following detailed description are illustrative and descriptive, and do not necessarily constitute limitations of the invention as claimed. The accompanying drawings incorporated herein and considered part of this specification illustrate embodiments of the invention and help illustrate the principles of the invention together with the general description. 【0009】 Many of the advantages of this disclosure can be better understood by those skilled in the art by referring to the accompanying figures. [Brief explanation of the drawing] 【0010】 [Figure 1A] This is a simplified schematic block diagram of a characterization system according to one or more embodiments of the present disclosure. [Figure 1B]This is a simplified schematic diagram of a characterization subsystem suitable for optical measurement according to one or more embodiments of the present disclosure. [Figure 1C] This is a simplified schematic diagram of a characterization subsystem configured as an X-ray subsystem according to one or more embodiments of the present disclosure. [Figure 1D] This is a simplified schematic diagram of a characterization subsystem configured as a particle beam characterization subsystem according to one or more embodiments of the present disclosure. [Figure 2] This flowchart illustrates a method for improving measurement performance according to one or more embodiments of the present disclosure. [Figure 3] This is a simplified block diagram of a process flow diagram illustrating a method for improving measurement performance according to one or more embodiments of the present disclosure. [Figure 4] This plot shows the data distribution within and outside the training space. [Figure 5A] This plot shows the distribution of absolute errors from an ensemble of models that do not use uncertainty estimators. [Figure 5B] This plot shows the distribution of absolute errors from an ensemble of models when the best model is selected using an uncertainty estimator according to one or more embodiments of the present disclosure. [Figure 6] This bar plot shows the reduction of errors by using an uncertainty estimator to select the best model from an ensemble according to one or more embodiments of the present disclosure. [Modes for carrying out the invention] 【0011】 References to the subject matter of the disclosure illustrated in the accompanying drawings are made in detail hereby. This disclosure has been shown and described in particular with respect to specific embodiments and their particular features. The embodiments described herein are to be understood as illustrations, not limitations. It will be readily apparent to those skilled in the art that various changes and modifications of form and detail can be made without departing from the spirit and scope of this disclosure. 【0012】 Embodiments of the present disclosure are directed to systems and methods for improving the measurement performance of a characteristic evaluation system. In particular, embodiments of the present disclosure are directed to systems and methods for improving the measurement performance of a characteristic evaluation system using an uncertainty estimator. For example, the system may be configured to train a plurality of machine learning models based on a set of training data, and the plurality of machine learning models may each be at least partially different from each other (e.g., trained using different training data or including different model hyperparameters). Further, each machine learning model may have the ability to generate an uncertainty estimator such that during execution time, the lowest uncertainty estimator for each trained machine learning model can be used to generate one or more measurement outputs. In this regard, one or more measurements associated with one or more machine learning models having the lowest uncertainty estimator may be used to generate the measurement output. 【0013】 Referring now to FIGS. 1A-6, systems and methods for improving the measurement performance of a characteristic evaluation system according to one or more embodiments of the present disclosure are described in more detail. 【0014】 FIG. 1A is a simplified schematic block diagram of a characteristic evaluation system 100 according to one or more embodiments of the present disclosure. 【0015】 In an embodiment, the characteristic evaluation system 100 includes a characteristic evaluation subsystem 102 that can be configured according to a characteristic evaluation recipe (e.g., an inspection recipe, a metrology recipe, or the like) for generating characteristic evaluation data associated with a characteristic evaluation target on a sample 104. 【0016】 In an embodiment, the property evaluation subsystem 102 includes one or more metrology subsystems. For example, the metrology subsystem 102 may include an optical metrology subsystem configured to generate optical metrology data associated with the sample 104. As another example, the metrology subsystem 102 may include an X-ray metrology subsystem configured to generate X-ray metrology data associated with the sample 104. As another example, the metrology subsystem 102 may include a particle beam metrology subsystem configured to generate e-beam metrology data associated with the sample 104, such as a scanning electron microscope (SEM) metrology subsystem, a transmission electron microscope (TEM) metrology subsystem, or the like, but not limited thereto. 【0017】 In an embodiment, the property evaluation subsystem 102 includes one or more inspection subsystems. For example, the inspection subsystem 102 may include an optical inspection subsystem configured to generate optical inspection data associated with the sample 104. As another example, the inspection subsystem 102 may include a particle beam inspection subsystem configured to generate e-beam inspection data associated with the sample 104. As another example, the inspection subsystem 102 may include an X-ray inspection subsystem configured to generate X-ray inspection data associated with the sample 104. 【0018】 In an embodiment, the sample 104 is placed on the sample stage 106. The sample stage 106 may include any device suitable for placing and / or scanning the sample 104 within the property evaluation subsystem 102. For example, the sample stage 106 may include any combination of a linear translation stage, a rotation stage, a chip / tilt stage, or the like. Thus, the sample stage 106 can align an object selected within the measurement field of view of the property evaluation subsystem 102 for measurement. 【0019】 Sample 104 may include a substrate formed from a semiconductor or non-semiconductor material (e.g., a wafer or similar). For example, the semiconductor or non-semiconductor material may include, but is not limited to, single-crystal silicon, gallium arsenide, and indium phosphide. The sample may further include a mask, lens (e.g., a metalens), reticle, or similar formed from the semiconductor or non-semiconductor material. Sample 104 may further include one or more layers disposed on the substrate. For example, such layers may include, but are not limited to, resists, dielectric materials, conductive materials, and semiconducting materials. Many different types of such layers are known in the art, and the term "sample" as used herein is intended to encompass samples on which all types of such layers may be formed. One or more layers formed on the sample may be patterned or unpatterned. For example, the sample may include multiple dies, each having repeatable patterned features. The formation and processing of such layers of material ultimately results in a completed device. Many different types of devices may be formed on a sample, and the term "sample" as used herein is intended to encompass a sample on which any type of device known in the art is fabricated. 【0020】 From here, various configurations of the characterization subsystem 102 according to one or more embodiments of the present disclosure will be described in more detail with reference to Figures 1B to 1D. 【0021】 In a general sense, the characterization subsystem 102 can illuminate the sample 104 with at least one illumination beam and collect at least one measurement signal from the sample 104 depending on the illumination beam. The illumination beam may include, but is not limited to, an optical beam (e.g., a beam of light), an X-ray beam, an electron beam, or an ion beam at any wavelength or wavelength range. Thus, the characterization subsystem 102 can operate as an optical characterization subsystem, an X-ray characterization subsystem, an electron beam (e.g., an e-beam) characterization subsystem, or an ion beam characterization subsystem. 【0022】 Figure 1B is a simplified schematic diagram of a characterization subsystem 102 suitable for optical measurements according to one or more embodiments of the present disclosure. For example, Figure 1B illustrates a variety of configurations including, but not limited to, a spectroscopic ellipsometer (SE), an SE with multiple illumination angles, an SE that measures Müller matrix elements (e.g., using a rotational compensator), a single-wavelength ellipsometer, a beam profile ellipsometer (angle-resolved ellipsometer), a beam profile reflectometer (angle-resolved reflectometer), a broadband reflectance spectrometer (spectroreflectometer), a single-wavelength reflectometer, an angle-resolved reflectometer, an imaging system, or a scattermeter (e.g., a speckle analyzer). The wavelength of the optical system can be varied from about 120 nm to 3 microns. In the case of a non-ellipsometer system, the collected signal can be polarization-resolved or non-polarized. 【0023】 In the embodiment, the characterization subsystem 102 includes an illumination light source 109 for generating an optical illumination beam 111. The illumination beam 111 may include one or more selected wavelengths of light, including but not limited to ultraviolet (UV) radiation, visible radiation, or infrared (IR) radiation. 【0024】 The illumination source 109 may be any type of illumination source known in the art that is suitable for generating the optical illumination beam 111. In embodiments, the illumination source 109 includes a broadband plasma (BBP) illumination source. In this regard, the illumination beam 111 may include radiation emitted by the plasma. For example, the BBP illumination source 109 may, but is not required to include, one or more pump sources (e.g., one or more lasers) configured to focus on a volume of gas such that energy is absorbed by the gas in order to generate or sustain a plasma suitable for emitting radiation. Furthermore, at least a portion of the plasma radiation may be utilized as the illumination beam 111. 【0025】 In the embodiment, the illumination source 109 may include one or more lasers. For example, the illumination source 109 may include any laser system known in the Art that has the ability to emit radiation in the infrared, visible, or ultraviolet portion of the electromagnetic spectrum. 【0026】 The illumination source 109 can further generate illumination beams 111 having any time profile. For example, the illumination source 109 can generate a continuous illumination beam 111, a pulsed illumination beam 111, or a modulated illumination beam 111. Additionally, the illumination beams 111 may be delivered from the illumination source 109 via free-space propagation or induced light (e.g., optical fiber, light pipe, or similar). 【0027】 In this embodiment, the illumination source 109 directs an illumination beam 111 towards the sample 104 via an illumination path 113. The illumination path 113 may include one or more illumination path lenses 116 or additional optical components 115 suitable for modifying and / or adjusting the illumination beam 111. For example, one or more optical components 115 may include, but are not limited to, one or more polarizers, one or more filters, one or more beam splitters, one or more diffusers, one or more homogenizers, one or more apodizers, or one or more beam shapers. 【0028】 In an embodiment, the metrology subsystem 102 includes a detector 118 configured to capture photon or particle emissions (e.g., collected signal 120) from a sample 104 through a collection path 122. The collection path 122 may include, but is not limited to, one or more collection path lenses 124 for directing at least a portion of the collected signal 120 towards the detector 118. For example, the detector 118 may receive collected, reflected, or scattered light from the sample 104 (e.g., via specular reflection, diffuse reflection, and the like) through one or more collection path lenses 124. As another example, the detector 118 may receive radiation of one or more diffraction orders (e.g., zero-order diffraction, ±1st-order diffraction, ±2nd-order diffraction, and the like) from the sample 104. As yet another example, the detector 118 may receive radiation generated by the sample 104 (e.g., luminescence associated with the absorption of the illumination beam 111, or the like). 【0029】 In some embodiments, the illumination beam 111 and the acquisition signal 120 may travel through the same objective lens. For example, the illumination path 113 and the acquisition path 122 may share the same objective lens. 【0030】 The detector 118 may include any type of detector known in the art that is suitable for measuring the illumination received from sample 104. For example, detector 118 may include, but is not limited to, a charge-coupled device (CCD) detector, a time-delayed integration (TDI) detector, a photomultiplier tube (PMT), an avalanche photodiode (APD), or similar. In embodiments, detector 118 may include a spectroscopic detector suitable for identifying the wavelength of light emanating from sample 104. 【0031】 The collection path 122 may further include any number of collection path lenses 132 or collection optical elements 126 for directing and / or correcting the illumination collected from the sample 104, including but not limited to one or more filters, one or more polarizers, one or more apodizers, or one or more beam blocks. 【0032】 Figure 1C is a simplified schematic diagram of a characterization subsystem 102 configured as an X-ray subsystem according to one or more embodiments of the present disclosure. The metrology subsystem 102 may include any type of X-ray subsystem known in the Art that is suitable for providing an X-ray illumination beam 111 and capturing an associated acquired signal 120, the acquired signal 120 may include, but is not limited to, X-ray emission, optical emission, or particle emission. Examples of X-ray configurations include, but are not limited to, a small-angle X-ray scattermeter (SAXR) or a soft X-ray reflectometer (SXR). 【0033】 In embodiments, the characterization subsystem 102 includes an X-ray illumination path lens 116 suitable for sighting or focusing the X-ray illumination beam 111, and a collection path lens (not shown) suitable for collecting, sighting, and / or focusing X-rays from the sample 104. For example, the metrology subsystem 102 may include, but is not limited to, a specular X-ray optical system such as an X-ray sighting mirror, an oblique incidence elliptical mirror, a polycapillary optical system such as a hollow capillary X-ray waveguide, a multilayer optical system, or a system, or any combination thereof. In embodiments, the metrology subsystem 102 includes an X-ray detector 118 such as, but is not limited to, an X-ray monochromator (e.g., a crystalline monochromator such as a Loxley-Tanner-Bowen monochromator, or similar), an X-ray diaphragm, an X-ray beam stop, or a diffraction optical system (e.g., a zone plate). 【0034】 Figure 1D is a simplified schematic diagram of a characterization subsystem 102 configured as a particle beam metrology subsystem (e.g., an e-beam metrology subsystem) according to one or more embodiments of the present disclosure. 【0035】 In embodiments, the characterization subsystem 102 includes one or more particle focusing elements (e.g., illumination path lens 116, collection path lens 124 (not shown), or similar). For example, one or more particle focusing elements may include, but are not limited to, a single particle focusing element or one or more particle focusing elements forming a composite system. Furthermore, one or more particle focusing elements may include, but are not limited to, any type of electron lens known in the Art, including electrostatic, magnetic, unipotential, or double potential lenses. It is noted herein that the description of the voltage contrast imaging inspection system as depicted in Figure 1C and the associated descriptions above are provided for illustrative purposes only and should not be construed as limitations. For example, the metrology subsystem 102 may include any excitation source known in the Art suitable for generating inspection data on sample 104. In embodiments, the metrology subsystem 102 includes two or more particle beam sources (e.g., electron beam sources or ion beam sources) for generating two or more particle beams. In one embodiment, the metrology subsystem 102 includes one or more components (e.g., one or more electrodes) configured to apply one or more voltages to one or more locations on the sample 104. In this regard, the metrology subsystem 102 can generate voltage contrast imaging data. 【0036】 In the embodiment, the characterization subsystem 102 includes one or more particle detectors 118 for imaging or otherwise detecting particles originating from the sample 104. In the embodiment, one or more particle detectors 118 include electron collectors (e.g., secondary electron collectors, backscatter electron detectors, or similar). In the embodiment, one or more particle detectors 118 include photon detectors (e.g., photodetectors, X-ray detectors, scintillation elements coupled to photomultiplier tube (PMT) detectors, or similar) for detecting electrons and / or photons from the sample surface. 【0037】 Referring collectively to Figures 1A to 1D, various hardware configurations may be separated into individual operating systems or integrated within a single subsystem. For example, a metrology subsystem may combine multiple hardware configurations within a single subsystem, as described collectively in U.S. Patent No. 7,933,026, which is incorporated herein by reference as a whole. As another example, multiple metrology subsystems may be used for measurements on one or more metrology objects, as described collectively in U.S. Patent No. 7,478,019, which is incorporated herein by reference as a whole. Various hardware configurations are described collectively in U.S. Patents No. 5,608,526, 5,859,424, and 6,429,943, all of which are incorporated herein by reference as a whole. 【0038】 The characterization subsystem 102 may be further configured with various hardware configurations to measure various structural and / or material properties of one or more layers of sample 104, including but not limited to overlays, inclinations, limit dimensions (CD) of one or more structures, film thickness, or film configuration after one or more fabrication steps. 【0039】 Referring here to Figures 2 and 3, various method steps for improving the measurement performance of a characterization system according to one or more embodiments of the present disclosure will be described in more detail. The applicant notes that the embodiments and enabling techniques described herein in the context of characterization system 100 should be interpreted as extending to the following steps. Nevertheless, it should be further noted that the following steps are not limited to the architecture of characterization system 100. 【0040】 Figure 2 is a flowchart illustrating a method 200 for improving the measurement performance of a characterization system according to one or more embodiments of the present disclosure. 【0041】 In step 202, multiple machine learning models may be trained based on a set of training data. For the purposes of this disclosure, the term “training data” can be considered as data that will be used as input for training machine learning models. 【0042】 In the embodiment, the training data set may include empirical data obtained from one or more samples. For example, the characterization subsystem 102 may be configured to obtain empirical data from one or more samples and provide the empirical data to the controller 108. 【0043】 In the embodiment, the training data set may include simulated data obtained from one or more geometric models. For example, the controller 108 (or an external controller) may be configured to generate one or more geometric models for generating the simulated data. 【0044】 Simulated and / or empirical data may include, but are not limited to, one or more spectra, one or more images, or similar. 【0045】 In the context of supervised learning, the training data set is labeled based on known information. In this regard, the controller 108 can receive reference data associated with the training data set. Thus, the training data set (e.g., empirical data and / or simulated data) as well as the reference data can be used as input for training multiple machine learning models. The controller 108 may be further configured to store the reference data and multiple trained machine learning models in memory 112. 【0046】 In the embodiment, multiple machine learning models may differ at least partially. For example, a first machine learning model among multiple machine learning models may differ from one or more additional machine learning models based on one of the sets of hyperparameters (i.e., parameters of the machine learning model) or datasets. 【0047】 In one example, if machine learning models differ based on the dataset, each machine learning model may be trained using a different dataset. In this regard, the first machine learning model may include the first dataset, one or more additional machine learning models may include one or more additional datasets, and one or more additional datasets for one or more additional machine learning models may differ from the first dataset of the first machine learning model. 【0048】 In another example, if machine learning models differ based on a set of hyperparameters, each machine learning model may differ based on the hyperparameters of the model itself. In this regard, the first machine learning model may have a first set of hyperparameters, one or more additional machine learning models may have one or more additional sets of hyperparameters, and one or more additional sets of hyperparameters of one or more additional machine learning models may differ from the first set of hyperparameters of the first machine learning model. The set of hyperparameters may include, but is not limited to, neural network layers, neurons, regularization, dropout layers, Monte Carlo dropout, Bayesian neural networks, and similar. 【0049】 In the embodiment, each of the multiple machine learning models has the ability to generate uncertainty estimators. For example, during execution time, the trained machine learning model can be applied to sample measurement data to determine the measurement and uncertainty estimator, as will be discussed further herein. 【0050】 It should be noted that multiple machine learning models may include any type of machine learning algorithm and / or deep learning technique, including but not limited to deep learning regression models, ensemble learning algorithms, artificial neural networks (ANNs), convolutional neural networks (CNNs), residual neural networks, and similar ones. 【0051】 In step 204, multiple sample measurement datasets from one or more test samples may be received during execution time. For example, the controller 108 may be configured to receive multiple sample measurement datasets from one or more test samples from the characterization subsystem 102. For example, the characterization subsystem 102 may be configured to generate one or more characterization measurements using multiple trained machine learning models, depending on the characterization recipe. 【0052】 In step 206, for each of the multiple received sample measurement datasets, each trained machine learning model may be applied to its respective dataset to determine the measured values ​​and uncertainty estimators for each of the multiple trained machine learning models. For example, as shown in Figure 3, the controller 108 may be configured to determine the measured values ​​(e.g., critical parameter Cp) and uncertainty estimator (UE) for each trained machine learning model. For example, as shown in Figure 3, a first machine learning model M1 may be applied to the test sample measurement dataset to determine a first CP value and a first UE value, and a second machine learning model M2 may be applied to the test sample measurement dataset to determine a second CP value and a second UE value, and so on, for N machine learning models M N However, this may be applied to determine the Nth CP value and the Nth UE value. 【0053】 In one embodiment, an uncertainty estimator for each trained machine learning model indicates the measured uncertainty of each trained machine learning model. For example, multiple trained machine learning models with different hyperparameters may be applied during runtime, and an uncertainty estimator for each model may be provided based on the difference in outputs. Alternatively, multiple trained machine learning models trained on various datasets may be applied during runtime, and an uncertainty estimator for each model may be provided based on the difference in outputs. 【0054】 Uncertainty estimators may include, but are not limited to, Bayesian neural networks, Monte Carlo dropout, deep ensembles, and similar technologies. 【0055】 For example, the lower the uncertainty estimator, the less uncertainty there is in the associated measurement. In this regard, the measurement performance of the characterization can be improved by selecting N measurements associated with the N lowest uncertainty estimators. 【0056】 In step 208, a measurement output may be generated for each of the multiple received sample measurement datasets. For example, the controller 108 may be configured to generate a measurement output based on N measurements associated with the N lowest uncertainty estimates of each of the N trained machine learning models. 【0057】 In the embodiment, N may include an integer equal to 1. For example, the controller 108 may be configured to select one uncertainty estimator of one trained machine learning model, the selected uncertainty estimator may have substantially the lowest (or minimum) value. Furthermore, as shown in Figure 3, the controller 108 may be configured to provide the associated measurement of the selected trained machine learning model having the lowest uncertainty estimator as a single measurement output. For example, as shown in Figure 3, the controller 103 may have multiple machine learning models M1~M N A trained machine learning model with the lowest UE can be selected, and the associated measurement CP can be chosen as the measurement output. 【0058】 In embodiments, N may include an integer greater than or equal to 2. For example, the controller 108 may be configured to select two or more uncertainty estimators, the selected two or more uncertainty estimators may also include two or more minimum (or lowest) values. Furthermore, the controller 108 may be configured to generate a measurement output by averaging the associated measurements of two or more selected trained machine learning models having two or more minimum uncertainty estimators. In a non-restrictive example, if N is 3, the controller 108 may be configured to select three measurements associated with three minimum uncertainty estimators and to generate a measurement output by averaging the three selected measurements. In this regard, measurements associated with three trained machine learning models having three minimum uncertainty estimators may be used, thus reducing the variability of the uncertainty estimators. It should be noted that the mean may include a sample mean, a lot mean, or one based on multiple samples. 【0059】 As discussed earlier in this specification, using uncertainty estimators to select the best machine learning model for measurement makes composite models robust in estimation (i.e., for samples beyond the training range). For example, as shown in plot 400 in Figure 4, a machine learning model may perform consistently well (i.e., with small errors) within the distribution (i.e., interpolation) but poorly deviate from the distribution (i.e., estimation). At estimation, the machine learning model performs accordingly, and therefore the model may overestimate or underestimate. Furthermore, some machine learning models estimate well (i.e., with small errors) in response to specific changes in distribution data outside the training range. Selecting the best machine learning model based on the lowest uncertainty estimator reduces errors. For example, the frequency of errors when using uncertainty estimators is reduced (as depicted in plot 500 in Figure 5A) compared to ensemble models where uncertainty estimators are not used (as depicted in plot 502 in Figure 5B). Furthermore, as shown in Figure 6, the mean absolute error of systems utilizing uncertainty estimators is lower than that of systems without uncertainty estimators. 【0060】 Figures 4-6 are provided for illustrative purposes only and should not be construed as limiting the scope of this disclosure. 【0061】 Referring again to Figure 1A, additional components of the characterization system 100 according to one or more embodiments of the present disclosure will be described in more detail. 【0062】 In an embodiment, the characterization system 100 includes a controller 108 that is communication-coupled to the characterization subsystem 102 and / or any of its components. In an embodiment, the controller 108 includes one or more processors 110. For example, one or more processors 110 may be configured to execute a set of program instructions maintained in a memory device 112 (or memory). One or more processors 110 of the controller 108 may include any processing elements known in the art. In this sense, one or more processors 110 may include any microprocessor-type device configured to execute algorithms and / or instructions. 【0063】 One or more processors 110 of the controller 108 may include any processor or processing element known in the art. For the purposes of this disclosure, the terms “processor” or “processing element” may be broadly defined to include any device having one or more processing or logic elements (e.g., one or more microprocessor devices, one or more application-specific integrated circuit (ASIC) devices, one or more field-programmable gate arrays (FPGAs), or one or more digital signal processors (DSPs)). In this sense, one or more processors 110 may include any device configured to execute algorithms and / or instructions (e.g., program instructions stored in memory). In embodiments, one or more processors 110 may be embodied as a desktop computer, mainframe computer system, workstation, image computer, parallel processor, network computer, or any other computer system configured to operate a characterization system 100 as described throughout this disclosure, or to execute a program configured to operate with the characterization system 100. Furthermore, different subsystems of the characterization system 100 may include processors or logic elements suitable for performing at least a portion of the steps described herein. Therefore, the above description should be interpreted merely as an illustration and not as an limitation to embodiments of the disclosure. Moreover, the steps described throughout the disclosure may be performed by a single controller, or alternatively, by a number of controllers. Additionally, controller 108 may include one or more controllers housed in a common housing or in multiple housings. Thus, any controller or combination of controllers may be separately packaged as modules suitable for integration into the characterization system 100. 【0064】 The memory device 112 may include any storage medium known in the Art that is suitable for storing program instructions executable by one or more associated processors 110. For example, the memory device 112 may include a non-temporary memory medium. As another example, the memory device 112 may include, but is not limited to, read-only memory (ROM), random access memory (RAM), magnetic or optical memory devices (e.g., disks), magnetic tapes, solid-state drives, and the like. It is further noted that the memory device 112 may be housed in a common controller housing together with one or more processors 110. In embodiments, the memory device 112 may be located remotely from the physical locations of one or more processors 110 and the controller 108. For example, one or more processors 110 of the controller 108 may have access to remote memory (e.g., a server) accessible over a network (e.g., the Internet, an intranet, and the like). 【0065】 The controller 108 can give instructions to the characterization subsystem 102 or any of its components (e.g., via control signals) or receive data from the characterization subsystem 102 or any of its components. The controller 108 may be further configured to perform any of the various process steps described throughout this disclosure. 【0066】 In the embodiment, the memory device 112 includes a data server. For example, the data server may collect data from the characterization subsystem 102 or other external subsystems associated with the characterization object in one or more processing steps (e.g., an ADI step, an AEI step, an ACI step, or similar). The data server may also store training data associated with training or otherwise used to generate characterization recipes. The controller 108 can then use any such data to create, update, retrain, or modify characterization recipes used to generate characterization measurements using characterization data from the device object. 【0067】 In embodiments, the characterization system 100 includes a user interface 114 that is communicate-connected to a controller 108. In embodiments, the user interface 114 may include, but is not limited to, one or more desktops, laptops, tablets, and similar devices. In embodiments, the user interface 114 includes a display used to display data from the characterization system 100 to a user. The display of the user interface 114 may include any display known in the art. For example, the display may include, but is not limited to, a liquid crystal display (LCD), an organic light-emitting diode (OLED) based display, or a CRT display. Those skilled in the art will recognize that any display device capable of integrating with the user interface 114 is suitable for the implementations in this disclosure. In embodiments, a user may input selections and / or commands in response to data displayed to the user via a user input device of the user interface 114. 【0068】 All of the methods described herein may include storing the results of one or more steps of an embodiment of the method in memory. The results may include any of the results described herein and may be stored in any form known in the art. The memory may include any of the memories described herein or any other suitable storage medium known in the art. After the results are stored, they may be accessed in memory and used by any of the method or system embodiments described herein, formatted for display to a user, used by another software module, method, or system, and so on. Furthermore, the results may be stored “permanently,” “semi-permanently,” “temporarily,” or for a period of time. For example, the memory may be random-access memory (RAM), and the results may not necessarily persist in memory indefinitely. 【0069】 It is further assumed that each embodiment of the methods described above may include any other step of any other method described herein. In addition, each embodiment of the methods described above may be carried out by any of the systems described herein. 【0070】 Those skilled in the art will recognize that the components, operations, devices, objects, and accompanying discussions described herein are used as examples for conceptual clarity and that various configuration variations are assumed. Accordingly, as used herein, the specific typical examples and accompanying discussions described are intended to be representative of their more general class. In general, the use of any particular typical example is intended to be representative of its class, and the exclusion of any specific components, operations, devices, and objects should not be taken as limiting. 【0071】 As used herein, directional terms such as “top,” “bottom,” “over,” “under,” “upper,” “upward,” “lower,” “down,” and “downward” are intended to provide relative positions for illustrative purposes and not to designate an absolute frame of reference. Various modifications to the embodiments described will be apparent to those skilled in the art, and the general principles defined herein may be applicable to other embodiments. 【0072】 With regard to the use of substantially any plural and / or singular terms herein, those skilled in the art can convert from plural to singular and / or singular to plural as appropriate for the context and / or use. Various singular / plural reinterpretations are not explicitly described herein for clarity. 【0073】 The subject matter described herein may illustrate different components that are contained within or connected to other components. Such described architectures are illustrative only, and it should be understood that many other architectures are indeed possible and achieve the same functionality. Conceptually, any arrangement of components to achieve the same functionality is effectively “associated” in such a way that the desired functionality is achieved. Therefore, any two components combined herein to achieve a particular functionality can be considered “associated” with each other, regardless of the architecture or intermediate components, in such a way that the desired functionality is achieved. Similarly, any two such associated components can be considered “connected” or “linked” with each other to achieve the desired functionality, and any two components capable of being associated in such a way can be considered “linkable” with each other to achieve the desired functionality. Specific examples of linkability include, but are not limited to, physically interlocking and / or physically interacting components, as well as / or wirelessly interacting and / or wirelessly interacting components, as well as / or logically interacting and / or logically interactable components. 【0074】 Furthermore, it should be understood that the present invention is defined in the appended claims. Generally, it will be understood by those skilled in the art that the terms used herein, and in particular in the appended claims (e.g., the body of the appended claims), are intended as "open" terms (for example, the term "including" should be interpreted as "including but not limited to," the term "having" should be interpreted as "at least having," and the term "includes" should be interpreted as "including but not limited to," etc.). If a particular number of introduced enumerations of claims is intended, such intent will be explicitly enumerated in the claims, and if there is no such enumeration, such intent will not exist, as will be further understood by those skilled in the art. For example, to aid understanding, the following appended claims may include the use of the introductory phrases "at least one" and "one or more" to introduce an enumeration of claims. Nevertheless, the use of such phrases should not be interpreted as suggesting that the introduction of a claim enumeration by the indefinite article “a” or “an” implies that any particular claim containing such an introduced enumeration is limited to only one such enumeration, even when the same claim contains the introductory phrase “one or more” or “at least one” and an indefinite article such as “a” or “an” (for example, “a” and / or “an” should typically be interpreted as meaning “at least one” or “one or more”), and the same is true of the use of the definite article used to introduce a claim enumeration. In addition, even when a particular number of introduced claim enumerations is explicitly enumerated, a person skilled in the art will recognize that such enumerations should typically be interpreted as meaning at least the number enumerated (for example, the minimal enumeration of “two enumerations” without other modifiers typically means at least two enumerations, or two or more enumerations).Furthermore, in examples where a similar convention is used for "A, B, and C, and at least one of the same," such a structure is generally intended to mean that the convention should be understood by those skilled in the art (for example, "a system having at least one of A, B, and C" should include, but not be limited to, systems having A alone, B alone, C alone, A and B together, A and C together, B and C together, and / or A, B, and C together). It will be further understood by those skilled in the art that any disjunctive word and / or phrase presenting two or more alternative terms, whether in a description, claims, or drawings, should be understood as presuming the possibility of including one of the terms, either of the terms, or both. For example, the phrase "A or B" would be understood to include the possibilities of "A" or "B" or "A and B". 【0075】 Many of the advantages of this disclosure will be understood from the foregoing description, and it will be clear that various modifications to the form, structure, and arrangement of the components can be made without deviating from the subject matter of the disclosure or sacrificing all of its material advantages. The forms described are for illustrative purposes only, and it is the intent of the following claims to encompass and include such modifications. Furthermore, it should be understood that the present invention is defined in the appended claims.

Claims

[Claim 1] One or more controllers, each including one or more processors configured to execute a set of program instructions stored in memory, wherein the set of program instructions is This involves training multiple machine learning models based on a set of training data. The set of training data includes empirical data labeled based on known information or simulated data labeled based on known information, Each of the aforementioned machine learning models has the ability to generate uncertainty estimators. The first machine learning model among the plurality of machine learning models differs from one or more additional machine learning models based on at least one of the sets of hyperparameters or datasets. Training and Receiving multiple sample measurement datasets from one or more test samples, For each of the above multiple sample measurement datasets, Applying each trained machine learning model to determine the measurements for each trained machine learning model and the uncertainty estimator, To generate a measurement output based on N trained machine learning models having the lowest uncertainty estimator, wherein the N trained machine learning models are a subset of the plurality of trained machine learning models. One or more controllers configured to cause one or more processors to perform the above-mentioned task. A characteristic evaluation system characterized by comprising the following features. [Claim 2] The system according to claim 1, characterized in that N is an integer equal to 1. [Claim 3] The system according to claim 2, wherein the measurement output is generated based on N trained machine learning models having the lowest uncertainty estimator, wherein the N trained machine learning models are a subset of the plurality of trained machine learning models, Selecting one trained machine learning model having the lowest uncertainty estimator from the aforementioned multiple trained machine learning models, The associated measurements of the selected trained machine learning model having the lowest uncertainty estimator are provided as the measurement output. A system characterized by including [Claim 4] The system according to claim 1, characterized in that N is an integer equal to or greater than 2. [Claim 5] The system according to claim 4, wherein the measurement output is generated based on N trained machine learning models having the lowest uncertainty estimator, wherein the N trained machine learning models are a subset of the plurality of trained machine learning models, Selecting two or more trained machine learning models from the aforementioned multiple trained machine learning models that have the lowest uncertainty estimator, The measurement output is generated by averaging the associated measurements of the two or more selected trained machine learning models having the lowest uncertainty estimator. A system characterized by including [Claim 6] A system according to claim 1, wherein the first machine learning model includes a first set of hyperparameters, and one or more additional machine learning models include one or more additional sets of hyperparameters, and the one or more additional sets of hyperparameters of the one or more additional machine learning models are different from the first set of hyperparameters of the first machine learning model. [Claim 7] A system according to claim 1, wherein the first machine learning model includes a first dataset, and one or more additional machine learning models include one or more additional datasets, and the one or more additional datasets of the one or more additional machine learning models are different from the first dataset of the first machine learning model. [Claim 8] The system according to claim 1, wherein the set of hyperparameters is Neural network layers, neurons, regularization, dropout layers, Monte Carlo dropout, or Bayesian neural networks A system characterized by comprising at least one of the following. [Claim 9] The system according to claim 1, wherein the plurality of machine learning models Deep learning regression models, ensemble learning algorithms, artificial neural networks, convolutional neural networks, or residual neural networks A system characterized by comprising at least one of the following. [Claim 10] The system according to claim 1, wherein the uncertainty estimator is Bayesian neural networks, Monte Carlo dropout, or deep ensembles A system characterized by including at least one of the following. [Claim 11] The system according to claim 1, Metrology subsystem connected to one or more controllers A system characterized by further comprising the features mentioned above. [Claim 12] The system according to claim 11, wherein the metrologic subsystem is Spectroscopic ellipsometer, reflectometer, small-angle X-ray scattermeter, scanning electron microscope, transmission electron microscope, or optical subsystem A system characterized by comprising at least one of the following. [Claim 13] The system according to claim 1, characterized in that the sample comprises a substrate. [Claim 14] The system according to claim 13, characterized in that the substrate comprises a wafer. [Claim 15] Characterization subsystem and One or more controllers connected to the characteristic evaluation subsystem include one or more processors configured to execute a set of program instructions stored in memory, wherein the set of program instructions is This involves training multiple machine learning models based on a set of training data. The set of training data includes empirical data obtained from samples and labeled based on known information, or simulated data obtained from geometric models of the samples and labeled based on known information. Each of the aforementioned machine learning models has the ability to generate uncertainty estimators. The first machine learning model among the plurality of machine learning models differs from one or more additional machine learning models based on at least one of the sets of hyperparameters or datasets. Training and Receiving multiple sample measurement datasets from one or more test samples, For each of the above multiple sample measurement datasets, Applying each trained machine learning model to determine the measurements for each trained machine learning model and the uncertainty estimator, To generate a measurement output based on N trained machine learning models having the lowest uncertainty estimator, wherein the N trained machine learning models are a subset of the plurality of trained machine learning models. One or more controllers configured to cause one or more processors to perform the above-mentioned task, A characteristic evaluation system characterized by comprising the following features. [Claim 16] The system according to claim 15, characterized in that N is an integer equal to 1. [Claim 17] The system according to claim 16, wherein the measurement output is generated based on N trained machine learning models having the lowest uncertainty estimator, wherein the N trained machine learning models are a subset of the plurality of trained machine learning models, Selecting one trained machine learning model having the lowest uncertainty estimator from the aforementioned multiple trained machine learning models, The associated measurements of the selected trained machine learning model having the lowest uncertainty estimator are provided as the measurement output. A system characterized by including [Claim 18] The system according to claim 15, characterized in that N is an integer equal to or greater than 2. [Claim 19] The system according to claim 18, wherein the measurement output is generated based on N trained machine learning models having the lowest uncertainty estimator, wherein the N trained machine learning models are a subset of the plurality of trained machine learning models, Selecting two or more trained machine learning models from the aforementioned multiple trained machine learning models that have the lowest uncertainty estimator, The measurement output is generated by averaging the associated measurements of the two or more selected trained machine learning models having the lowest uncertainty estimator. A system characterized by including [Claim 20] A system according to claim 15, wherein the first machine learning model includes a first set of hyperparameters, and one or more additional machine learning models include one or more additional sets of hyperparameters, and the one or more additional sets of hyperparameters of the one or more additional machine learning models are different from the first set of hyperparameters of the first machine learning model. [Claim 21] A system according to claim 15, wherein the first machine learning model includes a first dataset, and the one or more additional machine learning models include one or more additional datasets, and the one or more additional datasets of the one or more additional machine learning models are different from the first dataset of the first machine learning model. [Claim 22] The system according to claim 15, wherein the set of hyperparameters is Neural network layers, neurons, regularization, dropout layers, Monte Carlo dropout, or Bayesian neural networks A system characterized by comprising at least one of the following. [Claim 23] The system according to claim 15, wherein the plurality of machine learning models Deep learning regression models, ensemble learning algorithms, artificial neural networks, convolutional neural networks, or residual neural networks A system characterized by comprising at least one of the following. [Claim 24] The system according to claim 15, wherein the uncertainty estimator is Bayesian neural networks, Monte Carlo dropout, or deep ensembles A system characterized by including at least one of the following. [Claim 25] The system according to claim 15, characterized in that the characteristic evaluation subsystem comprises a metrology subsystem. [Claim 26] The system according to claim 25, wherein the metrologic subsystem is Spectroscopic ellipsometer, reflectometer, small-angle X-ray scattermeter, scanning electron microscope, transmission electron microscope, or optical metrometry subsystem A system characterized by comprising at least one of the following. [Claim 27] This involves training multiple machine learning models based on a set of training data. The set of training data includes empirical data labeled based on known information or simulated data labeled based on known information, Each of the aforementioned machine learning models has the ability to generate uncertainty estimators. The first machine learning model among the plurality of machine learning models differs from one or more additional machine learning models based on at least one of the sets of hyperparameters or datasets. Training and Receiving multiple sample measurement datasets from one or more test samples, For each of the above multiple sample measurement datasets, Applying each trained machine learning model to determine the measurements for each trained machine learning model and the uncertainty estimator, To generate a measurement output based on N trained machine learning models having the lowest uncertainty estimator, wherein the N trained machine learning models are a subset of the plurality of trained machine learning models. A method characterized by including the following.