Mask 3D modeling method for varied illumination incident angles in photolithography

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
A universal mask simulation model using linear and machine learning components addresses the challenges of predicting IC defects in sub-wavelength lithography, enhancing simulation efficiency and reducing costs and turnaround times.

WO2026131005A1PCT designated stage Publication Date: 2026-06-25ASML NETHERLANDS BV

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: WO · WO
Patent Type: Applications
Current Assignee / Owner: ASML NETHERLANDS BV
Filing Date: 2025-11-25
Publication Date: 2026-06-25

Application Information

Patent Timeline

25 Nov 2025

Application

25 Jun 2026

Publication

WO2026131005A1

IPC: G03F7/20; G03F1/36

CPC: G03F7/705; G03F1/36

AI Tagging

Application Domain

Photomechanical exposure apparatus Originals for photomechanical treatment

Technology Topics

Angle of incidenceLithography process

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

A method for designing a circular arc-shaped secondary mirror and a trough-type concentrating solar power system comprising the circular arc-shaped secondary mirror
CN117029293BSimple structure Easy to process Solar heating energySolar-rays concentrationSolar lightAngle of incidence
Radar sensor
US20260177660A1Wave based measurement systems Radiating element housingsAngle of incidenceRadar
An extinction-type ellipsometer
CN120064142Bquick measurementimprove accuracyPolarisation-affecting propertiesAngle of incidenceWafer
Optical laminate and article
CN116368405BLayered products Optical elementsAngle of incidenceFouling layer
Collimation control system and method based on wave aberration optimization
CN122260643AOptical elementsAngle of incidenceMirror reflection

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Current lithography simulation methods struggle with accurately predicting defect formation in ICs due to increasing complexity and sub-wavelength lithography, leading to high costs and long turn-around times, especially when dealing with varied illumination conditions and mask 3D effects.

Method used

A universal mask simulation model is developed using a first formulation for linear effects and a machine learning model for non-linear effects to predict mask 3D images under varied sampling conditions, reducing simulation runtime and enabling accurate full-chip modeling.

Benefits of technology

This approach allows for efficient and accurate simulation of mask 3D images under different conditions, reducing computational resources and time, thereby improving process yield and throughput in IC manufacturing.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure EP2025084197_25062026_PF_FP_ABST

Patent Text Reader

Abstract

A method for simulating a lithography process is disclosed. More particularly, a method for simulating a lithography model to generate a mask image, in which the lithography model is universal and can be applied to different modeling conditions is disclosed. The disclosed method provides a lithography model comprising a linear submodel configured to generate a linear component mask image representation and a higher order submodel configured to generate a non-linear component mask image representation. The linear and non-linear components may be combined to generate a resulting simulated M3D image, and the disclosed method does not require multiple models per modeling condition. Thus, the disclosed method may significantly reduce M3D modeling runtime and enable simulation of an M3D image by using a single, universal simulation model.

Need to check novelty before this filing date? Find Prior Art

Description

MASK 3D MODELING METHOD FOR VARIED ILLUMINATION INCIDENT ANGLES IN PHOTOLITHOGRAPHYCROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims priority of US application 63 / 737,516 which was filed on 20 December 2024, and which is incorporated herein in its entirety by reference.FIELD

[0002] The description herein relates to a method for simulating a lithography process. More particularly, a method for simulating a lithography model to generate an image, in which the lithography model is universal to different modeling conditions is disclosed.BACKGROUND

[0003] A lithographic apparatus is a machine that applies a desired pattern onto a target portion of a substrate. The lithographic apparatus can be used, for example, in the manufacture of integrated circuits (ICs). An IC chip in a smart phone can be as small as a person’s thumbnail and may include over 2 billion transistors. Making an IC is a complex and time-consuming process, with circuit components in different layers and including hundreds of individual steps. Errors in even one step may potentially result in problems with the final IC and may cause device failure. Therefore, in manufacturing processes of ICs, unfinished or finished circuit components are inspected to ensure that they are manufactured according to design and are free of defects. Inspection systems utilizing optical microscopes or charged particle (e.g., electron) beam microscopes, such as a scanning electron microscope (SEM) can be employed. As the physical sizes of IC components continue to shrink, accuracy and yield in IC inspection become increasingly important. High process yield and high wafer throughput can be impacted by the present of defects, especially if operator intervention is required for reviewing the defects. Therefore, modeling lithographic conditions that may accurately predict the formation of defects in ICs before fabrication occurs is desired.SUMMARY

[0004] Embodiments of the present disclosure provide a method for simulating a lithography model to generate an image, in which the lithography model is universal to different modeling conditions.

[0005] In some embodiments, the present disclosure provides a method for simulating a lithography process, the method comprising obtaining a mask pattern representation and a modeling condition, and generating an output mask pattern representation for the modeling condition, wherein generating the output mask pattern representation comprises using a mask simulation model that is universal to different modeling conditions.

[0006] In some embodiments, the present disclosure provides a non-transitory computer readable medium comprising a set of instructions that is executable by one or more processors of a computing device to cause the computing device to perform operations for simulating a lithography process. The operations comprise obtaining a mask pattern representation and a modeling condition, and generating an output mask pattern representation for the modeling condition, wherein generating the output mask pattern representation comprises using a mask simulation model that is universal to different modeling conditions.

[0007] Other advantages of the present disclosure will become apparent from the following description taken in conjunction with the accompanying drawings wherein are set forth, by way of illustration and example, certain embodiments of the present disclosure.BRIEF DESCRIPTION OF FIGURES

[0008] The above and other aspects of the present disclosure will become more apparent from the description of exemplary embodiments, taken in conjunction with the accompanying drawings.

[0009] FIG. 1 is a schematic illustrating an example lithographic apparatus, consistent with embodiments of the present disclosure.

[0010] FIG. 2 is a flowchart of an example method for simulating lithography in a lithographic apparatus, consistent with embodiments of the present disclosure.

[0011] FIG. 3 is a flowchart of an example method for source or mask optimization of a patterning process, consistent with embodiments of the present disclosure.

[0012] FIG. 4 is an example illustration of using a universal mask model to determine a mask image, consistent with some embodiments of the present disclosure.

[0013] FIG. 5 is an example schematic of a lithographic apparatus, consistent with some embodiments of the present disclosure.

[0014] FIG. 6 is an example workflow illustrating a calibration method, consistent with embodiments of the present disclosure.

[0015] FIG. 7 is an example illustration of using a universal mask model to determine a mask image, consistent with some embodiments of the present disclosure.

[0016] FIG. 8 is a schematic diagram illustrating an example artificial intelligence architecture, consistent with some embodiments of the present disclosure.

[0017] FIG. 9 is an illustration of an example universal mask model comprising a machine learning formulation, consistent with some embodiments of the present disclosure.

[0018] FIG. 10 is an example illustration of using a universal mask model to determine a mask image, consistent with embodiments of the present disclosure.

[0019] FIG. 11 an example schematic of an offline server or system, consistent with embodiments of the present disclosure.

[0020] FIG. 12 is an example workflow illustrating a method, consistent with embodiments of the present disclosure.DETAILED DESCRIPTION

[0021] Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the drawings. The following description refers to the accompanying drawings in which the same numbers in different drawings represent the same or similar elements unless otherwise represented. The implementations set forth in the following description of exemplary embodiments do not represent all implementations consistent with the invention. Instead, they are merely examples of apparatuses, systems, and methods consistent with aspects related to subject matter that may be recited in the appended claims.

[0022] The enhanced computing power of electronic devices, while reducing the physical size of the devices, can be accomplished by significantly increasing the packing density of circuit components such as transistors, capacitors, diodes, etc. on an IC chip. Semiconductor IC manufacturing is a complex and time-consuming process, with hundreds of individual steps. Errors in even one step have the potential to dramatically affect the functioning of the final product. Even one “killer defect” can cause device failure. The goal of the manufacturing process is to improve the overall yield of the process.

[0023] ICs may be manufactured using lithography, which is a fabrication process involving creating complex circuit patterns drawn on a mask deposited onto a substrate. The lithography process involves creating a master image on a mask or reticle (mask and reticle are used interchangeably herein), then projecting an image from the mask onto a resist-covered substrate in order to create a pattern that matches the design intent of defining functional elements, such as transistor gates, contacts, etc., on the device wafer. The more times a master pattern is successfully replicated within the design specifications, the lower the cost per finished device or “chip” will be. Until recently, the mask pattern has been an almost exact duplicate of the desired pattern at the wafer level, with the exception that the mask level pattern may be several times larger than the wafer level pattern, due to an imaging reduction ratio of the exposure tool. The mask pattern is typically formed by depositing and patterning a light-absorbing material on quartz or another transparent substrate. The mask is then placed in an exposure tool known as a “stepper” or “scanner” where light of a specific exposure wavelength is directed through the mask onto the wafers. The light is transmitted through clear areas of the mask, but is attenuated by a desired amount, typically between 90 and 100%, in the areas covered by the absorbing layer. The light that passes through some regions of the mask may also be phase shifted by a desired phase angle, typically an integer multiple of 180 degrees. After being collected by the projection optics of the exposure tool, the resulting aerial image pattern is then focused onto the wafers. A light-sensitive material (photoresist or resist) deposited on the wafer surface interacts with the light to form the desired pattern on the wafer, and the pattern is then transferred into the underlying layers on the wafer to form functional electrical circuits according to well-known processes.

[0024] Feature sizes being patterned have become significantly smaller than the wavelength of light used to transfer the pattern. This trend towards “subwavelength lithography” has resulted in increasing difficulty in maintaining adequate process margins in the lithography process. The aerial images created by the mask and exposure tool lose contrast and sharpness as the ratio of feature size to wavelength decreases. This ratio is quantified by the kl factor, defined as the numerical aperture of the exposure tool times the minimum feature size divided by the wavelength. There is limited practical flexibility in choosing the exposure wavelength, while the numerical aperture of exposure tools is approaching physical limits. Consequently, the continuous reduction in device feature sizes requires more and more aggressive reduction of the kl factor in lithographic processes, i.e. imaging at or below the classical resolution limits of an optical imaging system.

[0025] Methods to enable low-kl lithography have used master patterns on the mask that are no longer exact copies of the final wafer level pattern. The mask pattern is often adjusted in terms of the size and placement of features as a function of pattern density or pitch. Other techniques involve the addition or subtraction of extra corners on the mask features (“serifs,” “hammerheads,” and other patterns) and the addition of other geometries that are not intended to be printed on the wafer at all. These non-printing “assist features,” the sole purpose of which is to enhance the printability of the “main features,” may include scattering bars, holes, rings, checkerboards or “zebra stripes” to change the background light intensity (“gray scaling”), and other structures that are well documented in the literature. All of these methods are often referred to collectively as “Optical Proximity Correction” or OPC. With decreasing kl, the magnitude of proximity effects increases dramatically. In current high-end designs, more and more device layers require OPC, and almost every feature edge requires some amount of adjustment in order to ensure that the printed pattern will reasonably resemble the design intent. The implementation and verification of such extensive OPC application is only made possible by detailed full-chip computational lithography process modeling, and the process is generally referred to as model-based OPC. (See “Full-Chip Lithography Simulation and Design Analysis — How OPC Is Changing IC Design,” C. Spence, Proc. SPIE, Vol 0.5751, pp. 1-14 (2005) and “Exploring New High Speed, Mask Aware RET Verification Flows,” P. Martin et al., Proc. SPIE 5853, pp. 114-123, (2005)).

[0026] The mask may also be altered by the addition of phase-shifting regions which may or may not be replicated on the wafer. A large variety of phase-shifting techniques has been described at length in the literature including alternating aperture shifters, double expose masking processes, multiple phase transitions, and attenuating phase shifting masks. Masks formed by these methods are known as “Phase- Shifting Masks,” or PSMs. All of these techniques to increase the normalized image log slope (NILS) at low kl, including OPC, PSM and others, are referred to collectively as “Resolution Enhancement Technologies,” or RET. The result of all of these RETs, which are often applied to the mask in various combinations, is that the final pattern formed at the wafer level is no longer a simple replicate of the mask level pattern. In fact, it is becoming impossible to simply look at the mask pattern and determine what the final wafer pattern is supposed to look like. This greatly increases the difficulty in verifyingthat the design data is correct before the mask is made and wafers exposed, as well as verifying that the RETs have been applied correctly and the mask meets its target specifications.

[0027] The cost of manufacturing advanced mask sets is steadily increasing. Currently, the cost has already exceeded one million dollars per mask set for an advanced device. In addition, the turn-around time is always a critical concern. As a result, computer simulations of the lithography process, which assist in reducing both the cost and turn-around time, have become an integral part of semiconductor manufacturing. A lithography simulation process typically consists of several functional steps. First, a design layout that describes the shapes and sizes of patterns that correspond to functional elements of a semiconductor device, such as diffusion layers, metal traces, contacts, and gates of field-effect transistors, is created. These patterns represent the “design intent” of physical shapes and sizes that need be reproduced on a wafer by the lithography process in order to achieve certain electrical functionality and specifications of the final device. Numerous modifications to this design layout may be required to create the patterns on the mask or reticle used to print the desired structures. A variety of RET methods are applied to the design layout in order to approximate the design intent in the actually printed patterns. The resulting “post-RET” mask layout differs significantly from the “pre-RET” design layout. Both the pre- and post-RET layouts may be provided to the simulation system in a polygon-based hierarchical data file in, e.g., the GDS or the OASIS format.

[0028] The actual mask will further differ from the geometrical, idealized, and polygon-based mask layout because of fundamental physical limitations as well as imperfections of the mask manufacturing process. These limitations and imperfections include, e.g., corner rounding due to finite spatial resolution of the mask writing tool, possible line-width biases or offsets, and proximity effects similar to the effects experienced in projection onto the wafer substrate. The true physical properties of the mask may be approximated in a mask model to various degrees of complexity as described in U.S. Pat. No. 7,587,704. Mask-type specific properties, such as attenuation, phase-shifting design, etc., need be captured by the mask model. The lithography simulation system described in U.S. Pat. No. 7,003,758 may, e.g., utilize an image / pixel-based grayscale representation to describe the actual mask properties.

[0029] An important input to any lithography simulation system is the model for the interaction between the illuminating electric field and the mask. The thin-mask approximation is widely used in most lithography simulation systems. The thin-mask approximation, also called the Kirchhoff boundary condition, assumes that the thickness of the structures on the mask is very small compared with the wavelength and that the widths of the structures on the mask are very large compared with the wavelength. Therefore, the thin-mask approximation assumes the electro-magnetic field after mask is the multiplication of the incident field with the mask transmission function. That is, the mask transmits light in an ideal way, different regions on the mask transmit the electric field with the ideal transmittance and phase, and the transition region between different types of structures is a step function. The advantages of the thin-mask model are simple, fast, and reasonably accurate calculations for feature sizes much larger than the source wavelength.

[0030] A central part of lithography simulation is the optical model, which simulates the projection and image forming process in the exposure tool. The optical model incorporates parameters of the illumination and projection system: numerical aperture and partial coherence settings, illumination wavelength, illuminator source shape, and possibly imperfections of the system such as aberrations or flare. The projection system and various optical effects, e.g., high-NA diffraction, scalar or vector, polarization, and thin-film multiple reflection, may be modeled by transmission cross coefficients (TCCs). The TCCs may be decomposed into convolution kernels, using an eigen-series expansion. For computation speed, the series is usually truncated based on the ranking of eigen-values, resulting in a finite set of kernels. The more kernels are kept, the less error is introduced by the truncation. The lithography simulation system described in U.S. Pat. No. 7,003,758 allows for optical simulations using a very large number of convolution kernels without negative impact on computation time and therefore enables highly accurate optical modeling. (See also “Optimized Hardware and Software for Fast, Full Chip Simulation,” Y. Cao et al., Proc. SPIE Vol. 5754,407 (2005)).

[0031] Further, to predict shapes and sizes of structures formed on a substrate, a resist model is used to simulate the effect of projected light interacting with the photosensitive resist layer and the subsequent post-exposure bake (PEB) and development process. A distinction can be made between first-principle simulation approaches that attempt to predict three-dimensional resist structures by evaluating the three-dimensional light distribution in resist, as well as microscopic, physical, or chemical effects such as molecular diffusion and reaction within that layer. On the other hand, all “fast” simulation approaches that may allow full-chip simulation currently restrict themselves to more empirical resist models that employ as an input a two-dimensional aerial image provided by the optical model part of the simulation system. This separation between the optical model and the resist model being coupled by an aerial image is schematically indicated in FIG. 1. For simplicity, optional modeling of further processes, e.g., etch, ion implantation, or similar steps, is omitted.

[0032] Finally, the output of the simulation process will provide information on the predicted shapes and sizes of printed features on the wafer, such as predicted critical dimensions (CDs) and contours. Such predictions allow a quantitative evaluation of the lithographic printing process and on whether the process will produce the intended results.

[0033] As lithography processes entered below the 65 nm node, 4x reticles for leading-edge chip designs have minimum feature sizes smaller than the wavelength of light used in advanced exposure tools. The thin-mask approximation, however, can be inaccurate at sub-wavelength dimensions where topographic effects (also called thick-mask effects) arising from the vector nature of light become noticeable. These effects include polarization dependence due to the different boundary conditions for the electric and magnetic fields, transmission and phase error in small openings, edge diffraction (or scattering) effects or electromagnetic coupling. (See “Limitation of the Kirchhoff boundary conditions for aerial image simulation in 157 nm optical lithography,” M. S. Yeung and E. Barouch, IEEE Electron Devices Letter, Vol. 21, No. 9, pp. 433-435, (2000) and “Mask topography effects in projection printingof phase-shifting masks,” A. K. Wong and A. R. Neureuther, IEEE Trans. On Electron Devices, Vol. 41, No. 6, pp. 895-902, (1994)). Consequently, resource-consuming rigorous 3D electromagnetic field simulation has become necessary in aerial image formation of a thick-mask, e.g., a PSM mask. However, software that implements such rigorous 3D electromagnetic field simulation often runs extremely slow and hence is limited to extremely small areas of a chip design layout (on the order of a few square microns) and not viable for full-chip lithography modeling. Some efforts have been made to address mask 3D effects recently for full-chip lithography modeling. Two major approaches in the literature are the domain decomposition method (DDM) and the boundary layer model (BLM). (See “Simplified Models for EDGE Transitions in Rigorous Mask Modeling,” K. Adam, A. R. Neureuther, Proc, of SPIE, Vol. 4346, pp. 331-344, (2001) and “Boundary Layer Model to Account for Thick Mask Effects in Photolithography,” J. Tirapu-Azpiroz, P. Burchard, and E. Yablonovitch, Optical Microlithography XVI, Anthony Yen, Ed., Proc, of SPIE, Vol. 5040, pp. 1611-1619, (2003)).

[0034] In existing mask 3D (M3D) modeling approaches, a separate M3D model is constructed for each sampled condition of the illumination source (e.g., incident angle, source point, slit position, mask focus depth, corner rounding, etc.) and each separated M3D model has a unique formula. The mask near-field image (e.g. a M3D image) may be generated by superimposing and interpolating an image predicted by each M3D model. The M3D image may play an important role in source optimization and because the illumination source may contain many different sampling conditions (e.g., hundreds of different incident angles), fast simulation of the M3D image is under varied parameters is desired. Existing methods to simulate an M3D image under different sampling conditions is to first sample the real illumination source with a few sampling points, and then build an M3D image prediction model independently for each sampling point. Then, an M3D image is calculated for any other sampling point by using an interpolation method.

[0035] Embodiments of the present disclosure provide a method for simulating an M3D image under varied sampling conditions by using a universal model. The disclosed method uses a simulation model that comprises a first formulation or a second formulation, and where the simulation model can predict an M3D image for all varied sampling conditions. In some embodiments, the simulation model is configured to calculate a base condition result and a correction caused by a condition shift. In some embodiments, the first formulation is configured to generate a linear component of the M3D image corresponding to a linear effect resulting from a photolithography process. In some embodiments, the second formulation is a machine learning model configured to predict a higher order component of the M3D image corresponding to a non-linear effect resulting from a lithography process such as, but not limited to, optical proximity effects, diffraction effects, polarization effects, or variations in light distribution. In some embodiments, the machine learning model may be configured to generate a nonlinear component of the M3D image by appending constant channels of a condition (e.g., scalar values for a source point coordinate vector) in underlying, hidden layers of the neural network. The appended channels may enable the machine learning model to account for the variations of higher order effectsresulting from varied sampling conditions. The disclosed method may significantly reduce M3D modeling runtime and enable simulation of an M3D image by using a single, universal simulation model.

[0036] Objects and advantages of the disclosure can be realized by the elements and combinations as set forth in embodiments described herein. However, embodiments of the present disclosure are not necessarily required to achieve such exemplary objects or advantages. Some embodiments can achieve a different feature or enhancement without necessarily achieving any expressly stated object or advantage.

[0037] As used herein, unless specifically stated otherwise, the term “or” encompasses all possible combinations, except where infeasible. For example, if it is stated that a component can comprise A or B, then, unless specifically stated otherwise or infeasible, the component can comprise A, or B, or A and B. As a second example, if it is stated that a component can comprise A, B, or C, then, unless specifically stated otherwise or infeasible, the component can comprise A, or B, or C, or A and B, or A and C, or B and C, or A and B and C.

[0038] Relative dimensions of components in drawings may be exaggerated for clarity. Within the following description of drawings, the same or like reference numbers refer to the same or like components or entities, and only the differences with respect to the individual embodiments are described.

[0039] The term “patterning device” may be considered synonymous with similar terms of art, such as “reticle” or “mask.” The term “patterning device” used herein should be broadly interpreted as referring to any device that can be used to impart a pattern on a cross section of a radiation beam. The radiation beam then can recreate the pattern in a target portion of a substrate.

[0040] The term “projection system” used herein should be broadly interpreted as encompassing any type of projection system, including refractive, reflective, catadioptric, magnetic, electromagnetic, or electrostatic optical systems, or any combination thereof, as appropriate for the exposure radiation being used, or for other factors such as the use of an immersion liquid or the use of a vacuum. Any use of the term “projection lens” herein may be considered as synonymous with the more general term “projection system.”

[0041] Illumination can be understood to be a form of radiation. The terms “radiation” and “illumination” can be used herein interchangeably. Embodiments described in the context of illumination are also applicable in the context of radiation in general. Furthermore, the terms “radiation” and “beam” can encompass all types of electromagnetic radiation, including ultraviolet radiation (e.g., with a wavelength of 365, 248, 193, 157 or 126 nm) and EUV (extreme ultra-violet radiation, e.g., having a wavelength in the range 5-20 nm). The term “source” and “illumination source” as used herein may include illumination optics.

[0042] The term “optimizing” and “optimization” as used herein can indicate adjusting a lithographic apparatus such that results or processes of lithography have more desirable characteristics, such as higher accuracy of projection of design layouts on a substrate, larger process windows, etc.

[0043] FIG. 1 illustrates an exemplary lithographic apparatus 100, consistent with embodiments of the present disclosure. In some embodiments, lithographic apparatus 100 comprises a radiation source 102, which can be a deep-ultraviolet excimer laser source or other type of source including an extreme ultra violet (EUV) source (the lithographic apparatus itself need not have the radiation source), illumination optics which define the partial coherence (denoted as sigma) and which can include optic components 104, 106a, and 106b that shape radiation from source 102; a patterning device 108; and transmission optics 106c that project an image of the patterning device pattern onto a substrate plane 109. An adjustable filter or aperture 107 at disposed in among the optics can restrict the range of beam angles that impinge on the substrate plane 109. A largest possible angle 0maxcan define the numerical aperture NA of the projection optics as NA = n sin(0max), where n is the index of reflection of the medium in which the final lens element is working (e.g., a lens closest to the substrate).

[0044] In an optimization process of a lithographic projection system, a figure of merit of the system can be represented as a cost function. The optimization process can determine a set of parameters (design variables) of the system that minimizes the cost function. The cost function can have any suitable form depending on the goal of the optimization. For example, the cost function can be a weighted root mean square (RMS) of deviations of certain characteristics (evaluation points) of the system with respect to the intended values (e.g., ideal values) of these characteristics. The cost function can be the maximum of these deviations (e.g., worst deviation). The term “evaluation points” herein should be interpreted broadly to include any characteristics of the system. The design variables of the system can be confined to finite ranges or be interdependent due to practicalities of implementations of the system. In case of a lithographic apparatus, the constraints are often associated with physical properties and characteristics of the hardware such as tunable ranges or patterning device manufacturability design rules, and the evaluation points can include physical points on a resist image on a substrate, as well as non-physical characteristics such as dose and focus of the illumination used.

[0045] In a lithographic apparatus, a source can provide illumination (e.g., light). Projection optics can direct and shape the illumination via a patterning device and onto a substrate. The term “projection optics” as used herein should be broadly interpreted as encompassing various types of optical systems, including refractive optics, reflective optics, apertures and catadioptric optics, for example. The term “projection optics” may also include components operating according to any of these design types for directing, shaping or controlling the projection beam of radiation, collectively or singularly. The term “projection optics” may include any optical component in the lithographic projection apparatus, no matter where the optical component is located on an optical path of the lithographic projection apparatus. Projection optics may include optical components for shaping, adjusting or projecting radiation from the source before the radiation passes the patterning device, or optical components for shaping, adjusting or projecting the radiation after the radiation passes the patterning device. The term “projection optics” is broadly defined to include any optical component that can alter the wavefront of the radiation beam. For example, projection optics can include at least some of components 104, 106a,106b, and 106c. An aerial image is the radiation intensity distribution at substrate level. A resist layer on the substrate is exposed and the aerial image is transferred to the resist layer as a latent “resist image” therein. The resist image can be defined as a spatial distribution of solubility of the resist in the resist layer. A resist model can be used to calculate the resist image from the aerial image. An example of a resist model can be found in U.S. Patent No. 8,200,468, the contents of which are incorporated herein by reference in their entirety. The resist model is related to properties of the resist layer (e.g., effects of chemical processes which occur during exposure, post-exposure bake (PEB), and development). Optical properties of the lithographic apparatus (e.g., properties of the source, the patterning device, and the projection optics) dictate the aerial image. Since the patterning device used in the lithographic apparatus can be changed, it is desirable to separate the optical properties of the patterning device from the optical properties of the rest of the lithographic apparatus including at least the source and the projection optics.

[0046] FIG. 2 illustrates a flowchart of an exemplary method 200 for simulating lithography in a lithographic apparatus, consistent with embodiments of the present disclosure. In some embodiments, a source model 202 represents optical characteristics of the source (e.g., including radiation intensity distribution, phase distribution, or the like). A projection optics model 204 can represent optical characteristics of the projection optics (e.g., including changes to radiation intensity / phase distribution caused by the projection optics). A design layout model 206 can represent optical characteristics of a design layout (e.g., including changes to radiation intensity / phase distribution caused by a given design layout), which is the representation of an arrangement of features on, or formed by, a patterning device. An aerial image 208 can be simulated from source model 202, projection optics model 204, and design layout model 206. A resist image 212 can be simulated from aerial image 208 using a resist model 210. Simulation of lithography can, for example, predict lithographic pattern transfer results, which can include feature contours, edge placement errors (EPE), critical dimensions (CDs), or the like, in the resist image.

[0047] It is noted that the source model 202 can represent optical characteristics of the source that include, but are not limited to, NA-sigma (o) settings as well as any particular illumination source shape (e.g., off-axis radiation sources such as annular, quadrupole, and dipole, etc.). Projection optics model 204 can represent the optical characteristics of the projection optics that include, but are not limited to, aberration, distortion, refractive indexes, physical sizes, physical dimensions, or the like. Design layout model 206 can represent physical properties of a physical patterning device. An example of a design layout model can be found in U.S. Patent No. 7,587,704, the contents of which are incorporated herein by reference in their entirety. A goal of the simulation is to accurately predict feature contours, edge placement errors (EPE), critical dimensions (CDs), or the like, which can then be compared against an intended design for a device (e.g., a simulation to determine whether a mass fabrication of a new CPU architecture is feasible). The intended design is generally defined as a pre-optical proximity correction (OPC) design layout (OPC is sometimes also referred to as “optical and process correction”), which can be provided in a standardized digital file format. The layout file can be in a Graphic Database System(GDS) format, Graphic Database System II (GDS II) format, an Open Artwork System Interchange Standard (OASIS) format, a Caltech Intermediate Format (CIF), or the like. The intended design layout can include patterns or structures for transferring onto a wafer. The patterns or structures can be mask patterns used to transfer features from photolithography masks or reticles to a wafer. In some embodiments, a layout in GDS or OASIS format, among others, can include feature information stored in a binary file format representing planar geometric shapes, text, and other information related to the wafer design.

[0048] From the design layout, one or more portions can be identified, which are referred to as “clips.” In some embodiments, a set of clips is extracted, which represents the complicated patterns in the design layout (typically about 50 to 1000 clips, although any number of clips can be used). It is to be appreciated that these patterns or clips represent small portions (e.g., circuits, cells, or patterns) of the design and especially the clips represent small portions for which particular attention or verification is desirable. In other words, clips can be the portions of the design layout or can be similar or have a similar behavior of portions of the design layout where critical features are identified either by experience (including clips provided by a customer), by trial and error, or by running a full-chip simulation. Clips can contain one or more test patterns or gauge patterns.

[0049] An initial larger set of clips can be provided a priori by a customer based on known critical feature areas in a design layout that could benefit from image optimization. Alternatively, in some embodiments, the initial larger set of clips is extracted from the entire design layout by using some kind of automated algorithm (e.g., machine vision) or manual algorithm that identifies the critical feature areas.

[0050] In some embodiments, an optimization process (e.g., source mask optimization (SMO)) relates to one or more of a patterning process that employs process models (e.g., an optics model, a mask model, a resist model, etc. of FIG. 2). The optimization process can involve execution of the one or more process models and computing a cost function that can be reduced by modifying one or more characteristics (e.g., source, mask pattern, etc.) of the patterning process. In some embodiments, the one or more characteristics is described by design variables. Hence, an optimized characteristic can also be referred to as an optimized design variable, where a design variable is optimized based on a cost function.

[0051] In some embodiments, modifying the one or more characteristics is based on a gradient of the cost function that guides how the characteristic should be modified to reduce the cost function. A cost function can be a function of a certain continuous metric such as an edge placement error (e.g., a difference between contours of printed pattern and a target pattern). Using a continuous metric or a cost function of a continuous nature allows use of gradient-based optimizing algorithms that have acceptable runtime performance of an optimization process.

[0052] Details of example techniques and models used to transform a patterning device pattern into various lithographic images (e.g., an aerial image, a resist image, an etch image, etc.), apply OPC (e.g.,using models) and evaluate performance (e.g., in terms of process window) can be found in U.S. Patent Nos. 7,695,876; 7,707,538; 7,747,978; 7,882,480; 8,413,081; 8,438,508; and 9,360,766, the contents of which are incorporated herein by reference in their entirety.

[0053] FIG. 3 illustrates a flowchart of an exemplary method 300 of source or mask optimization of a patterning process, consistent with embodiments of the present disclosure. In a typical high-end design, almost every feature edge can benefit from some modification to achieve printed patterns that come sufficiently close to the target design. These modifications can include shifting or biasing of edge positions or line widths as well as application of “assist” features that are not intended to print themselves, but can affect the properties of an associated primary feature. Furthermore, optimization techniques applied to the source of illumination can have different effects on different edges and features. Optimization of illumination sources can include the use of pupils to restrict source illumination to a selected pattern of light. Embodiments of the present disclosure provide optimization methods that can be applied to both source and mask configurations.

[0054] A method of performing source and mask optimization (SMO) can allow full chip pattern coverage while lowering the computation cost by intelligently selecting a small set of critical design patterns from the full set of clips to be used in SMO. SMO can be performed on these selected patterns to obtain an optimized source. The optimized source can then be used to optimize the mask (e.g., using OPC and local mechanical-stress control) for the full chip, and the results can be compared. Various methods are provided for iteratively converging on an optimal result. Method 400 is an example SMO method.

[0055] A target design 301 (e.g., comprising a layout in a standard digital format such as OAIS, GDSII, etc.) for which a lithographic process is to be optimized can include memory, test patterns, and logic. From this design, a full set of clips 302 can be extracted, which represents complex patterns in design 301 (e.g., about 50 to 1000 clips). It is to be appreciated that these clips represent small portions (i.e., circuits, cells, or patterns) of the design for which particular attention or verification is of interest. At operation 304, a small subset of clips 306 (e.g., 15 to 50 clips) can be selected from full set of clips 302. As will be explained in more detail below, the selection of clips can be performed such that the process window of the selected patterns matches the process window for the full set of critical patterns as close as possible. The effectiveness of the selection can be measured by the total run time (pattern selection and SMO) reduction.

[0056] At operation 308, SMO can be performed with the selected patterns (15 to 50 patterns) of subset of clips 306. In particularly, an illumination source can be optimized for the selected patterns of subset of clips 306. Examples of other source optimization methods can be found in, for example, U.S. Patent Application Publication No. 2004 / 0265707, the contents of which are incorporated herein by reference in their entirety.

[0057] At operation 310, manufacturability verification of the selected patterns of subset of clips 306 can be performed with the source obtained in operation 308. In particular, verification can includeperforming an aerial image simulation of the selected patterns of subset of clips 306 and the optimized source and verifying that the patterns will print across a sufficiently wide process window. An example verification process can be found in U.S. Patent No. 7,342,646, the contents of which are incorporated herein by reference in their entirety. If the verification at operation 310 is satisfactory, as determined in operation 312, then processing can advance to full chip optimization (e.g., advanced to operations using optimized source 314). Otherwise, processing can return to operation 308, where SMO is performed again but with a different source or set of patterns. For example, the process performance as estimated by the verification tool can be compared against thresholds for certain process window parameters such as exposure latitude and depth of focus. These thresholds can be predetermined or set by a user.

[0058] After the selected patterns meet lithography performance specification as determined in step 312, the optimized source 314 can be used for optimization of the full set of clips 316 (e.g., originating from full set of clips 302).

[0059] At operation 318, model-based sub-resolution assist feature placement (MB-SRAF) and optical proximity correction (OPC) for all the patterns in the full set of clips 316 can be performed. Examples of MB-SRAF and OPC can be found in U.S. Patent Nos. 5,663,893; 5,821,014; 6,541,167; and 6,670,081, the contents of which are incorporated herein by reference in their entirety.

[0060] At operation 320, using processes similar to step 310, full pattern simulation based manufacturability verification can be performed with the optimized source 414 and the full set of clips 316 as corrected in step 318.

[0061] At operation 322, the performance (e.g., process window parameters such as exposure latitude and depth of focus) of the full set of clips 316 can be compared against subset of clips 306. For example, the pattern selection can be considered complete or the source is fully qualified for the full chip when the similar (<10%) lithography performances are obtained for both selected patterns of subset of clips 306 and critical patterns of full set of clips 316.

[0062] Otherwise, at operation 324, hotspots can be extracted. At operation 326, the hotspots can be added to subset of clips 306 and the process starts over. For example, hotspots (e.g., features among the full set of clips 316 that limit process window performance) identified during verification step 320 can be used for further source tuning or to run SMO of operation 308 again. The source can be considered fully converged when the process window of the full set of clips 316 are the same between the last run and the run before the last run of operation 322.

[0063] OPC calibration can be performed by modeling or simulation. For example, for the desired yield, the total number of features, and their respective probabilities of failure, simulation can be performed to optimize OPC for a lowest yielding feature. OPC addresses the fact that, in addition to any demagnification by the lithographic projection apparatus, the final size and placement of an image of the patterning device pattern projected on the substrate will not be identical to, or simply depend only on the size and placement of, the corresponding patterning device pattern features on the patterning device.

[0064] In some embodiments, the measurement data (e.g., stochastic variations) related to the printed pattern can be employed in optimizing the patterning process or adjusting parameters of the patterning process. For small feature sizes and high feature densities present on some design layouts, the position of a particular edge of a given feature can be influenced to a certain extent by the presence or absence of other adjacent features. These proximity effects arise from minute amounts of radiation coupled from one feature to another or non-geometrical optical effects such as diffraction and interference. Similarly, proximity effects can arise from diffusion and other chemical effects during post-exposure bake (PEB), resist development, and etching that generally follow lithography.

[0065] To ensure that the projected image of the patterning device pattern is in accordance with tolerances of a given target design, proximity effects should be predicted and compensated for using sophisticated numerical models, corrections, or pre-distortions of the patterning device pattern. The article “Full-Chip Lithography Simulation and Design Analysis — How OPC Is Changing IC Design,” C. Spence, Proc. SPIE, Vol. 5751, pp 1-14 (2005) provides an overview of “model-based” optical proximity correction processes, the contents of which are incorporated herein by reference in their entirety. In a typical high-end design, almost every feature of the patterning device pattern has some modification to achieve high fidelity of the projected image to the target design. These OPC modifications can include shifting or biasing of edge positions or line widths or application of “assist” features that are intended to assist projection of other features.

[0066] Application of model-based OPC to a target design can involve good process models and considerable computational resources, given the many millions of features typically present in a device design. However, applying OPC is generally an empirical, iterative process that does not always compensate for all possible proximity effects. Therefore, the effect of OPC, e.g., patterning device patterns after application of OPC and any other resolution enhancement technique (RET), should be verified by design inspection, e.g., intensive full-chip simulation using calibrated numerical process models, to reduce or minimize the possibility of design flaws being built into the patterning device pattern. This is driven by the enormous cost of making high-end patterning devices, as well as by the impact on turn-around time by reworking or repairing existing patterning devices once they have been manufactured. OPC and full-chip RET verification can be based on numerical modelling systems and methods. Examples of such methods can be found in U.S. Pat. No. 7,003,758 and an article titled “Optimized Hardware and Software For Fast, Full Chip Simulation”, by Y. Cao et al., Proc. SPIE, Vol. 5754, 405 (2005), the contents of which are incorporated herein by reference in their entirety.

[0067] The illumination source can also be optimized, either jointly with patterning device optimization or separately, to improve the overall lithography fidelity. The terms “illumination source” and “source” can be used interchangeably in this disclosure. Off-axis illumination (e.g., annular, quadrupole, dipole, or the like) can be used to resolve fine structures (e.g., target features) contained in the patterning device. However, when compared to a traditional illumination source, an off-axis illumination source usually provides less radiation intensity for the aerial image. Thus, it is desirable tooptimize the illumination source to achieve balance between finer resolution (relevant to yield) and reduced radiation intensity (relevant to throughput).

[0068] In the manufacturing of integrated circuits, an illumination source is designed or optimized by computation lithography to achieve improved performance (e.g., improved image quality, larger process window, etc.).

[0069] Reference is now made to FIG. 4, which is an example illustration of determining a mask image (e.g., M3D image or near field image) using a universal mask model for all conditions, consistent with some embodiments of the present disclosure. FIG. 4 illustrates mask pattern representation 401 and a condition 402, which can be represented as a vector in some embodiments, v, are provided to a universal mask model 403, which can predict a mask image 404 for varied conditions based on vector v. In some embodiments, mask pattern representation 401 may be layout properties of a mask feature such as a polygon area, edges, or corners. In some embodiments, mask pattern representation 401 may be a thin mask image. In some embodiments, mask pattern representation 401 may be a mask contour. In some embodiments, mask pattern representation 401 may be a mask contour discretized into a polygon. In some embodiments, mask pattern representation 401 may be a representation of line / space patterns with varying pitches (from isolated to dense), and two-dimensional patterns such as line / space ends with varying gap sizes should be included. The line / space patterns may span over a one-dimensional spatial frequency space while the line end patterns may cover two-dimensional effects, in particular line-end pull back, pinching, etc. Mask pattern representation 401 may be for any type of optical mask, for example a chrome-on-glass binary mask or an EPSM phase-shifting mask. In some embodiments, mask pattern representation 401 may be mask 3D topography data, including the thickness of films on the mask. The mask 3D topography data may be obtained from an individual mask error model and post- OPC layout data as described in U.S. Pat. No. 7,587,704, the subject matter of which is hereby incorporated by reference in its entirety. In some embodiments, mask pattern representation 401 may be metrology information of a mask. In some embodiments, condition 402 may be a vector v that represents a parameter of a lithography apparatus (e.g., lithography apparatus 100 in FIG. 1) that may change a resulting mask image. In some embodiments, condition 402 may represent an incident angle in an illumination source, a point source coordinate position in an illumination source, an illumination source position, a chief ray angle, a slit position, corner rounding of a mask feature, a mask focus depth, or any other lithographic parameter that may impact a resulting mask image.

[0070] Reference is now made to FIG. 5, which is a schematic of a lithographic apparatus 500, consistent with some embodiments of the present disclosure. FIG. 5 illustrates an illumination source (501) with a source point 502. Source point 502 may project a radiation beam 504 to projection optics 505 (e.g., 104, 106a, 106b, or 106c of FIG. 1), which may then project radiation beam 504 to mask 508. Mask 508 may have a multilayer component 506 and a plurality of features 507 on a surface of mask 508. Accordingly, mask 508 may have a three-dimensional structure. Radiation beam 504 may interact with mask 508 to form a mask near field image 509. In some embodiments, radiation beam 504 mayinteract with mask 508 by passing through mask 508 and diffracting into a diffracted radiation beam. In some embodiments, radiation beam 504 may interact with mask 508 by reflecting off mask 508 and diffracting into a diffracted radiation beam. Mask near field image 509 may also be referred to as a “mask 3D” (M3D) image. In embodiments of the present disclosure, mask 508 may be represented as a mask pattern representation (e.g., mask pattern representation 401 in FIG. 4) and provided to a universal mask model (e.g., universal mask model 403 in FIG. 4). In some embodiments, a condition for a universal mask model may be a coordinate position 510 of source point 502, where coordination position 510 projects radiation beam 504. Coordinate position 510 may be a two-dimensional vector, which includes a p parameter and a q parameter of coordinate position 510 (e.g., [p,q]). It is appreciated that a condition in embodiments of the present disclosure is not so limited and may include any number of dimensions. For example, a condition may be an illumination source position. For example, FIG. 5 illustrates illumination source 501 is centered about a chief ray angle 503. If illumination source 501 is shifted (e.g., left or right) about chief ray angle 503, then a resulting mask image from point source of illumination source 501 may change. Accordingly, a condition (e.g., condition 402 in FIG. 4) may be represented as a three-dimensional vector including a p parameter, a q parameter, and an s parameter where s represents an illumination source shift.

[0071] In some embodiments, universal mask model 403 may comprise a linear formulation, a machine learning formulation, or a combination thereof. Universal mask model 403 may calculate a baseline condition from condition 402 and then calculate a correction caused by a condition shift. In some embodiments, universal mask model 403 may engage variables associated with a condition (e.g., a source point coordinate position) into an analytical framework. In some embodiments, universal mask model 403 may be configured to predict and account for linear effects of a photolithography process. In some embodiments, the linear effects may correspond to distortions or deviations in a mask image feature resulting from, e.g., the illumination beam interacting with a mask. The deviations or distortions corresponding to linear effects may include, but are not limited to, feature rounding in a mask image, line width variation, standing waves, or degraded resolution. In some embodiments, universal mask model 403 may comprise filters to apply to input information (e.g., mask pattern representation 401) and to condition 402. In some embodiments, universal mask model 403 may comprise a linear formulation, which comprises a linear combination of a mask pattern representation (e.g., mask pattern representation 401) and a vector condition (e.g., condition 402) as represented by Equation 1:(Eq- 1),M3D model (M|v) = kaga+ kbgbM + ••• where ktrepresents an ith vector component of vector v. M represents mask pattern representation 401, and gi represents a filter associated with the ith vector component. In some embodiments, because vector v may represent a plurality of conditions (e.g., each source coordinate position [p,q]), then ktmay represent a vector of [pi,qi]. In some embodiments, ktmay comprise any mathematical function, operator, or model representative of vector components. For example, k may be represented as, but is not limited to, a matrix, a polynomial equation, a Taylor series, a binomial series, a power series, a Newtonian series, or any other generalization of a polynomial.

[0072] In some embodiments, universal mask model 403 may apply filters to mask pattern representation 401 and condition 402 by a Taylor series expansion, as represented by Equation 2:M3D model (M|v)where p0, q0represent a baseline condition result, p represents a parameter p of condition 402, q represents a parameter q of condition 402, M represents mask pattern representation 401, gxxxrepresents a filter, and m and n are coefficient values for the p parameter and q parameter, respectively.

[0073] In some embodiments, universal mask model 403 may comprise applying condition 402 to a filter to then apply to mask layout 401. Accordingly, universal mask model 403 may comprise a universal filter basis to apply to mask pattern representation 401. In some embodiments, the universal filter basis may be represented by Equation 3:where k and v are as described above. The term Fm n(x, y) represents a filter and the terms x and y may represent a mask pattern representation (e.g., mask pattern representation 401).

[0074] In some embodiments, the universal filter basis may be represented by, for example, a Taylor series expansion, as represented by Equation 4:where p0, q0, p, q, m, n, Fm n(x, y), x, and y are as described above.

[0075] As a non-limiting example, the universal filter basis may be represented by Equation 5, regarding a vector with parameters [p,q] , and m coefficient value of 1 and an n coefficient value of 2.(p ~ Po)1( 7—do)1F(x,y|v) = F0,0(x.y) - F1;0(x,y) + - - -(Eq. 5),(p - po)1(q - qo)2Ei,2( ,y)

[0076] In some embodiments, filters (e.g., gt, gxxx, or Fm n(x, y)) may be constructed from a library of transmitted fields of representative mask features and incident angles, and pre-computed using a rigorous EMF solver for Maxwell’s equations to account for diffraction and scattering effects. Additionally, illumination shape and polarization and mask defocus effects may be incorporated into the solver to account for mask defocus effects and edge-to-edge interaction effects.

[0077] Reference is now made to FIG. 6, which is an example flowchart of calibrating a M3D model, consistent with some embodiments of the present disclosure. In step 610, calibration test features are defined. The calibration test features can be imported from an existing design layout or can be specially generated for creating the 3D mask model. A test mask including the set of calibration test features is then manufactured. The calibration test features preferably cover a full range of different 3D mask topography profiles and different proximity interactions that are characteristic of the lithography process under consideration. A wide range of line / space patterns with varying pitches (from isolated to dense), and two-dimensional patterns such as line / space ends with varying gap sizes should be included. The line / space patterns span over a one-dimensional spatial frequency space while the line end patterns cover two-dimensional effects, in particular line-end pull back, pinching, etc. The test mask can be any type of optical mask, for example a chrome-on-glass binary mask or an EPSM phase-shifting mask. Each 3D mask model may be specific to a type of optical mask, although each 3D mask model is independent of the mask's layout.

[0078] In step 612, the test mask is inspected to obtain mask 3D topography data, including the thickness of films on the mask. A variety of metrology tools can be used to inspect the test mask. These metrology tools include, but are not limited to, conventional optical mask inspection tools, critical dimension scanning electron microscopes (CD-SEMs) or imaging SEMs, atomic force microscopes (AFMs) or scatterometry systems, or aerial image measurement system (AIMS) tools. The physical mask 3D topography data can also be obtained from an individual mask error model and post-OPC layout data as described in U.S. Pat. No. 7,587,704, the subject matter of which is hereby incorporated by reference in its entirety.

[0079] In step 614, the effect of light interacting with the test mask is rigorously simulated using the mask 3D topography data from the test mask and well-known equations describing the behavior of light (Maxwell's equations) to generate theoretical image data (or rigorously simulated data, or ground-truth data). In some embodiments, the mask 3D topography data are input into a rigorous 3D electromagneticfield (EMF) solver software program and rigorous simulations of the near-field complex field distribution are obtained. The EMF solver software can use any rigorous electromagnetic field algorithm, for example, a Finite-Discrete-Time-Domain (FDTD) algorithm or a Rigorous Coupled Waveguide Analysis (RCWA) algorithm. The simulations typically assume that the light passing through the mask is a single plane wave. Different polarization conditions are applied to the rigorous simulations, for example, x-polarization and y-polarization or TE-polarization and TM-polarization. Any other polarization condition can be represented by a linear combination of x- and y-polarizations or TE and TM polarizations.

[0080] In step 616, a form of a 3D mask model is selected. The form of the 3D mask model may be in the spatial domain or in the frequency domain. The selected form of the 3D mask model comprises a set of calibrated filters. In step 618, an initial filter for the 3D mask model is selected. In some embodiments, the filters may account for mask 3D edge scattering effects, corner scattering effects, feature-to-feature interactions, illumination source polarization, illumination source incident angle, or modify a thin-mask transmission function into a thick-mask transmission function according to a condition (e.g., condition 402 in FIG. 4).

[0081] In step 620, the mask near-field images are simulated by convolving the mask layout image, the corresponding condition (e.g., source point coordinate), and the derivative of the mask layout image with the filter to produce a filtered image. In step 622, a total difference between the filtered image and the theoretical image is calculated to calibrate the filter of the 3D mask model. In step 624, if the total difference between the filtered image and theoretical image is minimized or below a predetermined threshold, the method continues with step 626 (further explained below). If the total difference between the filtered image and the theoretical image is not minimized or below the predetermined threshold, the method continues with step 628.

[0082] In step 628, the filters are modified. The method then returns to step 620, and steps 620, 622, and 624, and if needed, step 628 are repeated until the total difference between the filtered image and the theoretical image is minimized or below the predetermined threshold. In step 626, the current filters are chosen as the final, calibrated filters for the 3D mask model.

[0083] Reference is now made to FIG. 7, which is an example illustration of determining a mask image (e.g., M3D image or near field mask image) using a universal mask model for all conditions, consistent with some embodiments of the present disclosure. In FIG. 7, mask pattern representation 701 (as described above) may be convolved with a universal filter basis 702 (e.g., Equations 1-5 above) such that a mask image 703 is determined. Mask pattern representation 701 may be as described above and may include separate mask layout parameters (e.g., parameters 704-707). As a non-limiting example, parameter 704 may be a polygon area, parameter 705 may be a horizontal edge, parameter 706 may be a vertical edge, and parameter 707 may be a corner. Each of parameters 704-707 may be convolved with universal filter basis 702 to generate mask image 703. Thus, separate filter bases for each of parameters 704-707 may be avoided and the computational time and cost required to determine maskimage 703 may be reduced. In some embodiments, mask image 703 may represent a linear component of a M3D image and thus predict a linear effect resulting from a photolithography process.

[0084] In some embodiments, universal mask model 403 may be configured to predict and account for the variations of non-linear effects resulting from varied sampling conditions and generate a higher order component of a M3D image. In some embodiments, the non-linear effects may correspond to higher order distortions or deviations in a target mask image resulting from, e.g., the illumination beam interacting with a mask at larger incident angles. The non-linear effects may include, but are not limited to, variations in light intensity across a feature resulting in uneven feature sizes and pattern infidelity, swing curves, unequal light polarization, or optical proximity effects. In some embodiments, universal mask model 403 includes a machine learning formulation that comprises a neural network. Neural networks typically consist of multiple layers, and the signal path traverses from front to back. The goal of the neural network is to solve problems in the same way that the human brain would, although several neural networks are much more abstract. Modern neural network projects typically work with a few thousand to a few million neural units and millions of connections. The neural network may have any suitable architecture and / or configuration known in the art. A neural network, as used herein, may refer to a computing model for analyzing underlying relationships in a set of input data by way of mimicking human brains. Similar to a biological neural network, the neural network may include a set of connected units or nodes (referred to as “neurons”), structured as different layers, where each connection (also referred to as an “edge”) may obtain and send a signal between neurons of neighboring layers in a way similar to a synapse in a biological brain. The signal may be any type of data (e.g., a real number). Each neuron may obtain one or more signals as an input and output another signal by applying a non-linear function to the inputted signals. Neurons and edges may typically be weighted by corresponding weights to represent the knowledge the neural network has acquired.

[0085] Reference is now made to FIG. 8, which is a schematic diagram illustrating an example artificial intelligence architecture, consistent with some embodiments of the present disclosure. For example, FIG. 8 illustrates a neural network 800. As depicted in FIG. 8, neural network 800 may include an input layer 820 that receives inputs, including input 810-1, . . ., input 810-m (m being an integer). For example, an input of neural network 800 may include any structure or unstructured data (e.g., an image). In some embodiments, neural network 800 may obtain a plurality of inputs simultaneously. For example, in FIG. 8, neural network 800 may obtain m inputs simultaneously. In some embodiments, input layer 820 may obtain m inputs in succession such that input layer 820 receives input 810-1 in a first cycle (e.g., in a first inference) and pushes data from input 810-1 to a hidden layer (e.g., hidden layer 830-1), then receives a second input in a second cycle (e.g., in a second inference) and pushes data from the second input to the hidden layer, and so on. Input layer 820 may obtain any number of inputs in the simultaneous manner, the successive manner, or any manner of grouping the inputs.

[0086] Input layer 820 may include one or more nodes, including node 820-1, node 820-2, . . ., node 820-a (a being an integer). A node (also referred to as a “machine perceptron” or a “neuron”) may model the functioning of a biological neuron. Each node may apply an activation function to received inputs (e.g., one or more of input 810-1, . . ., input 810-m). An activation function may include a Heaviside step function, a Gaussian function, a multiquadratic function, an inverse multiquadratic function, a sigmoidal function, a rectified linear unit (ReLU) function (e.g., a ReLU5 function or a Leaky ReLU function), a hyperbolic tangent (“tanh”) function, or any non-linear function. The output of the activation function may be weighted by a weight associated with the node. A weight may include a positive value between 0 and 1 , or any numerical value that may scale outputs of some nodes in a layer more or less than outputs of other nodes in the same layer.

[0087] As further depicted in FIG. 8, neural network 800 includes multiple hidden layers, including hidden layer 830-1, . . ., hidden layer 830-n (n being an integer). When neural network 800 includes more than one hidden layer, it may be referred to as an “artificial neural network” (ANN). An ANN encompasses a “deep neural network” (DNN) and a “recurrent neural network” (RNN). Each hidden layer may include one or more nodes. For example, in FIG. 8, hidden layer 830-1 includes node 830- 1-1, node 830-1-2, node 830-1-3, . . ., node 830-1-b (b being an integer), and hidden layer 830-n includes node 830-n-l, node 830-n-2, node 830-n-3, . . ., node 830-n-c (c being an integer). Similar to nodes of input layer 820, nodes of the hidden layers may apply the same or different activation functions to outputs from connected nodes of a previous layer, and weight the outputs from the activation functions by weights associated with the nodes.

[0088] As further depicted in FIG. 8, neural network 800 may include an output layer 840 that finalizes outputs, including output 850-1, output 850-2, . . ., output 850-d (d being an integer). Output layer 840 may include one or more nodes, including node 840-1, node 840-2, . . ., node 840-d. Similar to nodes of input layer 820 and of the hidden layers, nodes of output layer 840 may apply activation functions to outputs from connected nodes of a previous layer and weight the outputs from the activation functions by weights associated with the nodes.

[0089] Although nodes of each hidden layer of neural network 800 are depicted in FIG. 8 to be connected to each node of its previous layer and next layer (referred to as “fully connected”), the layers of neural network 800 may use any connection scheme. For example, one or more layers (e.g., input layer 820, hidden layer 830-1, . . ., hidden layer 830-n, or output layer 840) of neural network 800 may be connected using a convolutional scheme, a sparsely connected scheme, or any connection scheme that uses fewer connections between one layer and a previous layer than the fully connected scheme as depicted in FIG. 8.

[0090] Moreover, although the inputs and outputs of the layers of neural network 800 are depicted as propagating in a forward direction (e.g., being fed from input layer 820 to output layer 840, referred to as a “feedforward network”) in FIG. 8, neural network 800 may additionally or alternatively use backpropagation (e.g., feeding data from output layer 840 towards input layer 820) for other purposes.In some embodiments, neural network 800 may use backpropagation to evaluate and minimize a loss function. Neural network 800 may evaluate and minimize a loss function to improve a weight or bias of neutral network 800 to improve a feedforward network. In some embodiments, neural network 800 may minimize a loss function by applying a mean squared error loss function, a cross-entropy loss function, a mean absolute percentage error loss function, or a stability related loss function (e.g., a grid dependency loss or a trend loss). Accordingly, although neural network 800 is depicted similar to a deep neural network (DNN), neural network 800 may include a recurrent neural network (RNN) or any other neural network.

[0091] In some embodiments, neural network 800 may be or include deep neural networks (e.g., neural networks that have one or more intermediate or hidden layers between the input and output layers). As an example, neural network 800 may be based on a large collection of neural units (or artificial neurons). Neural network 800 may loosely mimic the manner in which a biological brain works (e.g., via large clusters of biological neurons connected by axons). Each neural unit of neural network 800 may be connected with many other neural units. Such connections can be enforcing or inhibitory in their effect on the activation state of connected neural units. In some embodiments, each individual neural unit may have a summation function that combines the values of all its inputs together. In some embodiments, each connection (or the neural unit itself) may have a threshold function such that a signal must surpass the threshold before it is allowed to propagate to other neural units. These neural network systems may be self-learning and trained, rather than explicitly programmed, and can perform significantly better in certain areas of problem solving, as compared to traditional computer programs. In some embodiments, neural network 800 may include multiple layers (e.g., where a signal path traverses from front layers to back layers). In some embodiments, back propagation techniques may be utilized by neural network 800, where forward stimulation is used to reset weights on the “front” neural units. In some embodiments, stimulation and inhibition for neural network 800 may be free flowing, with connections interacting in a more chaotic and complex fashion. In some embodiments, the intermediate layers of neural network 800 include one or more convolutional layers, one or more recurrent layers, or other layers.

[0092] Neural network 800 may be trained (e.g., parameters are determined) using a set of training information. The training information may include a set of training samples. Each sample may be a pair comprising an input object (typically a vector, which may be called a feature vector) and a desired output value (also called the supervisory signal). A training algorithm analyzes the training information and adjusts the behavior of neural network 800 by adjusting the parameters (e.g., weights of one or more layers) of the neural network based on the training information. For example, given a set of N training samples of the form {(xi, yi), (X2, yz), ... ,(xN,yN)} such that Xi is the feature vector of the i-th example and y; is its supervisory signal, a training algorithm seeks a neural network g: X -> Y, where X is the input space and Y is the output space. A feature vector is an n-dimensional vector of numerical features that represent some object (e.g., a simulated aerial image, a wafer design, a clip, etc.). Thevector space associated with these vectors is often called the feature space. After training, neural network 800 may be used for making predictions using new samples.

[0093] In the context of determining a M3D image, the feature vector may include one or more characteristics (e.g., shape, arrangement, size, etc.) of the design layout comprised or formed by the patterning device, one or more characteristics (e.g., one or more physical properties such as a dimension, a refractive index, material composition, etc.) of the patterning device, and one or more characteristics (e.g., the wavelength) of the illumination used in the lithographic process. The supervisory signal may include one or more characteristics of the M3D (e.g., one or more parameters of the M3D mask transmission function).

[0094] A neural network, as used herein, may refer to a computing model for analyzing underlying relationships in a set of input data by way of mimicking human brains. Similar to a biological neural network, the neural network may include a set of connected units or nodes (referred to as “neurons”), structured as different layers, where each connection (also referred to as an “edge”) may obtain and send a signal between neurons of neighboring layers in a way similar to a synapse in a biological brain. The signal may be any type of data (e.g., a real number). Each neuron may obtain one or more signals as an input and output another signal by applying a non-linear function to the inputted signals. Neurons and edges may typically be weighted by corresponding weights to represent the knowledge the neural network has acquired. During a training process (similar to a learning process of a biological brain), the weights may be adjusted (e.g., by increasing or decreasing their values) to change the strengths of the signals between the neurons to improve the performance accuracy of the neural network. Neurons may apply a thresholding function (referred to as an “activation function”) to its output values of the nonlinear function such that a signal is outputted only when an aggregated value (e.g., a weighted sum) of the output values of the non-linear function exceeds a threshold determined by the thresholding function. Different layers of neurons may transform their input signals in different manners (e.g., by applying different non-linear functions or activation functions). The output of the last layer (referred to as an “output layer”) may output the analysis result of the neural network, such as, for example, a categorization of the set of input data (e.g., as in image recognition cases), a numerical result, or any type of output data for obtaining an analytical result from the input data.

[0095] During the training of a neural network, a loss function may be used to evaluate the output data. The loss function, as used herein, may map output data of a machine learning model (e.g., the neural network) onto a real number (referred to as a “loss”) that intuitively represents a loss or an error (e.g., representing a difference between the output data and target output data) associated with the output data. The training of the neural network may seek to maximize or minimize the loss function (e.g., by pushing the loss towards a local maximum or a local minimum in a loss curve). For example, one or more parameters of the neural network may be adjusted or updated purporting to maximize or minimize the loss function. After adjusting or updating the one or more parameters, the neural network may obtainnew input data in a next iteration of its training. When the loss function is maximized or minimized, the training of the neural network may be terminated.

[0096] In some embodiments, universal mask model 403 may engage variables associated with a condition (e.g., a source point coordinate position) into a higher order framework (e.g., a neural network) to predict and account for non-linear effects resulting from a photolithography process. In some embodiments, universal mask model 403 may include a machine learning formulation that may be configured to generate a non-linear component of the M3D image by appending constant channels of a condition (e.g., scalar values for a source point coordinate vector) in underlying, hidden layers of the neural network. The appended channels may enable the machine learning model to account for the variations of higher order effects resulting from varied sampling conditions. In some embodiments, universal mask model 403 may append an additional layer comprising constant channels to a machine learning formulation. Reference is now made to FIG. 9, which is an illustration of an example universal mask model comprising a machine learning formulation, consistent with some embodiments of the present disclosure. Mask pattern representation 901 (as described above) and condition 902 (as described above) may be provided to a universal mask model 903 that comprises a machine learning formulation 903a. In some embodiments, universal mask model 903 may comprise an added layer 904 that comprises constant channels 905-908 to account for various conditions. Each of channels 905-908 may be determined by projecting a vector, v, comprising parameters of a condition (e.g., [p,q] as described above) into a constant scalar value. For example, channels 905-908 may be determined by a function represented by Equation 6:where p0, q0, p, q, m, and n are as described above.

[0097] For each of channels 905-908, the conditions may be different. For example, for channel 905, the m and n coefficient values may be 0 and for channel 906, the m coefficient may be 0 and the n coefficient may be 1. Accordingly, universal mask model 903 can build different values for each channel. It is appreciated that FIG. 9 serves as an illustrative example, that universal mask model 903 may include any number of layers, and is not limited to the illustration as seen in FIG. 9. Although not illustrated in FIG. 9, universal mask model 903 may output a mask image that represents a higher order component of a M3D image.

[0098] Reference is now made to FIG. 10, which is an example illustration of determining a mask image (e.g., M3D image or near field image) comprising a linear and higher order component using a universal mask model for all conditions, consistent with some embodiments of the present disclosure. FIG. 10 illustrates mask pattern representation 1001 and a condition 1002, which can be represented as a vector in some embodiments, v, are provided to a universal mask model 1003. Mask patternrepresentation 1001 and condition 1002 are as described above. Universal mask model 1003 may comprise a first formulation 1003a and a second formulation 1003b, and mask pattern representation 1001 and condition 1002 are provided to both. In some embodiments, first formulation 1003a may be configured to generate a result 1004 that is representative of a linear component of a M3D image. Result 1004 may be an image, a contour, a pattern, a plot, an equation, a matrix, a vector, or a value. First formulation 1003a may comprise a linear formulation, which comprises a linear combination of mask pattern representation and a vector condition as described above (e.g. Equations 1-5). In some embodiments, second formulation 1003b may be configured to generate a result 1005 that is representative of a non-linear component (e.g., higher order component) of a M3D image. Result 1005 may be an image, a contour, a pattern, a plot, an equation, a matrix, a vector, or a value. Second formulation 1003b may comprise a neural network and may be configured to appending constant channels of a condition (e.g., scalar values for a source point coordinate vector) in underlying, hidden layers of the neural network. In some embodiments, universal mask model 1003 may combine result 1004 and result 1005 to generate a M3D image 1006, which predicts both a linear component and a higher order component of a M3D image resulting from a photolithography process.

[0099] In some embodiments, M3D image 1006 may be used in a SMO or OPC flow (e.g., used to generate an aerial image by an optical model) as described above in FIG. 3.

[0100] In some embodiments, an offline computing system (e.g., offline from high volume manufacturing) may be used to collect input data and calibrate or generate a universal mask model configured to determine a mask image (e.g., M3D image or near field image) for all modeling conditions, consistent with some embodiments of the present disclosure. Reference is now made to FIG. 11, which is a block diagram of an example computing system for determining a mask image (e.g., M3D image or near field image) for all modeling conditions, consistent with embodiments of the present disclosure. For example, system 1104 may be a preprocessor, an encoder, or a decoder. In some embodiments, system 1104 may comprise a memory 1106 storing a set of instructions and at least one processor 1105 configured to execute the set of instructions to cause system 1104 to perform operations comprising processing a mask pattern representation (e.g., metrology data), calibrate a simulation model using the mask pattern representation, and execute the simulation model on a mask pattern representation (e.g., a thin mask image) to predict an output mask pattern representant (e.g., a M3D image). As shown in FIG. 11, system 1104 can include processor 1105. When processor 1105 executes instructions described herein, system 1104 can become a specialized machine for preprocessing, encoding, or decoding image data. Processor 1105 can be any type of circuitry capable of manipulating or processing information. For example, processor 1105 can include any combination of any number of a central processing unit (or “CPU”), a graphics processing unit (or “GPU”), a neural processing unit (“NPU”), a microcontroller unit (“MCU”), an optical processor, a programmable logic controller, a microcontroller, a microprocessor, a digital signal processor, an intellectual property (IP) core, a Programmable Uogic Array (PEA), a Programmable Array Uogic (PAL), a Generic Array Logic (GAL), a ComplexProgrammable Logic Device (CPLD), a Field-Programmable Gate Array (FPGA), a System On Chip (SoC), an Application-Specific Integrated Circuit (ASIC), or the like. In some embodiments, processor 1105 can also be a set of processors grouped as a single logical component. For example, as shown in FIG. 11, processor 1105 can include multiple processors, including processor 1105a, processor 1105b, and processor 1105n.

[0101] System 1104 can also include memory 1106 configured to store data (e.g., a set of instructions, computer codes, intermediate data, or the like). For example, as shown in FIG. 11, the stored data can include program instructions (e.g., program instructions for building and calibrating a simulation model) and data for processing (e.g., metrology data). Processor 1105 can access the program instructions and data for processing (e.g., via bus 1107), and execute the program instructions to perform an operation or manipulation on the data for processing. Memory 1106 can include a high-speed random-access storage device or a non-volatile storage device. In some embodiments, memory 1106 can include any combination of any number of a random-access memory (RAM), a read-only memory (ROM), an optical disc, a magnetic disk, a hard drive, a solid-state drive, a flash drive, a security digital (SD) card, a memory stick, a compact flash (CF) card, or the like. Memory 1106 can also be a group of memories (not shown in FIG. 11) grouped as a single logical component.

[0102] Bus 1107 can be a communication device that transfers data between components included in system 1104, such as an internal bus (e.g., a CPU-memory bus), an external bus (e.g., a universal serial bus port, a peripheral component interconnect express port), or the like.

[0103] For ease of explanation without causing ambiguity, processor 1105 and other data processing circuits are collectively referred to as a “data processing circuit” in this disclosure. The data processing circuit can be implemented entirely as hardware, or as a combination of software, hardware, or firmware. In addition, the data processing circuit can be a single independent module or can be combined entirely or partially into any other component of system 1104.

[0104] System 1104 can further include network interface 1108 to provide wired or wireless communication with a network (e.g., the Internet, an intranet, a local area network, a mobile communications network, or the like). In some embodiments, network interface 1108 can include any combination of any number of a network interface controller (NIC), a radio frequency (RF) module, a transponder, a transceiver, a modem, a router, a gateway, a wired network adapter, a wireless network adapter, a Bluetooth adapter, an infrared adapter, a near-field communication (“NFC”) adapter, a cellular network chip, or the like.

[0105] In some embodiments, optionally, system 1104 can further include peripheral interface 1109 to provide a connection to one or more peripheral devices. As shown in FIG. 11, the peripheral device can include, but is not limited to, a cursor control device (e.g., a mouse, a touchpad, or a touchscreen), a keyboard, a display (e.g., a cathode-ray tube display, a liquid crystal display, or a light-emitting diode display), an input device (e.g., a computing device), or the like.

[0106] Reference is now made to FIG. 12, which illustrates an example method 1200 for simulating a mask image, consistent with embodiments of the present disclosure.

[0107] In step 1201, mask pattern representation and a modeling condition are obtained. The mask pattern representation and modeling condition may be provided to a mask simulation model wherein the mask simulation model is universal to different modeling conditions. The mask pattern representation may be as described above and the modeling condition may be an incident angle in an illumination source, a point source coordinate position in an illumination source, an illumination source position, a chief ray angle, a slit position, a mask focus depth, corner rounding of a mask feature, or any other lithographic parameter that may impact a resulting mask image.

[0108] In step 1202, an output mask representation is generated for the modeling condition by using a mask simulation model that is universal to different modeling conditions. The mask simulation model calculates a result at a baseline modeling condition and a correction value by applying a series of basis functions. The basis functions may be a difference between the baseline condition and the modeling condition. The output image may be a mask image, a near field mask image, or an M3D image.

[0109] Benefits of the present disclosure include providing a method for simulating an M3D image under varied sampling conditions by using a single model universal to all sampling conditions. The disclosed method may significantly reduce M3D modeling runtime and enable simulation of an M3D image by using a single model previously unrecognized. In some embodiments, the present disclosure provides a method to improve simulation performance compared to current technologies.

[0110] A non-transitory computer-readable medium can be provided that stores instructions for a processor of a controller for simulating an M3D image according to the exemplary flowchart of FIG. 9, consistent with embodiments in the present disclosure. For example, the instructions stored in the non-transitory computer-readable medium can be executed by the circuitry of the controller for performing method 900 in part or entirely. Common forms of non-transitory media include, for example, a floppy disk, a flexible disk, hard disk, solid-state drive, magnetic tape, or any other magnetic data storage medium, a Compact Disc Read-Only Memory (CD-ROM), any other optical data storage medium, any physical medium with patterns of holes, a Random Access Memory (RAM), a Programmable Read-Only Memory (PROM), and Erasable Programmable Read-Only Memory (EPROM), a FLASH-EPROM or any other flash memory, Non-Volatile Random Access Memory (NVRAM), a cache, a register, any other memory chip or cartridge, and networked versions of the same.

[0111] It will be appreciated that the embodiments of the present disclosure are not limited to the exact construction that has been described above and illustrated in the accompanying drawings and that various modifications and changes can be made without departing from the scope thereof.

[0112] Embodiments of the present disclosure can be further described by the following clauses.1. A method for simulating a lithography process comprising:Obtaining a mask pattern representation and a modeling condition; andgenerating an output mask pattern representation for the modeling condition, wherein generating the output mask pattern representation comprises using a mask simulation model that is universal to different modeling conditions.2. The method of clause 1 , wherein the mask simulation model comprises a series of basis functions configured to calculate a result at a baseline condition and a difference between the result at the baseline condition and the modeling condition to determine a correction value.3. The method of clause 1 or 2, wherein the mask simulation model comprises a linear submodel.4. The method of clause 2 or 3, wherein the baseline condition and the correction value are determined by using a Taylor expansion series.5. The method of any one of clauses 2 to 4, wherein the baseline condition and the correction value are determined by using a model filter, wherein the model filter represents an electromagnetic scattering effect resulting from the mask pattern representation.6. The method of any one of clauses 1 to 5, wherein the mask simulation model is configured to generate a linear component of the output mask pattern representation from the mask pattern representation and the modeling condition.7. The method of any one of clauses 1 to 6, wherein the mask simulation model is calibrated prior to simulating the lithography process, wherein calibration of the mask simulation model comprises: an obtaining of a representation of a mask pattern; a determination of a theoretical mask image from the representation of the mask pattern; a selection of an initial filter for a basis function and the generation of a filtered mask image from the initial filter and the representation of the mask pattern; a determination of a calibrated filter from the calculation of a difference between the filtered mask image and the theoretical image; and a selection of the calibrated filter for a basis function.8. The method of clause 7, wherein the representation of the mask pattern comprises topography data of the mask pattern.9. The method of clause 8, wherein the topography data comprises metrology data.10. The method of any one of clauses 7 to 9, wherein the determination of the theoretical mask image comprises a simulation of an effect of a radiation beam interacting with the mask pattern.11. The method of any one of clauses 7 to 10, wherein the determination of the calibrated filter further comprises a minimization of a total difference between the filtered image and the theoretical image or the determination of a total difference below a threshold value.12. The method of any one of clauses 7 to 11, wherein the mask simulation model is calibrated using an offline system or server.13. The method of any one of any one of clauses 1 to 12, wherein the mask simulation model further comprises a higher order submodel.14. The method of clause 13, wherein the higher order submodel comprises a neural network framework with a constant channel appended in an underlying layer.15. The method of clause 14, wherein the constant channel is appended by projecting a vector of the modeling condition into a scalar.16. The method of any one of clauses 13 to 15, wherein the mask simulation model is configured to generate a higher order component of the output mask pattern representation from the mask pattern representation and the modeling condition.17. The method of clause 16, wherein the mask simulation model is configured to generate the output mask pattern representation by combining the liner order component and the higher order component.18. The method of any one of clauses 13 to 17, wherein the mask simulation model is calibrated prior to simulating the lithography process, wherein the calibration of the mask simulation model comprises: an obtaining of training data comprising a characteristic of a mask pattern as an input parameter and a target mask pattern representation; a generation of a predicted mask pattern representation from the training data by the mask simulation model; and the calibration of, based on the determination of a difference between the predicted mask pattern representation and the target mask pattern representation from a cost function, the mask simulation model configured to generate the higher order component of the output mask pattern representation.19. The method of clause 18, wherein the calibration of the mask simulation model further comprises: an iterative modification of one or more parameters of the mask simulation model based on a gradientbased method such that the cost function is reduced.20. The method of clause 19, wherein the cost function is minimized.21. The method of any one of clauses 18 to 20, wherein the characteristic of the mask pattern comprises a thin mask image, a polygon, a polygon area, an edge, or a corner, and a modeling condition is represented as a vector.22. The method of any one of clauses 18 to 21, wherein the target mask pattern representation comprises a higher order component mask pattern representation.23. The method of clause 22, wherein the higher order component mask pattern representation comprises a M3D image.24. The method of any one of clauses 18 to 23, wherein the mask simulation model is calibrated using an offline system or server.25. The method of any one of clauses 1 to 24, wherein the modeling condition is represented as a vector in the mask simulation model.26. The method of any one of clauses 1 to 25, wherein the mask pattern representation comprises a thin mask image, a polygon, a polygon area, an edge, or a corner.27. The method of any one of clauses 1 to 26, wherein the modeling condition comprises an incident angle in an illumination source.28. The method of any one of clauses 1 to 26, wherein the modeling condition comprises a point source coordinate position in an illumination source.29. The method of any one of clauses 1 to 26, wherein the modeling condition comprises an illumination source position. 30. The method of any one of clauses 1 to 26, wherein the modeling condition comprises a chief ray angle.31. The method of any one of clauses 1 to 26, wherein the modeling condition comprises a slit position.32. The method of any one of clauses 1 to 26, wherein the modeling condition comprises a mask focus depth. 33. The method of any one of clauses 1 to 26, wherein the modeling condition comprises a corner rounding of a mask feature.34. The method of any one of clauses 1 to 33, wherein the output mask pattern representation comprises a mask image, a near field mask image, or an M3D image.35. The method of any one of clauses 1 to 34, wherein the output mask pattern representation is used in a source-mask-optimization (SMO) or optical proximity correction (OPC) flow.36. A non-transitory computer readable medium comprising a set of instructions that is executable by one or more processors of a computing device to cause the computing device to perform operations for simulating a lithography process, the operations comprising the method of any one of clauses 1-35.

Claims

CLAIMS1. A non-transitory computer readable medium comprising a set of instructions that is executable by one or more processors of a computing device to cause the computing device to perform operations for simulating a lithography process, the operations comprising a method comprising: obtaining a mask pattern representation and a modeling condition; and generating an output mask pattern representation for the modeling condition, wherein generating the output mask pattern representation comprises using a mask simulation model that is universal to different modeling conditions.

2. The medium of claim 1 , wherein the mask simulation model comprises a series of basis functions configured to calculate a result at a baseline condition and a difference between the result at the baseline condition and the modeling condition to determine a correction value, wherein the baseline condition and the correction value are determined by using a model filter, wherein the model filter represents an electromagnetic scattering effect resulting from the mask pattern representation.

3. The medium of claim 1, wherein the mask simulation model comprises a linear submodel.

4. The medium of claim 1 , wherein the mask simulation model is configured to generate a linear component of the output mask pattern representation from the mask pattern representation and the modeling condition.

5. The medium of claiml, wherein the mask simulation model is calibrated prior to simulating the lithography process, wherein calibration of the mask simulation model comprises: an obtaining of a representation of a mask pattern; a determination of a theoretical mask image from the representation of the mask pattern, wherein the determination of the theoretical mask image comprises a simulation of an effect of a radiation beam interacting with the mask pattern; a selection of an initial filter for a basis function and the generation of a filtered mask image from the initial filter and the representation of the mask pattern; a determination of a calibrated filter from the calculation of a difference between the filtered mask image and the theoretical image; and a selection of the calibrated filter for a basis function.

6. The medium of claim 5, wherein the representation of the mask pattern comprises topography data of the mask pattern.

7. The medium of claim 5, wherein the determination of the calibrated filter further comprises a minimization of a total difference between the filtered image and the theoretical image or the determination of a total difference below a threshold value.

8. The medium of claim 1, wherein the mask simulation model further comprises a higher order submodel configured to generate a higher order component of the output mask pattern representation from the mask pattern representation and the modeling condition.

9. The medium of claim 8, wherein the higher order submodel comprises a neural network framework with a constant channel appended in an underlying layer,10. The method of claim 9, wherein the constant channel is appended by projecting a vector of the modeling condition into a scalar.

11. The medium of claim 8, wherein the mask simulation model is configured to generate the output mask pattern representation by combining the liner order component and the higher order component.

12. The medium of claim 9, wherein the mask simulation model is calibrated prior to simulating the lithography process, wherein the calibration of the mask simulation model comprises: an obtaining of training data comprising a characteristic of a mask pattern as an input parameter and a target mask pattern representation; a generation of a predicted mask pattern representation from the training data by the mask simulation model; and the calibration of, based on the determination of a difference between the predicted mask pattern representation and the target mask pattern representation from a cost function, the mask simulation model configured to generate the higher order component of the output mask pattern representation, wherein the target mask pattern representation comprises a higher order component mask pattern representation.

13. The medium of claim 12, wherein the characteristic of the mask pattern comprises a thin mask image, a polygon, a polygon area, an edge, or a corner, and a modeling condition is represented as a vector, and wherein the higher order component mask pattern representation comprises a M3D image.

14. The medium of claim 1, wherein the mask pattern representation comprises a thin mask image, a polygon, a polygon area, an edge, or a corner; wherein the output mask pattern representation comprises a mask image, a near field mask image, or an M3D image; and wherein the output mask pattern representation is used in a source-mask-optimization (SMO) or optical proximity correction (OPC) flow.

15. The medium of claim 1, wherein the modeling condition comprises one or more of: an incident angle in an illumination source; a point source coordinate position in an illumination source; an illumination source position; a chief ray angle; a slit position; a mask focus depth; and a corner rounding of a mask feature.