Computational lithography simulation using a multi-channel physics-informed neural network for a 3D mask

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A multi-channel PINN approach for simulating lithography processes addresses the computational inefficiencies of thick mask simulations, improving yield and throughput by enabling faster and accurate near-field image generation.

WO2026139198A1PCT designated stage Publication Date: 2026-07-02ASML NETHERLANDS BV

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: WO · WO
Patent Type: Applications
Current Assignee / Owner: ASML NETHERLANDS BV
Filing Date: 2025-12-04
Publication Date: 2026-07-02

Application Information

Patent Timeline

04 Dec 2025

Application

02 Jul 2026

Publication

WO2026139198A1

IPC: G03F1/22; G03F7/20; G06F30/27

AI Tagging

Technology Topics

Lithography process Algorithm

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure EP2025085416_02072026_PF_FP_ABST

Patent Text Reader

Abstract

A non-transitory computer-readable medium stores a set of instructions that is executable by at least one processor of an apparatus to cause the apparatus to perform operations for simulating a lithography process. The operations include obtaining a physics-informed neural network (PINN) comprising multiple channel. A first channel of the multiple channels is associated with a first set of physical characteristics of an interaction between an electromagnetic (EM) field and a thick mask used in the lithography process. A second channel of the multiple channels is associated with a second set of physical characteristics of the interaction that is different from the first set of physical characteristics. The operations also include executing the PINN using the first channel to generate a near-field image representation of the EM field for the thick mask with the second channel disabled.

Need to check novelty before this filing date? Find Prior Art

Description

COMPUTATIONAL LITHOGRAPHY SIMULATION USING A MULTI-CHANNEL PHYSICS- INFORMED NEURAL NETWORK FOR A 3D MASKCROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims priority of US application 63 / 738,389 which was filed on 23 December 2024 and which is incorporated herein in its entirety by reference.FIELD

[0002] The description herein relates to lithographic apparatuses and processes. More particularly, the description herein relates computational lithography and simulations for thick mask modeling.BACKGROUND

[0003] A lithographic apparatus is a machine that applies a desired pattern onto a substrate by illuminating a mask that includes the pattern. Lithographic apparatuses are used in the manufacture of integrated circuits (ICs) having very small nanoscale features. An IC chip (e.g., a processor) can be as small as a person’s thumbnail and yet include billions of transistors. Nanofabrication is possible using pattern transfer techniques of photolithography. Making an IC is a complex and time-consuming process, with circuit components in different layers and including hundreds of individual steps. Errors in even one lithography step has the potential to result in problems with the final IC and can cause device failure. High process yield and high wafer throughput can be impacted by the presence of defects.

[0004] As feature sizes become significantly smaller than the wavelength of the illumination used to transfer the pattern, it is increasingly more difficult to maintain adequate process margins in the lithography process. For example, an aerial image created by a mask and exposure tool lose contrast and sharpness as the ratio of feature size to wavelength decreases. Loss of sharpness and contrast decreases accuracy of pattern transfer. Computational lithography techniques can be used to improve accuracy by simulating a lithography process and optimizing lithography process parameters. Some process parameters can relate to patterns on the mask. For example, the simulated illumination (electromagnetic field) in the vicinity of a mask (near field) can carry the pattern information of as a result of the light-matter interaction between illumination and mask pattern. The simulated near field image of the patterned illumination near the mask is represented as a mask image. The simulation then propagates the patterned illumination into the far field, which interacts with resist on a substrate. The simulated far field image is represented as an aerial image. The chemical alteration of the resist is simulated, and the simulated image that represents the altered resist is a resist image. An etch image can also be used, which represents an etch simulation result of the altered resist (or the etching of unaltered resist for a negative resist).

[0005] While images can be two-dimensional, three-dimensional physical models can also be used in computational lithography simulations. For example, to accurately simulate the light-matter interaction between source illumination and mask, simulations can take into account the three-dimensional nature of the mask. The three-dimensional mask model, or thick mask model, is a representation of the effects of the three-dimensional physical structure of the mask on the projected light. The three-dimensional nature of the mask is taken into account in simulating light diffraction by considering a base substrate (either transmissive or reflective) of finite thickness, which is coated with light-absorbing material of finite thickness.

[0006] Conventional software for rigorous three-dimensional electromagnetic field simulation for a thick mask often runs extremely slow. Such software is made for small portions of a chip design layout (not full-chip) and it is impractical to use for large full-chip layouts. Though computational simulations using Maxwell’ s equations can provide high fidelity between computational simulation and physical reality, using realistic boundary conditions for Maxwell’s equations can be prohibitively cumbersome in terms of computational demand and processing time. It is desirable to develop faster and more accurate simulations of lithographic processes that address the above-noted issues.SUMMARY

[0007] Embodiments of the present disclosure provide a system and method for a reliable and less computationally intensive simulation of illumination physics of a 3D mask. The simulation can be performed using a multi-channel physics-informed neural network.

[0008] In some embodiments, a non-transitory readable medium that stores a set of instructions for simulating a lithography process is provided. The instructions are executable by at least one processor of an apparatus to cause the apparatus to perform operations. The operations can comprise obtaining a physics-informed neural network (PINN) comprising multiple channels. A first channel of the multiple channels can be associated with a first set of physical characteristics of an interaction between an electromagnetic (EM) field and a thick mask used in the lithography process. A second channel of the multiple channels can be associated with a second set of physical characteristics of the interaction that is different from the first set of physical characteristics. The operations can also comprise executing the PINN using the first channel to generate a near-field image representation of the EM field for the thick mask with the second channel disabled.

[0009] In some embodiments, a method for a multi-channel physics-informed neural network is provided. The method can comprise training the PINN for simulating a lithography process. A first channel of the multiple channels can be associated with a first set of physical characteristics of an interaction between an electromagnetic (EM) field and a thick mask used in the lithography process. A second channel of the multiple channels can be associated with a second set of physical characteristics of the interaction that is different from the first set of physical characteristics. The method can also comprise configuring the multiple channels to be independently toggleable forinferencing. The first channel can generate a near-field image representation of the EM field for the thick mask with the second channel disabled.

[0010] In some embodiments, a system for simulating a lithography process is provided. The system can comprise one or more processors and one or more memory devices. The one or more memory devices can store a set of instructions that is executable by the one or more processors to cause the system to perform operations for simulating the lithography process. The operations can comprise obtaining a physics-informed neural network (PINN) comprising multiple channels. A first channel of the multiple channels is associated with a first set of physical characteristics of an interaction between an electromagnetic (EM) field and a thick mask used in the lithography process. A second channel of the multiple channels is associated with a second set of physical characteristics of the interaction that is different from the first set of physical characteristics. The operations can also comprise executing the PINN using the first channel to generate a near-field image representation of the EM field for the thick mask with the second channel disabled.BRIEF DESCRIPTION OF FIGURES

[0011] The above and other aspects of the present disclosure will become more apparent from the description of example embodiments, taken in conjunction with the accompanying drawings.

[0012] FIG. 1 shows example subsystems of a lithographic apparatus, consistent with embodiments of the present disclosure.

[0013] FIG. 2 shows a flowchart of an example method for simulating lithography in a lithographic apparatus, consistent with embodiments of the present disclosure.

[0014] FIG. 3 shows a flowchart of an example method for source or mask optimization of a patterning process, consistent with embodiments of the present disclosure.

[0015] FIG. 4 shows an example mask that can be used in a lithographic apparatus, consistent with embodiments of the present disclosure.

[0016] FIG. 5 shows a flowchart of an example method for simulating illumination physics of a mask used in a lithographic apparatus, consistent with embodiments of the present disclosure.

[0017] FIG. 6 shows a flowchart of an example method for simulating illumination physics of a mask used in a lithographic apparatus, consistent with embodiments of the present disclosure.

[0018] FIG. 7 shows a flowchart of an example method for simulating illumination physics of a mask used in a lithographic apparatus, consistent with embodiments of the present disclosure.

[0019] FIG. 8 shows a flowchart of an example method for simulating illumination physics of a mask used in a lithographic apparatus, consistent with embodiments of the present disclosure.

[0020] FIG. 9 shows a table of a channel arrangement for a PINN, consistent with embodiments of the present disclosure

[0021] FIG. 10 shows a flowchart of an example method for simulating a lithography process using a mask, consistent with embodiments of the present disclosure.

[0022] FIG. 11 shows a block diagram of an example server, consistent with some embodiments of the disclosure.

[0023] FIG. 12 shows a flowchart of an example method for training a PINN, consistent with embodiments of the present disclosure.DETAILED DESCRIPTION

[0024] Reference will now be made in detail to example embodiments, examples of which are illustrated in the drawings. The following description refers to the accompanying drawings in which the same numbers in different drawings represent the same or similar elements unless otherwise represented. The implementations set forth in the following description of example embodiments do not represent all implementations consistent with the invention. Instead, they are merely examples of apparatuses, systems, and methods consistent with aspects related to subject matter that may be recited in the appended claims.

[0025] To print nano-accurate features on a substrate (e.g., nano-transistors), a lithographic apparatus can illuminate a mask. The mask acts like a stencil for the illumination and the patterned illumination is projected onto the substrate, thereby achieving a pattern transfer from mask to substrate. It is desirable for the illumination used in this process to be conditioned to a high degree of accuracy in terms of wavelength, dose, uniform intensity spread, uniform wavefront, or the like. Instability of the illumination can introduce defects and reduce yield. A goal of the manufacturing process is to avoid such defects to maximize the number / yield of functional ICs made in the process.

[0026] A lithography process can include creating a master image on a mask, then projecting an image from the mask onto a resist-coated substrate in order to create a pattern that matches the design intent of defining on the device wafer (e.g., functional elements such as transistor gates, contacts, or the like). The projection of the patterned illumination can be achieved via precise and complex projection optical systems. Yield increases the more times a master pattern is successfully replicated within the design specifications. Yield is a metric that characterizes failure rate in device fabrication, which relates to cost and efficiency. Wafers or chips with lithography errors are a sunk cost and lost time for a fab.

[0027] The technology trend towards “subwavelength lithography” makes it increasingly difficult to maintaining adequate process margins in the lithography process. There is limited practical flexibility in choosing the exposure wavelength. And the numerical aperture (NA) of exposure tools are near physical limits. Consequently, the continuous reduction in device feature sizes requires more and more aggressive reduction of the kl factor in lithographic processes, conducting imaging at or below the classical resolution limits (the kl factor quantifies the ratio of feature size to wavelength, which is defined as the NA of the exposure tool times the minimum feature size divided by the wavelength). Methods to enable low-kl lithography can include adjusting the mask pattern to include non-printing “assist features” that are shapes that are not meant to be printed, but rather instigate proximity effectsto provide a correction, thereby generating main print features that are more faithful to the intended design pattern. These correction methods can be referred to as optical proximity correction (OPC) methods. The implementation and verification of full-chip OPC can be made possible by detailed fullchip computational lithography process modeling. The process is generally referred to as model-based OPC.

[0028] Computational lithography simulation can rely on a plurality of physical models, which simulate light projection, image forming process, and light-matter interactions (e.g., optical model, mask model, resist model, etch model, stochastic edge placement error (SEPE) model, or the like). For example, physical properties of a mask and its interaction with incident illumination can be approximated in a mask model to various degrees of complexity. An important input to a lithography simulation system is the model for the interaction between the illuminating electric field and the mask. In the past, thin-mask approximations were satisfactory enough when device feature sizes were much larger than the wavelength of the exposure illumination. However, resource intensive, rigorous three-dimensional (3D) electromagnetic field simulation is becoming more important in aerial image formation using a thick mask. An example of a thick mask model is described in U.S. Patent No. 11,461,532 (issued October 4, 2022), the content of which is incorporated herein by reference in its entirety. Nanoscopic topographic features of a mask are now relevant to computational lithography simulations and computational rigor can no longer be avoided. A challenge is that software to perform such rigorous 3D electromagnetic field simulation for a thick mask often runs extremely slow and hence can be limited to extremely small areas of a chip design layout (not full-chip).

[0029] It is desirable to provide even faster, less computationally intensive, and more accurate thick mask simulation methods in order to increase yield and throughput of lithographic systems.

[0030] Embodiments of the present disclosure provide a system and method for training a physicsbased neural network (PINN) with multiple processing channels that can be toggled on and off to reduce time consuming computations while activating the minimum number of channels needed for simulating a near-field output (mask image) from a mask.

[0031] Objects and advantages of the disclosure can be realized by the elements and combinations as set forth in embodiments described herein. However, embodiments of the present disclosure are not necessarily required to achieve such example objects or advantages. Some embodiments can achieve a different feature or enhancement without necessarily achieving any expressly stated object or advantage.

[0032] As used herein, unless specifically stated otherwise, the term “or” encompasses all possible combinations, except where infeasible. For example, if it is stated that a component can comprise A or B, then, unless specifically stated otherwise or infeasible, the component can comprise A, or B, or A and B. As a second example, if it is stated that a component can comprise A, B, or C, then, unless specifically stated otherwise or infeasible, the component can comprise A, or B, or C, or A and B, or A and C, or B and C, or A and B and C.

[0033] Relative dimensions of components in drawings may be exaggerated for clarity. Within the following description of drawings, the same or like reference numbers refer to the same or like components or entities, and only the differences with respect to the individual embodiments are described.

[0034] The term “patterning device” may be considered synonymous with similar terms of art, such as “reticle” or “mask” and such terms can be used herein interchangeably. The term “patterning device,” “reticle,” “mask,” or the like, used herein should be broadly interpreted as referring to any device that can be used to impart a pattern on a cross section of a radiation beam. The radiation beam then can recreate the pattern in a target portion of a substrate.

[0035] The term “projection system” used herein should be broadly interpreted as encompassing any type of projection system, including refractive, reflective, catadioptric, magnetic, electromagnetic, or electrostatic optical systems, or any combination thereof, as appropriate for the exposure radiation being used, or for other factors such as the use of an immersion liquid or the use of a vacuum. Any use of the term “projection lens” herein may be considered as synonymous with the more general term “projection system.”

[0036] Illumination can be understood to be a form of radiation. The terms “radiation” and “illumination” can be used herein interchangeably. Embodiments described in the context of illumination are also applicable in the context of radiation in general. Furthermore, the terms “radiation” and “beam” can encompass all types of electromagnetic radiation, including ultraviolet radiation (e.g., with a wavelength of 365, 248, 193, 157 or 126 nm) and EUV (extreme ultra-violet radiation, e.g., having a wavelength in the range 5-20 nm, such as 13.5 nm).

[0037] It is to be appreciated that image information can be represented in several forms. For example, a graphical representation of an image (e.g., displayed at a display device) can also be represented as digital data (e.g., saved as a file in memory). An analog representation can be in the form of electrical signals that convey the image information. The present disclosure can refer to generating, analyzing, or processing of images. It is to be appreciated that such operations directed to images can be performed with respect to any representation of an image (e.g., graphical, electrical signal, binary data, or the like). For example, generating an image can refer to generating an actual graphical representation of the image or generating a binary-data representation of the image that can be read and processed by a computing device.

[0038] FIG. 1 shows example subsystems of a lithographic apparatus 100, consistent with embodiments of the present disclosure. In some embodiments, lithographic apparatus 100 comprises a radiation source 102, which can be a deep -ultraviolet excimer laser source or other type of source including an extreme ultra violet (EUV) source (the lithographic apparatus itself need not have the radiation source), illumination optics which define the partial coherence (denoted as sigma) and which can include optic components 104, 106a, and 106b that shape radiation from source 102; a patterning device 108; and transmission optics 106c that project an image of the patterning device pattern onto asubstrate plane 109. An adjustable filter or aperture 107 at disposed in among the optics can restrict the range of beam angles that impinge on the substrate plane 109. A largest possible angle 0max can define the numerical aperture NA of the projection optics as NA = n*sin(0max), where n is the index of refraction of the medium in which the final lens element is working (e.g., a lens closest to the substrate). And while the example of FIG. 1 illustrates a transmissive lithographic apparatus (e.g., patterning device 108 is transmissive), embodiments described herein are not so limited. For example, embodiments described herein are also applicable to reflective lithographic apparatuses that use a reflective patterning device).

[0039] In an optimization process of a lithographic projection system, a figure of merit of the system can be represented as a cost function. The optimization process can determine a set of parameters (design variables) of the system that minimizes the cost function. The cost function can have any suitable form depending on the goal of the optimization. For example, the cost function can be a weighted root mean square (RMS) of deviations of certain characteristics (evaluation points) of the system with respect to the intended values (e.g., ideal values) of these characteristics. The cost function can be the maximum of these deviations (e.g., worst deviation). The term “evaluation points” herein should be interpreted broadly to include any characteristics of the system. The design variables of the system can be confined to finite ranges or be interdependent due to practicalities of implementations of the system. In case of a lithographic apparatus, the constraints are often associated with physical properties and characteristics of the hardware such as tunable ranges or patterning device manufacturability design rules, and the evaluation points can include physical points on a resist image on a substrate, as well as illumination characteristics such as dose and focus.

[0040] In a lithographic apparatus, a source can provide illumination (e.g., light). Projection optics can direct and shape the illumination via a patterning device and onto a substrate. The term “projection optics” is broadly defined to include any optical component that can alter the wavefront of the radiation beam. For example, projection optics can include at least some of components 104, 106a, 106b, and 106c. An aerial image is the radiation intensity distribution at substrate level. A resist layer on the substrate is exposed and the aerial image is transferred to the resist layer as a latent “resist image” therein. The resist image can be defined as a spatial distribution of solubility of the resist in the resist layer.

[0041] Various aspects of lithographic apparatus 100 can be simulated using computational lithography. For example, a resist model is used to calculate the resist image from the aerial image. An example of a resist model can be found in U.S. Patent No. 8,200,468 (issued June 12, 2012), the contents of which are incorporated herein by reference in their entirety. The resist model is related to properties of the resist layer (e.g., effects of chemical processes which occur during exposure, postexposure bake (PEB), and development). Optical properties of the lithographic apparatus (e.g., properties of the source, the patterning device, and the projection optics) dictate the aerial image. Since the patterning device used in the lithographic apparatus can be changed, it is desirable toseparate the optical properties of the patterning device from the optical properties of the rest of the lithographic apparatus including at least the source and the projection optics. It is to be appreciated that, in the context of computational lithography, terms such as “model,” “physical model,” “physicsbased model,” or the like can be considered a computer-based algorithmic approach specifically designed to solve optical and process proximity problems by taking into account physical properties (e.g., material properties and geometries) that influence the lithography process outcome.

[0042] FIG. 2 shows a flowchart of an example method 200 for simulating lithography in a lithographic apparatus (e.g., lithographic apparatus 100 (FIG. 1)), consistent with embodiments of the present disclosure. In some embodiments, a source model 202 represents optical characteristics of the source (e.g., including radiation intensity distribution, phase distribution, or the like). A projection optics model 204 can represent optical characteristics of the projection optics (e.g., including changes to radiation intensity / phase distribution caused by the projection optics). A design layout model 206 can represent optical characteristics of a design layout (e.g., including changes to radiation intensity / phase distribution caused by a given design layout), which is the representation of an arrangement of features on, or formed by, a patterning device (e.g., a mask). The mask can comprise a 3D structure of absorber layers and valleys in the absorber layers that alter the intensity / phase of illumination incident thereupon. An aerial image 208 can be simulated from source model 202, projection optics model 204, and design layout model 206. A resist image 212 can be simulated from aerial image 208 using a resist model 210. Simulation of lithography can, for example, predict lithographic pattern transfer results, which can include feature contours, edge placement errors (EPE), critical dimensions (CDs), or the like, in the resist image. It is to be appreciated that the simulation of resist image 212 can be provided in a two-tier process, for example, generating a pre-develop resist image and then generating a post-develop resist image (e.g., an etch image).

[0043] It is noted that the source model 202 can represent optical characteristics of the source that include, but are not limited to, NA-sigma (o) settings as well as any particular illumination source shape (e.g., off-axis radiation sources such as annular, quadrupole, and dipole, etc.). Projection optics model 204 can represent the optical characteristics of the projection optics that include, but are not limited to, aberration, distortion, refractive indexes, physical sizes, physical dimensions, or the like. Design layout model 206 can represent physical properties of a physical patterning device. An example of a design layout model can be found in U.S. Patent No. 7,587,704 (issued September 8, 2009), the contents of which are incorporated herein by reference in their entirety. An example of a thick mask model is described in the previously mentioned U.S. Patent No. 11,461,532. A goal of the simulation is to accurately predict feature contours, edge placement errors (EPE), critical dimensions (CDs), or the like, which can then be compared against an intended design for a device (e.g., a simulation to determine whether a mass fabrication of a new CPU architecture is feasible). The intended design is generally defined as a pre-OPC design layout (OPC is sometimes also referred to as “optical and process correction”), which can be provided in a standardized digital file format. Thelayout file can be in a Graphic Database System (GDS) format, Graphic Database System II (GDS II) format, an Open Artwork System Interchange Standard (OASIS) format, a Caltech Intermediate Format (CIF), or the like. The intended design layout can include patterns or structures for transferring onto a wafer. The patterns or structures can be mask patterns used to transfer features from photolithography masks or reticles to a wafer. In some embodiments, a layout in GDS or OASIS format, among others, can include feature information stored in a binary file format representing planar geometric shapes, text, and other information related to the wafer design.

[0044] From the design layout, one or more portions can be identified, which are referred to as “clips.” In some embodiments, a set of clips is extracted, which represents the complicated patterns in the design layout (typically about 50 to 1000 clips, although any number of clips can be used). It is to be appreciated that these patterns or clips represent small portions (e.g., circuits, cells, or patterns) of the design and especially the clips represent small portions for which particular attention or verification is desirable. In other words, clips can be the portions of the design layout or can be similar or have a similar behavior of portions of the design layout where critical features are identified either by experience (including clips provided by a customer), by trial and error, or by running a fullchip simulation. Clips can contain one or more test patterns or gauge patterns.

[0045] An initial larger set of clips can be provided a priori by a customer based on known critical feature areas in a design layout that could benefit from image optimization. Alternatively, in some embodiments, the initial larger set of clips is extracted from the entire design layout by using some kind of automated algorithm (e.g., machine vision) or manual algorithm that identifies the critical feature areas.

[0046] In some embodiments, an optimization process (e.g., source and mask optimization) relates to one or more of a patterning process that employs process models (e.g., an optics model, a mask model, a resist model, or the like). The optimization process can involve execution of the one or more process models and computing a cost function that can be reduced by modifying one or more characteristics (e.g., source, mask pattern, etc.) of the patterning process. In some embodiments, the one or more characteristics is described by design variables. Hence, an optimized characteristic can also be referred to as an optimized design variable, where a design variable is optimized based on a cost function.

[0047] In some embodiments, modifying the one or more characteristics is based on a gradient of the cost function that guides how the characteristic should be modified to reduce the cost function. A cost function can be a function of a certain continuous metric such as an edge placement error (e.g., a difference between contours of printed pattern and a target pattern). Using a continuous metric or a cost function of a continuous nature allows use of gradient-based optimizing algorithms that have acceptable runtime performance of an optimization process.

[0048] Details of example techniques and models used to transform a patterning device pattern into various lithographic images (e.g., an aerial image, a resist image, an etch image, etc.), apply OPC(e.g., using models) and evaluate performance (e.g., in terms of process window) can be found in U.S. Patent Nos. 7,695,876 (issued April 13, 2010); 7,707,538 (issued April 27, 2010); 7,747,978 (issued June 29, 2010); 7,882,480 (issued February 1, 2011); 8,413,081 (issued April 2, 2013); 8,438,508 (issued May 7, 2013); and 9,360,766 (issued June 7, 2016), the contents of which are incorporated herein by reference in their entirety.

[0049] FIG. 3 shows a flowchart of an example method 300 of source or mask optimization of a patterning process, consistent with embodiments of the present disclosure. In some embodiments, method 300 shows how the different images, models, clips, optimizations, and other concepts described in reference to FIG. 2 are related. In a typical high-end design, almost every feature edge can benefit from some modification to achieve printed patterns that come sufficiently close to the target design. These modifications can include shifting or biasing of edge positions or line widths as well as application of “assist” features that are not intended to print themselves (e.g., non-print features), but can affect the properties of an associated primary feature. Furthermore, optimization techniques applied to the source of illumination can have different effects on different edges and features. Optimization of illumination sources can include the use of pupils to restrict source illumination to a selected pattern of light. Embodiments of the present disclosure provide optimization methods that can be applied to both source and mask configurations.

[0050] A method of performing source and mask optimization (SMO) can allow full chip pattern coverage while lowering the computation cost by intelligently selecting a small set of critical design patterns from the full set of clips to be used in SMO. SMO can be performed on these selected patterns to obtain an optimized source. The optimized source can then be used to optimize the mask (e.g., using OPC and local mechanical-stress control) for the full chip, and the results can be compared. Various methods are provided for iteratively converging on an optimal result. Method 500 is an example SMO method.

[0051] A target design 301 (e.g., comprising a layout in a standard digital format such as OAIS, GDSII, etc.) for which a lithographic process is to be optimized can include memory, test patterns, and logic. From this design, a full set of clips 302 can be extracted, which represents complex patterns in design 301 (e.g., about 50 to 1000 clips). It is to be appreciated that these clips represent small portions (i.e., circuits, cells, or patterns) of the design for which particular attention and / or verification is of interest. At operation 304, a small subset of clips 306 (e.g., 15 to 50 clips) can be selected from full set of clips 302. As will be explained in more detail below, the selection of clips can be performed such that the process window of the selected patterns matches the process window for the full set of critical patterns as close as possible. The effectiveness of the selection can be measured by the total run time (pattern selection and SMO) reduction.

[0052] At operation 308, SMO can be performed with the selected patterns (15 to 50 patterns) of subset of clips 306. In particularly, an illumination source can be optimized for the selected patterns of subset of clips 306. Source optimization can be performed as described below with respect toFIGS. 4-9. Examples of other source optimization methods can be found in, for example, U.S. Patent Application Publication No. 2004 / 0265707 (published December 30, 2004), the contents of which are incorporated herein by reference in their entirety.

[0053] At operation 310, manufacturability verification of the selected patterns of subset of clips 306 can be performed with the source obtained in operation 308. In particular, verification can include performing an aerial image simulation of the selected patterns of subset of clips 306 and the optimized source and verifying that the patterns will print across a sufficiently wide process window. An example verification process can be found in U.S. Patent No. 7,342,646 (issued March 11, 2008), the contents of which are incorporated herein by reference in their entirety. If the verification at operation 310 is satisfactory, as determined in operation 312, then processing can advance to full chip optimization (e.g., advanced to operations using optimized source 314). Otherwise, processing can return to operation 308, where SMO is performed again but with a different source or set of patterns. For example, the process performance as estimated by the verification tool can be compared against thresholds for certain process window parameters such as exposure latitude and depth of focus. These thresholds can be predetermined or set by a user.

[0054] After the selected patterns meet lithography performance specification as determined in step 312, the optimized source 314 can be used for optimization of the full set of clips 316 (e.g., originating from full set of clips 302).

[0055] At operation 318, model-based sub-resolution assist feature placement (MB-SRAF) and optical proximity correction (OPC) for all the patterns in the full set of clips 316 can be performed. Examples of MB-SRAF and OPC can be found in U.S. Patent Nos. 5,663,893 (issued September 2, 1995); 5,821,014 (issued October 13, 1998); 6,541,167 (issued April 1, 2003); and 6,670,081 (December 30, 2003), the contents of which are incorporated herein by reference in their entirety.

[0056] At operation 320, using processes similar to step 310, full pattern simulation-based manufacturability verification can be performed with the optimized source 314 and the full set of clips 316 as corrected in step 318.

[0057] At operation 322, the performance (e.g., process window parameters such as exposure latitude and depth of focus) of the full set of clips 316 can be compared against subset of clips 306. For example, the pattern selection can be considered complete and / or the source is fully qualified for the full chip when the similar (<10%) lithography performances are obtained for both selected patterns of subset of clips 306 and critical patterns of full set of clips 316.

[0058] Otherwise, at operation 324, hotspots can be extracted. At operation 326, the hotspots can be added to subset of clips 306 and the process starts over. For example, hotspots (e.g., features among the full set of clips 316 that limit process window performance) identified during verification step 320 can be used for further source tuning or to run SMO of operation 308 again. The source can be considered fully converged when the process window of the full set of clips 316 are the same between the last run and the run before the last run of operation 322.

[0059] OPC calibration can be performed by modeling or simulation. For example, for the desired yield, the total number of features, and their respective probabilities of failure, simulation can be performed to optimize OPC for a lowest yielding feature. OPC addresses the fact that, in addition to any demagnification by the lithographic projection apparatus, the final size and placement of an image of the patterning device pattern projected on the substrate will not be identical to, or simply depend only on the size and placement of, the corresponding patterning device pattern features on the patterning device.

[0060] In some embodiments, the measurement data (e.g., stochastic variations) related to the printed pattern can be employed in optimizing the patterning process or adjusting parameters of the patterning process. For small feature sizes and high feature densities present on some design layouts, the position of a particular edge of a given feature can be influenced to a certain extent by the presence or absence of other adjacent features. These proximity effects arise from minute amounts of radiation coupled from one feature to another or non-geometrical optical effects such as diffraction and interference. Similarly, proximity effects can arise from diffusion and other chemical effects during post-exposure bake (PEB), resist development, and etching that generally follow lithography.

[0061] To ensure that the projected image of the patterning device pattern is in accordance with tolerances of a given target design, proximity effects should be predicted and compensated for using sophisticated numerical models, corrections, or pre-distortions of the patterning device pattern. The article “Full-Chip Lithography Simulation and Design Analysis — How OPC Is Changing IC Design,” C. Spence, Proc. SPIE, Vol. 5751, pp 1-14 (2005) provides an overview of “model-based” optical proximity correction processes, the contents of which are incorporated herein by reference in their entirety. In a typical high-end design, almost every feature of the patterning device pattern has some modification to achieve high fidelity of the projected image to the target design. These OPC modifications can include shifting or biasing of edge positions or line widths and / or application of “assist” features that are intended to assist projection of other features.

[0062] Application of model-based OPC to a target design can involve good process models and considerable computational resources, given the many millions of features typically present in a device design. However, applying OPC is generally an empirical, iterative process that does not always compensate for all possible proximity effects. Therefore, the effect of OPC, e.g., patterning device patterns after application of OPC and any other resolution enhancement technique (RET), should be verified by design inspection, e.g., intensive full-chip simulation using calibrated numerical process models, to reduce or minimize the possibility of design flaws being built into the patterning device pattern. This is driven by the enormous cost of making high-end patterning devices, as well as by the impact on turn-around time by reworking or repairing existing patterning devices after they have been manufactured. OPC and full-chip RET verification can be based on numerical modelling systems and methods. Examples of such methods can be found in U.S. Pat. No. 7,003,758 (issued February 21, 2006) and an article titled “Optimized Hardware and Software For Fast, Full ChipSimulation,” by Y. Cao et al., Proc. SPIE, Vol. 5754, 405 (2005), the contents of which are incorporated herein by reference in their entirety.

[0063] Referring to design layout model 206 (FIG.2), an accurate mask model can promote accurate simulation of a pattern transfer process from mask to resist. To better appreciate useful aspects of embodiments described herein, it is instructive to consider parameters of a physical mask and how the parameters influence the simulation process.

[0064] FIG. 4 shows an example mask 400 that can be used in lithographic apparatus 100 (FIG. 1), as well as simulated using methods 200 and 300 (FIGS. 2 and 3), consistent with embodiments of the present disclosure. In some embodiments, mask 400 comprises a support structure 402 and a patterned structure 404. Support structure 402 can comprise a plurality of layers 406, 408, and 410 (more or fewer layers can be used). Layer 408 can be a transparent substrate (e.g., quartz glass) that is transparent to illumination used in lithographic apparatus 100 (FIG. 1). Layers 406 and 410 can be coatings or films that serve one or more optical functions (e.g., anti-reflective (AR) coating, polarizer, or the like). Patterned structure 404 can comprise a layer of illumination-rejecting material (e.g., light absorber). The light absorber can be patterned using etched portions 412 (e.g., valleys, trenches, holes, or the like).

[0065] For brevity, embodiments described herein can be described with reference to transmissiontype masks with light absorbers. However, it is appreciated that embodiments described herein are not so limited. For example, embodiments described herein are also applicable to reflective masks, as well as any suitable illumination-rejecting material.

[0066] A beam of illumination 414 can be provided by radiation source 102 (FIG. 1). Beam of illumination 414 (an electromagnetic (EM) field) can be incident on mask 400 from the side of support structure 402 (embodiments are not so limited, as simulations can be adjusted to simulate incidence from the side of patterned structure 404). Transparent portions of mask 400 can transmit a portion of the EM field. Patterned structure 404 can reject portions of the EM field that interact with the light absorber. The transmitted illumination is represented as patterned beam of illumination 416 (or patterned EM field). Non- limiting coordinate axes are provided for facilitating description of directional elements. In the example illustrated in FIG. 4, beam of illumination 414 propagates along the z-direction while a plane of mask 400 extends along an xy-plane (in the drawing sheet, x-direction is horizontal and y-direction is out of sheet).

[0067] In some embodiments, the illumination can interact several optical components as it propagates from source to substrate / resist. An example light-mask interaction is illustrated in FIG. 4.Simulation of interactions like the one in FIG.4 can be used to improve parameters of actual lithography processes.

[0068] The pupil profile can be represented as an image of the intensity distribution of beam of illumination 414 at a pupil plane of radiation source 102 (FIG. 1). Similarly, a mask image (which can also be referred to as a “near-field image”) can represent a distribution of the complex amplitudeof patterned beam of illumination 416 at a near-field output plane 418 of mask 400. Any number of xy-planes can be defined along the z-axis. For disambiguation, terms such as a “given z-plane” can refer to an xy-plane at a given z-height or position. For example, near-field plane 418 (e.g., output plane) is a near-field z-plane defined proximal to patterned structure 404 at the output side of mask 400.

[0069] Embodiments described herein can be subject to near-field and far-field behavior. The near field and far field are regions of the EM field around an object, such as the result of radiation scattering off an object (e.g., scattering from mask 400). Near-field behaviors can dominate close to mask 400 (e.g., at near-field plane 418) while EM far-field behaviors can dominate at greater distances, such as when patterned beam of illumination 416 reaches the resist at the substrate. In this context, the far-field image can correspond to the aerial image created by patterned beam of illumination 416.

[0070] As alluded above, pattern designs for next generation ICs use feature sizes close to the wavelength beam of illumination 414 (e.g., patterned structure 404 having feature sizes close to 13.5 nm). In this size regime, it is important for simulations to take into account the impact of thickness of the mask stack (e.g., consider the mask as a 3D object with non-negligible thickness). Computational simulations using Maxwell’ s equations with suitable boundary conditions can provide high fidelity between computational simulation and physical reality. Cascading dependencies are important — e.g., a more realistic thick mask model begets a more realistic near-field image (or mask image), which begets a more realistic aerial image. However, applying Maxwell’ s equations to realistic boundary conditions (such as a complex pattern on a mask) can be prohibitively cumbersome in terms of computational demand and processing time. Hence, some embodiments of the present disclosure rely on a PINN-based approach (physics-informed) to predict or infer solutions to Maxwell’s equations with small-feature boundary conditions.

[0071] PINN-based approaches to solving Maxwell’s equations provide some desirable features, for example, avoiding inefficient numerical approximations of solutions to Maxwell’s equations and avoiding other deep learning approaches that rely on large amounts of expensive rigorously simulated or measured training data. An example of PINN-based modeling of near-field and far-field outcomes of EUV absorbers is described in Medvedev et al., “Modeling ofNear-and Far-Field Diffraction from EUV Absorbers Using Physics-Informed Neural Networks,” 2023 Photonics & Electromagnetics Research Symposium (PIERS) (pp. 297-305), IEEE (2023, July) (“Medvedev 1”); Medvedev et al., “3D mask simulation and lithographic imaging using physics-informed neural networks f Optical and EUV Nanolithography XXXVII, 12953, 129530N (2024) (“Medvedev 2”); and Medvedev et al., “3D EUV mask simulator based on physics-informed neural networks: effects of polarization and illumination Computational Optics 2024 (Vol. 13023, pp. 19-36), SPIE (2024, June) (“Medvedev 3”); the contents of which are incorporated herein by reference in their entirety. An example PINN is also described below with respect to FIG. 12.

[0072] A problem with using neural networks in such manner is the large amount of computing resources dedicated to generating ground truth data for training. To train up a useful enough PINN, a 3D volume of mask 400 (e.g., the entire 3D volume) is simulated to evaluate physics loss, yet the piece that is eventually used for lithography simulation (e.g., aerial image simulation) can be just a 2D field on the output plane. Furthermore, evaluation of physical loss in the PINN requires using a fine pixel grid (e.g., pixel size is -2 / 10), whereas lithography simulation might be carried out using less computationally intensive coarse pixel grids (-2 / 4 / VA). Some embodiments of the present disclosure provide modified PINN-based approaches that can significantly reduce computational burden while maintaining the high accuracy aspects of PINNs.

[0073] FIG. 5 shows a flowchart of an example method 500 for simulating illumination physics of a thick mask used in a lithographic apparatus (e.g., lithographic apparatus 100 (FIG. 1)), consistent with embodiments of the present disclosure. In some embodiments, method 500 will be described in the context of mask 400 of FIG. 4, which is reproduced in FIG. 5. It is to be appreciated that method 500 is not limited to aspects of mask 400 specifically. Method 500 can be performed with other mask types (e.g., a reflective mask).

[0074] Method 500 can include a PINN 502. The physics-based aspect of PINN 502 can be implemented using the Helmholtz wave equation form of Maxwell’s equations, as in equation 1.V2E(r) + n2kgE(r') = 0 (Eq. 1)

[0075] E(r) is the electric as a function of geometry r, n (shorthand of n(r)) is the material property distribution (e.g., refractive index) as a function of geometry r, and ko is the vacuum wavenumber. Equation 1 is provided as a non-limiting example. Any suitable form of Maxwell’s equations can be used.

[0076] In training a PINN, the loss function may indicate the extent of deviation of a simulated solution from a physically real solution. In a related context, a “residual” can refer to the difference between an approximate solution to the Helmholtz wave equation obtained through an imprecise method (e.g., numerical approximation) and the exact solution to the equation. In some embodiments, the residual can be used as the loss function. Equation 2 is an example loss function (there are several possible formulations of a loss function for a PINN).

[0077] The index j goes from 1 to N, where N is the number of training points of PINN 502.

[0078] Equations 1 and 2 can be incorporated into PINN 502 in more general forms to account for the vector nature of the EM field (e.g., to account for polarization). The programming of physics into PINN 502 can be as described below in reference to FIG. 12.

[0079] Instead of the electric field equation representation in equations 1 and 2, PINN 502 can consider both the electric field and the magnetic field independently.

[0080] According to embodiments of the present disclosure, different physical characteristics of the light-matter interaction can be assigned to a separate channel in PINN 502. Hence, in some embodiments, PINN 502 comprises a plurality of solution channels (e.g., channels 504, 506, 508, 510, and 512). The channels can be output channels of PINN 502. More or fewer channels can be implemented (e.g., two or more channels). PINN 502 can perform mask simulation processing on a multi-channeled thick mask model. Depending on the embodiments, each channel can represent a field type (e.g., electric or magnetic), a polarization, a pixel resolution (e.g., coarse or fine pixel resolution), a propagation progression of the EM field (e.g., at different z-planes), or the like.

[0081] The example in FIG. 5 maps each channel to a different stage of propagation progression of the EM field through mask 400. For example, channel 504 corresponds to a portion of PINN 502 that simulates the EM field at a z-plane 514, which is disposed in mask layer 406 (e.g., a coating or thin film). Channel 506 can correspond to a portion of PINN 502 that simulates the EM field at a z-plane 516, which is disposed in layer 406 (e.g., a substrate). Channel 508 can correspond to a portion of PINN 502 that simulates the evolved EM field at a z-plane 518, which is disposed in layer 410 (e.g., another coating or thin film). Channel 510 can correspond to a portion of PINN 502 that simulates the evolved EM field at a z-plane 520, which is disposed in patterned structure 404. Channel 512 can correspond to a portion of PINN 502 that simulates the near field at near-field plane 418. Channel 512 can be used to generate a thick mask model image (near-field image) of mask 400. Each channel of PINN 502 can correspond to an EM field at a different z-plane in relation to the mask stack. The different z-planes can be disposed at different mask layers, outside of the mask layers, or even have multiple z-plane channels at different positions in a same mask layer.

[0082] In some embodiments, the channel-based approach of PINN 502 can allow for a substantial reduction of computational burden after training is completed.

[0083] According to embodiments of the present disclosure, the training of PINN 502 can be performed comprehensively across all channels to minimize the total physical loss (e.g., until a cumulative physical loss reaches a stable or threshold value for the whole mask stack). Neural network weights w and biases b can be trained using, as input, the material property distribution n(r) (n(r) describes the nanoscale features of patterned structure 404).

[0084] Neural networks can be generally defined as a computational approach that is based on a relatively large collection of neural units loosely modeling the way a biological brain solves problems with relatively large clusters of biological neurons connected by axons. Each neural unit is connected with many others, and links can be enforcing or inhibitory in their effect on the activation state ofconnected neural units. These systems are self-learning and trained rather than explicitly programmed and excel in areas where the solution or feature detection is difficult to express in a traditional computer program.

[0085] Neural networks can comprise multiple layers. The signal path can traverse from front to back (backpropagation). A goal of neural networks is to solve problems in the same way that the human brain would, although several neural networks are much more abstract. Modern neural network projects typically work with a few thousand to a few million neural units and millions of connections. The neural network can have any suitable architecture or configuration known in the art.

[0086] In a further example, a model may comprise a convolutional neural network. For example, the embodiments described herein can take advantage of learning concepts such as a convolutional neural network to solve the normally intractable representation conversion problem (e.g., rendering). The model may have any convolutional neural network configuration or architecture known in the art.

[0087] A neural network can include a set of connected units or nodes (referred to as “neurons”), structured as different layers, where each connection (also referred to as an “edge”) may obtain and send a signal between neurons of neighboring layers in a way similar to a synapse in a biological brain. The signal may be any type of data (e.g., a real number). Each neuron may obtain one or more signals as an input and output another signal by applying a non-linear function to the inputted signals. Neurons and edges may typically be weighted by corresponding weights to represent the knowledge the neural network has acquired. During a training process (similar to a learning process of a biological brain), the weights may be adjusted (e.g., by increasing or decreasing their values) to change the strengths of the signals between the neurons to improve the performance accuracy of the neural network. Neurons may apply a thresholding function (referred to as an “activation function”) to its output values of the non-linear function such that a signal is outputted only when an aggregated value (e.g., a weighted sum) of the output values of the non-linear function exceeds a threshold determined by the thresholding function. Different layers of neurons may transform their input signals in different manners (e.g., by applying different non-linear functions or activation functions). The output of the last layer (referred to as an “output layer”) may output the analysis result of the neural network, such as, for example, a categorization of the set of input data (e.g., as in image recognition cases), a numerical result, or any type of output data for obtaining an analytical result from the input data.

[0088] A neural network can be structured according to channels. In an example of a convolutional network, two-dimensional convolutions can be applied on images. The image can be an MxN array of pixels. The image can have various properties p (e.g., electric field, magnetic field, polarization, or the like). The image can then be represented as MxNxp, where p is the number of channels. A convolution layer can receive, as input, the image with the information MxNxp. As output, the convolution layer can output a map of dimensions M'xN'xp', with the number of output channelsbeing p'. In some embodiments, the number of channels corresponds to a depth of the matrices used in the convolutions.

[0089] Training of the neural network, as used herein, may refer to a process of improving the accuracy of the output of the neural network. Typically, the training may be categorized into three types: supervised training, unsupervised training, and reinforcement training. In the supervised training, a set of target output data (also referred to as “labels” or “ground truth”) may be generated based on a set of input data using a method other than the neural network. The neural network may then be fed with the set of input data to generate a set of output data that is typically different from the target output data. Based on the difference between the output data and the target output data, the weights of the neural network may be adjusted in accordance with a rule. If such adjustments are successful, the neural network may generate another set of output data more similar to the target output data in a next iteration using the same input data. If such adjustments are not successful, the weights of the neural network may be adjusted again. After a sufficient number of iterations, the training process may be terminated in accordance with one or more predetermined criteria (e.g., the difference between the final output data and the target output data is below a predetermined threshold, or the number of iterations reaches a predetermined threshold). The trained neural network may be applied to analyze other input data.

[0090] In the unsupervised training, the neural network is trained without any external gauge (e.g., labels) to identify patterns in the input data rather than generating labels for them. Typically, the neural network may analyze shared attributes (e.g., similarities and differences) and relationships among the elements of the input data in accordance with one or more predetermined rules or algorithms (e.g., principal component analysis, clustering, anomaly detection, or latent variable identification). The trained neural network may extrapolate the identified relationships to other input data.

[0091] In the reinforcement learning, the neural network is trained without any external gauge (e.g., labels) in a trial-and-error manner to maximize benefits in decision making. The input data sets of the neural network may be different in the reinforcement training. For example, a reward value or a penalty value may be determined for the output of the neural network in accordance with one or more rules during training, and the weights of the neural network may be adjusted to maximize the reward values (or to minimize the penalty values). The trained neural network may apply its learned decisionmaking knowledge to other input data.

[0092] During the training of a neural network, a loss function (or referred to as a “cost function”) may be used to evaluate the output data. The loss function, as used herein, may map output data of a machine learning model (e.g., the neural network) onto a real number (referred to as a “loss” or a “cost”) that represents a loss or an error (e.g., representing a difference between the output data and target output data) associated with the output data. The training of the neural network may seek to maximize or minimize the loss function (e.g., by pushing the loss towards a local maximum or a localminimum in a loss curve). For example, one or more parameters of the neural network may be adjusted or updated purporting to maximize or minimize the loss function. After adjusting or updating the one or more parameters, the neural network may obtain new input data in a next iteration of its training. When the loss function is maximized or minimized, the training of the neural network may be terminated.

[0093] After training, PINN 502 can be used for inferencing near field images. PINN 502 can be executed to generate a solution of the near field (e.g., a thick mask model image) using channel 512. During inference, the other channels can be disabled. The deactivation of specific channels during inferencing prevents the use of computing power on computations that are inconsequential to the specific solution of interest. In many situations, it is enough to generate an inferred solution for the near-field image to successfully complete computational simulation of the aerial image and resist image. Hence, computational resources and time can be saved when generating near-field image by limiting channel usage to just channel 512 as opposed to an unmodified PINN that seeks to execute every inference computation within its purview, whether necessary or extraneous.

[0094] In some embodiments, the channel breakdown described for method 500 can be combined in any of the channel breakdowns described in reference to FIGS. 6-8.

[0095] FIG. 6 shows a flowchart of an example method 600 for simulating illumination physics of a thick mask used in a lithographic apparatus (e.g., lithographic apparatus 100 (FIG. 1)), consistent with embodiments of the present disclosure. In some embodiments, the channel breakdown described for method 600 is combinable with any of the channel breakdowns described in reference to FIGS. 5, 7, or 8 (an example of channel combinations is described in reference to FIG. 9). The physics information programmed into PINN 602 can be as described above in reference to PINN 502 (FIG.5) (e.g., using equations 1 and 2 or Medvedev 1, 2, or 3).

[0096] In some embodiments, a PINN 602 comprises a plurality of solution channels. The channels can be output channels of PINN 602. A channel 604 can correspond to a portion of PINN 602 that simulates the electric field portion of illumination. A channel 606 can correspond to a portion of PINN 602 that simulates the magnetic field portion of illumination. Similar to PINN 502 (FIG. 5), PINN 602 can be trained across all channels to minimize total physical loss. After training, PINN 602 can be used for inferencing. PINN 602 can be executed to generate a solution of the electric field using channel 604 while channel 606 can be disabled, or vice versa. The deactivation of specific channels during inferencing prevents waste of computing resources on computations that are inconsequential to the specific solution of interest. This saves computational resources and processing time.

[0097] FIG. 7 shows a flowchart of an example method 700 for simulating illumination physics of a thick mask used in a lithographic apparatus (e.g., lithographic apparatus 100 (FIG. 1)), consistent with embodiments of the present disclosure. In some embodiments, the channel breakdown described for method 700 is combinable with any of the channel breakdowns described in reference to FIGS. 5,6, or 8. The physics information programmed into PINN 702 can be as described above in reference to PINN 502 or 602 (FIGS.5 and 6).

[0098] In some embodiments, a PINN 702 comprises a plurality of solution channels (e.g., channels 704, 706, 708, 710, 712, and 714). To better appreciate method 700, it is instructive to consider an example scenario of illumination incident illumination on a mask 716. Mask 716 can comprise a support structure 718 and a patterned structure 720. It is to be appreciated that method 700 can be performed with any mask type (e.g., a transmissive mask or a reflective mask). A beam of illumination can be incident on input side of mask 716. In an example, the beam of illumination can be represented by its electric field component Em, which can be diagonally polarized such that it comprises both x- and y- polarization components, E;n>x(corresponds to inset 722) and E;n>y(corresponds to inset 724). Mask 716 can output a patterned beam of illumination, which can be represented by its electric field component Eout. After interaction with mask 716, it can be that the output electric field (of the patterned beam) can have x-, y-, and z- polarization components E;n>x, E;n>y, and Ejn,z. In some cases, it can be that not all polarization information is needed for ascertaining a final image result (e.g., aerial image, resist image, etch image, or the like).

[0099] Hence, channels 704, 706, and 708 can correspond to portions of PINN 702 that simulate the Ejn,xportion of the illumination. Channel 704 can correspond to a portion of PINN 702 that simulates the Ejn,xand Eout>xportion of the illumination. Channel 706 can correspond to a portion of PINN 702 that simulates the E;n>xand Eout>yportion of the illumination. Channel 708 can correspond to a portion of PINN 702 that simulates the Ein,xand EOut,z portion of the illumination. Channels 710, 712, and 714 can correspond to portions of PINN 702 that simulate the E;n>yportion of the illumination. Channel 710 can correspond to a portion of PINN 702 that simulates the E;n>yand Eout>xportion of the illumination. Channel 712 can correspond to a portion of PINN 702 that simulates the E;n>yand Eout>yportion of the illumination. Channel 714 can correspond to a portion of PINN 702 that simulates the Ejn,yand EOut,zportion of the illumination.

[0100] Similar to PINN 502 or 602 (FIGS. 5 and 6), PINN 702 can be trained across all channels to minimize total physical loss. The channels can be output channels of PINN 702. After training, PINN 702 can be used for inferencing. In an example scenario, it is of interest to ascertain values of Eoutthat have the same polarization as Em. Since E;nwas diagonally polarized (has Ein,xand Ein,ycomponents), PINN 702 can be executed to generate a solution of the electric field using channel 704 (E;n>xand Eout>xsolution; inset 722) and channel 712 (E;n>yand Eout>ysolution; inset 722) while the other channels are disabled. The deactivation of specific channels during inferencing prevents waste of computing resources on computations that are inconsequential to the specific solution of interest. This saves computational resources and processing time.

[0101] FIG. 8 shows a flowchart of an example method 800 for simulating illumination physics of a thick mask used in a lithographic apparatus (e.g., lithographic apparatus 100 (FIG. 1)), consistent with embodiments of the present disclosure. In some embodiments, the channel breakdown describedfor method 800 is combinable with any of the channel breakdowns described in reference to FIGS. 5-7. The physics information programmed into PINN 802 can be as described above in reference to PINN 502, 602, or 702 (FIGS. 5-7). The channels can be output channels of PINN 802.

[0102] In some embodiments, different z-planes of a mask (see FIG.5) can be divided into smaller regions according to a given step size (e.g., pixels having a selected pixel size). In some instances, high precision can be important for a simulation to generate a practical and usable result. For high precision, some calculations can use fine pixel sizes (e.g., pixel_stepfine~ 0.1 * ; where / . is the wavelength of the source illumination). However, a wavelength of 13.5 nm (a non-limiting example) can be inconveniently small (e.g., pixel_stepfine~ 1.35 nm (or 1.82 nm2in area)). An example fullchip field area can be in the order of 20 mm x 30 mm. Dividing such a full chip area into fine pixels results in approximately 3.3xl014pixels. The resulting computational burden can be prohibitive.

[0103] However, not all simulation computations benefit from the high accuracy afforded by a fine pixel grid. Some portions of simulation can be performed using coarser pixel sizes while also having minimal adverse impact to final simulation results (pixel_stepcoarse~ A / 4NA; where NA is numerical aperture of the relevant optical device (e.g., NA of the projection optics)). In such scenarios, it is desirable to constrain a PINN to coarse pixel computations (e.g., via channel selection). In an example, a wavelength of / . = 13.5 nm can result in pixel_stepcoarse~ 8 nm (or 64 nm2in area). The much larger pixel step can reduce pixel count by approximately two orders of magnitude compared to the example of pixel_stepfinedescribed above, thereby resulting in a proportional reduction of computational burden.

[0104] Pixel_stepcoarseand pixel_stepfinecan relate to spatial frequencies that are characteristic of design layout parameters. For example, sharp corners and small CD are characterized by high spatial frequencies, while large smooth structures with little to no nanoscale variation are characterized by low spatial frequencies. Regions of a design layout that have high spatial frequency features can benefit from the higher accuracy of pixel_stepfine. However, there can be little to no benefit of using such fine pixels for regions of low spatial frequency, thus such regions are better suited for pixel_stepcoarse. Such regions or portions of a design layout can be clips, as described above in reference to FIG.3.

[0105] A PINN 802 can comprise a plurality of solution channels. A channel 804 can correspond to a portion of PINN 802 that simulates illumination through a mask while using a coarse pixel size (e.g., for large structures with low spatial frequencies). A channel 806 can correspond to a portion of PINN 802 that simulates illumination through a mask while using a fine pixel size (e.g., for fine nanoscale structures or sharp corners with high spatial frequencies). In this manner, low and high spatial frequency components can be separated into separate channels.

[0106] An example of how channels can work is given in terms of spatial frequencies and frequency filters. The raw output from the PINN 802 can be a rectangular grid. In the coarse selection (channel 804), the output raw data (rectangular grid) can be projected onto a 2D array of pixels (e.g., usingpixel_stepcoarse)- Then, a low-pass filter 808 can be applied to the raw output grid. The white portion low-pass filter 808 indicates what is allowed to pass through while the black portion indicates rejected parts. Low-pass filter 808 rejects high order frequencies while allowing the frequencies within the pass band. For high frequency components, the inverse process can be performed. After projecting the output raw data onto a 2D array of pixels (e.g., using pixel_stepfine), High frequencies can be passed through a high-pass filter 810. Low-pass filter 808 and high-pass filter 810 can be programmed such that their sum is unity. A result is that channel 804 can contribute to a low frequency portion of the final simulation result. Conversely, channel 806 can contribute to a high frequency portion of the final simulation result.

[0107] Similar to PINN 502, 602, or 702 (FIGS. 5-7), PINN 802 can be trained across all channels to minimize total physical loss. After training, PINN 802 can be used for inferencing. PINN 802 can be executed to generate a solution of the EM field using channel 804 while channel 806 can be kept in an inactive state, or vice versa. The deactivation of specific channels during inferencing prevents waste of computing resources on computations that are inconsequential to the specific solution of interest. This saves computational resources and processing time.

[0108] Spatial frequencies are also related to the NA of an optical system (such as the projection optics of lithographic apparatus 100 (FIG. 1)). A NA-based cutoff frequency fcutof an optical system is given by fcut= 2NA / Z. In some embodiments, the determination of whether to use channel 804 (e.g., pixel_stepCOarSe) or channel 806 (e.g., pixel_stepfine) can be based on whether a portion of a design layout to be simulated has spatial frequency range that is below fcut (or above fcut).

[0109] FIG. 9 shows a table of a channel arrangement for a PINN 900, consistent with embodiments of the present disclosure. In some embodiments, the physics information programmed into PINN 900 can be as described above in reference to PINN 502, 602, 702, or 802 (FIGS. 5-8). Each channel in PINN 900 can be associated with a set of physical characteristics of an interaction between an EM field (illumination) and a patterning device (e.g., patterning device 108 (FIG. 1), mask 400 (FIGS. 4 and 5), or mask 716 (FIG.7)). A set of physical characteristics can comprise one or more physical characteristics (e.g., one or more)). A first set of physical characteristics can be different from a second set of physical characteristics if a physical characteristic in the first set is not present in the second set. For example, channels 902 and 904 have different sets of physical characteristics since they differ in the field component characteristic even though they have other common constituents (e.g., z-plane and spatial frequency characteristics).

[0110] In FIG. 5, a physical characteristic of the interaction between an EM field and a mask is directed to a z-plane of interaction. In FIG.6, a physical characteristic of the interaction between an EM field and a mask is directed to the electric field and magnetic field portions of the EM field. In FIG. 7, a physical characteristic of the interaction between an EM field and a mask is directed to polarization of the EM field. In FIG. 8, a physical characteristic of the interaction between an EM field and a mask is directed to a spatial frequency of a design layout. The present disclosure is notlimited to the above-noted examples of physical characteristics of illumination-mask interaction. It is be appreciated that other physical characteristics of illumination-mask interaction can be used in embodiments described herein.

[0111] The channel arrangement of PINN 900 can comprise channels 902, 904, 906, 908, 910, 912, 914, 916, or more. More or fewer channels can be implemented (e.g., two or more channels). Channel 902 can be associated with near-field plane 418 (FIGS. 4 and 5), a spatial frequency below fcut, and Exand Eyfor the diagonal-polarization component of the electric field. The parameters of channels 904, 906, 908, 910, 912, 914, and 916 are as shown in FIG. 9. The example of FIG. 9 is directed to a scenario in which channel 902 is selected during inferencing using the PINN while channels 904, 906, 908, 910, 912, 914, and 916 are disabled during inferencing.

[0112] In some embodiments, executing PINN 900 for inferencing (e.g., uses fewer channels) can be faster than the training component of PINN 900 (e.g., uses more channels). After training, PINN 900 can be used multiple times for inferencing as part of simulations according to methods 200 or 300 (FIGS. 2 and 3). By disabling channels 904, 906, 908, 910, 912, 914, and 916 during inferencing, the savings in computational resources and processing time accumulates for every execution of PINN 900. Furthermore, selection of active channels can be performed in any permutation (e.g., two or more channels active during inference with the remaining channels disabled). Selection of a given channel can be based on whether a given physical characteristic helps to prevent an unusable simulation outcome. For example, if a clip involved in the simulation has a hotspot, then channel selection can be narrowed down to channels that use a spatial frequency above fcut. Further narrowing of channel selection can be made by assessing which polarization, field component, or z-plane matters to the simulation outcome.

[0113] In some embodiments, the training of PINN 900 is performed with a first number of active channels for the training. Inference can then be performed using a second number of active channels while unselected channels are disabled during inference. The first number of active channels for training can be greater than the second number of channels for inference (or the second number of channels is a smaller subset of the first number of channels).

[0114] FIG. 10 shows a flowchart of an example method 1000 for simulating a lithography process using mask 400 (FIGS. 4 and 5)), consistent with embodiments of the present disclosure. In some embodiments, method 1000 is performed using devices and functions described in reference to FIGS.1-9, 11, and 12. At operation 1002, a PINN comprising multiple channels can be trained ((e.g., PINNs 502, 602, 702, 802, or 900 (FIGS. 5-9)). A first channel of the multiple channels can be associated with a first set of physical characteristics of an interaction between an EM field and a thick mask used in the lithography process (e.g., channel 902 (FIG.9) associated with near-field plane 418 (FIGS. 4 and 5)). A second channel of the multiple channels can be associated with a second set of physical characteristics of the interaction — the second set being different from the first set (e.g.,channel 910 (FIG. 9) associated with z-plane 514 (FIG. 5)) and not near-field plane 418 (FIGS.4 and 5)).

[0115] At operation 1004, the PINN can be executed using the first channel to generate a near-field image representation of the EM field near the thick mask with the second channel disabled (e.g., channel 902 is <Active? = yes> during inference while channel 910 (FIG.9) is <Active? = no> during inference).

[0116] Some embodiments of the present disclosure can comprise additional or alternative method operations based on the description of FIGS. 1-9. Example operations can comprise generating an aerial image representation based on the near-field image representation, training a PINN using all channels active, training a PINN using a first number of channels and executing the PINN using a second number with a smaller than the first number of channels, or the like.

[0117] FIG. 11 shows a block diagram of an example server 1100, consistent with some embodiments of the disclosure. In some embodiments, server 1100 can comprise processor 1102. When processor 1102 executes instructions described herein, server 1100 can become a specialized machine. It is appreciated that server 1100 can be involved with computationally intensive tasks, such as training PINN.

[0118] Processor 1102 can be any type of circuitry capable of manipulating or processing information. For example, processor 1102 can comprise any combination of any number of a central processing unit (“CPU”), a graphics processing unit (“GPU”), a neural processing unit (“NPU”), a microcontroller unit (“MCU”), an optical processor, a programmable logic controller, a microcontroller, a microprocessor, a digital signal processor, an intellectual property (IP) core, a Programmable Logic Array (PLA), a Programmable Array Logic (PAL), a Generic Array Logic (GAL), a Complex Programmable Logic Device (CPLD), a Field-Programmable Gate Array (FPGA), a System On Chip (SoC), an Application- Specific Integrated Circuit (ASIC), or the like. In some embodiments, processor 1102 can also be a set of processors grouped as a single logical component. Processor 1102 can comprise multiple processors, including processor 1102a, processor 1102b, and so on up to processor 1102n.

[0119] Server 1100 can further comprise memory 1104 configured to store data (e.g., a set of instructions, computer codes, intermediate data, or the like). The stored data can comprise program instructions and data for processing. Processor 1102 can access the program instructions and data for processing (e.g., via bus 1110), and execute the program instructions to perform an operation or manipulation on the data for processing. Memory 1104 can comprise a high-speed random-access storage device or a non-volatile storage device. In some embodiments, memory 1104 comprises any combination of any number of non-transitory computer-readable media. Memory 1104 can also be a group of memories grouped as a single logical component.

[0120] Bus 1110 can be a communication device that transfers data between components inside server 1100, such as an internal bus (e.g., a CPU-memory bus), an external bus (e.g., a universal serial bus port, a peripheral component interconnect express port), or the like.

[0121] Processor 1102 and other data processing circuits can collectively be referred to as a “data processing circuit” in the present disclosure (intended to reduce ambiguity and not to limit). The data processing circuit can be implemented entirely as hardware, or as a combination of software, hardware, or firmware. In addition, the data processing circuit can be a single independent module or can be combined entirely or partially into any other component of server 1100.

[0122] Server 1100 can further comprise network interface 1106 to provide wired or wireless communication with a network (e.g., the Internet, an intranet, a local area network, a mobile communications network, or the like). In some embodiments, network interface 1106 can comprise any combination of any number of a network interface controller (NIC), a radio frequency (RF) module, a transponder, a transceiver, a modem, a router, a gateway, a wired network adapter, a wireless network adapter, a Bluetooth adapter, an infrared adapter, a near-field communication (“NFC”) adapter, a cellular network chip, or the like.

[0123] In some embodiments, optionally, server 1100 can further comprise a peripheral interface 1108 to provide a connection to one or more peripheral devices. The peripheral device(s) can include, but are not limited to, a cursor control device (e.g., a mouse, a touchpad, or a touchscreen), a keyboard, a display (e.g., a cathode-ray tube display, a liquid crystal display, or a light-emitting diode display), a video input device (e.g., a camera or an input interface coupled to a video archive), or the like.

[0124] Consistent with some embodiments of this disclosure, the computer-implemented method of using a PINN to computationally simulate a lithography process can comprise training the PINN using Maxwell’s equations. In some embodiments, the PINN can be trained using server 1100 in instances where server 1100 can provide higher computational capacity compared to a smaller computational system, such as a personal computer. For example, the PINN is trained at a first computing system (e.g., server 1100) and the execution of the PINN for inferencing can be performed at a second computing system (e.g., a personal computer).

[0125] A non-transitory computer-readable medium can store instructions for a processor of a controller for simulating a lithography process according to the exemplary flowcharts of FIGS. 2, 3, 5-8, and 10, consistent with embodiments in the present disclosure. For example, the instructions stored in the non-transitory computer-readable medium can be executed by the circuitry of the controller of, or corresponding to, a lithography system for performing methods 200, 300, 500, 600, 700, 800, or 1000 in part or entirely. For example, as stated above, server 1100 can be used to perform the computationally intensive portion of the methods (e.g., training a PINN), while a separate system (e.g., a system similar to server 1100) having a separate controller can be involved with the inferencing portion of the PINN. A non-exhaustive list of common forms of non-transitory mediaincludes, for example, a floppy disk, a flexible disk, hard disk, solid-state drive, a flash drive, a security digital (SD) card, a memory stick, a compact flash (CF) card, magnetic tape, or any other magnetic data storage medium, a Compact Disc Read-Only Memory (CD-ROM), any other optical data storage medium, any physical medium with patterns of holes, a Random Access Memory (RAM), a Read-Only Memory, (ROM), a Programmable Read-Only Memory (PROM), a Field-Programmable Gate Array (FPGA), and Erasable Programmable Read-Only Memory (EPROM), a FLASH-EPROM or any other flash memory, Non-Volatile Random Access Memory (NVRAM), a cache, a register, any other memory chip or cartridge, and networked versions of the same.

[0126] FIG. 12 shows a flowchart of an example method 1200 for training a PINN 1202, consistent with embodiments of the present disclosure. In some embodiments, PINNs 502, 602, 702, or 802 (FIGS. 5-8) can be trained as described in reference to PINN 1202. Method 1200 can use a convolutional neural network (CNN) 1204. CNN 1204 can be a suitable neural network for image segmentation (e.g., a U-Net or fully convolutional network). CNN 1204 can comprise convolutional and pooling layers 1206-1 to 1206-n (e.g., two or more convolutional and pooling layers). Adjustable weights of CNN 1204 that are updated based on computations using Maxwell’s equations.

[0127] Method 1200 is described in a non-limiting manner using electric fields for purposes of example. It is appreciated that method 1200 can be implemented similarly with magnetic fields.

[0128] Mask data 1208 can be used as input. Mask data 1208 can comprise detailed information of a mask, such as material properties of the layer stack, the shape and EM properties of absorber material, or the like. The shape of the absorbers map is the mask pattern representation. The shape of the absorbers indicate which areas of the mask stop illumination from reaching the next step (the resist). Using an input E-field Ei, CNN 1204 can predict a solution of EM field scattering by the mask’s material distribution and properties. The predicted solution is represented as scattered EM field data 1214. Scattered EM field data 1214 can comprise one or more properties related to EM fields (e.g., magnitude, intensity, polarization, wave information, z-plane position dependence, electric field, magnetic field, grid coarseness / fineness, or the like). CNN 1204 can perform computations on a number of different channels 1212-1 to 1212-n. Each of the one or more properties of scattered EM field data 1214 can be assigned to a corresponding channel.

[0129] An EM solver 1210 can be applied to scattered E-field data 1214 (the scattered field is denoted as Es). EM solver 1210 can comprise a boundary condition (BC) operation that is applicable to scattered EM field data 1214. The BC operation can generate boundary-defined scattered E-field data 1216. EM solver 1210 can also comprise a Laplacian (V2) operation that is applicable to boundary-defined scattered EM field data 1216. The V2operation can generate derivative term(s) for equation implementation 1220. Equation implementation 1220 can be any form of equation that is suitable for solving scattering behavior (e.g., Maxwell’s equations). The total E-field ET can represent the combined scattered E-field Es and incident E-field Ei. Properties of matter (e.g., refractive indexn, electric field constant ko, permittivity, permeability, or the like) can be provided to equation implementation 1220 (e.g., taken from mask data 1208).

[0130] An output of equation implementation 1220 can be a quantity that indicates a deviation from an expected result (e.g., non- zero result is a deviation). The deviation can be designated as a residual 1222. The residual 1222 can form the basis of a loss function 1224 (e.g., equation 2). Using loss function 1224, residual 1222 can be b ackprop agated to update the network weights of CNN 1204. The process can be iterated until loss function 1224 is minimized. The backpropagation and loss function minimization process can produce PINN 1202. PINN 1202 can comprise channels 1212-1 to 1212-n, which are optimized according to method 1200. The channels can be output channels of PINN 1202. PINN 1202 can be used as described above in reference to FIGS. 5-11.Embodiments of the present disclosure can be further described by the following clauses.1. A non-transitory computer-readable medium that stores a set of instructions that is executable by at least one processor of an apparatus to cause the apparatus to perform operations for simulating a lithography process, the operations comprising:obtaining a physics-informed neural network (PINN) comprising multiple channels, wherein:a first channel of the multiple channels is associated with a first set of physical characteristics of an interaction between an electromagnetic (EM) field and a thick mask used in the lithography process; anda second channel of the multiple channels is associated with a second set of physical characteristics of the interaction that is different from the first set of physical characteristics; and executing the PINN using the first channel to generate a near-field image representation of the EM field for the thick mask with the second channel disabled.2. The non-transitory computer-readable medium of clause 1, wherein the operations further comprise:generating an aerial image representation based on the near-field image representation.3. The non-transitory computer-readable medium of clause 1, wherein the first set of physical characteristics comprises a position of a z-plane that is parallel to a near-field plane of the thick mask.4. The non-transitory computer-readable medium of clause 3, wherein:the position is of the near-field plane; andexecuting the PINN comprises selecting the first channel to perform an inference associated with the position of the near-field plane.5. The non-transitory computer-readable medium of clause 1, wherein:the first set of physical characteristics comprises an electric field portion of the EM field; and the second set of physical characteristics comprises a magnetic field portion of the EM field, a polarization of the EM field, or a spatial frequency.6. The non-transitory computer-readable medium of clause 1, wherein:the first set of physical characteristics comprises a magnetic field portion of the EM field; andthe the second set of physical characteristics comprises an electric field portion of the EM field, a polarization of the EM field, or a spatial frequency.7. The non-transitory computer-readable medium of clause 1, wherein:the first set of physical characteristics comprises a polarization of the EM field; and the second set of physical characteristics comprises an electric field portion of the EM field, a magnetic field portion of the EM field, or a spatial frequency.8. The non-transitory computer-readable medium of clause 1, wherein:the first set of physical characteristics comprises a spatial frequency; andthe second set of physical characteristics comprises an electric field portion of the EM field, a magnetic field portion of the EM field, or a polarization of the EM field.9. The non-transitory computer-readable medium of clause 1, wherein:the first set of physical characteristics comprises a spatial frequency; andexecuting the PINN comprises:applying a low-pass filter to output data from the PINN to reject a portion of the output data associated with spatial frequencies above a numerical aperture (NA)-based cutoff frequency; orapplying a high-pass filter to output data from the PINN to reject a portion of the output data associated with spatial frequencies below the NA-based cutoff frequency.10. The non-transitory computer-readable medium of clause 1, wherein the PINN was trained using all channels of the multiple channels.11. The non-transitory computer-readable medium of clause 1, wherein the multiple channels are multiple output channels.12. The non-transitory computer-readable medium of clause 1, wherein:the PINN was trained by activating a first number of channels of the multiple channels; executing the PINN comprises performing inferencing using a second number of channels of the multiple channels; andthe second number of channels for the inferencing is a smaller subset of the first number of channels.13. The non-transitory computer-readable medium of clause 1, wherein the PINN was trained based on applying a first aspect of Maxwell’s equations to the first channel and a second aspect of Maxwell’ s equations to the second channel.14. The non-transitory computer-readable medium of clause 1, wherein the PINN was trained using a convolutional neural network using weight adjustments based on a residual from Maxwell’s equations.15. The non-transitory computer-readable medium of clause 14, wherein the PINN was trained by minimizing a loss function, the loss function being based on the residual.16. The non-transitory computer-readable medium of clause 1, wherein the PINN was trained using material distribution data of the thick mask as an input to Maxwell’s equations.17. The non-transitory computer-readable medium of clause 1, wherein the operations further comprise training the PINN using all channels of the multiple channels.18. The non-transitory computer-readable medium of clause 17, wherein:training the PINN comprises activating a first number of channels of the multiple channels in a backpropagation process;the multiple channels are configured to be used for inferencing using a second number of channels of the multiple channels; andthe second number of channels for the inferencing is a smaller subset of the first number of channels for the training.19. The non-transitory computer-readable medium of clause 17, wherein training the PINN is based on applying a first aspect of Maxwell’s equations to the first channel and a second aspect of Maxwell’ s equations to the second channel.20. The non-transitory computer-readable medium of clause 17, wherein training the PINN comprises adjusting weights of a convolutional neural network based on a residual from Maxwell’s equations.21. The non-transitory computer-readable medium of clause 20, wherein training the PINN comprises minimizing a loss function, the loss function being based on the residual.22. The non-transitory computer-readable medium of clause 17, wherein training the PINN comprises using material distribution data of the thick mask as an input to Maxwell’s equations.23. A method comprising:training a physics-informed neural network (PINN) comprising multiple channels for simulating a lithography process, wherein:a first channel of the multiple channels is associated with a first set of physical characteristics of an interaction between an electromagnetic (EM) field and a thick mask used in the lithography process; anda second channel of the multiple channels is associated with a second set of physical characteristics of the interaction that is different from the first set of physical characteristics; and configuring the multiple channels to be independently toggleable for inferencing, wherein the first channel is configured to generate a near-field image representation of the EM field for the thick mask with the second channel disabled.24. The method of clause 23, wherein training the PINN comprises using all channels of the multiple channels for a backpropagation process.25. The method of clause 23, wherein:training the PINN comprises activating a first number of channels of the multiple channels in a backpropagation process;the multiple channels are configured to be used for inferencing using a second number of channels of the multiple channels; andthe second number of channels for the inferencing is a smaller subset of the first number of channels for the training.26. The method of clause 23, wherein training the PINN is based on applying a first aspect of Maxwell’ s equations to the first channel and a second aspect of Maxwell’ s equations to the second channel.27. The method of clause 23, wherein training the PINN comprises adjusting weights of a convolutional neural network based on a residual from Maxwell’s equations.28. The method of clause 27, wherein training the PINN comprises minimizing a loss function, the loss function being based on the residual.29. The method of clause 23, wherein training the PINN comprises using material distribution data of the thick mask as an input to Maxwell’s equations.30. The method of clause 23, wherein the first set of physical characteristics comprises a position of a z-plane that is parallel to a near-field plane of the thick mask.31. The method of clause 30, wherein:the position is of the near-field plane; andthe multiple channels are configured to be used for inferencing by disabling the second channel and selecting the first channel to perform an inference associated with the position of the near-field plane.32. The method of clause 23, wherein the first set of physical characteristics comprises an electric field portion of the EM field.33. The method of clause 23, wherein the first set of physical characteristics comprises a magnetic field portion of the EM field.34. The method of clause 23, wherein the first set of physical characteristics comprises a polarization of the EM field.35. The method of clause 23, wherein the first set of physical characteristics comprises a spatial frequency.36. The method of clause 23, wherein:the first set of physical characteristics comprises a spatial frequency; andthe multiple channels are configured to be used for inferencing by:applying a low-pass filter to output data from the PINN to reject a portion of the output data associated with spatial frequencies above a numerical aperture (NA)-based cutoff frequency; orapplying a high-pass filter to output data from the PINN to reject a portion of the output data associated with spatial frequencies below the NA-based cutoff frequency.37. The method of clause 23, wherein the first channel is further configured to generate the near-field image representation comprising information indicative of an aerial image of the photolithography process.38. A system comprising:one or more processors;one or more memory devices configured to store a set of instructions that is executable by the one or more processors to cause the system to perform operations for simulating a lithography process, the operations comprising:obtaining a physics-informed neural network (PINN) comprising multiple channels, wherein:a first channel of the multiple channels is associated with a first set of physical characteristics of an interaction between an electromagnetic (EM) field and a thick mask used in the lithography process; anda second channel of the multiple channels is associated with a second set of physical characteristics of the interaction that is different from the first set of physical characteristics; and executing the PINN using the first channel to generate a near-field image representation of the EM field for the thick mask with the second channel disabled.39. The system of clause 38, wherein the operations further comprise:generating an aerial image representation based on the near-field image representation. 40. The system of clause 38, wherein the first set of physical characteristics comprises a position of a z -plane that is parallel to a near-field plane of the thick mask.41. The system of clause 40, wherein:the position is of the near-field plane; andexecuting the PINN comprises selecting the first channel to perform an inference associated with the position of the near-field plane.42. The system of clause 38, wherein the first set of physical characteristics comprises an electric field portion of the EM field.43. The system of clause 38, wherein the first set of physical characteristics comprises a magnetic field portion of the EM field.44. The system of clause 38, wherein the first set of physical characteristics comprises a polarization of the EM field.45. The system of clause 38, wherein the first set of physical characteristics comprises a spatial frequency.46. The system of clause 38, wherein:the first set of physical characteristics comprises a spatial frequency; andexecuting the PINN comprises:applying a low-pass filter to output data from the PINN to reject a portion of the output data associated with spatial frequencies above a numerical aperture (NA)-based cutoff frequency; orapplying a high-pass filter to output data from the PINN to reject a portion of the output data associated with spatial frequencies below the NA-based cutoff frequency.47. The system of clause 38, wherein the PINN was trained using all channels of the multiple channels.48. The system of clause 38, wherein:the PINN was trained by activating a first number of channels of the multiple channels; executing the PINN comprises performing inferencing using a second number of channels of the multiple channels; andthe second number of channels for the inferencing is a smaller subset of the first number of channels.49. The system of clause 38, wherein the PINN war trained based on applying a first aspect of Maxwell’ s equations to the first channel and a second aspect of Maxwell’ s equations to the second channel.50. The system of clause 38, wherein the PINN was trained using a convolutional neural network using weight adjustments based on a residual from Maxwell’s equations.51. The system of clause 50, wherein the PINN was trained by minimizing a loss function, the loss function being based on the residual.52. The system of clause 38, wherein the PINN was trained using material distribution data of the thick mask as an input to Maxwell’s equations.

[0131] It will be appreciated that the embodiments of the present disclosure are not limited to the exact construction that has been described above and illustrated in the accompanying drawings and that various modifications and changes can be made without departing from the scope thereof.

Claims

CLAIMS1. A non-transitory computer-readable medium that stores a set of instructions that is executable by at least one processor of an apparatus to cause the apparatus to perform operations for simulating a lithography process, the operations comprising:obtaining a physics-informed neural network (PINN) comprising multiple channels, wherein:a first channel of the multiple channels is associated with a first set of physical characteristics of an interaction between an electromagnetic (EM) field and a thick mask used in the lithography process; anda second channel of the multiple channels is associated with a second set of physical characteristics of the interaction that is different from the first set of physical characteristics; and executing the PINN using the first channel to generate a near-field image representation of the EM field for the thick mask with the second channel disabled.

2. The non-transitory computer-readable medium of claim 1, wherein the operations further comprise:generating an aerial image representation based on the near-field image representation.

3. The non-transitory computer-readable medium of claim 1, wherein the first set of physical characteristics comprises a position of a z-plane that is parallel to a near-field plane of the thick mask.

4. The non-transitory computer-readable medium of claim 3, wherein:the position is of the near-field plane; andexecuting the PINN comprises selecting the first channel to perform an inference associated with the position of the near-field plane.

5. The non-transitory computer-readable medium of claim 1, wherein:the first set of physical characteristics comprises an electric field portion of the EM field; and the second set of physical characteristics comprises a magnetic field portion of the EM field, a polarization of the EM field, or a spatial frequency.

6. The non-transitory computer-readable medium of claim 1, wherein:the first set of physical characteristics comprises a magnetic field portion of the EM field; and the second set of physical characteristics comprises an electric field portion of the EM field, a polarization of the EM field, or a spatial frequency.

7. The non-transitory computer-readable medium of claim 1, wherein:the first set of physical characteristics comprises a polarization of the EM field; and the second set of physical characteristics comprises an electric field portion of the EM field, a magnetic field portion of the EM field, or a spatial frequency.

8. The non-transitory computer-readable medium of claim 1, wherein:the first set of physical characteristics comprises a spatial frequency; andthe second set of physical characteristics comprises an electric field portion of the EM field, a magnetic field portion of the EM field, or a polarization of the EM field.

9. The non-transitory computer-readable medium of claim 1, wherein:the first set of physical characteristics comprises a spatial frequency; andexecuting the PINN comprises:applying a low-pass filter to output data from the PINN to reject a portion of the output data associated with spatial frequencies above a numerical aperture (NA)-based cutoff frequency; orapplying a high-pass filter to output data from the PINN to reject a portion of the output data associated with spatial frequencies below the NA-based cutoff frequency.

10. The non-transitory computer-readable medium of claim 1, wherein the PINN was trained using all channels of the multiple channels.

11. The non-transitory computer-readable medium of claim 1, wherein:the PINN was trained by activating a first number of channels of the multiple channels; executing the PINN comprises performing inferencing using a second number of channels of the multiple channels; andthe second number of channels for the inferencing is a smaller subset of the first number of channels, wherein the multiple channels are multiple output channels..

12. The non-transitory computer-readable medium of claim 1, wherein the PINN was trained based on applying a first aspect of Maxwell’s equations to the first channel and a second aspect of Maxwell’ s equations to the second channel.

13. The non-transitory computer-readable medium of claim 1, wherein the PINN was trained using material distribution data of the thick mask as an input to Maxwell’s equations.

14. The non- transitory computer-readable medium of claim 1, wherein:training the PINN comprises activating a first number of channels of the multiple channels in a backpropagation process;the multiple channels are configured to be used for inferencing using a second number of channels of the multiple channels; andthe second number of channels for the inferencing is a smaller subset of the first number of channels for the training.

15. The non- transitory computer-readable medium of claim 11, wherein training the PINN is based on applying a first aspect of Maxwell’s equations to the first channel and a second aspect of Maxwell’s equations to the second channel.