A multimode fusion control system of a semiconductor single crystal growth equipment

By integrating heterogeneous sensing networks and nonlinear multi-objective cooperative controllers, multimodal fusion control of semiconductor single crystal growth equipment was achieved, solving the time lag and model mismatch problems of traditional control systems and improving crystal quality and yield.

CN122279730APending Publication Date: 2026-06-26WUHAN INST OF TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
WUHAN INST OF TECH
Filing Date
2026-05-09
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Traditional semiconductor single crystal growth equipment relies on the feedback of a single physical quantity in its control system, which leads to time lag and adjustment oscillation caused by thermal inertia. This makes it difficult to achieve real-time response of the growth rate and cannot detect the turbulent flow field and oxygen concentration distribution inside the melt, resulting in dislocations or crystallization defects easily generated at the crystal growth end.

Method used

By employing a heterogeneous sensing network, a multimodal spatiotemporal alignment device, a digital twin inference server, and a nonlinear multi-objective collaborative controller, integrating visual, acoustic, and thermodynamic sensing units, and through multimodal data fusion and deep reinforcement learning, the system enables real-time monitoring and prediction of the internal state of the melt, and dynamically adjusts growth parameters to ensure crystal quality.

Benefits of technology

It enables in-depth monitoring of the internal state of the melt, reduces dislocation density, improves crystal quality and yield, solves the model mismatch problem of traditional control systems, and has robustness and intelligent evolution capabilities to adapt to the process requirements of different batches.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122279730A_ABST
    Figure CN122279730A_ABST
Patent Text Reader

Abstract

This invention belongs to the field of semiconductor single crystal growth control technology, specifically relating to a multimodal fusion control system for semiconductor single crystal growth equipment. The system includes a heterogeneous sensing network, a multimodal spatiotemporal alignment device, a digital twin inference server, and a nonlinear multi-objective cooperative controller. The heterogeneous sensing network collects visual, acoustic, and thermodynamic multidimensional information, which is then mapped to a unified feature space by the multimodal spatiotemporal alignment device. The digital twin inference server, based on finite volume analysis, predicts the crystal defect distribution and solid-liquid interface shape in real time for the next moment. The cooperative controller dynamically adjusts the coupling relationship between heating power, rotation speed, and pulling speed, aiming for constant diameter, flat interface, and minimum oxygen content. This invention achieves a leap from hysteresis feedback to predictive control, fills the blind spot in melt internal state monitoring, improves the accuracy of constant diameter control and single crystal yield, and enhances the system robustness under complex operating conditions.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of semiconductor single crystal growth control technology, specifically relating to a multi-modal fusion control system for semiconductor single crystal growth equipment. Background Technology

[0002] With the rapid development of the semiconductor industry, the preparation of high-quality single-crystal materials has become a core link in the integrated circuit industry chain. In the single-crystal silicon growth process, the Czochralski method, as the mainstream process, places high demands on the stability of the thermal field and the precise control of the growth environment. An efficient and precise crystal growth equipment control system is not only crucial for ensuring wafer yield, but also the technological cornerstone for achieving high-performance, low-defect growth of large-size semiconductor-grade single crystals.

[0003] The automated control system of semiconductor single crystal growth equipment mainly maintains the thermodynamic balance of the solid-liquid interface through closed-loop regulation of liquid surface diameter, thermal field temperature, and mechanical motion parameters. This process involves complex multiphase flow heat and mass transfer and phase transition dynamics. The control system needs to monitor production parameters in real time and dynamically adjust the actuators to ensure the uniformity of single crystal diameter and the stability of internal microstructure throughout the entire growth cycle.

[0004] Traditional control schemes typically rely on sensor feedback for a single physical quantity, leading to regulatory oscillations when dealing with time lags caused by thermal inertia, making it difficult to achieve real-time growth rate responses at the microscopic scale. Existing monitoring methods mostly focus on visible surface diameters and single-point temperatures, failing to detect implicit key variables such as turbulent flow fields and oxygen concentration distribution within the melt, creating a spatial sensing blind spot. Because single-crystal growth is accompanied by severe nonlinear dynamic drift, traditional linear control models are prone to model mismatch and robustness deficiencies under complex operating conditions, resulting in dislocations or crystallization defects easily forming at the crystal growth ends. Summary of the Invention

[0005] The purpose of this invention is to provide a multimodal fusion control system for semiconductor single crystal growth equipment, which can solve the problems mentioned in the background art.

[0006] To achieve the above objectives, the technical solution adopted by the present invention is as follows: A multimodal fusion control system for a semiconductor single crystal growth equipment includes a heterogeneous sensing network, a multimodal spatiotemporal alignment device, a digital twin inference server, and a nonlinear multi-objective cooperative controller, as follows: The heterogeneous sensing network includes a visual sensing unit, an acoustic sensing unit, and a thermodynamic sensing unit, which are used to collect multi-dimensional physical state information during the semiconductor single crystal growth process from all directions. The visual perception unit uses the principle of high-speed infrared thermal imaging to capture the microscopic morphology of the solid-liquid interface and the distribution of the temperature gradient field in real time, and extracts the crystal diameter features and meniscus morphology features from them. The acoustic sensing unit is deployed on the crucible wall to collect acoustic signals, bubble bursting signals and thermal noise signals during the melt convection process, and to determine the viscosity change and turbulence intensity inside the melt through acoustic feature inversion logic. The thermodynamic sensing unit is used to perform high-frequency sampling of the voltage parameters, current parameters, and magnetohydrodynamic parameters of the heater to construct coupled input data of electromagnetic field and thermal field; The multimodal spatiotemporal alignment device is connected to the heterogeneous sensing network and is used to semantically align the time-series signals acquired by the acoustic sensing unit, the spatial distribution images acquired by the visual sensing unit, and the point source data acquired by the thermodynamic sensing unit, and map them into a unified feature space. The digital twin inference server is connected to the multimodal spatiotemporal alignment device. It has a digital twin of the single crystal growth process corresponding to the physical device. The digital twin inference server is used to receive the aligned multimodal data in real time. Through inference logic based on finite volume analysis, it predicts the crystal defect distribution trend and solid-liquid interface shape change at the next predetermined time scale and outputs the prediction results for the future state shift caused by thermal inertia. The nonlinear multi-objective collaborative controller is connected to the digital twin simulation server and is used to dynamically adjust the coupling relationship between heating power parameters, crucible rotation speed parameters, and lifting speed parameters based on the prediction results output by the digital twin simulation server, with multiple objective functions including constant crystal diameter, optimal solid-liquid interface flatness, and lowest oxygen content.

[0007] Preferably, the acoustic sensing unit includes multiple high-temperature resistant acoustic sensors, which are arranged in an array on the outer wall of the crucible to capture the rheological acoustic emission signal of the melt under high temperature conditions.

[0008] Furthermore, the acoustic sensing unit uses voiceprint recognition logic to match the collected acoustic features with a preset flow field modal database to determine the fluid dynamic state inside the melt in the invisible region.

[0009] Furthermore, the visual perception unit is equipped with a filter component to eliminate strong thermal radiation interference generated by the heating element, ensuring that the infrared thermal imager can accurately identify pixel-level brightness changes at the solid-liquid interface and calculate a high-precision diameter feedback value.

[0010] Furthermore, the magnetohydrodynamic parameters collected by the thermodynamic sensing unit include the intensity distribution data of the applied magnetic field and the fluctuation data of the induced current, which are used to analyze the suppression effect of the magnetic field on melt convection.

[0011] Preferably, the multimodal spatiotemporal alignment device adopts a heterogeneous data fusion strategy, which ensures that sensing signals of different frequencies and dimensions are logically in the same reference system by performing resampling processing on the time axis and coordinate transformation on the spatial dimension.

[0012] Furthermore, the digital twin simulation server introduces a dual-drive fusion mechanism of physical mechanism model and data-driven model during the simulation process. The physical mechanism model performs deterministic calculations of the temperature field and concentration field inside the melt based on the heat and mass transfer equation, while the data-driven model performs probabilistic compensation for nonlinear random disturbances based on historical growth data.

[0013] Furthermore, the digital twin simulation server is used to calculate the strain rate distribution at the solid-liquid interface and, in conjunction with a preset dislocation multiplication model, to evaluate the crystal integrity index at the current growth rate.

[0014] Furthermore, the prediction results output by the digital twin simulation server include a prediction curve for the diameter change trend within a specific future time period, as well as a compensation coefficient for the change in thermal environment caused by the drop in the melt level.

[0015] Preferably, the nonlinear multi-objective cooperative controller employs a control algorithm based on deep reinforcement learning, which automatically obtains the optimal decision-making strategy through iterative training with a massive number of process parameter combinations in a simulated environment.

[0016] Furthermore, when the nonlinear multi-objective cooperative controller detects an abnormal turbulence warning of melt convection fed back by the digital twin simulation server, it is used to trigger a fine-tuning program for heating power in advance, and to generate centrifugal force compensation by adjusting the crucible rotation speed to suppress the impending interface flipping phenomenon.

[0017] Furthermore, the nonlinear multi-objective cooperative controller has an adaptive evolution function, which can adjust the weight parameters in the control function in real time according to the difference in impurity content of different batches of polysilicon raw materials, thereby realizing the intelligent migration of the process curve.

[0018] Furthermore, the nonlinear multi-objective cooperative controller invokes different control modes during the constant diameter stage, the tailing stage, and the necking stage of single crystal growth, respectively, to adapt to the differences in dynamic characteristics exhibited by the system at different growth stages.

[0019] Furthermore, the nonlinear multi-objective cooperative controller is also used to monitor the health status of the actuator. When the mechanical vibration of the lifting mechanism or the rotating mechanism is detected to exceed a preset threshold, the feedback data provided by the acoustic sensing unit is automatically intervened for cross-verification to eliminate false diameter fluctuation signals caused by mechanical vibration.

[0020] Preferably, the multimodal fusion control system of the semiconductor single crystal growth equipment further includes a process optimization management platform for storing complete multimodal sensing data and control trajectories for each batch, and generating a correlation analysis report on the consistency of the electrical performance of the crystal rod.

[0021] Furthermore, the process optimization management platform is used to compare the deviation between the actual growth parameters and the predicted parameters of the digital twin, and to use the deviation value to perform online correction on the physical parameter model in the digital twin simulation server, thereby improving the accuracy of subsequent simulations.

[0022] Furthermore, the thermodynamic sensing unit in the heterogeneous sensing network also includes real-time monitoring of argon flow parameters and furnace pressure parameters, and inputs these parameters as boundary conditions to the multimodal spatiotemporal alignment device to correct the measurement deviation caused by the refractive index change in the visual sensing unit.

[0023] Furthermore, the nonlinear multi-objective cooperative controller calculates the interaction factors between each actuator to achieve decoupled control of heating power, rotation speed and lifting rate, ensuring that adjusting a single physical quantity will not cause drastic fluctuations in other growth conditions.

[0024] Preferably, the acoustic sensing unit is used to calculate the trend of dissolved gas concentration change in the melt by analyzing the frequency of the acoustic pulses generated by bubble bursting.

[0025] Furthermore, the digital twin simulation server updates the oxygen concentration distribution map at the crystal growth interface in real time by numerically simulating the oxygen atom transport path inside the melt, and provides it as a feedback quantity to the nonlinear multi-objective cooperative controller.

[0026] Furthermore, the nonlinear multi-objective cooperative controller is used to dynamically adjust the magnetic field strength parameters according to the oxygen concentration distribution spectrum, and change the boundary layer thickness at the melt edge through magnetohydrodynamic effects, thereby achieving precise control of the oxygen content of the single crystal.

[0027] Furthermore, the heterogeneous sensing network is also used to detect the real-time position parameters of the crucible. The multimodal spatiotemporal alignment device links and compensates the position parameters with the focal length parameters of the visual sensing unit to ensure that the visual monitoring system always focuses on the solid-liquid interface region during the liquid level drop.

[0028] Preferably, the nonlinear multi-objective cooperative controller is used in the final stage of crystal growth to prevent crystallization caused by uneven thermal field at the tail end by using a combined strategy of increasing the pulling speed and simultaneously reducing the heating power.

[0029] Furthermore, the nonlinear multi-objective cooperative controller establishes a time delay model of the actuator response and performs advance phase compensation at the command output end to offset the control lag caused by thermal field transmission inertia.

[0030] Furthermore, the system is used to determine consistency by utilizing the characteristic redundancy of the acoustic sensing unit and the visual sensing unit when a sudden vibration occurs in the external environment. If the deviation between the two exceeds a preset range, it automatically enters the inertial holding mode to maintain the current operating parameters of the actuator until the interference disappears.

[0031] Compared with the prior art, the present invention has the following beneficial effects: 1. This invention overcomes the limitations of traditional control schemes that rely solely on feedback from a single physical quantity by constructing a heterogeneous sensing network incorporating vision, acoustics, and thermodynamics. In particular, the introduction of the acoustic sensing unit enables inverse sensing of convection, viscosity, and turbulence intensity within the melt, filling a gap in existing technologies for monitoring the internal state of the melt. This allows the control system to acquire deeper-level process state information, laying a data foundation for improving crystal quality.

[0032] 2. This invention, through the collaborative work of a multimodal spatiotemporal alignment device and a digital twin inference server, utilizes finite volume inference logic to enable the system to predict in advance the diameter offset and defect risks caused by thermal inertia. This allows the actuator to complete parameter adjustments before the deviation actually occurs, achieving extremely high-precision equal diameter control, reducing the dislocation density during single crystal growth, and improving the yield.

[0033] 3. The nonlinear multi-objective cooperative controller employed in this invention, combined with a deep reinforcement learning algorithm, solves the model mismatch problem under complex crystal growth conditions. The system no longer relies on fixed adjustment parameters but can adaptively adjust the control strategy based on the dynamic drift during the growth stage. Through deep coupling adjustment of multiple dimensions such as heating, rotation speed, and pulling, the system can balance multiple objectives such as diameter, interface shape, and impurity content, ensuring a high degree of uniformity between the microscopic quality and macroscopic dimensions of the crystal rod during the growth of large-size single crystals.

[0034] 4. This invention possesses robustness and intelligent evolution capabilities. The complementary verification mechanism of multimodal data can eliminate environmental noise interference, ensuring system stability in complex industrial environments. Through continuous accumulation of process data and digital twin calibration, the system can automatically evolve the optimal process curve for raw materials with different characteristics, solving the industry problem of traditional process formulations being difficult to transfer across batches and reducing reliance on human experience. Attached Figure Description

[0035] Figure 1 This is a schematic diagram of the overall technical solution architecture of the present invention; Figure 2 This is a schematic diagram of the core principle framework of multimodal fusion predictive control based on digital twin inference in this invention; Figure 3 This is a logical flowchart of the heterogeneous perception and multimodal spatiotemporal alignment of multidimensional physical state information in this invention. Figure 4 This is a diagram of the deduction logic framework of the digital twin deduction server in this invention, which integrates the internal physical mechanism and data-driven dual-drive. Figure 5 This is a control logic framework diagram of the nonlinear multi-objective cooperative controller in this invention for the coupling relationship of multiple process parameters. Detailed Implementation

[0036] Example 1: Please refer to the appendix Figure 1 To be continued Figure 5 A multimodal fusion control system for a semiconductor single crystal growth equipment includes a heterogeneous sensing network, a multimodal spatiotemporal alignment device, a digital twin inference server, a nonlinear multi-objective collaborative controller, and a process optimization management platform. The heterogeneous sensing network is used to collect multi-dimensional physical information about the thermal field, flow field, and mechanical motion states involved in the growth of semiconductor single crystals in a comprehensive and high-frequency manner, and to perform preliminary noise reduction and normalization processing on the collected raw physical signals. The heterogeneous sensing network is deployed on the inner wall of the vacuum chamber of the single crystal furnace, the quartz observation window, and the mechanical connection parts of the crucible support mechanism.

[0037] The heterogeneous sensing network is further subdivided into visual sensing units, acoustic sensing units, and thermodynamic sensing units.

[0038] The visual sensing unit is configured to capture the microscopic morphology and temperature gradient field distribution of the solid-liquid interface in real time using high-speed infrared thermal imaging principles. The visual sensing unit includes a high-resolution infrared thermal imaging sensor positioned at the observation window on top of the single crystal furnace. This sensor possesses photosensitive characteristics with wavelengths within a specific infrared frequency band, enabling it to penetrate the argon protective atmosphere and accompanying thermal convection interference generated during single crystal growth. The visual sensing unit also includes a high-performance image processor, which is used to accurately identify the boundary line between the melt surface and the growing single crystal rod, i.e., the three-phase point line, using a sub-pixel edge extraction algorithm. Using the extracted geometric features of the three-phase point line, the processor calculates the real-time crystal diameter. The visual sensing unit is also configured to analyze the meniscus height and curvature near the solid-liquid interface as key geometric parameters for assessing interface stability. To ensure image clarity under extremely high temperatures, the visual sensing unit is also equipped with an actively cooled filter assembly. This assembly, through a precise optical interference coating, eliminates strong background thermal radiation generated by the heater, ensuring that the infrared thermal imager can capture minute pixel-level brightness changes at the solid-liquid interface.

[0039] The acoustic sensing unit is configured to be deployed on the crucible wall to collect acoustic signals, bubble bursting signals, and thermal noise signals during the melt convection process. The acoustic sensing unit consists of a sensor array of multiple high-temperature resistant acoustic sensors, which are arrayed on the outer wall of the crucible and acoustically matched to the crucible wall using a specific high-temperature resistant ceramic coupling agent to receive acoustic emission signals from inside the melt. The acoustic sensing unit integrates a signal demodulation module based on a digital signal processor, which converts the collected time-domain acoustic signals into frequency-domain feature vectors. By applying acoustic fingerprint recognition logic, the acoustic sensing unit matches the real-time collected acoustic features with a flow field modal fingerprint database pre-stored in local memory. The fingerprint database contains characteristic acoustic spectra under different rotational speeds, temperature gradients, and impurity concentrations. Through this matching mechanism, the system can inversely determine the viscosity changes, turbulence intensity, and eddy current distribution characteristics caused by convection instability within the melt in areas inaccessible to visible light, solving the industry problem of traditional optical monitoring methods being unable to perceive the internal fluid dynamics of the melt.

[0040] The thermodynamic sensing unit is used to perform high-frequency synchronous sampling of the heater's voltage, current, and magnetohydrodynamic parameters, constructing a coupled input data stream of electromagnetic and thermal fields. The thermodynamic sensing unit includes not only a high-precision power acquisition module for real-time monitoring of the graphite heater's power consumption fluctuations, but also a set of magnetic induction intensity monitoring sensors arranged around the magnetic field generator. These sensors are used to record in real-time magnetic induction intensity distribution data generated by superconducting or conventional magnetic fields, as well as induced current fluctuation signals generated by the melt's movement in the magnetic field. The thermodynamic sensing unit timestamps the collected electromagnetic parameters such as voltage, current, magnetic field strength, and induced current to generate a feature vector describing the intensity of the heat source and the electromagnetic suppression force within the furnace. The thermodynamic sensing unit also includes real-time flow monitoring of the argon supply system and furnace pressure monitoring, using argon flow parameters and furnace pressure parameters as environmental boundary conditions for subsequent measurement compensation.

[0041] The multimodal spatiotemporal alignment device, connected to the heterogeneous sensing network, is used to perform multi-dimensional semantic alignment of the time-series signals acquired by the acoustic sensing unit, the spatial distribution images acquired by the visual sensing unit, and the point source data acquired by the thermodynamic sensing unit.

[0042] The sampling frequency configuration of each sensing unit in the heterogeneous sensing network is as follows: the sampling frequency of the high-temperature resistant acoustic sensor of the acoustic sensing unit is set to 10kHz to 50kHz, preferably 20kHz in this embodiment, to achieve high-frequency capture of the acoustic emission signal of melt rheology; the sampling frequency of the power acquisition module and magnetic induction intensity monitoring sensor of the thermodynamic sensing unit is set to 1kHz to 10kHz, preferably 5kHz in this embodiment, and the sampling frequency of the argon flow rate and furnace pressure monitoring module is set to 100Hz to 1kHz, preferably 500Hz in this embodiment; the frame rate of the high-resolution infrared thermal imaging sensor of the visual sensing unit is set to 25fps to 100fps, preferably 50fps in this embodiment, corresponding to a single frame acquisition interval of 20ms.

[0043] Because different types of sensing units have different sampling frequencies, the multimodal spatiotemporal alignment device employs an adaptive heterogeneous data fusion strategy. For high-frequency acoustic and thermodynamic point source data, the device performs resampling processing based on linear interpolation or spline interpolation to keep it synchronized with the acquisition time of the video frame sequence on the time axis. The device maps the camera coordinate system where the visual sensing unit is located to the physical coordinate system where the crucible is located through a preset coordinate transformation algorithm, ensuring that all sensed physical quantities are within a unified reference space.

[0044] The preset coordinate transformation algorithm uses the Zhang Zhengyou calibration method, which is well-known in the art, to map the camera coordinate system to the physical coordinate system. The specific implementation steps are as follows: A calibration plate is placed at the center of the crucible inside the single crystal furnace. The size of the calibration plate checkerboard matches the size of the solid-liquid interface region during single crystal growth. Images of the calibration plate at different angles and heights are acquired using an infrared thermal imaging sensor. Extract the checkerboard corner points from the calibration board image, solve for the camera's intrinsic parameter matrix and distortion coefficients, and complete the camera intrinsic parameter calibration; A world coordinate system is established based on the central axis of the crucible, with the center point of the upper edge of the crucible as the origin, the vertical upward axis as the Z-axis, and the horizontal radial axes as the X and Y axes. The camera extrinsic parameter matrices, including the rotation matrix, are solved by establishing the correspondence between the world coordinates of the calibration plate and the image pixel coordinates. With translation vector ; The transformation from pixel coordinates to world coordinates is accomplished using the following formula: in This represents the depth value along the camera's optical axis. For pixel homogeneous coordinates, These represent the horizontal and vertical pixel coordinates of the measured point on the infrared image, respectively. For the camera intrinsic parameter matrix, These are homogeneous coordinates in the world coordinate system. These represent the actual three-dimensional coordinates of the measured point in the physical world coordinate system of the crucible, ultimately achieving a unified mapping between the camera coordinate system of the visual perception unit and the physical coordinate system of the crucible.

[0045] The multimodal spatiotemporal alignment device also maps the features of each mode to a unified, high-dimensional feature space, and eliminates redundant information between different modes through feature extraction algorithms, retaining only the key feature combinations closely related to the single crystal growth state.

[0046] The feature extraction algorithm employs a multimodal feature fusion extraction algorithm based on an autoencoder, and its specific implementation is as follows: Acoustic modal encoder, visual modal encoder, and thermodynamic modal encoder were constructed respectively. Each encoder adopted a 3-layer fully connected network structure. The input layer dimension was matched with the feature dimension of the corresponding modality. The hidden layer dimension was set to 256 and the output layer dimension was set to 128. The single modal features of different dimensions were mapped to the feature vector of the same dimension. A fusion layer is constructed, concatenating the 128-dimensional feature vectors output from the three single-modal encoders to obtain a 384-dimensional concatenated feature. The weight coefficients of each single-modal feature are calculated using an attention mechanism module. The formula for calculating the weight coefficients is as follows: in For the first A single-modal feature vector , This is the weight matrix. , For bias terms, For the first Attention weights for each modality; Based on attention weights, the features of each single modality are weighted and summed to obtain a fused 128-dimensional unified high-dimensional feature vector; in step S24, a decoder is constructed to reconstruct the original input features of each single modality from the fused feature vector. The training objective is to minimize the reconstruction error, thus completing the training of the autoencoder. During the training process, the reconstruction error... The calculation formula is: in This represents the original single-modal features. Features reconstructed by the decoder For feature dimensions; After training, the decoder is removed, and only the encoder and fusion layer are retained as the feature extraction module. The output is a unified high-dimensional feature after eliminating redundant information, which realizes the effective fusion of multimodal features and the extraction of key information.

[0047] The digital twin simulation server is connected to the multimodal spatiotemporal alignment device, and internally constructs a digital twin corresponding to the physical single-crystal growth equipment and its process. The digital twin simulation server is used to receive multimodal feature data from the alignment device in real time and input it as initial boundary conditions into the fusion engine of the internally integrated physical model and data model.

[0048] The digital twin simulation server employs a dual-drive fusion mechanism of physical mechanism model and data-driven model during the simulation process. The physical mechanism model, based on the heat and mass transfer equation, Navier-Stokes equations, and Christaller distribution equations, performs numerical calculations of the temperature field, velocity field, and solute concentration field within the melt using the finite volume method. This first-principles-based calculation provides deterministic physical evolution trends. The data-driven model, based on massive amounts of historical growth data, uses recurrent neural networks or graph neural networks to probabilistically compensate for nonlinear random perturbations that are difficult for the physical model to characterize.

[0049] The digital twin simulation server is used to simulate in real time the distribution trend of crystal defects, the flatness change of the solid-liquid interface shape, and the predicted movement trajectory of the triple point on the melt surface at the next predetermined time scale (e.g., within the next 30 seconds to 5 minutes). The simulation logic is specifically modeled for the high inertia characteristics of thermal field transmission, calculates the hysteresis temperature response caused by the current adjustment of heating power, and outputs prediction results for future state shifts.

[0050] The high inertia characteristic modeling of the heat field transfer is specifically a modeling method based on the coupling of a first-order thermal inertia hysteresis model using the lumped parameter method and a two-dimensional transient heat conduction equation. The specific implementation method is as follows: The graphite heater, graphite heat shield, crucible, and silicon melt within the single crystal furnace are divided into multiple lumped parameter units. Based on the law of conservation of energy, the heat balance equations for each unit are constructed. The general formula for the heat balance equations is: in For the first Specific heat capacity of each unit For the first Density of each unit For the first Volume of each unit For the first Temperature of each unit For the first The input heat of each unit, For the first The output heat of each unit; For the silicon melt-solid interface region, a two-dimensional transient heat conduction equation is constructed, and the equation expression is: in For the density of silicon material, For the specific heat capacity of silicon material, For the thermal conductivity of silicon materials, For internal heat source, Radial coordinates, For axial coordinates, For time; Based on the lumped-parameter heat balance equation, a first-order hysteresis transfer function is obtained by fitting the heater power adjustment to the melt temperature change. The expression is: in For the system static gain, The thermal inertia time constant is used in this embodiment. The value range is from 10s to 60s. In this embodiment, the time delay is the pure time delay. The value ranges from 2s to 10s. It is a complex frequency variable; The obtained hysteresis transfer function is coupled with the transient heat conduction equation. The real-time change of heater power is used as the input boundary condition to solve for the temperature field distribution inside the melt and at the solid-liquid interface at the future time scale, that is, the hysteresis temperature response caused by the adjustment of heating power, thus completing the high inertia characteristic modeling of heat field transfer.

[0051] The server is also used to calculate the strain rate distribution at the solid-liquid interface and, in conjunction with a pre-set dislocation multiplication physics model, to assess in real time the probability of generating crystal integrity indicators under the current pulling speed and thermal gradient environment.

[0052] The preset dislocation multiplication physical model adopts the Alexander-Haasen dislocation multiplication model, which is commonly used in this field, and its specific expression is as follows: in, For dislocation density, The shear stress is calculated from the strain rate distribution at the solid-liquid interface and the elastic modulus of the silicon single crystal. The stress index is used in this embodiment. The value is 4.5. , The rate constant is material-dependent. This is the activation energy for dislocation multiplication. The activation energy for dislocation annihilation. Boltzmann's constant, The absolute temperature at the solid-liquid interface; The implementation steps of the model are as follows: Based on the strain rate distribution at the solid-liquid interface obtained from digital twin simulation, the shear stress σ distribution at the interface is calculated. Substituting the shear stress and interface temperature distribution into the Alexander-Haasen model above, the trend of dislocation density variation under future time scales can be obtained. The preset dislocation density threshold is 5 × 10³ cm. -2 When the calculated dislocation density exceeds this threshold, the crystal integrity index is deemed substandard, and the corresponding risk probability is output. The calculation formula is: in For dislocation density threshold, The maximum allowable dislocation density during single-crystal growth is set to 1×10⁻⁶ in this embodiment. 5 cm -2 .

[0053] The prediction results output by the server include not only scalar values, but also continuous prediction curves of diameter changes over a specific future period, as well as geometric compensation coefficients calculated to account for changes in the thermal environment caused by the drop in the melt level.

[0054] The derivation and calibration method of the geometric compensation coefficient is as follows: Based on the law of conservation of mass in single crystal growth, the relationship between the melt level drop height and crystal growth parameters is derived. The calculation formula is: in The height by which the melt level drops. The density of solid silicon, The density of liquid silicon, For crystal pulling speed, Where is the crystal radius, The radius of the inner wall of the crucible; Construct a thermal field geometric compensation model, using the initial position of the melt surface as a reference, and calculate the compensation model as the melt surface decreases. At this time, the relative position of the solid-liquid interface and the heater changes, resulting in a change in the heat input at the interface, and the geometric compensation coefficient... Defined as the ratio of the actual heat flux density at the interface at the current liquid level to the reference heat flux density at the initial liquid level; The relationship between the geometric compensation coefficient and the liquid level drop height was determined through offline calibration experiments. The calibration method was as follows: Under the unloaded state of the single crystal furnace, the molten medium in the crucible was set as a graphite block with the same heat capacity and emissivity as the silicon melt. The height of the graphite block was adjusted to simulate different liquid level drops. Under the same heater power, the heat flux density at the center of the upper surface of the graphite block was measured, and the functional relationship between the heat flux density ratio and the liquid level drop height was obtained by fitting the data. , The fitting function is a linear function obtained in this embodiment: ,in The linear coefficients were determined through calibration experiments. During the real-time simulation, the calculated real-time liquid level drop height is substituted into the calibrated function to calculate the real-time geometric compensation coefficient, which is used to correct the thermal field boundary conditions and offset the change in thermal environment caused by the drop in the melt level.

[0055] The nonlinear multi-objective cooperative controller, connected to the digital twin simulation server, is used to execute complex process commands with multiple objective functions, including constant crystal diameter, optimal solid-liquid interface flatness, and minimum oxygen content. The controller no longer relies on traditional linear proportional-integral-derivative (PID) regulators, but instead employs a control algorithm architecture based on deep reinforcement learning.

[0056] The deep reinforcement learning-based control algorithm employs the dual-delay deep deterministic policy gradient algorithm (TD3). Its network architecture includes one actor network, one target actor network, two critic networks, and two target critic networks. The specific structure of each network is as follows: Actor Network: A 4-layer fully connected network structure is adopted. The input layer dimension is consistent with the state space dimension, set to 16 dimensions; the first hidden layer dimension is set to 512, using the ReLU activation function; the second hidden layer dimension is set to 256, using the ReLU activation function; the output layer dimension is consistent with the action space dimension, set to 4 dimensions, using the Tanh activation function, and the output range is [-1,1], corresponding to the normalized value of the adjustment increment of each actuator. Target actor network: The structure is completely identical to that of the actor network, and the network parameters are kept in soft-update synchronization with the actor network; Critics Network: There are two identical critic networks, both using a 4-layer fully connected network structure. The input layer is a concatenation of the state vector and action vector, with a dimension of 20. The first hidden layer has a dimension of 512 and uses the ReLU activation function. The second hidden layer has a dimension of 256 and uses the ReLU activation function. The output layer has a dimension of 1 and outputs the Q-value of the corresponding state-action pair, which is the expected value of the cumulative reward. Target Critics Network: There are 2 networks, each with the same structure as the two existing critic networks. The network parameters are kept in soft-update synchronization with the corresponding critic networks.

[0057] The nonlinear multi-objective cooperative controller learns the optimal control path under different growth stages and environmental disturbances through millions of iterations of process parameter combinations in an offline simulation environment.

[0058] The specific iterative optimization training process of the algorithm is as follows: 1. Construct an offline simulation training environment. Based on the physical mechanism model of single crystal growth in the digital twin simulation server, a simulation environment is built to simulate the entire process of single crystal growth, including necking, equal diameter, and tailing. Different training scenarios are set, such as raw material impurity content, initial state of thermal field, and environmental disturbance, covering the full range of working conditions for single crystal growth. 2. Initialize network parameters: Randomly initialize the weight parameters of the actor network and the two critic networks. Synchronize the weight parameters of the target actor network and the two target critic networks with their respective main networks. Set the experience replay pool capacity to 1,000,000, the discount factor γ to 0.99, and the soft update coefficient. Set the policy update delay d to 2, the exploration noise standard deviation to 0.2, the noise clipping value to 0.5, the batch sampling size to 256, and the maximum number of training iterations to 5 million. 3. Interactive sampling is performed in the simulation environment. At each time step, the actor network adjusts its input based on the current state. Output Action After adding Gaussian exploration noise, the data is sent to the simulation environment, and the environment returns to the next time step state s', with an immediate reward. Termination mark ,Will The resulting quintuple is stored in the experience replay pool; 4. When the number of samples in the experience replay pool reaches the batch sampling size, randomly sample 256 groups of samples from the experience replay pool to update the network parameters; 5. Update the critic network parameters. For each sampled sample, the target actor network outputs the next action based on the next state. The clipped Gaussian noise is superimposed, and two target Q-values ​​are output by two target critic networks respectively. The smaller of the two values ​​is taken to calculate the target Q-value. The formula for calculating the target Q-value is as follows: in, For the first The target Q value of each sampled sample For the first Instant reward for each sample Execute an action for the i-th sample. After that, the environment transitions to the state of the next moment. , The outputs of two target critic networks are used; the parameters of the two critic networks are updated separately using the mean squared error loss function. The expression is: Where j=1,2, corresponding to two critic networks respectively. For batch sampling size, For the j-th main commentator network, the current state ,action The value estimation output; 6. Every two critic network updates, perform one update for both the actor network and the target network: Update the parameters of the actor network using the deterministic policy gradient theorem. The policy gradient calculation formula is as follows: in, For the strategy gradient of the actor network, Let be the objective function. Let Q-value of the first main commenter network be the gradient with respect to action a. For the output action of the actor network about gradient, Here is the policy function corresponding to the actor network; the parameters of the target actor network and the target critic network are updated using a soft update method, and the soft update formula is: in For the target network parameters, Main network parameters; 7. Repeat steps 3 to 6 until the maximum number of training iterations is reached, or the accumulated reward converges to a stable value. The training is then completed, and the converged actor network is obtained as the final control policy network, which is used to output the optimal actuator adjustment increment in real time.

[0059] When the controller receives a future state deviation warning from the digital twin simulation server, its internal multi-objective optimization engine automatically calculates the optimal adjustment increment for the actuators. The actuators include a heater, a lifting mechanism, and a rotating mechanism. The controller achieves predictive intervention in the growth environment by dynamically adjusting the nonlinear coupling relationship between heating power parameters, crucible rotation speed parameters, and lifting speed parameters.

[0060] For example, when the acoustic sensing unit detects a specific turbulence spectrum inside the melt, and the digital twin inference server predicts that this turbulence will cause a shape reversal at the solid-liquid interface within the next two minutes, the nonlinear multi-objective cooperative controller executes a proactive phase compensation strategy. This strategy includes: reducing the heating power in advance with a slight slope, increasing the crucible rotation speed to enhance the centrifugal force's suppression of convection, and performing small compensating oscillations at the lifting speed to counteract the hysteresis effect caused by thermal inertia. This proactive control mechanism eliminates diameter fluctuations caused by feedback lag.

[0061] The nonlinear multi-objective cooperative controller possesses adaptive evolution capabilities. Due to differences in the microscopic impurity content (such as oxygen and carbon concentration) and thermal conductivity between different batches of quartz crucibles, graphite hot zone components, and polycrystalline silicon raw materials, the controller can adjust the weight parameters in the control function in real time based on initial sensor feedback for each batch. This intelligent evolutionary capability enables the system to automatically evolve the optimal growth process curve for the specific operating conditions of the current batch.

[0062] During different stages of crystal growth, such as the necking stage, the constant diameter stage, and the finishing stage, the nonlinear multi-objective cooperative controller automatically switches between different control modes. In the constant diameter stage, the controller focuses on extremely high-precision diameter stability control; while in the finishing stage, the controller is configured to prevent crystallization caused by uneven thermal field at the tail end and control the drastic rebound of oxygen content through a specific combination strategy of pulling speed and temperature control.

[0063] The nonlinear multi-objective cooperative controller also integrates hardware health monitoring functionality. It monitors the encoder signals and motor current of the lifting and rotating mechanisms. When the detected mechanical vibration amplitude exceeds a preset safety threshold, the controller automatically calls upon data from the acoustic sensing unit for cross-validation to determine whether the vibration originates from a flow field burst within the melt or external mechanical interference. If the vibration is a spurious diameter fluctuation caused by external environmental vibration, the controller automatically executes filtering logic, enters inertial holding mode, and maintains stable operating parameters of the current actuator until the interference signal disappears.

[0064] The process optimization management platform, connected to the nonlinear multi-objective collaborative controller and the digital twin simulation server, is used to construct a full lifecycle process data asset library. The platform stores complete multimodal sensing raw data for each batch, the digital twin simulation trajectory, and the actual operation logs of the controller. The platform has a built-in correlation analysis engine used to correlate and model the electrical performance indicators of the final produced ingots (such as minority carrier lifetime and resistivity distribution) with the multimodal characteristics during the growth process. By analyzing the deviation between actual growth parameters and digital twin predicted parameters, the platform is used to perform offline or online correction of the physical model constants in the digital twin, continuously improving the simulation accuracy of subsequent batches.

[0065] Furthermore, the multimodal fusion control system also includes an emergency safety module. This module is used to automatically trigger a redundancy estimation strategy based on the remaining modes when any type of mode fails in the heterogeneous sensing network. For example, if the visual sensing unit cannot accurately extract the diameter due to fogging of the quartz window, the emergency module will use a digital twin to infer the current virtual diameter based on the interface acoustic emission intensity captured by the acoustic sensing unit and the power consumption trajectory recorded by the thermodynamic sensing unit, and guide the system to safely enter a controlled shutdown or maintenance state to reduce losses.

[0066] Example 2: Building upon Example 1, this example provides a multimodal fusion control system for a semiconductor single crystal growth equipment based on an edge computing and cloud collaborative architecture. In this architecture, data processing from the heterogeneous sensing network is distributed between the edge processing node closest to the furnace and the remote central control center, further enhancing the system's real-time response capability and global process optimization capability.

[0067] In this embodiment, the visual perception unit in the heterogeneous sensing network further includes a set of arrayed infrared visual sensors deployed around the periphery of the single crystal furnace. These sensors are used to acquire multi-view images of the solid-liquid interface from multiple observation windows. The edge processing node is used to perform real-time three-dimensional reconstruction of these multi-view images, constructing a real-time three-dimensional topological morphology of the solid-liquid interface. Compared to single-view two-dimensional diameter measurement, this three-dimensional topological morphology can accurately reflect whether the crystal exhibits eccentricity or distortion under high-temperature conditions.

[0068] The acoustic sensing unit is further enhanced in this embodiment. The array-type high-temperature resistant acoustic sensor not only monitors the vibration of the crucible wall but also includes an ultrasonic transducer mounted on the top of the lifting rod. This transducer is used to emit ultrasonic pulses into the single crystal rod and receive the reflected echoes. By analyzing the propagation speed and attenuation characteristics of the pulses inside the crystal, the system can assess the stress distribution inside the single crystal and the presence of microscopic defect clusters in real time. The edge processing node extracts the physical state features inside the crystal in real time through complex deconvolution processing of the acoustic wave reflection path and inputs them as independent feature dimensions into the multimodal spatiotemporal alignment device.

[0069] In this embodiment, the thermodynamic sensing unit integrates a temperature sensor array based on fiber Bragg gratings. These fiber optic sensors are embedded in a graphite heat shield and heater support. Due to the strong electromagnetic interference resistance of fiber optic sensors, they can provide faster and more accurate point source temperature measurements than traditional thermocouples, and are unaffected by induced magnetic fields. By acquiring these discretely distributed, high-precision temperature points and combining them with magnetohydrodynamic parameters, the thermodynamic sensing unit provides more refined thermal field boundary constraints for the digital twin simulation server.

[0070] In this embodiment, the multimodal spatiotemporal alignment device employs a communication protocol based on deterministic delay compensation. Due to differences in the physical distance and processing complexity of different sensors, the arrival time of signals to the controller varies slightly. The alignment device is used to automatically add a high-precision timing tag to the header of the data packet based on the known processing delay of each mode. Using a global synchronization clock, the device performs phase calibration on all physical signals at the logic level, ensuring that the state vector received by the nonlinear multi-objective cooperative controller is an instantaneous process profile.

[0071] In this embodiment, the digital twin simulation server employs a distributed computing architecture. Lightweight fluid dynamics simulations are executed on edge processing nodes to provide real-time feedback down to the second level; while high-precision full-thermal-field coupled simulations are performed on a cloud server cluster. The cloud server cluster is used for long-range simulations across scales using the finite volume method, considering not only single growth cycles but also the aging and degradation effects of thermal field components over time. Long-range compensation curves generated in the cloud are periodically sent to the edge controller to correct the baseline deviation of the local control logic.

[0072] In this embodiment, the nonlinear multi-objective cooperative controller is configured with a two-layer control structure. The bottom-layer controller consists of a high-speed computing module based on a field-programmable gate array (FPGA) for performing millisecond-level closed-loop control of heater power consumption and mechanical position. The top-layer controller consists of a decision module based on a deep learning processor for performing global process optimization based on multimodal fusion.

[0073] When sudden environmental disturbances occur during system operation, such as power fluctuations or foundation vibrations in the workshop, the bottom-level controller can perform rapid clamping operations based on the instantaneous feedback from the thermodynamic sensing unit to prevent overshoot of the actuators. Meanwhile, the top-level controller analyzes the redundancy characteristics of acoustics and vision to determine the persistence and depth of the disturbance, dynamically reshaping the objective function to enable the system to recover to the target growth state through the optimal energy consumption path.

[0074] In this embodiment, the process optimization management platform possesses enhanced intelligent analysis capabilities. It not only stores historical data but also utilizes generative adversarial networks to construct numerous virtual abnormal operating conditions. By testing the robustness of the controller in this virtual environment, the platform can automatically identify potential risks in the control logic under specific extreme conditions and push optimized model patches online.

[0075] The heterogeneous sensing network in this embodiment also includes monitoring the composition of exhaust gas discharged from the single crystal furnace. By monitoring the concentration of trace amounts of carbon monoxide in the exhaust gas in real time, the system can infer the intensity of the chemical reaction between the graphite component and the quartz crucible at high temperatures, and the nonlinear multi-objective collaborative controller can fine-tune the argon flow rate and pressure accordingly, thereby indirectly optimizing the carbon and oxygen content inside the single crystal.

[0076] Example 3: Based on the above examples, this example describes in detail the specific decision-making logic and multi-objective balancing mechanism based on deep reinforcement learning in the nonlinear multi-objective cooperative controller.

[0077] The nonlinear multi-objective cooperative controller internally establishes a reinforcement learning model based on state space, action space, and reward function. The state space consists of a multi-dimensional feature vector output by the multimodal spatiotemporal alignment device, which includes diameter deviation, interface curvature, melt viscosity, turbulence index, real-time power consumption, current lifting speed, and crucible rotation speed.

[0078] The multidimensional feature vector of the state space is specifically 16-dimensional, and the specific composition and normalization method of each dimension are as follows: Dimension 1: Crystal diameter deviation The calculation method is the difference between the actual measured diameter and the set diameter, normalized to the interval [-1, 1]. The normalization formula is as follows: ,in In this embodiment, the maximum allowable diameter deviation is set to ±2mm; This represents the normalized crystal diameter deviation.

[0079] Second dimension: Average curvature of the solid-liquid interface The solid-liquid interface contour extracted from infrared thermal imaging images is fitted and normalized to the [0,1] interval. 3rd dimension: Average melt viscosity It is obtained by inverting the voiceprint features of the acoustic sensing unit and normalized to the [0,1] interval; 4th dimension: Melt turbulence index The value is calculated from the acoustic spectrum characteristics of the acoustic sensing unit. The calculation method is the ratio of turbulent fluctuation velocity to average flow velocity, normalized to the [0,1] interval. 5th Dimension: Real-time Power Consumption of the Heater Normalized to the [0,1] interval, with the normalization reference being the rated power of the heater; 6th Dimension: Real-time Crystal Pulling Speed Normalized to the [0,1] interval, with the normalization reference being the maximum rated speed of the lifting mechanism; 7th Dimension: Real-time Rotation Speed ​​of the Crucible Normalized to the [0,1] interval, with the normalization reference being the maximum rated speed of the crucible rotating mechanism; 8th Dimension: Real-time External Magnetic Field Strength Normalized to the [0,1] interval, with the normalization reference being the maximum rated magnetic field strength of the magnetic field generator; 9th dimension: Average temperature gradient at the solid-liquid interface , obtained by infrared thermal imaging sensor, and normalized to the [0,1] interval; 10th Dimension: Predicted Value of Melt Oxygen Concentration The output is generated by the digital twin inference server and normalized to the [0,1] interval; 11th Dimension: Real-time Argon Gas Flow Rate Normalized to the [0,1] interval, with the normalization reference being the maximum rated flow rate of the argon system; 12th Dimension: Real-time Pressure Inside the Furnace Normalized to the [0,1] interval, with the normalization reference being the rated working pressure inside the furnace; 13th Dimension: Real-time Drop Height of Melt Surface , calculated by the mass conservation formula, and normalized to the [0,1] interval; 14th Dimension: Predicted Diameter Change in the Next 30 Seconds The output is generated by the digital twin inference server and normalized to the [-1,1] interval; 15th Dimension: Predicted value of interface curvature change in the next 30 seconds The output is generated by the digital twin inference server and normalized to the [-1,1] interval; 16th Dimension: Predicted Oxygen Concentration Change in the Next 30 Seconds The output is generated by the digital twin inference server and normalized to the interval [-1,1].

[0080] The action space is defined as the heater power increment, the lifting motor frequency increment, the crucible motor frequency increment, and the magnetic field strength adjustment increment.

[0081] The reward function is defined as a weighted sum of multiple sub-objectives; the complete expression of the reward function is: in, , , , These are the weight coefficients for the four sub-objectives, and the range for setting these weight coefficients is as follows: ∈[0.5,0.7], ∈[0.15,0.25], ∈[0.05,0.15], ∈[0.05,0.15], and satisfy + + + =1; In this embodiment, the preferred weight is set as follows: =0.6, =0.2, =0.1, =0.1; The specific functional forms of each sub-objective are as follows: First sub-goal For diameter-controlled reward items, the specific function expression is: in, The actual measured diameter is the deviation from the set diameter. This sub-objective penalizes the diameter deviation by using the negative value of the sum of squared deviations. The larger the deviation, the lower the reward value, thus ensuring constant control of the crystal diameter. Second sub-objective The deviation function is specifically expressed as a reward item for interface flatness optimization: Wherein, Cideal is the curvature of the ideal flat interface, with a value of 0; this sub-objective penalizes the deviation of the interface curvature from the ideal flatness in the form of an absolute value. The larger the absolute value of the curvature, the lower the reward value, ensuring the optimal flatness of the solid-liquid interface. The third sub-objective, R3, is a power consumption fluctuation penalty term, and its specific function expression is as follows: in, This refers to the change in heater power per unit time. This sub-objective penalizes power consumption fluctuations per unit time; the greater the power fluctuation, the lower the reward value, thus avoiding drastic adjustments in heating power and improving thermal field stability. Fourth sub-objective For oxygen content control reward items, the specific expression of the deviation function is: in, The target oxygen concentration is used for single crystal growth. This sub-target penalizes the deviation between the predicted oxygen concentration and the target value. The greater the deviation, the lower the reward value, thus achieving precise control of the oxygen content of the single crystal to the minimum.

[0082] During the growth process, the nonlinear multi-objective cooperative controller continuously observes changes in the state vector and outputs the optimal action combination in real time. To address the stringent stability requirements of semiconductor processes, the reinforcement learning model is also encapsulated with a layer of hard physical constraints. These physical constraints are used to monitor the action output. If the action increment output by the reinforcement learning model exceeds the safe slope threshold calculated based on physical thermodynamics (e.g., the instantaneous rate of change of the lifting speed exceeds the limit that may cause neck breakage), the control system will automatically trim the action to a safe range.

[0083] In the mid-to-late stages of the constant-diameter phase, as the amount of melt in the crucible decreases, the heat transfer characteristics of the thermal field undergo nonlinear drift. The digital twin simulation server provides a time-varying background field compensation coefficient to the nonlinear multi-objective cooperative controller. The controller incorporates this compensation coefficient into the dynamic calculation of the reward function, enabling the system to predictively reduce the base heating power and, in conjunction with changing the lifting position of the crucible, maintain the absolute stability of the solid-liquid interface at the center of the thermal field.

[0084] The nonlinear multi-objective cooperative controller is also used to handle degradation strategies after sensor failure. When the system detects a decrease in the data reliability of the vision unit through consistency checks (such as low image contrast due to deposition occlusion), the controller dynamically adjusts the weights of the feature space. The system increases the feedback weights of the acoustic and thermodynamic sensing units, and uses the intensity of melt surface oscillations derived from acoustic signature features to calculate the virtual diameter. This multimodal redundancy strategy ensures that the single crystal growth system can maintain a controlled state and avoid process deviations even under harsh operating conditions.

[0085] Example 4: Based on the above examples, this example describes in detail the deep integration and deduction logic of the physical mechanism model and the data-driven model of the digital twin inference server.

[0086] In this embodiment, the digital twin simulation server is first used to establish a high-fidelity geometric model, which describes the detailed physical dimensions of the heating system, thermal insulation components, quartz crucible, and single crystal rod within the single crystal furnace. The physical mechanism model integrated within the server employs a multi-field coupled solver based on the finite volume method. This solver is used to comprehensively analyze the heat transfer, mass transfer, and flow phenomena within the melt.

[0087] The solver uses real-time power consumption, rotational speed, and magnetic field strength parameters obtained from the heterogeneous sensing network as energy input sources, and calculates the three-dimensional velocity vector distribution inside the melt according to the Navier-Stokes equations. The Boneske approximation model is combined to handle buoyancy-driven convection caused by temperature differences. The solver is also used to handle heat radiation exchange at the melt surface, applying the discrete coordinate method to calculate the radiative heat transfer between the graphite component and the melt. This physical mechanism deduction can provide the local temperature gradient value at the crystal growth interface, which determines the upper limit of the crystal growth rate.

[0088] The data-driven model is based on long short-term memory networks or variational autoencoders in deep learning architectures. This model is used to extract nonlinear patterns from massive amounts of historical process data. These patterns include the effects of thermal conductivity decay caused by the graphite thermal field components during use, softening and deformation of the quartz crucible at high temperatures, and the influence of random eddies formed by argon flow within the furnace cavity on the diameter.

[0089] The digital twin inference server dynamically weights and fuses the deterministic output of the physical mechanism model with the probabilistic deviation prediction of the data-driven model. The fusion logic is as follows: during the stable initial and middle stages of system operation (equal diameter), the weight of the physical mechanism model is increased to maintain high-precision physical parameter inference; during the necking, tailing, or periods of severe external interference, the weight of the data-driven model is increased to capture stochastic nonlinear phenomena that are difficult to model due to the complexity of the physical mechanism.

[0090] The prediction results output by the digital twin simulation server include a three-dimensional stress field distribution map of the future growth interface. This stress field distribution map is used to assess the probability distribution of dislocation formation during the crystal's cooling process. When the predicted stress exceeds a preset safety limit, the nonlinear multi-objective cooperative controller executes a smooth deceleration strategy, that is, fine-tuning the ratio of pulling speed to rotational speed without changing the diameter, to optimize the thermal equilibrium state of the interface.

[0091] The server is also used to calculate the transport trajectory of oxygen atoms inside the melt. Since the oxygen content in monocrystalline silicon mainly depends on the rate at which the quartz crucible is eroded by the melt and the rate of oxygen volatilization from the melt surface, the physical model calculates the oxygen dissolution flux at the crucible wall in real time based on the near-wall flow velocity derived from the acoustic sensing unit. The oxygen evaporation rate at the liquid surface is calculated by combining argon pressure and flow parameters. The digital twin simulation server ultimately outputs a real-time oxygen concentration distribution map at the interface, which is used as a feedback reference for a nonlinear multi-objective cooperative controller. This controller uses the magnetic field strength to suppress or enhance melt convection, achieving millisecond-level closed-loop control of the monocrystalline oxygen content.

[0092] Example 5: Based on the above examples, this example describes how the process optimization management platform uses multimodal data from the edge side to perform full-process process traceability and self-evolution.

[0093] The process optimization management platform is used to build a large-scale data center based on a time-series database. For each batch of single crystal growth, the platform not only records the controller's instructions and the sensor's measurements, but also records the intermediate state variables generated by the digital twin simulation server at each moment, such as the highest flow rate inside the melt and the average thermal gradient.

[0094] The platform incorporates a set of optimization algorithms based on transfer learning. When abnormal fluctuations in resistivity distribution are detected in the crystal rods produced from a batch of raw materials during subsequent processes, the process engineers can use the platform to trace back to any point in the growth process. The platform automatically analyzes the multimodal characteristics at that moment. For example, it identifies an abnormal low-frequency pulse captured by the acoustic sensing unit at the 1500mm diameter mark, and this pulse highly coincides with the interface instability predicted by the digital twin server in time.

[0095] The process optimization management platform utilizes these correlation analysis results to automatically update the reinforcement learning model of the nonlinear multi-objective cooperative controller. By introducing this abnormal operating condition as a negative sample into offline training, the system can evolve a more robust obstacle avoidance control logic. This self-evolutionary capability allows the system to continuously accumulate process experience for specific raw materials and specific thermal field conditions.

[0096] The platform is also used for preventative maintenance of the single-crystal furnace actuators. By analyzing the current spectrum of the pulling motor recorded by the thermodynamic sensing unit and the bearing friction noise captured by the acoustic sensing unit, the platform can use fault prediction algorithms to provide early warnings of potential failure risks of the pulling mechanism several days in advance and recommend maintenance before the next furnace cycle. This integrated management model, combining process awareness and equipment health, improves the overall operating efficiency of the semiconductor single-crystal production line.

[0097] The specific embodiments described above further illustrate the purpose, technical solution, and beneficial effects of the present invention. It should be understood that the above descriptions are merely specific embodiments of the present invention and are not intended to limit the present invention. Any modifications, equivalent substitutions, or improvements made within the spirit and principles of the present invention should be included within the scope of protection of the present invention. Without departing from the spirit of the present invention, those skilled in the art can make various equivalent transformations, all of which fall within the scope of protection of the present invention. In the description of the present invention, the names of various components, modules, units, or devices are only for descriptive convenience and do not constitute a unique limitation on the hardware structure. In practical applications, the functions described in the present invention can be implemented through software, hardware, firmware, or a combination thereof. For those skilled in the art, after referring to the contents of this specification, they can implement the technical solutions described in the present invention according to specific engineering needs without creative effort; these all fall within the scope of protection of the present invention.

Claims

1. A multimodal fusion control system for a semiconductor single crystal growth apparatus, characterized in that, include: Heterogeneous sensing networks are used to collect multidimensional physical state information during the semiconductor single crystal growth process from all directions. A multimodal spatiotemporal alignment device, connected to the heterogeneous sensing network, is used to semantically align the time-series signals, spatial distribution images, and point source data acquired by the heterogeneous sensing network, and map them to a unified feature space. The digital twin inference server is connected to the multimodal spatiotemporal alignment device. It has a digital twin of the single crystal growth process corresponding to the physical device built inside. The digital twin inference server is used to receive the aligned multimodal data in real time, predict the crystal defect distribution trend and solid-liquid interface shape change at the next predetermined time scale through inference logic, and output the prediction results for the future state shift caused by thermal inertia. A nonlinear multi-objective collaborative controller, connected to the digital twin simulation server, is used to dynamically adjust the coupling relationship between heating power parameters, crucible rotation speed parameters, and lifting speed parameters based on the prediction results, with multiple objective functions including constant crystal diameter, optimal solid-liquid interface flatness, and minimum oxygen content. The digital twin simulation server is equipped with a dual-drive fusion engine that integrates physical mechanism models and data-driven models; The physical mechanism model is based on the heat and mass transfer equation, the Navier-Stokes equation and the Christaller distribution equation. The finite volume method is used to numerically calculate the temperature field, velocity field and solute concentration field inside the melt, and the discrete coordinate method is used to calculate the radiative heat transfer between the graphite part and the melt. The data-driven model is based on historical growth data and uses recurrent neural networks or graph neural networks to probabilistically compensate for nonlinear random disturbances. The digital twin simulation server is used to calculate the strain rate distribution at the solid-liquid interface and, in conjunction with a preset dislocation multiplication physics model, to evaluate the crystal integrity index at the current growth rate. The prediction results include a prediction curve for the diameter change trend in the future period, and a geometric compensation coefficient for the change in thermal environment caused by the drop in the melt level.

2. The multimodal fusion control system for a semiconductor single crystal growth equipment according to claim 1, characterized in that, The heterogeneous sensing network includes acoustic sensing units; The acoustic sensing unit includes multiple high-temperature resistant acoustic sensors, which are arranged in an array on the outer wall of the crucible in the single crystal growth equipment and are acoustically matched with the crucible wall through a high-temperature resistant ceramic coupling agent. The acoustic sensing unit integrates a signal demodulation module, which is used to convert the acquired time-domain acoustic signal into a frequency-domain feature vector. The acoustic sensing unit is equipped with a voiceprint recognition module, which is used to match the frequency domain feature vector with a preset flow field modal fingerprint database, and determine the viscosity change, flow field turbulence intensity, and eddy current distribution characteristics caused by convection instability in the invisible area inside the melt through voiceprint feature inversion logic. The flow field modal fingerprint database contains characteristic acoustic spectra under different rotational speeds, different temperature gradients, and different impurity concentrations.

3. The multimodal fusion control system for a semiconductor single crystal growth equipment according to claim 2, characterized in that, The heterogeneous sensing network includes a visual sensing unit; The visual perception unit includes a high-resolution infrared thermal imaging sensor and an image processor arranged in the observation window on top of the single crystal growth equipment. The high-resolution infrared thermal imaging sensor is used to capture the microstructure of the solid-liquid interface and the distribution of the temperature gradient field in real time using the principle of high-speed infrared thermal imaging. The visual perception unit is also equipped with an active cooling filter assembly, which eliminates background heat radiation interference generated by the heating element through an optical interference coating. The image processor is used to identify the three-phase point line between the melt surface and the growing single crystal rod through a sub-pixel edge extraction algorithm, and to calculate the real-time crystal diameter value based on the geometric characteristics of the three-phase point line. The image processor is also used to analyze the meniscus height and curvature near the solid-liquid interface as geometric parameters for evaluating interface stability.

4. The multimodal fusion control system for a semiconductor single crystal growth apparatus according to claim 3, characterized in that, The heterogeneous sensing network includes thermodynamic sensing units; The thermodynamic sensing unit is used to perform high-frequency sampling of the voltage parameters, current parameters, and magnetohydrodynamic parameters of the heater to construct coupled input data of electromagnetic field and thermal field; The magnetohydrodynamic parameters include the intensity distribution data of the applied magnetic field and the induced current fluctuation data generated by the movement of the melt in the magnetic field; The thermodynamic sensing unit also includes a high-precision power acquisition module, a magnetic induction intensity monitoring sensor, an argon flow monitoring module, and a furnace pressure monitoring module. The thermodynamic sensing unit is used to timestamp the collected voltage, current, magnetic field strength, and induced current to generate a feature vector describing the intensity of the heat source and the electromagnetic suppression force in the furnace. The collected argon flow rate parameters and furnace pressure parameters are used as environmental boundary conditions to correct the measurement deviation caused by the refractive index change in the visual sensing unit.

5. The multimodal fusion control system for a semiconductor single crystal growth apparatus according to claim 4, characterized in that, The multimodal spatiotemporal alignment device is equipped with a heterogeneous data fusion module and a coordinate transformation module; The heterogeneous data fusion module is used to perform resampling processing based on linear interpolation or spline interpolation, so that the high-frequency acoustic signals and thermodynamic point source data are synchronized with the video frame sequence acquisition time of the visual perception unit on the time axis. The coordinate transformation module is used to map the camera coordinate system where the visual perception unit is located to the physical coordinate system where the crucible is located through a coordinate transformation algorithm, so as to ensure that the physical quantities are in a unified reference space. The multimodal spatiotemporal alignment device is also used to add a high-precision timing tag to the header of the data packet and to perform phase calibration on all physical signals using a global synchronization clock.

6. The multimodal fusion control system for a semiconductor single crystal growth apparatus according to claim 5, characterized in that, The nonlinear multi-objective cooperative controller employs a control algorithm based on deep reinforcement learning, which includes a state space, an action space, and a reward function. The state space consists of a multidimensional feature vector containing diameter deviation, interface curvature, melt viscosity, turbulence index, real-time power consumption, real-time lifting speed, and crucible rotation speed. The action space is defined as the heater power increment, the lifting motor frequency increment, the crucible motor frequency increment, and the magnetic field strength adjustment increment. The reward function is defined as a weighted sum of multiple sub-objectives, wherein the first sub-objective is defined as the negative of the sum of squares of the deviations between the actual diameter and the set diameter, the second sub-objective is defined as a function of the deviation between the curvature of the solid-liquid interface and the ideal flatness, the third sub-objective is defined as a penalty term for power consumption fluctuation per unit time, and the fourth sub-objective is defined as a function of the deviation between the predicted oxygen concentration and the target oxygen concentration.

7. The multimodal fusion control system for a semiconductor single crystal growth apparatus according to claim 6, characterized in that, The nonlinear multi-objective cooperative controller is equipped with a phase compensation module, a mode switching module, and an adaptive evolution module; The phase compensation module is used to execute an advanced phase compensation strategy when an abnormal turbulence warning of melt convection is detected, to trigger the fine-tuning program of heating power in advance, and to generate centrifugal force compensation by adjusting the crucible rotation speed to suppress interface flipping. The mode switching module is used to call different control modes during the constant diameter stage, the finishing stage, and the necking stage of single crystal growth. The adaptive evolution module is used to adjust the weight parameters in the control function in real time according to the differences in impurity content of different batches of polysilicon raw materials. The nonlinear multi-objective cooperative controller is also used to establish a time delay model of the actuator response and to perform advance phase compensation at the command output end to offset the control lag caused by thermal field transmission inertia.

8. The multimodal fusion control system for a semiconductor single crystal growth apparatus according to claim 7, characterized in that, The system also includes a process optimization management platform, which is connected to the nonlinear multi-objective collaborative controller and the digital twin simulation server; The process optimization management platform is used to store the multimodal sensing raw data, digital twin simulation trajectory and controller operation logs for each batch, and to build a process data asset library with a full life cycle. The process optimization management platform has a built-in correlation analysis engine, which is used to model the correlation between the electrical performance indicators of the crystal rod and the multimodal characteristics in the growth process. The process optimization management platform is used to compare the deviation between the actual growth parameters and the predicted parameters of the digital twin, and to use the deviation value to perform online correction of the physical parameter model in the digital twin inference server. The process optimization management platform is also used to perform preventive maintenance warnings for the actuator by analyzing the current spectrum of the lifting motor recorded by the thermodynamic sensing unit and the bearing friction sound captured by the acoustic sensing unit.

9. The multimodal fusion control system for a semiconductor single crystal growth apparatus according to claim 8, characterized in that, The system also includes an emergency safety module; The nonlinear multi-objective cooperative controller is used to monitor the hardware health status of the actuator. When the mechanical vibration of the lifting mechanism or the rotating mechanism exceeds the preset threshold, the feedback data provided by the acoustic sensing unit is automatically used for cross-verification. The emergency safety module is used to trigger a redundancy estimation strategy based on the remaining modes when any type of mode fails in the heterogeneous sensing network. When the data reliability of the visual perception unit decreases, the nonlinear multi-objective cooperative controller is used to increase the feedback weight of the acoustic perception unit and the thermodynamic perception unit, and to calculate the virtual diameter by using the oscillation intensity of the melt surface inverted by the acoustic features. When a sudden vibration occurs in the external environment, the system uses the characteristic redundancy of the acoustic sensing unit and the visual sensing unit to make a consistency determination, and automatically enters the inertial holding mode when the deviation exceeds the preset range to maintain the current operating parameters of the actuator.