A visual guidance method and system for the docking of a semiconductor warehouse wafer cartridge body with a limiting protrusion
By using a flip-up vision camera for high-precision visual positioning in a semiconductor wafer warehousing handling system, the problems of decreased positioning accuracy and high interference risk in existing technologies are solved, achieving high-precision, low-cost dynamic anti-interference and intelligent process control.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- WEISHI ADVANCED INTELLIGENT TECH (SUZHOU) CO LTD
- Filing Date
- 2026-03-02
- Publication Date
- 2026-06-23
AI Technical Summary
Existing technologies for automated handling in semiconductor wafer warehouses suffer from problems such as decreased positioning accuracy, mechanical hard alignment leading to jamming or wear, lack of dynamic correction and visual monitoring, resulting in high interference risk, high system cost, and high complexity.
A flip-up vision camera is used to perform high-precision visual positioning of the limiting protrusions of the transfer rack and the placement rack in a time-division multiplexing manner. Combined with wafer hopper recognition, it realizes precise visual guidance and real-time closed-loop control throughout the process. Through the coordinated work of fixed and flip-up vision cameras, image processing unit and motion control unit, docking accuracy is ensured and interference is prevented.
It improves the success rate and reliability of docking, reduces system cost and complexity, and achieves high-precision dynamic anti-interference and intelligent process control.
Smart Images

Figure CN121752015B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the fields of semiconductor manufacturing automation and machine vision technology, specifically to a visual guidance method and system for guiding the wafer pod body and the limiting protrusions on the fixed station to achieve high-precision, interference-free docking during the automated handling process of semiconductor wafer pods (such as FOUPs). Background Technology
[0002] In automated material handling systems (AMHS) of semiconductor production lines, wafer silos need to be frequently transferred between stacker cranes (OHT / AGVs) and the load ports of equipment front-end modules (EFEMs). One of the core operations in this process is that the handling mechanism (such as a transfer rack with limiting protrusions) picks up the wafer silo and precisely places it onto a fixed workstation (such as a placement rack with limiting protrusions). Precise docking is a prerequisite for avoiding physical collisions, ensuring reliable pneumatic / electrical connections, and preventing wafer breakage.
[0003] Currently, common guidance methods mainly rely on the repeatability of robotic arms and simple photoelectric sensors. However, this approach has significant limitations: First, long-term operation of the robotic arm will generate cumulative errors, leading to a decrease in absolute positioning accuracy; second, wafer hoppers, transfer racks, and placement racks all have processing and installation tolerances, and relying solely on mechanical alignment can easily lead to jamming or wear; third, existing technologies lack real-time visual monitoring and feedback for the critical handover process of "transfer rack protrusion withdrawal" and "placement rack protrusion insertion" during placement, making dynamic correction impossible and leaving the risk of interference still present.
[0004] Some high-end systems use multiple fixed industrial cameras to monitor different workstations, such as one camera watching the silo, one camera watching the transfer rack, and one camera watching the placement rack. While this approach provides visual information, it is costly, has a complex layout, is cumbersome to calibrate, and each camera has an independent field of view, making it difficult to achieve high-precision relative position closed-loop control in a unified coordinate system.
[0005] Therefore, there is an urgent need for a vision solution that is cost-effective, highly accurate, and capable of providing dynamic closed-loop guidance for key docking processes. Summary of the Invention
[0006] This invention aims to overcome the shortcomings of existing technologies and provide a visual guidance method and system for the docking of a wafer hopper with limiting protrusions. The method utilizes a flip-up vision camera, multiplexed in time-division multiplexing, to perform high-precision visual positioning of the limiting protrusions on the transfer rack and the placement rack, respectively. Combined with visual recognition of the wafer hopper, it achieves precise visual guidance and real-time closed-loop control throughout the entire process of "grab alignment" and "placement alignment," effectively improving the docking success rate and reliability, and preventing interference.
[0007] To achieve the above objectives, the present invention adopts the following technical solution:
[0008] In a first aspect, the present invention provides a visual guidance method for docking a wafer hopper with limiting protrusions, applied to a semiconductor wafer hopper handling system. The handling system includes a transfer rack with a first set of limiting protrusions, a placement rack with a second set of limiting protrusions, a fixed vision camera, and a flip-up vision camera. The core process of the method is as follows:
[0009] First, an image of the bottom of the initial wafer silo is acquired using a fixed vision camera, the positioning slots distributed in an isosceles triangle pattern at the bottom are identified, and the precise spatial position and unique orientation of the silo are calculated.
[0010] Next, the robotic arm drives the transfer frame to initially move under the chamber. Then, the flip-up vision camera is activated, positioned in its first shooting posture (e.g., lens facing upwards or at an angle), to photograph and locate the first set of limiting protrusions on the transfer frame, obtaining their precise three-dimensional coordinates. Combining the known position of the chamber and the position of the transfer frame protrusions, the system can calculate the precise motion offset, control the robotic arm to make fine adjustments, and ultimately achieve perfect alignment and gripping of the first set of protrusions with the positioning groove at the bottom of the chamber.
[0011] When the transfer frame carrying the bin moves above the target placement rack to prepare for placement, the flip-up vision camera is controlled to flip to a second shooting posture (e.g., lens facing down or angled downwards) to photograph and position the second set of limiting protrusions on the placement rack. Similarly, based on the real-time pose of the bin, the adjustment amount required to align the bin's positioning groove with the second set of protrusions is calculated, guiding the robot arm to perform precise positioning and pre-alignment before placement.
[0012] Finally, during the process of controlling the transfer frame to descend and place the hopper, the flip-up vision camera continuously images the docking area, monitors the relative positional deviation between the positioning slot and the second set of limiting protrusions in real time, and forms a visual closed-loop feedback to dynamically adjust the descent trajectory, ensuring that the material rack protrusion can be smoothly and accurately inserted while the transfer frame protrusion is withdrawing, achieving a seamless and interference-free handover.
[0013] Secondly, the present invention provides a visual guidance system for implementing the above-described method. The system includes a fixed vision unit, a flip-up vision unit, an image processing unit, and a motion control unit.
[0014] The fixed vision unit is typically a fixed-mount industrial camera, responsible for initial positioning and orientation identification of the wafer hopper.
[0015] The flip-up vision unit is the key component of this system. It includes a high-resolution camera module and a mechanism that drives its flipping (such as a servo motor-driven turntable). This unit is programmed to switch quickly and precisely between two main working postures: "observing the transfer rack protrusion" and "observing the material placement rack protrusion".
[0016] The image processing unit is responsible for running the visual recognition algorithm, extracting feature points (such as protruding tips and positioning groove edges) from the image, and calculating their position and orientation in the world coordinate system.
[0017] The motion control unit then drives the robotic arm to complete precise movements based on the guidance information provided by the image processing unit.
[0018] The beneficial effects of this invention are as follows:
[0019] High precision and closed-loop control: The position of key features (protrusions, positioning grooves) is directly measured by the vision system, rather than relying on the absolute precision of the robot arm, realizing closed-loop alignment based on relative position, resulting in higher docking accuracy and stronger resistance to mechanical errors.
[0020] Dynamic anti-interference protection: During the placement process, a flip-up camera is used to monitor the handover area in real time, which can promptly detect and correct minor deviations caused by vibration, deformation, etc., and dynamically prevent the risk of interference between the transfer rack protrusion and the material placement rack protrusion from the control logic.
[0021] System simplification and cost optimization: By replacing multiple fixed cameras with a single high-performance flip camera, coverage of multiple key observation points is achieved, simplifying the system structure and reducing hardware costs, installation complexity, and the workload of multi-camera calibration.
[0022] Orientation error prevention confirmation: By recognizing the asymmetric positioning groove pattern at the bottom of the wafer housing, the orientation of the housing is confirmed in the initial stage, and visual verification is carried out throughout the process to completely eliminate the possibility of 180-degree misplacement.
[0023] Intelligent process: By embedding visual verification steps into key nodes after grasping and before placement, process quality monitoring is provided, improving the reliability and intelligence level of the entire handling process. Attached Figure Description
[0024] Figure 1 This is a schematic diagram of the overall structure of the present invention.
[0025] Figure 2 This is a schematic diagram of the overall structure of the present invention from another angle.
[0026] Figure 3 This is a schematic diagram showing the state when the transfer rack and the material placement rack of the present invention are connected.
[0027] Figure 4 This is a schematic diagram of the control flow of the present invention.
[0028] Explanation of the main component labels in the diagram:
[0029] 1-Fixed base, 2-Robot arm, 3-Transfer frame, 31-First set of limiting protrusions, 4-Placing rack, 41-Second set of limiting protrusions, 5-Wafer hopper, 13-Fixed vision camera (first vision system), 14-Flip-over vision camera (second vision system), 141-Gimbal drive mechanism, 15-Image processing unit. Detailed Implementation
[0030] To make the objectives, technical solutions, and advantages of this invention clearer, the invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are for illustrative purposes only and are not intended to limit the invention.
[0031] Example 1: Visual Guidance Method Flow
[0032] See Figure 1-4 This embodiment describes in detail the steps of the visual guidance method.
[0033] Step S101: Initial positioning and orientation of the cargo compartment.
[0034] The handling system moves to the vicinity of the target wafer cascade 5. A fixed vision camera 13 (such as a 5-megapixel fixed-focus camera) captures an image of the bottom of the stationary wafer cascade 5. The image processing unit 15 receives the image and runs a pre-trained deep learning model or traditional image processing algorithms (such as edge detection and template matching) to identify the three positioning slots on the cascade support plate. Using stereo vision principles (or combined with known dimensions if using monocular vision) or by conversion with a calibration plate, the three-dimensional coordinates (first spatial position) of these three positioning slots in the system's world coordinate system are calculated. More importantly, because the three positioning slots are distributed in an isosceles triangle (one vertex slot and two symmetrical base slots), the system can clearly distinguish which is the vertex slot, thus uniquely determining whether the wafer cascade is in the 0-degree or 180-degree orientation (first attitude orientation). This information is crucial for subsequent gripping and placement, ensuring correct orientation.
[0035] Step S102: Initial movement and protrusion positioning of the transfer frame.
[0036] Based on the calculated position and orientation of the storage compartment, the motion control unit controls the robotic arm 2 to drive the transfer frame 3 to a rough grasping position below the storage compartment. Then, it controls the gimbal drive mechanism 141 of the flip-up vision camera 14 to flip its lens to a "first shooting posture" facing upwards or slightly upwards. In this posture, the camera 14 captures images of the first set of limiting protrusions 31 at the bottom of the transfer frame 3. The image processing unit 15 identifies these protrusions, which are also distributed in an isosceles triangle pattern, and calculates the three-dimensional coordinates (second spatial position) of their center point.
[0037] Step S103: Capture, align, and execute.
[0038] The image processing unit 15 compares the position of the wafer hopper positioning slot group (first spatial position) with the position of the transfer frame protrusion (second spatial position), and calculates the deviations between the two in X, Y, Z and horizontal rotation angles. The motion control unit generates a first fine-tuning control command based on this deviation, driving the robotic arm 2 to perform millimeter-level or even sub-millimeter-level micro-movements until the deviation approaches zero. Subsequently, the transfer frame 3 is controlled to rise, allowing the first set of limiting protrusions 31 to precisely insert into the positioning slots of the wafer hopper 5, completing the gripping process. Optionally, the flip-up camera 14 can take another picture after insertion to verify whether the protrusions are in place.
[0039] Step S104: Pre-alignment before placement.
[0040] The robotic arm 2 transports the grasped wafer cassette 5 to a predetermined height above the target rack 4. Then, it controls the tilting vision camera 14 to tilt its gimbal, so that the lens faces downward or to the side downward in a "second shooting posture". In this posture, the camera 14 captures and positions the second set of limiting protrusions 41 on the upper surface of the rack 4, and calculates its three-dimensional coordinates (third spatial position). At the same time, the system knows the current pose of the wafer cassette 5 through feedback from the robotic arm encoder or by using the fixed vision camera 13.
[0041] Step S105: Place the dynamic closed-loop guide.
[0042] The system calculates the placement trajectory based on the current pose of the bin and the position of the second set of protrusions. During the descent of the bin 5 carried by the transfer frame 3, a flip-up vision camera 14 continuously captures images at a high frame rate (e.g., 30fps) of the area where the bottom of the bin is about to contact the protrusions of the placement frame. The image processing unit 15 processes the images in real time, calculating the positional deviation between the positioning slot and the second set of protrusions. This deviation is fed back to the motion control unit in real time, which dynamically adjusts the slight lateral position or angle of the robot arm's descent, forming a visual servo closed loop. This process ensures that the bin is always "aiming" at the second set of protrusions during its descent. When the second set of protrusions begins to enter the positioning slot and the first set of protrusions simultaneously exits, the vision system confirms that the handover is normal.
[0043] Step S106: Placement completed and evacuation.
[0044] Once the pressure sensor or vision confirms that the hopper is fully seated on the loading rack 4 and the transfer rack protrusion is completely disengaged, the motion control unit controls the robot arm 2 to control the transfer rack 3 to detach, and the process ends.
[0045] Example 2: Composition of the visual guidance system
[0046] See Figures 1 to 4 This embodiment describes the system hardware configuration for implementing the above method.
[0047] The system hardware mainly includes a fixed vision camera 13, a flip-up vision camera 14, an image processing unit, and a control cabinet.
[0048] The fixed vision camera 13 is rigidly mounted on the fixed base 1 via a bracket, and its field of view covers the preset area to be grasped in the wafer hopper. It is equipped with a ring-shaped LED light source to ensure uniform lighting at the bottom of the hopper.
[0049] The flip-up vision camera 14 is the core component. It includes a high-resolution CMOS industrial camera (e.g., 12 megapixels) and a two-axis precision gimbal (tilt axis and horizontal rotation axis). The gimbal is driven by a servo motor and integrates a high-precision encoder. The camera module is fixed to the base of the robot arm 2 via a mounting arm or to a stand independent of the robot arm, ensuring that it can simultaneously and clearly observe the areas below the transfer rack 3 and above the placement rack 4. The camera 14 is equipped with a zoom lens and a small integrated bar light source, the illumination angle of which can be controlled by software and linked to the camera's posture.
[0050] The image processing unit 15 and the control cabinet contain an industrial computer, a motion control card, and I / O modules. The industrial computer runs vision processing software (such as software developed based on OpenCV or Halcon) and motion control software. The vision processing software carries the positioning groove recognition model and the protrusion recognition model, and completes all coordinate calculations. The motion control software is responsible for command issuance and closed-loop control.
[0051] Example 3: Details of Image Recognition and Processing
[0052] This embodiment illustrates the key recognition process of the image processing unit.
[0053] Whether it's identifying positioning slots or limiting protrusions, the core is to identify the vertices of isosceles triangles.
[0054] For slot identification: Since the slot opening forms a noticeable dark area outline under backlighting or sidelighting, the algorithm first performs image preprocessing (filtering, binarization), and then performs contour finding. From all contours, three contours with areas and shapes meeting preset conditions are selected. The center point of the smallest bounding rectangle of these three contours is calculated as the feature point. Subsequently, the distances between each pair of these three feature points are calculated, and the combination of two points with equal distances that differ from the third side is identified, thus determining the two "base" points and the "vertex" of the isosceles triangle. This "vertex" corresponds to the key feature point of the wafer housing's orientation.
[0055] For identifying limit protrusions: Under specially designed lighting (such as low-angle lighting), the protrusions will form a bright area with obvious shadows. Using a similar contour finding and geometric analysis method, the center points of the three protrusion contours are found, and the isosceles triangle geometric relationship is also determined to match the pre-defined design model.
[0056] By pre-calibrating the camera with hand and eye (for flip-up cameras, calibration is required in each common pose), the coordinates of feature points in the image pixel coordinate system can be transformed to the robot base coordinate system or the world coordinate system, thereby providing an absolute position reference for motion control.
[0057] The technical scope of this invention is not limited to the content described above. Those skilled in the art can make various modifications and variations to the above embodiments without departing from the technical concept of this invention, and all such modifications and variations should fall within the protection scope of this invention.
Claims
1. A visual guidance method for docking a wafer housing with a limiting protrusion, characterized in that, An application in a semiconductor wafer warehousing handling system, the handling system comprising a transfer frame with a first set of limiting protrusions, a placement frame with a second set of limiting protrusions, a fixed vision camera, and a flip-up vision camera, the method comprising: S1: Using the fixed vision camera, acquire an image of the bottom of the wafer hopper at the initial position, identify and calculate the first spatial position and first orientation of the positioning slot group at the bottom of the wafer hopper; S2: Based on the first spatial position and the first orientation direction, control the robot arm to drive the transfer frame to move, so that the first set of limiting protrusions on the transfer frame is initially aligned with the positioning groove group at the bottom of the wafer hopper; S3: Using the flip-out vision camera, acquire an image of the first set of limiting protrusions on the transfer frame in the first shooting posture, and identify and calculate the second spatial position of the first set of limiting protrusions; S4: Based on the first spatial position and the second spatial position, generate a first fine-tuning control command to control the robot to make micro-movements and complete the precise alignment and engagement of the first set of limiting protrusions and the positioning groove group; S5: After the transfer frame carries the wafer hopper to the target placement rack, control the flip-out vision camera to flip to the second shooting posture; S6: Using the flip-out vision camera in the second shooting posture, acquire an image of the second set of limiting protrusions on the material rack, identify and calculate the third spatial position of the second set of limiting protrusions; S7: Based on the current pose of the wafer hopper and the third spatial position, generate a second fine-tuning control command to control the robot arm to drive the transfer frame and the wafer hopper to make micro-movements, so that the positioning groove group at the bottom of the wafer hopper is precisely pre-aligned with the second set of limiting protrusions; S8: Control the transfer frame to descend. During the descent, compare the relative positions of the positioning groove group and the second set of limiting protrusions in real time, and dynamically adjust until the engagement is completed.
2. The method according to claim 1, characterized in that, The fixed vision camera is a fixed focal length industrial camera, which is fixedly installed on the base of the conveying system, with its optical axis pointing vertically or obliquely to the bottom of the expected docking position of the wafer hopper; the flip-up vision camera is a zoom industrial camera mounted on a two-axis or three-axis gimbal.
3. The method according to claim 1 or 2, characterized in that, In step S1, identifying the positioning slot group at the bottom of the wafer cascade specifically includes: Three positioning slots distributed in an isosceles triangle are identified, and the orientation of the wafer housing is uniquely determined based on the asymmetric geometric features of the isosceles triangle, the orientation including the distinction between two possible orientations of 0 degrees and 180 degrees.
4. The method according to claim 1, characterized in that, In steps S3 and S6, identifying the first group of limiting protrusions or the second group of limiting protrusions specifically includes: The system identifies convex contours that are distributed in isosceles triangles and calculates the three-dimensional coordinates of their center points using image processing algorithms.
5. The method according to claim 1, characterized in that, In step S8, the real-time comparison and dynamic adjustment specifically include: The flip-out vision camera continuously captures images of the area near the bottom of the wafer hopper and the material rack in the second shooting posture. Calculate the positional and angular deviations between the positioning groove group and the second set of limiting protrusions; The deviation is fed back to the motion controller of the robot to form a closed-loop control, which corrects the descent trajectory in real time.
6. The method according to claim 1, characterized in that, After the card connection is completed in step S4 and before the card connection is completed in step S8, a verification step is also included: The image of the docking area is acquired by the flip-out vision camera to determine whether the limiting protrusion is fully and correctly inserted into the corresponding positioning slot, and a verification result signal is generated.
7. A visual guidance system for implementing the method according to any one of claims 1-6, characterized in that, include: Fixed vision unit, flip-up vision unit, image processing unit, and motion control unit; The fixed vision unit is used to acquire a global image of the bottom of the wafer cascade. The flip-out vision unit includes a camera module that can be driven to flip, used to acquire local high-precision images of the transfer rack limiting protrusion and the material placement rack limiting protrusion under different shooting postures. The image processing unit is communicatively connected to the fixed vision unit and the flip-out vision unit, and is used to process the received image, identify feature targets, and calculate their spatial position and pose information. The motion control unit is electrically connected to the image processing unit and the robotic arm of the handling system. It is used to receive position information and guidance instructions generated by the image processing unit and control the robotic arm and transfer frame to perform corresponding movements.
8. The system according to claim 7, characterized in that, The camera module of the flip-out vision unit is mounted via a flip-out drive mechanism, which is configured to enable the camera module to switch between at least two preset, stable shooting postures, wherein the optical axis of the first shooting posture points to the limiting protrusion area below the transfer frame, and the optical axis of the second shooting posture points to the limiting protrusion area on the placement frame.
9. The system according to claim 7 or 8, characterized in that, The image processing unit integrates a first recognition model and a second recognition model; The first identification model is used to identify the isosceles triangular positioning slot group at the bottom of the wafer cascade and analyze its orientation; The second identification model is used to identify the isosceles triangular limiting protrusions on the transfer rack and the material placement rack.
10. The system according to claim 7, characterized in that, The system also includes an illumination unit, which includes a first light source that cooperates with the fixed vision unit and a second light source that cooperates with the flip-out vision unit. The illumination direction of the second light source is adjusted synchronously as the flip-out vision unit is flipped.