Real-time recognition system for soft-stacking position of automobile based on machine vision and deep learning

By combining 3D point cloud reconstruction and dynamic simulation, high-precision adaptive identification and closed-loop control of the position of automotive soft strip stacks were achieved, solving the problem of dynamic morphological changes during soft strip stack installation and improving assembly accuracy and efficiency.

CN122265072APending Publication Date: 2026-06-23JIANGSU JINBULI PRECISION MANUFACTURING CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
JIANGSU JINBULI PRECISION MANUFACTURING CO LTD
Filing Date
2026-02-09
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

In automotive flexible strip stacking installation, existing technologies rely on static reference paths, which are insufficient to cope with the dynamic morphological changes of the flexible strip during the stacking process. This results in insufficient interlayer fit or interference, failing to meet the requirements of high-precision assembly.

Method used

A method combining high-fidelity 3D point cloud reconstruction, dynamic simulation strategy optimization, and real trajectory divergence correction is adopted. Point cloud data is collected by a structured light vision sensor, 3D surface reconstruction and collision mesh simulation are performed to generate a virtual stacked environment, and control commands are generated through Monte Carlo sampling and gradient ascent update to achieve high-precision adaptive identification and closed-loop control of the flexible soft stacked position.

Benefits of technology

This improved the effectiveness and robustness of the control strategy, shortened the process development cycle, reduced production costs, and ensured high-precision stack-up assembly results.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122265072A_ABST
    Figure CN122265072A_ABST
Patent Text Reader

Abstract

The application discloses a real-time identification system for the position of a soft stack of a vehicle based on machine vision and deep learning, belongs to the field of stack procedures and three-dimensional vision detection of vehicle manufacturing, and comprises the following steps: collecting a point cloud of a soft stack surface through a structured light vision sensor, performing curved surface fitting and Poisson reconstruction, converting a model into a collision grid, importing the collision grid into a dynamic simulation space to construct a virtual environment and extract a state vector, establishing a motion probability distribution table, performing Monte Carlo sampling to generate a virtual trajectory, calculating a weighted sum of the interlayer overlapping area of the virtual trajectory and performing gradient ascent update on the probability distribution table, analyzing the evolved distribution table to issue instructions, collecting a real trajectory, calculating a relative entropy divergence of the real trajectory and the virtual trajectory, and updating the evolved motion distribution table. The application combines a high-fidelity three-dimensional point cloud reconstruction, a dynamic simulation strategy optimization and a real trajectory divergence correction, and can realize high-precision adaptive identification and closed-loop control of the position of a flexible soft stack.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of automotive manufacturing lamination processes and 3D vision inspection, and in particular to a real-time recognition system for automotive soft sheet stacking positions based on machine vision and deep learning. Background Technology

[0002] Automotive flexible strips are widely used in automotive electronic and electrical architectures due to their small size, light weight, and good bending performance. During vehicle assembly or component assembly, multiple layers of flexible strips often need to be stacked and installed according to specific positional relationships. Because of their high flexibility and susceptibility to complex nonlinear deformations during grasping and movement, their posture in three-dimensional space is difficult to control precisely.

[0003] In related technologies, Chinese invention patent application CN120219298A discloses a machine vision-based real-time adjustment system and method for wiring harness installation paths. The system acquires image data of the target automotive wiring harness installation area to obtain a corresponding wiring harness installation image sequence; preprocesses the acquired wiring harness installation image sequence to obtain an initial wiring harness image sequence; enhances the initial wiring harness image sequence to obtain a complete wiring harness image sequence; inputs the complete wiring harness image sequence into a pre-constructed target detection model to obtain corresponding area installation information; and evaluates the corresponding wiring harness installation path in real time based on this information to obtain corresponding path adjustment instructions; and adjusts the automotive wiring harness installation path within the target automotive wiring harness installation area based on the obtained path adjustment instructions.

[0004] Regarding the aforementioned technologies, the inventors believe that this method primarily relies on geometric comparison using a static reference path model to correct deviations. However, in the scenario of flexible busbar stack-up installation, the shape of the flexible busbar dynamically changes with contact, gravity, and stacking order. A fixed reference path is insufficient to characterize the complex flexibility of the flexible busbar during actual physical stacking. Adjusting solely based on the current geometric position deviation easily overlooks the physical constraints of the flexible busbar during stacking contact, resulting in the adjusted path still failing to guarantee interlayer fit or causing unexpected interference, thus failing to meet the requirements for high-precision flexible busbar stack-up assembly. Summary of the Invention

[0005] To address the aforementioned issues, this invention provides a real-time position recognition system for automotive flexible sheet stacks based on machine vision and deep learning. It employs a technical solution that combines high-fidelity 3D point cloud reconstruction, dynamic simulation strategy optimization, and real trajectory divergence correction, enabling high-precision adaptive recognition and closed-loop control of the flexible sheet stack position.

[0006] The above objectives can be achieved through the following approach: A real-time automotive flexible strip stack position recognition system based on machine vision and deep learning includes: a point cloud fitting and reconstruction module, used to acquire discrete point cloud coordinate data of the flexible strip surface by controlling a structured light vision sensor, perform least squares surface fitting and outlier removal, and perform Poisson surface reconstruction to fill holes and generate a 3D surface model; a collision mesh import simulation module, used to convert the 3D surface model into a simulation collision mesh and import it into the dynamic simulation space to construct a virtual stack environment and extract a state vector sequence from the virtual stack environment; and a state discretization and trajectory sampling module, used to discretize the state vector sequence into a state index sequence. The state index sequence establishes an action probability distribution table, and Monte Carlo sampling is performed according to the action probability distribution table to generate a virtual trajectory set; the overlap reward and distribution evolution module is used to calculate the inter-layer overlap area sequence for each track of the virtual trajectory set and summarize it into a weighted sum of overlap areas, and perform gradient ascent update on the action probability distribution table according to the weighted sum of overlap areas to generate an evolved action distribution table; the control issuance and divergence correction module is used to parse the evolved action distribution table into a control command sequence and issue it for execution, collect and stack the point cloud to generate a real trajectory set, calculate the relative entropy divergence sequence between the real trajectory set and the virtual trajectory set and update the evolved action distribution table.

[0007] Optionally, the point cloud fitting and reconstruction module includes: a point cloud acquisition unit, used to control the structured light vision sensor to acquire multi-angle grating image sequences of the soft-pack surface, perform pixel-level phase principal value calculation and unwrapping operation, map the two-dimensional pixel coordinates to three-dimensional spatial coordinates according to camera calibration, and construct a sparse point cloud dataset; a neighborhood tangent plane projection distance denoising unit, used to perform radius filtering on the sparse point cloud dataset and generate a neighborhood point set, iteratively fit a local tangent plane on the neighborhood point set and calculate the projection distance sequence from the point to the plane, take the upper quantile threshold on the projection distance sequence and delete the coordinates exceeding the threshold to obtain denoised point cloud data; and a point normal vector Poisson reconstruction mesh generation unit, used to calculate the point normal vector set on the denoised point cloud data, perform Poisson surface reconstruction based on the point normal vector set to generate a three-dimensional surface model, the three-dimensional surface model including a vertex coordinate set and a patch index set.

[0008] Optionally, the collision mesh import simulation module includes: a collision body generation unit, used to read the vertex coordinate set and the face index set, perform mesh simplification by merging edge lengths to generate a simplified mesh, and perform convex decomposition on the simplified mesh to construct the simulation collision mesh; a virtual stacked rigid body assembly unit, used to load the simulation collision mesh into the dynamic simulation space, establish the soft-pack rigid body and the fixture rigid body and write the mass parameters and inertia parameters to generate a virtual stacked environment; and a pose contact state vector sampling unit, used to read the pose, end pose and contact point coordinates of the soft-pack rigid body along the simulation time sequence based on the virtual stacked environment, splice them by dimension to generate a state vector and output the state vector sequence by time step.

[0009] Optionally, the collision body generation unit includes: calculating the edge set and edge length sequence of the patch index set, taking the quantile of the edge length sequence to obtain the edge length threshold; selecting target edges with edge lengths less than the edge length threshold from the edge set, performing edge folding and merging to reconstruct the vertex coordinate set and the patch index set, and generating a simplified mesh; performing convex decomposition on the simplified mesh to generate a set of convex polyhedra, and summing the convex polyhedra set to obtain the simulation collision mesh.

[0010] Optionally, the state discretization and trajectory sampling module includes: an equal-width binning state index generation unit, used to perform equal-width binning on the state vector sequence according to its dimensions, calculate the bin number to which each state vector falls and arrange them according to time steps to generate a state index sequence; an action frequency normalization probability table generation unit, used to count the frequency of action occurrence corresponding to each state index in the state index sequence and normalize it to generate an action probability distribution table; and a cumulative probability sampling virtual trajectory generation unit, used to extract random numbers from the action probability distribution table for each state index and select actions according to cumulative probability, iterate to obtain an action sequence, and combine it with time steps to generate a virtual trajectory set.

[0011] Optionally, the system further includes: projecting time slices onto the denoised point cloud data using a state vector sequence, calculating and normalizing the point cloud spatial occupancy density at each time step, and generating a density weight sequence.

[0012] Optionally, the overlap reward and distribution evolution module includes: a projected area sequence generation unit, used to read the pose of the soft-packed rigid body according to the time step based on the virtual trajectory set, transform the pose of the soft-packed rigid body to the same coordinate system, calculate the absolute value of the difference between the projected areas of adjacent time step grids, and arrange them according to the time step to generate an inter-layer overlap area sequence; a time step weighted product summation unit, used to perform weighted summation on the inter-layer overlap area sequence based on the density weight sequence to generate a weighted sum of overlap areas; and an evolution distribution table generation unit, used to perform multiplicative scaling on each action probability in the action probability distribution table using the weighted sum of overlap areas as a scaling factor to normalize and generate an evolution action distribution table.

[0013] Optionally, the time-step weighted product summation unit includes: performing a term-by-term product operation based on the density weight sequence and the interlayer overlap area sequence to generate a weighted area sequence; and performing a summation operation on the weighted area sequence to generate a weighted sum of overlap areas.

[0014] Optionally, the control issuance and divergence correction module includes: an instruction mapping and issuance unit, used to select the action index with the highest action probability according to the state index of the evolutionary action distribution table, map and generate a control instruction sequence, and issue it for execution; a point cloud centroid trajectory point generation unit, used to collect point cloud to obtain a point cloud frame sequence after instruction execution stacking, extract three-dimensional coordinates from the point cloud frame sequence and calculate the mean of the coordinates as the centroid point, and arrange the centroid points according to the time step to generate a set of real trajectories; and a divergence calculation and distribution table update unit, used to align the set of real trajectories and the set of virtual trajectories according to the time step, calculate the trajectory point distance sequence and normalize it to generate a relative entropy divergence sequence, and perform probability scaling and normalization on the evolutionary action distribution table according to the relative entropy divergence sequence.

[0015] Based on the same inventive concept, this invention also provides a real-time identification method for the stacked position of automotive flexible strips based on machine vision and deep learning. The method includes: acquiring discrete point cloud coordinate data of the flexible strip surface by controlling a structured light vision sensor, performing least squares surface fitting and outlier removal, and performing Poisson surface reconstruction to fill holes and generate a three-dimensional surface model; converting the three-dimensional surface model into a simulation collision mesh and importing it into a dynamic simulation space to construct a virtual stacked environment, and extracting a state vector sequence from the virtual stacked environment; discretizing the state vector sequence into a state index sequence, establishing an action probability distribution table for the state index sequence, and performing Monte Carlo sampling according to the action probability distribution table to generate a virtual trajectory set; calculating the inter-layer overlap area sequence for each track of the virtual trajectory set and summing it into a weighted sum of overlap areas, and performing gradient ascent update on the action probability distribution table according to the weighted sum of overlap areas to generate an evolutionary action distribution table; parsing the evolutionary action distribution table into a control command sequence and issuing it for execution, acquiring the stacked point cloud to generate a real trajectory set, calculating the relative entropy divergence sequence between the real trajectory set and the virtual trajectory set, and updating the evolutionary action distribution table.

[0016] Compared with the prior art, the present invention has the following advantages: 1. High-density point cloud data was acquired using a structured light vision sensor, and after a series of denoising, reconstruction, and mesh optimization steps, a computational model suitable for dynamic simulation was generated. This high-precision 3D model provides a solid geometric and physical foundation, ensuring the consistency of behavior between the virtual simulation environment and the real physical world, thereby improving the effectiveness of control strategies; 2. Instead of time-consuming and potentially damaging physical trial and error, virtual trajectories are generated through extensive Monte Carlo sampling in a digital twin environment. The optimal motion strategy evolves using physically interpretable overlap area as the reward function. This simulation-based learning paradigm shortens the process development cycle and reduces production costs. 3. While executing the optimal strategy in the physical world, the system simultaneously acquires real-world point cloud trajectories and calculates their relative entropy divergence with the simulated predicted trajectories. This divergence is used as a feedback signal to recalibrate the optimized action distribution table, enabling the control strategy to adaptively compensate for model errors and environmental uncertainties, thereby ensuring high robustness and high accuracy in practical industrial applications.

[0017] Other features and advantages of the invention will be set forth in the description which follows, and will be apparent in part from the description, or may be learned by practicing the invention. The objects and other advantages of the invention may be realized and obtained by means of the structures pointed out in the description, claims and drawings. Attached Figure Description

[0018] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0019] Figure 1 This is a framework diagram of a real-time recognition system for the position of automotive soft sheet stacking based on machine vision and deep learning, according to an embodiment of the present invention.

[0020] Figure 2 This is a schematic diagram of the structure of a real-time recognition system for the position of automotive soft sheet stacking based on machine vision and deep learning, according to an embodiment of the present invention.

[0021] Figure 3 This is a comparison diagram of the projected area fluctuation distribution in an embodiment of the present invention.

[0022] Figure 4 This is a joint density distribution diagram of the relative entropy divergence between the reward and the simulated reality in an embodiment of the present invention. Detailed Implementation

[0023] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0024] Reference Figure 1One embodiment of the present invention proposes a real-time recognition system for the position of automotive flexible packing stacks based on machine vision and deep learning. It adopts a technical solution that combines high-fidelity 3D point cloud reconstruction, dynamic simulation strategy optimization and real trajectory divergence correction, which can achieve high-precision adaptive recognition and closed-loop control of the position of flexible packing stacks.

[0025] like Figure 2 As shown, the system in this embodiment specifically includes: The point cloud fitting and reconstruction module is used to collect discrete point cloud coordinate data of the soft surface by controlling the structured light vision sensor, perform least squares surface fitting and outlier removal, and perform Poisson surface reconstruction to fill holes and generate a three-dimensional surface model. The collision mesh import simulation module is used to convert a 3D surface model into a simulation collision mesh and import it into the dynamic simulation space to construct a virtual overlay environment and extract the state vector sequence from the virtual overlay environment. The state discretization and trajectory sampling module is used to discretize the state vector sequence into a state index sequence, establish an action probability distribution table for the state index sequence, and perform Monte Carlo sampling according to the action probability distribution table to generate a set of virtual trajectories. The overlap reward and distribution evolution module is used to calculate the inter-layer overlap area sequence for each track of the virtual trajectory set and summarize it into a weighted sum of overlap areas. The action probability distribution table is then updated by gradient ascent according to the weighted sum of overlap areas to generate an evolutionary action distribution table. The control distribution and divergence correction module is used to parse the evolutionary action distribution table into a sequence of control commands and issue them for execution. After collecting and stacking point clouds, it generates a set of real trajectories, calculates the relative entropy divergence sequence between the set of real trajectories and the set of virtual trajectories, and updates the evolutionary action distribution table.

[0026] Optionally, the point cloud fitting and reconstruction module includes: The point cloud acquisition unit is used to control the structured light vision sensor to acquire multi-angle grating image sequences on the surface of the soft array, perform pixel-level phase principal value calculation and unwrapping operation, and map the two-dimensional pixel coordinates to three-dimensional spatial coordinates according to the camera calibration to construct a sparse point cloud dataset. To obtain high-precision geometric information of the flexible material carrier surface, the projector in the structured light vision sensor is first driven to project a set of phase-shift encoded sinusoidal grating fringes onto the surface of the flexible material carrier. Typically, this is achieved using... In the phase-shifting method, the camera synchronously acquires images of deformed stripes modulated by the height of the object's surface. For any pixel in the image coordinate system... Its folded phase The calculation follows the arctangent function relationship, and its calculation formula is as follows: , in: Represents pixels The folded phase value at the location, its range is within between; The number of phase shift steps is derived from configuration parameters and is typically set to... The more steps, the stronger the noise resistance, but the longer the collection time. This is the current phase shift index, and its value is... ; The first one captured by the camera Image in coordinates The grayscale value at that location; For the first The phase shift of multiple grating images. This formula utilizes the orthogonality of sine waves and solves for phase information from the intensity variations of multiple images using the least squares principle, thereby eliminating the effects of uneven ambient light intensity and differences in surface reflectivity. The calculated... exist The periodic blurring necessitates a subsequent spatial or temporal phase unwrapping operation to restore the discontinuous folded phases to continuous absolute phases. Finally, the camera intrinsic parameter matrix is ​​used. and extrinsic parameter matrix Pixel coordinates are determined based on the principle of triangulation. and its corresponding absolute phase Mapped to three-dimensional coordinates in the world coordinate system This forms a sparse point cloud dataset.

[0027] For example, in a detection station, the number of phase shift steps is set. The structured light projector sequentially projects phase shifts of... The sine fringes. The camera captured four corresponding images. For a single pixel at the center of the image. Read their grayscale values ​​respectively Substituting into the formula, the numerator is approximately... The denominator is approximately The arctangent is calculated to obtain the folded phase value of the point. Then, combined with calibration data, the pixel is converted into coordinates of a point in three-dimensional space. The set of all such points constitutes a sparse point cloud dataset.

[0028] The neighborhood tangent plane projection distance denoising unit is used to perform radius filtering on the sparse point cloud dataset and generate a neighborhood point set. Iteratively fits the local tangent plane on the neighborhood point set and calculates the projection distance sequence from the point to the plane. The upper quantile threshold is taken on the projection distance sequence and the coordinate points exceeding the threshold are deleted to obtain the denoised point cloud data. The original point cloud often contains outliers caused by metallic reflections or dust. Denoising is achieved using methods based on local statistical features. First, for each target point in the sparse point cloud dataset... Execute radius filtering: Centered on the sphere, with radius Define a spherical space for the radius, and search for all points that fall within this space to form a neighborhood set. The radius here The value is set based on the point cloud density of the soft-surface surface, and it must be ensured that the neighborhood of the smooth region contains at least [number missing]. For example, take a point. Next, the least squares method or principal component analysis (PCA) is used to analyze the neighborhood point set. By performing plane fitting, a local tangent plane passing through the center of this neighborhood is obtained. The general equation of this tangent plane is: ,in Let be the unit normal vector of the plane. Then, calculate the target point. Projected distance to the local tangent plane The calculation formula is as follows: , in: Indicates the target point The Euclidean distance from its local fitting plane; For target point The three-dimensional coordinate components; The parameters of the fitted local tangent plane are given. The physical meaning of this value represents the "roughness" or "degree of abrupt change" at that point. For smooth, continuous soft-pack surfaces, this value should be close to 0; for noisy points, this value is typically large. The projected distance sequence is calculated by traversing all points. The sequence was statistically analyzed, and its upper quantile threshold was calculated. For example, take the 85th percentile, that is If a certain point If the point is determined to be a noise point deviating from the surface, it will be removed from the dataset, and the remaining points will constitute the denoised point cloud data.

[0029] For example, suppose a point on the edge of the soft stack Make a judgment. Its neighborhood radius. Other points within are mainly distributed in On the plane. The fitted local tangent plane equation is approximately: ,parameter Substituting into the distance formula, the numerator is... The denominator is Therefore, the projected distance is calculated. However, statistical analysis of the distance sequence across the entire point cloud revealed that the projected distances of 95% of the points were less than [a certain value]. .because This point was identified as noise and was removed.

[0030] The point normal vector Poisson reconstruction mesh generation unit is used to calculate the set of point normal vectors on the denoised point cloud data, and perform Poisson surface reconstruction based on the set of point normal vectors to generate a three-dimensional surface model. The three-dimensional surface model includes a set of vertex coordinates and a set of patch indices.

[0031] To convert a discrete point cloud into a continuous mesh model, the normal vector of each point needs to be estimated first. For each point in the denoised point cloud data, its normal vector is searched. Nearest neighbor points, typically 10 to 30, are used to calculate eigenvectors using covariance matrix analysis. The eigenvector corresponding to the smallest eigenvalue is then taken as the normal vector for that point. This yields the set of point normal vectors. Subsequently, Poisson surface reconstruction is performed. The principle of this algorithm is to transform the surface reconstruction problem into solving the Poisson equation. This is achieved by solving a scalar field function, i.e., an indicator function. , so that its gradient field Approximate as closely as possible to the vector field composed of point cloud normal vectors. The Poisson equation is expressed as: , in: For the Laplace operator, it means to calculate the divergence and then the gradient of a scalar field; The implicit indicator function to be solved has a value of 1 inside the object and 0 outside, with a sudden change at the surface; Given a vector field divergence, vector field It is generated by smooth interpolation of the point normal vector set. The indicator function is obtained by discretizing and solving the above equations on an octree structure. The spatial distribution of the data. Finally, the moving cube algorithm is used to extract the data. The isosurface generates a set of vertex coordinates. and the set of noodle indexes This method constructs a three-dimensional curved surface model. It utilizes the global integration property to automatically fill data gaps caused by occlusion during the acquisition process.

[0032] For example, the input contains 100,000 point data points with normal vectors. An octree structure with a depth of 8 layers is constructed to solve the Poisson equation. During the solution process, several data gaps of approximately 2mm in diameter, originally caused by occlusion due to the bending of the soft stack, are resolved in the indicator function. The isosurfaces are automatically smoothed and completed during extraction. The final output 3D surface model contains 5000 vertices and 9800 triangular faces, and the model surface is continuous and closed, without holes or defects, and can be directly used for subsequent physical collision simulations.

[0033] Optionally, the collision mesh import simulation module includes: The collision body generation unit is used to read the vertex coordinate set and the face index set, perform mesh simplification by merging edge lengths to generate a simplified mesh, and perform convex decomposition on the simplified mesh to construct the simulation collision mesh; First, the high-density 3D surface model is received. To meet the real-time requirements of the MuJoCo dynamics simulation engine, the mesh complexity must be reduced. Then, each edge in the set of patch indices is traversed. Connect vertices and Calculate its Euclidean distance as the side length. The calculation formula is as follows: , in, and Vertices and The three-dimensional coordinates are determined. The lengths of all edges are counted to generate a sequence of edge lengths, and the lower quantile threshold of this sequence is determined. .for For the edges, perform an edge collapse operation to collapse the vertices. and Merge into a new vertex This reduces the number of faces, generating a simplified mesh. Subsequently, to address the issue of low collision detection efficiency caused by the simplified mesh often being a non-convex polyhedron, an approximate convex decomposition is performed. This process fills the interior of the simplified mesh using a voxelization method and divides it into... A series of non-overlapping convex polyhedra The union volume of these convex polyhedra Compared to the original simplified mesh volume Volume error between Less than a preset threshold, for example, 1%. Each convex polyhedron Each is composed of a new set of vertices and faces, all The set of these elements constitutes the simulation collision mesh.

[0034] For example, the input soft-pack 3D model contains 100,000 faces. The 20th percentile of the edge length sequence is calculated to be 0.15 mm. All edges with lengths less than 0.15 mm are folded, resulting in a simplified mesh containing 5,000 faces. This simplified mesh, which is U-shaped, is then decomposed into 10 convex hulls connected end-to-end. The total volume of these combined convex hulls is... The volume error with the original mesh is only This meets the requirements for high-precision collision detection.

[0035] The virtual stacked rigid body assembly unit is used to load the simulated collision mesh into the dynamic simulation space, establish the soft-pack rigid body and the fixture rigid body and write the mass parameters and inertia parameters to generate the virtual stacked environment. First, initialize a dynamic simulation space and define the gravitational acceleration vector. In this space, the simulated collision mesh is instantiated as a soft-pack rigid body object. To ensure the dynamic characteristics of the simulation, the soft-pack rigid body must be assigned accurate mass properties. The average density is read based on the physical material of the soft-pack. ,For example Calculate the volume of the simulated collision mesh. Then calculate the total mass. More importantly, it is necessary to calculate the moment of inertia tensor. This is the core parameter that determines the rotational properties of a rigid body. For a rigid body composed of… A soft rigid body composed of convex polyhedra, its inertia tensor relative to the center of mass. It is The formula for calculating a symmetric matrix is ​​as follows: , With main diagonal elements For example, its physical meaning is the ease or difficulty of a rigid body rotating about the X-axis, and the calculation formula is the volume integral: , In numerical computation, this integral is approximated by summing the discrete tetrahedrons of the convex polyhedron. The calculated mass... and inertia tensor The physical properties of the flexible rigid body are written into the model. At the same time, the 3D model of the fixture is imported and set as a "static rigid body", that is, with infinite mass and no movement under force, thus constructing a complete virtual stacked environment including the flexible rigid body, the fixture, and the contact friction coefficient.

[0036] For example, suppose the soft-pack rigid body consists of 5 convex polyhedra, and the total calculated volume is... Density taken Then write the quality parameters. Its moment of inertia about the X-axis is calculated through integration. The moment of inertia is extremely small, while the moment of inertia about the Y and Z axes is relatively large. These parameters ensure that, in the simulation, the virtual soft board will produce flipping and sliding motions consistent with the real physical world when pushed by the robotic arm.

[0037] The pose contact state vector sampling unit is used to read the pose of the soft rigid body, the end pose, and the coordinates of the contact point along the simulation time sequence based on a virtual stacked environment, splice them by dimension to generate a state vector, and output the state vector sequence by time step.

[0038] The dynamics simulation engine operates at a fixed time step. ,For example This advances the evolution of the physical world. At each simulation time step... The current state data is read through the simulation engine's API interface. First, the three-dimensional position vector of the mass center of the soft-plane rigid body is read. and a quaternion vector representing its three-dimensional attitude. The quaternions satisfy the normalization constraint. Secondly, read the position vector of the robotic arm's end effector that is performing the operation. and attitude quaternions Next, the simulation engine is used to obtain the set of all contact points between the soft-body rigid body and the fixture rigid body. Assume that at the current moment, a connection is detected... Of the contact points, the one with the largest contact normal force is selected. One key contact point For example, a preset constant, such as Extract its coordinates If the actual contact points are insufficient If there are 1 or more, pad them with a zero vector. Finally, concatenate all the above data in dimensional order to generate... State vector at time step State vector The construction formula is defined as: , in: This represents the transpose of a vector. The dimension of this vector is... The vector is continuously output in chronological order to form a state vector sequence. , serving as a digital fingerprint describing the entire stacking process.

[0039] For example, setting a time step In step 50 of the simulation, that is At that moment, the soft board just touched the bottom surface of the fixture. The soft board position was read. Posture Quaternion This indicates a 90-degree rotation. The end effector of the robotic arm is located at... Two contact points were detected simultaneously. and The remaining Each contact point is padded with zeros. The final generated... It is a vector containing 26 floating-point numbers. This sequence records the complete dynamic trajectory of the softboard falling from the air and colliding with the fixture.

[0040] Optionally, the collider generation unit includes: Calculate the edge set and edge length sequence of the patch index set, and take the quantile of the edge length sequence to obtain the edge length threshold; To identify minute details or oversampled regions within the mesh at a geometric level, the input 3D surface model is first analyzed. This model consists of a set of vertex coordinates. and the set of noodle indexes Composition. Traversing the set of patch indices. Extract all unique connection edges and construct an edge set. For each edge in the edge set... Based on the two endpoints it connects and Calculate its physical length The calculation formula is as follows: , in: Indicates the first The Euclidean distance of the strip is the spatial length of the triangular mesh edge on the soft-pack surface; These are the 3D coordinates of the vertices in the world coordinate system. After traversal, a sequence of edge lengths containing all edge lengths is generated. Subsequently, to adaptively determine the simplification level, instead of hard-coding the threshold, the quantile method from statistics was used. This was applied to the side-length sequence. Sort in ascending order and calculate the lower quantile threshold. The calculation formula is: , in: For example, the proportion of quantiles That is, 15%. Edges with lengths in the lower quantile usually correspond to extremely fine regions or unnecessary curvature details in the mesh. Prioritizing the collapse of these edges can minimize the number of meshes while minimizing geometric deformation.

[0041] For example, suppose the input softsort model contains 100,000 edges. Calculating the length of all edges reveals that the lengths range from... to Sort these 100,000 length values ​​in ascending order. Set the quantile ratio. Read the value at the 20,000th position after sorting, let's say it's... Then determine the side length threshold. This means that all lengths less than Even the smallest edges will be marked as target edges to be processed.

[0042] Select target edges with lengths less than the edge length threshold from the edge set, and perform edge folding and merging to reconstruct the vertex coordinate set and the patch index set, generating a simplified mesh; In the edge set Filter out all that meet the criteria The edges are taken as target edges, and an edge folding algorithm is performed on each target edge. The algorithm's operation is to fold the two endpoints of the target edge... Remove and replace with a new vertex Replace the original edge and delete the two triangular faces adjacent to it. To ensure the simplified geometry approximates the original model as closely as possible, the new vertex... The location is not simply determined by taking the midpoint, but rather by minimizing a quadratic error metric. Its objective function is to minimize the point... The sum of squared distances to the original related plane set: , in: It is the sum of the fundamental error matrices of the correlated plane. This is achieved by solving... This process allows us to obtain the optimal coordinates that preserve the local curvature characteristics. This process iterates until all edges have been processed, ultimately reconstructing the vertex coordinate set and the patch index set, and outputting a simplified mesh.

[0043] For example, the original mesh contained 200,000 faces and had a file size of 10MB. Directly importing it into the physics engine would cause the frame rate to drop to 5 FPS. After edge collapsing based on the 20th percentile, the number of mesh faces was reduced to 10,000. Although the number of faces was significantly reduced, the rounded features of the soft-surface edges and the overall surface undulations remained visually almost unchanged, while the geometric data size was reduced to 0.5MB, meeting the performance requirements of real-time simulation.

[0044] The simplified mesh is decomposed into a set of convex polyhedra, and the set of convex polyhedra is then combined to obtain the simulation collision mesh.

[0045] Collision detection in physics simulation engines is highly efficient and numerically stable when handling convex polyhedra, but computationally complex and prone to clipping errors when handling concave polyhedra. Therefore, it is necessary to transform the simplified non-convex mesh into a set of convex polyhedra. The approximate convex decomposition algorithm V-HACD is employed, with the mathematical goal of finding a set of convex polyhedra. such that their union Compared to the original simplified grid The Hausdorff distance or volume error between them should be as small as possible. Volume error Defined as: , in: This represents the volume of the calculated geometry. The generated volume will... Each convex polyhedron is defined as a simulation collision mesh.

[0046] For example, a soft board in its natural state exhibits a complex "S"-shaped curved structure, which is a typical concave geometry. If a single convex hull is used to directly enclose it, the gaps in the "S"-shaped curve will be filled, causing the robotic arm to encounter a non-existent "air wall" before even touching the soft board itself in the simulation. By performing convex decomposition on this simplified mesh, setting the maximum number of convex hulls to 16, and the volume error tolerance to 1%, the calculation results automatically cut the "S"-shaped soft board into 12 interconnected small convex polyhedrons. These small segments closely conform to the curved path of the soft board. In the simulation, the robotic arm can accurately reach into the gaps in the soft board's curve to operate without triggering erroneous collision responses.

[0047] Optionally, the state discretization and trajectory sampling module includes: The equal-width binning state index generation unit is used to perform equal-width binning on the state vector sequence according to its dimensions, calculate the bin number to which each state vector falls and arrange them according to time steps to generate a state index sequence. To transform a continuous high-dimensional state space into a discrete state space suitable for reinforcement learning or probabilistic inference, we first read the sequence of state vectors. For the state vector Each dimension For example, the center of mass of a soft rigid body Coordinates are needed to determine the optimal bin width. Instead of using empirical presets, it employs the Freedman-Diaconis (FD) criterion for adaptive calculation, which minimizes the integral mean square error between the histogram and the true probability density function. (Binning width) The calculation formula is as follows: , in: Indicates the first Optimal bin width for dimensional data; This represents the total number of samples in the state vector sequence, i.e., the total number of time steps. For all state vectors in the sequence at the th... A set of numerical values ​​of dimension ; It is the interquartile range, that is upper quartile 75th percentile and lower quartile The difference between the 25th percentile and the 25th percentile. The reason for using IQR instead of standard deviation is that it is more robust to outliers that may appear in the simulation data. Based on the calculated... Calculate the first Number of boxes in the container : , Subsequently, for each state vector in the sequence Calculate its in the first binning index of dimensions Combine the binning indices of all dimensions to generate a unique integer that identifies the state; this is the state index. Arrange them in chronological order to generate a sequence of state indexes.

[0048] For example, suppose the sequence of state vectors contains Data from each time step. For the dimension of "Z-axis height of the soft-pack end", the minimum value of the data obtained was... The maximum value is The upper quartile is The lower quartile is First, calculate the interquartile range. Substituting into the FD criterion formula, the bin width... Calculate the number of boxes. One. For a given moment, the data value... Its binning index is The algorithm automatically determines 25 bins based on the dispersion of the data, avoiding the problems of information loss due to setting too few bins or computational sparsity due to setting too many bins.

[0049] The action frequency normalization probability table generation unit is used to count the frequency of action occurrence corresponding to each state index in the state index sequence and normalize it to generate an action probability distribution table. First, initialize a hash table to store the counts, then iterate through the state index sequence. and their corresponding action sequences For each state index Statistics on the actions taken afterward The number of times it appears is denoted as frequency. After the statistics are completed, for each state index... Calculate the conditional probability of taking each action. To prevent the probability of certain possible actions from being zero due to sample sparsity, Laplace smoothing, also known as plus-one smoothing, can be introduced. The calculation formula is as follows: , in: Indicates discrete state Select action The probability of; For the observed state Next action Frequency; The total number of actions in the action space; The smoothing coefficient is usually taken as... That is, only frequency normalization is performed, or This means performing smoothing, which here uses unsmoothed direct frequency normalization. The final generated action probability distribution table stores the action probability distribution vectors corresponding to all non-empty state indices.

[0050] For example, in the discrete state with state index "1024", historical data shows that a total of 100 decisions were made. The "move left" action occurred 20 times, the "move right" action occurred 50 times, and the "stop" action occurred 30 times. Calculate the probability: "move left" probability. "Right" probability "Stop" probability In the final generated action probability distribution table, the key is "1024" and the values ​​are vectors. This truly reflects the operational habit of "moving to the right" under this specific geometric configuration.

[0051] The cumulative probability sampling virtual trajectory generation unit is used to extract random numbers from the action probability distribution table for each state index and select actions according to the cumulative probability, iterate to obtain the action sequence, and combine it with the time step to generate a set of virtual trajectories.

[0052] For each time step, given the current state... Retrieve the corresponding action probability distribution First, construct the cumulative distribution function (CDF) sequence. ,in ,and The calculation formula is: , Obviously Next, generate a conforming... arrive Uniformly distributed random numbers Traverse the cumulative distribution sequence to find the one that satisfies... index of conditions Then the first The action selected in this sampling is the action. This process simulates a random walk based on a probability distribution. After selecting a move, the state index for the next time step is deduced based on kinematics. The above sampling steps are repeated until the length of the generated trajectory reaches the preset number of steps. .

[0053] For example, the probability of the action corresponding to the current state is: Action A( Action B Action C () Construct the cumulative probability sequence: The intervals are divided as follows: Corresponding to action A, Corresponding to action B, Corresponding action C. The system generates random numbers. . judge: Greater than and less than or equal to The sample falls within the range of action B. Therefore, action B is selected for this sampling. This step is recorded, and sampling continues in the next state, ultimately forming a virtual operation trajectory containing 50 steps.

[0054] Optionally, the system also includes: The denoised point cloud data is projected onto time slices using the state vector sequence. The point cloud spatial occupancy density at each time step is calculated and normalized to generate a density weight sequence.

[0055] First, we receive the denoised point cloud data, denoted as [data type] in the local coordinate system of the soft sorting system. And a sequence of state vectors. Traverse the sequence of state vectors and extract each time step. The pose information of the soft-pack rigid body, i.e., the position vector. and attitude quaternions To perform time-slice projection, the pose is converted to... homogeneous transformation matrix .matrix The construction formula is as follows: , in, It is composed of quaternions The converted Rotation matrix, yes Translation vector. Next, the denoised point cloud data in the local coordinate system is... Each point in Extended to homogeneous coordinates and left-multiply by the matrix This yields the absolute coordinate point cloud of that time step in the simulated world coordinate system. After obtaining the projected point cloud Then, voxel rasterization is used to calculate the spatial occupancy density. A three-dimensional spatial extent enclosing the point cloud at that moment is determined and divided into sections with sides of length [missing information]. Tiny cubes. The value is set to to This value must be less than the average width of the soft stack to ensure resolution. The number of "non-empty voxels" containing at least one point cloud data point is counted and denoted as . At this point, the spatial occupancy density at that time step... The calculation formula is as follows: , in: for The physical compactness density value at any given moment; The total number of points in the denoised point cloud data is a constant that remains unchanged throughout the entire sequence. for The number of non-empty voxels detected at any given time; This refers to the volume of a single voxel. When the soft stack is in a curled, folded, or compressed state, the point cloud is more spatially concentrated, occupying a larger number of voxels. Less, therefore the calculated density Higher density; conversely, when the soft array is in a state of violent swinging or unfolding, the point cloud occupies a large space and has a high density. The density sequence was relatively low. Finally, to eliminate dimensional differences and accommodate subsequent weighted calculations, the density sequences calculated for all time steps were... Perform max-min normalization to generate the final density-weight sequence. The normalization formula is: , in: and These are the minimum and maximum densities in the entire sequence, respectively. For example... Figure 3 As shown, according to quantile pairs The sample distribution is displayed using a rain cloud plot, including violin distribution, box curve statistics, and jitter scatter details. (See...) The higher the quantile, the higher the corresponding The overall distribution shifts to the left and the dispersion decreases, indicating that the high-density weights correspond to more stable projected area fluctuations.

[0056] For example, assume that at simulation time step At that time, the soft board had just been grasped by the robotic arm and was in a stable, slightly bent state. Total number of points in the denoised point cloud data. Each. (Set the voxel side length) After matrix transformation and voxelization statistics, it was found that the point cloud only occupies a small portion of the space. Individual elements. Calculate the original density at this moment: This density value indicates that the soft packing structure is compact and its physical state is relatively stable at this point.

[0057] For example, assume that at simulation time step During this process, the soft float undergoes violent deformation and drifting, exhibiting a loose, unfolded shape. Similarly... The points were dispersed over a wider area, and the number of non-empty voxels surged statistically. Calculate the original density at that moment: .contrast At a certain moment, the density decreases significantly. If throughout the entire process... , After normalization: Time weight ; Time weight This allows subsequent algorithms to automatically give higher attention to stable patterns while ignoring unstable, volatile patterns.

[0058] Optionally, the overlapping rewards and distribution evolution module includes: The projected area sequence generation unit is used to read the pose of the soft-pack rigid body according to the time step based on the virtual trajectory set, transform the pose of the soft-pack rigid body to the same coordinate system, calculate the absolute value of the difference between the projected areas of adjacent time step grids, and generate an interlayer overlapping area sequence by arranging them according to the time step. First, a geometric analysis is performed on each virtual trajectory generated by Monte Carlo sampling, for each time step in the trajectory. The pose matrix of the rigid body of the flexible board is read, and all vertices of the simulated collision mesh that makes up the flexible board are projected onto the two-dimensional reference plane where the fixture is located. Then, the area of ​​this projected polygon is calculated. To ensure the accuracy and universality of the calculation, the polygon area formula is used: , in: for The projected area of ​​the soft stack on the reference plane at any given moment; The total number of vertices of the projected polygon; For the first The vertices are given two-dimensional coordinates on the reference plane, and the vertices must be arranged in either counterclockwise or clockwise order. After calculating the area at each time step, the absolute value of the difference between the projected areas of adjacent time steps is calculated. This value Physically, this characterizes the dynamic instability of the soft stack: when the soft stack undergoes violent flipping, folding, or oscillation, the projected area fluctuates drastically, leading to... The size is relatively large; however, when the soft packing is laid out smoothly... Approaching zero. This sequence This is the sequence of interlayer overlap areas.

[0059] For example, in step 30 of a virtual trajectory, the soft-pack projected area In step 31, due to improper operation of the robotic arm, the soft board tipped over, and the projected area decreased sharply. At this point, the absolute value of the difference is calculated. This larger value will be recorded and used as the basis for penalizing the sequence of actions.

[0060] The time-step weighted product summation unit is used to perform weighted summation on the interlayer overlap area sequence based on the density weight sequence to generate a weighted sum of overlap areas. Combining physical state compactness With stability index Calculate the total reward score for the entire trajectory. .because This represents instability and requires inverse mapping when calculating returns. The weighted sum of overlapping areas is calculated using the following weighted summation formula: , in: This is the weighted sum of the overlapping areas of the virtual trajectory; a higher value indicates a better trajectory. for The density weighting of time moments ensures that the stability of key morphological changes has the greatest impact on the total score. This is a stability decay term. This is the sensitivity coefficient. This term ensures that when the absolute value of the difference... As the value increases, the reward diminishes rapidly.

[0061] The evolutionary distribution table generation unit is used to perform multiplicative scaling on the action probability distribution table using the overlapping area weighted sum as a scaling factor, and normalize it to generate the evolutionary action distribution table.

[0062] Using overlapping area weighted sum As a reward signal, the probability distribution table of actions used to generate the trajectory is modified. For each state index experienced in the trajectory... and the chosen action The new probability of it in the evolutionary action distribution table Update according to the following rules: , in: For example, the learning rate ; This indicates a direct proportionality. After the update, normalization is performed on the probabilities of all actions in each state, making... This process can produce high... The action sequence of the value is "enhanced", and the probability of its corresponding action selection is increased in the evolutionary distribution table.

[0063] For example, suppose that under state ID "55", the probability of the action "push down" in the initial probability table is... The generated trajectory underwent a "downward compression" at state "55," and the trajectory subsequently maintained extremely high stability. The final calculated weighted sum of the overlapping areas... Set the learning rate. The updated nonnormalized probability weights are then... After normalization, the probability of this action may change from... Upgraded to This means that in the next round of simulation or actual control, it is more likely that "downward pressure" will be selected in state "55", thereby reproducing the high-return stacking effect.

[0064] Optionally, the time-step weighted product summation unit includes: A weighted area sequence is generated by performing a term-by-term product operation on the density weight sequence and the interlayer overlap area sequence. First, a term-by-term multiplication operation is performed to dynamically modulate the geometric overlap representation using the stability of the physical state. Two input sequences that are strictly aligned in the time dimension are received: the interlayer overlap area sequence. and density-weighted sequence Traverse the entire simulation timeline, for each time step... The interlayer overlap area at that moment With the corresponding density weight Perform scalar multiplication to generate the weighted area value at that moment. The calculation formula is as follows: , in: for The weighted area value at any given time is in the same unit as the area unit. for The interlayer overlap area at time t, which quantifies the projected geometry of the soft stack at that time; for The dimensionless density weight at time t, with a value range of . If at any moment the soft stack is in a loose, violently shaking state, that is... When the area of ​​the soft-pack is close to zero, even if its projected area is large, its contribution to the final evaluation will be suppressed; conversely, when the soft-pack is in a compact and stable state, i.e. When the value is close to 1, its geometric overlap effect will be fully recognized and preserved.

[0065] For example, assume the simulation trajectory length is 100 steps. At time steps... The soft sheet has just begun to fall, its attitude is unstable, and the overlap area between layers is large. However, due to the discrete distribution of point clouds, the corresponding density weights are only... Perform multiplication: The contribution at that moment is significantly weakened. At time step... The soft pad has been stably laid on the fixture, with the interlayer overlap area... However, at this point, the point cloud density is extremely high, and the density weight is... Perform multiplication: As can be seen from the comparison, although The original area was larger, but after weighting, it was correctly determined. A stable state has higher value.

[0066] Perform a summation operation on the weighted area sequence to generate a weighted sum of overlapping areas.

[0067] After generating the weighted area sequence, a summation operation is performed to compress the temporal performance of the entire virtual trajectory into a single scalar metric, facilitating gradient calculation for reinforcement learning algorithms. First, an accumulator is created and initialized to zero, then the weighted area sequence is iterated. The values ​​from all time steps are summed up. The final sum is the weighted sum of the overlapping areas. Its mathematical expression is as follows: , in: The total reward value for this virtual trajectory is the core basis for strategy updates; This represents the total number of time steps for this trajectory; Indicates from arrive Summing all terms. Weighted sum of overlapping areas. This comprehensively reflects the quality of the soft-pack stacking operation. It not only requires a large final stacking area, but also requires the softpack to maintain high physical stability throughout the entire movement. This value serves as a proportional coefficient in the gradient ascent algorithm to adjust the probability distribution of actions.

[0068] For example, the weighted area sequence containing 100 time steps is summed. The sequence data is... After summing up, the total is obtained. In contrast, another trajectory generated from different action sequences, while also achieving stacking, experienced multiple violent flips in the soft stack during the process, resulting in a weighted area sequence sum of only [amount missing]. In the evolutionary process, due to This will significantly increase the probability of generating the first trajectory, thereby guiding the control strategy to converge in a direction that is both stable and accurate.

[0069] Optionally, the control module for distribution and divergence correction includes: The instruction mapping and issuing unit is used to select the action index with the highest action probability according to the state index of the evolution action distribution table, map and generate a control instruction sequence, and issue it for execution. First, the evolution action distribution table is received. To ensure execution efficiency, a greedy strategy is used for decision-making. For each discrete state index... Iterate through the action probability vector in this state and select the action index with the highest probability value. Next, the instruction mapping library is consulted. This library defines the specific physical parameters corresponding to each abstract action index. For example, action index "ID_05" corresponds to "the end effector descends 5mm along the Z-axis"; action index "ID_12" corresponds to "the end effector rotates 15 degrees". The selected series of action indexes is sequentially converted into a control instruction sequence that the robot controller can recognize, and then sent to the robot actuator via industrial Ethernet to drive the robotic arm to complete the actual soft stacking operation.

[0070] For example, suppose that under state index "204", the probability of "Action A" in the evolutionary action distribution table is 0.1, the probability of "Action B" is 0.8, and the probability of "Action C" is 0.1. Select "Action B" with the highest probability. Looking up the table, the control instruction corresponding to "Action B" is "MoveLOffs(pCurrent,0,0,-2),v100,fine,tool0", an ABB robot instruction, meaning to move the current position down 2mm. Generate this instruction and send it to the robot for execution.

[0071] The point cloud centroid trajectory point generation unit is used to collect point cloud frame sequences after instruction execution stacking, extract three-dimensional coordinates from the point cloud frame sequences and calculate the average coordinate value as the centroid point, and arrange the centroid points according to time steps to generate a set of real trajectories. During the robot's execution of control commands, the structured light vision sensor continuously acquires images at a fixed frequency and generates a point cloud frame sequence. To efficiently characterize the overall motion trend of the soft-pack and filter out local deformation noise, the centroid is used as the trajectory feature point. For the first... Frame point cloud data, extract the set of 3D coordinates of all valid points. The coordinates of the centroid of the soft-pack at that moment. The calculation formula is as follows: , in: for The three-dimensional coordinate vector of the centroid of the soft-pack is calculated at each moment. ; This represents the total number of valid points in the current frame's point cloud, derived from sensor counts. For the first point cloud Three-dimensional coordinate vectors of points; This represents the vector summation operation. It can macroscopically reflect the positional changes of the soft-placer, arranging all the calculated centroids in chronological order to generate a set of true trajectories. .

[0072] For example, in the 10th frame of point cloud collected, a total of 5000 points. Sum the X, Y, and Z coordinates of these 5000 points, assuming the total sum of the X coordinates is 500mm, the total sum of the Y coordinates is 1000mm, and the total sum of the Z coordinates is 250mm. Then the centroid coordinates of this frame are calculated as follows: Meters. This represents the overall spatial position of the soft stack at that moment.

[0073] The divergence calculation and distribution table update unit is used to align the real trajectory set and the virtual trajectory set by time step, calculate the trajectory point distance sequence and normalize it to generate a relative entropy divergence sequence, and perform probability scaling and normalization on the evolution action distribution table according to the relative entropy divergence sequence.

[0074] Performing closed-loop correction from simulation to reality is crucial because simulation model parameters, such as friction coefficient and stiffness, inevitably contain errors compared to the real physical environment, leading to inconsistencies between the virtual and real trajectories. First, the real trajectories are set... The set of virtual trajectories generated in the simulation using the same sequence of actions Perform time step alignment, using linear interpolation if necessary. Next, calculate each time step. Distance of trajectory points Subsequently, to quantify the impact of this bias on the policy distribution, the relative entropy divergence sequence was calculated. relative entropy divergence value of the step The following normalized exponential mapping formula is used for calculation: , in: for The relative entropy divergence value at time t, and its range. A larger value indicates a greater difference between reality and simulation. The Euclidean distance between the real centroid and the virtual centroid; For example, the allowable standard deviation. This is used to adjust the sensitivity to errors. Finally, the evolutionary action distribution table is updated based on this divergence sequence. Figure 4As shown, the data is used to characterize the consistency risk of high-return trajectories in actual execution. From the bin density and quantile median, it can be seen that the divergence tends to be lower as the return increases, thus providing a statistical basis for probability scaling correction.

[0075] Based on the same inventive concept, this invention also provides a real-time identification method for the position of automotive soft strip stacks based on machine vision and deep learning, the method comprising: By controlling the structured light vision sensor to collect discrete point cloud coordinate data of the soft board surface, least squares surface fitting and outlier removal are performed, and Poisson surface reconstruction is performed to fill the holes and generate a three-dimensional surface model. The three-dimensional surface model is converted into a simulation collision mesh and imported into the dynamic simulation space to construct a virtual overlay environment, and the state vector sequence is extracted from the virtual overlay environment; Discretize the state vector sequence into a state index sequence, establish an action probability distribution table for the state index sequence, and perform Monte Carlo sampling according to the action probability distribution table to generate a set of virtual trajectories. Calculate the inter-layer overlap area sequence for each track in the virtual trajectory set and summarize it into a weighted sum of overlap areas. Then, perform gradient ascent update on the action probability distribution table according to the weighted sum of overlap areas to generate the evolutionary action distribution table. The evolutionary action distribution table is parsed into a sequence of control commands and executed. After collecting and stacking point clouds, a set of real trajectories is generated. The relative entropy divergence sequence between the set of real trajectories and the set of virtual trajectories is calculated and the evolutionary action distribution table is updated.

[0076] It should be noted that the formulas mentioned above, through the principle of dimensional consistency and mathematical standardization methods (such as normalization, dimensionless parameter conversion, or unit system unification), can translate physical quantities with different properties into unitless standard values ​​or parameters that can be superimposed in the same dimension. This eliminates the interference of different dimensions on the operational logic, allowing the formulas to retain the original data distribution characteristics while possessing mathematical rationality and adaptability to objective laws. These are conventional technical methods and will not be elaborated further. The electrical connections between the various units mentioned above do not necessarily represent direct or indirect connections; any indirect connection method is applicable to the embodiments of this invention as long as it achieves the purpose of this invention. The above are merely exemplary embodiments of this invention and should not be construed as limiting the scope of this invention.

[0077] All equivalent changes and modifications made in accordance with the teachings of this invention are still within the scope of this invention. Those skilled in the art will readily conceive of other embodiments of this invention upon considering the specification and the disclosure of practical truth. This application is intended to cover any variations, uses, or adaptations of this invention that follow the general principles of this invention and include common knowledge or conventional techniques in the art not described herein.

Claims

1. A real-time recognition system for the position of automotive flexible strip stacks based on machine vision and deep learning, characterized in that, The system includes: The point cloud fitting and reconstruction module is used to collect discrete point cloud coordinate data of the soft surface by controlling the structured light vision sensor, perform least squares surface fitting and outlier removal, and perform Poisson surface reconstruction to fill holes and generate a three-dimensional surface model. The collision mesh import simulation module is used to convert the three-dimensional surface model into a simulation collision mesh and import it into the dynamic simulation space to construct a virtual overlay environment and extract the state vector sequence from the virtual overlay environment; The state discretization and trajectory sampling module is used to discretize the state vector sequence into a state index sequence, establish an action probability distribution table for the state index sequence, and perform Monte Carlo sampling according to the action probability distribution table to generate a set of virtual trajectories. The overlap reward and distribution evolution module is used to calculate the inter-layer overlap area sequence for each track of the virtual trajectory set and summarize it into a weighted sum of overlap areas. The module then performs gradient ascent update on the action probability distribution table according to the weighted sum of overlap areas to generate an evolutionary action distribution table. The control distribution and divergence correction module is used to parse the evolutionary action distribution table into a sequence of control commands and issue them for execution, collect and stack point clouds to generate a set of real trajectories, calculate the relative entropy divergence sequence between the set of real trajectories and the set of virtual trajectories, and update the evolutionary action distribution table.

2. The real-time recognition system for the position of automotive soft sheet stacking based on machine vision and deep learning according to claim 1, characterized in that, The point cloud fitting and reconstruction module includes: The point cloud acquisition unit is used to control the structured light vision sensor to acquire multi-angle grating image sequences on the surface of the soft array, perform pixel-level phase principal value calculation and unwrapping operation, and map the two-dimensional pixel coordinates to three-dimensional spatial coordinates according to the camera calibration to construct a sparse point cloud dataset. The neighborhood tangent plane projection distance denoising unit is used to perform radius filtering on the sparse point cloud dataset and generate a neighborhood point set, iteratively fit a local tangent plane on the neighborhood point set and calculate the projection distance sequence from the point to the plane, take the upper quantile threshold of the projection distance sequence and delete the coordinate points that exceed the threshold to obtain denoised point cloud data. A Poisson reconstruction mesh generation unit is used to calculate a set of point normal vectors for the denoised point cloud data, and to perform Poisson surface reconstruction based on the set of point normal vectors to generate a three-dimensional surface model. The three-dimensional surface model includes a set of vertex coordinates and a set of patch indices.

3. The real-time recognition system for the position of automotive soft sheet stacking based on machine vision and deep learning according to claim 2, characterized in that, The collision mesh import simulation module includes: The collision body generation unit is used to read the vertex coordinate set and the face index set, perform mesh simplification by merging edge lengths to generate a simplified mesh, and perform convex decomposition on the simplified mesh to construct a simulation collision mesh; The virtual stacked rigid body assembly unit is used to load the simulated collision mesh into the dynamic simulation space, establish the soft-pack rigid body and the fixture rigid body and write the mass parameters and inertia parameters to generate a virtual stacked environment. The pose contact state vector sampling unit is used to read the pose of the soft rigid body, the pose of the end effector and the coordinates of the contact point along the simulation time sequence based on the virtual stacked environment, splice them by dimension to generate a state vector and output the state vector sequence by time step.

4. The real-time recognition system for the position of automotive soft sheet stacking based on machine vision and deep learning according to claim 3, characterized in that, The collider generation unit includes: Calculate the edge set and edge length sequence of the patch index set, and take the quantile of the edge length sequence to obtain the edge length threshold; Select a target edge whose length is less than the edge length threshold from the edge set, and perform edge folding and merging to reconstruct the vertex coordinate set and the patch index set to generate a simplified mesh; The simplified mesh is decomposed into a set of convex polyhedra, and the set of convex polyhedra is then combined to obtain the simulation collision mesh.

5. The real-time recognition system for the position of automotive soft sheet stacking based on machine vision and deep learning according to claim 1, characterized in that, The state discretization and trajectory sampling module includes: An equal-width binning state index generation unit is used to perform equal-width binning on the state vector sequence according to its dimensions, calculate the bin number to which each state vector falls and arrange them according to time steps to generate a state index sequence. The action frequency normalization probability table generation unit is used to count the frequency of action occurrence corresponding to each state index in the state index sequence and normalize it to generate an action probability distribution table. The cumulative probability sampling virtual trajectory generation unit is used to extract random numbers from the action probability distribution table for each state index and select actions according to the cumulative probability, iterate to obtain an action sequence, and combine it with the time step to generate a set of virtual trajectories.

6. The real-time recognition system for the position of automotive soft sheet stacking based on machine vision and deep learning according to claim 2, characterized in that, The system also includes: The denoised point cloud data is projected onto time slices using the state vector sequence, the point cloud spatial occupancy density at each time step is calculated and normalized, and a density weight sequence is generated.

7. The real-time recognition system for the position of automotive soft sheet stacking based on machine vision and deep learning according to claim 6, characterized in that, The overlapping reward and distribution evolution module includes: The projected area sequence generation unit is used to read the pose of the soft-pack rigid body according to the time step based on the virtual trajectory set, transform the pose of the soft-pack rigid body to the same coordinate system, calculate the absolute value of the difference between the projected areas of adjacent time step grids, and arrange them according to the time step to generate an interlayer overlapping area sequence. The time-step weighted product summation unit is used to perform a weighted summation on the interlayer overlap area sequence based on the density weight sequence to generate a weighted summation of overlap areas. An evolutionary distribution table generation unit is used to perform multiplicative scaling on each action probability in the action probability distribution table using the weighted sum of the overlapping areas as a scaling factor, and normalize it to generate an evolutionary action distribution table.

8. The real-time recognition system for the position of automotive soft sheet stacking based on machine vision and deep learning according to claim 7, characterized in that, The time-step weighted product summation unit includes: A weighted area sequence is generated by performing a term-by-term product operation on the density weight sequence and the interlayer overlap area sequence. Perform a summation operation on the weighted area sequence to generate a weighted sum of overlapping areas.

9. The real-time recognition system for the position of automotive soft sheet stacking based on machine vision and deep learning according to claim 1, characterized in that, The control distribution and divergence correction module includes: The instruction mapping and issuing unit is used to select the action index with the highest action probability according to the state index of the evolution action distribution table, map and generate a control instruction sequence, and issue it for execution. The point cloud centroid trajectory point generation unit is used to collect point cloud after instruction execution stacking to obtain point cloud frame sequence, extract three-dimensional coordinates from the point cloud frame sequence and calculate the coordinate mean as centroid point, and arrange the centroid point according to time step to generate a set of real trajectory. The divergence calculation and distribution table update unit is used to align the real trajectory set and the virtual trajectory set by time step, calculate the trajectory point distance sequence and normalize it to generate a relative entropy divergence sequence, and perform probability scaling and normalization on the evolution action distribution table according to the relative entropy divergence sequence.

10. A real-time identification method for the position of automotive soft strip stacks based on machine vision and deep learning, applied to the real-time identification system for the position of automotive soft strip stacks based on machine vision and deep learning as described in any one of claims 1-9, characterized in that, The method includes: By controlling the structured light vision sensor to collect discrete point cloud coordinate data of the soft board surface, least squares surface fitting and outlier removal are performed, and Poisson surface reconstruction is performed to fill the holes and generate a three-dimensional surface model. The three-dimensional surface model is converted into a simulation collision mesh and imported into the dynamic simulation space to construct a virtual overlay environment, and a state vector sequence is extracted from the virtual overlay environment; The state vector sequence is discretized into a state index sequence, an action probability distribution table is established for the state index sequence, and Monte Carlo sampling is performed according to the action probability distribution table to generate a set of virtual trajectories. For each track in the virtual trajectory set, calculate the inter-layer overlap area sequence and summarize it into a weighted sum of overlap areas. Then, perform gradient ascent update on the action probability distribution table according to the weighted sum of overlap areas to generate an evolutionary action distribution table. The evolutionary action distribution table is parsed into a sequence of control commands and executed. After collecting and stacking point clouds, a set of real trajectories is generated. The relative entropy divergence sequence between the set of real trajectories and the set of virtual trajectories is calculated and the evolutionary action distribution table is updated.