A servo control method and system for pressing a hemostatic robot

By combining a flexible thin-film pressure sensor and a reinforcement learning model, adaptive control of the robotic arm is achieved, solving the problem of individual differences in the hemostasis device and improving the hemostasis effect and patient comfort.

CN117618053BActive Publication Date: 2026-06-23SHENZHEN INST OF ADVANCED TECH CHINESE ACAD OF SCI

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SHENZHEN INST OF ADVANCED TECH CHINESE ACAD OF SCI
Filing Date
2023-11-17
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Existing hemostatic devices cannot adapt to individual differences, making it difficult to control the pressure, increasing the risk of postoperative complications. In addition, traditional manual compression methods are time-consuming and laborious, affecting patient comfort and recovery speed.

Method used

By employing a flexible thin-film pressure sensor and a reinforcement learning model, the robot arm's pressing force and angle are controlled by detecting changes in pressure distribution, thus achieving adaptive servo control.

Benefits of technology

Precise control of pressure can shorten hemostasis time, reduce patient discomfort, improve comfort and safety, and reduce the workload of medical staff.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN117618053B_ABST
    Figure CN117618053B_ABST
Patent Text Reader

Abstract

The application discloses a follow-up control method and system for pressing a hemostasis robot. The method comprises the following steps: detecting a current pressure distribution state of a mechanical arm pressing position, comparing a difference between the current pressure distribution state and a standard state, and generating a follow-up control signal if the difference reaches a set change threshold, wherein the standard state is a pressure distribution state corresponding to an expected pressing effect; inputting the current pressure distribution state into a trained reinforcement learning model to obtain a corresponding execution action for indicating a change of a pressing angle of the mechanical arm if the follow-up control signal is detected; and converting the execution action into a mechanical arm control signal to control the pressing position of the mechanical arm to be fixed and the pressing angle to be changed. The application achieves a compression hemostasis effect in a manner of controlling the pressing of the mechanical arm, can accurately control the pressing strength, shortens the compression time length, and improves the comfort of patients.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of automation control technology, and more specifically, to a follow-up control method and system for a pressure-based hemostasis robot. Background Technology

[0002] Interventional cardiac surgery has become a common treatment method for diseases such as coronary heart disease, arrhythmia, congenital heart disease, and valvular heart disease. The radial and femoral arteries are the main surgical pathways for interventional cardiac surgery. The femoral artery, due to its larger diameter, facilitates puncture and catheter manipulation, offering unparalleled advantages over other interventional pathways when dealing with complex lesions. However, the large diameter and high blood flow velocity of the femoral artery also make immediate hemostasis at the puncture site difficult, and improper hemostasis can increase the risk of postoperative complications.

[0003] In recent years, various types of arterial puncture closure devices and femoral artery compression hemostatic devices have been applied clinically. However, due to individual physiological differences among patients (such as vascular conditions and body size), these devices cannot simulate the force of local compression applied by a human operator to different puncture sites, hindering the clinical adoption of hemostatic devices. Postoperatively, doctors still primarily rely on manual compression. Furthermore, improper hemostasis procedures can easily lead to complications such as local hematoma, pseudoaneurysm, and arteriovenous fistula. Under traditional manual compression methods, patients must remain immobile for extended periods while receiving sandbag pressure, exacerbating discomfort and pain. Therefore, shortening the hemostasis procedure time, improving patient comfort, and reducing postoperative complications after femoral artery intervention have become not only a challenge for interventional physicians but also a current focus of the field.

[0004] Following femoral artery interventional surgery, applying pressure to the femoral artery puncture site to promote closure of the vascular wound and achieve hemostasis is a common hemostatic method. However, it also carries risks of complications such as bleeding, hematoma, pseudoaneurysm, and arteriovenous fistula. Therefore, selecting an appropriate puncture site, using appropriate pressure and duration, and employing auxiliary hemostatic instruments or dressings are all important factors in improving hemostatic efficacy and safety. Currently, the main method is manual pressure hemostasis. After the femoral artery interventional surgery, before removing the sheath, the patient's systolic and diastolic blood pressure are measured, and the compression time is determined based on the blood pressure level and sheath size. Generally, the compression time is 15-20 minutes, but if the patient has hypertension, is on anticoagulation therapy, or has a larger sheath, the compression time needs to be extended. When removing the sheath, negative pressure aspiration should be maintained, and appropriate pressure should be applied to the puncture site with sterile cotton balls or dressings. The pressure should be sufficient to feel the dorsalis pedis artery pulse without bleeding from the puncture site. After applying pressure to stop the bleeding, apply a bandage or elastic stocking to the puncture site and have the patient rest in bed for 6-12 hours, avoiding movement or flexion. This manual pressure hemostasis method is time-consuming and laborious, increasing the workload of medical staff and the patient's pain, and may affect the patient's comfort and recovery speed.

[0005] In the prior art, patent application CN202310632482.1 discloses a pressure hemostat for arterial puncture sites based on flexible compression. This hemostat includes a bandage section, a base frame section, multiple fitting components, multiple pressure application components, a breathable dressing, and an expansion component. Such pressure hemostats are complex, expensive, and difficult to operate. Furthermore, the applicable population for pressure hemostats is limited, and they cannot adaptively adjust the compression force to suit individuals with different physiological states. Moreover, the compression force is difficult to control; excessive or insufficient pressure can affect the hemostatic effect and safety. Excessive pressure may cause damage or blockage of the puncture site and surrounding blood vessels, while insufficient pressure may cause bleeding or hematoma. In addition, existing pressure hemostats do not effectively reduce the compression duration, requiring patients to remain in bed for extended periods and avoid activity during compression, thus increasing patient discomfort. Summary of the Invention

[0006] The purpose of this invention is to overcome the shortcomings of the prior art and provide a follow-up control method and system for a pressure-based hemostasis robot.

[0007] According to a first aspect of the present invention, a follow-up control method for a pressure-based hemostasis robot is provided. The method includes the following steps:

[0008] The current pressure distribution at the point where the robotic arm presses is detected and compared with the standard state. If the difference reaches a set change threshold, a follow-up control signal is generated, wherein the standard state is the pressure distribution state corresponding to the desired pressing effect.

[0009] Upon detecting the follow-up control signal, the current pressure distribution state is input into the trained reinforcement learning model to obtain the corresponding execution action, which is used to indicate the change of the robotic arm pressing angle.

[0010] The execution action is converted into a robotic arm control signal to control the robotic arm to fix the pressing position and change the pressing angle.

[0011] According to a second aspect of the present invention, a follow-up control system for a pressure-based hemostasis robot is provided. The system includes a robotic arm, a controller, and a host computer, wherein:

[0012] The host computer is used to: detect the current pressure distribution state at the point where the robotic arm is pressing, and compare the difference with the standard state. If the difference reaches a set change threshold, a follow-up control signal is generated, wherein the standard state is the pressure distribution state corresponding to the desired pressing effect; when the follow-up control signal is detected, the current pressure distribution state is input into a trained reinforcement learning model to obtain the corresponding execution action, which is then transmitted to the controller to indicate the change of the robotic arm pressing angle.

[0013] The controller is used to convert the execution action into a robotic arm control signal to control the robotic arm to fix the pressing position and change the pressing angle.

[0014] Compared with the prior art, the advantages of the present invention are that it provides an adaptive immediate pressure hemostasis robot control scheme, which achieves the effect of compression hemostasis by controlling the robotic arm to press, can accurately control the pressure intensity, shorten the compression time, and through the follow-up control of the robotic arm end, allows the patient to have a small range of movement during the compression process, thereby improving the patient's comfort and safety.

[0015] Other features and advantages of the invention will become clear from the following detailed description of exemplary embodiments of the invention with reference to the accompanying drawings. Attached Figure Description

[0016] The accompanying drawings, which are incorporated in and form part of this specification, illustrate embodiments of the invention and, together with their description, serve to explain the principles of the invention.

[0017] Figure 1 This is a flowchart of a follow-up control method for a pressure-based hemostasis robot according to an embodiment of the present invention;

[0018] Figure 2 This is a schematic diagram of the overall process of a follow-up control method for a pressure-based hemostasis robot according to an embodiment of the present invention;

[0019] Figure 3 This is a schematic diagram of a five-joint robotic arm structure according to an embodiment of the present invention;

[0020] Figure 4 This is a schematic diagram illustrating the process of training a reinforcement learning model according to an embodiment of the present invention;

[0021] Figure 5 This is a value representation intent according to an embodiment of the present invention. Detailed Implementation

[0022] Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that, unless otherwise specifically stated, the relative arrangement, numerical expressions, and values ​​of the components and steps set forth in these embodiments do not limit the scope of the invention.

[0023] The following description of at least one exemplary embodiment is merely illustrative and is in no way intended to limit the invention or its application or use.

[0024] Techniques, methods, and equipment known to those skilled in the art may not be discussed in detail, but where appropriate, such techniques, methods, and equipment should be considered part of the specification.

[0025] In all the examples shown and discussed herein, any specific values ​​should be interpreted as merely exemplary and not as limitations. Therefore, other examples of exemplary embodiments may have different values.

[0026] It should be noted that similar labels and letters in the following figures indicate similar items; therefore, once an item is defined in one figure, it does not need to be discussed further in subsequent figures.

[0027] This invention provides a follow-up control scheme for a pressure-based hemostasis robot, particularly suitable for robotic arm control for immediate pressure-based hemostasis at the femoral artery puncture site. The scheme generally includes: collecting and analyzing pressure data; dividing the pressure distribution into different states according to intervals; using these states as the state space of a subsequent reinforcement learning model; each state corresponding to multiple angle transformation actions of the robotic arm, which serve as the action space of the subsequent reinforcement learning model; a reinforcement learning-based robotic arm end-effector follow-up control algorithm; training the reinforcement learning model based on the aforementioned state and action spaces using a Q-learning algorithm; the trained model can calculate the corresponding follow-up control signal based on real-time pressure distribution changes; and then converting the follow-up control signal into position control commands through forward and inverse kinematics calculations, sending them to the robotic arm controller to control the robotic arm's movement, achieving the follow-up effect.

[0028] Specifically, in combination Figure 1 and Figure 2 As shown, the provided follow-up control method for the pressure-based hemostasis robot includes the following steps:

[0029] Step S110: Construct a reinforcement learning model to enable the robotic arm to perform follow-up control as the position of the pressing target changes.

[0030] by Figure 3 Taking a robot platform with a five-joint robotic arm as an example, the platform also includes a host computer and a robot controller (not shown). The end effector of the robotic arm is a compression hemostasis actuator, and a flexible thin-film pressure sensor, which is a square matrix, is installed on the end effector. The pressure magnitude and distribution on the flexible thin-film pressure sensor will change with the compression force and compression angle. The purpose of this invention is to enable the robotic arm to follow the position changes of the target object being compressed, that is, when the object being compressed rotates, the robotic arm should follow the movement to ensure that the compression direction is perpendicular to the compression point.

[0031] In one embodiment, the overall concept of using reinforcement learning to achieve follow-up control is as follows: First, the movement of the pressed object causes a change in the pressure distribution on the thin-film pressure sensor attached to the pressing point, generating a new state s1. The host computer compares the difference between s1 and s0. If the difference reaches a change threshold, a follow-up control signal is generated; otherwise, a constant pressure is maintained. When the host computer detects the follow-up control signal, it inputs the current pressure distribution state s1 on the thin-film pressure sensor into the reinforcement learning model. The trained reinforcement learning model matches the optimal policy π(s1|a1) based on the current state s1 and outputs an execution action a1. The host computer receives the execution action a1, converts it into a robotic arm control signal, and sends it to the controller, which controls the movement of the robotic arm.

[0032] S120 uses the pressure distribution under constant pressure when applying pressure to achieve hemostasis as the optimization target to train the reinforcement learning model.

[0033] In one embodiment, the training of the reinforcement learning model is implemented using the Q-learning algorithm, such as... Figure 4 As shown.

[0034] First, the pressure distribution on the thin-film pressure sensor is divided into 50 states according to different ranges. Each state corresponds to, for example, two actions: the robotic arm presses in a counter-clockwise rotation around the pressing point. ° The robotic arm rotates clockwise around the pressing point by 1 second. ° This design allows us to define the state space and action space. After defining the state space and action space, we then build the Q-Table (value table). For example... Figure 5As shown, the Q-Table has a specification of 50*2. The rows represent different states s of pressure distribution on the pressure diaphragm, and the columns represent the changing action a of the robotic arm pressing angle. It stores the maximum value Q obtained by taking the corresponding action in the current state. The value Q is expressed as follows:

[0035] Q π (s t ,a t )=E[R t+1 +λR t+2 +λ 2 R t+3 +…|s t ,a t (1)

[0036] Among them, a t s represents the action at time t. t Let λ represent the state at time t, λ be the set coefficient, t be the time index, R be the reward corresponding to each state-action, and E be the expectation.

[0037] The reinforcement learning process involves updating the values ​​in the Q-Table to maximize value, which is the expected future reward. For example, consider the pressure distribution on a thin-film pressure sensor under constant pressure to achieve hemostasis as the standard state s0, which is the optimization objective of reinforcement learning. When the object being pressed moves, causing a change in the pressure distribution, the state changes, initiating the reinforcement learning process. The pressure distribution corresponds to a state s1 in the state space, and each state corresponds to two actions. Before exploring the environment, the Q-Table provides the same arbitrary setpoints. As the environment is continuously explored, the Q-Table iteratively updates Q(s,a) using the Bellman equation (dynamic programming equation) to provide increasingly better approximations.

[0038] Specifically, in combination Figure 4 As shown, training a reinforcement learning model using the Q-learning algorithm includes the following steps:

[0039] Step 1: Initialize Q-Table.

[0040] For example, all initial values ​​are 0, and an exploration rate epsilon is specified, initially set to 1.

[0041] Step 2: Generate a random number. If this number is greater than epsilon, query the Q-Table and select the action with the highest value in the current state. Otherwise, continue exploring (random actions). After executing the action, a new state is obtained. If s2 is closer to s0 than s1, the reward is increased by 1; otherwise, it is decreased by 1, denoted as R(s,a).

[0042] Step 3: After receiving the reward, update the value of the corresponding action in that state.

[0043] For example, the update method uses the Bellman equation, see formula (2). The learning rate α is set to 0.1, and the impact factor γ is set to 0.9. Formula (3) is the true value of Q, and formula (4) is the temporal difference error.

[0044] NewQ(s,a)=Q(s,a)+α[R(s,a)+γmaxQ ′ (s ′ ,a ′ )-Q(s,a)](2)

[0045] Q target =R(s,a)+γmaxQ ′ (s ′ ,a ′ (3)

[0046] TDerror = Q target -Q(s,a) (4)

[0047] Step 4: Update state s2.

[0048] Step 5: Return to step 2 and repeat until the goal is achieved or learning stops.

[0049] It should be understood that epsilon is a randomly selected step size. Initially, this rate should be at its maximum because the values ​​in the Q-Table are initial and meaningless. Extensive exploration through random action selection is necessary at this stage. A large epsilon is essential when initially training the Q-function. As the agent becomes more confident in its estimated Q-value, the epsilon is gradually decreased. Furthermore, the number of states in the reinforcement learning model, the number of actions per state, etc., can be reasonably set according to the accuracy and efficiency requirements of the reinforcement learning model.

[0050] Step S130: With the goal of changing the pressing angle while keeping the pressing position fixed, design the circular motion plan of the end effector of the robotic arm.

[0051] With the follow-up control signal in place, the movement of the robotic arm can be controlled. The goal is to change the pressing angle without altering the pressing position. Since the pressing direction aligns with the end effector direction, the challenge lies in changing the direction of the end effector. In one embodiment, the movement of the robotic arm's end effector is controlled by designing an arc. For example, using the original robotic arm end effector as a fixed center and the end effector as the radius, the joint connected to the end effector becomes the new robotic arm end effector. The new end effector then moves in an arc around the center, achieving the effect of a fixed end point while the direction of the end effector changes.

[0052] Specifically, circular arc planning includes the following steps:

[0053] First, determine the starting point, intermediate point, and ending point of the circular arc trajectory, as well as the posture of the robotic arm's end effector. Second, based on the coordinates of the three points, solve for the equation of the plane containing the circular arc trajectory and the coordinates of the center. The equation of the plane containing the circular arc trajectory is given by formula (5), where the coefficients A, B, C, and D can be obtained by formula (6). (x1, y1, z1), (x2, y2, z2), and (x3, y3, z3) are the coordinates of the starting point, intermediate point, and ending point of the circular arc trajectory, respectively.

[0054] Ax + By + Cz + D = 0 (5)

[0055]

[0056] Next, based on the center coordinates and arc angle, the position interpolation point of the robotic arm's end is generated.

[0057] The coordinates of the center of the circle are given by formula (7). The arc angle is given by formula (8), and the arc radius r is the length of the end connecting rod.

[0058]

[0059]

[0060] The coordinates of the i-th point of the circular arc trajectory are (x i ,y i ,z i ), as in formula (9).

[0061]

[0062] Among them, (x s ,y s ,z s Let θ be the coordinates of the starting point of the circular arc trajectory, ω be the unit vector of the rotation axis, and θ be the coordinates of the starting point. i Let R(ω,θ) be the angle of the arc formed by the i-th point and the starting point. i ) is a rotation about the axis of rotation ω. i The rotation matrix of the angle is expressed as in formula (10).

[0063]

[0064] I is the identity matrix, [ω] × Let be the antisymmetric matrix of the unit vector ω of the rotation axis, and its expression is as shown in formula (11).

[0065]

[0066] Finally, the position interpolation point (x) at the end of the robotic arm was obtained. i ,y i ,z i Based on the forward and inverse kinematics model of the robotic arm, the joint angle corresponding to each interpolation point is calculated and sent to the robotic arm controller to control the subsequent movement of the robotic arm.

[0067] Step S140: Using the trained reinforcement learning model and the designed circular motion plan, perform follow-up control on the pressing target.

[0068] The trained reinforcement learning model can be applied to the servo control of a robot. Specifically, before servo control, the target object's position remains unchanged. At this time, the robotic arm maintains a constant force and vertical pressure on the pressing point, using the pressure distribution on the thin-film pressure sensor at this point as the standard distribution, denoted as state s0. A servo switch signal is set to the host computer, for example, represented by a Boolean variable. Turning on the switch starts servo control. At this time, the host computer receives the pressure distribution state in real time. When the pressing point rotates, it causes a change in the pressure distribution on the thin-film pressure sensor, generating a new state s1. The host computer compares the difference between s1 and s0. If the difference reaches a change threshold, a servo control signal is generated; otherwise, the constant force is maintained. When the host computer detects the servo control signal, it inputs the current pressure distribution state s1 on the thin-film pressure sensor into the trained reinforcement learning model to obtain the optimal policy π(s1|a1) matching the current state s1, and outputs an action a1. The host computer receives the action a1, converts it into a robotic arm control signal, and sends it to the controller. The controller controls the robotic arm to move in an arc around the pressing point without changing its position, thereby changing the pressing angle. The robotic arm movement causes the pressure distribution on the thin-film pressure sensor to change to a new state s2. The host computer continues to compare the difference between s2 and s0, and this process is repeated cyclically. The goal is to make the pressure distribution closer to s0, keeping the difference within a change threshold. This process will result in a state-action trajectory sequence τ = (s0, a0, s1, a1, ..., s t ,a t The final state converges to s0.

[0069] In summary, compared with the prior art, the present invention has the following advantages:

[0070] 1) The flexible thin-film pressure sensor used in this invention is a square matrix. The pressure data is divided into different states according to the force distribution at different detection points on the matrix, which serves as the state space for reinforcement learning. This improves the correlation between the state space and the pressure distribution, and enables timely detection of changes in pressure distribution.

[0071] 2) This invention uses the pressure distribution on the flexible thin film pressure sensor as the state space and the pressing angle change of the robotic arm as the action space. By using the Q-learning algorithm to train the reinforcement learning model, a reinforcement learning model with end effector servo function can be obtained that enables the robotic arm to adapt to different individual conditions.

[0072] 3. The present invention uses the original robotic arm end as the fixed center and the end link as the radius. The joint connected to the end link is used as the new robotic arm end. The new end moves in an arc around the center, achieving the effect of changing the direction of the end link while the end point remains stationary. This improves the accuracy of pressure control and can adaptively change the pressing direction, realizing the movement control of the end following the patient's limb wound.

[0073] 4) The follow-up control technology used in this invention has low labor costs, reduces the physical burden on doctors, and allows for more precise control of the pressure applied. Furthermore, compared to manual pressure where the patient cannot move for extended periods, this invention allows the patient a small range of movement during the pressure application process, reducing the patient's pain.

[0074] 5) It has been verified that the present invention can be effectively applied to various robot platforms, such as a robot platform with a five-joint robotic arm. By controlling the robotic arm, the hemostasis effect can be achieved by applying pressure, and the patient's comfort and safety can be improved.

[0075] This invention can be a system, method, and / or computer program product. A computer program product may include a computer-readable storage medium having computer-readable program instructions loaded thereon for causing a processor to implement various aspects of the invention.

[0076] Computer-readable storage media can be tangible devices capable of holding and storing instructions for use by an instruction execution device. Computer-readable storage media can be, for example, but not limited to, electrical storage devices, magnetic storage devices, optical storage devices, electromagnetic storage devices, semiconductor storage devices, or any suitable combination thereof. More specific examples (a non-exhaustive list) of computer-readable storage media include: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static random access memory (SRAM), portable compact disc read-only memory (CD-ROM), digital multifunction disc (DVD), memory sticks, floppy disks, mechanical encoding devices, such as punch cards or recessed protrusions storing instructions thereon, and any suitable combination thereof. The computer-readable storage media used herein are not to be construed as transient signals themselves, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., light pulses through fiber optic cables), or electrical signals transmitted through wires.

[0077] The computer-readable program instructions described herein can be downloaded from computer-readable storage media to various computing / processing devices, or downloaded via a network, such as the Internet, local area network, wide area network, and / or wireless network, to an external computer or external storage device. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and / or edge servers. A network adapter card or network interface in each computing / processing device receives the computer-readable program instructions from the network and forwards them to the computer-readable storage media in the respective computing / processing device.

[0078] The computer program instructions used to perform the operations of this invention may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state setting data, or source code or object code written in any combination of one or more programming languages, including object-oriented programming languages ​​such as Smalltalk, C++, Python, etc., and conventional procedural programming languages ​​such as "C" or similar languages. The computer-readable program instructions may be executed entirely on the user's computer, partially on the user's computer, as a standalone software package, partially on the user's computer and partially on a remote computer, or entirely on a remote computer or server. In cases involving a remote computer, the remote computer may be connected to the user's computer via any type of network—including a local area network (LAN) or a wide area network (WAN)—or may be connected to an external computer (e.g., via the Internet using an Internet service provider). In some embodiments, electronic circuitry, such as programmable logic circuitry, field-programmable gate arrays (FPGAs), or programmable logic arrays (PLAs), is personalized by utilizing state information from the computer-readable program instructions. This electronic circuitry can execute the computer-readable program instructions to implement various aspects of the invention.

[0079] Various aspects of the present invention are described herein with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It should be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer-readable program instructions.

[0080] These computer-readable program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing apparatus to produce a machine such that, when executed by the processor of the computer or other programmable data processing apparatus, they create means for implementing the functions / actions specified in one or more blocks of the flowchart and / or block diagram. These computer-readable program instructions can also be stored in a computer-readable storage medium that causes a computer, programmable data processing apparatus, and / or other device to operate in a particular manner; thus, the computer-readable medium storing the instructions comprises an article of manufacture that includes instructions for implementing aspects of the functions / actions specified in one or more blocks of the flowchart and / or block diagram.

[0081] Computer-readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable data processing apparatus, or other device to produce a computer-implemented process, thereby causing the instructions executed on the computer, other programmable data processing apparatus, or other device to perform the functions / actions specified in one or more boxes of a flowchart and / or block diagram.

[0082] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of an instruction containing one or more executable instructions for implementing a specified logical function. In some alternative implementations, the functions marked in the blocks may occur in a different order than those marked in the drawings. For example, two consecutive blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, can be implemented using a dedicated hardware-based system that performs the specified function or action, or using a combination of dedicated hardware and computer instructions. It will be known to those skilled in the art that implementation in hardware, implementation in software, and implementation using a combination of software and hardware are equivalent.

[0083] The various embodiments of the present invention have been described above. These descriptions are exemplary and not exhaustive, nor are they limited to the disclosed embodiments. Many modifications and variations will be apparent to those skilled in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen to best explain the principles, practical application, or technical improvements to the embodiments in the market, or to enable others skilled in the art to understand the embodiments disclosed herein. The scope of the invention is defined by the appended claims.

Claims

1. A follow-up control method for a pressure-based hemostasis robot, comprising the following steps: The current pressure distribution at the point where the robotic arm presses is detected and compared with the standard state. If the difference reaches a set change threshold, a follow-up control signal is generated, wherein the standard state is the pressure distribution state corresponding to the desired pressing effect. Upon detecting the follow-up control signal, the current pressure distribution state is input into the trained reinforcement learning model to obtain the corresponding execution action, which is used to indicate the change of the robotic arm pressing angle. The execution action is converted into a robotic arm control signal to control the robotic arm to fix the pressing position and change the pressing angle; The reinforcement learning model is trained according to the following steps: Based on the pressure distribution at the pressing point, it is divided into multiple states as state space according to different ranges. Each state corresponds to two actions as action space. Each action is used to indicate the pressing direction and rotation angle of the robotic arm. A value table is established based on the correspondence between the state space and the action space to store the maximum value Q obtained by taking the corresponding action in the current state; The pressure distribution under constant force when the desired pressing effect is achieved is taken as the standard state and used as the optimization target of the reinforcement learning model to carry out the reinforcement learning process, thereby obtaining the state-action trajectory sequence. The process of converting the execution action into a robotic arm control signal to control the robotic arm to fix the pressing position while changing the pressing angle includes the following steps: With the end of the robotic arm as the fixed center and the end link as the radius, the joint connected to the end link is taken as the new end of the robotic arm. The circular arc trajectory of the new end of the robotic arm moving around the center is planned. The circular arc trajectory is determined based on the starting point, the middle point and the ending point. The position interpolation point of the end of the robotic arm is generated based on the arc trajectory, thereby controlling the pressing position of the robotic arm to be fixed while changing the pressing angle. The process of generating the position interpolation point of the end of the robotic arm based on the arc trajectory, and then controlling the pressing position of the robotic arm to be fixed while changing the pressing angle, includes the following steps: Determine the center coordinates and arc angle based on the described arc trajectory; Based on the center coordinates and the arc angle, the first arc trajectory is obtained. Coordinates of points The position interpolation point at the end of the robotic arm is represented as: in, These are the coordinates of the center of the circle. Let be the coordinates of the starting point of the circular arc trajectory. The unit vector of the rotation axis. For the first The angle of the arc formed by each point and the starting point. For rotation around the axis Rotation The rotation matrix of the angle is expressed as: in, It is the identity matrix. Unit vector of rotation axis The antisymmetric matrix; Based on the position interpolation point of the robotic arm end effector Based on the forward and inverse kinematics model of the robotic arm, the joint angle corresponding to each interpolation point is solved, thereby controlling the pressing position of the robotic arm to be fixed while changing the pressing angle.

2. The method according to claim 1, characterized in that, The maximum value Q is represented as: in, Indicates the action at time t. Indicates the state at time t. Here, t is the coefficient, and t is the time index. It represents the reward for each state-action, where E represents the expected reward.

3. The method according to claim 1, characterized in that, The two actions are the robotic arm pressing in the direction of the press and rotating counterclockwise around the pressing point. The robotic arm rotates clockwise around the pressing point. .

4. The method according to claim 1, characterized in that, The pressure distribution is acquired using a flexible thin-film pressure sensor mounted on the end effector of the robotic arm. This flexible thin-film pressure sensor is a square matrix containing multiple detection points.

5. A follow-up control system for a pressure-based hemostasis robot, comprising a robotic arm, a controller, and a host computer, wherein: The host computer is used to: detect the current pressure distribution state at the point where the robotic arm is pressing, and compare the difference with the standard state. If the difference reaches a set change threshold, a follow-up control signal is generated, wherein the standard state is the pressure distribution state corresponding to the desired pressing effect; when the follow-up control signal is detected, the current pressure distribution state is input into a trained reinforcement learning model to obtain the corresponding execution action, which is then transmitted to the controller to indicate the change of the robotic arm pressing angle. The controller is used to convert the execution action into a robotic arm control signal to control the robotic arm to fix the pressing position and change the pressing angle. The process of converting the execution action into a robotic arm control signal to control the robotic arm to fix the pressing position while changing the pressing angle includes the following steps: With the end of the robotic arm as the fixed center and the end link as the radius, the joint connected to the end link is taken as the new end of the robotic arm. The circular arc trajectory of the new end of the robotic arm moving around the center is planned. The circular arc trajectory is determined based on the starting point, the middle point and the ending point. The position interpolation point of the end of the robotic arm is generated based on the arc trajectory, thereby controlling the pressing position of the robotic arm to be fixed while changing the pressing angle. The process of generating the position interpolation point of the end of the robotic arm based on the arc trajectory, and then controlling the pressing position of the robotic arm to be fixed while changing the pressing angle, includes the following steps: Determine the center coordinates and arc angle based on the described arc trajectory; Based on the center coordinates and the arc angle, the first arc trajectory is obtained. Coordinates of points The position interpolation point at the end of the robotic arm is represented as: in, These are the coordinates of the center of the circle. Let be the coordinates of the starting point of the circular arc trajectory. The unit vector of the rotation axis. For the first The angle of the arc formed by each point and the starting point. For rotation around the axis Rotation The rotation matrix of the angle is expressed as: in, It is the identity matrix. Unit vector of rotation axis The antisymmetric matrix; Based on the position interpolation point of the robotic arm end effector Based on the forward and inverse kinematics model of the robotic arm, the joint angle corresponding to each interpolation point is solved, thereby controlling the pressing position of the robotic arm to be fixed while changing the pressing angle.

6. The system according to claim 5, characterized in that, The robotic arm includes an end cap, a link connected to the end cap, and a joint connected to the link. With the end cap as a fixed center and the link as the radius, the joint connected to the link is used as a new end cap of the robotic arm. The new end cap of the robotic arm is planned to move in an arc around the center, and the arc trajectory is determined based on the starting point, the intermediate point, and the ending point. Then, the position interpolation point of the end cap of the robotic arm is generated based on the arc trajectory to control the pressing position of the robotic arm to be fixed while changing the pressing angle.

7. A computer-readable storage medium having a computer program stored thereon, wherein, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 4.