An operation ticket work compliance judgment system, method, and program product
By introducing an operation ticket compliance judgment system into substations, combined with voice, action and location recognition technologies, the safety risks and accuracy issues of substation operation compliance management have been resolved, achieving fully automated compliance judgment and improving operation safety and digital management capabilities.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- WUHAN XINDIAN ELECTRICAL TECH
- Filing Date
- 2026-05-21
- Publication Date
- 2026-06-19
AI Technical Summary
In existing technologies, compliance management of substation operations relies on subjective human judgment, which has problems such as high safety risks, low efficiency, insufficient accuracy, and lack of digital closed loop, and cannot meet the requirements of intelligent, real-time, and highly reliable safety management.
An operation ticket compliance judgment system is adopted, which combines speech recognition, action recognition and location positioning. It extracts operation ticket information through deep learning text recognition, completes step decomposition and sequence matching by combining action, location and object keyword databases, verifies steps by combining speech denoising model and cosine similarity algorithm, locates personnel by using image and pose dataset and hash encoding, performs precise positioning by combining Canny line feature extraction and 3D and 2D line feature matching, judges the accuracy of instruction actions by using hand joint recognition algorithm, and determines the operation execution status in real time by using switch status recognition model.
It enables automated, standardized, and real-time verification of substation operations, avoiding human error, reducing the risk of misoperation and equipment damage, providing full-process digital data recording, supporting safety analysis and accountability, and improving operational safety and refined management.
Smart Images

Figure CN122242978A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of intelligent management and control technology for substation operation safety in power systems, specifically to a system, method, and program product for judging the compliance of operation tickets. Background Technology
[0002] As a key hub in the power system, substations must strictly adhere to the operation ticket system for on-site operations (such as switching operations, equipment maintenance, and overhaul testing). Operators must follow the steps, locations, objects, and action specifications specified in the operation ticket, and skipping steps, making mistakes, or operating incorrectly is strictly prohibited.
[0003] The current mainstream solution for compliance management of substation operations is a combination of manual checklist reading and on-site manual supervision and verification: before the operation, the operator reads the operation steps aloud to confirm them, and the on-site supervisor verifies the content of the checklist by hearing, observes the operation location, operation actions, and equipment status by visual inspection, and manually judges whether it is consistent with the requirements of the operation ticket to complete the compliance verification. This traditional manual management solution has the following insurmountable technical defects: Relying on subjective human judgment poses high safety risks: The entire process relies on the attention, sense of responsibility and experience of the supervisor for verification. Due to problems such as personnel fatigue, negligence and distraction, situations such as missed verification of vote count content, misjudgment of work location, failure to verify operation actions, and misjudgment of equipment status may occur, which can easily lead to misoperation, equipment damage or even personal injury accidents.
[0004] Manual review is inefficient and lacks real-time performance: manual supervision can only achieve intermittent and localized verification, and cannot track and judge the entire process, continuous and real-time of the voting voice, personnel location, operation actions and equipment status. Compliance verification is delayed and cannot stop violations in a timely manner.
[0005] The verification dimensions are too limited and the accuracy is insufficient: basic verification is completed only through human hearing and vision, lacking multi-dimensional quantitative recognition and matching of voice content, spatial location, body movements and device status. The judgment criteria are subjective and highly differentiated, resulting in low accuracy of compliance verification.
[0006] Without a digital management and control closed loop, traceability is difficult: manual supervision lacks a complete digital record and automatic alarm mechanism, violations cannot be warned in real time, and it is difficult to accurately locate the violation and reconstruct the operation process afterward. Safety management lacks traceable and quantifiable digital support.
[0007] In summary, the traditional manual voting and supervision methods for operational compliance management can no longer meet the needs of intelligent, real-time, precise, and highly reliable safety management of substation operations. The industry urgently needs an automated operational compliance judgment method that integrates operation ticket parsing, voice recognition, action recognition, and location positioning. Summary of the Invention
[0008] The purpose of this invention is to overcome the shortcomings of the aforementioned background technology and provide a system, method, and program product for judging the compliance of operation tickets by integrating operation ticket parsing, voice recognition, action recognition, and location positioning.
[0009] To achieve this objective, the operational ticket compliance judgment system designed in this invention includes: The information infrastructure module is used to collect voice information from the vote counters, images of substation switchgear, and images of the work scene. Based on the voice information from the vote counters, a dedicated voice recognition model for substation scenes is trained and established. Based on the images of substation switchgear, a switch status recognition model is established. An operation ticket keyword library is constructed. A point cloud map of substation equipment locations is established. Based on the images of the work scene, an image and pose reference dataset is created. The personnel operation positioning module is used to obtain the initial positioning information of personnel before the vote counting based on the image and pose reference dataset, extract the operation equipment number based on the substation scenario-specific speech recognition model, and obtain the precise positioning information of the operator by combining the substation equipment location point cloud map and the initial positioning information of personnel before the vote counting. Based on the precise positioning information of the operator, the module constructs the correspondence between the operator and the surrounding equipment. The vote-reading compliance judgment module is used to parse the voice information of the vote-reader based on a substation-specific voice recognition model and operation ticket keyword library to obtain operation instruction keywords. Based on the operator's precise location information and the correspondence between the operator and surrounding equipment, it verifies the operator's position compliance. When the operator's position is compliant, it determines the current operation step based on the operation instruction keywords; it also recognizes the operator's hand gestures, determines the corresponding operation target position based on the current operation step, and verifies the accuracy of the operator's hand gestures based on the operation target position.
[0010] Furthermore, the operation ticket compliance judgment system also includes: a personnel operation verification module, which is used to locate the center coordinates of the switch after the operation based on the switch status recognition model, the operation ticket keyword library and the current operation steps, and judge the operation execution status based on the center coordinates of the switch after the operation. When the operation execution status is judged to be completed, the module verifies whether the actual status of the current switch is consistent with the operation ticket requirements.
[0011] Furthermore, the method for creating an image and pose reference dataset based on work scene images includes: scaling all work scene images to a fixed size. N×N And convert it to grayscale. N Set a fixed size for all work scene images after scaling; calculate the pixel difference between adjacent columns of a single grayscale image using the following formula: ,in, These are the coordinates of pixels in a single grayscale image. For a single grayscale image in The pixel value corresponding to the coordinates. for The pixel difference between adjacent columns at coordinates; based on the calculation formula for the pixel difference between adjacent columns of a single grayscale image, the pixel difference between adjacent columns of all grayscale images is calculated; Traverse all pixel coordinates of a single grayscale image For each pixel coordinate corresponding Perform binarization judgment, if If the hash value is greater than 0, then the hash value of the pixel coordinate is 1. If the hash value is ≤0, then the hash value of the pixel coordinate is 0; the binarized results of all pixel coordinates in a single grayscale image are arranged in pixel coordinate order to form a hash value of length 1. N× (N-1) The hash value encoding, i.e. ,in, The hash value encoding of a single grayscale image is obtained by iterating through the pixel differences of adjacent columns corresponding to all pixel coordinates of all grayscale images.
[0012] Furthermore, the method for obtaining the initial positioning information of personnel before the vote counting based on the image and pose reference dataset includes: before the vote counting, acquiring continuous images of the work scene in real time, and scaling all acquired continuous images of the work scene to a fixed size. n×n And save it. n The images are scaled to a fixed size based on all acquired continuous images of the work scene; continuous images of the work scene are captured according to a time-series window to obtain a set of window images, i.e. ,in, A collection of window images within a time sequence window; w This is the length of the timing window; t The current moment; The current time in the time series window t The corresponding latest frame image is the current frame image. ; Let i be the earliest frame image within the temporal window; calculate the sum of the Hamming distances between the i-th frame image in the pose reference dataset and all images in the window image set, using the following formula: ,in, The sum of Hamming distances between the i-th frame image and the set of window images in the image and pose reference dataset; i and j are the image frame numbers, and i > j; k is the pixel hash value number; Let k be the hash value of the k-th bit of the (tj)-th frame in the window image set; The hash value is the k-th bit corresponding to the i-th frame in the image and pose reference dataset; ⊕ is the binary XOR operation; select the smallest bit in the image and pose reference dataset. The corresponding i-th frame image is used as the optimal matching frame. , the best matching frame The corresponding personnel pose is used as the current frame image. Personnel location information, i.e., the initial location information of personnel before the vote counting.
[0013] Furthermore, the method for obtaining precise operator positioning information by combining the substation equipment location point cloud map and the initial positioning information of personnel before the vote counting includes: using the Canny operator to extract the current frame image. The two-dimensional line features are used to obtain a set of two-dimensional line features. ,in, It is a set of two-dimensional line features; Let i be the point set of the i-th two-dimensional feature line; based on the initial positioning information of personnel before the vote counting, extract the point cloud map of the substation equipment location within the camera's field of view to obtain the scene point cloud. A line feature extraction algorithm based on geometric analysis is used to extract scene point clouds. The three-dimensional line features are used to obtain a set of three-dimensional line features. ,in, It is a set of three-dimensional line features; Let i be the set of points for the i-th 3D feature line; based on camera intrinsic parameters and the initial positioning information of personnel before the vote counting, The three-dimensional line features are projected onto the image plane to obtain the set of projected feature points. ; with a preset search radius for For each projected line feature, match the corresponding two-dimensional line feature and calculate the line feature similarity. The calculation formula is as follows: ,in, Let be the similarity between the i-th projected 3D line feature and the j-th 2D line feature; Let i be the feature vector of the i-th projected 3D line; Let j be the eigenvector of the j-th two-dimensional line; Let be the Euclidean norm of the eigenvector of the i-th projected 3D line. Let the Euclidean norm of the j-th 2D line feature vector be denoted as . The projected 3D line feature vector and the 2D line feature vector with the highest similarity are selected to form the optimal feature pair. The PNP algorithm is then used to solve the current frame image based on the optimal feature pair. With scene point cloud The relative pose of the personnel is used to obtain the precise pose of the operator. Based on the operator's precise position Obtain the set of device numbers around the operator. Combined with the operator's precise position Set of equipment numbers around the operator This allows us to obtain precise location information from the operators before the vote counting begins.
[0014] Furthermore, the method for parsing the voice information of the person announcing the vote based on the substation-specific speech recognition model and the operation ticket keyword library to obtain operation instruction keywords, and verifying the operator's location compliance based on the operator's precise location information and the correspondence between the operator and surrounding equipment, includes: using the substation-specific speech recognition model to perform noise reduction processing and speech-to-text conversion on the real-time voice information of the person announcing the vote, and extracting the corresponding equipment number from the voice of the person announcing the vote. Based on the aforementioned operation ticket keyword library, keywords related to the operation instructions are extracted from the speech-to-text using a keyword matching algorithm to obtain a set of operation instruction keywords. The operation ticket is used for text recognition to obtain the equipment number to be operated. Number the equipment to be operated Collection of equipment numbers around the operator The device number corresponding to the voice of the vote counter Perform a verification match, if and = If so, the operator's current work position is deemed compliant.
[0015] Furthermore, the method for determining the current operation step based on operation instruction keywords and identifying the operator's hand gestures includes: establishing a mapping relationship between each step of the operation ticket and the corresponding operation instruction keyword group; performing similarity matching between the real-time extracted operation instruction keywords and the keyword groups of each step to pinpoint the operation step currently being performed by the operator; acquiring images of the work scene in real time; using a hand joint point recognition algorithm to detect key points of the operator's hand; locating and extracting the pixel coordinates of the tip of the operator's index finger as the actual indication position of the operator's hand gestures. .
[0016] Furthermore, the method for determining the corresponding operation target location based on the current operation step and verifying the accuracy of the operator's hand gestures based on the operation target location includes: retrieving standard operating equipment information pre-associated with the current operation step, matching the operation target equipment corresponding to the current step, and reading the standard pixel coordinates of the operation target equipment in the real-time acquired operation scene image as the reference operation target location. ; Calculate the actual pointing position of the operator's hand gestures. Relative to the target position of the operation pixel distance ;like If the distance does not exceed the preset pixel distance threshold, it is determined that the operator's hand gesture is accurate.
[0017] Furthermore, a method for judging the compliance of operation tickets based on an operation ticket compliance judgment system includes: collecting voice information of the ticket announcer, images of substation switchgear, and images of the operation scene; training and establishing a dedicated voice recognition model for the substation scene based on the voice information of the ticket announcer; establishing a switch status recognition model based on the images of the substation switchgear; constructing an operation ticket keyword library; establishing a point cloud map of substation equipment locations; and creating an image and pose reference dataset based on the operation scene images. Based on image and pose reference datasets, the initial positioning information of personnel before the vote count is obtained. The operation equipment number is extracted based on the substation scenario-specific speech recognition model. The precise positioning information of the operators is obtained by combining the substation equipment location point cloud map and the initial positioning information of personnel before the vote count. The correspondence between the operators and the surrounding equipment is constructed based on the precise positioning information of the operators. Based on a dedicated voice recognition model for substation scenarios and an operation ticket keyword library, the voice information of the ticket announcer is analyzed to obtain operation instruction keywords. Based on the operator's precise location information and the correspondence between the operator and surrounding equipment, the operator's position compliance is verified. When the operator's position is compliant, the current operation step is determined based on the operation instruction keywords. The operator's hand gestures are identified, and the corresponding operation target position is determined according to the current operation step. The accuracy of the operator's hand gestures is verified based on the operation target position.
[0018] Furthermore, a computer program product includes a computer program / instructions that, when executed by a processor, implement the steps of the method.
[0019] The beneficial effects of this invention are as follows: First, this invention uses deep learning text recognition to extract operation ticket information, and combines action, location, and object keyword libraries to complete step decomposition and sequence matching, automatically replacing manual step-by-step interpretation and verification, avoiding skipping or incorrect steps in the process, and significantly improving the efficiency of operation execution and compliance management. Second, a dedicated voice denoising and recognition model is trained for substation noise, and a cosine similarity algorithm is used to complete the quantitative matching of the ticket announcement voice and operation ticket steps, and to verify the step order, solving the problems of easy omissions, misjudgments, and poor real-time performance of manual verification, and realizing automated, standardized, and real-time verification of ticket announcement content. Then, image and pose datasets, hash encoding, and Hamming distance are used to complete the initial personnel positioning, and Canny line feature extraction, 3D and 2D line feature matching, and PNP algorithm (Perspective-n-Point, n-point perspective projection algorithm) are used to achieve precise personnel positioning. Through triple matching of image recognition device number, voice extraction number, and point cloud positioning surrounding number, the problems of inaccurate manual positioning and wrong intervals are completely solved, eliminating misoperation from a spatial dimension. Next, a hand joint recognition algorithm is used to determine the accuracy of the indicated actions. A switch status recognition model is used to determine the operation execution and the final equipment status in real time, completing a triple verification of the compliance of the voting action, operation execution, and equipment status. This fills the gap in control where manual tracking of actions and results is impossible. Next, a fully automated compliance judgment system is constructed by integrating location, voice, action, and equipment status recognition with real-time alarms. This eliminates subjective misjudgments caused by personnel fatigue, negligence, and experience differences, reducing the risk of personal injury, equipment damage, and misoperation accidents from the source, significantly improving operational safety. Finally, the system automatically records all process data, including voice, location, action, equipment status, and verification results, forming a complete digital archive. This allows for precise location of violations and reconstruction of the operation process, providing reliable data support for safety analysis, accountability, and process optimization, achieving refined and digital management of substation operations.
[0020] In summary, the operation ticket compliance judgment system designed in this invention has optimized voice, image, and point cloud algorithms specifically for the complex noise and multi-equipment interference environment of substations. The system is robust, has high recognition accuracy, can operate stably in actual engineering scenarios, and has excellent field adaptability and promotion value. Attached Figure Description
[0021] To more clearly illustrate the technical solutions of the embodiments disclosed in this invention, the accompanying drawings of the embodiments will be briefly described below. These drawings are for illustrative purposes only and are not intended to limit the scope of protection of this invention.
[0022] Figure 1 This is a schematic diagram of the module connections of the operation ticket compliance judgment system designed in this invention. Figure 2The flowchart shows the operation ticket compliance judgment method designed for this invention. Detailed Implementation
[0023] The technical solutions (including preferred technical solutions) of the present invention will be further described in detail below with reference to the accompanying drawings and by way of listing some optional embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative effort are within the scope of protection of the present invention.
[0024] Example 1 This invention provides a specific embodiment of an operation ticket compliance judgment system: like Figure 1 As shown, the operation ticket compliance judgment system includes four modules, as detailed below: Information infrastructure construction module: Collect voice information of the vote counters, images of substation switchgear and operation scene images; train and establish a dedicated voice recognition model for substation scene based on the voice information of the vote counters; establish a switch status recognition model based on the images of substation switchgear; construct an operation ticket keyword library; establish a point cloud map of substation equipment location; and create an image and pose reference dataset based on the operation scene images. Personnel positioning module: Based on image and pose reference dataset, obtain the initial positioning information of personnel before the vote count, extract the operation equipment number based on the substation scenario-specific speech recognition model, combine the substation equipment location point cloud map and the initial positioning information of personnel before the vote count to obtain the precise positioning information of the operator, and construct the correspondence between the operator and the surrounding equipment based on the precise positioning information of the operator. The vote-reading compliance judgment module: Based on a substation-specific voice recognition model and operation ticket keyword library, it parses the voice information of the vote-reader to obtain operation instruction keywords. Based on the operator's precise location information and the correspondence between the operator and surrounding equipment, it verifies the operator's position compliance. When the operator's position is compliant, it determines the current operation step based on the operation instruction keywords; it also recognizes the operator's hand gestures, determines the corresponding operation target position based on the current operation step, and verifies the accuracy of the operator's hand gestures based on the operation target position.
[0025] Personnel operation verification module: Based on the switch status recognition model, operation ticket keyword library and current operation steps, locate the center coordinates of the switch after the operation, determine the operation execution status according to the center coordinates of the switch after the operation, and when the operation execution status is determined to be completed, verify whether the actual status of the current switch is consistent with the requirements of the operation ticket.
[0026] Example 2 like Figure 2 As shown, this embodiment provides a method for judging the compliance of operation tickets based on embodiment 1. The specific steps include S1 to S20, wherein S1 to S4 are in Figure 2 Not shown in the text: S1 model training and basic data collection; Speech model training: Collect speech information of vote-counting personnel in a substation environment, and train speech denoising and speech recognition models to address the noise of substation equipment operation, thereby achieving accurate denoising and text conversion of speech in noisy environments.
[0027] Switch recognition model training: Collect images of switchgear in substations, train a target recognition model for the switches on the switchgear, and achieve accurate recognition of different states of the switch such as open and closed.
[0028] S2 constructs a keyword library for operation tickets; A standardized operation ticket keyword library is constructed, which contains action keywords, location keywords, and operation object keywords in the operation ticket, providing a benchmark for keyword extraction and matching in subsequent steps.
[0029] S3 establishes a point cloud map of the substation equipment locations; The substation scene is scanned by LiDAR and camera to obtain three-dimensional spatial data of the entire station, forming a point cloud map. The location information, equipment number and equipment type of different equipment are marked in the point cloud map, and the binding relationship between the spatial location and identity information of the equipment is established.
[0030] S4 generates image and pose baseline datasets and produces hash codes; Based on the collected images of the work scene, we created the image IMGo and pose POSEo datasets. We then preprocessed all the collected images of the work scene by uniformly scaling them to a fixed size of N×N and converting them into grayscale images.
[0031] The formula for calculating the pixel difference between adjacent columns of a grayscale image is: in, These are the coordinates of pixels in a single grayscale image. For a single grayscale image in The pixel value corresponding to the coordinates. for The pixel difference between adjacent columns at coordinates; based on the calculation formula for the pixel difference between adjacent columns of a single grayscale image, the pixel difference between adjacent columns of all grayscale images is calculated.
[0032] Traverse all pixel coordinates of a single grayscale image For each pixel coordinate corresponding Perform binarization judgment, if If the hash value is greater than 0, then the hash value of the pixel coordinate is 1. If the hash value is ≤0, then the hash value of the pixel coordinate is 0; the binarized results of all pixel coordinates in a single grayscale image are arranged in pixel coordinate order to form a hash value of length 1. N×(N-1) The hash value encoding, i.e. ,in, The hash value encoding of a single grayscale image is obtained by iterating through the pixel differences of adjacent columns corresponding to all pixel coordinates of all grayscale images.
[0033] Initial positioning of personnel before S5 vote counting; During the initialization phase before the vote counting, images of the work scene are acquired in real time, making the current frame image... Based on the current frame, the initial location of personnel is completed. The specific sub-steps are as follows: S5.1 generates a set of time-series window images; Real-time operation scene images are captured by a camera, preprocessed, scaled to a fixed size of n×n, and saved. A set of window images is obtained by capturing the window in a time sequence. Where w is the time window length and t is the current time; This is the latest frame of the image at the current moment.
[0034] S5.2 Calculate the sum of Hamming distances; For the i-th frame image in IMG0, calculate the sum of its Hamming distances with the images within the window, using the following formula: ,in The sum of Hamming distances between the i-th frame image and the set of window images in the image and pose reference dataset; i and j are the image frame numbers, and i > j; k is the pixel hash value number; Let k be the hash value of the k-th bit of the (tj)-th frame in the window image set; Let be the k-th bit hash value corresponding to the i-th frame in the image and pose reference dataset; ⊕ is the binary XOR operation; after traversing and calculating, select the smallest bit hash value in the image and pose reference dataset. The corresponding i-th frame image is used as the optimal matching frame. .
[0035] S5.3 Determine the initial position of the personnel; The best matching frame Corresponding personnel position As the current frame image personnel position That is, the initial location information of the personnel before the vote counting. The initial positioning of personnel was completed before the vote counting.
[0036] S6 Extraction Operation Ticket Equipment Number; The operator takes an image of the operation ticket, and uses a deep learning text recognition algorithm to extract the equipment number from the operation ticket. .
[0037] S7 Extraction Operation Ticket Step Sequence; All operation steps in the operation ticket image are extracted using a text recognition algorithm and saved to a step set in execution order. , ,in, This refers to the i-th operation step in the operation ticket.
[0038] S8 extracts keywords in a single step; The text matching method is used to extract each operation step from the operation ticket. Core keywords, generating a set of key words for each step , ,in, For the first The j-th keyword in the operation steps.
[0039] Precise personnel location during the S9 vote counting phase; After the vote counting begins, the current location of each person is accurately determined using a point cloud map. The specific sub-steps are as follows: S9.1 Extract two-dimensional line features from the image; The Canny operator is used to extract the current frame image. The two-dimensional line features in the image are as follows: The current color frame image is converted to a single-channel grayscale image to eliminate color information interference and unify the input format for feature extraction. A Gaussian kernel function is used to smooth the grayscale image, suppressing interference from uneven lighting, equipment reflections, and image noise at the substation site, preventing noise from being misidentified as edge features. All pixels in the grayscale image are traversed, and the horizontal and vertical gradients of each pixel are calculated. The gradient magnitude and direction of each pixel are further calculated to locate the edge strength and direction. Local maxima are determined for all pixels along the gradient direction, retaining only the pixel with the largest gradient magnitude. Wide edges are refined into precise edge lines of single-pixel width, and redundant weak response points near the edges are removed. High and low thresholds are set. Pixels with gradient magnitudes not less than the high threshold are identified as strong edge points and retained directly; pixels with gradient magnitudes not higher than the low threshold are identified as non-edge points and removed directly; pixels with gradient magnitudes between the high and low thresholds are identified as weak edge points and retained only if connected to strong edge points, otherwise removed. The filtered single-pixel edge points are connected according to topological relationships to form continuous two-dimensional line features, which are then sequentially encapsulated into a set of two-dimensional line features. , ,in, Let be the set of points for the i-th two-dimensional feature line.
[0040] S9.2 extracts the point cloud of the scene within the camera's field of view; Based on the initial camera pose, extract scene point clouds within the camera's field of view. .
[0041] S9.3 Extracts 3D line features from point clouds; A line feature extraction algorithm based on geometric analysis is used to extract... The 3D line features in the scene point cloud are used to obtain a set of 3D line features. Denoising and homogenization preprocessing are performed; the normal vector and curvature of each point in the point cloud are calculated to select candidate feature points belonging to the straight-line structure; the candidate points are clustered into continuous line segment point sets according to spatial distance and normal vector angle constraints; straight lines are fitted to the line segment point sets to generate standard 3D line features; all fitted 3D line features are encapsulated in sequence to obtain a 3D line feature set. , ,in, Let be the set of points for the i-th three-dimensional feature line.
[0042] S9.4 Projecting 3D line features onto an image; Based on camera intrinsics and initial pose, The inner 3D feature line point cloud is projected onto the image plane to obtain the projected feature point set. .
[0043] S9.5 Calculate the similarity of line features; right The i-th projection line feature within the preset search radius. Match 2D line features and calculate similarity using the following formula: ,in, Let be the similarity between the i-th projected 3D line feature and the j-th 2D line feature; Let i be the feature vector of the i-th projected 3D line; Let j be the eigenvector of the j-th two-dimensional line; Let be the Euclidean norm of the eigenvector of the i-th projected 3D line. Let be the Euclidean norm of the j-th two-dimensional line eigenvector. S9.6 Determine the optimal line feature pair; Select the feature with the highest similarity as the optimal line feature pair, and repeat the calculation until... P 3D-2D All features within the range were matched.
[0044] S9.7 Solve for the precise pose of the personnel; The PNP algorithm is used to solve the current frame image based on the optimal feature pair. With scene point cloud The relative pose of the personnel is used to obtain the precise pose of the operator. Based on the operator's precise position Obtain the set of device numbers around the operator. .
[0045] S10 ticket announcement audio analysis and keyword extraction; The system collects the voice recordings of ticket announcements in real time, denoises them using a speech denoising model, and then converts them into text using a deep learning algorithm to extract the device numbers from the voice recordings. ; The core keywords of the text are extracted using keyword matching methods to obtain a set of voting keywords: ,in A set of keywords extracted from the voice recording of the vote count; This refers to the k-th keyword in the set.
[0046] S11 Ticket Counting Location Compliance Verification; Perform triple number matching to determine position compliance. Judgment rules: like and = If the voting location is compliant, proceed to the next step; If the above conditions are not met, an alarm will be issued indicating that the vote counting position is not in compliance.
[0047] S12 Similarity calculation between the content of the ticket announcement and the steps of the ticket operation; After the location is compliant, retrieve the i-th step of the operation ticket (keyword). Pi With vote counting keywords The specific sub-steps for calculating cosine similarity are as follows: S12.1 Generate the total set of keywords and their frequency codes; merge and Pi The keywords are deduplicated to obtain the total keyword set. Ws .
[0048] statistics Ws Keywords in Chinese and Pi The number of occurrences in the sequence, generating a frequency code: in: For the c-th keyword in The number of times it appears in; For the c-th keyword in Pi c represents the number of times the keyword appears in the text; c represents the total number of keywords.
[0049] S12.2 Calculate cosine similarity; Calculate using cosine similarity algorithm and Pi similarity The formula is: S12.3 Determine the current execution step; Iterate through all steps of the operation ticket and calculate. Based on the similarity between each step, the step Wq with the highest similarity is selected as the current operation ticket step to be executed.
[0050] S13 Verification of the compliance of the voting procedure sequence; Compare the current execution step Wq with the preset step order on the operation ticket: If the order is consistent, then the voice recording of the votes is compliant; If a skipped or incorrect step occurs, an alarm will be issued indicating that the voice announcement of the vote is not in compliance with regulations.
[0051] S14 recognizes hand gestures during ticket counting; The system acquires real-time images of the vote counting scene and uses a hand joint recognition algorithm to determine the operator's hand posture. When the operator's hand is in a pointing gesture, the system calculates the pixel coordinates of the index finger's tip. .
[0052] S15 locates the target pixel position; The text information of the operation ticket is extracted based on a deep learning text recognition algorithm, matched with the current execution step Wq, to determine the operation target (switch / disconnector) and obtain the pixel coordinates of the operation target. .
[0053] S16 Compliance verification of the vote counting process; Calculate the coordinates of the operator's index finger tip With the coordinates of the target operation pixel distance D p : like D p If the number of votes is less than the preset threshold, the vote counting process is compliant. like D p If the value is greater than or equal to the preset threshold, an alarm will be issued indicating that the vote counting action is not in compliance.
[0054] S17 locates the center coordinates of the switch to be operated; Real-time acquisition of scene images; identification of switches and switch numbers on the switchgear; determination of the switch to be operated based on operation ticket keywords; acquisition of the switch center pixel coordinates. P e .
[0055] S18 determines the operation execution status; Real-time identification of pixel coordinates of operator's hand in images ,when Coordinates of the switch center P e The pixel distance is less than the threshold T e At that time, it is determined that the operator has performed the operation; After the operation is executed, the actual status of the switches and disconnectors on the switchgear is identified again.
[0056] S19 Equipment Status Compliance Verification; Compare the actual status of the switch with the keyword requirements of the operation ticket: If the status is consistent, the operation is compliant; If the status is inconsistent, a status non-compliance alarm will be issued.
[0057] S20 executes the entire process in a loop; Repeat steps S10-S19. After each operation ticket step has completed compliance verification, check for alarms. If an alarm is detected, the system pauses subsequent steps and waits for manual confirmation or on-site handling. If no alarm is detected, proceed to the next operation ticket step. Complete compliance verification of the voice, location, action, and equipment status step by step until all steps in the operation ticket have been executed, at which point the system completes full-process compliance control.
[0058] Example 3 This invention provides a computer program product, including a computer program that, when executed by a processor, implements the steps of the operation ticket compliance judgment method described in Embodiment 2.
[0059] It should be noted that the above description of the technical solutions is exemplary, and this specification may be embodied in different forms and should not be construed as limiting it to the technical solutions set forth herein. Rather, providing these descriptions will ensure that the disclosure of this invention is thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art. Furthermore, the technical solutions of this invention are defined only by the scope of the claims.
[0060] When using the terms “comprising,” “having,” and “including” as described in this specification, there may also be another part or other parts, and the terms used are generally singular but may also be plural.
[0061] Finally, it should be noted that the above embodiments are merely representative examples of the present invention. Obviously, the present invention is not limited to the above embodiments and many variations are possible. Any simple modifications, equivalent changes, and alterations made to the above embodiments based on the technical essence of the present invention should be considered within the protection scope of the present invention.
Claims
1. A system for judging the compliance of operation tickets, characterized in that, include: The information infrastructure module is used to collect voice information from the vote counters, images of substation switchgear, and images of the work scene. Based on the voice information from the vote counters, a dedicated voice recognition model for substation scenes is trained and established. Based on the images of substation switchgear, a switch status recognition model is established. An operation ticket keyword library is constructed. A point cloud map of substation equipment locations is established. Based on the images of the work scene, an image and pose reference dataset is created. The personnel operation positioning module is used to obtain the initial positioning information of personnel before the vote counting based on the image and pose reference dataset, extract the operation equipment number based on the substation scenario-specific speech recognition model, and obtain the precise positioning information of the operator by combining the substation equipment location point cloud map and the initial positioning information of personnel before the vote counting. Based on the precise positioning information of the operator, the module constructs the correspondence between the operator and the surrounding equipment. The vote counting compliance judgment module is used to parse the voice information of the vote counting personnel based on the substation scenario-specific voice recognition model and operation ticket keyword library to obtain operation instruction keywords. Based on the operator's precise location information and the correspondence between the operator and the surrounding equipment, the module verifies the operator's position compliance. When the operator's position is compliant, the module determines the current operation step based on the operation instruction keywords. The system identifies the operator's hand gestures, determines the corresponding target location based on the current operation step, and verifies the accuracy of the operator's hand gestures based on the target location.
2. The operation ticket compliance judgment system as described in claim 1, characterized in that, It also includes: a personnel operation verification module, which is used to locate the center coordinates of the switch after the operation based on the switch status recognition model, the operation ticket keyword library and the current operation steps, and determine the operation execution status based on the center coordinates of the switch after the operation. When the operation execution status is determined to be completed, it verifies whether the actual status of the current switch is consistent with the requirements of the operation ticket.
3. The operation ticket compliance judgment system as described in claim 1, characterized in that: The method for creating an image and pose reference dataset based on work scene images includes: scaling all work scene images to a fixed size. N×N And convert it to grayscale. N Set a fixed size for all work scene images after scaling; calculate the pixel difference between adjacent columns of a single grayscale image using the following formula: ,in, These are the coordinates of pixels in a single grayscale image. For a single grayscale image in The pixel value corresponding to the coordinates. for The pixel difference between adjacent columns at coordinates; based on the calculation formula for the pixel difference between adjacent columns of a single grayscale image, the pixel difference between adjacent columns of all grayscale images is calculated; Traverse all pixel coordinates of a single grayscale image For each pixel coordinate corresponding Perform binarization judgment, if If the hash value is greater than 0, then the hash value of the pixel coordinate is 1. If the hash value is ≤0, then the hash value of the pixel coordinate is 0; the binarized results of all pixel coordinates in a single grayscale image are arranged in pixel coordinate order to form a hash value of length 1. N×(N-1) The hash value encoding, i.e. ,in, The hash value encoding of a single grayscale image is obtained by iterating through the pixel differences of adjacent columns corresponding to all pixel coordinates of all grayscale images.
4. The operation ticket compliance judgment system as described in claim 3, characterized in that: The method for obtaining the initial positioning information of personnel before the vote counting based on the image and pose reference dataset includes: before the vote counting, acquiring continuous images of the work scene in real time, and scaling all acquired continuous images of the work scene to a fixed size. n×n And save it. n The images are scaled to a fixed size based on all acquired continuous images of the work scene; continuous images of the work scene are cropped according to a time sequence window to obtain a set of window images, i.e. ,in, A collection of window images within a time sequence window; w This is the length of the timing window; t The current moment; The current time in the time series window t The corresponding latest frame image is the current frame image. ; Let i be the earliest frame image within the temporal window; calculate the sum of the Hamming distances between the i-th frame image in the pose reference dataset and all images in the window image set, using the following formula: ,in, The sum of Hamming distances between the i-th frame image and the set of window images in the image and pose reference dataset; i and j are the image frame numbers, and i > j; k is the pixel hash value number; Let k be the hash value of the k-th bit of the (tj)-th frame in the window image set; The hash value is the k-th bit corresponding to the i-th frame in the image and pose reference dataset; ⊕ is the binary XOR operation; select the smallest bit in the image and pose reference dataset. The corresponding i-th frame image is used as the optimal matching frame. , the best matching frame The corresponding personnel pose is used as the current frame image. Personnel location information, i.e., the initial location information of personnel before the vote counting.
5. The operation ticket compliance judgment system as described in claim 4, characterized in that: The method for obtaining precise positioning information of operators by combining substation equipment location point cloud maps and initial positioning information of personnel before the vote counting includes: extracting the current frame image using the Canny operator. The two-dimensional line features are used to obtain a set of two-dimensional line features. ,in, It is a set of two-dimensional line features; Let i be the point set of the i-th two-dimensional feature line; based on the initial positioning information of personnel before the vote counting, extract the point cloud map of the substation equipment location within the camera's field of view to obtain the scene point cloud. A line feature extraction algorithm based on geometric analysis is used to extract scene point clouds. The three-dimensional line features are used to obtain a set of three-dimensional line features. ,in, It is a set of three-dimensional line features; Let i be the set of points for the i-th 3D feature line; based on camera intrinsic parameters and the initial positioning information of personnel before the vote counting, The three-dimensional line features are projected onto the image plane to obtain the set of projected feature points. ; with a preset search radius for For each projected line feature, match the corresponding two-dimensional line feature and calculate the line feature similarity. The calculation formula is as follows: ,in, Let be the similarity between the i-th projected 3D line feature and the j-th 2D line feature; Let i be the feature vector of the i-th projected 3D line; Let j be the eigenvector of the j-th two-dimensional line; Let be the Euclidean norm of the eigenvector of the i-th projected 3D line. Let the Euclidean norm of the j-th 2D line feature vector be denoted as . The projected 3D line feature vector and the 2D line feature vector with the highest similarity are selected to form the optimal feature pair. The PNP algorithm is then used to solve the current frame image based on the optimal feature pair. With scene point cloud The relative pose of the personnel is used to obtain the precise pose of the operator. Based on the operator's precise position Obtain the set of device numbers around the operator. Combined with the operator's precise position Set of equipment numbers around the operator This allows us to obtain precise location information from the operators before the vote counting begins.
6. The operation ticket compliance judgment system as described in claim 5, characterized in that: The method for parsing the voice information of the person announcing the vote, based on a substation-specific speech recognition model and an operation ticket keyword library, to obtain operation instruction keywords, and verifying the operator's location compliance based on the operator's precise location information and the correspondence between the operator and surrounding equipment, includes: using the substation-specific speech recognition model to perform noise reduction processing and speech-to-text conversion on the real-time voice information of the person announcing the vote, and extracting the corresponding equipment number from the voice of the person announcing the vote. Based on the aforementioned operation ticket keyword library, keywords related to the operation instructions are extracted from the speech-to-text using a keyword matching algorithm to obtain a set of operation instruction keywords. The operation ticket is used for text recognition to obtain the equipment number to be operated. Number the equipment to be operated Collection of equipment numbers around the operator The device number corresponding to the voice of the vote counter Perform a verification match, if and = If so, the operator's current work position is deemed compliant.
7. The operation ticket compliance judgment system as described in claim 6, characterized in that: The current operation step is determined based on operation instruction keywords; The method for identifying operator hand gestures includes: establishing a mapping relationship between each step of the operation ticket and the corresponding operation instruction keyword group; performing similarity matching between the real-time extracted operation instruction keywords and the keyword groups of each step; and identifying the operation step currently being performed by the operator. Real-time acquisition of images of the work scene; use a hand joint point recognition algorithm to detect key points of the operator's hand; locate and extract the pixel coordinates of the tip of the operator's index finger as the actual pointing position of the operator's hand gestures. .
8. The operation ticket compliance judgment system as described in claim 7, characterized in that: The method for determining the corresponding operation target location based on the current operation step and verifying the accuracy of the operator's hand gestures based on the operation target location includes: retrieving standard operating equipment information pre-associated with the current operation step, matching the operation target equipment corresponding to the current step, and reading the standard pixel coordinates of the operation target equipment in the real-time acquired operation scene image as the reference operation target location. ; Calculate the actual pointing position of the operator's hand gestures. Relative to the target position of the operation pixel distance ;like If the distance does not exceed the preset pixel distance threshold, it is determined that the operator's hand gesture is accurate.
9. A method for judging the compliance of operation tickets based on the operation ticket compliance judgment system of claim 1, characterized in that, include: Collect voice information of the vote-counting personnel, images of substation switchgear, and images of the work scene; train and establish a dedicated voice recognition model for substation scenes based on the voice information of the vote-counting personnel; establish a switch status recognition model based on the images of substation switchgear; construct an operation ticket keyword library; and establish a point cloud map of substation equipment locations. Create an image and pose reference dataset based on images of work scenarios; Based on image and pose reference datasets, the initial positioning information of personnel before the vote count is obtained. The operation equipment number is extracted based on the substation scenario-specific speech recognition model. The precise positioning information of the operators is obtained by combining the substation equipment location point cloud map and the initial positioning information of personnel before the vote count. The correspondence between the operators and the surrounding equipment is constructed based on the precise positioning information of the operators. Based on a dedicated voice recognition model for substation scenarios and a keyword library for operation tickets, the voice information of the ticket announcer is analyzed to obtain operation instruction keywords. Based on the operator's precise location information and the correspondence between the operator and surrounding equipment, the operator's position compliance is verified. When the operator's position is compliant, the current operation step is determined based on the operation instruction keywords. The system identifies the operator's hand gestures, determines the corresponding target location based on the current operation step, and verifies the accuracy of the operator's hand gestures based on the target location.
10. A computer program product comprising a computer program / instructions, characterized in that: When the computer program / instructions are executed by the processor, they implement the steps of the method of claim 9.