Posture detection device, posture detection method, and sleep phase determination method
By acquiring and judging components to process frame images, and using a learning model and difference consistency to determine false detections, the problem of false detection between frame images in pose detection is solved, and the accuracy of pose detection is improved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SHIMADZU SEISAKUSHO LTD
- Filing Date
- 2021-07-07
- Publication Date
- 2026-06-23
AI Technical Summary
Existing machine learning-based pose detection techniques suffer from the problem of misdetecting human body part position information between frames, resulting in the detection of human movements that did not actually occur.
The acquisition unit acquires frame images in a time-series order. The position detection unit and the decision unit determine the position information of multiple parts of the human body to identify false detections. The learned model outputs position information, and the decision unit outputs false detection information based on the difference and consistency between frame images.
It effectively removes falsely detected location information, prevents the detection of human movements that did not actually occur, and improves the accuracy of posture detection.
Smart Images

Figure CN116348909B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to a posture detection device, a posture detection method, and a sleeping posture determination method. Background Technology
[0002] In recent years, machine learning-based pose detection (also known as pose estimation) technology has developed rapidly. Machine learning-based pose detection technology detects people within an image, identifies the positional information of multiple body parts, and then calculates lines connecting these parts, which are then overlaid on the original image for display. These body parts include, for example, the knees, elbows, wrists, and ankles. This technology eliminates the need for markers on the body, as was done with traditional motion capture. One such technology uses the OpenPose algorithm (Non-Patent Document 1). Using OpenPose, real-time pose detection can be performed on animations captured by a standard monocular camera. This technology has spurred the development of machine learning-based pose detection technology, which is now being applied in various fields such as sports, medicine, and security.
[0003] Existing technical documents
[0004] Non-patent literature
[0005] Non-patent literature 1: Zhe Cao et al. (2017). OpenPose: Realtime Multi-Person 2DPose Estimation using Part Affinity Fields, CVPR Summary of the Invention
[0006] The technical problem that the invention aims to solve
[0007] However, machine learning-based pose detection techniques can erroneously detect the positional information of multiple parts of the human body within an input image. Since most machine learning-based pose detection techniques, including OpenPose, detect positional information for each frame of an animation without considering the relationships between frames, movement can be detected even when the human body hasn't moved between frames. This is because the detection results for the positional information of multiple parts of the human body can sometimes differ between frames. Furthermore, for the same reason, even when the human body has moved between frames, positional information containing movement that could not possibly have occurred can still be detected. Thus, due to the misdetection of the positional information of multiple parts of the human body, movements that did not actually occur can be detected, regardless of whether movement exists between frames.
[0008] This disclosure was made to solve this problem, and its purpose is to provide a technique for determining false detections of positional information of multiple parts of the human body.
[0009] Solution to the above technical problems
[0010] A posture detection apparatus according to one aspect of this disclosure includes an acquisition unit, a position detection unit, and a determination unit. The acquisition unit acquires frame images sequentially in a time series. The position detection unit inputs information obtained from the frame images into a learned model and outputs position information of multiple body parts from the learned model. The determination unit outputs information related to false detections by the position detection unit. The position detection unit outputs a first position group when the first frame image acquired by the acquisition unit is used as input, and outputs a second position group when the second frame image acquired by the acquisition unit, which is different from the first frame image, is used as input. The determination unit outputs information related to false detections by the position detection unit based on the difference between the first and second position groups and the consistency between the first and second frame images.
[0011] The pose detection method according to other schemes of this disclosure includes: the step of acquiring frame images in a time-series order; the step of inputting information obtained from the frame images into a learned model and outputting position information of multiple parts of the human body from the learned model; and the step of outputting information related to false detection for the output position information of the multiple parts. The step of outputting the position information of the multiple parts includes: outputting a first position group when the acquired first frame image is used as input; and outputting a second position group when the acquired second frame image, which is different from the first frame image, is used as input. The step of outputting information related to false detection is based on the difference between the first position group and the second position group, and the consistency between the first frame image and the second frame image, to output information related to false detection.
[0012] Other sleeping posture determination methods according to this disclosure include: a posture detection method; and a step of outputting sleeping posture information based on the output position information of multiple body parts and output information related to false detection.
[0013] Invention Effects
[0014] According to this disclosure, it is possible to determine false detections of positional information for multiple parts of the human body. Therefore, falsely detected positional information can be removed, thus preventing the detection of human movements that did not actually occur. Attached Figure Description
[0015] Figure 1 This is a schematic diagram illustrating an example of the hardware configuration of the posture detection device according to this embodiment.
[0016] Figure 2This is a block diagram illustrating an example of the functional configuration of a posture detection device.
[0017] Figure 3 This is a flowchart illustrating an example of the detection process.
[0018] Figure 4 This is a flowchart illustrating an example of the first group of setup processes.
[0019] Figure 5 This is a flowchart illustrating an example of the second group of setup processes.
[0020] Figure 6 This is a flowchart illustrating an example of the setting process for the third group.
[0021] Figure 7 This is a flowchart illustrating an example of error troubleshooting.
[0022] Figure 8 This is a diagram illustrating an example of the input and output data for detection processing.
[0023] Figure 9 This is a diagram showing an example of the test results.
[0024] Figure 10 This is a diagram showing an example of the test results.
[0025] Figure 11 This is a diagram showing an example of the test results.
[0026] Figure 12 This is a diagram showing an example of the test results.
[0027] Figure 13 This is a diagram showing an example of the test results.
[0028] Figure 14 This is a diagram showing an example of the test results.
[0029] Figure 15 This is a flowchart illustrating an example of error elimination processing in a variation of this embodiment. Detailed Implementation
[0030] Hereinafter, embodiments of the present disclosure will be described in detail with reference to the accompanying drawings. Furthermore, the same or equivalent parts in the drawings will be labeled with the same reference numerals and their descriptions will not be repeated.
[0031] [Hardware Components of the Posture Detection Device]
[0032] Figure 1 This is a schematic diagram illustrating an example of the hardware configuration of the posture detection device according to this embodiment. Figure 1As shown, the posture detection device 100 of this embodiment includes a hard disk 101, a CPU (Central Processing Unit) 102, a memory 103, a display interface 104, a peripheral device interface 105, and a camera interface 106. These are connected to each other via a bus in a manner that enables communication. The posture detection device 100 includes a personal computer or workstation.
[0033] Hard disk 101 is a non-volatile storage device. Hard disk 101 stores, for example, an OS (Operating System) 111, a position detection learning model 112, an error troubleshooting program 113, and a sleep position detection program 114. In addition to the data shown in the figure, hard disk 101 also stores programs or data for various processing operations.
[0034] CPU 102 reads the program stored in hard disk 101 into memory 103 and executes it to realize various functions of posture detection device 100. Memory 103 is a volatile storage device, including, for example, DRAM (Dynamic Random Access Memory).
[0035] The position detection fully trained model 112 is a neural network model that inputs information obtained from frame images into the fully trained model and outputs position information of multiple parts of the human body as the detection result (hereinafter also referred to as the "position detection result"). At this time, either pixel values obtained from the frame images or feature values extracted from the frame images can be input. For example, OpenPose developed by CMU (Carnegie Mellon University), Human-pose-estimation developed by Microsoft, or PoseNet developed by Google can be used as the position detection fully trained model 112.
[0036] Error elimination procedure 113 is a procedure used to eliminate and correct position detection results that are judged to be erroneous when the output position detection result is judged to be erroneous. In this embodiment, the detection of position information of multiple parts of the human body is referred to as "pose detection". Position information is the coordinates that can determine the position of, for example, the shoulder, elbow, wrist, etc. Position information can be coordinates that can determine the position of other areas, calculated from the coordinates that can determine the position of multiple areas. For example, it can be the coordinates that can determine the position of the cheek area, calculated from the coordinates that can determine the position of the eyes, nose, and mouth. In addition, position information is not limited to 2D, but can also be 3D position information. 3D position information can be estimated from 2D position information, or it can be measured using a stereo camera or a TOF camera.
[0037] Sleep position detection program 114 is a program that takes position detection results as input and outputs sleep position information as the detection result (hereinafter also referred to as "sleep position detection result" or "sleep position determination result"). Sleep position information includes, for example, supine, prone, facing right, facing left, etc. The position information of multiple parts of the human body detected as the position detection result is correlated with the sleep position information. Sleep position detection program 114 is a program that infers sleep position based on a predefined model that establishes a correspondence between position detection results as input and sleep position information as output. Alternatively, it can also use a machine learning-based learned model to output sleep position information.
[0038] CPU 102 is connected to monitor 121 via monitor interface 104. CPU 102 is connected to mouse 122 and keyboard 123 via peripheral device interface 105. CPU 102 is connected to camera 124 via camera interface 106.
[0039] Display interface 104 is an interface for connecting display 121, enabling data input and output between posture detection device 100 and display 121. Display 121 is constructed of, for example, LCD (Liquid Crystal Display) or organic EL (Electroluminescence) display. Display 121 displays animation captured by camera 124 or results output by position detection using learned model 112 or sleep posture detection program 114.
[0040] The peripheral device interface 105 is used to connect peripheral devices such as mouse 122 and keyboard 123, so as to realize the input and output of data between the posture detection device 100 and the peripheral devices.
[0041] The camera interface 106 is used to connect the camera 124, enabling data input and output between the posture detection device 100 and the camera 124.
[0042] Alternatively, the CPU 102 may be configured to connect to a network such as a LAN (Local Area Network) via a communication interface. In this case, the CPU 102 can also be connected to the camera 124 or the display 121 via the network.
[0043] [Hardware Components of the Posture Detection Device]
[0044] Figure 2 This is a block diagram illustrating an example of the functional configuration of a posture detection device. For example... Figure 2 As shown, the posture detection device 100 includes an acquisition unit 131, a position detection unit 132, a determination unit 133, and a sleeping posture detection unit 134 as functional units involved in the detection process. These functions are implemented by the CPU 102 of the posture detection device 100 executing the OS 111 and various programs.
[0045] The acquisition unit 131 acquires frame images in a time-series order. For example, the frame images are images included in an animation of a sleeping person. Specifically, the sleeping person is filmed by camera 124, and the acquisition unit 131 acquires the frame images included in the filmed animation in a time-series order via camera interface 106. Furthermore, the animation is not limited to filming a sleeping person. It can be a person who is awake, or it can be something other than a person.
[0046] The position detection unit 132 inputs information obtained from frame images into a learned model (learned model 112 for position detection), and outputs position information of multiple parts of the human body as position detection results from the learned model. Here, the position information of multiple parts of the human body is also referred to as a "position group". In addition, the "position group" is also simply referred to as "position".
[0047] When the position detection unit 132 takes the frame image acquired by the acquisition unit 131 as input, it outputs a position group as the position detection result. For example, the position detection unit 132 outputs a first position group when the first frame image acquired by the acquisition unit 131 is taken as input, and outputs a second position group when the second frame image acquired by the acquisition unit 131 is taken as input. The determination unit 133 outputs information related to false detections by the position detection unit (also called "false detection information").
[0048] Furthermore, if the output position detection result is determined to be incorrect, the determination unit 133 uses the error elimination procedure 113 to eliminate the position detection result determined to be incorrect and performs correction. Hereinafter, the position group output by the determination unit 133 will also be referred to as the position group (correction).
[0049] The sleeping position detection unit 134 uses the sleeping position detection program 114 to output sleeping position information based on the position detection results.
[0050] The frame images acquired by the acquisition unit 131 are sent to the display interface. Furthermore, images that determine the position group (correction) output from the determination unit 133, the difference image between the first and second frame images, images that determine false detection information, and images that determine the sleep position information output from the sleep position detection unit 134 are sent to the display interface. The display 121 can display these images.
[0051] Furthermore, the configuration is not limited to the posture detection device 100 having a position detection unit 132; it could be an external device having a position detection unit 132, or it could be another external device having some of the functions. Alternatively, the external device could be connected via a network. Furthermore, the posture detection device 100 could also lack a sleeping posture detection unit 134. Alternatively, a device with a sleeping posture detection unit 134 could be called a sleeping posture determination device, and a device without a sleeping posture detection unit 134 could be called a posture detection device.
[0052] [Flowchart of detection and processing]
[0053] Figure 3 This is a flowchart illustrating an example of the detection process. For example... Figure 3 As shown, CPU102 performs detection processing.
[0054] The detection process is initiated when each frame of the animation is generated. Hereinafter, the frame image acquired by the acquisition unit 131 in this detection process will be referred to as the "frame image acquired this time" or "the second frame image." The frame image acquired by the acquisition unit 131 in the previous detection process will be referred to as the "frame image acquired last time" or "the first frame image." In this detection process, the position group output by the position detection unit 132 will be referred to as the "second position group," and the position detection result output by the position detection unit 132 will be referred to as the "position detection result this time." In the previous detection process, the position group output by the position detection unit 132 will be referred to as the "first position group," and the position detection result output by the position detection unit 132 will be referred to as the "position detection result last time."
[0055] Furthermore, the acquisition unit 131 does not need to acquire frame images for every single frame; it can acquire them every few frames or at predetermined intervals. Alternatively, it may not be necessary to acquire frame images contained in the animation captured by the camera 124 in real time, but instead use the animation stored on the hard disk 101 for detection processing.
[0056] The detection process is a series of processes performed by the acquisition unit 131, the position detection unit 132, the determination unit 133, and the sleeping posture detection unit 134. The detection process includes frame image acquisition processing, position detection processing, first group setting processing, second group setting processing, third group setting processing, error elimination processing, and sleeping posture detection processing. Alternatively, the detection process may omit the sleeping posture detection processing.
[0057] The frame image acquisition process is performed by the acquisition unit 131. The position detection process is performed by the position detection unit 132 using the position detection learning model 112.
[0058] The determination unit 133 outputs information related to false detections by the position detection unit 132. The first group setting process, the second group setting process, and the third group setting process are processes performed by the determination unit 133. Furthermore, the determination unit 133 performs error elimination processing using the error elimination procedure 113. The sleeping posture detection process is a process performed by the sleeping posture detection unit 134 using the sleeping posture detection procedure 114. The steps will be described below only as S.
[0059] If the detection process begins, CPU 102 performs frame image acquisition processing in S11, advancing the process to S12. In the frame image acquisition processing, acquisition unit 131 acquires frame images sequentially in time. A frame image is, for example, an image included in an animation of a sleeping person.
[0060] In S12, CPU 102 performs position detection processing and proceeds to S13. In the position detection processing, position detection unit 132 inputs information obtained from frame images into a learned model (learned model 112 for position detection), and outputs position information of multiple parts of the human body as position detection results from the learned model.
[0061] CPU102 executes the first set of settings in S13. Figure 4 The process is then advanced to S14. In the first group setting process, if the condition based on the acquired frame image is met, the determination unit 133 sets the acquired frame image as the first group.
[0062] CPU102 executes the second set of settings in S14. Figure 5 The process is then advanced to S15. In the second group setting process, if the condition based on the position detection result is met, the determination unit 133 sets the frame image acquired this time as the second group.
[0063] CPU102 executes the third group of setting processes in S15. Figure 6The process is then advanced to S16. In the third group setting process, if the conditions based on the first or second group settings are met, the determination unit 133 sets the frame image acquired this time to the third group.
[0064] CPU102 performs error troubleshooting in S16. Figure 7 The process is then advanced to S17. In the error elimination process, the determination unit 133 corrects the position detection result if the frame image acquired this time is set as the third group.
[0065] In S17, CPU102 performs sleep position detection processing and ends the detection process. Sleep position detection unit 134 takes the position detection result as input, performs sleep position determination through sleep position detection program 114, and outputs sleep position information (sleep position detection result).
[0066] The following uses Figures 4-6 The processing for setting groups 1 to 3 will be explained. The summary of the processing performed here is as follows: The position detection unit 132 outputs a first position group when it receives the first frame image acquired by the acquisition unit 131 as input. Furthermore, the position detection unit 132 outputs a second position group when it receives a second frame image acquired by the acquisition unit 131 that is different from the first frame image as input. Here, the second frame image is an image acquired after the first frame image. Alternatively, the first frame image may also be an image acquired after the second frame image.
[0067] The determination unit 133 outputs information related to the false detection by the position detection unit 132 based on the difference between the first position group and the second position group, and the consistency between the first frame image and the second frame image. In this embodiment, when the determination unit 133 determines that the first position group and the second position group are different based on the difference between them, and determines that the first frame image and the second frame image are consistent based on the consistency between them, it outputs information that confirms the position detection unit 132 has falsely detected the second position group as information related to the false detection by the position detection unit 132. Furthermore, the determination unit 133 excludes the falsely detected second position group from the output of the position detection unit 132. The following explanation uses a flowchart.
[0068] Figure 4This is a flowchart illustrating an example of the first group setting process. The posture detection device 100 includes a first calculation unit and a second calculation unit. The first calculation unit calculates the difference between the first position group and the second position group. The second calculation unit calculates the consistency between the first frame image and the second frame image. For example, the consistency between the first frame image and the second frame image can be calculated as follows: The difference between each pixel is obtained between the first frame image and the second frame image. The value obtained by counting the number of pixels that produce a difference is used as the difference, and the value obtained by counting the number of pixels that do not produce a difference is used as the consistency. Alternatively, the ratio of the number of pixels that produce a difference to the total number of pixels can be used as the difference, and the ratio of the number of pixels that do not produce a difference to the total number of pixels can be used as the consistency.
[0069] The determination unit 133 classifies the second frame image into a first group based on the difference between the first frame image and the second frame image. Specifically, if the first group setting process begins, in S21, the second calculation unit calculates the consistency between the previously acquired frame image (first frame image) and the currently acquired frame image (second frame image), and proceeds the process to S22. Furthermore, based on the difference between the first frame image and the second frame image at that time, a difference image is generated that can determine the difference between the first frame image and the second frame image.
[0070] In S22, when the difference between the first frame image and the second frame image exceeds a predetermined reference ("Yes" in S22), the determination unit 133 proceeds to S23. Alternatively, the process can proceed to S23 when the consistency between the first frame image and the second frame image is below a predetermined reference.
[0071] In S22, the determination unit 133 terminates the first group setting process if the difference between the first frame image and the second frame image does not exceed a predetermined reference (in S22, this is "No"). Alternatively, the process can proceed to S23 if the consistency between the first frame image and the second frame image is not below a predetermined reference.
[0072] In step S23, the determination unit 133 sets the second frame image (the frame image acquired this time) as group 1 and ends the group 1 setting process. In this process, if it can be determined that there are almost no pixels that differ between the first and second frame images (the difference does not exceed a predetermined benchmark, or the consistency is not below a predetermined benchmark), the first and second frame images are determined to be consistent. Furthermore, if it is determined that the first and second frame images are inconsistent (different), the second frame image (the frame image acquired this time) is set as group 1.
[0073] Alternatively, the second calculation unit may not calculate the similarity or difference between the first frame image and the second frame image, and the determination unit 133 may not determine whether there is a difference between the first frame image and the second frame image. In this case, the first frame image and the second frame image may be checked by the human eye to determine whether there is a difference between the first frame image and the second frame image.
[0074] In other words, instead of the determination unit 133 performing the first group setting process, the frame image can be set to the first group manually by visual inspection. Alternatively, the difference image can be visually inspected, and the frame image can be set to the first group manually. In this case, the first group setting can be performed via external input from the mouse 122 or the keyboard 123.
[0075] Figure 5 This is a flowchart illustrating an example of the second group setting process. The determination unit 133 classifies the second frame image input to the position detection unit 132 for outputting the second position group as the second frame image of the second group based on the difference between the first position group and the second position group.
[0076] For example, the difference between position group 1 and position group 2 can also be calculated as follows. For each part, calculate the amount of movement from position group 1 to position group 2. Given that the coordinates of position 1 are (X1, Y1) and the coordinates of position 2 are (X2, Y2) for a given part, calculate the amount of movement between position 1 and position 2. Then, calculate the sum of the movement amounts for all parts as the difference.
[0077] If the second set of processing begins, in S31, the first calculation unit calculates the difference between the previous detection result (first position group) and the current detection result (second position group) and proceeds the processing to S32.
[0078] In S32, if the difference between the first position group and the second position group exceeds a predetermined benchmark ("Yes" in S32), the determination unit 133 proceeds to S33. If, in S32, the determination unit 133 does not exceed the predetermined benchmark ("No" in S32), the second setting process ends.
[0079] In step S33, the determination unit 133 classifies the second frame image (the frame image acquired this time), which was input to the position detection unit for outputting the second position group, as the second frame image of the second group, and ends the second setting process. In this process, if it can be determined that there is almost no movement between the first position group and the second position group (the difference does not exceed a predetermined reference), it is determined that the first position group and the second position group are consistent. Then, if it is determined that the first position group and the second position group are inconsistent (phase difference), the second frame image (the frame image acquired this time) is set as the second group.
[0080] Furthermore, instead of calculating the difference as the sum of the movements of all parts, the difference can be calculated for each part individually, and a judgment on whether there is a difference can be made for each part. In this case, by comparing with the difference images corresponding to each part, it is possible to determine whether the detection results for each part are incorrect. In addition, multiple locations such as the right elbow and right wrist can be considered, for example, to determine whether the detection results related to the right hand are incorrect.
[0081] Alternatively, the first calculation unit may not calculate the consistency or difference between the first position group and the second position group, and the determination unit 133 may not determine whether the first position group and the second position group are different based on the calculation result of the first calculation unit. In this case, the first position group and the second position group may also be confirmed by human eye to determine whether the first position group and the second position group are different.
[0082] In other words, instead of the determination unit 133 performing the second group setting process, the frame image can be manually set to the second group by visual inspection. Alternatively, the differential data can be visually inspected, and the frame image can be manually set to the second group. In this case, the second group setting can be performed via external input from the mouse 122 or the keyboard 123.
[0083] The determination unit 133 outputs information related to the false detection by the position detection unit 132 based on the difference between the first position group and the second position group, and the consistency between the first frame image and the second frame image. Specifically, when the determination unit 133 determines that the first position group and the second position group are different based on the difference between them, and determines that the first frame image and the second frame image are consistent based on the consistency (or difference) between them, it outputs information that confirms the position detection unit 132 has falsely detected the second position group as information related to the false detection by the position detection unit 132. This is because although there is no human movement between the frames, the positions of multiple parts of the human body change as a detection result, thus the detection result is determined to be incorrect.
[0084] Furthermore, when the difference between the first position group and the second position group in at least one of the multiple body parts exceeds a predetermined benchmark, the determination unit 133 also outputs information that confirms the position detection unit 132 has misdetected the second position group as information related to the misdetection by the position detection unit 132. This is because, relative to the time from when the first frame image is captured to when the second frame image is captured, the position of the body part changes at a speed that is impossible for human movement, thus the detection result is determined to be incorrect. For example, only the wrist may change at an impossible speed, or the entire body (all parts) may move instantaneously. The difference between the first position group and the second position group can also be the amount of movement of each part; as long as the amount of movement in one or more parts exceeds a predetermined benchmark, the misdetection of the second position group is determined.
[0085] The determination unit 133 excludes the falsely detected second position group from the output of the position detection unit 132 (discards the position detection result). Then, the determination unit 133 changes the falsely detected second position group to the same position information as the first position group. These processes are performed in the third group setting process and error elimination process. The following uses a flowchart for detailed explanation.
[0086] Figure 6 This is a flowchart illustrating an example of the third group setting process. The determination unit 133 classifies the second frame image, which is included in the second group but not included in the first group, as the second frame image of the third group, and determines that the position detection unit 132 has mistakenly detected the second frame image of the second position group.
[0087] If the third group setting process begins, in S41, the determination unit 133 sets the second frame image (the frame image acquired this time) that is included in the second group but not included in the first group as the second frame image of the third group, and proceeds the process to S42. The frame image set as the third group is the frame image that the position detection unit 132 mistakenly detected as the second position group.
[0088] In S42, when the difference between the previous detection result (first position group) and the current detection result (second position group) in at least one of the multiple positions exceeds a predetermined benchmark, the determination unit 133 sets the second frame image (the frame image acquired this time) as the third group and ends the third group setting process.
[0089] Figure 7 This is a flowchart illustrating an example of error correction processing. If error correction processing is started, in S51, if the determination unit 133 sets the currently acquired frame image (the second frame image) to the third group ("Yes" in S51), the processing is advanced to S52.
[0090] In S51, if the determination unit 133 does not set the currently acquired frame image as the third group (in S51 it is "No"), the error elimination process ends. In S52, the determination unit 133 excludes the falsely detected second position group (the current position detection result) from the output result of the position detection unit 132 and proceeds to S53.
[0091] In step S53, the determination unit 133 changes the falsely detected second position group (the current position detection result) to the same position information as the first position group (the previous position detection result), and ends the error elimination process. In other words, since the current position detection result is incorrect, the previous position detection result is used as the current position detection result.
[0092] For example, if, among the multiple body parts identified as different locations in the position detection results, the current position detection result is excluded if more than one location is determined to be different, then the current position detection result is discarded. In this case, including correctly detected locations, the current position detection result is excluded, and the previous position detection result is used. Alternatively, it is also possible to exclude only the locations determined to be different from the current position detection result. In this case, correctly detected locations are not excluded from the current position detection result; only falsely detected locations are excluded from the current position detection result, and the previous position detection result is used.
[0093] Alternatively, instead of the determination unit 133 performing the third group setting process, the frame image can be manually set to the third group by visually verifying the change in the difference or position detection results of the frame image. In this case, the third group setting can be performed via external input from the mouse 122 or the keyboard 123.
[0094] Furthermore, it is not limited to both S52 and S53; either one can be performed. If only the process (S52) of excluding falsely detected second position groups (the current position detection result) from the output of position detection unit 132 is performed, the current position detection result no longer exists. Therefore, sleep position detection based on the current position detection result cannot be performed. In this case, sleep position detection can be performed based solely on the correct position detection results performed before and after.
[0095] The sleeping posture detection unit 134 takes the position detection result as input, performs sleeping posture determination through the sleeping posture detection program 114, and outputs sleeping posture information (sleeping posture detection result). In the detection processing, steps S11 to S16 are used to output position information (position detection result) of multiple parts of the human body or information related to false detections (false detections of the second position group). Furthermore, in step S17, based on the output position information of multiple parts and the output information related to false detections, sleeping posture information (sleeping posture detection result) is output. Specifically, based on the elimination and modification of falsely detected second position groups based on information related to false detections, position information of multiple parts is input, and sleeping posture information is output.
[0096] Figure 8 This diagram illustrates an example of the input and output data of the detection process. The acquisition unit 131 acquires frame images contained in the animation of a sleeping person. (Example...) Figure 8 As shown, the position detection unit 132 inputs the information obtained from the frame image into the position detection learning model 112, and outputs the position information (position group) of multiple parts of the human body as the position detection result from the learning model.
[0097] The position detection unit 132 outputs the position of the human body part at point n as the position detection result. In this embodiment, n=23. Hereinafter, the position of the human body part at point n is labeled as position A1, position A2, position A3...position An. These positions indicate the position of each joint or a specific position of the face. For example, the positions of each joint indicate the positions of joints such as the left and right knees or elbows. Specific positions of the face indicate the positions of the eyes, nose, or ears.
[0098] The positions of body parts are determined by coordinates (X, Y) that can be used to determine the positions for display on monitor 121. These coordinates correspond to the positions of body parts in the frame image. For example, coordinates that can determine the position of the right elbow are detected, corresponding to the right elbow in the frame image. An image that can determine these positions and an image of lines connecting the positions are generated as a position image. In monitor 121, the position image is displayed as an overlay on the front side of the image of the person in the frame image.
[0099] For example, the position detection unit 132 outputs the coordinates (Xa1, Ya1) of position A1 as a position detection result. The position detection unit 132 outputs the coordinates (Xa2, Ya2) of position A2 as a position detection result. The position detection unit 132 outputs the coordinates (Xa3, Ya3) of position A3 as a position detection result. The position detection unit 132 outputs the coordinates (Xan, Yan) of position An as a position detection result.
[0100] The determination unit 133 performs error elimination processing. During error elimination processing, if the position detection result is determined to be incorrect (it is determined that the second position group was mistakenly detected), the determination unit 133 corrects the position detection result. Thus, the position detection result is changed to the corrected position detection result. If the position detection result is not determined to be incorrect during error elimination processing, the determination unit 133 does not change the position detection result.
[0101] For example, the position detection unit 132 outputs the coordinates (Xb1, Yb1) of position A1 as a position detection result (correction). The position detection unit 132 outputs the coordinates (Xb2, Yb2) of position A2 as a position detection result (correction). The position detection unit 132 outputs the coordinates (Xb3, Yb3) of position A3 as a position detection result (correction). The position detection unit 132 outputs the coordinates (Xbn, Ybn) of position An as a position detection result (correction).
[0102] For example, in the following... Figure 12 In frame image A, since the determination unit 133 did not determine that the position detection result was incorrect, no change was made to determine the coordinates of A1 to An. Figure 12 In frame image B, coordinate changes were made that could determine all positions (positions A1 to An). Figure 13 In frame B, changes were made to determine the coordinates of the right elbow and right wrist.
[0103] If the position detection unit 134 does not determine that the position detection result is incorrect, it takes the position detection result as input and outputs sleeping position information (sleeping position detection result). If the position detection result is determined to be incorrect, the sleeping position detection unit 134 takes the position detection result (correction) as input and outputs sleeping position information (sleeping position detection result).
[0104] The output sleeping posture detection result is the sleeping position (sleeping posture) of people such as supine, prone, facing right, and facing left. Alternatively, the sleeping posture detection unit 134 can detect movements such as turning over and cramping during sleep based on the detection results of multiple consecutive positions, and output them as the sleeping posture detection result.
[0105] The output sleep posture detection results can be in any form if they can confirm a person's health status. For example, the output could be information indicating sleep symptoms such as frequent shaking of the body, turning over, sudden twitching of a part of the body, leg shaking, or sleeping in an unnatural posture.
[0106] The position detection unit 132 outputs any one of the results Z1 to Zm as the sleeping position detection result. For example, result Z1 is "supine". Result Z2 is "prone". Result Z3 is "facing right". Result Z4 is "facing left". Result Z5 is "rolling over". Result Z6 is "spasticity".
[0107] In this embodiment, the display 121 displays text that indicates the sleeping posture detection result. For example, the display 121 may show an animation (frame image) of a person sleeping on their back, along with the text "sleeping on their back". Alternatively, the display 121 may show an animation (frame image) of a person in a state of spasm, along with the text "spasm". Thus, by observing the display 121, the posture or movement of the sleeping person can be confirmed.
[0108] Furthermore, in this embodiment, the determination unit 133 corrects the current position detection result as an error based on the difference between the first position group and the second position group, and the consistency between the first frame image and the second frame image. That is, even if the human body has not moved, if it is determined that there is a difference in the detection result between frames, the position detection result is judged to be incorrect and corrected. Pose detection techniques (pose estimation techniques) such as Openpose output the detection result based on one frame. In contrast, this embodiment focuses on improving the detection accuracy of position detection by considering the difference between images between frames. In this embodiment, special attention is paid to the state of a person sleeping, in which the person's body movement is very minimal. Thus, in this embodiment, the detection accuracy of the position detection result is improved when the difference between frames is unlikely to occur. Therefore, the accuracy of sleep posture detection can also be improved simultaneously, and the health status of a person while sleeping can be appropriately confirmed.
[0109] [Example of test result display]
[0110] Figures 9-14 This is a diagram showing an example of the test results. Figure 9 This shows an example of the display of position detection results and sleeping posture detection results when the patient is sleeping on their back. For example... Figure 9 As shown, a frame image captured by a camera 124 is displayed on the monitor 121. The person being filmed is lying on their back asleep, and the image (frame image) of the person lying on their back is displayed on the monitor 121.
[0111] The position detection unit 132 outputs the positions of n (23 in this embodiment) parts of the human body as position detection results. If an error is determined in the position detection result, the position detection result is corrected. The position image generated based on the position detection result is superimposed on the front side of the image of the person in the frame image and displayed on the display 121.
[0112] The sleeping posture detection unit 134 takes the position detection result as input and outputs sleeping posture information as the sleeping posture detection result. The display 121 displays text that confirms the sleeping posture detection result.
[0113] exist Figure 9 In the example, on display 121, position images are displayed in front of an image of a person sleeping on their back. For example, images that can identify the right ear, right eye, nose, left eye, and left ear are displayed as position images, and lines connecting these are shown. Furthermore, images that can identify the right shoulder, right elbow, and right wrist are displayed as position images, and lines connecting these are shown. To ensure accurate position detection, position images are displayed corresponding to the positions of various parts of the human body in the frame image.
[0114] In the upper right corner of the display 121, a text image showing the word "lying supine" is displayed as the result of the sleeping posture detection.
[0115] exist Figure 10 The image shows an example of the position detection results and sleeping posture detection results when the person is sleeping facing right. Figure 10 As shown, a frame image captured by camera 124 is displayed on monitor 121. The person being filmed is sleeping facing right, and an image (frame image) of the person sleeping facing right is displayed on monitor 121.
[0116] exist Figure 10 In the example, on display 121, a position image is displayed as a position detection result, superimposed on an image of a person sleeping facing right. Furthermore, in the upper right corner of display 121, the text image "Facing Right," indicating sleeping facing right, is displayed as a sleeping posture detection result.
[0117] exist Figure 10 In the example, to accurately detect location, a location image is displayed corresponding to the positions of various parts of the human body in the frame image. Since the right hand is hidden in the image, the positions of the right elbow and right wrist are not output as detection results.
[0118] exist Figure 11 The image shows an example of the detection results when the left hand of a person sleeping on their back is gradually moved. Figure 11 In this process, images are acquired sequentially in time series, following the order of frame image A, frame image B, and frame image C. The following... Figures 12-14 The same applies. And... Figure 9 Similarly, as Figure 11 As shown, the image of a person sleeping on their back, the position image, and the text image "sleeping on their back" are displayed on the monitor 121 as frame image A.
[0119] In the next acquired frame image B, assume the person's left hand moves upwards. In this case, the image of a person sleeping on their back with their left hand moving upwards is displayed on monitor 121. By observing the movement of the left hand, a differential image is generated that can determine the movement of the left hand.
[0120] For example, as a difference image, an image is displayed where regions that change between frames are changed to a specific color (e.g., yellow). In frame image B, the region corresponding to the left hand is changed to yellow. Furthermore, a position image and the text image "lying down" are displayed.
[0121] In the next acquired frame image C, suppose the person's left hand moves further upwards. In this case, the display 121 shows the left hand moving further upwards. A differential image is generated based on the left hand movement. In the image, the area corresponding to the left hand turns yellow. A position image and the text image "lying supine" are then displayed.
[0122] exist Figure 12 This example illustrates a situation where the person is not moving in frames A through C, but the position detection result is incorrect in frame B. Figure 9 Similarly, as Figure 12 As shown, the display 121 shows an image of a person sleeping on their back, a position image, and the text image "sleeping on their back" as frame image A.
[0123] In the next acquired frame image B, although the person did not move, the position detection result was incorrect; the position image was significantly shifted to the right side of the screen and displayed on monitor 121. Therefore, the position detection result was determined to be incorrect and was corrected. By correcting the position detection result, the position detection result corresponding to the previous frame image A was used. Thus, the corrected position detection result was displayed correctly.
[0124] Figure 13 This example illustrates a scenario where a person moves in frames A through C, and the position detection result in frame B is determined to be incorrect. Figure 9 Similarly, as Figure 13 As shown, the image of a person sleeping on their back, the position image, and the text image "sleeping on their back" are displayed on the monitor 121 as frame image A.
[0125] In the next acquired frame image B, a differential image is displayed that can determine the movement of the left hand due to a slight movement of the left hand. However, the position detection results for the left wrist and left elbow are incorrect, displaying a position image indicating a large movement of the left hand. Furthermore, relative to the time from when frame A was captured to when frame B was captured, the left hand moves at a speed impossible for a human to move.
[0126] For example, it's impossible for the left hand to move from bottom to top in less than one second. In this case, when comparing frame image A and frame image B, the position detection result is determined to be incorrect because the change in the position of the left wrist or left elbow exceeds a predefined reference value.
[0127] If the position detection result is determined to be incorrect, it is corrected. By correcting the position detection result, the position detection result corresponding to the previous frame image A is used. Thus, the corrected position detection result is displayed correctly.
[0128] exist Figure 14 This example illustrates a situation where, in frames A through C, the person is not moving, but a portion of their position cannot be detected in frame B. Figure 9 Similarly, as Figure 14 As shown, the display 121 shows an image of a person sleeping on their back, a position image, and the text image "sleeping on their back" as frame image A.
[0129] Suppose all positions are detected in frame A. However, suppose in the next acquired frame image B, the person hasn't moved, but there's an undetectable area on the left side. In this case, the position detection results corresponding to the previous frame image A can also be used. Thus, the corrected position detection results are displayed correctly.
[0130] [Regarding error troubleshooting]
[0131] Figure 15 This is a flowchart illustrating an example of error rejection processing in a variation of this embodiment. In this embodiment, the determination unit 133 changes the falsely detected second position group to the same position information as the first position group. However, it is not limited to this; the determination unit 133 may also change the falsely detected second position group based on multiple position groups output by the position detection unit 132 before and after outputting the second position group (the current position detection result).
[0132] Specifically, if the error elimination process of the modified example is started, in S61, if the determination unit 133 sets the frame image (the second frame image) acquired this time to the third group ("Yes" in S61), the process is advanced to S62.
[0133] In S61, if the determination unit 133 does not set the currently acquired frame image (the second frame image) as the third group (in S61 it is "No"), the error elimination process ends. In S62, the determination unit 133 excludes the falsely detected second position group (the current position detection result) from the output result of the position detection unit 132 and proceeds to S63.
[0134] In S63, the determination unit 133 modifies the falsely detected second position group based on the multiple position groups output by the position detection unit 132 before and after outputting the second position group (the current position detection result). The modification of the position detection result can be performed according to the position of each detected part of the human body.
[0135] For example, the current position detection result can be calculated by linearly supplementing the previous and next position detection results. In this case, if the next position detection result is not output, the current position detection result cannot be corrected, and the position image will be delayed by one frame in display 121. However, if position detection is not performed in real time but based on pre-captured animation, there is no need for a one-frame delay in display.
[0136] Furthermore, and not limited to this, the current position detection result can also be calculated by linearly supplementing the previous position detection result with the previous position detection result. Additionally, the current position detection result can be corrected based on three or more position detection results.
[0137] [plan]
[0138] Those skilled in the art will understand that the above embodiments and their variations are specific examples of the following solutions.
[0139] (Item 1) A posture detection device includes an acquisition unit, a position detection unit, and a determination unit. The acquisition unit acquires frame images sequentially in a time series. The position detection unit inputs information obtained from the frame images into a learned model and outputs position information of multiple parts of the human body from the learned model. The determination unit outputs information related to false detections by the position detection unit. The position detection unit outputs a first position group when the first frame image acquired by the acquisition unit is used as input, and outputs a second position group when the second frame image acquired by the acquisition unit, which is different from the first frame image, is used as input. The determination unit outputs information related to false detections by the position detection unit based on the difference between the first and second position groups and the consistency between the first and second frame images.
[0140] Based on this configuration, it is possible to identify false detections of positional information for multiple parts of the human body. This allows for the removal of falsely detected positional information, thus preventing the detection of movements that did not actually occur.
[0141] (Item 2) The posture detection apparatus described in Item 1 further includes a first calculation unit and a second calculation unit. The first calculation unit calculates the difference between a first position group and a second position group. The second calculation unit calculates the consistency between a first frame image and a second frame image. Based on the difference between the first position group and the second position group calculated by the first calculation unit and the consistency between the first frame image and the second frame image calculated by the second calculation unit, the determination unit outputs information related to false detections by the position detection unit.
[0142] With this configuration, since the first calculation unit calculates the difference between the first position group and the second position group, the difference between the first position group and the second position group can be determined with high accuracy compared to the case of visual confirmation. Furthermore, since the consistency between the first frame image and the second frame image is calculated, the consistency between the first frame image and the second frame image can be determined with high accuracy. As a result, information related to false detections by the position detection unit can be output with high accuracy.
[0143] (Item 3) In the posture detection device described in Item 1 or 2, when the determination unit determines that the first position group and the second position group are different based on the difference degree between the first position group and the second position group, and determines that the first frame image and the second frame image are consistent based on the consistency degree between the first frame image and the second frame image, the determination unit outputs information that can determine that the position detection unit has misdetected the second position group as information related to the misdetection of the position detection unit.
[0144] Based on this configuration, false detections in the second position group can be identified. Therefore, false detections in the second position group can be eliminated, thus preventing the detection of human movements that did not actually occur.
[0145] (Item 4) In the posture detection device described in Item 3, when the difference between the first position group and the second position group in at least one of the multiple positions exceeds a predetermined reference, the determination unit also outputs information that can determine that the position detection unit has misdetected the second position group as information related to the misdetection of the position detection unit.
[0146] Based on this configuration, it is possible to identify false detections of positional information for multiple parts of the human body under extreme positional changes. Therefore, falsely detected positional information can be eliminated, thus preventing the detection of human movements that did not actually occur.
[0147] (Item 5) In the posture detection device described in Item 3 or 4, the determination unit excludes the falsely detected second position group from the output result of the position detection unit.
[0148] Based on this configuration, it is possible to identify false detections in the second position group and remove false detections in the second position group, thus preventing the detection of human movements that did not actually occur.
[0149] (Item 6) In the posture detection device described in Item 3 or 4, the determination unit changes the falsely detected second position group to the same position information as the first position group.
[0150] Based on this configuration, it is possible to identify false detections in the second position group and replace the falsely detected second position group with the correctly detected position group, thus preventing the detection of human movements that did not actually occur.
[0151] (Item 7) In the posture detection device described in Item 3 or 4, the determination unit changes the falsely detected second position group based on the multiple position groups output by the position detection unit before and after outputting the second position group.
[0152] Based on this configuration, it is possible to identify false detections in the second position group and to modify the falsely detected second position group based on the correctly detected position group, thus preventing the detection of human movements that did not actually occur.
[0153] (Item 8) In the posture detection apparatus described in any one of items 1 to 7, the determination unit classifies the second frame image into a second frame image of group 1 based on the difference between the first frame image and the second frame image. The determination unit classifies the second frame image input to the position detection unit for outputting the second position group into a second frame image of group 2 based on the difference between the first position group and the second position group. The determination unit determines the second frame image that is included in the second group but not included in the first group as a second frame image that the position detection unit has misdetected as belonging to the second position group.
[0154] With this configuration, since the determination unit calculates the difference between the first position group and the second position group, and calculates the consistency between the first frame image and the second frame image, it can determine the difference between the first position group and the second position group, and the consistency between the first frame image and the second frame image with high accuracy compared to the case of confirmation by the human eye.
[0155] (Item 9) A pose detection method includes: the step of acquiring frame images sequentially in a time series; the step of inputting information obtained from the frame images into a learned model and outputting position information of multiple parts of the human body from the learned model; and the step of outputting information related to false detections for the output position information of the multiple parts. The step of outputting the position information of the multiple parts includes: the step of outputting a first position group when the acquired first frame image is used as input; and the step of outputting a second position group when the acquired second frame image, which is different from the first frame image, is used as input. The step of outputting information related to false detections is to output information related to false detections based on the difference between the first position group and the second position group, and the consistency between the first frame image and the second frame image.
[0156] Based on this configuration, it is possible to identify false detections of positional information for multiple parts of the human body. This allows for the removal of falsely detected positional information, thus preventing the detection of movements that did not actually occur.
[0157] (Item 10) A sleeping posture determination method includes: the posture detection method described in Item 9; and the step of outputting sleeping posture information based on the output position information of the multiple body parts and the output information related to false detection.
[0158] Based on this configuration, sleeping position information can be output based on the positional information of multiple parts of the human body that are correctly detected, thus improving the accuracy of the output of sleeping position information.
[0159] The embodiments disclosed herein should be considered illustrative and not restrictive in all respects. The scope of the invention is not shown by the description of the above embodiments, but by the claims, which include all modifications within the meaning and scope equivalent to the claims.
[0160] Explanation of reference numerals in the attached figures
[0161] 100 posture detection device
[0162] 101 Hard Drive
[0163] 102 CPU
[0164] 103 Memory
[0165] 104 Display Interface
[0166] 105 Peripheral device interfaces
[0167] 106 Camera Interface
[0168] 111 OS
[0169] 112. Learned model for location detection
[0170] 113 Troubleshooting Procedure
[0171] 114 Sleeping Posture Detection Program
[0172] 121 Monitor
[0173] 122 Mouse
[0174] 123 Keyboard
[0175] 124 cameras
[0176] 131 Acquisition Department
[0177] 132 Position Detection Department
[0178] 133 Judgment Department
[0179] 134 Sleeping Posture Detection Department.
Claims
1. A posture detection device, characterized in that, have: The acquisition unit acquires frame images sequentially according to time series. The position detection unit inputs information obtained from the frame image into the learned model and outputs position information of multiple parts of the human body from the learned model. The determination unit outputs information related to false detections by the position detection unit. When the position detection unit takes the first frame image acquired by the acquisition unit as input, it outputs a first position group as the position information of the plurality of parts. When the position detection unit takes a second frame image acquired by the acquisition unit, which is different from the first frame image, as input, it outputs a second position group as the position information of the plurality of parts. The determination unit outputs information related to the misdetection of the second position group based on the difference between the first position group and the second position group, and the consistency between the first frame image and the second frame image. When the determination unit determines that the first position group and the second position group are different based on the difference between them, and determines that the first frame image and the second frame image are consistent based on the consistency between them, the determination unit outputs information that confirms the position detection unit has misdetected the second position group as information related to the misdetection of the position detection unit. The determination unit excludes the falsely detected second position group from the output of the position detection unit.
2. The posture detection device as described in claim 1, characterized in that, It also has: The first calculation unit calculates the difference between the first position group and the second position group; The second calculation unit calculates the consistency between the first frame image and the second frame image. The determination unit outputs information related to the false detection of the position detection unit based on the difference between the first position group and the second position group calculated by the first calculation unit and the consistency between the first frame image and the second frame image calculated by the second calculation unit.
3. The posture detection device as described in claim 1, characterized in that, When the difference between the first position group and the second position group in at least one of the plurality of positions exceeds a predetermined benchmark, the determination unit also outputs information that can determine that the position detection unit has misdetected the second position group as information related to the misdetection of the position detection unit.
4. The posture detection device as described in claim 1, characterized in that, The determination unit will change the falsely detected second position group to the same position information as the first position group.
5. The posture detection device as described in claim 1, characterized in that, The determination unit modifies the falsely detected second position group based on the multiple position groups output by the position detection unit before and after outputting the second position group.
6. The posture detection device as described in claim 1, characterized in that, The determination unit classifies the second frame image into the first group of second frame images based on the difference between the first frame image and the second frame image; Based on the difference between the first position group and the second position group, the second frame image input to the position detection unit for outputting the second position group is classified into the second group of the second frame image; The second frame image, which is included in the second group but not in the first group, is determined to be the second frame image of the second position group that the position detection unit has misdetected.
7. A posture detection method, characterized in that, include: The steps to acquire frame images in time sequence; The steps are as follows: inputting information obtained from the frame image into the learned model, and outputting the position information of multiple parts of the human body from the learned model; The step of outputting information related to false detections, in addition to the output location information of the multiple parts, is to... The step of outputting the location information of multiple parts includes: The step of outputting the first position group as the position information of the multiple parts when the first frame image is acquired as input; The step of outputting a second position group as the position information of the multiple parts when a second frame image, which is different from the first frame image, is taken as input. The step of outputting information related to false detection involves, based on the difference between the first position group and the second position group, and the consistency between the first frame image and the second frame image, outputting information that can determine the false detection of the second position group as information related to the false detection. When the difference between the first position group and the second position group is determined to be different, and the consistency between the first frame image and the second frame image is determined to be consistent, information that can determine the false detection of the second position group is output as information related to the false detection. The falsely detected second location group is excluded from the output of the step of outputting the location information of multiple locations.
8. A method for determining sleeping posture, characterized in that, include: The posture detection method according to claim 7; The step of outputting sleeping posture information based on the output position information of the multiple parts and the output information related to the false detection.