Intelligent interactive projection method and interactive projection device

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
By cropping and extracting user action features in the interactive projection method and combining them with historical interaction records for matching and judgment, the problem of low action recognition accuracy in the existing technology is solved, and more efficient user interaction is achieved.

CN120340067BActive Publication Date: 2026-06-26NANJING ARTS INST

View PDF 3 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: NANJING ARTS INST
Filing Date: 2025-04-01
Publication Date: 2026-06-26

Application Information

Patent Timeline

01 Apr 2025

Application

26 Jun 2026

Publication

CN120340067B

IPC: G06F3/01; G06V40/10; G06V40/20; G06V10/26; G06V10/74; G06V10/75

CPC: G06V40/113; G06V40/28; G06V10/273; G06V10/761; G06V10/752; G06F3/017

AI Tagging

Technology Topics

Feature extractionVision algorithms

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

In existing interactive projection methods, traditional action recognition algorithms struggle to accurately extract and identify key features of user actions, resulting in low recognition accuracy, misjudgments, and inaccurate matching, which negatively impacts the overall interactive experience.

Method used

By capturing user actions through a camera, analyzing and recognizing static or real-time action signals using computer vision algorithms, cropping images and extracting feature actions, and combining historical interaction records for matching and judgment, the action classification and matching mechanism is optimized.

Benefits of technology

It improves the accuracy and flexibility of action recognition, reduces false positives, and enhances the accuracy and efficiency of interaction.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN120340067B_ABST

Patent Text Reader

Abstract

The application discloses an intelligent interactive projection method and an interactive projection device, and relates to the technical field of interactive projection.The application solves the technical problem that the recognition accuracy is low, the algorithm cannot accurately detect the key features of user actions, and the interaction is affected.The application generates signals by detecting the detailed features of user gestures and the multi-dimensional information of specific objects in the motion signal generation stage for static motion signals.Compared with some simple motion detection methods, the application can more accurately recognize static actions.For real-time motion signals, the application improves the accuracy of real-time motion recognition by comprehensively capturing motion features, and efficiently processes image data by using advanced computer vision algorithms and signal processing technologies.In the real-time motion signal processing process, the application meets the real-time requirements by quickly preprocessing images, extracting features and processing in segments, and guarantees the smoothness of the interaction.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of interactive projection technology, specifically to an intelligent interactive projection method and an interactive projection device. Background Technology

[0002] With the development of the intelligent era, human-computer interaction technology has become a research hotspot. Action recognition technology based on computer vision has emerged. It captures user actions through cameras and uses computer vision algorithms to analyze and recognize the actions, thereby generating corresponding signals to realize human-computer interaction. This technology has a wide range of applications in smart homes, smart education, smart medical rehabilitation and other fields, and can provide users with a more convenient, efficient and natural interactive experience.

[0003] According to patent application CN106774827B, a projection interaction method, a projection interaction device, and a smart terminal are disclosed. The method includes: receiving a gesture image captured by a camera during projection; acquiring a projection image at the same moment as the gesture image and enlarging the projection image to the same size as the gesture image; comparing the enlarged projection image with the gesture image and cropping different regions from the two images as target regions; performing static gesture recognition on the target region, extracting the gestures within the target region, matching the gestures with a preset gesture template to obtain the corresponding instructions; and using the obtained instructions to control the projection process.

[0004] However, when using some existing interactive projection methods, traditional action recognition algorithms have difficulty accurately extracting and recognizing user actions, resulting in low recognition accuracy. This makes it difficult for the algorithm to accurately detect the key features of user actions, leading to misjudgments. Furthermore, when matching the recognized actions with the preset action library, inaccurate or unmatched matches are likely to occur, further impacting the overall interaction. Summary of the Invention

[0005] To address the shortcomings of existing technologies, this invention provides an intelligent interactive projection method and interactive projection device, which solves the problem of low recognition accuracy, preventing the algorithm from accurately detecting the key features of user actions and thus affecting the interaction.

[0006] To achieve the above objectives, the present invention provides the following technical solution: an intelligent interactive projection method, which specifically includes the following steps:

[0007] The camera captures user movements, and computer vision algorithms are used to analyze and identify them to generate static or real-time motion signals.

[0008] The generated static motion signal is analyzed, the user motion image is cropped to obtain a cropped image, and feature extraction is performed to obtain feature motion. At the same time, it is compared with the gesture image to generate a comparison matching or comparison mismatch signal.

[0009] By comparing and analyzing the matching signals, the matching results of the gesture images are determined to generate interactive information. By comparing and analyzing the non-matching signals, the gesture image with the highest similarity is selected as the pre-selected action. Combined with historical interaction records, the standard matching result is determined to generate interactive information.

[0010] The real-time motion signal is analyzed by preprocessing the user motion image to obtain a preprocessed image and extracting features to obtain user motion features. At the same time, the motion is segmented according to the motion node to obtain segmented motion. Pre-selected pixels are selected based on the pixel values of the segmented motion pixels and used as the standard to generate user motion features. These are then combined to obtain preprocessed motion. Finally, the preprocessed motion is compared with the gesture image to generate interactive information or secondary analysis signals.

[0011] The generated secondary analysis signal is processed to match the segmented actions with the gesture images to obtain matching results. At the same time, the results to be analyzed are filtered according to the proportion of the number of segmented actions, and the interaction criteria are determined by combining historical interaction records to generate interaction information.

[0012] As a further aspect of the present invention, the specific method for generating static or real-time action signals is as follows:

[0013] The system acquires a sequence of motion images and uses computer vision algorithms to identify the image sequence. If the user's action is identified as a gesture in a relatively static state, a static motion signal is generated; otherwise, if the user's action is identified as a gesture generated during movement, a real-time motion signal is generated.

[0014] As a further aspect of the present invention, the specific method for analyzing the generated static action signal is as follows:

[0015] The user action image is acquired and cropped to obtain a cropped image. At the same time, the features of the user action in the cropped image are extracted to obtain the feature action. The feature action is then compared and matched with the gesture image.

[0016] If the feature action exists in the gesture image, a matching signal is generated; otherwise, if the feature action does not exist in the gesture image, a mismatch signal is generated, and both are analyzed separately.

[0017] As a further aspect of the present invention, the specific method for analyzing the matching signals and mismatch signals is as follows:

[0018] By comparing and analyzing the matching signals, the corresponding matching results in the gesture image are obtained, and the interaction information is generated based on the matching results.

[0019] By comparing the mismatch signals, the similarity between the feature action and the gesture image is calculated. The gesture image with the highest similarity is selected as the pre-selected action. Then, the historical interaction records are obtained, and the relationship between the pre-selected action and the historical interaction records is determined.

[0020] If a pre-selected action exists in the historical interaction record, the interaction will be performed based on the pre-selected action to generate interaction information; otherwise, if no pre-selected action exists in the historical interaction record, secondary interaction information will be generated.

[0021] As a further aspect of the present invention, the specific method for analyzing real-time action signals is as follows:

[0022] The system acquires user action images corresponding to real-time actions, performs grayscale conversion, noise reduction, and normalization to obtain preprocessed images, extracts features from the preprocessed images as user action features, segments them according to action nodes to obtain segmented actions, and filters pre-selected pixels based on the pixel values of the segmented action pixels.

[0023] As a further aspect of the present invention, the specific method for obtaining the pre-selected pixels through screening is as follows:

[0024] The system acquires the motion images corresponding to the segmented actions and performs single-frame segmentation to obtain single-frame images. For the pixels of user action features and their corresponding pixel values in the single-frame images, the obtained pixel values are compared with preset values. The specific values of the preset values are set by the operator. Pixels whose pixel values meet the preset values are recorded as pre-selected pixels, while pixels that do not meet the preset values are removed.

[0025] As a further aspect of the present invention, the specific method for obtaining the preprocessing action to generate interactive information or secondary analysis signal is as follows:

[0026] Preprocessing features are generated based on preselected pixels, and so on to obtain preprocessing features for all segmented actions. At the same time, the preprocessing features are combined to obtain the preprocessing actions.

[0027] The preprocessed action is compared and analyzed with the gesture image. If the preprocessed action exists in the gesture image, the corresponding matching image is obtained and interaction information is generated. Otherwise, if the preprocessed action does not exist in the gesture image, a secondary analysis signal is generated.

[0028] As a further aspect of the present invention, the specific method for processing the generated secondary analysis signal is as follows:

[0029] All segmented actions are acquired and matched with gesture images to obtain corresponding matching results. At the same time, the proportion of segmented actions in the matching results is calculated and compared with the filtering threshold. Matching results with a proportion greater than the filtering threshold are recorded as results to be analyzed.

[0030] Obtain the historical interaction records corresponding to the results to be analyzed, and at the same time obtain the number of interactions corresponding to the results to be analyzed. Sort them in descending order of the number of interactions, calculate the similarity between the gesture images and real-time actions in the results to be analyzed, and select the results to be analyzed with the highest similarity as the interaction standard to generate interaction information.

[0031] An intelligent interactive projection device includes a camera, a data processing center, and a projection module, wherein the data processing center includes an image processing module, a motion analysis module, and an interactive information generation module.

[0032] The camera is used to capture images of user actions between the camera and the projection surface after being turned on and send them to the data processing.

[0033] The data processing center is used to analyze the acquired user action images, identify user actions to generate static or real-time action signals, and transmit both to the action analysis module.

[0034] The motion analysis module analyzes the acquired static or real-time motion signals. For static motion signal analysis, it crops the user's motion image to obtain a cropped image, extracts features to obtain feature actions, and compares it with the gesture image to generate a matching or mismatch signal. For matching signals, it analyzes the matching result of the gesture image to generate interaction information. For mismatch signals, it selects the gesture image with the highest similarity and records it as a pre-selected action. It also combines historical interaction records to determine the standard matching result, generates interaction information, and transmits it to the interaction information generation module.

[0035] The real-time motion signal is analyzed by preprocessing the user motion image to obtain a preprocessed image and extracting features to obtain user motion features. At the same time, the motion is segmented according to the motion node to obtain segmented motion. Pre-selected pixels are selected based on the pixel values of the segmented motion pixels and used as the standard to generate user motion features. These are then combined to obtain preprocessed motion. Finally, the preprocessed motion is compared with the gesture image to generate interactive information or secondary analysis signals.

[0036] The secondary analysis signal is analyzed, the segmented actions are matched with the gesture images to obtain the matching results, and the results to be analyzed are filtered according to the proportion of the number of segmented actions. The interaction standard is determined by combining the historical interaction records, the interaction information is generated, and the interaction information is transmitted to the interaction information generation module.

[0037] The interactive information generation module transmits the acquired interactive information to the projection module;

[0038] The projection module is used to acquire interactive information and generate interactive commands, and then control the projection according to the commands.

[0039] This invention provides an intelligent interactive projection method and an interactive projection device. Compared with the prior art, it has the following advantages:

[0040] This invention precisely preserves the action region and reduces interference from irrelevant information by cropping the user's action image when processing static action signals; at the same time, it uses advanced feature extraction algorithms to extract characteristic actions that can represent the action, thereby improving the accuracy of feature extraction.

[0041] For static action signals, when comparing them with the gesture image database, it can not only determine whether there is a complete match, but also calculate the similarity between the feature action and all gesture images when there is a mismatch, and make a judgment by combining historical interaction records, which improves the accuracy and flexibility of matching.

[0042] For real-time motion signal processing, after obtaining the preprocessed motion, it is compared and analyzed with the gesture image, considering the similarity of multiple key features such as motion contour and posture. Simultaneously, in the secondary analysis signal processing, segmented motions are matched with the gesture image, and the results to be analyzed are filtered according to their proportion. Interaction criteria are determined by combining historical interaction records, further optimizing the motion classification and matching mechanism and improving the matching success rate. Attached Figure Description

[0043] Figure 1 This is a diagram illustrating the steps and methods of the present invention;

[0044] Figure 2 This is a block diagram of the device of the present invention. Detailed Implementation

[0045] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0046] Example 1, please refer to Figure 1 This application provides an intelligent interactive projection method, which specifically includes the following steps:

[0047] Step S1

[0048] By capturing user movements in real time using a camera, and then using advanced computer vision algorithms to perform deep analysis and recognition on these image data after acquiring the sequence of motion images, corresponding motion signals are generated through a series of complex image processing, feature extraction, and pattern matching operations. These signals can be specifically divided into static motion signals and real-time motion signals.

[0049] The generation of static motion signals occurs when the algorithm analyzes user actions and identifies the user action as a gesture in a relatively static state, or the position and posture of a specific object.

[0050] For example, when a user makes an "OK" gesture in front of a camera, the computer vision algorithm detects features such as the degree of finger bending and the positional relationship of the fingertips to determine that this is an "OK" gesture, thereby generating a corresponding static motion signal to indicate that the user has made the specific "OK" gesture.

[0051] For example, when a camera captures a cup on a table, the algorithm analyzes the cup's position, angle, and outline in the image to determine that the cup is upright and located in the upper left corner of the table. At this point, a static motion signal is generated to describe the position and posture of the cup as a specific object.

[0052] Real-time motion signals are generated when the algorithm recognizes that the user's actions are in a dynamic motion process. For example, when a user runs in front of a camera, the algorithm tracks the positional changes of various parts of the user's body (such as legs and arms) in continuous images, calculates parameters such as their trajectory, speed, and acceleration, and then generates real-time motion signals to represent the user's running motion.

[0053] For example, when a user holds a remote control and waves it in the air, the algorithm can capture the positional changes of the remote control at different times, generate real-time motion signals, and demonstrate that the remote control is in motion and its motion characteristics.

[0054] Step S2

[0055] In the process of recognizing and interacting with user actions, the generated static action signals are first analyzed in depth. Using specific signal processing algorithms, the static action signals are converted into corresponding user action images. Next, a cropping operation is performed on the user action images to precisely remove irrelevant and redundant areas, retaining only the action regions directly related to the user's actions, thus obtaining a cropped image.

[0056] After obtaining the cropped image, advanced feature extraction algorithms are used to extract features of the user's actions from the cropped image, thereby obtaining characteristic actions that can represent the action. Then, the characteristic actions are compared and matched one by one with gesture images in a pre-built gesture image library. During the comparison process, if the current characteristic action is found to be a perfect match with an image in the gesture image library, a matching signal is generated; if no matching gesture image is found, a mismatch signal is generated.

[0057] For example, assuming the gesture image library contains gesture images such as "clenched fist", "thumbs up", and "wave", a matching signal will be generated when the feature action is completely consistent with the feature of the "thumbs up" gesture image; if the feature action does not match any gesture image in the library, a mismatch signal will be generated.

[0058] When a matching signal is generated, it is further analyzed. By parsing the signal, the corresponding matching result is obtained from the gesture image library. Based on this matching result, corresponding interactive information is generated. For example, if the matching result is a "like" gesture, the interactive information might be to display a "like successful" message on the screen, or to trigger a like-related function, such as liking a video or an article.

[0059] When a mismatch signal is generated, it also needs to be analyzed. At this point, the similarity between the feature action and all gesture images in the gesture image library is calculated. During the calculation, a specific similarity calculation algorithm, such as the cosine similarity algorithm, is used to calculate the similarity between each gesture image and the feature action, and the images are sorted in descending order of similarity. The gesture image with the highest similarity is selected and recorded as the pre-selected action. Next, historical interaction records are retrieved, and the pre-selected action is matched against actions in the historical interaction records.

[0060] If an action matching the pre-selected action is found in the historical interaction record, the interaction will proceed based on the pre-selected action, and corresponding interaction information will be generated. For example, if there are multiple "waving" action records in the historical interaction record, and the current pre-selected action is also determined to be "waving," then the interaction information may be to activate a function related to waving, such as controlling the smart device to turn on or off.

[0061] Conversely, if the pre-selected action is not found in the historical interaction record, secondary interaction information is generated to prompt the user to perform the action again, or to provide some guidance information to help the user adjust the action to obtain a more accurate matching result.

[0062] Step S3

[0063] Once the real-time motion signal is generated, it is first analyzed in depth. Through signal parsing and conversion techniques, a series of user motion images corresponding to the real-time motion are obtained. These images completely record the user's posture changes during the motion process. To facilitate subsequent processing, preprocessing operations are performed on the obtained user motion images, specifically including grayscale conversion, noise reduction, and normalization.

[0064] Grayscale conversion transforms a color image into a grayscale image, reducing data dimensionality while preserving key information. For example, converting a color image of a user running into grayscale allows subsequent processing to focus more on the movement outline itself, rather than color information. Noise reduction utilizes algorithms such as median filtering and Gaussian filtering to remove noise interference from the image, ensuring image clarity and accuracy. For instance, it removes noise caused by camera sensor noise or transmission interference, making movement details clearer. Normalization maps image pixel values to a specific range, such as [0, 1] or [-1, 1], to eliminate the impact of differences in brightness and contrast between different images on subsequent analysis, making movement images taken at different times and under different lighting conditions comparable. After this series of preprocessing operations, a preprocessed image is obtained.

[0065] Next, feature extraction is performed on the preprocessed image. Algorithms such as edge detection and contour extraction are used to obtain user action features, which are presented in the form of action contours. For example, for an image of a user waving their hand, the outline of the arm movement can be clearly delineated after feature extraction. To further refine the analysis, the obtained user action features are segmented according to action nodes. Action nodes can be key locations with significant posture changes during the action, such as the starting position of the arm when the user is doing a push-up, the position of the arm bending to the lowest point, and the ending position of the arm straightening again. By dividing the continuous action features into action nodes, segmented action features are obtained.

[0066] For each segmented action, its corresponding action image is acquired, and this image is then segmented into individual frames. This breaks down the continuous sequence of action images into single-frame images. This allows for more detailed analysis of each frame. Within each single frame, the pixels representing the user's action features are extracted, and the pixel value for each pixel is calculated sequentially.

[0067] For example, in a single frame image, the pixel value of a certain pixel on the outline of a user's arm movement may be a value in the range of [0, 255] after grayscale conversion.

[0068] The obtained pixel values are compared with preset values, which are set by the operator according to actual needs and scenarios, and the preset values are a range. For example, the operator sets the pixel value range to [100, 180] based on the expectation of specific action features.

[0069] The system selects pixels whose values meet a preset range and records them as pre-selected pixels, while discarding pixels that do not meet the range. Based on these pre-selected pixels, user action features are regenerated and recorded as pre-processed features. The same process is applied to each segment of the action to obtain its corresponding pre-processed features. Finally, all pre-processed features are combined to form a complete pre-processed action.

[0070] After constructing the preprocessed action, it is compared and analyzed with the gesture image. The gesture image can be a pre-stored standard action image library or a reference action image set in a specific scene. If the preprocessed action matches a certain action in the gesture image, that is, the key features such as action outline and posture are highly similar, the system obtains the corresponding matching image and generates interaction information according to the preset interaction rules.

[0071] For example, in a smart home control scenario, a gesture image library stores images of various hand gestures used to control home appliances. When a user's wave gesture is processed and matched with a hand gesture in the library used to control the TV switch, the system acquires the matching image and generates interactive information to turn the TV on or off. Conversely, if no match is found in the gesture images after preprocessing, a secondary analysis signal is generated.

[0072] Step S4

[0073] First, all relevant segmented actions are extracted from the signal. These segmented actions are the result of previously dividing the user's real-time action characteristics according to action nodes, and each segmented action represents a key stage in the user's action process.

[0074] Next, for each segmented action, a comprehensive match is performed between it and gesture images in a pre-built gesture image library. The matching process employs advanced image recognition algorithms to compare the similarity of the segmented action's contour, posture, and other key features with the gesture images. After the matching operation, a series of corresponding matching results are obtained, each indicating that a certain segmented action has a certain degree of similarity to a specific gesture image in the gesture image library.

[0075] Next, calculate the percentage of segmented actions in each matching result. For example, suppose that in a real-time motion analysis, a total of 10 segmented actions are obtained, and 3 of them successfully match a specific gesture image. Then, the percentage of this matching result is 30% (3 ÷ 10 × 100%). Sort all matching results in descending order of percentage. This allows for quick filtering of results that have a high degree of matching with real-time motion segments.

[0076] Next, the percentage of sorted matching results is compared with a filtering threshold set by the operator based on actual needs. The filtering threshold is a key parameter that determines which matching results are sufficiently credible and worthy of further analysis. For example, based on experience and the requirement for system accuracy, the operator might set the filtering threshold to 20%. The system will automatically filter out matching results whose percentage exceeds the filtering threshold and record these results as pending analysis. This step eliminates results with low relevance to real-time actions or those that may be false matches, thus narrowing the scope of subsequent analysis and improving processing efficiency.

[0077] After obtaining the results to be analyzed, the system retrieves the historical interaction records corresponding to each result. These records contain data on past user interactions in similar action scenarios, which is crucial for understanding the intent behind the current real-time action. Simultaneously, the system counts the number of interactions for each result, reflecting the frequency of that match in history. The results are then sorted from highest to lowest interaction count, prioritizing those that have appeared more frequently historically, as these are more likely to represent the user's current true intent.

[0078] Finally, the similarity between the gesture images and real-time actions in the analysis results is calculated sequentially. A more precise algorithm is used for similarity calculation, comprehensively considering multiple factors such as the action's temporal sequence, spatial location, and posture changes. After calculating the similarity of all the analysis results, the result with the highest similarity is selected as the interaction standard. Based on this interaction standard, the system generates corresponding interaction information according to preset interaction rules.

[0079] Example 2, please refer to Figure 2 This application provides an intelligent interactive projection device, including a camera, a data processing center, and a projection module, wherein the data processing center includes an image processing module, a motion analysis module, and an interactive information generation module.

[0080] The camera is used to capture images of user actions between the camera and the projection surface after being turned on and send them to the data processing.

[0081] The data processing center is used to analyze the acquired user action images, identify user actions to generate static or real-time action signals, and transmit both to the action analysis module.

[0082] The motion analysis module analyzes the acquired static or real-time motion signals. For static motion signal analysis, it crops the user's motion image to obtain a cropped image, extracts features to obtain feature actions, and compares it with the gesture image to generate a matching or mismatch signal. For matching signals, it analyzes the matching result of the gesture image to generate interaction information. For mismatch signals, it selects the gesture image with the highest similarity and records it as a pre-selected action. It also combines historical interaction records to determine the standard matching result, generates interaction information, and transmits it to the interaction information generation module.

[0083] The real-time motion signal is analyzed by preprocessing the user motion image to obtain a preprocessed image and extracting features to obtain user motion features. At the same time, the motion is segmented according to the motion node to obtain segmented motion. Pre-selected pixels are selected based on the pixel values of the segmented motion pixels and used as the standard to generate user motion features. These are then combined to obtain preprocessed motion. Finally, the preprocessed motion is compared with the gesture image to generate interactive information or secondary analysis signals.

[0084] The secondary analysis signal is analyzed, the segmented actions are matched with the gesture images to obtain the matching results, and the results to be analyzed are filtered according to the proportion of the number of segmented actions. The interaction standard is determined by combining the historical interaction records, the interaction information is generated, and the interaction information is transmitted to the interaction information generation module.

[0085] The interactive information generation module transmits the acquired interactive information to the projection module;

[0086] The projection module is used to acquire interactive information and generate interactive commands, and then control the projection according to the commands.

[0087] The data in the above formulas are all calculated using numerical values, without substituting the units of the parameters. In addition, the contents not described in detail in this specification are all prior art known to those skilled in the art.

[0088] The above embodiments are only used to illustrate the technical methods of the present invention and are not intended to limit it. Although the present invention has been described in detail with reference to preferred embodiments, those skilled in the art should understand that modifications or equivalent substitutions can be made to the technical methods of the present invention without departing from the spirit and scope of the technical methods of the present invention.

Claims

1. A smart interactive projection method, characterized in that, The method specifically includes the following steps: The camera captures user movements, and computer vision algorithms are used to analyze and identify them to generate static or real-time motion signals. The generated static motion signal is analyzed, the user motion image is cropped to obtain a cropped image, and feature extraction is performed to obtain feature motion. At the same time, it is compared with the gesture image to generate a comparison matching or comparison mismatch signal. By comparing and analyzing the matching signals, the corresponding matching results in the gesture image are obtained, and the interaction information is generated based on the matching results. By comparing the mismatch signals, the similarity between the feature action and the gesture image is calculated. The gesture image with the highest similarity is selected as the pre-selected action. Then, the historical interaction records are obtained, and the relationship between the pre-selected action and the historical interaction records is determined. If a pre-selected action exists in the historical interaction record, the interaction will be performed based on the pre-selected action to generate interaction information; otherwise, if no pre-selected action exists in the historical interaction record, secondary interaction information will be generated. The system analyzes real-time motion signals, preprocesses user motion images to obtain preprocessed images, extracts features to obtain user motion features, segments motions according to motion nodes, filters pre-selected pixels based on pixel values of segmented motion pixels, and uses these as standards to generate preprocessed features. The generated preprocessed features are then combined to obtain preprocessed motions, which are then compared with gesture images to generate interactive information or secondary analysis signals. The generated secondary analysis signal is processed, and the segmented actions are matched with the gesture images to obtain matching results. Simultaneously, the results to be analyzed are filtered according to the proportion of segmented actions, where the proportion represents the percentage of segmented actions that successfully match a given gesture image out of the total number of segmented actions. Interaction criteria are determined by combining historical interaction records, and interaction information is generated. The specific processing method is as follows: All segmented actions are acquired and matched with gesture images to obtain corresponding matching results. At the same time, the proportion of segmented actions in the matching results is calculated and compared with the filtering threshold. Matching results with a proportion greater than the filtering threshold are recorded as results to be analyzed. Obtain the historical interaction records corresponding to the results to be analyzed, and at the same time obtain the number of interactions corresponding to the results to be analyzed. Sort them in descending order of the number of interactions, calculate the similarity between the gesture images and real-time actions in the results to be analyzed, and select the results to be analyzed with the highest similarity as the interaction standard to generate interaction information.

2. The intelligent interactive projection method according to claim 1, characterized in that, The specific method for generating static or real-time action signals is as follows: The system acquires a sequence of motion images and uses computer vision algorithms to identify the image sequence. If the user's action is identified as a gesture in a relatively static state, a static motion signal is generated; otherwise, if the user's action is identified as a gesture generated during movement, a real-time motion signal is generated.

3. The intelligent interactive projection method according to claim 1, characterized in that, The specific method for analyzing the generated static action signal is as follows: The user action image is acquired and cropped to obtain a cropped image. At the same time, the features of the user action in the cropped image are extracted to obtain the feature action. The feature action is then compared and matched with the gesture image. If the feature action exists in the gesture image, a matching signal is generated; otherwise, if the feature action does not exist in the gesture image, a mismatch signal is generated, and both are analyzed separately.

4. The intelligent interactive projection method according to claim 1, characterized in that, The specific method for analyzing real-time motion signals is as follows: The system acquires user action images corresponding to real-time actions, performs grayscale conversion, noise reduction, and normalization to obtain preprocessed images, extracts features from the preprocessed images as user action features, segments them according to action nodes to obtain segmented actions, and filters pre-selected pixels based on the pixel values of the segmented action pixels.

5. The intelligent interactive projection method according to claim 4, characterized in that, The specific method for obtaining the pre-selected pixels through filtering is as follows: The system acquires the motion images corresponding to the segmented actions and performs single-frame segmentation to obtain single-frame images. For the pixels of user action features and their corresponding pixel values in the single-frame images, the obtained pixel values are compared with preset values. The specific values of the preset values are set by the operator. Pixels whose pixel values meet the preset values are recorded as pre-selected pixels, while pixels that do not meet the preset values are removed.

6. The intelligent interactive projection method according to claim 1, characterized in that, The specific method for obtaining preprocessing actions to generate interactive information or secondary analysis signals is as follows: Preprocessing features are generated based on preselected pixels, and so on to obtain preprocessing features for all segmented actions. At the same time, the preprocessing features are combined to obtain the preprocessing actions. The preprocessed action is compared and analyzed with the gesture image. If the preprocessed action exists in the gesture image, the corresponding matching image is obtained and interaction information is generated. Otherwise, if the preprocessed action does not exist in the gesture image, a secondary analysis signal is generated.

7. An intelligent interactive projection device, used to execute the intelligent interactive projection method according to any one of claims 1-6, characterized in that, The device includes a camera, a data processing center, and a projection module, and the data processing center includes an image processing module, a motion analysis module, and an interactive information generation module. The camera is used to capture images of user actions between the camera and the projection surface after being turned on and send them to the data processing center. The data processing center is used to analyze the acquired user action images, identify user actions to generate static or real-time action signals, and transmit both to the action analysis module. The motion analysis module analyzes the acquired static or real-time motion signals. For static motion signal analysis, it crops the user's motion image to obtain a cropped image, extracts features to obtain feature actions, and compares it with the gesture image to generate a matching or mismatch signal. For matching signals, it analyzes the matching result of the gesture image to generate interaction information. For mismatch signals, it selects the gesture image with the highest similarity and records it as a pre-selected action. It also combines historical interaction records to determine the standard matching result, generates interaction information, and transmits it to the interaction information generation module. The system analyzes real-time motion signals, preprocesses user motion images to obtain preprocessed images, extracts features to obtain user motion features, segments motions according to motion nodes, filters pre-selected pixels based on pixel values of segmented motion pixels, and uses these as standards to generate preprocessed features. The generated preprocessed features are then combined to obtain preprocessed motions, which are then compared with gesture images to generate interactive information or secondary analysis signals. The secondary analysis signal is analyzed, the segmented actions are matched with the gesture images to obtain the matching results, and the results to be analyzed are filtered according to the proportion of the number of segmented actions. The interaction standard is determined by combining the historical interaction records, the interaction information is generated, and the interaction information is transmitted to the interaction information generation module. The interactive information generation module transmits the acquired interactive information to the projection module; The projection module is used to acquire interactive information and generate interactive commands, and then control the projection according to the commands.

Citation Information

Patent Citations

CN106774827B
CN106774827A
CN111680594A

Patent Information

AI Technical Summary

Abstract

Description

Patent Citations

CN106774827B

CN106774827A

CN111680594A