Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

405 results about "Multi modal fusion" patented technology

Gesture recognition method based on 3D-CNN and convolutional LSTM

The invention discloses a gesture recognition method based on 3D-CNN and convolution LSTM. The method comprises the steps that the length of a video input into 3D-CNN is normalized through a time jitter policy; the normalized video is used as input to be fed to 3D-CNN to study the short-term temporal-spatial features of a gesture; based on the short-term temporal-spatial features extracted by 3D-CNN, the long-term temporal-spatial features of the gesture are studied through a two-layer convolutional LSTM network to eliminate the influence of complex backgrounds on gesture recognition; the dimension of the extracted long-term temporal-spatial features are reduced through a spatial pyramid pooling layer (SPP layer), and at the same time the extracted multi-scale features are fed into the full-connection layer of the network; and finally, after a latter multi-modal fusion method, forecast results without the network are averaged and fused to acquire a final forecast score. According to the invention, by learning the temporal-spatial features of the gesture simultaneously, the short-term temporal-spatial features and the long-term temporal-spatial features are combined through different networks; the network is trained through a batch normalization method; and the efficiency and accuracy of gesture recognition are improved.
Owner:BEIJING UNION UNIVERSITY

Method and system for in-store shopper behavior analysis with multi-modal sensor fusion

The present invention provides a comprehensive method for automatically and unobtrusively analyzing the in-store behavior of people visiting a physical space using a multi-modal fusion based on multiple types of sensors. The types of sensors employed may include cameras for capturing a plurality of images and mobile signal sensors for capturing a plurality of Wi-Fi signals. The present invention integrates the plurality of input sensor measurements to reliably and persistently track the people's physical attributes and detect the people's interactions with retail elements. The physical and contextual attributes collected from the processed shopper tracks includes the motion dynamics changes triggered by an implicit and explicit interaction to a retail element, comprising the behavior information for the trip of the people. The present invention integrates point-of-sale transaction data with the shopper behavior by finding and associating the transaction data that corresponds to a shopper trajectory and fusing them to generate a complete an intermediate representation of a shopper trip data, called a TripVector. The shopper behavior analyses are carried out based on the extracted TripVector. The analyzed behavior information for the shopper trips yields exemplary behavior analysis comprising map generation as visualization of the behavior, quantitative shopper metric derivation in multiple scales (e.g., store-wide and category-level) including path-to-purchase shopper metrics (e.g., traffic distribution, shopping action distribution, buying action distribution, conversion funnel), category dynamics (e.g., dominant path, category correlation, category sequence). The present invention includes a set of derived methods for different sensor configurations.
Owner:VIDEOMINING CORP

Human-machine interaction multi-mode early intervention system for improving social interaction capacity of autistic children

ActiveCN102354349AImprove social interaction skillsImprove social interactionInput/output for user-computer interactionGraph readingUSBVisual perception
The invention discloses a human-machine interaction multi-mode early intervention system for improving the social interaction capacity of autistic children. The system comprises a multi-point touch screen, a computer and three cameras respectively mounted at the left and right sides of the touch screen and above the middle part of the touch screen, wherein each camera is provided with a microphone and connected with the computer through a USB (Universal Serial Bus) interface; and the system is provided with six basic modules, namely a visual signal processing module, a voice signal processing module, a physical interactive interface module, a multi-mode fusion module, an intelligent control console module and a real scene simulation module, wherein the modules are combined with computer vision, voice recognition, behavior identification, intelligent agent and virtual reality technologies so as to support and improve the social interaction capacity of the autistic children. Development and change of several children in the learning environment are tracked for half year, wherein the social interaction capacity of most children is improved obviously, and other children also make some progress in the aspect of interaction capacity.
Owner:HUAZHONG NORMAL UNIV

Multi-mode information fusion-based classroom learning state monitoring method and system

The invention discloses a multi-mode information fusion-based classroom learning state monitoring method and system. The method specifically comprises the following steps of: acquiring an indoor sceneimage and positioning faces in the scene image; estimating face orientation postures in a face region so as to estimate attentions of students according to the face orientation postures; estimating facial expressions in the face region; acquiring skin conduction signals of the students so as to estimating physiological activation degrees of the students according to the skin conduction signals; recording interactive answer frequencies and correctness of the students on the classroom so as to estimate participation degrees of the students; and fusing four-dimensional information such as attention, learning moods, the physiological activation degrees and the classroom participation degrees of the students so as to analyze classroom learning states of the students. The invention furthermoreprovides a system for realizing the method. By applying the method and system, the learning states of students on classrooms can be objectively and correctly monitored and analyzed in real time, so that the teaching process analysis is perfected and the teaching effect differentiation degree is enhanced.
Owner:HUAZHONG NORMAL UNIV

Multimode autofluorescence tomography molecule image instrument and rebuilding method

The invention discloses a multi-modality autofluorescence molecular tomographic imaging instrument, comprising a signal gathering module, a signal preprocessing module, a system control module, and a signal post-processing module. The method of the invention comprises determining the feasible region of light source through X-ray imaging and autofluorescence tomographic imaging based on multi-stage adaptive finite element combined with digital mouse, reconstruction target area optical characteristic parameter, and modality fusion, and adaptive optimized factorization for partial texture according to the posterior error estimation to obtain the fluorescence light source in reconstruction target area. The morbidity problem of autofluorescence molecular tomographic image can be efficiently solved, and the precise reconstruction of the autofluorescence light source can be carried out in the complicated reconstruction target area by the multi-modality fusion imaging mode of the autofluorescence molecular tomographic imaging. The precise reconstruction of the autofluorescence light source can be finished by the liquid nitrogen cooling CCD probe, multi-angle fluorescence probe technology, and multi-modality fusion technology, and the autofluorescence molecular tomographic imaging algorithm based on multi-stage adaptive finite element with the non-uniformity characteristics of the reconstruction target area.
Owner:INST OF AUTOMATION CHINESE ACAD OF SCI

Method for recovering real-time three-dimensional body posture based on multimodal fusion

InactiveCN102800126AThe motion capture process is easyImprove stability3D-image rendering3D modellingColor imageTime domain
The invention relates to a method for recovering a real-time three-dimensional body posture based on multimodal fusion. The method can be used for recovering three-dimensional framework information of a human body by utilizing multiple technologies of depth map analysis, color identification, face detection and the like to obtain coordinates of main joint points of the human body in a real world. According to the method, on the basis of scene depth images and scene color images synchronously acquired at different moments, position information of the head of the human body can be acquired by a face detection method; position information of the four-limb end points with color marks of the human body are acquired by a color identification method; position information of the elbows and the knees of the human body is figured out by virtue of the position information of the four-limb end points and a mapping relation between the color maps and the depth maps; and an acquired framework is subjected to smooth processing by time domain information to reconstruct movement information of the human body in real time. Compared with the conventional technology for recovering the three-dimensional body posture by near-infrared equipment, the method provided by the invention can improve the recovery stability, and allows a human body movement capture process to be more convenient.
Owner:ZHEJIANG UNIV

Multi-modal emotion recognition method based on fusion attention network

The invention discloses a multi-modal emotion recognition method based on a fusion attention network. The method comprises: extracting high-dimensional features of three modes of text, vision and audio, and aligning and normalizing according to the word level; then, inputting the signals into a bidirectional gating circulation unit network for training; extracting state information output by the bidirectional gating circulation unit network in the three single-mode sub-networks to calculate the correlation degree of the state information among the multiple modes; calculating the attention distribution of the plurality of modalities at each moment; wherein the state information is the weight parameter of the state information at each moment; and weighting and averaging state information ofthe three modal sub-networks and the corresponding weight parameters to obtain a fusion feature vector as input of the full connection network, a to-be-identified text, inputting vision and audio intothe trained bidirectional gating circulation unit network of each modal, and obtaining final emotion intensity output. According to the method, the problem of weight consistency of all modes during multi-mode fusion can be solved, and the emotion recognition accuracy under multi-mode fusion is improved.
Owner:ZHEJIANG UNIV OF TECH

Mobile type multi-modal interaction method and device based on enhanced reality

The invention discloses a mobile type multi-modal interaction method and device based on enhanced reality. The method comprises the following steps that: through an enhanced reality way, displaying ahuman-computer interaction interface, wherein an enhanced reality scene comprises interaction information, including a virtual object and the like; through the ways of gesture and voice, sending an interaction instruction by a user, comprehending different-modal semantic through a multi-modal fusion method, and carrying out fusion on the modal data of the gesture and the voice to generate a multi-modal fusion interaction instruction; and after a user interaction instruction acts, returning an acting result to an enhanced reality virtual scene, and carrying out information feedback through thechange of the scene. The device of the invention comprises a gesture sensor, a PC (Personal Computer), a microphone, optical transmission type enhanced reality display equipment and a WiFi (Wireless Fidelity) router. The invention provides the mobile type multi-modal interaction method and device based on the enhanced reality, a human-centered thought is embodied, the method and the device are natural and visual, learning load is lowered, and interaction efficiency is improved.
Owner:SOUTH CHINA UNIV OF TECH

Semantic segmentation method and system for RGB-D image

The invention discloses a semantic segmentation method and system for an RGB-D image. The semantic segmentation method comprises the steps: extracting RGB coding features and depth coding features ofan RGB-D image in multiple stages; inputting the RGB coding features and the depth coding features of each stage in the plurality of stages into an attention model to obtain each multi-mode fusion feature corresponding to each stage; extracting context semantic information of the multi-modal fusion features in the fifth stage by using a long short-term memory network; splicing the multi-modal fusion features and the context semantic information in the fifth stage to obtain context semantic features; and performing up-sampling on the context semantic features, and fusing the context semantic features with the multi-modal fusion features of the corresponding stage by using a jump connection mode to obtain a semantic segmentation map and a semantic segmentation model. By extracting RGB codingfeatures and depth coding features of the RGB-D image in multiple stages, the semantic segmentation method effectively utilizes color information and depth information of the RGB-D image, and effectively mines context semantic information of the image by using a long short-term memory network, so that the semantic segmentation accuracy of the RGB-D image is improved.
Owner:HANGZHOU WEIMING XINKE TECH CO LTD +1

Hand posture estimation system and method based on RGBD fusion network

The invention provides a hand posture estimation system and method based on an RGBD fusion network. The system comprises a global depth feature extraction module, a residual module, a multi-mode feature fusion module and a branch parallel interference elimination module. The global depth feature extraction module adopts two parallel paths of cross-fused residual networks, wherein the upper path isa high-resolution feature map, the lower path is a low-resolution feature map, carries out the multi-scale feature fusion by the cross-fusing multi-resolution information, and finally predicts the network output in a high-resolution feature map. An input part of the system is divided into a depth image processing branch and an RGB color image processing branch, the features extracted by the two branches are subjected to multi-mode fusion to form the global features, the global features are sent to a branch parallel interference elimination module to perform feature extraction of hand branches, and the reinforced hand branch features are obtained and used for final joint position prediction. According to the method, the hand posture estimation with higher accuracy is achieved mainly through the information synthesis of the color images and the depth images.
Owner:DALIAN UNIV OF TECH

Multi-mode molecular tomography system

InactiveCN101984928ADoes not involve spatial registration issuesConsistent geometric coordinatesSurgeryComputerised tomographsSoft x rayOptical tomography
The invention relates to a multi-mode molecular tomography system which is characterized by comprising one or more light sources of an X-ray source, a near infrared laser light source and a finite spectral width light source for projecting scanning light to an object to be scanned, an electric loading device, an imaging device and a control and processing device, wherein the imaging device is used for obtaining intensity distribution data of x-rays, visible light or near infrared light emerging from the surface of the object to be scanned after scanning, and inputting the intensity distribution data into the control and processing device; the control and processing device is used for controlling the actions of the object to be scanned through the electric loading device; and the control and processing device comprises a tomography module, the tomography module is used for receiving the data of the imaging device, utilizing XCT (X-ray computed tomography) and DOT (diffuse optical tomography) modes to reconstruct an outer boundary and similar information of all internal organizations during marginalization, fusing and reconstructing XCT, DOT, FMT (fluorescence molecular tomography) and BLT (bioluminescence tomography) single-mode or multi-mode tomography image after fusion and outputting. The multi-mode molecular tomography system is applicable to the field of x-ray and optical biomedical imaging.
Owner:PEKING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products