Method of video speech recognition and search

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A video voice and voice recognition technology, applied in voice recognition, voice analysis, special data processing applications, etc., can solve the problems of lack of detail, large amount of video search data, unapplied search, etc.

Active Publication Date: 2011-05-25

HUAQIN TECH CO LTD

View PDF5 Cites 28 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0002] The current cloud technology and search technology have been widely used in various industries. The current video search technology is still in the process of exploration. Because of the large amount of data, video search is not easy to express by image content or video clips. The details are not reached, and the currently widely used video search is based on the file name and the artificially added label as a keyword search

At the same time, speech recognition technology is also widely used in various fields, but currently it is only for single speech recognition, and most of them only recognize short segments of speech, and have not done in-depth research and utilization.

At the same time, the current video can intercept the middle clip to play or capture the screenshot content at a certain point in time, but it is not currently applied to the search

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0032] Such as figure 2 As shown, in public security, while recording the video content of the camera, obtain the sound file and use the speech recognition technology for corresponding processing, and store it in the cloud, or only save the text file in the cloud, and store the actual data in other convenient large data storage bodies. When searching for audio and video files, single text retrieval or text, video clips and screenshots of corresponding time slices can be used as the retrieval results for two situations.

Embodiment 2

[0034] Such as image 3 As shown, in personal applications, it is also possible to search for video files similar to network video media. For special applications, for example, when sorting out items, record the storage location and report the corresponding item name. When searching, enter the corresponding item name to find the storage location of the item. Prevent the difficulty of finding due to forgetting or searching by non-organizers, such as taking pictures and saying when tidying up the room: put summer clothes here, dad’s shirts here, mom’s coats here, brother’s pencils here , my sister’s cosmetics are put here. When searching, enter a shirt, and then retrieve multiple shirt results. According to the screenshot, determine the time slice of the target shirt or find the location directly. This family application can greatly avoid conflicts caused by reasons such as missing items or family members misunderstanding different memories of the same thing, and is especially c...

Embodiment 3

[0036] Such as Figure 4 As shown, for the search of online video media, the cloud analyzes the video and converts the sound, and marks it according to the Time Line in a subtitle-like manner. (Converted into text by speech recognition technology) can list the corresponding subtitle text and video clips and screenshots of the corresponding Time Line. For example, when the user only remembers part of the lines of a certain movie, this technology can be used to perform video retrieval for this part of the lines.

[0037] In summary, due to the adoption of the above technology, the present invention can perform extensive and targeted search on videos, and at the same time, this technology can also be used for rapid positioning in terms of public security and private item search.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a method of video speech recognition and search, comprising the following steps: 1) converting all video sounds into a text by speech recognition; 2) independently storing the texts or attaching the texts to videos; 3) selecting a plurality of words which occur maximally in the texts as word labels of the videos, wherein the word labels are added behind the file names of the videos; and 4) searching the word labels of all videos. The method can be used to search the videos widely and specifically, and carry out quick positioning in public security and private goods search.

Description

technical field [0001] The invention relates to the field of video processing, in particular to a video voice recognition and retrieval method. Background technique [0002] The current cloud technology and search technology have been widely used in various industries. The current video search technology is still in the process of exploration. Because of the large amount of data, video search is not easy to express by image content or video clips. The details are not reached, and the currently widely used video search is based on the file name and the artificially added label as a keyword search. At the same time, speech recognition technology is also widely used in various fields, but currently it is only for single speech recognition, and most of them only recognize short segments of speech, and have not done in-depth research and utilization. At the same time, at present, the video can be intercepted to play the middle clip or capture the screenshot content at a certain ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L15/26G06F17/30

Inventor 刘伟奇

Owner HUAQIN TECH CO LTD

Method of video speech recognition and search

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology