Method and device for intercepting voice of a target person in a video

A video-in-target technology, applied in instruments, character and pattern recognition, electrical components, etc., can solve the problems of high audio clarity, difficult speech interception, and low speech interception efficiency.

Active Publication Date: 2019-06-18
SPEAKIN TECH CO LTD
View PDF12 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The embodiment of the present application provides a method and device for intercepting the voice of a target person in a video, which solves the problem that the current voice separation algorithm has high requirements on the clarity of the audio, and

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for intercepting voice of a target person in a video
  • Method and device for intercepting voice of a target person in a video
  • Method and device for intercepting voice of a target person in a video

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] In order to enable those skilled in the art to better understand the solution of the application, the technical solution in the embodiment of the application will be clearly and completely described below in conjunction with the drawings in the embodiment of the application. Obviously, the described embodiment is only It is a part of the embodiments of this application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

[0040] This application designs a method and device for intercepting the voice of the target person in the video, which solves the problem that the current voice separation algorithm has high requirements on the clarity of the audio. In the environment, the influence of noise is great, the difficulty of voice interception exists, and the technical problems of low efficiency of voice in...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention discloses a method and a device for intercepting voice of a target person in a video. The method comprises the following steps of using a lip-shaped voice activity detection model, giving a first mark to a video frame, subjected to voice activity, of a target person in the audio and video file, a second mark is given to the video frame, not subjected to the voice activity, of the target person; obtaining a first marker sequence, continuously setting a preset number of first start-stop time points of the video frames containing the first mark in the first mark sequence; determining a second start-stop time point of a corresponding voice frame in the audio and video file, Therefore, the corresponding voice segment in the audio and video file is directly intercepted according to the second start-stop time point. According to the method and the device, the voice segment file of the target person is obtained through the human-voice separation algorithm, human-voice separation is realized, and the technical problems that the requirement of the current human-voice separation algorithm on the definition of audio is high, the audio needs to be subjected to noise reduction processing first and then subjected to human-voice separation, the noise influence is large in a noisy environment, the voice interception difficulty is high, and the voice interceptionefficiency is low are solved.

Description

technical field [0001] The present application relates to the technical field of speech processing, in particular to a method and device for intercepting the speech of a target person in a video. Background technique [0002] When the public security conducts voiceprint identification, it is necessary to compare the voiceprint of the suspect's voice. When extracting the voiceprint, some collected audio files have a noisy recording environment and many speakers. It is necessary to separate the human voice in the audio. To get the voice of the target person. At present, there is a special vocal separation algorithm, but it has high requirements on the clarity of the audio. It is necessary to perform noise reduction processing on the audio before performing vocal separation. In a noisy environment, the impact of noise is large, and it is difficult to intercept speech. The technical problem of low efficiency of voice interception. Contents of the invention [0003] The embod...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): H04N21/439H04N21/845G06K9/62G06K9/00
Inventor 郑棉洲吕莉丽
Owner SPEAKIN TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products