Multi-mode-based conference spokesman identity non-inductive confirmation method

A speaker, multi-modal technology, applied in neural learning methods, character and pattern recognition, biological neural network models, etc., to achieve high accuracy and improve efficiency
CN110807370APending Publication Date: 2020-02-18南京星耀智能科技有限公司

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
南京星耀智能科技有限公司
Publication Date
2020-02-18

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention provides a multi-mode-based conference speaker identity non-inductive confirmation method. Based on a conference using multiple modes of image, voice and text, the identity of a spokesman is confirmed by recognizing the expression, voice and speaking style of the spokesman, and the method specifically comprises an expression recognition method based on a deep learning model, a voicerecognition method based on an artificial intelligence algorithm and a method for recognizing speaking content by adopting a text clustering algorithm. According to the method, the whole process is automatic, manual intervention is not needed, the identity of the speaker can be confirmed in a non-inductive mode through the artificial intelligence algorithm model, manual intervention is not needed,meeting and office efficiency is greatly improved, and accuracy is high.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention belongs to the field of natural language processing, in particular to a method for non-sensing confirmation of the identity of a conference speaker based on multimodality. Background technique

[0002] With the development of the economy, efficient office is increasingly inseparable from the conference system. At this stage, many conference systems need to record the speech content of each speaker for the convenience of summarization and reporting. Therefore, for this requirement, an intelligent and fast method for distinguishing speakers is needed.

[0003] At present, the current conference system mostly uses the microphone to record the voice of the speaker to record the content of the speech. If you want to distinguish different speakers, you need to assign a microphone to each speaker. However, if you assign multiple microphones, it may cause crosstalk. Because the distance is too close, multiple microphones will recognize a person ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More