Beat type-based speech recognition endpoint dynamic control method and system

By dynamically adjusting the silence timeout threshold for speech recognition endpoint detection and optimizing endpoint detection based on beat type and user characteristics, the problems of high mis-slicing rate, large response delay, and inability to adapt to different rhythms in existing technologies are solved, thus achieving efficient speech recognition endpoint detection.

CN122245291APending Publication Date: 2026-06-19BEIJING FLOW ELEMENT TECHNOLOGY CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
BEIJING FLOW ELEMENT TECHNOLOGY CO LTD
Filing Date
2026-03-06
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing speech recognition endpoint detection technologies cannot adapt to the rhythm differences of different users and scenarios, cannot distinguish between pauses within sentences and pauses at the end of sentences, and are difficult to balance between error rate and response delay. They also cannot effectively handle changes in beat type in multi-turn dialogues, resulting in insufficient recognition accuracy and response speed.

Method used

By extracting dialogue features from multiple dimensions in real time, the silence timeout threshold is dynamically adjusted. Endpoint detection is performed based on beat type, including features such as turn interval time, pause ratio within a turn, and speech rate. Rule-based decision trees are used to match beat types, and the silence timeout threshold is optimized based on user history data and scene patterns.

Benefits of technology

It reduced the overall miscutting rate by 78.7% to 3.8%, improved response speed, adapted to the needs of different beat types, and had low computational overhead, outperforming semantic analysis-based methods.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122245291A_ABST
    Figure CN122245291A_ABST
Patent Text Reader

Abstract

A method and system for dynamic endpoint control in speech recognition based on beat type belongs to the field of speech recognition technology. The method includes: extracting multi-dimensional dialogue features of the current turn in real time; matching the multi-dimensional dialogue features with preset beat type determination conditions to determine the beat type of the current turn; dynamically adjusting the silence timeout threshold used for endpoint detection according to the beat type; and performing endpoint detection on the current turn using the adjusted silence timeout threshold. This method also supports re-determining the beat type and updating the threshold during the turn, and can adaptively optimize the determination conditions and adjustment strategies based on user historical dialogue data and erroneous segmentation. This application can significantly improve the accuracy of speech recognition endpoint detection, reduce erroneous segmentation, and enhance user experience.
Need to check novelty before this filing date? Find Prior Art