Beat type-based speech recognition endpoint dynamic control method and system
By dynamically adjusting the silence timeout threshold for speech recognition endpoint detection and optimizing endpoint detection based on beat type and user characteristics, the problems of high mis-slicing rate, large response delay, and inability to adapt to different rhythms in existing technologies are solved, thus achieving efficient speech recognition endpoint detection.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- BEIJING FLOW ELEMENT TECHNOLOGY CO LTD
- Filing Date
- 2026-03-06
- Publication Date
- 2026-06-19
AI Technical Summary
Existing speech recognition endpoint detection technologies cannot adapt to the rhythm differences of different users and scenarios, cannot distinguish between pauses within sentences and pauses at the end of sentences, and are difficult to balance between error rate and response delay. They also cannot effectively handle changes in beat type in multi-turn dialogues, resulting in insufficient recognition accuracy and response speed.
By extracting dialogue features from multiple dimensions in real time, the silence timeout threshold is dynamically adjusted. Endpoint detection is performed based on beat type, including features such as turn interval time, pause ratio within a turn, and speech rate. Rule-based decision trees are used to match beat types, and the silence timeout threshold is optimized based on user history data and scene patterns.
It reduced the overall miscutting rate by 78.7% to 3.8%, improved response speed, adapted to the needs of different beat types, and had low computational overhead, outperforming semantic analysis-based methods.
Smart Images

Figure CN122245291A_ABST