Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

47results about How to "Implement speech recognition" patented technology

Speech recognition model training method, speech recognition method and related devices

The embodiment of the invention provides a speech recognition model training method, a speech recognition method and related devices. The training method comprises the steps of: determining a trainingcurrent mixed language audio, obtaining the training initial acoustic features, utilizing a first language module to obtain a training first time sequence position acoustic feature, utilizing a second language module to obtain a training second time sequence position acoustic feature, performing fusion and text coding on the training first time sequence position acoustic feature and the trainingsecond time sequence position acoustic feature to obtain a training current fusion text feature, obtaining a first training current prediction text feature according to the training current fusion text feature and a previous reference text feature, obtaining first loss according to the first training current prediction text feature and a current reference text feature, then obtaining model loss, and adjusting the parameters of a speech recognition model according to the model loss until the trained speech recognition model is obtained. According to the voice recognition model training method,the voice recognition method and the related devices provided by the embodiment of the invention, the voice recognition accuracy can be improved.
Owner:BEIJING CENTURY TAL EDUCATION TECH CO LTD

Smart voice cell phone or smart voice tablet computer

The invention discloses a smart voice cell phone or smart voice tablet computer and relates to the technical field of electronic devices. The smart voice cell phone or smart voice tablet computer comprises a cloud server and an electronic device. A voice wake-up module, a voice recognition module, a voice command module, a semantic analysis module, a voice synthesis module and a program control module are arranged in the electronic device. The voice recognition module is connected with the cloud server through a wired communication module. The voice recognition module is connected with the voice command module through an information analysis and processing and signal conversion and transmission module. The voice command module is connected with the program control module through an information processing and signal transmission module. By providing the voice wake-up module, the voice recognition module, the voice command module, the semantic analysis module, the voice synthesis moduleand the program control module in the electronic device, the smart voice cell phone or smart voice tablet computer achieves voice interaction, directly wakes up programs through the voice wake-up module without manually opening the applications, realizes software application through a voice interaction function easy to operate.
Owner:ANHUI SEMXUM INFORMATION TECH CO LTD

Intelligent financial counseling robot convenient to use

InactiveCN108406782AHandling financial consulting services convenientlyImprove stabilityProgramme-controlled manipulatorComputer caseBevel gear
The invention discloses an intelligent financial counseling robot convenient to use and relates to the technical field of financial counseling equipment. The intelligent financial counseling robot convenient to use comprises a bottom plate. The top of the bottom plate is fixedly connected with a crate. A transmission case is connected between the two sides of the inner wall of the crate in a sliding mode. The bottom of the inner wall of the transmission case is fixedly connected with a motor through a connecting block, and the outer surface of an output shaft of the motor is fixedly connectedwith a first bevel gear. A bidirectional threaded rod is rotationally connected between the two sides of the inner wall of the transmission case through bearings, and the outer surface of the middle of the bidirectional threaded rod is fixedly connected with a second bevel gear adapting to the first bevel gear. According to the intelligent financial counseling robot convenient to use, people can use the robot quite conveniently, the height of a touch screen can be automatically adjusted according to the height of a client and the distance to the client, the purpose that the robot is suitable for different clients is well achieved, and accordingly, the client can use the intelligent financial counseling robot to handle the financial counseling business quite conveniently.
Owner:朱晓丹

Conference summary generation method and device, electronic equipment and storage medium

The invention discloses a conference summary generation method and device, electronic equipment and a storage medium. The method comprises the following steps: extracting a spectrogram of conference voice data; determining a first probability value between a signal feature of the conference voice data and a phoneme template according to the spectrogram by using an acoustic model of a preset intelligent decoding engine to obtain a phoneme feature corresponding to the signal feature, and determining a second probability value between the phoneme feature and a character template by using a language model of the preset intelligent decoding engine to obtain a phoneme feature corresponding to the phoneme feature; and decoding the conference voice data by using a decoder of a preset intelligent decoding engine according to the first probability value and the second probability value to obtain conference text data, thereby realizing end-to-end voice recognition without directly extracting voice features, and improving the voice recognition efficiency and accuracy in a complex scene. And finally, error correction operation is performed on the conference text data to generate a conference summary, so that the accuracy of a final result is further ensured.
Owner:广西中科曙光云计算有限公司 +1

Paint spraying robot voice recognition method based on multi-scale enhanced BiLSTM model

The invention discloses a paint spraying robot voice recognition method based on a multi-scale enhanced BiLSTM model. The method comprises the following steps: 1) acquiring common spraying sound instructions by using a signal acquisition system, wherein NI-9234 is selected as a data acquisition card; 2) repeatedly adding Gaussian white noise to the collected audio signal for 100 times, generating a noisy signal, solving a corresponding Mel spectrum sequence, and then solving an average sequence of 100 Mel spectrum sequences; 3) performing feature extraction on the average Mel spectrum sequence by using a multi-scale convolution filter, and then performing further mining on the extracted features by using a BiLSTM model to obtain corresponding output; 4) splicing outputs of the BiLSTM model together, then inputting the spliced outputs to a full connection layer and a Softmax layer for processing, and finally realizing speech recognition in combination with a CTC algorithm. and 5) embedding the model obtained through training in the steps 1) to 4) into a spraying robot, so that corresponding spraying tasks are intelligently achieved. According to the model, the intelligent voice recognition function of the spraying robot can be realized, and the model has very high practical application value.
Owner:JINLING INST OF TECH

Method and system for realizing voice age and/or gender recognition service, and medium

The invention relates to the field of voice recognition and particularly relates to a method, a system and a device for realizing voice age and/or gender recognition service and a medium, and aims to solve technical problems of remote accurate calling and simple and convenient deployment of an existing voice age and/or gender recognition model. Therefore, a terminal calls a server through a serialized voice age/gender identification request under a predefined GRPC framework, and the server identifies the age/gender through a set age/gender voice identification service; the corresponding voice age/gender recognition deep neural network model is accurately selected to decode and determine the age and/or gender information of the target object, and the age and/or gender information is returned to the terminal. Due to the fact that the age and/or gender service mode and the remote calling architecture are set, the corresponding model is called after the type of the model is determined, calling is more accurate and does not need to depend on a fixed frame, the method is more flexible, expandability is high, the resource utilization rate is high, concurrency is high, and meanwhile iterative updating of the algorithm model is facilitated.
Owner:GUANGZHOU YUNCONG INFORMATION TECH CO LTD

Device for transferring speech recognition to video

The invention discloses a transforming device from speech identification to video, which comprises: an identifying code establishing module which is used for establishing a corresponding identifying code according to the types of video resource when a media server is started; an audio stream receiving module which is connected with the identifying code establishing module and is used for establishing a connecting channel of the audio stream and receiving the audio stream after the media server receives the request of an application server; a speech identifying module which is connected with the audio stream receiving module and is used for identifying audio data and outputting the identified data to a transformation processing module; the transformation processing module which is connected with the speech identifying module and the identifying code establishing module and is used for transforming the received data of the speech identifying code and making comparison between the transformed data with an identifying code established by the identifying code establishing module, thus realizing video transformation; a video stream output module which is connected with the transformation processing module and used for outputting the transformed video stream to a terminal unit through network.
Owner:ZTE CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products