Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

354results about "Audio data querying" patented technology

Audio request interaction system

A person can use a portable electronic device to electronically purchase or otherwise request a product, service or other deliverable related to audio programming to which the person is listening at the time they initiate the request. The request is fulfilled by a service that analyzes the audio content to identify the deliverable the person desires.
Owner:NCR CORP

Conference record generation method based on voice recognition, device and storage medium

The invention discloses a conference record generation method based on voice recognition, and further discloses an electronic device and a computer storage medium. The conference record generation method comprises the steps that a conference record generation instruction sent by a user is received, and audio to be converted is acquired; sentence division is performed on the to-be-converted audio,and audio sentences of the to-be-converted audio are obtained; voiceprint features are extracted from the recognized audio sentences separately, the voiceprint features corresponding to the audio sentences are compared with a preset voiceprint feature library for analysis to determine speaker identity information corresponding to each audio sentence, and the audio sentences are divided into voicesegments according to the speaker identity information to determine a voice segment set corresponding to the to-be-converted audio; target voice recognition models corresponding to the voice segmentsare called to sequentially obtain text corresponding to each voice segment; and a conference record corresponding to the to-be-converted audio is generated. By using the conference record generation method based on voice recognition, accuracy and efficiency of conference record generation can be improved.
Owner:招商局金融科技有限公司

Systems and methods for recording, searching, and sharing spoken content in media files

Systems for recording, searching for, and sharing media files among a plurality of users are disclosed. The systems include a server that is configured to receive, index, and store a plurality of media files, which are received by the server from a plurality of sources, within at least one database in communication with the server. In addition, the server is configured to make one or more of the media files accessible to one or more persons—other than the original sources of such media files. Still further, the server is configured to transcribe the media files into text; receive and publish comments associated with the media files within a graphical user interface of a website; and allow users to query and playback excerpted portions of such media files.
Owner:VOICEBASE

Audio processing method and device based on artificial intelligence

The invention discloses an audio processing method and device based on artificial intelligence. One concrete implement mode of the method includes the steps that an audio file to be processed is converted into an image to be processed; the content feature of the image to be processed is extracted; according to the style feature and the content feature of the image to be processed, a target image is determined, and the style feature is obtained from a template image converted from a template audio file; the target image is converted into the processed audio file. By means of the implement mode, the processed audio file has the template audio style without changing the content of the audio file to be processed, and audio processing efficiency and flexibility are improved.
Owner:BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD

Server device, client device, information processing system, information processing method, and program

An information processing method of an information processing system including a client device functioning as a client and a server device functioning as a server which are capable of communication via a network, includes the steps of: at the server device, managing format identification information provided so as to be unique for each content data within the range of each data format, corresponding to the contents of each content; at the client device, causing execution of communication via network, for specifying a content, as to the server device; at the server device, transmitting, to the client device, format identification information correlated with a specified content in response to specification of a content via network from the client device; and at the client device, managing the received and obtained format identification information as corresponding to the specified content.
Owner:SONY CORP

Systems, methods, and computer products for recommending media suitable for a designated style of use

Media suitable for a designated style of use are recommended from among a database of media objects. A prediction engine is trained using a plurality of media object lists, each of the media object lists containing metadata associated with a plurality of media objects, the media object lists corresponding to the designated style of use, and the prediction engine being trained to calculate the likelihood that a media object is suitable for the designated style of use. The prediction engine includes a binomial classification model trained with feature vectors that include behavioral data, acoustic data and cultural data for the plurality of media objects. The trained prediction engine is applied to media objects in the database of media objects so as to calculate likelihoods that the media objects are suitable for the designated style of use. One or more media objects are recommended using the calculated likelihoods.
Owner:SPOTIFY

Virtual Wireless Multitrack Recording System

Disclosed are systems and methods for wirelessly recording multi-track audio files without the data corruption or loss of data that typically occurs with wireless data transmission. In some aspects of the present invention, each performer is equipped with a local audio device capable of locally recording the respective performer's audio while also transmitting it to a master recorder. The locally recorded audio may then be used to repair or replace any audio lost or corrupted during transmission to the master recorder. Such repair or replacement may be performed electronically or via playback of the locally recorded audio. In other aspects of the present invention, a master recorder is not required since all locally recorded audio may be combined or otherwise processed post-recording. Locally recorded audio may include identifiers to aid in post-recording identification of such audio. A multi-memory unit is also provided to facilitate manipulation and processing of audio files.
Owner:ZAXCOM

Methods, systems, and computer program products for categorizing/rating content uploaded to a network for broadcasting

Methods, systems, and computer program products that automatically categorize and / or assign ratings to content (video and audio content) uploaded by individuals who want to broadcast the content to others via a communications network, such as an IPTV network, are provided. When an individual uploads content to a network, a network service automatically extracts an audio stream from the uploaded content. Words in the extracted audio stream are identified. For each identified word, a preexisting library of selected words is queried to determine if a match exists between words in the library and words in the extracted audio stream. The selected words in the library are associated with a particular content category or content rating. If a match exists between an identified word and a word in the library, the uploaded content is assigned a content category and / or rating associated with the matched word.
Owner:AT&T INTPROP I LP

Content atomization

Organizing and publishing content in a content management system wherein content, including text, images and video, is received and segmented into content atoms. One or more tags are associated with the content atoms to allow device specific presentation of the content atoms.
Owner:SINCLAIR BROADCAST GRP

Speech retrieval apparatus and speech retrieval method

InactiveUS20110071833A1Error robustnessError speech recognition accuracyAudio data retrievalSpeech recognitionSerializationAcoustic model
Disclosed are a speech retrieval apparatus and a speech retrieval method for searching, in a speech database, for an audio file matching an input search term by using an acoustic model serialization code, a phonemic code, a sub-word unit, and a speech recognition result of speech. The speech retrieval apparatus comprises a first conversion device, a first division device, a first speech retrieval unit creation device, a second conversion device, a second division device, a second speech retrieval unit creation device, and a matching device. The speech retrieval method comprises a first conversion step, a first division step, a first speech retrieval unit creation step, a second conversion step, a second division step, a second speech retrieval unit creation step, and a matching step.
Owner:RICOH KK

Voice command processing method and electronic device utilizing the same

An voice command processing method provides a unified voice control interface to access and control Internet of things (IoT) devices and configure value of attributes of graphical user interface (GUI) elements, attributes of applications, and attributes of the IoT devices. As a voice command comprises an expression of a percentage or a fraction of a baseline value of an attribute, or an exact value of the attribute of an IoT device, the unified voice control interface sets the attribute of the IoT device in response to the percentage, the fraction, or the exact value in the voice command.
Owner:CLOUD NETWORK TECH SINGAPORE PTE LTD

Methods, systems, and computer program products for categorizing/rating content uploaded to a network for broadcasting

Methods, systems, and computer program products that automatically categorize and / or assign ratings to content (video and audio content) uploaded by individuals who want to broadcast the content to others via a communications network, such as an IPTV network, are provided. When an individual uploads content to a network, a network service automatically extracts an audio stream from the uploaded content. Words in the extracted audio stream are identified. For each identified word, a preexisting library of selected words is queried to determine if a match exists between words in the library and words in the extracted audio stream. The selected words in the library are associated with a particular content category or content rating. If a match exists between an identified word and a word in the library, the uploaded content is assigned a content category and / or rating associated with the matched word.
Owner:AT&T INTPROP I L P

Wireless environment method and apparatus

A method and apparatus are provided for a first controlled device, such as a wireless local transmitter that accepts a plurality of digital audio signals and corresponding program information signals converted from a controlled source, such as the encoded digital data provided by a digital data signal source, typically a community antenna television (CATV) cable or direct broadcast satellite, then modulates said digital audio and corresponding program information signals on different carrier frequencies and transmits the modulated signals to a plurality of second controlled devices, such as remote digital receiver / tuners that demodulate said signals to output music in stereophonic sound and display the corresponding program information by means of an alphanumeric display. The first and second controlled devices contain microprocessor systems for communicating, controlling, storing, processing, and display of digital data within the operation of the respective system. A high speed, error free digital signal protocol is utilized for transmitting digital audio and corresponding program information signals to and from the digital receiver / tuner. The digital data transmitter and digital receiver / tuners utilize the 44.1 kilohertz (kHz) Compact Disc (CD) clock signal embedded in digital audio signals provided by an established delivery system to derive clocking signals for reception and processing of digital audio signals and for implementing the display information communications protocol.
Owner:DIGITAL STREAM IP LLC

Audio ownership system

System for providing music to users.
Owner:VILCAUSKAS ANDY +1

Diagnostic system and diagnostic method

ActiveUS20170076451A1Medical diagnosis can be effectively assistedReduce dataImage enhancementImage analysisReference imageData encoding
A first interface for reading a medical patient image record is provided. Furthermore, provision is made of an encoding module for machine-based learning of data encodings of image patterns by an unsupervised deep learning and for establishing a deep-learning-reduced data encoding of a patient image pattern contained in the patient image record. Furthermore, provision is made of a comparison module for comparing the established data encoding with reference encodings of reference image patterns stored in a database and for selecting a reference image pattern with a reference encoding which is similar to the established data encoding. An assignment module serves to establish a key term assigned to the selected reference image pattern and to assign the established key term to the patient image pattern. A second interface is provided for outputting the established key term with assignment to the patient image pattern.
Owner:SIEMENS HEALTHCARE GMBH

Three-dimensional generalized space

According to one embodiment, audio and non-audio data can be represented as sound sources in a three-dimensional sound space adapted to also provide visual data. Non-audio data can be associated with audio sound sources presented in the sound space. Navigation within this combined three-dimensional audio / visual space can be based primarily on the audio aspects of the sound sources with the details of the non-audio data being presented on demand, for example, when the listener navigates through the combined three-dimensional audio / visual space to a particular sound source at which point the non-audio data associated with that sound source can be presented.
Owner:AVAYA INC

A rhythm point recognition method and device, an electronic device and a storage medium

The invention discloses a rhythm point recognition method and device, an electronic device and a storage medium. The method comprises the following steps of determining at least one alternative rhythmpoint in an audio signal according to the spectral characteristics of the audio signal to be identified, and obtaining the starting point time corresponding to each alternative rhythm point; mappingeach alternative rhythm point into a trend fitting envelope signal of the audio signal according to the corresponding starting point time, and determining a target rhythm point in each alternative rhythm point according to the waveform characteristics of the trend fitting envelope signal; and determining volume information corresponding to each target rhythm point according to the beat informationof the audio signal, fitting an envelope signal and the beat information of the audio signal according to the fluctuation of the audio signal, and determining the duration corresponding to each target rhythm point. According to the embodiment of the invention, the rhythm points can be automatically and accurately identified, and the rhythm point identification efficiency is improved.
Owner:BEIJING BYTEDANCE NETWORK TECH CO LTD

Article recommendation method based on Chinese similarity calculation

The invention discloses an article recommendation method based on Chinese similarity calculation. The method comprises the following specific steps of crawling the main content of an article by utilizing a Python crawler program; obtaining a word vector according to the main content of the crawled article, and training; converting the article to be recommended into a word vector matrix; and converting the keyword word group of the user into a matrix, reading the word vector matrix converted from the article obtained in the previous step, carrying out standardized processing on the word vectormatrix data, carrying out the matrix calculation, and carrying out arrangement according to a similarity coefficient. The invention provides the article recommendation method based on Chinese similarity calculation, which can help an internet user to efficiently mine the interested articles, and is larger in application range, lower in manual marking cost and good in recommendation diversity.
Owner:武汉掌游科技有限公司

Ground-to-air call data analysis method and system based on voice recognition

InactiveCN110335609ASolve the problem of separate playbackSolve the problem of focusing on monitoring the voice of the flightSpeech recognitionNeural architecturesStatistical analysisData analysis system
The invention discloses a ground-to-air call data analysis method and a ground-to-air call data analysis system based on voice recognition. The method comprises the steps of continuously collecting and storing ground-to-air call voice data and radar data by means of an air traffic control recorder; converting into a wav format file through adaptive decoding; intercepting a complete statement to form a voice file through an endpoint detection technology based on deep learning; converting the voice file into text information through an air traffic control voice recognition model based on deep learning; determining control instruction intention and parameters through an air traffic control semantic understanding model based on deep learning; and performing ground-air call data analysis such as voice and instruction data statistical analysis, monitoring data and voice synchronous playback, key monitoring and the like based on each generated file. According to the method provided by the invention, the efficiency and accuracy of ground-air call data analysis work are comprehensively improved. The problem that during the air traffic control safety management practice, the control commandquality evaluation and the after-event analysis are carried out by completely relying on manual listening, recording, query and statistical analysis of ground-air calls is solved.
Owner:SICHUAN UNIV +1

Phone-call recording access failure reason recognizing method

The invention belongs to the field of voice recognition, and particularly relates to a phone-call recording access failure reason recognizing method. The method comprises the following steps: markingaccess failure reasons by signals; if reasons cannot be obtained by signal classification, extracting an audio fingerprint characteristic sequence from to-be-recognized phone-call recording, and searching from an audio fingerprint database by the sequence; if matched fingerprints can be found out, marking the access failure reason for to-be-recognized phone-call according to access failure reasonlabels in fingerprint key values; and if the matched fingerprints cannot be found out, recognizing audio contents into text contents by automatic voice recognition, classifying in an access failure document classifying model by a text classifying method on the basis of the contents, and marking the to-be-recognized phone-call recording by access failure reason classifying results obtained by classification. The method can recognize recording files in an offline manner, streaming phone-call voice can also be recognized, the universality is high, and the phone-call recording access failure reason recognizing method is suitable for different application scenarios of the call center.
Owner:北京灵伴即时智能科技有限公司

Air conditioner system fault detecting method and device, and electronic equipment

The invention relates to the technical field of air conditioners, in particular to an air conditioner system fault detecting method and device, and electronic equipment. The air conditioner system fault detecting method comprises the steps that a plurality of sets of audio data when various types of fault of an air conditioner system occur, and an audio sample set is obtained; characteristic information of all sets of audio in the audio sample set is extracted, and the correlation relationship between the various fault types and the characteristic information of all sets of audio is established; the fault audio data detected in real time are obtained, and the fault audio characteristic information is extracted; and the fault characteristic information and a fault audio database are matched, and the fault type of the air conditioner system is determined according to the correlation relationship. According to the scheme, the fault type, corresponding to the fault audio data, of the air conditioner system can be accurately detected through intelligent decision means.
Owner:PING AN TECH (SHENZHEN) CO LTD

Sharing method and device of network works, server and storage medium

The present invention discloses a sharing method and device of network works, a server and a storage medium. The method comprises the following steps of: selecting original works shared on a network page; employing new works uploaded by a user, and performing derivative work on the original works to obtain shared works; and performing sharing of the shared works on the network page. In the invention, the sharing method and device of network works, the server and the storage medium form shared works through joint derivative work of the original works and the new works to enrich the resources ofthe shared works, reduce the creation difficulty of the shared works and allow more users to take part in the creation of the shared works.
Owner:BEIJING DAJIA INTERNET INFORMATION TECH CO LTD

Audio detect method, device, apparatus, and computer-readable storage medium

The invention discloses an audio detection method, an apparatus, a device and a computer-readable storage medium, wherein, the method comprises the following steps of: obtaining a training sample setcomprising a preset amount of training samples and sample tags of each of the training samples, wherein the training sample comprises audio samples corresponding to the sample tags of the spliced audio and the original audio respectively; training a pre-created convolution neural network by using the training sample set to obtain an audio recognition model; the audio sample with unknown sample label is obtained as the sample to be tested, and the sample to be tested is input to the audio recognition model, so as to obtain the recognition result that the sample to be tested output by the audiorecognition model is mosaic audio or original audio. Audio detection base on a convolution neural network is realized in that application, and the audio detection method is not only prove to have highaccuracy by experiments, but also has no special requirements for training samples and sample to be tested, and is good in generalization.
Owner:SPEAKIN TECH CO LTD

Lyrics display processing method and device, electronic device and computer storage medium

The invention relates to the technical field of audio processing, and discloses a lyrics display processing method and device, an electronic device and a computer-readable storage medium, wherein, thelyrics display processing method comprises the following steps of: obtaining song information and original lyrics information of a track to be played; Then, based on the preset lyrics format, analyzing the original lyrics information to obtain the target lyrics information which accords with the preset lyrics format, wherein The preset lyrics format includes the identification information part and the lyrics body part; Then, when the playback song information is detected, displaying the target lyrics information corresponding to the current playback time synchronously according to the currentplayback time of the song information. The method of the embodiment of the present application enables not only synchronously displaying the corresponding lyrics individually according to the playingprogress of the current song, but also accurately displaying the corresponding color of the lyrics verbatim according to the playing progress of the song, thereby greatly improving the user experience.
Owner:BEIJING MICROLIVE VISION TECH CO LTD

Voiceprint retrieval method based on deep Hash

The invention discloses a voiceprint retrieval method based on deep Hash by which the effects of low storage space and efficient retrieval in a voiceprint retrieval task are achieved. The method comprises a step of training a deep voiceprint hash model, a step of constructing a hash coding database and a step of retrieving the query voice in the database, and is characterized by firstly constructing an end-to-end deep neural network structure, and training the deep neural network model by utilizing the voice data marked with a speaker identity to obtain a deep voiceprint hash function, and then calculating the Hash codes corresponding to the training set through the deep voiceprint Hash function, and constructing a database; for the newly inputted voice data, using the deep voiceprint hashfunction to calculate a corresponding hash code, and adding the hash code to a database in real time. During the retrieval process, for the given voice, the deep voiceprint hash function is used forcalculating the corresponding hash code, and finally a retrieval result is obtained in the database based on index or Hamming distance sorting.
Owner:NANJING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products