Patents

Literature

Patsnap Eureka AI that helps you search prior art, draft patents, and assess FTO risks, powered by patent and scientific literature data.

22 results about "Punctuation" patented technology

Filter

Efficacy Topic

Property

Owner

Technical Advancement

Application Domain

Technology Topic

Technology Field Word

Patent Country/Region

Patent Type

Patent Status

Application Year

Inventor

Punctuation (formerly sometimes called pointing) is the use of spacing, conventional signs and certain typographical devices as aids to the understanding and correct reading of written text whether read silently or aloud. Another description is, "It is the practice action or system of inserting points or other small marks into texts in order to aid interpretation; division of text into sentences, clauses, etc., by means of such marks."

Punctuation prediction method, content display method, device, equipment, medium and product

PendingCN122334185ASoftware engineeringMechanical engineering

This application relates to a punctuation prediction method, content display method, apparatus, device, medium, and product. The method includes: acquiring a set of user texts corresponding to a target punctuation mark, wherein the user texts in the set contain the target punctuation mark; extracting usage preferences for the target punctuation mark from the user texts in the set to obtain usage preference features corresponding to the target punctuation mark; filtering target user texts from the set that match the usage preference features based on the usage preference features; and generating text based on the target user texts to obtain target generated text corresponding to the target punctuation mark. The target user texts and the target generated text are used to train a target punctuation prediction model, which is used to predict punctuation marks in text. This method can improve the accuracy of punctuation prediction.

Punctuation prediction method, content display method, device, equipment, medium and product

Punctuation prediction method, content display method, device, equipment, medium and product

Punctuation prediction method, content display method, device, equipment, medium and product

Owner:SHUXING TECH (BEIJING) CO LTD

Electronic text analysis for detecting computer-generated interaction in text-based communications

ActiveUS12664364B2Natural language data processingHuman–computer interactionE-text

A text analysis processing for detecting computer-generated text is provided. In some cases, a text-based chat interaction may be initiated and analyzed to determine whether the text-based chat generated by a communicating entity is computer-generated. The text of the chat session may be analyzed to evaluate punctuation, use of emojis, spacing, grammar, words, phrases, and the like to determine a further likelihood of whether the text is computer-generated. A duration of the chat session may be used as a scoring factor. The various probabilities and scores may be combined to provide a composite score.

Electronic text analysis for detecting computer-generated interaction in text-based communications

Electronic text analysis for detecting computer-generated interaction in text-based communications

Electronic text analysis for detecting computer-generated interaction in text-based communications

Owner:BANK OF AMERICA CORP

The word of god (WOG): the 1,197,000 letter string of encoded hebrew letters underlying the original bible

PendingUS20260141175A1Natural language translationSemantic analysisAlgorithmPattern detection

A data structure and associated methods for analysis of a continuous 1,197,000-letter unvocalized Hebrew string referred to as the Word of God (WOG). The data structure contains only the twenty-two classical Hebrew letters and their five final forms, with no spacing, punctuation, vowelization, or editorial symbols. Intrinsic placement of the final letters enables deterministic segmentation of the string into 305,490 lexical units and 23,206 verses without external conventions. Fixed letter-number assignments provide a numeric architecture for evaluating substrings, detecting alterations, identifying encoded mathematical correspondences, and performing pattern analysis. The system preserves full semantic range by supporting multiple morphologically valid interpretations of unvocalized Hebrew strings. Methods for segmentation, numeric evaluation, reconstruction, integrity verification, semantic analysis, and mathematical pattern detection are provided thereby providing a reproducible foundation for computational and linguistic research.

The word of god (WOG): the 1,197,000 letter string of encoded hebrew letters underlying the original bible

The word of god (WOG): the 1,197,000 letter string of encoded hebrew letters underlying the original bible

The word of god (WOG): the 1,197,000 letter string of encoded hebrew letters underlying the original bible

Owner:JURAVIN DON KARL

Determining semantic and grammatical correctness of user-expanded sentence using integrated programmatic and specialized guided and constrained artificial intelligence

PendingUS20260140972A1Natural language translationDigital data information retrievalUser inputSemantic system

A system and method guide an Artificial Intelligence engine to determine the semantic and grammatical correctness of a user-expanded sentence in real-time. The sentence validation process involves receiving input from the user, the input includes sentence fragment that the user wishes to expand and user-expanded sentence that the user constructs on the fragment provided. The inputs are broken down into tokens. The word-level tokenization algorithm is used, which identifies tokens by splitting the text into spaces, punctuation marks, and other delimiters. Further, a token comparison algorithm is used to assess the relationship between the sentence fragment and the user-expanded sentence to analyze order and placement. Once the token comparison is complete, a prompt is generated using prompt generator to evaluate grammatical and semantic evaluation of the user-expanded sentence. Real-time feedback is provided to the user based on grammatical and semantic evaluation.

Determining semantic and grammatical correctness of user-expanded sentence using integrated programmatic and specialized guided and constrained artificial intelligence

Owner:2HR LEARNING INC

A method for regulatory speech segmentation based on speech recognition and end-point detection

ActiveCN117238279BAutomatic segmentationSpeech segmentation

The application provides a regulation voice segmentation method based on speech recognition and endpoint detection, which is applied to air traffic control voice audio stream segmentation, and comprises the following steps: step 1, constructing a punctuation model based on speech recognition and a speech endpoint detection model; step 2, using the punctuation model based on speech recognition to recognize the audio data stream of the regulation voice, and outputting the corresponding text and sentence end identifier of the audio data stream; step 3, using the speech endpoint detection model to judge the speech starting point and ending point contained in the audio data stream of the regulation voice; step 4, segmenting the audio data stream of the regulation voice into audio segments; and step 5, applying the audio segments as data materials to the speech recognition process of the air traffic control system. Through the combination of speech recognition and endpoint detection, the application realizes the automatic segmentation of the air traffic control voice audio stream, and improves the accuracy and efficiency of the segmentation.

A method for regulatory speech segmentation based on speech recognition and end-point detection

Owner:THE 28TH RES INST OF CHINA ELECTRONICS TECH GROUP CORP

A method for intelligent punctuation compression and line layout for East Asian text typesetting

PendingCN122287553AAlgorithmTheoretical computer science

This invention discloses an intelligent punctuation compression and line layout method for East Asian text typesetting. East Asian full-width punctuation marks are categorized into six types based on their typesetting function: start punctuation, end punctuation, sentence-end punctuation, sentence-in-sentence punctuation, exclamation / interrogative punctuation, and center punctuation. Each type defines an independent compressibility and compression direction. The categorized punctuation marks are modeled as composite typesetting elements carrying both width flexibility parameters and line break cost parameters. The line break cost and remaining compression capacity are correlated within the same element, and the composite typesetting element is incorporated into the cost optimization process of the line layout algorithm. The selection of line break points and punctuation spacing allocation are jointly determined in the same optimization calculation. A power-law cost function and hierarchical spacing allocation priority (punctuation compression takes precedence over word spacing adjustment, which in turn takes precedence over character spacing adjustment) are employed, and rules are implemented through hard constraints. Vertical layout mode, vertical-center-horizontal unit modeling, and cross-language configurable rules are supported. This invention solves the problem of suboptimal global typesetting quality caused by the decoupling of punctuation compression and line break algorithms in existing technologies.

A method for intelligent punctuation compression and line layout for East Asian text typesetting

Owner:BEIJING ADVANCED OPEN SOURCE TECHNOLOGY CO LTD

Text sentence breaking method, system and related device based on combination of acoustics and semantics

PendingCN122116888ASemantic analysisSpeech recognitionFeature setSemantics

The application provides a text punctuation method and system based on the combination of acoustics and semantics, and related equipment. The method comprises: obtaining audio data containing voice instructions; processing the audio data based on a preset voice recognition engine and a semantic model, identifying at least one candidate punctuation point in the text corresponding to the audio data, and obtaining a semantic punctuation probability of each candidate punctuation point; for each candidate punctuation point, extracting an acoustic feature set corresponding to the candidate punctuation point in the audio data; calculating a fusion punctuation score for the candidate punctuation point according to the semantic punctuation probability and the acoustic feature set; based on the fusion punctuation score, determining whether to perform a punctuation operation at the candidate punctuation point, and outputting the final text punctuation result of the audio data. The application fuses and calibrates pure text semantic analysis by introducing acoustic information, reduces the ambiguity of punctuation, and improves the accuracy of complex voice instruction punctuation.

Text sentence breaking method, system and related device based on combination of acoustics and semantics

Text sentence breaking method, system and related device based on combination of acoustics and semantics

Text sentence breaking method, system and related device based on combination of acoustics and semantics

Owner:SHENZHEN TONGXINGZHE TECH

Intelligent display voice-to-text method and system, and medium

ActiveCN116320614BSearch wordsDocument transformation

The application discloses an intelligent display voice-to-text method and system and a medium, and the method comprises the following steps: when an audio and video file is captured, the audio and video file is converted into text; each sentence of text is segmented according to a pre-agreed punctuation mark to obtain segmented sentence text information; a pre-established dictionary table is searched; if historical data corresponding to the segmented sentence text information is found in the dictionary table, a corresponding previously combined text paragraph is obtained from the dictionary table; the text paragraph is rendered; single characters in the rendered text paragraph content are processed according to sensitive words, forbidden words or search words; and the processed single characters are synchronized with the text paragraph on a page for display. Through the application, a user can quickly find out where a violation is located, and can also quickly jump to a corresponding progress for manual auditing to confirm whether a problem exists, thereby greatly reducing the workload of the user for compliance processing.

Intelligent display voice-to-text method and system, and medium

Intelligent display voice-to-text method and system, and medium

Intelligent display voice-to-text method and system, and medium

Owner:SHENZHEN CRAFTSMAN NETWORK TECH CO LTD

A small language-oriented audio and video subtitle optimization generation method and system

PendingCN122313949ASoutheast asiaSpoken language

This invention relates to the field of artificial intelligence technology and discloses a method and system for optimizing and generating audio and video subtitles for less commonly spoken languages. The method includes the following steps: Step 1: Audio extraction and preprocessing; Step 2: Language recognition for the less commonly spoken language; Step 3: Language-aware punctuation restoration module; Step 4: Subtitle readability-driven segmentation module; Step 5: Subtitle format encapsulation and output. This application establishes a complete technical process encompassing audio preprocessing, language-aware speech recognition, structured text restoration, subtitle segmentation optimization, and time alignment. This method significantly improves the accuracy of speech-to-text conversion in less commonly spoken language videos and enhances the structural integrity and readability of subtitles. It is particularly suitable for practical application scenarios involving less commonly spoken languages in regions such as Southeast Asia, where data is scarce and languages are diverse. It has broad application value and practical significance in media dissemination, educational videos, and government services.

A small language-oriented audio and video subtitle optimization generation method and system

Owner:XINGZHOU DIGITAL TECH (ZHUHAI) CO LTD

A semantic enhancement-based ando language generation method and system

PendingCN122337173AMedicineSemantic feature

This application relates to a semantically enhanced Amdo Tibetan language generation method and system. The method extracts semantic features containing lexical boundaries through a pre-trained encoder with full-word masking. After dimensional transformation and phoneme-level repetition expansion, these features are added element-wise to the phoneme features, so that each phoneme directly carries the complete semantics of its corresponding word, fundamentally eliminating semantic fragmentation in prosodic modeling. Simultaneously, the sequence termination vector is independently input into the end-energy predictor and jointly trained with a variational inference adversarial training framework to obtain the generation model and the sentence-end energy decay coefficient. During inference, after synthesizing audio segments according to sentence-end punctuation, the end of the segments is faded out based on the decay coefficient and a silence buffer block is spliced. This ensures the naturalness and accuracy of the speech flow rhythm and logical stress while completely eliminating sentence-end elision and acoustic truncation noise, significantly improving the prosodic naturalness and speech purity of the synthesized Amdo Tibetan speech.

A semantic enhancement-based ando language generation method and system

Owner:HAINAN TIBETAN AUTONOMOUS PREFECTURE TIBETAN INFORMATION TECH RES CENT

Speech recognition method and device, electronic equipment and storage medium

PendingCN122290602ATime informationSpeech sound

This application provides a speech recognition method, apparatus, electronic device, and storage medium. The speech recognition method includes: acquiring an audio stream; recognizing the audio stream to obtain a token text sequence, the token text sequence including multiple token texts and time information corresponding to each token text; traversing the multiple token texts according to the time information; if it is determined that the length of the paragraph before the current token text is greater than a preset length, or the time difference between the current token text and the previous token text is greater than a preset duration, segmenting before the current token text to divide the token text sequence into multiple paragraphs; adding punctuation marks to each of the multiple paragraphs, and outputting the paragraphs with punctuation marks, thereby enabling reasonable segmentation that conforms to semantic logic and improving the accuracy of subsequent semantic understanding.

Speech recognition method and device, electronic equipment and storage medium

Owner:SHENZHEN STREAMING VIDEO TECH

A subtitle line level alignment method, electronic device and storage medium

PendingCN122334192ASemantic vectorAlgorithm

This application provides a subtitle line-level alignment method, electronic device, and storage medium, including comparing the number of sentences in two text segments; intelligently segmenting the target language text according to its linguistic habits; using a pre-trained large-scale bilingual semantic embedding model to convert each sentence of the source and target language texts into a high-dimensional semantic vector; calculating the semantic similarity matrix between all sentence pairs and finding an alignment path with optimal semantic similarity globally; introducing a length ratio penalty factor to balance semantic similarity and sentence length reasonableness; if blank lines exist, locating the text line that produces the blank line and its adjacent areas, segmenting the preceding sentence of the target language text according to common punctuation marks of that language to organize sentence fragments and eliminate blank lines. This method adopts a "coarse-to-fine, progressively layered" processing strategy, ultimately outputting an alignment result without blank lines and with reasonable semantic matching.

A subtitle line level alignment method, electronic device and storage medium

Owner:FUZHOU CHANGXIN INFORMATION TECH CO LTD

Training methods for language models using data generation and reinforcement learning

ActiveUS12646507B2Speech recognitionLinguistic modelAutomatic speech

The disclosed method generates helpful training data for a language model, for example, a model implementing a punctuation restoration task, for real-world ASR texts. The method uses a reinforcement learning method using a generative AI model to generate additional data to train the language model. The method allows the generative AI model to learn from real-world ASR text to generate more effective training examples based on gradient feedback from the language model.

Training methods for language models using data generation and reinforcement learning

Training methods for language models using data generation and reinforcement learning

Training methods for language models using data generation and reinforcement learning

Owner:ADOBE INC

A teaching interaction quality evaluation method and system based on a large language model

ActiveCN121810129BData processing applicationsSemantic analysisAutomatic speechSpeech sound

The present application relates to the field of teaching interaction quality evaluation, and particularly relates to a teaching interaction quality evaluation method and system based on a large language model. The method comprises the following steps: audio transcription, converting classroom audio into original transcription text through speech activity detection, speaker classification, automatic speech recognition and punctuation recovery; transcription refinement, using a large language model combined with a pre-school education knowledge base to perform context-based text correction on the original transcription text to generate refined transcription text; quality evaluation step, based on a pre-school education quality evaluation scale, using a few-shot example guide and a thinking chain reasoning for each scoring point to determine whether there is a voice segment in the refined transcription text that meets the scoring point, performing binary scoring, and generating an interaction quality evaluation report containing each evaluation dimension compliance rate, teaching highlight analysis and phased teaching optimization suggestions. The present application significantly improves the evaluation efficiency. The present application is suitable for teaching interaction quality evaluation.

A teaching interaction quality evaluation method and system based on a large language model

Owner:THE CHINESE UNIV OF HONG KONG (SHENZHEN)

A punctuation prediction method, device, apparatus and storage medium

ActiveCN116052672BSpeech recognitionEnergy efficient computingMedicineSpeech sound

The application provides a punctuation prediction method, device and equipment and a storage medium, wherein the method comprises: obtaining a recognition result as a target text every time a speech recognition system recognizes a speech segment in target speech and outputs a recognition result; obtaining local context information of each word in the target text, and predicting punctuation in the target text based on the local context information of each word in the target text to obtain a prediction result as a preliminary punctuation prediction result of the target text; and determining a final punctuation prediction result of the target text based on a recognition result attribute of the target text and the preliminary punctuation prediction result of the target text. The punctuation prediction method provided by the application can be used to predict punctuation of a target text, and can obtain a relatively accurate and stable punctuation prediction result. The punctuation prediction method provided by the application has high prediction efficiency and good user experience.

A punctuation prediction method, device, apparatus and storage medium

Owner:IFLYTEK CO LTD

Deep learning based ancient chinese automatic punctuation and semantic analysis system

ActiveCN120764552BImprove adaptabilityReduce sentence segmentation error rateSemantic analysisText processingGraph neural networksKnowledge graph

The application discloses an ancient Chinese automatic sentence breaking and semantic analysis system based on deep learning, and relates to the technical field of ancient Chinese analysis.The system combines a gating recurrent unit (GRU) and a convolutional neural network (CNN) through a sentence breaking module, combines image character features, and can effectively process semantic ambiguity, long-distance dependence, and problems of homophones and variant characters.A knowledge graph containing historical and cultural common knowledge and dynamic relationships between words is constructed through a semantic analysis module, prior knowledge is injected through a graph neural network, and the depth of semantic understanding is improved.A sentence breaking result containing sentence breaking basis and a visual semantic analysis result are provided through an output module, so that users can clearly understand the processing logic and the trustworthiness and convenience are improved.The system can efficiently process a large number of unlabeled ancient Chinese texts, fully excavate the value of various cultural heritage such as classics and inscriptions, promote the research progress of grammar and vocabulary, and help the inheritance and spread of ancient Chinese culture.

Deep learning based ancient chinese automatic punctuation and semantic analysis system

Owner:HEBEI AGRICULTURAL UNIV.

Apparatus and method for detecting morphological incompleteness of self-introduction answer

ActiveKR102993221B1User inputUser interface

An apparatus for detecting formal incompleteness of a self-introduction response according to the present disclosure comprises: an input / output module that receives user input or provides output to the user based on a user interface; a memory in which at least one process for performing an operation to detect an error category of a sentence within the self-introduction is stored; and at least one processor that performs an operation to detect an error category of a sentence within the self-introduction according to the process. The at least one processor is configured to acquire a response sentence from the input self-introduction, detect an error category corresponding to each of the response sentences among predefined error categories, list at least one error category detected in each response sentence, and provide each response sentence and the listed error category to the input / output module. The error category may include at least one of a non-answer that is not in the form of an answer, a non-sentence that is not in the form of a sentence, a non-word that is not a word, a minor error containing punctuation that does not conform to rules, and a bullet that is a sentence written in a bulleted descriptive form.

Apparatus and method for detecting morphological incompleteness of self-introduction answer

Owner:MUHAYU CO LTD

Punctuation generation method, apparatus, and storage medium

ActiveCN115775555BEngineeringSpeech sound

The application relates to the technical field of artificial intelligence, in particular to a punctuation generation method and device and a storage medium, the method comprising the following steps: obtaining pronunciation information corresponding to an audio signal; inputting at least one first character into a first model to obtain first punctuation symbol indication information, the first character being a character obtained by splitting the pronunciation information corresponding to the audio signal, and the first punctuation symbol indication information indicating a punctuation symbol corresponding to each first character; determining a second text according to the first character, the first punctuation symbol indication information and a first text, wherein the first text comprises a character corresponding to the audio signal, and the second text comprises the character corresponding to the audio signal and punctuation symbols. According to the embodiment of the application, the model occupied space and energy consumption can be reduced, and a more efficient method for generating punctuation in a speech recognition result is realized.

Punctuation generation method, apparatus, and storage medium

Punctuation generation method, apparatus, and storage medium

Punctuation generation method, apparatus, and storage medium

Owner:HUAWEI TECH CO LTD

Punctuation judgment method and device, electronic equipment and storage medium

ActiveCN115376133BNeural learning methodsMedicineVerbal expression

The present disclosure provides a punctuation judgment method and device, electronic equipment and storage medium, comprising: recognizing the character information, label information, point information and sentence class information of a target sentence, and obtaining a target feature sequence of the target sentence; recombining each character feature in the target feature sequence to obtain label recombination features and point recombination features of the target feature sequence; and performing judgment on the label information and point information of the target sentence according to the label recombination features and the point recombination features, and obtaining a punctuation judgment result of the target sentence. In this way, the present disclosure can guide users to correctly use punctuation marks and improve language expression ability.

Punctuation judgment method and device, electronic equipment and storage medium

Punctuation judgment method and device, electronic equipment and storage medium

Punctuation judgment method and device, electronic equipment and storage medium

Owner:BEIJING CENTURY TAL EDUCATION TECH CO LTD

A text broadcast method, an electronic device and a storage medium

PendingCN122293784AImprove experienceChinese charactersBroadcasting

This application relates to the field of data processing technology, providing a text broadcasting method, electronic device, and storage medium. In this method, the text to be broadcast can be standardized. For example, for Chinese text, numbers represented in forms other than Chinese and symbols to be broadcast can be converted into their corresponding Chinese characters, and unnecessary punctuation can be removed to obtain the language text. Then, the broadcast duration of the text to be broadcast is predicted based on the language text. This improves the accuracy of predicting the broadcast duration of the text to be broadcast.

A text broadcast method, an electronic device and a storage medium

Owner:HONOR DEVICE CO LTD

A real-time punctuation restoration method based on efficient corpus screening

ActiveCN120998203BNatural language processingData set

The present application relates to a kind of real-time punctuation recovery method based on efficient corpus screening.The present application first downloads multiple open-source Chinese error correction corpus data sets, cleans data, and removes punctuation, to build simulated speech recognition results;Then use multiple corpora to form multiple data sets using different methods and perform data weighting;Finally, compare the accuracy of the prediction results of multiple data sets, and continuously replace the generation method of the data set to fine-tune the model according to the recovery effect of the model.The present application effectively utilizes Chinese error correction corpus and open-source CT-transformer model, and achieves good experimental results in speech recognition post-processing tasks.Through the data enhancement method of splicing multiple sentences into a line and mixing different corpora according to different proportions and weighting different punctuation, the problem of real-time punctuation recovery of speech recognition corpus is solved, and the punctuation recovery effect is effectively improved.

A real-time punctuation restoration method based on efficient corpus screening

Owner:KUNMING UNIV OF SCI & TECH

Semantic enhancement-based large model question answering method and device, storage medium and equipment

PendingCN122152975ASemantic analysisOther databases queryingSemantic searchQuestions and answers

The application discloses a semantic enhancement-based large model question and answer method and device, a storage medium and equipment, and belongs to the technical field of deep learning. A question and a knowledge base are obtained, which includes different types of feature data corresponding to semantic fragments. The semantic fragments are obtained by fragmenting a document according to punctuation symbol grouping and priority. The punctuation symbol grouping includes at least one punctuation symbol. The number of word units of each semantic fragment is in the interval of n-m to n, and m is less than or equal to a predetermined threshold. For each knowledge base, the question and the corresponding type of feature data in the knowledge base are subjected to semantic retrieval and keyword retrieval to obtain a plurality of semantic fragments. The question and the corresponding type of feature data in the knowledge base are subjected to gallery retrieval to obtain a gallery retrieval result. The plurality of semantic fragments are reordered and context expanded to obtain a complete context. A large model is used to generate an answer based on the complete context and the gallery retrieval result. The application can improve retrieval efficiency and generate comprehensive and accurate answers.

Semantic enhancement-based large model question answering method and device, storage medium and equipment

Owner:BEIJING UCAP INTERNET TECH

Popular searches

Feature based Prediction methods Data science Computer generation Multimedia Computer engineering Data structure Utterance Pattern analysis Sentence