Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Text sentence segmentation position recognition method and system, electronic equipment and storage medium

A recognition method and sentence segmentation technology, applied in the information field, can solve problems such as low accuracy of downstream tasks

Pending Publication Date: 2020-10-09
CTRIP COMP TECH SHANGHAI
View PDF0 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The technical problem to be solved by the present invention is to overcome the defects of low accuracy of downstream tasks such as subsequent intent recognition, named entity recognition, and classification tasks due to unsentenced text data obtained by speech recognition in the prior art, and to provide a text segmentation position Identification method and system, electronic device and storage medium

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text sentence segmentation position recognition method and system, electronic equipment and storage medium
  • Text sentence segmentation position recognition method and system, electronic equipment and storage medium
  • Text sentence segmentation position recognition method and system, electronic equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0046] This embodiment provides a method for identifying the position of a text sentence, referring to figure 1 , the recognition method of the text sentence break position comprises the following steps:

[0047] Step S101, receiving text data after voice recognition, and splitting words in the text data into characters.

[0048] In a specific implementation, the customer service robot converts human speech into text through ASR (Automatic Speech Recognition, automatic speech recognition technology), and obtains the text data in step S101. Among them, the text data obtained after ASR speech recognition is some words or words without any punctuation marks, such as ["Hello", "Excuse me", "Order", "Number", "Yes", "How much"] , the results obtained without punctuation directly lead to the low accuracy rate of subsequent tasks, such as user speech intent matching, user speech scene recognition, and user speech emotion classification. However, segmenting the recognized text data ...

Embodiment 2

[0104] This embodiment provides a recognition system for the position of a text sentence, referring to image 3 , the recognition system 20 of text sentence positions includes a receiving module 21 , a local feature extraction module 22 , a semantic feature extraction module 23 , a splicing module 24 , a prediction module 25 and a recognition module 26 .

[0105] The receiving module 21 is used for receiving text data after speech recognition, and splitting words in the text data into characters.

[0106] The local feature extraction module 22 is used to map each character into a character vector, and use the CNN model to extract the local features of the character vector to obtain the first hidden vector.

[0107] The semantic feature extraction module 23 is used to map the words in the text data into word vectors, and use the Bi-LSTM model to extract the semantic features of each word vector to obtain the second latent vector.

[0108] The splicing module 24 is used for spl...

Embodiment 3

[0114] Figure 4 A schematic structural diagram of an electronic device provided in this embodiment. The electronic device includes a memory, a processor, and a computer program stored in the memory and operable on the processor. When the processor executes the program, the method for identifying the position of a text segment in Embodiment 1 is implemented. Figure 4 The electronic device 3 shown is only an example, and should not impose any limitation on the functions and application scope of the embodiments of the present invention.

[0115] The electronic device 3 may be in the form of a general computing device, eg it may be a server device. Components of the electronic device 3 may include but not limited to: the at least one processor 4 mentioned above, the at least one memory 5 mentioned above, and the bus 6 connecting different system components (including the memory 5 and the processor 4 ).

[0116] The bus 6 includes a data bus, an address bus and a control bus. ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a text sentence segmentation position recognition method and system, electronic equipment and a storage medium. The method comprises the steps: receiving text data after voicerecognition, and splitting words in the text data into characters; extracting a local feature of each character vector by using a CNN model to obtain a first implicit vector; utilizing a Bi-LSTM modelto extract semantic features of each word vector to obtain a second implicit vector; splicing the first implicit vector and the second implicit vector and inputting the first implicit vector and thesecond implicit vector into a CRF model; decoding an output result of the CRF model to obtain a label of the character vector; and identifying all sentence segmentation positions according to the label corresponding to each character. According to the method, local features and semantic features are extracted through the CNN model and the Bi-LSTM model respectively, and the CRF model is used as anoutput layer, so that recognition of text sentence segmentation positions is realized, and the accuracy of downstream tasks such as subsequent intention recognition, named entity recognition and classification tasks is further improved.

Description

technical field [0001] The present invention relates to the field of information technology, in particular to a method and system for identifying the position of a sentence sentence in a text, electronic equipment and a storage medium. Background technique [0002] With the development of artificial intelligence technology, many repetitive tasks will be completed by machines, and customer service robots are an example. To enable customer service robots to better serve customers and downstream tasks, such as the accuracy of the intent recognition of what customers say, named entity recognition, etc., is crucial. Sentence function plays a bridge role in the above process. When the customer's speech is too long to perform correct intent recognition or classification, the long sentence is truncated, that is, the long sentence is turned into a short sentence, so as to improve the accuracy of subsequent intent recognition, named entity recognition, and classification tasks. [0...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/211G06F40/284G06F40/30G06F40/126G06N3/04
CPCG06F40/211G06F40/284G06F40/30G06F40/126G06N3/049G06N3/045
Inventor 杨赫罗超吉聪睿胡泓
Owner CTRIP COMP TECH SHANGHAI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products