Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Terminal and method for determining sentence marking sequence on basis of CRF (conditional random fields)

A conditional random field and tagging sequence technology, which is applied to computer components, instruments, calculations, etc., can solve the problems of increasing the complexity of training corpus preparation and training time, reducing work efficiency, and increasing the workload of model training.

Active Publication Date: 2017-05-31
北京易观数智科技股份有限公司
View PDF8 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Using the existing technology to mark the training corpus sequence in the training corpus sequence increases the workload of model training, especially increases the complexity of training corpus preparation and training time, and reduces work efficiency

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Terminal and method for determining sentence marking sequence on basis of CRF (conditional random fields)
  • Terminal and method for determining sentence marking sequence on basis of CRF (conditional random fields)
  • Terminal and method for determining sentence marking sequence on basis of CRF (conditional random fields)

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0095] The method for determining the sequence of sentence tokens based on the conditional random field provided by the present invention can be implemented on a terminal that determines the sequence of sentence tokens based on the conditional random field, wherein the terminal can include mobile phones, smart phones, notebook computers, digital broadcast receivers, etc. , Personal Digital Assistants (PDAs), Tablet Computers (PADs), Portable Multimedia Players (PMPs), navigation devices, etc. mobile terminals and stationary terminals such as digital TVs, desktop computers, etc.

[0096]If the terminal has an operating system, the operating system can be UNIX, Linux, Windows, Mac OS X, Android (Android), Windows Phone and the like.

[0097] Application software (Application, APP) is a third-party application program for smart terminals. Users can use various application software for office work, entertainment, and information acquisition. The formats include ipa, pxl, deb, apk, ...

Embodiment 2

[0110] Figure 4 The flow chart of the second embodiment of the method for determining the sequence of sentence tags based on the conditional random field of the present invention, such as Figure 4 As shown, the method for determining the sentence token sequence based on the conditional random field provided by the embodiment of the present invention is applied on the mobile phone, and the method may include the following steps:

[0111] Step 401: Process the sentence to be marked according to the probability model of the conditional random field, and obtain the probability values ​​of all the individual characters in the sentence to be marked which have all relationships in the sentence to be marked.

[0112] The mobile phone processes the sentence to be marked according to the probability model of the conditional random field, and obtains the probability values ​​that all individual characters in the sentence to be marked have all relationships in the sentence to be marked....

Embodiment 3

[0137] Figure 8 It is a schematic structural diagram of a terminal embodiment for determining a sentence tagging sequence based on a conditional random field in the present invention, such as Figure 8 As shown, the terminal 08 for determining the sentence tag sequence based on the conditional random field provided by the embodiment of the present invention includes: a processing module 81 and a determining module 82; wherein,

[0138] The processing module 81 is used to process the sentence to be marked according to the probability model of the conditional random field, and obtain the probability values ​​that all the individual characters in the sentence to be marked have all relationships in the sentence to be marked; wherein, The probability model of the conditional random field is obtained after carrying out the probability model training of the conditional random field to the second training corpus sequence, and the second training corpus sequence is by performing the i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a terminal for determining a sentence marking sequence on the basis of CRF (conditional random fields). The terminal comprises a processing module and a determining module, wherein the processing module is used for processing a to-be-marked sentence according to a CRF statistical model and probability values of possessive relationships of all characters of the to-be-marked sentence in the to-be-marked sentence are obtained; the CRF statistical model is obtained by performing CRF statistical model training on a second training corpus sequence, the second training corpus sequence is obtained by performing single-character spacing processing on a first training corpus sequence, performing one-by-one spacing on the characters in the first training corpus sequence with space marks and performing sequence labeling; the determining module is used for determining the marking sequence of the to-be-marked sentence according to the probability values of the possessive relationships and default rules. The invention further discloses a method for determining the sentence marking sequence on the basis of the CRF. With adoption of the terminal and the method, model training workload and complexity can be reduced, and training time can be shortened.

Description

technical field [0001] The invention relates to the field of intelligent data analysis of terminals, in particular to a terminal and a method for determining a sequence of sentence tags based on a conditional random field. Background technique [0002] Conditional Random Fields (CRF), a discriminative probability model, is a machine learning model commonly used in text part-of-speech tagging, word segmentation, and named entity recognition, such as natural language text. [0003] Conditional Random Field In the field of Natural Language Processing (NLP) of artificial intelligence, the training corpus sequence is segmented into words and the required corpus is obtained through model training. Sequence labeling is required; the process is: use S / B / E / M is used to indicate the position of each word in a sentence, S represents a single word, B represents the first word of a word, E represents the last word of a word, and M represents the middle word in a word; for example, for a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/62
CPCG06F18/2415G06F18/214
Inventor 李博梁怀宗张淑燕
Owner 北京易观数智科技股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products