Chinese word vector modeling method

A modeling method and word vector technology, applied in the field of vector modeling, can solve problems such as low intelligence, inconvenient use, and unregistered word representation, and achieve the effects of high intelligence, improved accuracy, and convenient use

Inactive Publication Date: 2020-08-14
WEIFANG UNIV OF SCI & TECH
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] A Chinese word vector modeling method proposed by the present invention solves the problem that the existing Chinese word vector modeling methods simply introduce information such as radicals and strokes, and at the same time cannot reasonably represent unregistered words, and cannot automatically update sentences. It is not up to the trend of the times, and the degree of intelligence is low, which leads to the problem of inconvenient use

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese word vector modeling method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0034] refer to figure 1 , a Chinese word vector modeling method, comprising the following steps:

[0035] S1: Obtain the corpus base of Chinese words and sounds, and classify them, and at the same time, classify the types with A-type numbers;

[0036] S2: Number the corpus bases in each type sequentially in B category;

[0037] S3: Map the number into the vector space;

[0038] S4: Build a basic model based on the corpus;

[0039] S5: Input Chinese words, detect the sentence length, and detect the corpus basis in Chinese words;

[0040] S6: convert the detected corpus into numbers, and judge the conversion result;

[0041] S7: Map the number to a real number vector in the vector space;

[0042] S8: Carry out word segmentation processing on the corpus basis of Chinese words, and detect the results;

[0043] S9: Input word segmentation processing results into the basic model;

[0044] S10: Complete Chinese word vector modeling.

[0045] In the present embodiment, in sai...

Embodiment 2

[0048] refer to figure 1 , a Chinese word vector modeling method, comprising the following steps:

[0049] S1: Obtain the corpus base of Chinese words and sounds, and classify them, and at the same time, classify the types with A-type numbers;

[0050] S2: Number the corpus bases in each type sequentially in B category;

[0051] S3: Map the number into the vector space;

[0052] S4: Build a basic model based on the corpus;

[0053] S5: Input Chinese words, detect the sentence length, and detect the corpus basis in Chinese words;

[0054] S6: convert the detected corpus into numbers, and judge the conversion result;

[0055] S7: Map the number to a real number vector in the vector space;

[0056] S8: Carry out word segmentation processing on the corpus basis of Chinese words, and detect the results;

[0057] S9: Input word segmentation processing results into the basic model;

[0058] S10: Complete Chinese word vector modeling.

[0059] In the present embodiment, in the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the field of vector modeling, and particularly relates to a Chinese word vector modeling method. According to an existing Chinese word vector modeling method, only radical strokes and other information are simply introduced, and meanwhile unregistered words cannot be reasonably expressed. In order to solve the problems of inconvenience in use due to incapability of automatically updating statements, incapability of following the trend of the times and low intelligent degree in the prior art, the invention provides the following scheme that the method comprises the following steps: S1, obtaining corpus bases of form and sound characteristics of Chinese words, classifying the corpus bases, and carrying out A-type numbering on the types of the corpus bases; S2, sequentially carrying out B-type numbering on the corpus foundation in each type; and S3, mapping the number to a vector space. The unregistered words can be reasonably expressed, statements can be automatically updated, the trend of the times can be followed, the intelligent degree is high, and use is convenient.

Description

technical field [0001] The invention relates to the technical field of vector modeling, in particular to a Chinese word vector modeling method. Background technique [0002] Word embeddings have become an essential part of any deep learning-based natural language processing system. Natural language processing systems encode words and sentences in fixed-length dense vectors, resulting in vastly improved processing of text data by neural networks. [0003] The existing Chinese word vector modeling methods simply introduce information such as radicals and strokes, and at the same time cannot reasonably represent unregistered words, cannot automatically update sentences, cannot keep up with the trend of the times, and have a low degree of intelligence, resulting in inconvenient use. [0004] Therefore, we propose a Chinese word vector modeling method to solve the above problems. Contents of the invention [0005] A Chinese word vector modeling method proposed by the present ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/284G06F40/289G06F16/31
CPCG06F16/316G06F40/284G06F40/289
Inventor 王君君
Owner WEIFANG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products