A time series data similarity measurement method and measurement system

A technology of time series data and measurement methods, applied in neural learning methods, medical data mining, electrical digital data processing and other directions, can solve problems such as reducing the efficiency and accuracy of similarity calculation, loss of time information, etc. Indicates dense, reasonable and effective effect

A technology of time series data and measurement methods, applied in neural learning methods, medical data mining, electrical digital data processing and other directions, can solve problems such as reducing the efficiency and accuracy of similarity calculation, loss of time information, etc. Indicates dense, reasonable and effective effect

CN109948646AInactive Publication Date: 2019-06-28XI AN JIAOTONG UNIV

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A time series data similarity measurement method and measurement system
  • A time series data similarity measurement method and measurement system
  • A time series data similarity measurement method and measurement system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0080] see figure 1 , a time series data similarity measurement method according to an embodiment of the present invention is applied to the similarity measurement of electronic health records, including the following steps:

[0081] S101, constructing an effective representation of medical sequence events in electronic health records.

[0082] Step1, the electronic health record (EMR) matrix is ​​too sparse. The first thing to do is to make the sparse matrix dense and reduce the dimensionality of the high-dimensional matrix. see figure 2 , convert each EMR matrix into an event sequence, arrange the events according to the relative time of the relative events, the events that occur on the same day do not count the order, and finally get a vector H;

[0083] Step2, use word2vec to map each medical event in the electronic health record into a fixed-length vector to obtain the relative relationship of each medical event in the electronic health record. word2vec is an efficient...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a time series data similarity measurement method and measurement system, and the method comprises the following steps: firstly, learning the vector representation of each eventfor events in all time series data; Secondly, mapping the occurrence time of each event into a vector with the same dimension as the event vector, and embedding the vector into the event vector through vector addition; And finally, sending the final event sequence representation into a convolutional neural network for supervised learning, and finally learning a robust time sequence data similarity measurement model; carrying out similarity measurement through the obtained similarity measurement model. According to the method, the expression of the time sequence data is more reasonable and effective, so that the accuracy of time sequence data similarity measurement can be improved.

Description

technical field [0001] The invention belongs to the technical field of time series data similarity, and in particular relates to a time series data similarity measurement method and a measurement system. Background technique [0002] Data similarity measurement is a basic problem in data science, which involves many application fields such as natural language processing, data retrieval, and cohort analysis. There are a large amount of time series data in the real scene, and these data usually have the characteristics of time series, high dimensionality, heterogeneity, sparsity, unequal dimension and irregularity. [0003] At present, the sequence representation method based on one-hot vector is usually used. Due to the characteristics of sparsity and high dimensionality, this representation method will seriously reduce the efficiency and accuracy of similarity calculation. In addition, existing methods usually aggregate sequence events within a specific time period, ignorin...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
28 Jun 2019
Publication
CN109948646A
IPC
G06K9/62; G06N3/04; G06N3/08; G06F17/27; G16H50/70
Inventors
钱步月; 张先礼