Analysis device, analysis method, and analysis program
The analysis device enhances prediction accuracy by aggregating and analyzing time-series data with missing values using CNNs, addressing the limitations of conventional methods in handling missing data.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- HITACHI LTD
- Filing Date
- 2024-12-16
- Publication Date
- 2026-06-26
AI Technical Summary
Conventional techniques fail to accurately handle missing values in time-series data, leading to low prediction accuracy, especially in predicting cognitive functions using wearable devices.
An analysis device that processes time-series data by aggregating detected values and missing features, performing frequency analysis, and using convolutional neural networks (CNN) to identify states in living organisms, thereby incorporating missing data meaningfully into the learning process.
Improves prediction accuracy by actively utilizing missing data features, enabling accurate identification of states such as sleep-wake patterns and cognitive decline.
Smart Images

Figure 2026105267000001_ABST
Abstract
Description
Technical Field
[0004] , , , ,
[0001] The present invention relates to an analysis device, an analysis method, and an analysis program for analyzing data.
Background Art
[0002] Patent Document 1 discloses a time-series data feature amount extraction device. This time-series data feature amount extraction device processes the received unevenly spaced time-series data group into an evenly spaced time-series data group including missing values and a missing information group indicating the presence or absence of missing values based on the received input time-series data length and the received minimum observation interval, and uses, as an error, the difference between the non-missing elements of the matrix of the evenly spaced time-series data group including the missing values and the output result elements of the output layer of the model, learns the weight vectors of each layer of the model, stores the weight vectors in a storage unit as model parameters, receives time-series data for which feature amounts are to be extracted, calculates the values of the intermediate layer of the model using the model parameters stored in the storage unit by inputting the received time-series data for which feature amounts are to be extracted into the model, and outputs the calculated values of the intermediate layer as feature amounts representing the temporal change of the data.
Prior Art Documents
Patent Documents
[0003]
Patent Document 1
Summary of the Invention
Problems to be Solved by the Invention
[0004] However, the technique of Patent Document 1 only ignores missing values and does not consider accurately filling in missing values or finding meaning in the missing values themselves. Therefore, for example, the accuracy of predicting cognitive functions using wearable devices is low, especially when dealing with time-series data related to cognitive functions including missing values.
[0005] Thus, the conventional techniques described above cannot achieve sufficient prediction accuracy when the missing data pattern itself can be a meaningful feature for prediction.
[0006] The present invention aims to improve the accuracy of predictions when there are gaps in time-series biological data. [Means for solving the problem]
[0007] An analysis device that represents one aspect of the invention disclosed in this application includes an input process in which a processor inputs first time-series data of detected values of a living organism and second time-series data indicating whether the organism is in a sleep state or a wake state; an extraction process in which the processor extracts time-series detected values from the first time-series data input by the input process for the period in which the second time-series data is in either the sleep state or the wake state; a first aggregation process in which the time-series detected values for the period in either state extracted by the extraction process are aggregated for each unit time to calculate statistical values and output first one-dimensional time-series array data of the statistical values; and a second one-dimensional time-series array of the missing features in the time-series detected values for the period in either state are aggregated for each unit time to calculate missing features and output first one-dimensional time-series array data of the missing features. The method is characterized by performing the following: a second aggregation process that outputs array data; a splitting process that splits the first one-dimensional array data output by the first aggregation process based on a specific frequency to generate a plurality of first two-dimensional array data consisting of elements with statistical values and elements without statistical values; a second one-dimensional array data output by the second aggregation process based on the specific frequency to generate a plurality of second two-dimensional array data consisting of elements with missing features and elements without missing features; a merging process that combines the plurality of first two-dimensional array data and the plurality of second two-dimensional array data generated by the splitting process; and an identification process that outputs an identification result for identifying the state of the living organism by performing a convolution operation on the merging result from the merging process and extracting the features of the merging result. [Effects of the Invention]
[0008] According to a typical embodiment of the present invention, it is possible to improve the prediction accuracy when there are gaps in time-series biological data. Problems, configurations, and effects other than those mentioned above will be clarified by the following description of the embodiments. [Brief explanation of the drawing]
[0009] [Figure 1] Figure 1 is a block diagram showing an example of the hardware configuration of the analysis device. [Figure 2] Figure 2 is a block diagram showing an example of the functional configuration of the analysis device. [Figure 3] Figure 3 is a block diagram showing an example of the functional configuration of the learning unit. [Figure 4] Figure 4 is an explanatory diagram showing examples of aggregation performed by the first aggregation unit and the second aggregation unit. [Figure 5] Figure 5 is an explanatory diagram showing an example of frequency analysis performed by the frequency analysis unit. [Figure 6] Figure 6 is an explanatory diagram showing an example of the generation of the first two-dimensional array data set by the division section. [Figure 7] Figure 7 is an explanatory diagram showing an example of the generation of a second two-dimensional array data set by the division section. [Figure 8] Figure 8 is an explanatory diagram illustrating Example 1 of combined two-dimensional array data. [Figure 9] Figure 9 is an explanatory diagram showing example 2 of combined two-dimensional array data. [Figure 10] Figure 10 is a flowchart showing an example of the learning process procedure performed by the learning unit. [Figure 11] Figure 11 is a block diagram showing an example of the functional configuration of the prediction unit. [Figure 12] Figure 12 is a flowchart showing an example of the learning process procedure performed by the learning unit. [Figure 13] Figure 13 is an explanatory diagram showing an example of the display screen. [Figure 14] Figure 14 is an explanatory diagram showing example 3 of combined two-dimensional array data. [Modes for carrying out the invention]
[0010] <Figure 1: Hardware configuration of the analysis device> Figure 1 is a block diagram showing an example of the hardware configuration of an analysis device. The analysis device 100 includes a processor 101, a storage device 102, an input device 103, an output device 104, and a communication interface (communication IF) 105. The processor 101, storage device 102, input device 103, output device 104, and communication IF 105 are connected by a bus 106. The processor 101 controls the analysis device 100. The storage device 102 serves as the work area for the processor 101. The storage device 102 is a non-temporary or temporary recording medium that stores various programs and data. Examples of storage devices 102 include ROM (Read Only Memory), RAM (Random Access Memory), HDD (Hard Disk Drive), and flash memory. The input device 103 inputs data. Examples of input devices 103 include a keyboard, mouse, touch panel, numeric keypad, scanner, microphone, and sensor. The output device 104 outputs data. Output devices 104 include, for example, a display, a printer, and a speaker. The communication interface 105 connects to a network and sends and receives data.
[0011] <Figure 2 Functional configuration of the analysis device 100> Figure 2 is a block diagram showing an example of the functional configuration of the analysis device 100. The analysis device 100 has a learning unit 201 and a prediction unit 202. The learning process involves extracting features from one-dimensional time-series data, learning with teacher labels using the extracted features, and generating a learning model 203. The prediction unit 202 extracts features from the time-series data to be predicted, inputs the extracted features into the learning model 203, and identifies the time-series data to be predicted.
[0012] The learning unit 201 and the prediction unit 202 are specifically realized, for example, by causing the processor 101 to execute a program stored in the storage device 102 shown in FIG. 1. Further, at least one of the learning unit 201 and the prediction unit 202 may be implemented on another computer communicable with the analysis device 100.
[0013] The learning model 203 is stored in the storage device 102 shown in FIG. 1 or another computer communicable with the analysis device 100.
[0014] <FIG. 3 Functional configuration example of the learning unit 201> FIG. 3 is a block diagram showing a functional configuration example of the learning unit 201. The learning unit 201 includes an input unit 309, a first aggregation unit 311, a second aggregation unit 312, a frequency analysis unit 313, a division unit 314, a combination unit 315, an identification unit 316, and an adjustment unit 317.
[0015] The input unit 309 inputs two types of synchronized one-dimensional time series data 300 and 301. The time series data 300 is one-dimensional biological data indicating the temporal change of biological data detected from a living body. Hereinafter, it may be referred to as one-dimensional biological data 300. The biological data is, for example, the number of steps, heart rate, body surface temperature, blood pressure, blood oxygen concentration, pulse rate, acceleration (for example, the acceleration of arm movement) detected from a wearable device such as a smartwatch worn on a living body, such as a human. The one-dimensional biological data 300 is a learning data set paired with a teacher label 370 (for example, 0: normal, 1: cognitive function decline).
[0016] The sleep-wake state data 301 is sleep-wake state data for detecting the sleep state or wake state of a living body, such as a human. Hereinafter, it may be referred to as sleep-wake state data 301. The wearable device has a prediction model for predicting the sleep-wake state data 301.
[0017] The predictive model is a classifier trained on a training dataset of subjects, where the training data includes, for example, the subject's step count, heart rate, body surface temperature, blood pressure, blood oxygen saturation, pulse rate, acceleration, and acceleration (e.g., acceleration of arm movement), and the subject's sleep or wakefulness state is the ground truth label.
[0018] The predictive model, for example, takes biometric data such as a person's step count, heart rate, blood pressure, blood oxygen saturation, pulse rate, body surface temperature, and acceleration (for example, the acceleration of arm movement) as input and outputs a prediction result indicating whether the person is in a sleep or wakeful state. This time-series prediction result is the sleep-wake state data 301.
[0019] Furthermore, if the predictive model is implemented within a wearable device, and the user forgets to wear the device, not only the one-dimensional biometric data 300 but also the sleep-wake state data 301 will not be detected. For this reason, the predictive model may be implemented within the analysis device 100 instead of the wearable device. In this case, even if the user forgets to wear the wearable device, the analysis device 100 can predict the sleep-wake pattern from the sleep-wake state data 301, assuming there are no missing data points, using the predictive model.
[0020] The extraction unit 310 extracts one-dimensional data from the one-dimensional biological data 300 for the time periods when the sleep-wake state data 301 is in an awakened state. The extraction unit 310 also extracts one-dimensional biological data from the one-dimensional biological data 300 for the periods when the sleep-wake state data 301 is in a sleep state. The one-dimensional biological data extracted by the extraction unit 310 is referred to as extracted one-dimensional biological data 300. Specifically, if data is extracted for the awakened state, it is referred to as awakened state extracted one-dimensional biological data, and if data is extracted for the sleep state, it is referred to as sleep state extracted one-dimensional biological data.
[0021] The first aggregation unit 311 aggregates the detected values from the extracted one-dimensional biological data 300 at unit time intervals. The unit time is a time interval longer than the detection interval of consecutive detected values in the extracted one-dimensional biological data 300. For example, if the extracted one-dimensional biological data 300 consists of detected values every second, the unit time is one minute. The first aggregation unit 311 aggregates the detected values at unit time intervals and calculates statistical values. Statistical values are statistical values relating to the detected values within a unit time interval, such as the sum, mean, maximum, minimum, median, and mode of the detected values within that unit time interval. The time-series data of the statistical values is referred to as the first one-dimensional array data 320. In the first one-dimensional array data 320, consecutive statistical values may be connected by line segments.
[0022] The second aggregation unit 312 aggregates missing detection values from the extracted one-dimensional biological data 300 for each unit of time. Missing values are detection points that should have been detected but were not detected for some reason, or detection points that were detected but were outside a predetermined range, such as outliers, and the detection points of those values. The second aggregation unit 312 aggregates missing values for each unit of time and calculates missing feature quantities. Missing feature quantities are, for example, the number of missing values per unit of time (hereinafter referred to as the number of missing values). The time-series data of missing feature quantities is referred to as the second one-dimensional array data 330. In the second one-dimensional array data 330, consecutive missing feature quantities may be connected by line segments.
[0023] [Figure 4: Aggregation by the first aggregation unit 311 and the second aggregation unit 312] Figure 4 is an explanatory diagram showing examples of aggregation by the first aggregation unit 311 and the second aggregation unit 312. (A) shows an example of interpolation when there are missing values in the extracted one-dimensional biological data 300. (B) shows an example of aggregation by the first aggregation unit 311. In Figure 4, as an example, the detection interval for the extracted one-dimensional biological data 300 is set to 15 seconds and the unit time to 1 minute.
[0024] (A) If there are missing values in the extracted one-dimensional biometric data 300, it is possible to interpolate the missing values using two adjacent detection values at the time the missing value occurred. However, if linear interpolation is used, the interpolated value is the average of those two detection values and not the true missing detection value. Therefore, if interpolation is performed, the learning accuracy and prediction accuracy will decrease if meaning is to be found in the missing values.
[0025] (B) The first aggregation unit 311 aggregates the detected values for each unit of time and calculates statistical values from the aggregated detected values. For example, in interval B1, there are 3 detected values and 1 missing value, so the first aggregation unit 311 calculates statistical values for the 3 detected values. Also, in interval B2, there are 2 detected values and 2 missing values, so the first aggregation unit 311 calculates statistical values for the 2 detected values. The series of statistical values calculated in this way constitutes the first one-dimensional array data.
[0026] (B) The second aggregation unit 312 aggregates the missing values for each unit of time and calculates the total number of missing values by summing the aggregated number of missing values. For example, in interval B1, there are 3 detected values and 1 missing value, so the first aggregation unit 311 calculates the number of missing values as 1. Also, in interval B2, there are 2 detected values and 2 missing values, so the first aggregation unit 311 calculates the number of missing values as 2. The series of missing value counts calculated in this way constitutes the second one-dimensional array data 330.
[0027] Returning to Figure 3, the frequency analysis unit 313 performs frequency analysis on the first one-dimensional array data 320 and outputs specific frequencies to the division unit 314.
[0028] [Figure 5 Frequency analysis by frequency analysis unit 313] Figure 5 is an explanatory diagram showing an example of frequency analysis performed by the frequency analysis unit 313. The frequency analysis unit 313 performs a Fast Fourier Transform on the first one-dimensional array data 320 to generate a frequency spectrum 500. The frequency spectrum 500 shows the intensity for each frequency. The intensity of a frequency is a value that depends on the change in the detected value, and the more periodic the change, the higher the intensity of that frequency. In the frequency spectrum 500, the frequencies are denoted as f1, f2, f3, ..., fn (n is an integer of 1 or more) in descending order of frequency. If f1, f2, f3, ..., fn are not distinguished, they are referred to as a specific frequency fn.
[0029] The frequency analysis unit 313 selects a specific frequency from the frequency spectrum 500 and outputs it to the division unit 314. The specific frequency is, for example, the maximum frequency f1 in the frequency spectrum 500. The specific frequency is not limited to the maximum frequency f1, but may also be, for example, a frequency whose intensity is above a threshold, or any of the top n (where n is an integer greater than or equal to 1) frequencies fn in terms of intensity.
[0030] Returning to Figure 3, the division unit 314 divides the first one-dimensional array data 320 based on a specific frequency fn to generate a plurality of first two-dimensional array data. Specifically, for example, the division unit 314 divides the first one-dimensional array data 320 with a period of 1 / fn, which is the reciprocal of the specific frequency fn. Each of the divided plurality of first two-dimensional array data is matrix data in a coordinate plane composed of a horizontal axis indicating time and a vertical axis indicating statistical values, where, for example, elements with statistical values (including line segments if consecutive statistical values are connected by line segments) have a value of 1, and elements without statistical values have a value of 0. The values of both elements are just examples, and they can be different values. The divided plurality of first two-dimensional array data are referred to as the first two-dimensional array data group 351.
[0031] Similarly, the splitting unit 314 divides the second one-dimensional array data 330 based on a specific frequency fn to generate a plurality of second two-dimensional array data. Specifically, for example, the splitting unit 314 divides the second one-dimensional array data 330 with a period of 1 / fn, which is the reciprocal of the specific frequency fn. Each of the divided plurality of second two-dimensional array data is matrix data in a coordinate plane composed of a horizontal axis indicating time and a vertical axis indicating missing features (number of missing values), where, for example, elements with statistical values (which may include line segments if consecutive missing features are connected by line segments) are set to 1, and elements without statistical values are set to 0. The values of both elements are just examples, and they may be different. The divided plurality of second two-dimensional array data are referred to as the second two-dimensional array data group 352.
[0032] The number of divisions of the first one-dimensional array data 320 and the number of divisions of the second one-dimensional array data 330 are the same. The division unit 314 may divide the first one-dimensional array data 320 and the second one-dimensional array data 330 using a predetermined specific frequency fn, rather than a specific frequency fn from the frequency analysis unit 313.
[0033] [Figure 6: Example of generation of the first two-dimensional array data group 351 by the division unit 314] Figure 6 is an explanatory diagram showing an example of the generation of the first two-dimensional array data group 351 by the division unit 314. The division unit 314 divides the first two-dimensional array data group 351 into first two-dimensional array data 601 to 606 with a period of 1 / fn, which is the reciprocal of a specific frequency fn.
[0034] If the first two-dimensional array data 601 to 606 are not distinguished, they will be referred to as the first two-dimensional array data 600. In the first two-dimensional array data 600, consecutive statistical values may be connected by line segments. The first two-dimensional array data 600 is, for example, matrix data in which elements where statistical values (including line segments if consecutive statistical values are connected by line segments) exist in that coordinate space are represented as 1, and elements where they do not exist are represented as 0. The first two-dimensional array data 600 may also be image data.
[0035] [Figure 7: Example of generation of the second two-dimensional array data group 352 by the division unit 314] Figure 7 is an explanatory diagram showing an example of the generation of the second two-dimensional array data group 352 by the division unit 314. The division unit 314 divides the second two-dimensional array data group 352 into second two-dimensional array data 701 to 706 with a period of 1 / fn, which is the reciprocal of a specific frequency fn.
[0036] If the second two-dimensional array data 701-706 are not distinguished, they will be referred to as the second two-dimensional array data 700. In the second two-dimensional array data 700, consecutive statistical values may be connected by line segments. The second two-dimensional array data 700, like the first two-dimensional array data 600, is a matrix data in which elements with a missing count (which may also include line segments if consecutive missing counts are connected by line segments) in its coordinate space are represented as 1, and elements without a missing count are represented as 0. The second two-dimensional array data 700 may also be image data.
[0037] Returning to Figure 3, the joining unit 315 joins the first two-dimensional array data group 351 and the second two-dimensional array data group 352, generating the joined two-dimensional array data 360 as a result. Specifically, for example, the joining unit 315 joins the first two-dimensional array data 601-606 and the second two-dimensional array data 701-706 in a distributed manner.
[0038] [Figures 8 and 9: Combined two-dimensional array data 360] Figure 8 is an explanatory diagram showing Example 1 of the combined two-dimensional array data 360. In Figure 8, the combined two-dimensional array data 360 is an example in which the first two-dimensional array data 600 and the second two-dimensional array data 700 are combined by arranging them alternately one by one in the direction of the vertical axis (statistical value, number of missing values).
[0039] Figure 9 is an explanatory diagram showing Example 2 of the combined two-dimensional array data 360. In Figure 9, the combined two-dimensional array data 360 is an example in which the first two-dimensional array data 600 and the second two-dimensional array data 700 are combined by arranging two of them alternately in the direction of the vertical axis (statistical value, number of missing values).
[0040] The joining by the joiner 315 depends, for example, on the size of the kernel 800. The kernel 800 is a filter used when performing a convolution operation on the joined two-dimensional array data 360 in the CNN. The joiner 315 joins the first two-dimensional array data 601-606 and the second two-dimensional array data 701-706 so that regardless of where the kernel is positioned in the joined two-dimensional array data 360, it will contain one or more first two-dimensional array data 600 and one or more second two-dimensional array data 700.
[0041] Let the size of kernel 800 be m × m (where m is an integer greater than or equal to 1). Also, let V be the length of the vertical axis of the first two-dimensional array data 600 and the second two-dimensional array data 700, respectively. If m > V, then the first two-dimensional array data 600 and the second two-dimensional array data 700 are arranged alternately one at a time along the vertical axis, as shown in Figure 8. If m > 2V, then the first two-dimensional array data 600 and the second two-dimensional array data 700 are arranged alternately two at a time along the vertical axis. Therefore, if m > kV (where k is an integer greater than or equal to 1), then the first two-dimensional array data 600 and the second two-dimensional array data 700 are arranged alternately k at a time along the vertical axis.
[0042] In Figures 8 and 9, the vertical axis is shorter than the horizontal axis (time), so the coupling section 315 arranges the first two-dimensional array data 600 and the second two-dimensional array data 700 alternately in the vertical direction. However, if the horizontal axis is shorter than the vertical axis, the coupling section 315 can arrange the first two-dimensional array data 600 and the second two-dimensional array data 700 alternately in the horizontal direction.
[0043] Returning to Figure 3, the identification unit 316 has a CNN. The identification unit 316 inputs the combined two-dimensional array data 360 into the CNN, extracts features from the combined two-dimensional array data 360, and outputs an identification result based on the extracted features. The features are an array composed of numerical values obtained by convolving the kernel 800 onto statistical values such as voltage values detected from a sensor. The identification result is an output value from the fully connected layer included in the CNN, and takes a range of 0 to 1.
[0044] The adjustment unit 317 adjusts the weights of the CNN kernel 800 based on the identification result from the identification unit 316 and the training label 370 indicating the state of the organism in the extracted one-dimensional biological data 300 to set up a learning model 203 that identifies the state of the organism. Specifically, for example, the adjustment unit 317 calculates the value of the loss function based on the difference between the identification result and the training label 370, and performs backpropagation to reduce the value of the loss function, thereby controlling the weights of the kernel 800 used in the CNN's convolution operation.
[0045] <Figure 10: Learning process procedure by the learning unit 201> Figure 10 is a flowchart showing an example of the learning process procedure performed by the learning unit 201.
[0046] (Step S1000) The learning unit 201 receives one-dimensional biological data 300 and sleep-wake state data 301 via the input unit 309.
[0047] (Step S1001) The learning unit 201, using the extraction unit 310, extracts one-dimensional data from the one-dimensional biological data 300 for the period in which the sleep-wake state data 301 is either in an awakened state or a sleep state.
[0048] (Step S1002) The learning unit 201, using the first aggregation unit 311, aggregates the detected values of the extracted one-dimensional biological data 300 extracted in step S1001 into statistical values per unit time, as shown in Figure 4, and generates the first one-dimensional array data 320.
[0049] (Step S1003) The learning unit 201, using the second aggregation unit 312, aggregates the number of missing values per unit time, as shown in Figure 4, and generates the second one-dimensional array data 330.
[0050] (Step S1004) The learning unit 201, using the frequency analysis unit 313, performs frequency analysis on the first one-dimensional array data 320 as shown in Figure 5, and generates a frequency spectrum 500.
[0051] (Step S1005) The learning unit 201, using the division unit 314, selects a specific frequency fn from the frequency spectrum 500, as shown in Figure 5.
[0052] (Step S1006) The learning unit 201, using the division unit 314, divides the first one-dimensional array data 320 at a period of 1 / fn with a specific frequency fn, as shown in Figure 6, and generates the first two-dimensional array data group 351.
[0053] (Step S1007) The learning unit 201, using the division unit 314, divides the second one-dimensional array data 330 at a period of 1 / fn with a specific frequency fn, as shown in Figure 7, and generates a second two-dimensional array data group 352.
[0054] (Step S1008) The learning unit 201, using the combining unit 315, combines the first two-dimensional array data group 351 and the second two-dimensional array data group 352, as shown in Figures 8 and 9, to generate combined two-dimensional array data 360.
[0055] (Step S1009) The learning unit 201 inputs the combined two-dimensional array data 360 into the CNN via the adjustment unit 317, performs a convolution operation to extract features from the combined two-dimensional array data 360, and outputs the classification result.
[0056] (Step S1010) The learning unit 201 learns the relationship between the identification result output in step S1009 and the teacher label 370 using the adjustment unit 317. Specifically, the learning unit 201 and the adjustment unit 317 calculate the value of the loss function based on the difference between the identification result and the teacher label 370, and control the weights of the kernel 800 by performing backpropagation to reduce the value of the loss function.
[0057] The learning unit 201 repeats steps S1002 to S1010 for each combination of extracted one-dimensional biological data 300 and training labels 370. This generates a learning model 203 in which the weights of the finally obtained kernel 800 are set to form a CNN.
[0058] In this way, by actively incorporating missing data into the learning process rather than ignoring it, it is possible to generate a learning model 203 that can find meaning in the missing data itself.
[0059] <Figure 11 Example of Functional Configuration of Prediction Unit 202> Figure 11 is a block diagram showing an example of the functional configuration of the prediction unit 202. The prediction unit 202 includes an input unit 309, a first aggregation unit 311, a second aggregation unit 312, a frequency analysis unit 313, a division unit 314, a coupling unit 315, and an identification unit 1117. The input unit 309, the first aggregation unit 311, the second aggregation unit 312, the frequency analysis unit 313, the division unit 314, and the coupling unit 315 have the same configuration as the learning unit 201.
[0060] In the prediction unit 202, the one-dimensional data input to the input unit 309 consists of two types of synchronized one-dimensional biometric data 1110 and sleep-wake state data 1111 of the prediction target. Therefore, the extraction unit 310 extracts one-dimensional data from the one-dimensional biometric data 1110 for the times when the sleep-wake state data 1111 is in an awakened state. The extraction unit 310 also extracts one-dimensional data from the one-dimensional biometric data 1110 for the times when the sleep-wake state data 1111 is in a sleep state.
[0061] Furthermore, the first aggregation unit 311 outputs the first one-dimensional array data 1120, the second aggregation unit 312 outputs the second one-dimensional array data 1130, the frequency analysis unit 313 outputs a specific frequency fn selected by the learning unit 201, the splitting unit 314 outputs the first two-dimensional array data group 1151 and the second two-dimensional array data group 1152, and the merging unit 315 outputs the two-dimensional array data 1160.
[0062] The identification unit 1117 inputs the two-dimensional array data 1160 into the learning model 203 and outputs the identification result by performing a convolution operation with a kernel 800 whose weights are controlled by the learning unit 201.
[0063] <Figure 12 Prediction processing procedure by prediction unit 202> Figure 12 is a flowchart showing an example of the learning process procedure performed by the learning unit 201.
[0064] (Step S1200) The prediction unit 202 receives one-dimensional biological data 1110 and sleep-wake state data 1111 to be predicted via the input unit 309.
[0065] (Step S1201) The prediction unit 202 uses the extraction unit 310 to extract one-dimensional data from the one-dimensional biological data 1110 at times when the sleep-wake state data 1111 is either in an awakened state or a sleep state.
[0066] (Step S1202) The prediction unit 202, using the first aggregation unit 311, aggregates the detected values of the extracted one-dimensional biological data 1110 into statistical values per unit time, as shown in Figure 4, and generates the first one-dimensional array data 1120.
[0067] (Step S1203) The prediction unit 202, using the second aggregation unit 312, aggregates the number of missing values per unit time, as shown in Figure 4, and generates the second one-dimensional array data 1130.
[0068] (Step S1204) The prediction unit 202, using the division unit 314, divides the first one-dimensional array data 1120 at a period of 1 / fn of a specific frequency fn selected in the learning unit 201, as shown in Figure 6, and generates the first two-dimensional array data group 1151.
[0069] (Step S1205) The prediction unit 202, using the division unit 314, divides the second one-dimensional array data 1130 at a period of 1 / fn of a specific frequency fn selected in the learning unit 201, as shown in Figure 7, and generates a second two-dimensional array data group 1152.
[0070] (Step S1206) The prediction unit 202, using the coupling unit 315, combines the first two-dimensional array data group 1151 and the second two-dimensional array data group 1152, as shown in Figures 8 and 9, to generate two-dimensional array data 1160.
[0071] (Step S1207) The prediction unit 202 inputs the two-dimensional array data 1160 into the CNN via the identification unit 1117, performs a convolution operation with a kernel 800 whose weights are controlled by the learning unit 201, extracts features from the two-dimensional array data 1160, and outputs the identification result. By simultaneously convolving missing features, it is possible to identify that even if the time-series changes of statistical values are the same, if there are many missing features in the time-series changes, there is a high possibility that there is some kind of abnormality in the organism. In this way, by using a learning model 203 that actively incorporates missing features rather than ignoring them, it is possible to obtain an identification result in which the missing features themselves are meaningful.
[0072] Therefore, for example, if an elderly person removes a wearable device that detects one-dimensional biometric data 1110 and sleep-wake state data 1111 while bathing and then forgets to put it back on, the one-dimensional biometric data 1110 will be missing. The learning model 203 learns the duration of long-term data loss in the awake state, the number of times long-term data loss occurs, and the interval between long-term data loss occurrences from the extracted one-dimensional biometric data 300 and its correct label, so the prediction unit 202 can predict whether or not the elderly person has dementia.
[0073] <Figure 13: Example of display screen> Figure 13 is an explanatory diagram showing an example of a display screen. On display screen 1300, the first one-dimensional array data 320, the second one-dimensional array data 330, and the frequency spectrum 500 are displayed.
[0074] <Variation> Figure 14 is an explanatory diagram illustrating Example 3 of two-dimensional array data. In Figures 8 and 9, the first two-dimensional array data 600 and the second two-dimensional array data 700 were arranged so that the statistical values and the number of missing values were included in a single kernel 800. However, the combined two-dimensional array data 360 in Figure 13 is data arranged so that a single kernel 800 contains only one of either the first two-dimensional array data 600 or the second two-dimensional array data 700.
[0075] As explained above, this embodiment extracts one-dimensional biometric data that takes into account the state of the living organism, thereby improving the prediction accuracy when there are missing values in the one-dimensional biometric data. For example, when extracting one-dimensional biometric data of the wakefulness state from one-dimensional biometric data 1110 and sleep-wake state data 1111 detected from a wearable device worn by an elderly person, it is possible to predict the decline in the cognitive function of the elderly person.
[0076] Elderly individuals with cognitive decline are likely to forget to reattach wearable devices after removing them for bathing or other activities. Therefore, missing statistics (such as the length of the missing period) from one-dimensional biometric data extracted during periods of arousal associated with non-wearing are useful for detecting cognitive decline. By performing a convolution operation on such missing features, a learning model 203 capable of classifying individuals into a group without abnormalities and a group with cognitive decline is generated, improving classification accuracy. As another example, the learning model 203 capable of classifying individuals into a group without abnormalities and a group with cognitive decline could also be used to predict sleep-wake patterns in data where individuals forgot to reattach their devices.
[0077] Furthermore, even if data loss occurs when a wearable device is worn during sleep, such loss is excluded from the one-dimensional biometric data extracted from the awake state because the organism is in a sleep state. Therefore, by focusing on the one-dimensional biometric data extracted from the awake state, the classification accuracy can be improved.
[0078] Furthermore, in the example described above, one-dimensional biometric data extracted from the awake state of the organism was used, but one-dimensional biometric data extracted from the sleep state of the organism may also be used. In this case, for example, the learning unit 201 uses heart rate and blood oxygen concentration as training data, and uses "normal" or "sleep apnea syndrome" as the ground truth label to learn whether or not the organism has sleep apnea syndrome and generates a learning model 203, and the prediction unit 202 predicts whether or not the organism has sleep apnea syndrome.
[0079] The analysis device 100 described above is connected to a wearable device that detects one-dimensional time-series data via a network such as the Internet, LAN (Local Area Network), or WAN (Wide Area Network). The analysis device 100 may also be implemented within the wearable device. The prediction unit 202 may also be implemented within the wearable device.
[0080] It should be noted that the present invention is not limited to the embodiments described above, but includes various modifications and equivalent configurations within the spirit of the attached claims. For example, the embodiments described above are described in detail to make the present invention easier to understand, and the present invention is not necessarily limited to having all of the described configurations. Furthermore, some of the configurations of one embodiment may be replaced with those of another embodiment. Furthermore, some of the configurations of one embodiment may be added to those of another embodiment. Furthermore, some of the configurations of each embodiment may be added, deleted, or replaced with other configurations.
[0081] Furthermore, each of the aforementioned configurations, functions, processing units, and processing means may be implemented in hardware, for example, by designing them as integrated circuits, or they may be implemented in software by having a processor interpret and execute programs that realize each function.
[0082] Information such as programs, tables, and files that implement each function can be stored in memory, hard disks, SSDs (Solid State Drives), or on recording media such as IC (Integrated Circuit) cards, SD cards, and DVDs (Digital Versatile Discs).
[0083] Furthermore, the control lines and information lines shown are those deemed necessary for explanation purposes and do not necessarily represent all control lines and information lines required for implementation. In reality, it can be assumed that almost all components are interconnected. [Explanation of Symbols]
[0084] 100 Analyzer 201 Learning Department 202 Prediction Section 203 Learning Models 300 time-series data (one-dimensional biological data) 301 Time-series data (sleep-wake state data) 309 Input section 310 Extraction part 311 First Tallying Department 312 Second Tallying Department 313 Frequency Analysis Unit 314 Split section 315 Joint 316 Identification Unit 317 Generation part 320 one-dimensional array data 330 One-dimensional array data 351 Two-dimensional array data set 352 Two-dimensional array data set 360-combined two-dimensional array data 370 Teacher Labels 500 frequency spectrum 600 First two-dimensional array data 700 Second two-dimensional array data 800 kernel 1110 Time-series data (one-dimensional biological data) 1111 Time-series data (sleep-wake state data) 1117 Identification Unit 1120 First-dimensional array data 1130 Second-dimensional array data 1151 First two-dimensional array data set 1152 Second Two-Dimensional Array Dataset 1160 combined two-dimensional array data fn Specific frequency
Claims
1. An analysis device having a processor, The aforementioned processor, An input process that inputs first time-series data of detected values of a living organism and second time-series data indicating whether the organism is in a sleep state or an awake state. An extraction process to extract time-series detected values from the first time-series data input by the input process for the period in which the second time-series data is in either the sleep state or the wake state, A first aggregation process that aggregates the time-series detected values for the period in which either of the states extracted by the extraction process is performed, calculates statistical values for each unit of time, and outputs a first one-dimensional array data of the time-series statistical values. A second aggregation process that aggregates the missing values in the time-series detected values for a period in which either of the above states is present, calculates missing features, and outputs a time-series second one-dimensional array data of the missing features. A splitting process that splits the first one-dimensional array data output by the first aggregation process based on a specific frequency to generate a plurality of first two-dimensional array data consisting of elements in which the statistical value exists and elements in which the statistical value does not exist, and splits the second one-dimensional array data output by the second aggregation process based on the specific frequency to generate a plurality of second two-dimensional array data consisting of elements in which the missing feature exists and elements in which the missing feature does not exist, A merging process that combines a plurality of first two-dimensional array data and a plurality of second two-dimensional array data generated by the division process, An identification process that outputs an identification result for identifying the state of the living organism by performing a convolution operation on the combined result obtained by the aforementioned combination process and extracting the feature quantities of the combined result, An analysis device characterized by performing the following actions.
2. The analysis apparatus according to claim 1, The aforementioned processor, The first one-dimensional array data is subjected to frequency analysis, and a frequency analysis process is performed to select the specific frequency from the frequency analysis results. In the division process, the processor divides the first one-dimensional array data based on the specific frequency selected by the frequency analysis process to generate a plurality of first two-dimensional array data, and divides the second one-dimensional array data based on the specific frequency selected by the frequency analysis process to generate a plurality of second two-dimensional array data. An analytical device characterized by the following features.
3. The analysis apparatus according to claim 2, The aforementioned specific frequency is the frequency with the highest intensity among the frequency analysis results. An analytical device characterized by the following features.
4. The analysis apparatus according to claim 2, The aforementioned specific frequency is a frequency in the frequency analysis results whose intensity is above a predetermined threshold. An analytical device characterized by the following features.
5. The analysis apparatus according to claim 1, The missing feature is the number of times the missing value occurs. An analytical device characterized by the following features.
6. The analysis apparatus according to claim 1, The aforementioned missing value is the point in time when the detected value was not detected. An analytical device characterized by the following features.
7. The analysis apparatus according to claim 1, The aforementioned missing value is when the detected value is outside the predetermined range. An analytical device characterized by the following features.
8. The analysis apparatus according to claim 1, In the concatenation process, the processor concatenates the plurality of first two-dimensional array data and the plurality of second two-dimensional array data such that the kernel used for the convolution operation includes one of the plurality of first two-dimensional array data and one of the plurality of second two-dimensional array data. An analytical device characterized by the following features.
9. The analysis apparatus according to claim 1, The aforementioned processor, Based on the identification result and the training label indicating the state of the organism in the time-series data of the detected value, an adjustment process is performed to adjust the learning model for identifying the state of the organism. An analysis device characterized by performing the following actions.
10. The analysis apparatus according to claim 1, Having a learning model that identifies the state of the living organism, In the aforementioned identification process, the processor outputs the identification result from the learning model by inputting the combination result to the learning model. An analytical device characterized by the following features.
11. The analysis apparatus according to claim 1, Either of the above states is the state of wakefulness. An analytical device characterized by the following features.
12. The analysis apparatus according to claim 1, Either of the above states is the sleep state. An analytical device characterized by the following features.
13. The processor, An input process that inputs first time-series data of detected values of a living organism and second time-series data indicating whether the organism is in a sleep state or an awake state. An extraction process to extract time-series detected values from the first time-series data input by the input process for the period in which the second time-series data is in either the sleep state or the wake state, A first aggregation process that aggregates the time-series detected values for the period in which either of the states extracted by the extraction process is performed, calculates statistical values for each unit of time, and outputs a first one-dimensional array data of the time-series statistical values. A second aggregation process that aggregates the missing values in the time-series detected values for a period in which either of the above states is present, calculates missing features, and outputs a time-series second one-dimensional array data of the missing features. A splitting process that splits the first one-dimensional array data output by the first aggregation process based on a specific frequency to generate a plurality of first two-dimensional array data consisting of elements in which the statistical value exists and elements in which the statistical value does not exist, and splits the second one-dimensional array data output by the second aggregation process based on the specific frequency to generate a plurality of second two-dimensional array data consisting of elements in which the missing feature exists and elements in which the missing feature does not exist, A merging process that combines a plurality of first two-dimensional array data and a plurality of second two-dimensional array data generated by the division process, An identification process that outputs an identification result for identifying the state of the living organism by performing a convolution operation on the combined result obtained by the aforementioned combination process and extracting the feature quantities of the combined result, An analysis method characterized by performing the following.
14. In the processor, An input process that inputs first time-series data of detected values of a living organism and second time-series data indicating whether the organism is in a sleep state or an awake state. An extraction process to extract time-series detected values from the first time-series data input by the input process for the period in which the second time-series data is in either the sleep state or the wake state, A first aggregation process that aggregates the time-series detected values for the period in which either of the states extracted by the extraction process is performed, calculates statistical values for each unit of time, and outputs a first one-dimensional array data of the time-series statistical values. A second aggregation process that aggregates the missing values in the time-series detected values for a period in which either of the above states is present, calculates missing features, and outputs a time-series second one-dimensional array data of the missing features. A splitting process that splits the first one-dimensional array data output by the first aggregation process based on a specific frequency to generate a plurality of first two-dimensional array data consisting of elements in which the statistical value exists and elements in which the statistical value does not exist, and splits the second one-dimensional array data output by the second aggregation process based on the specific frequency to generate a plurality of second two-dimensional array data consisting of elements in which the missing feature exists and elements in which the missing feature does not exist, A merging process that combines a plurality of first two-dimensional array data and a plurality of second two-dimensional array data generated by the division process, An identification process that outputs an identification result for identifying the state of the living organism by performing a convolution operation on the combined result obtained by the aforementioned combination process and extracting the feature quantities of the combined result, An analysis program characterized by executing the following: