The invention discloses a massive small
file storage performance optimization method based on
time sequence prediction, and belongs to the field of
information storage. The massive small file storageperformance optimization method comprises the following steps: collecting historical file access records with
time information to obtain a
data set; after the
data set is preprocessed into discrete
time series data, using a time window to rolling on the
data set to generate a training data set, wherein the training data at any t moment takes the data at the t-n-t moment as input data, and takes the data at the t + 1 moment as
label data; establishing a
time sequence prediction model based on a
recurrent neural network, and performing training,
verification and testing in sequence by utilizinga
training set, a
verification set and a
test set obtained by dividing the training data set, thereby obtaining a target model; predicting the change trend of the
file size by utilizing the target model so as to identify a large file and a small file in the
file size; and directly storing the large files, and aggregating and storing the
small files based on a
time sequence. According to the massive small
file storage performance optimization method, the storage performance of massive
small files in the distributed storage
system can be optimized.