The present invention relates to a similar tobacco leaves search method based on the near
infrared spectrum of tobacco leaves. The near
infrared spectrum of tobacco leaves is used as basic data by the present invention. Distributed sampling is first carried out to the target tobacco leaves of each species; samples are pre-treated; the near
infrared spectrum of the samples is obtained by scanning the samples on a
near infrared spectrometer;
principal component analysis (PCA) operation is carried out to a plurality of
near infrared spectra of the target tobacco leaves of each species, obtaining loading matrixes, characteristic values and standardized residual errors, so as to generate a
data model of the target tobacco leaves of each species; the near infrared spectrum of an unknown
tobacco leaf, and the loading matrixes in the target
tobacco leaf data models are used to carry out principal component
decomposition calculation to the near infrared spectrum of the unknown
tobacco leaf, so as to obtain the principal component
score and
decomposition residual error of the unknown tobacco leaf; the principal component
score of the unknown tobacco leaf and the principal component space distance of the target tobacco leaf data models are calculated, and the residual error distance between the
decomposition residual error of the unknown tobacco leaf and the standardized residual errors in the target tobacco leaf data models is also calculated; the distance between the unknown tobacco leaf and the target tobacco leaves is measured through the sum square root of the principal component space distance and the residual errors; the smaller the distance is, the higher the similarity is; finally, the distances between the unknown tobacco leaf and each target tobacco leaf is compared and sorted according to the size of the distances, so as to obtain a similar tobacco leaves search result.