The present invention relates to a similar tobacco leaves search method based on the near 
infrared spectrum of tobacco leaves. The near 
infrared spectrum of tobacco leaves is used as basic data by the present invention. Distributed sampling is first carried out to the target tobacco leaves of each species; samples are pre-treated; the near 
infrared spectrum of the samples is obtained by scanning the samples on a 
near infrared spectrometer; 
principal component analysis (PCA) operation is carried out to a plurality of 
near infrared spectra of the target tobacco leaves of each species, obtaining loading matrixes, characteristic values and standardized residual errors, so as to generate a 
data model of the target tobacco leaves of each species; the near infrared spectrum of an unknown 
tobacco leaf, and the loading matrixes in the target 
tobacco leaf data models are used to carry out principal component 
decomposition calculation to the near infrared spectrum of the unknown 
tobacco leaf, so as to obtain the principal component 
score and 
decomposition residual error of the unknown tobacco leaf; the principal component 
score of the unknown tobacco leaf and the principal component space distance of the target tobacco leaf data models are calculated, and the residual error distance between the 
decomposition residual error of the unknown tobacco leaf and the standardized residual errors in the target tobacco leaf data models is also calculated; the distance between the unknown tobacco leaf and the target tobacco leaves is measured through the sum square root of the principal component space distance and the residual errors; the smaller the distance is, the higher the similarity is; finally, the distances between the unknown tobacco leaf and each target tobacco leaf is compared and sorted according to the size of the distances, so as to obtain a similar tobacco leaves search result.