Journal of Southwest Petroleum University(Science & Technology Edition) ›› 2025, Vol. 47 ›› Issue (4): 62-74.DOI: 10.11885/j.issn.1674-5086.2022.05.04.01

• GEOLOGY EXPLORATION • Previous Articles     Next Articles

A Logging Interpretation Method for Tight Sandstone Diagenetic Facies Based on Improved SMOTE and Random Forest Algorithm

ZHEN Yan1,2, KANG Jintao1,2, ZHAO Xiaoming1,2, GE Jiawang1,2, DAI Maolin1,2   

  1. 1. School of Geoscience and Technology, Southwest Petroleum University, Chengdu, Sichuan 610500, China;
    2. Sichuan Key Laboratory of Natural Gas Geology, Chengdu, Sichuan 610500, China
  • Received:2022-05-04 Published:2025-07-25

Abstract: Diagenetic logging interpretation is the key to predict high-quality tight sandstone reservoirs. Compared with conventional mathematical statistical methods, machine learning methods can effectively improve the accuracy of diagenetic facies logging interpretation. However, due to the insufficient number of samples, there are still some multi-solutions in the interpretation results. In order to effectively solve the problem of sample data imbalance in the logging interpretation of diagenetic facies, this paper considers the spatial constraints of new samples on the basis of the classical SMOTE (Synthetic Minority Over-sampling Technique) algorithm. This paper puts forward a RESMOTE (Repeat SMOTE) algorithm, which adds the few class samples in the imbalanced data, and uses the random forest model to identify and explain the diagenetic facies. The experimental results show that the RESMOTE algorithm is better than classical SMOTE algorithm, Borderline-SMOTE algorithm and ADASYN algorithm, and the accuracy of random forest model is improved from 77.27% to 91.06%. The RESMOTE algorithm can ensure the accuracy of new data, effectively solve the problem of over-fitting and low accuracy in conventional logging lithofacies identification and classification methods, and has important application value for the prediction of high quality tight sandstone reservoirs.

Key words: tight sandstone, diagenetic facies, RESMOTE algorithm, random forest, log interpretation

CLC Number: