西南石油大学学报(自然科学版) ›› 2025, Vol. 47 ›› Issue (4): 62-74.DOI: 10.11885/j.issn.1674-5086.2022.05.04.01

• 地质勘探 • 上一篇    下一篇

基于改进SMOTE和随机森林算法的致密砂岩成岩相测井解释方法

甄艳1,2, 康锦涛1,2, 赵晓明1,2, 葛家旺1,2, 代茂林1,2   

  1. 1. 西南石油大学地球科学与技术学院, 四川 成都 610500;
    2. 天然气地质四川省重点实验室, 四川 成都 610500
  • 收稿日期:2022-05-04 发布日期:2025-07-25
  • 通讯作者: 赵晓明,E-mail:zhxim98@163.com
  • 作者简介:甄艳,1985年生,女,汉族,四川绵竹人,副研究员,博士,主要从事人工智能与油气开发地质、时空大数据挖掘等方面的研究工作。E-mail:zhenyan0824@163.com
    康锦涛,1997年生,男,汉族,山西吕梁人,硕士研究生,主要从事机器学习、人工智能与油气地质方面的研究工作。E-mail:863107879@qq.com
    赵晓明,1982年生,男,汉族,山东安丘人,教授,主要从事油气田开发地质学、深水沉积学、非常规油气地质、二氧化碳封存与利用、人工智能与油气地质等方面的研究工作。E-mail:zhxim98@163.com
    葛家旺,1988年生,男,汉族,湖北枝江人,副研究员,主要从事沉积学、层序地层学及开发沉积学等方面的研究工作。E-mail:gjwddn@163.com
    代茂林,1997年生,男,汉族,四川仪陇人,硕士研究生,主要从事油气开发地质。E-mail:1503048438@qq.com
  • 基金资助:
    国家自然科学基金(42072183,41902124)

A Logging Interpretation Method for Tight Sandstone Diagenetic Facies Based on Improved SMOTE and Random Forest Algorithm

ZHEN Yan1,2, KANG Jintao1,2, ZHAO Xiaoming1,2, GE Jiawang1,2, DAI Maolin1,2   

  1. 1. School of Geoscience and Technology, Southwest Petroleum University, Chengdu, Sichuan 610500, China;
    2. Sichuan Key Laboratory of Natural Gas Geology, Chengdu, Sichuan 610500, China
  • Received:2022-05-04 Published:2025-07-25

摘要: 成岩相测井解释是致密砂岩优质储层预测的关键,相较于常规的数理统计学方法,机器学习方法可以有效提高成岩相测井解释精度,但受样品数量不足的影响,其解释结果仍存在一定的多解性。为有效解决成岩相测井解释中样本数据不平衡问题,在经典SMOTE算法的基础上,顾及新增样本的空间约束,提出了一种RESMOTE算法,对不平衡数据中的少类样本进行新增,并利用随机森林模型进行成岩相的识别与解释。结果表明,RESMOTE算法优于经典SMOTE算法、Borderline-SMOTE算法和ADASYN算法,随机森林模型的精度从原来的77.27%提升至91.06%。采用RESMOTE算法可保证新增数据的准确性,有效解决了常规测井岩相识别分类方法中的过拟合和准确性不高的问题,对致密砂岩优质储层预测具有重要的应用价值。

关键词: 致密砂岩, 成岩相, RESMOTE算法, 随机森林, 测井解释

Abstract: Diagenetic logging interpretation is the key to predict high-quality tight sandstone reservoirs. Compared with conventional mathematical statistical methods, machine learning methods can effectively improve the accuracy of diagenetic facies logging interpretation. However, due to the insufficient number of samples, there are still some multi-solutions in the interpretation results. In order to effectively solve the problem of sample data imbalance in the logging interpretation of diagenetic facies, this paper considers the spatial constraints of new samples on the basis of the classical SMOTE (Synthetic Minority Over-sampling Technique) algorithm. This paper puts forward a RESMOTE (Repeat SMOTE) algorithm, which adds the few class samples in the imbalanced data, and uses the random forest model to identify and explain the diagenetic facies. The experimental results show that the RESMOTE algorithm is better than classical SMOTE algorithm, Borderline-SMOTE algorithm and ADASYN algorithm, and the accuracy of random forest model is improved from 77.27% to 91.06%. The RESMOTE algorithm can ensure the accuracy of new data, effectively solve the problem of over-fitting and low accuracy in conventional logging lithofacies identification and classification methods, and has important application value for the prediction of high quality tight sandstone reservoirs.

Key words: tight sandstone, diagenetic facies, RESMOTE algorithm, random forest, log interpretation

中图分类号: