西南石油大学学报(自然科学版) ›› 2022, Vol. 44 ›› Issue (2): 113-122.DOI: 10.11885/j.issn.1674-5086.2020.04.20.02

• 石油与天然气工程 • 上一篇    下一篇

基于支持度矩阵Apriori算法的钻井隐患关联挖掘

王兵, 黄丹, 李文璟   

  1. 西南石油大学计算机科学学院, 四川 成都 610500
  • 收稿日期:2020-04-20 发布日期:2022-04-22
  • 通讯作者: 王兵,E-mail:w9521423@sina.com
  • 作者简介:王兵,1977年生,男,汉族,四川南充人,副教授,硕士,主要从事钻井安全评价、数据分析与数据挖掘等方面的研究工作。E-mail:w9521423@sina.com
    黄丹,1993年生,女,汉族,四川成都人,硕士研究生,主要从事数据挖掘与机器学习方面的研究工作。E-mail:1085152827@qq.com
    李文璟,1993年生,男,汉族,四川成都人,硕士研究生,主要从事机器学习方面的研究工作。E-mail:565229097@qq.com
  • 基金资助:
    国家科技重大专项(2016ZX05020–006)

Correlation Mining of Hidden Hazards in Drilling Based on Support Matrix Apriori Algorithm

WANG Bing, HUANG Dan, LI Wenjing   

  1. School of Computer Science, Southwest Petroleum University, Chengdu, Sichuan 610500, China
  • Received:2020-04-20 Published:2022-04-22

摘要: 运用数据挖掘技术研究钻井作业事故隐患的分布规律及其内在机理,是迫切需要解决的重要课题。针对冗余、复杂的钻井隐患数据在挖掘过程中频繁项集丢失及其生成效率低的问题,提出一种基于支持度矩阵的Apriori算法。首先,引入布尔矩阵来表示事务数据库,避免了数据库的重复扫描。其次,通过事务矩阵相乘构造支持度矩阵来获得支持度从而简化支持度计算方法。最后,对算法的连接策略进行优化,简化了频繁项集的生成过程,且在运算过程中不断约简矩阵结构。在UCI数据集上进行实验,证明了改进后的Apriori算法能有效地提高执行效率。将该算法应用于钻井历史隐患数据的关联挖掘,挖掘结果能为安全管理者提供科学的决策依据,实现对钻井作业事故隐患有效识别和风险控制,具有重要意义和推广应用价值。

关键词: 数据挖掘, 钻井隐患, Apriori算法, 关联规则, 支持度矩阵

Abstract: It is very important to use data mining technology to study the distribution rule and inherent mechanism of hidden trouble in drilling operation. Aiming at frequent itemsets loss of complex hidden danger data and low generation efficiency, an Apriori algorithm based on support matrix is proposed. First, we introduce a boolean matrix in the transaction database to prevent repeated database scanning. Secondly, the support matrix is constructed by multiplying the transaction matrix to obtain support and simplify the calculation method of support. Finally, the connection strategy of the algorithm is optimized, which simplifies the generation process of frequent itemsets, and continuously reduces the matrix structure in the calculation process. Experiments on UCI datasets show that the improved Apriori algorithm can effectively improve the efficiency of execution. This algorithm is applied to the associated mining of historical drilling hazard data, the mining results can provide reasonable basis for safety managers, identify effectively hidden dangers and risk control, which is of great significance and worth of popularization and application.

Key words: data mining, hidden danger of drilling, Apriori algorithm, association rule, support matrix

中图分类号: