大理大学学报 ›› 2021, Vol. 6 ›› Issue (12): 5-11.

• 数学与计算机科学 • 上一篇    下一篇

含稀有特征的高平均效用co-location模式挖掘算法

  

  1. 大理大学数学与计算机学院,云南大理 671003
  • 收稿日期:2021-04-15 出版日期:2021-12-15 发布日期:2022-01-12
  • 通讯作者: 李晓伟,讲师,博士,E-mail:lixiaowei_xidian@163.com。
  • 作者简介:曾新,讲师,主要从事数据挖掘、计算机应用技术研究。
  • 基金资助:
    国家自然科学基金项目(71661001;61902049);云南省地方本科高校(部分)基础研究联合专项资金项目(2018FH001-062;2018FH001-063);大理大学数据安全与应用创新团队项目(ZKLX2020308)

High Average-Utility Co-Location Patterns Mining Algorithm with Rare Features

  1. College of Mathematics and Computer, Dali University, Dali, Yunnan 671003, China
  • Received:2021-04-15 Online:2021-12-15 Published:2022-01-12

摘要: 空间高效用co-location模式挖掘以模式中所有特征的参与效用之和为衡量标准,而未考虑模式的长度和稀有特征对模式效用的影响。一般而言,模式的长度越长或存在稀有特征,模式的效用可能越大。在空间高效用co-location模式挖掘研究的基础上,同时考虑模式的长度和可能存在的稀有特征。首先,提出含稀有特征的高平均效用co-location模式挖掘的相关定义;然后,构建含稀有特征的高平均效用co-location模式挖掘算法HAUWR,并在真实和合成数据集下对HAUWR进行大量实验,实验结果表明:HAUWR能够挖掘出满足条件的co-location模式完全集,并具有较好的可扩展性;最后,针对模式长度对高效用co-location模式的影响,HAUWR与含稀有特征的高效用co-location模式挖掘算法HUWR在数据集大小、距离阈值、特征稀有度等方面进行了对比。

关键词: 空间数据挖掘, 高平均效用, co-location模式, 稀有特征, 模式长度

Abstract:

Spatial high utility co-location patterns mining uses the sum of participating utility of all features in pattern as the measurement standard without considering the impact of pattern length and rare features. In general the longer the pattern length or the rarer the features the greater the utility of the pattern. This paper is based on spatial high-utility co-location pattern mining while considering the pattern length and possible rare features of patterns. Firstly the definitions of high average-utility co-location pattern mining with rare features is proposed then a high average-utility co-location pattern mining algorithm HAUWR with rare features is constructed and a large number of experiments on HAUWR are performed on real and synthetic data sets. The experiment result shows that algorithm HAUWR can mine the complete set of co-location patterns that meet the conditions and has good scalability. Finally regarding the impact of pattern length of high utility co-location patterns HAUWR is compared with high utility co-location patterns mining algorithm HUWR containing rare features in some aspects such as dataset size distance threshold and feature rarity.

Key words:

 , spatial data mining, high average-utility, co-location patterns, rare features, pattern length

中图分类号: