You can download program in pdf

Day 1: Monday, Dec 9, 2024	Sessions/Forums	Room
8:00 - 8:30	Welcome coffee break
8:30 – 14:00	Causal Representation Learning (CRL)	CS 3
8:30 – 16:00	Sentiment Elicitation from Natural Text for Information Retrieval and Extraction (SENTIRE)	CS 4
8:30 –11:00	Responsible AI to Increase Clinical Decision Trust: Explainability & Reliability of Machine Learning Models (TRUST)	CS 5
8:30 – 16:00	International Workshop on Data Mining for Service (DMS2024)	CS 6
8:30 – 11:00	Workshop on AI for Financial Crime Fight (AI4FCF)	CS 7
8:30 – 11:00	International Workshop on Spatial and Spatio-Temporal Data Mining (SSTDM)	CS 8
8:30 – 11:00	The 2024 Workshop on Optimization Based Techniques for Emerging Data Mining Problems (OEDM)	CS 9
11:00 – 11:30	Coffee Break
11:30 – 16:00	Incremental Classification and Clustering, Concept Drift, Novelty Detection, Active Learning in Big/Fast data Context (IncrLearn)	CS 5
11:30 – 16:00	International Workshop on Data Mining in Finance (DMF)	CS 7
11:30 – 13:00	Workshop on Information Seeking with Big Models (BigIS)	CS 8
11:30 – 13:00	Deep Learning and Clustering (DLC)	CS 9
13:00 – 14:00	Lunch
14:00 – 16:00	Workshop on Emerging Trends in Deep Learning for Healthcare (ETDLH)	CS 3
14:00 – 16:00	The 11th ICDM Workshop on High Dimensional Data Mining (HDM)	CS 8
14:00 – 16:00	Data Mining in Biomedical Informatics and Healthcare (DMBIH)	CS 9
16:00 – 16:30	Coffee Break
16:30 – 19:00	Demo Track	CS 5
16:30 – 19:00	International Workshop on Multimodal Content Analysis for Social Good (MM4SG)	CS 3
16:30 – 19:00	International Workshop on Data-Centric AI (DCAI)	CS 4
16:30 – 19:00	2nd International Workshop on Adaptable, Reliable, and Responsible Learning (ARRL)	CS 6
16:30 – 18:00	Advances in AI-Driven Data Mining for Autonomous Systems (AIDM-AS)	CS 7
16:30 – 18:00	Machine Learning for Cybersecurity (MLC)	CS 8
16:30 – 18:00	International Workshop on AI for Nudging and Personalization (WAIN)	CS 9
18:00 – 19:00	The 2nd International Workshop on User Understanding from Big Data Workshop (DMU2)	CS 7
18:00 – 19:00	Neverending Machine Learning (NML)	CS 8
18:00 – 19:00	Evolutionary Data Mining and Machine Learning Workshop (EDMML)	CS 9
18:00 – 20:00	Steering Committee Meeting with ICDM 2024 & 2025 Main Organizers (By invitation only)	TBA

Conference Agenda

Day 2: Tuesday, Dec 10, 2024		08:00 – 08:45	Welcome coffee break
		08:45 – 09:00	Opening and Welcome (Eric Xing, Conference Chairs, PC Chairs, Local Chairs)
		09:00 – 10:00	Keynote 1: Preslav Nakov Towards Safe, Truly Open, and Factual Large Language Models Room CHB
10:00 – 17:30	Tutorial 1 (5 hours in total) Causality and Large Models Room CHB	10:00 – 10:30	Coffee break
		10:30 – 12:00	Session A1-1	Session A2-1	Session A3-1	Session A5-1
			Room CS 5	Room CS 7	Room CS 9	Room Hive
		12:00 – 13:30	Lunch
		13:30 – 15:00	Session A4-1	Session A5-2	Session A6-1	Session A3-2
			Room CS 5	Room CS 7	Room CS 9	Room Hive
		15:00 – 15:30	Coffee break
		15:30 – 17:00	Session A1-2	Session A2-2	Session A3-3	Session A6-2
			Room CS 5	Room CS 7	Room CS 9	Room Hive
		18:00 – 20:00	Welcome Reception (Location: ADNEC)

Day 3: Wednesday, Dec 11, 2024		08:00 – 09:00	Welcome coffee break
		09:00 – 10:00	Session A1-3	Session A3-4	Session A5-3	Session A6-3
			Room CS 5	Room CS 7	Room CS 9	Room Hive
		10:00 – 10:30	Coffee break
10:30 – 12:00	Women Forum Room Hive	10:30 – 12:00	Session A5-4	Session A6-4	Session A1-4	Session A2-3
			Room CS 5	Room CS 7	Room CS 9	Room CHB
		12:00 – 13:30	Lunch
		13:30 – 14:30	Keynote 2: Bernhard Schölkopf Towards causal world models and digital twins Room CHB
		14:30 – 18:00	Organised desert trip





		18:30 – 20:30	Banquet and Award Ceremony (Location: Desert)

Day 4: Thursday, Dec 12, 2024		08:00 – 09:00	Welcome coffee break
		09:00 – 10:00	Keynote 3: Claudia Plant Clustering: Balancing Abstraction and Representation Room CHB
10:00 – 12:00	Tutorial 2 Hypergraph Neural Networks: An In-Depth and Step-by-Step Guide Room CHB	10:00 – 10:30	Coffee break
		10:30 – 12:00	Session A3-5	Session A5-5	Session A6-5	Session A1-5
			Room CS 5	Room CS 7	Room CS 9	Room Hive
		12:00 – 13:30	Lunch
13:30 – 15:30	Tutorial 3 Uncertain Boundaries: Multidisciplinary Approaches to Copyright Issues in Generative AI Room CHB	13:30 – 15:00	Session A1-6	Session A2-5	Session A3-6	Session A2-4
			Room CS 5	Room CS 7	Room CS 9	Room Hive
		15:00 – 15:30	Coffee break
		15:30 – 17:30	Panel discussion: TBA Room Room CHB

		17:30	Conference concluding remarks

Conference paper presentations

Keynote Lecture: 60 minutes (about 45 minutes for talk and 15 minutes for Q and A)

Main conference regular paper (R): 20 minutes (about 15 minutes for talk and 5 minutes for Q and A)

Main conference short paper (S): 15 minutes (about 10 minutes for talk and 5 minutes for Q and A)

Day 2: December 10, 2024

Session A1-1 Foundations, algorithms, models, and theory of data mining

Room CS 5, 10:30-11:40

Session Chair: Xingquan (Hill) Zhu, Florida Atlantic University, xzhu3@fau.edu

10:30	DM306	Efficient Network Embedding by Approximate Equitable Partitions	R
		Giuseppe Squillace, Mirco Tribastone, Max Tschaikowski, and Andrea Vandin
10:50	DM319	ADOD: Adaptive Density Outlier Detection	R
		Li Qian, Jing Qian, Xin Sun, Wengang Guo, and Christian Böhm
11:10	DM227	Matrix Profile for Anomaly Detection on Multidimensional Time Series	S
		Chin-Chia Michael Yeh, Audrey Der, Uday Singh Saini, Vivian Lai, Yan Zheng, Junpeng Wang, Xin Dai, Zhongfang Zhuang, Yujie Fan, Huiyuan Chen, Prince Aboagye, Liang Wang, Wei Zhang, and Eamonn Keogh
11:25	DM271	CL4CO: A Curriculum Training Framework for Graph-based Neural Combinatorial Optimization	S
		Yang Liu, Chuan Zhou, Peng Zhang, Zhao Li, Shuai Zhang, Xixun Lin, and Xindong Wu

Session A2-1 Deep learning and statistical methods for data mining

Room CS 7, 10:30-11:45

Session Chair: Flavio Giobergia, Politecnico di Torino, flavio.giobergia@polito.it

10:30	DM211	Generating Realistic Tabular Data with Large Language Model	R
		Dang Nguyen, Sunil Gupta, Kien Do, Thin Nguyen, and Svetha Venkatesh
10:50	DM245	HyperTime: A Dynamic Hypergraph Approach for Time Series Classification	R
		Raneen Younis and Zahra Ahmadi
11:10	DM301	Improving Time Series Encoding with Noise-Aware Self-Supervised Learning and an Efficient Encoder	R
		Duy Nguyen Anh, Trang Tran, Hieu Pham Huy, Le Nguyen Phi, and Lam Nguyen Minh
11:30	DM295	QUCE: The Minimisation and Quantification of Path-Based Uncertainty for Generative Counterfactual Explanations	S
		Jamie Duell, Monika Seisenberger, Hsuan Fu, and Xiuyi Fan

Session A3-1 Mining from heterogeneous data sources

Room CS 9, 10:30-11:40

Session Chair: Yue He, Tsinghua University, heyuethu@mail.tsinghua.edu.cn

10:30	DM216	Graph Community Augmentation with GMM-based Modeling in Latent Space	R
		Shintaro Fukushima and Kenji Yamanishi
10:50	DM233	Solving Combinatorial Optimization Problem over Graph through QUBO Transformation and Deep Reinforcement Learning	R
		Tianle Pu, Chao Chen, Li Zeng, Shixuan Liu, Rui Sun, and Changjun Fan
11:10	DM384	Exploratory Combinatorial Optimization Problem Solving via Gauge Transformation	S
		Tianle Pu, Changjun Fan, Mutian Shen, Yizhou Lu, Li Zeng, Zohar Nussinov, Chao Chen, and Zhong Liu
11:25	DM259	2DXformer: Dual Transformers for Wind Power Forecasting with Dual Exogenous Variables	S
		Yajuan Zhang, Jiahai Jiang, Yule Yan, liang Yang, and ping zhang

Session A5-1 Data mining for modelling, visualization, personalization, and recommendation

Room Hive, 10:30-11:40

Session Chair: Di Wu, Sun Yat-Sen University, China, wudi27@mail.sysu.edu.cn

10:30	DM410	Contrastive Learning for Adapting Language Model to Sequential Recommendation	R
		Fei-Yao Liang, Wu-Dong Xi, Xing-Xing Xing, Wei Wan, Chang-Dong Wang, Min Chen, and Mohsen Guizani
10:50	DM419	Cross-Store Next-Basket Recommendation	R
		Liangchen Ma, Ya Li, Zifeng Mai, Feiyao Liang, Chang-Dong Wang, Min Chen, and Mohsen Guizani
11:10	DM517	DifFaiRec: Generative Fair Recommender with Conditional Diffusion Model	S
		Zhenhao Jiang and Jicong Fan
11:25	DM663	A Parameter Update Balancing Algorithm for Multi-task Ranking Models in Recommendation Systems	S
		Jun Yuan, Guohao Cai, and Zhenhua Dong

Session A4-1 Data mining systems and platforms

Room CS 5, 13:30-15:00

Session Chair: Juan Garcia, Universidad de Guayaquil, juan.garciap1@ug.edu.ec

13:30	DM366	Designing an attack-defense game: how to increase the robustness of financial transaction models via a competition	R
13:30	DM366	Alexey Zaytsev, Alex Natekin, Evgeni Vorsin, Valerii Smirnov, Georgii Smirnov, Oleg Sidorshin, Alexander Senin, Alexander Dudin, Maria Kovaleva, and Dmitry Berestnev
13:50	DM409	Scaling Disk Failure Prediction via Multi-Source Stream Mining	R
13:50	DM409	Shujie Han, Zirui Ou, Qun Huang, and Patrick P. C. Lee
14:10	DM455	APOLLO: Differential Private Online Multi-Sensor Data Prediction with Certified Performance	R
14:10	DM455	Honghui Xu, Wei Li, Shaoen Wu, Liang Zhao, and Zhipeng Cai
14:30	DM559	FGLBA: Enabling Highly-Effective and Stealthy Backdoor Attack on Federated Graph Learning	S
14:30	DM559	Qing Lu, Miao Hu, Di Wu, Yipeng Zhou, Mohsen Guizani, and Quan Z. Sheng
14:45	DM583	Enhancing Entity Alignment on Probabilistic Knowledge Graphs	S
14:45	DM583	Yunfei Li, Lu Chen, Chengfei Liu, Rui Zhou, and Jianxin Li

Session A5-2 Data mining for modelling, visualization, personalization, and recommendation

Room CS 7, 13:30-15:00

Session Chair: Yejing Wang, City University of Hong Kong, yejing.wang@my.cityu.edu.hk

13:30	DM277	Transitivity-Encoded Graph Attention Networks for Complementary Item Recommendations	R
		Jin Shang, Yang Jiao, Chenghuan Guo, Minghao Sun, Yan Gao, Jia Liu, Michinari Momma, Itetsu Taru, and Yi Sun
13:50	DM288	SR-PredictAO: Session-based Recommendation with High-Capability Predictor Add-On	R
		Ruida WANG, Raymond Chi-Wing Wong, and Weile TAN
14:10	DM331	Enhancing Embeddings Quality with Stacked Gate for Click-Through Rate Prediction	R
		Caihong Mu, Yunfei Fang, Jialiang Zhou, and Yi Liu
14:30	DM241	Hi-Gen: Generative Retrieval For Large-Scale Personalized E-commerce Search	S
		YanjingWu Wu, Yinfu Feng, Jian Wang, Wenji Zhou, Yunan Ye, Rong Xiao, and Jun Xiao
14:45	DM343	Exploitation or Exploration Next? User Behavior Decoupling and Emerging Intent Modeling for Next-Item Recommendation	S
		Nengjun Zhu, Lingdan Sun, Xiangfeng Luo, Jian Cao, Qi Zhang, and Xinjiang Lu

Session A6-1 Applications of data mining

Room CS 9, 13:30-15:00

Session Chair: Meikang Qiu, Augusta University, qiumeikang@ieee.org

13:30	DM254	Towards Efficient Ridesharing via Order-Vehicle Pre-Matching Using Attention Mechanism	R
		Zhidan Liu, Jinye Lin, Zhiyu Xia, Chao Chen, and Kaishun Wu
13:50	DM270	DFDG: Data-Free Dual-Generator Adversarial Distillation for One-Shot Federated Learning	R
		Kangyang Luo, Shuai Wang, Yexuan Fu, Renrong Shao, Xiang Li, Yunshi Lan, Ming Gao, and Jinlong Shu
14:10	DM359	Debunking Fake News in Online Social Networks without Text Analysis	R
		Xing Su, Jian Yang, Jia Wu, and Zitai Qiu
14:30	DM266	Goal-guided Generative Prompt Injection Attack on Large Language Models	S
		Chong Zhang, Mingyu Jin, Qinkai Yu, Chengzhi Liu, Haochen Xue, and Xiaobo Jin
14:45	DM385	SplitSEE: A Splittable Self-supervised Framework for Single-channel EEG Representation Learning	S
		Rikuto Kotoge, Zheng Chen, Tasuku Kimura, Yasuko Matsubara, Takufumi Yanagisawa, Haruhiko Kishima, and Yasushi Sakurai

Session A3-2 Mining from heterogeneous data sources

Room Hive, 13:30-15:00

Session Chair: Djellel Difallah, NYU Abu Dhabi, djellel@nyu.edu

13:30

DM388

ELiCiT: Effective and Lightweight Lossy Compression of Tensors

Jihoon Ko, Taehyung Kwon, Jinhong Jung, and Kijung Shin

13:50

DM393

LISA: Learning-Integrated Space Partitioning Framework for Traffic Accident Forecasting on Heterogeneous Spatiotemporal Data

Bang An, Xun Zhou, Amin Khezerlou, Nick Street, Jinping Guan, and Jun Luo

14:10

DM430

Emotional Synchronization for Audio-Driven Talking-Head Generation

Zhao Zhang, Yan Luo, Zhichao Zuo, Richang Hong, Yi Yang, and Meng Wang

14:30

DM605

SemiFDA: Domain Adaptation in Semi-Supervised Federated Learning

Michele Craighero, Giorgio Rossi, Beatrice Rossi, Diego Carrera, Diego Stucchi, Pasqualina Fragneto, and Giacomo Boracchi

14:45

DM649

Controllable Visit Trajectory Generation with Spatiotemporal Constraints

Haowen Lin, John Krumm, Cyrus Shahabi, and Li Xiong

Session A1-2 Foundations, algorithms, models, and theory of data mining

Room CS 5, 15:30-17:00

Session Chair: Yuewen Sun, MBZUAI, yuewen.sun@mbzuai.ac.ae

15:30	DM378	Probabilistic Matrix Factorization-based Three-stage Label Completion for Crowdsourcing	R
		Boyi Yang, Liangxiao Jiang, and Wenjun Zhang
15:50	DM413	HomoMGC: Homophily-enhanced Adaptive Graph Refinement for Multi-view Graph Clustering	R
		Man-Sheng Chen, Xiao-Sha Cai, Chang-Dong Wang, Dong Huang, Min Chen, and Mohsen Guizani
16:10	DM442	GADIN: Generative Adversarial Denoise Imputation Network for Incomplete Data	R
		Dong Li, Zhicong Liu, Mingfeng Hu, Baoyan Song, and Xiaohuan Shan
16:30	DM325	Generalized Sparse Additive Model with Unknown Link Function	S
		Peipei Yuan, Xinge You, Hong Chen, Xuelin Zhang, and Qinmu Peng
16:45	DM462	Towards Expressive Graph Representations for Graph Neural Networks	S
		Chengsheng Mao, Liang Yao, and Yuan Luo

Session A2-2 Deep learning and statistical methods for data mining

Room CS 7, 15:30-17:00

Session Chair: Evgenii Tsymbalov, Amazon, etsymbalov@gmail.com

15:30	DM323	Graph Contrastive Learning with Adversarial Structure Refinement (GCL-ASR)	R
		Jiangwen Chen, Kou Guang, Qiyang Li, and Tan Hao
15:50	DM412	GQ*: Towards Generalizable Deep Q-Learning for Steiner Tree in Graphs	R
		Wei Huang, Hanchen Wang, Dong Wen, Xuefeng Chen, Wenjie zhang, and Ying Zhang
16:10	DM315	Hierarchical Explanations for Text Classification Models: Fast and Effective	R
		Zhenyu Nie, Zheng Xiao, Huizhang Luo, Xuan Liu, and Anthony Theodore Chronopoulos
16:30	DM449	Channel-Attentive Graph Neural Networks	S
		Tuğrul Hasan Karabulut and İnci M. Baytaş
16:45	DM580	Cascading Multimodal Feature Enhanced Contrast Learning for Music Recommendation	S
		Qimeng Yang, Shijia Wang, Da Guo, Dongjin Yu, Qiang Xiao, Dongjing Wang, and Chuanjiang Luo

Session A3-3 Mining from heterogeneous data sources

Room CS 9, 15:30-17:00

Session Chair: Guangyi Chen, MBZUAI, Guangyi.Chen@mbzuai.ac.ae

15:30

DM327

Adaptive Loss-ware Modulation for Multimedia Retrieval

Jian Zhu, Yu Cui, Zeyi Sun, Yuyang Dai, Xi Wang, Lei Liu, Cheng Luo, and Li-Rong Dai

15:50

DM337

Towards Cross-domain Few-shot Graph Anomaly Detection

Jiazhen Chen, Sichao Fu, Zhibin Zhang, Zheng Ma, Mingbin Feng, Tony Wirjanto, and Qinmu Peng

16:10

DM383

Informative Subgraphs Aware Masked Auto-Encoder in Dynamic Graphs

Pengfei Jiao, Xinxun Zhang, Mengzhou Gao, and Tianpeng Li

16:30

DM320

A Momentum Contrastive Learning Framework for Query-POI Matching

Yuting Qiang, Jianbin Zheng, Lixia Wu, Haomin Wen, Junhong Lou, and Minhui Deng

16:45

DM371

Multi-modal Sarcasm Detection via Dual Synergetic Perception Graph Convolutional Networks

Xingjie Zhuang and Zhixin Li

Session A6-2 Applications of data mining

Room Hive, 15:30-17:00

Session Chair: Xun Zhou, Harbin Institute of Technology, Shenzhen zhouxun2023@hit.edu.cn

15:30	DM690	Dual Cross-Stage Partial Learning for Enhanced Object Detection in Dehazed Images	R
		Jinbiao Zhao, Zhao Zhang, Jiahuan Ren, Haijun Zhang, Zhongqiu Zhao, and Meng Wang
15:50	DM697	Resource2Box: Learning To Rank Resources in Distributed Search Using Box Embedding	R
		Ulugbek Ergashev, Geon Lee, Kijung Shin, Eduard Dragut, and Weiyi Meng
16:10	DM709	ChronoCTI: Mining Knowledge Graph of Temporal Relations among Cyberattack Actions	R
		Md Rayhanur Rahman, Brandon Wroblewski, Quinn Matthews, Brantley Morgan, Timothy Menzies, and Laurie Williams
16:30	DM749	Addressing Delayed Feedback in Conversion Rate Prediction: A Domain Adaptation Approach	S
		Leisheng Yu, Yanxiao Cai, Lucas Chen, Minxing Zhang, Wei-Yen Day, Li Li, Rui Chen, Soo-Hyun Choi, and Xia Hu
16:45	DM753	Hypergraph-Enhanced Contrastively Regularized Transformer for Multi-Behavior E-commerce Product Recommendation	S
		Shuiying Liao and P. Y. Mok

Day 3: December 11, 2024

Session A1-3 Foundations, algorithms, models, and theory of data mining

Room CS 5, 09:00-10:00

Session Chair: Mengyue Yang, Bristol University, mengyue.yang.20@ucl.ac.uk

09:00	DM363	Scalable Order-Preserving Pattern Mining	R
		Ling Li, Wiktor Zuba, Grigorios Loukides, Solon Pissis, and Maria Matsangidou
09:20	DM546	Efficiently Manipulating Structural Graph Clustering Under Jaccard Similarity	R
		Chuanyu Zong, Rui Fang, Meng-xiang Wang, Tao Qiu, and Anzhen Zhang
09:40	DM617	IIFE: Interaction Information Based Automated Feature Engineering	S
		Tom Overman, Diego Klabjan, and Jean Utke

Session A3-4 Mining from heterogeneous data sources

Room CS 7, 09:00-10:00

Session Chair: Haoxuan Li, Peking University, hxli@stu.pku.edu.cn

09:00

DM322

Adaptive Graph Neural Networks for Cold-start Multimedia Recommendation

Zhen Li, Jibin Wang, Zhuo Chen, Kun Wu, Yuanzhen Wei, and Hai Huang

09:20

DM482

EEiF: Efficient Isolated Forest with e Branches for Anomaly Detection

Yifan Zhang, Haolong Xiang, Xuyun Zhang, Xiaolong Xu, Wei Fan, Qin Zhang, and Lianyong Qi

09:40

DM223

MetaSTC: A Meta Spatio-Temporal Learning Paradigm for Traffic Flow Prediction

Kexin Xu, Zhemeng Yu, Yucen Gao, Songjian Zhang, Jun Fang, Xiaofeng Gao, and Guihai Chen

Session A5-3 Data mining for modelling, visualization, personalization, and recommendation

Room CS 9, 09:00-10:00

Session Chair: Przemyslaw Kazienko, Wroclaw Tech, kazienko@pwr.edu.pl

09:00

DM438

Early Fire Detection based on Local Morphological Knowledge Matching

Xinzhi Wang, Mengyue Li, Nengjun Zhu, Jiayan Qian, and Zhanyi Zheng

09:20

DM402

RecCoder: Reformulating Sequential Recommendation as Large Language Model-Based Code Completion

Kai-Huang Lai, Wudong Xi, Xingxing Xing, Wei Wan, Chang-Dong Wang, Min Chen, and Mohsen Guizani

09:40

DM726

ExoTST: Exogenous-Aware Temporal Sequence Transformer for Time Series Prediction

Kshitij Tayal, Arvind Renganathan, Xiaowei Jia, Vipin Kumar, and Dan Lu

Session A6-3 Applications of data mining

Room Hive, 09:00-10:00

Session Chair: Maurizio Atzori, University of Cagliari , atzori@unica.it

09:00	DM655	Financial Risk Assessment via Long-term Payment Behavior Sequence Folding	R
		Yiran Qiao, Yateng Tang, Xiang Ao, Qi Yuan, Ziming Liu, Chen Shen, and Xuehao Zheng
09:20	DM743	Adaptive Process-Guided Learning: An Application in Predicting Lake DO Concentrations	R
		Runlong Yu, Chonghao Qiu, Robert Ladwig, Paul Hanson, Yiqun Xie, Yanhua Li, and Xiaowei Jia
09:40	DM334	Interdependency Matters: Graph Alignment for Multivariate Time Series Anomaly Detection	S
		Yuanyi Wang, Haifeng Sun, Chengsen Wang, Mengde Zhu, Wei Tang, Jingyu Wang, Qi Qi, Zirui Zhuang, and Jianxin Liao

Session A5-4 Data mining for modelling, visualization, personalization, and recommendation

Room CS 5, 10:30-11:35

Session Chair: Parham Moradi, RMIT University, parham.moradi@rmit.edu.au

10:30	DM367	Continuous Exact Explanations of Neural Networks	R
		Alice Dethise and Marco Canini
10:50	DM414	Periodic Prompt on Dynamic Heterogeneous Graph for Next Basket Recommendation	S
		Ru-Bin Li, Man-Sheng Chen, Xin-Yu Ding, Chang-Dong Wang, Sihong Xie, Shuangyin Liu, Min Chen, and Mohsen Guizani
11:05	DM454	A Condensed Transition Graph Framework for Zero-shot Link Prediction with Large Language Models	S
		Mingchen Li, Chen Ling, rui Zhang, and Liang Zhao
11:20	DM495	Influence-aware Group Recommendation for Social Media Propagation	S
		Chengkun He, Xiangmin Zhou, Chen Wang, Longbing Cao, Jie Shao, and Zahir Tari

Session A6-4 Applications of data mining

Room CS 7, 10:30-11:40

Session Chair: Flavio Giobergia, Politecnico di Torino, flavio.giobergia@polito.it

10:30	DM373	Utilitarian Online Learning from Open-World Soft Sensing	R
		Heng Lian, Yu Huang, Xingquan Zhu, and Yi He
10:50	DM641	CounterFair: Group Counterfactuals for Bias Detection, Mitigation and Subgroup Identification	R
		Alejandro Kuratomi, Zed Lee, Panayiotis Tsaparas, Guilherme Dinis Junior, Evaggelia Pitoura, Tony Lindgren, and Panagiotis Papapetrou
11:10	DM573	D-Cube : Exploiting Hyper-Features of Diffusion Model for Robust Medical Classification	S
		Minhee Jang, Juheon Son, Thanaporn Viriyasaranon, Junho Kim, and Jang-hwan Choi
11:25	DM648	Survival Analysis with Multiple Noisy Labels	S
		Donna Tjandra and Jenna Wiens

Session A1-4 Foundations, algorithms, models, and theory of data mining

Room CS 9, 10:30-11:40

Session Chair: Mubarak Gwaza Abdu-Aguye, MBZUAI, Mubarak.Abdu-Aguye@mbzuai.ac.ae

10:30

DM488

Margin-bounded Confidence Scores for Out-of-Distribution Detection

Lakpa Tamang, Mohamed Reda Bouadjenek, Richard Dazeley, and Sunil Aryal

10:50

DM515

Fast and Accurate Triangle Counting in Graph Streams Using Predictions

Cristian Boldrin and Fabio Vandin

11:10

DM390

Accurate and Fast Estimation of Temporal Motifs using Path Sampling

Yunjie Pan, Omkar Bhalerao, C. Seshadhri, and Nishil Talati

11:25

DM326

SHADE: Deep Density-based Clustering

Anna Beer, Pascal Weber, Lukas Miklautz, Collin Leiber, Walid Durani, Christian Böhm, and Claudia Plant

Session A2-3 Deep learning and statistical methods for data mining

Room CHB, 10:30-11:45

Session Chair: Evgenii Tsymbalov, Amazon, etsymbalov@gmail.com

10:30

DM461

Combining Self-Supervision and Privileged Information for Representation Learning from Tabular Data

Haoyu Yang, Gyorgy Simon, Michael Steinbach, Genevieve Melton, and Vipin Kumar

10:50

DM510

Towards Dynamic University Course Timetabling Problem: An Automated Approach Augmented via Reinforcement Learning

Yanan Xiao, XiangLin Li, Lu Jiang, Pengfei Wang, Kaidi Wang, and Na Luo

11:10

DM591

HFGNN: Efficient Graph Neural Networks using Hub-Fringe Structures

Pak Lon Ip, Sheng Hui Zhang, Xue Kai Wei, Tsz Nam Chan, and Leong Hou U

11:30

DM628

Unsupervised Domain Adaptation for Action Recognition via Self-Ensembling and Conditional Embedding Alignment

Indrajeet Ghosh, Garvit Chugh, Abu Zaher Md Faridee, and Nirmalya Roy

Women forum

Day 3: December 11, 2024 Women Forum 10:30- 12:00 Room: Room Hive Co-chairs：Prof. Xiaochun Yang & Prof. Xiaofeng Gao
10:30	Forum Opening
	Prof. Kun Zhang, Program Committee Co-Chair
10:35	Warm-Up Speech (with Personal Experience Sharing)
	Prof. Xiaochun Yang & Prof. Xiaofeng Gao, Women Forum Co-Chairs
10:50	Intelligent Knowledge Discovery—Explorations in Talent Analytics
	Dr. Ying Sun The Hong Kong University of Science and Technology, Guangzhou, China
11:10	Breaking Barriers in Time Series Analysis
	Dr. Zahra Ahmadi Hannover Medical School, Germany
11:30	Exploring Data Science: A Personal Journey
	Ms. Li Qian Ludwig-Maximilians-Universit¨at M¨unchen, Germany
11:45	The Research on Machine Learning for Data Management
	Ms. Chaohong Ma Renmin University of China
12:00	Closing Speech
	Prof. Elena Baralis, Program Committee Co-Chair

Day 4: December 12, 2024

Session A3-5 Mining from heterogeneous data sources

Room CS 5, 10:30-11:40

Session Chair: Djellel Difallah, NYU Abu Dhabi, djellel@nyu.edu

10:30

DM436

High-Fidelity Diffusion Editor for Zero-Shot Text-Guided Video Editing

Yan Luo, Zhichao Zuo, Zhao Zhang, Zhongqiu Zhao, Haijun Zhang, and Richang Hong

10:50

DM475

Align Along Time and Space: A Graph Latent Diffusion Model for Traffic Dynamics Prediction

Yuhang Liu, Yingxue Zhang, Xin Zhang, Yu Yang, Yiqun Xie, Sahar Ghanipoor Machiani, Yanhua Li, and Jun Luo

11:10

DM483

Futures Quantitative Investment with Heterogeneous Continual Graph Neural Network

Zhizhong Tan, Min Hu, Bin Liu, and Guosheng Yin

11:25

DM497

Multi-Hyperbolic Space-based Heterogeneous Graph Attention Network

Jongmin Park, Seunghoon Han, Jong-Ryul Lee, and Sungsu Lim

Session A5-5 Data mining for modelling, visualization, personalization, and recommendation

Room CS 7, 10:30-11:25

Session Chair: Shirui Pan, Griffith University, s.pan@griffith.edu.au

10:30

DM747

DISCO: A Hierarchical Disentangled Cognitive Diagnosis Framework for Interpretable Job Recommendation

Xiaoshan Yu, Chuan Qin, Qi Zhang, Chen Zhu, Haiping Ma, Xingyi Zhang, and Hengshu Zhu

10:50

DM778

Bi-level User Modeling for Deep Recommender Systems

Yejing Wang, Dong Xu, Xiangyu Zhao, Zhiren Mao, Peng Xiang, Ling Yan, Yao Hu, Zijian Zhang, Xuetao Wei, and Qidong Liu

11:10

DM708

An Explainable Recommender System by Integrating Graph Neural Networks and User Reviews

Sahar Batmani, Parham Moradi, Narges Haidari, and Mahdi Jalili

Session A6-5 Applications of data mining

Room CS 9, 10:30-11:25

Session Chair: Ling Chen, University of Technology Sydney, ling.chen@uts.edu.au

10:30	DM806	A Learned Approach to Index Algorithm Selection	R
		Chaohong Ma, Xiaohui Yu, Yifan Li, Aishan Maoliniyazi, and Xiaofeng Meng
10:50	DM772	TAN: A Tripartite Alignment Network Enhancing Composed Image Retrieval with Momentum Distillation	R
		Yongquan Wan, Erhe Yang, Cairong Yan, Guobing Zou, and Bofeng Zhang
11:10	DM604	AnomalyLLM: Few-shot Anomaly Edge Detection for Dynamic Graphs using Large Language Models	S
		Shuo Liu, Yao Di, Lanting Fang, Zhetao Li, Wenbin Li, Kaiyu Feng, Xiaowen Ji, and Jingping Bi

Session A1-5 Foundations, algorithms, models, and theory of data mining

Room Hive, 10:30-11:40

Session Chair: Shaoan Xie, Carnegie Mellon University, shaoan@cmu.edu

10:30

DM667

Scalable Graph Classification via Random Walk Fingerprints

Peiyan Li, Honglian Wang, and Christian Böhm

10:50

DM717

Warm-Starting Contextual Bandits under Latent Reward Scaling

Bastian Oetomo, R. Malinga Perera, Renata Borovica-Gajic, and Benjamin I. P. Rubinstein

11:10

DM446

Constructing $\epsilon$-Constrained Sparsified $\beta^s$-Complexes using Space Partitioning Trees

Rohit Singh and Philip Wilsey

11:25

DM394

DynoGraph: Dynamic Graph Construction for Nonlinear Dimensionality Reduction

Li Qian, Claudia Plant, Yalan Qin, Jing Qian, and Christian Böhm

Session A1-6 Foundations, algorithms, models, and theory of data mining

Room CS 5, 13:30-14:55

Session Chair: Jiuyong Li, University of South Australia, jiuyong.li@unisa.edu.au

13:30

DM776

PROMIPL:A Probabilistic Generative Model for Multi-Instance Partial-Label Learning

Yin-Fang Yang, Wei Tang, and Min-Ling Zhang

13:50

DM783

A Novel Shadow Variable Catcher for Addressing Selection Bias in Recommendation Systems

Qingfeng Chen, Boquan Wei, Debo Cheng, Jiuyong Li, Lin Liu, and Shichao Zhang

14:10

DM672

Reducing Unfairness in Distributed Community Detection

Hao Zhang, Malith Jayaweera, Bin Ren, Yanzhi Wang, and Sucheta Soundarajan

14:25

DM780

An Efficient Graph Autoencoder with Lightweight Desmoothing Decoder and Long-Range Modeling

Jinyong Wen, Tao Zhang, Chunxia Zhang, Shiming Xiang, Chunhong Pan

14:40

DM798

MoRE-LLM: Mixture of Rule Experts Guided by a Large Language Model

Alexander Koebler, Ingo Thon, and Florian Buettner

Session A2-5 Deep learning and statistical methods for data mining

Room CS 7, 13:30-15:00

Session Chair: Chuan Zhou, Peking University, zhouchuancn@pku.edu.cn

13:30	DM634	Counterfactual Brain Graph Augmentation Guided Bi-Level Contrastive Learning for Disorder Analysis	R
		Guangwei Dong, Xuexiong Luo, Jing Du, Jia Wu, Shan Xue, Jian Yang, and Amin Beheshti
13:50	DM734	Feature Map Purification for Enhancing Adversarial Robustness of Deep Timeseries Classifiers	R
		Mubarak Abdu-Aguye, Zaigham Zaheer, and Karthik Nandakumar
14:10	DM790	EMIT - Event Based Masked Auto Encoding for Irregular Time Series	R
		Hrishikesh Patel, Ruihong Qiu, Adam Irwin, Shazia Sadiq, and Sen Wang
14:30	DM809	PC3: Enhancing Concurrency in High-Conflict Transactions with Prior Cascading Control	S
		Zhibin Wang, Jiangtao Cui, Xiyue Gao, Hui Zhang, Guiqi Ren, Yixiao Liu, Hui Li, and Kankan Zhao
14:45	DM795	Handling Non-IID Data in Federated Learning Using Metaheuristic Optimization Techniques	S
		Amin Birashk, Sadaf MD Halim, and Latifur Khan

Session A3-6 Mining from heterogeneous data sources

Room CS 9, 13:30-15:00

Session Chair: Kijung Shin, KAIST, kijungs@kaist.ac.kr

13:30

DM713

Traffic Pattern Sharing for Federated Traffic Flow Prediction with Personalization

Hang Zhou, Wentao Yu, Sheng Wan, Yongxin Tong, Tianlong Gu, and Chen Gong

13:50

DM745

TROPICAL: Transformer-based Hypergraph Learning for Camouflaged Fraudsters Detection

Venus Haghighi, Behnaz Soltani, Nasrin Shabani, Jia Wu, Yang Zhang, Lina Yao, Quan Z. Sheng, and Jian Yang

14:10

DM760

MOStream: A Modular and Self-Optimizing Data Stream Clustering Algorithm

Zhengru Wang, Xin Wang, and Shuhao Zhang

14:30

DM467

Weakly-Supervised Graph Classification with Even a Single Key Subgraph Per Class

Lu Zhang, Chenbo Zhang, Jihong Guan, and Shuigeng Zhou

14:45

DM681

Graph Rhythm Network: Beyond Energy Modeling for Deep Graph Neural Networks

Yufei Jin and Xingquan Zhu

Session A2-4 Deep learning and statistical methods for data mining

Room Hive, 13:30-15:00

Session Chair: Omkar Bhalerao , University of California, Santa Cruz, obhalera@ucsc.edu

13:30

DM610

A Bayesian Hierarchical Model for Orthogonal Tucker Decomposition with Oblivious Tensor Compression

Matthew Pietrosanu, Bei Jiang, and Linglong Kong

13:50

DM611

Normalizing self-supervised learning for provably reliable Change Point Detection

Alexandra Bazarova, Evgenia Romanenkova, and Alexey Zaytsev

14:10

DM729

Enhancing Distribution and Label Consistency for Graph Out-of-Distribution Generalization

Song Wang, Xiaodong Yang, Rashidul Islam, Huiyuan Chen, Minghua Xu, Jundong Li, and Yiwei Cai

14:25

DM741

CAKD: A Correlation-Aware Knowledge Distillation Framework Based on Decoupling Kullback-Leibler Divergence

Zao Zhang, Huaming Chen, Pei Ning, Nan Yang, and Dong Yuan

14:40

DM812

Rank Supervised Contrastive Learning for Time Series Classification

Qianying Ren, Dongsheng Luo, and Dongjin Song

Tutorials

Tutorial 1: Causality and Large Models

Presenters: Haoxuan Li, Chuan Zhou, Mengyue Yang, Mingming Gong, Jun Wang, Xiao-Hua Zhou

Abstract: Our tutorial aims to explore the synergies between causality and large models, also known as “foundation models,” which have demonstrated remarkable capabilities across for helping data mining in healthcare, finance, and education. However, there are increasingly concerns about the trustworthy and interpretability of these complex ”black-box” LLMs behind the promising performance in data mining domains. A growing community of researchers is turning towards a more principled framework to address these concerns, better understand the behavior of large models, and improve their reliability and interpretability. Specifically, this tutorial will focus on three directions: causal agents for decision-making, LLMs for causality, and benefiting LLMs with causality. Besides, we introduce some open challenges and potential future directions for this area. We hope this tutorial could stimulate more ideas on this topic and facilitate the development of causality-aware large models.

Duration: One whole Day

Tutorial 2: Hypergraph Neural Networks: An In-Depth and Step-by-Step Guide

Presenters: Sunwoo Kim, Soo Yong Lee, Yue Gao, Alessia Antelmi, Mirko Polato, Kijung Shin

Abstract: Higher-order interactions (HOIs) are ubiquitous in real-world networks. Investigation of deep learning for networks of HOIs, expressed as hypergraphs, has become an important agenda for the data mining and machine learning communities. Thus, hypergraph neural networks (HNNs) have emerged as a powerful tool for representation learning on hypergraphs. Given the emerging trend, we provide a timely tutorial dedicated to HNNs. We cover the (1) inputs, (2) message passing schemes, (3) training strategies, (4) applications (e.g., recommender systems and time series analysis), and (5) open problems of HNNs. This tutorial is intended for researchers and practitioners who are interested in (hyper)graph representation learning and its applications.

Duration: Half-a-day

Tutorial 3: Uncertain Boundaries: Multidisciplinary Approaches to Copyright Issues in Generative AI

Presenters: As generative AI systems become more prevalent in creative fields, concerns about intellectual property rights have grown, particularly regarding the production of content that closely resembles human-created work. Recent controversies, where AI models have generated near-replicas of copyrighted material, underscore the urgency of reviewing the current copyright framework and developing methods to mitigate infringement risks. To this end, this tutorial offers a comprehensive analysis of these copyright challenges, examining them throughout the AI development life cycle and providing developers with actionable strategies. It begins by discussing the foundational goals and considerations for copyright in generative AI, followed by methods for detecting and assessing potential violations in AI outputs. Next, it introduces techniques to safeguard creative works and datasets from unauthorized replication. The tutorial also covers training methods aimed at minimising the risk of AI models reproducing protected content. Finally, it reviews the state of AI copyright regulation and suggests future research pathways to address existing gaps.

Duration: Half-a-day

Keynotes

Keynote 1: Preslav Nakov

Title: Towards Safe, Truly Open, and Factual Large Language Models

Abstract: We will discuss several initiatives towards safe, truly open, and factual large language models (LLMs). First, we will present Do-Not-Answer, a dataset for evaluating the guardrails of LLMs, which is at the core of the safety mechanisms incorporated in Jais, the world's leading open Arabic-centric foundation and instruction-tuned large language model, and Nanda, our recently released open Hindi LLM. Next, we will discuss the LLM360 initiative of MBZUAI's Institute on Foundation Models, aiming at developing fully transparent open-source LLMs. We will then examine the factuality challenges associated with large language models, and we will present some recent relevant tools for addressing these challenges developed at MBZUAI: (i) OpenFactCheck, a framework for fact-checking LLM output, for building customized fact-checking systems, and for benchmarking LLMs for factuality, (ii) LM-Polygraph, a tool for predicting an LLM's uncertainty in its output using cheap and fast uncertainty quantification techniques, and (iii) LLM-DetectAIve, a tool for machine-generated text detection.

Bio: Preslav Nakov is Professor and Department Chair for NLP at the Mohamed bin Zayed University of Artificial Intelligence. He is part of the core team at MBZUAI's Institute for Foundation Models that developed Jais, the world's best open-source Arabic-centric LLM, Nanda, the world's best Hindi model, and LLM360, the first truly open LLM. Previously, he was Principal Scientist at the Qatar Computing Research Institute, HBKU, where he led the Tanbih mega-project, developed in collaboration with MIT, which aims to limit the impact of "fake news", propaganda and media bias by making users aware of what they are reading, thus promoting media literacy and critical thinking. He received his PhD degree in Computer Science from the University of California at Berkeley, supported by a Fulbright grant. He is Chair-Elect of the European Chapter of the Association for Computational Linguistics (EACL), Secretary of ACL SIGSLAV, and Secretary of the Truth and Trust Online board of trustees. Formerly, he was PC chair of ACL 2022, and President of ACL SIGLEX. He is also member of the editorial board of several journals including Computational Linguistics, TACL, ACM TOIS, IEEE TASL, IEEE TAC, CS&L, NLE, AI Communications, and Frontiers in AI. He authored a Morgan & Claypool book on Semantic Relations between Nominals, two books on computer algorithms, and 250+ research papers. He received a Best Paper Award at ACM WebSci'2022, a Best Long Paper Award at CIKM'2020, a Best Resource Paper Award at EACL'2024, a Best Demo Paper Award (Honorable Mention) at ACL'2020, a Best Task Paper Award (Honorable Mention) at SemEval'2020, a Best Poster Award at SocInfo'2019, and the Young Researcher Award at RANLP’2011. He was also the first to receive the Bulgarian President's John Atanasoff award, named after the inventor of the first automatic electronic digital computer. His research was featured by over 100 news outlets, including Reuters, Forbes, Financial Times, CNN, Boston Globe, Aljazeera, DefenseOne, Business Insider, MIT Technology Review, Science Daily, Popular Science, Fast Company, The Register, WIRED, and Engadget, among others.

Photo: https://mbzuai.ac.ae/study/faculty/preslav-nakov/

Keynote 2: Bernhard Schölkopf

Title: Towards causal world models and digital twins

Abstract: Research on understanding and building artificially intelligent systems has moved from symbolic approaches to statistical learning, and is now beginning to study interventional models relying on concepts of causality. Some of the hard open problems of machine learning and AI are intrinsically related to causality, and progress may require advances in our understanding of how to model and infer causality from data, as well as conceptual progress on what constitutes a causal representation and a causal world model. I will present basic concepts and thoughts, and some applications to astronomy.

Bio: Bernhard Schölkopf's scientific interests are in machine learning and causal inference. He has applied his methods to a number of different fields, ranging from biomedical problems to computational photography and astronomy. Bernhard studied physics and mathematics and earned his Ph.D. in computer science in 1997, becoming a Max Planck director in 2001. He has (co-)received the Berlin-Brandenburg Academy Prize, the Royal Society Milner Award, the Leibniz Award, the BBVA Foundation Frontiers of Knowledge Award, and the ACM AAAI Allen Newell Award. He is Fellow of the CIFAR Program "Learning in Machines and Brains", and a Professor at ETH Zurich. He helped start the MLSS series of Machine Learning Summer Schools. In 2023, he founded the ELLIS Institute Tuebingen, and acts as its scientific director.

Keynote 3: Claudia Plant

Title: Clustering: Balancing Abstraction and Representation

Abstract: How to find a natural grouping of a large real data set? Clustering requires a balance between abstraction and representation. To identify clusters, we need to abstract from superfluous details of individual objects. But we also need a rich representation that emphasizes the key features shared by groups of objects that distinguish them from other groups of objects.

Each clustering algorithm implements a different trade-off between abstraction and representation. Classical K-means implements a high level of abstraction - details are simply averaged out - combined with a very simple representation - all clusters are Gaussians in the original data space. We will see how approaches to subspace and deep clustering support high-dimensional and complex data by allowing richer representations. However, with increasing representational expressiveness comes the need to explicitly enforce abstraction in the objective function to ensure that the resulting method performs clustering and not just representation learning. We will see how current deep clustering methods define and enforce abstraction through centroid-based and density-based clustering losses. Balancing the conflicting goals of abstraction and representation is challenging. Ideas from subspace clustering help by learning one latent space for the information that is relevant to clustering and another latent space to capture all other information in the data.

The talk ends with an outlook on future research in clustering. In my view, future methods will more adaptively balance abstraction and representation to improve performance, energy efficiency and interpretability. By automatically finding the sweet spot between abstraction and representation, the human brain is very good at clustering and other related tasks such as single-shot learning. So, there is still much to be explored.

Bio: Claudia Plant is full professor, leader of the Data Mining and Machine Learning research group at the Faculty of Computer Science University of Vienna, Austria. Her group focuses on new methods for exploratory data mining, e.g., clustering, anomaly detection, graph mining and matrix factorization. Many approaches relate unsupervised learning to data compression, i.e. the better the found patterns compress the data the more information we have learned. Other methods rely on finding statistically independent patterns or multiple non-redundant solutions, relying on deep learning or nature-inspired concepts such as synchronization. Indexing techniques and methods for parallel hardware support exploring massive data. Claudia Plant has co-authored over 150 peer-reviewed publications, among them more than 30 contributions to KDD and ICDM and 4 Best Paper Awards. Papers on scalability aspects appeared at SIGMOD, ICDE, and the results of interdisciplinary projects in leading application-related journals such as Bioinformatics, Cerebral Cortex and Water Research.