- Authors: Yeahoon Kwon, Yesong Choe, Soungmin Park, Neil Dhir, Sanghack Lee
- Conference: Thirty-Ninth Conference on Neural Information Processing Systems (NeurIPS 2025)
- Poster Session: Thursday, December 4, 2025 • 11:00 AM – 2:00 PM PST • San Diego, Exhibit Hall C,D,E #2603
- Blog Post: Interactive Explainer
NS-SCMMAB is a Python library for non-stationary structural causal bandits that addresses sequential decision-making in environments with evolving causal mechanisms. Unlike traditional multi-armed bandit (MAB) formulations that assume fixed reward distributions, our framework models how causal structures change over time and how interventions propagate temporally.
- Temporal Causal Modeling: Explicitly models how causal structures evolve over time using temporal structural causal models (SCMs)
- Non-Myopic Intervention Strategies: Identifies intervention sequences that maximize both immediate and long-term rewards through POMIS+ (Possibly-Optimal Minimal Intervention Sets with Future Support)
- Graphical Causal Tools: Provides graphical characterization and algorithms for identifying optimal intervention strategies in non-stationary environments
- Comprehensive Experiments: Includes all code to reproduce the experimental results from the NeurIPS 2025 paper
- Python 3.9 or higher
- Operating System: Linux or macOS (tested on both)
The easiest way to install NS-SCMMAB is using pip:
git clone https://github.com/yeahoon-k/NS-SCMMAB.git
cd NS-SCMMAB
pip install -e .This will automatically install all required dependencies:
numpy >= 1.21.2scipy >= 1.7.1networkx >= 2.6.3joblib >= 1.0.1matplotlib >= 3.4.3seaborn >= 0.11.2tqdm >= 4.62.0
Here's a simple example to get you started with NS-SCMMAB:
import numpy as np
from src.model import CausalDiagram, StructuralCausalModel
from src.pomis_plus import POMISplusSEQ
from src.bandits import ThompsonSamplingBandit
# Define a simple non-stationary causal structure
# Time step 1: X1 -> Z1 -> Y1
# Time step 2: X2 -> Z2 -> Y2, with X1 -> X2 (temporal dependency)
# Create causal diagram
cd = CausalDiagram()
cd.add_edges([
('X1', 'Z1'), ('Z1', 'Y1'), # Time step 1
('X2', 'Z2'), ('Z2', 'Y2'), # Time step 2
('X1', 'X2') # Temporal edge
])
# Define structural equations
def f_X1(u): return u['U_X1']
def f_Z1(x1, u): return (x1 + u['U_Z1']) % 2
def f_Y1(z1, u): return (z1 + u['U_Y1']) % 2
def f_X2(x1, u): return (x1 + u['U_X2']) % 2
def f_Z2(x2, u): return (x2 + u['U_Z2']) % 2
def f_Y2(z2, u): return (z2 + u['U_Y2']) % 2
# Define exogenous distributions
P_U = {
'U_X1': 0.5, 'U_Z1': 0.2, 'U_Y1': 0.1,
'U_X2': 0.3, 'U_Z2': 0.2, 'U_Y2': 0.1
}
# Create SCM
scm = StructuralCausalModel(
graph=cd,
functions={'X1': f_X1, 'Z1': f_Z1, 'Y1': f_Y1,
'X2': f_X2, 'Z2': f_Z2, 'Y2': f_Y2},
exogenous_dist=P_U
)
# Compute POMIS+ sequences
reward_vars = ['Y1', 'Y2']
pomis_plus_sequences = POMISplusSEQ(cd, reward_vars, time_horizon=2)
print("POMIS+ Intervention Sequences:")
for i, seq in enumerate(pomis_plus_sequences, 1):
print(f"Sequence {i}: {seq}")from src.scm_bandits import SCMBandit
from src.bandits import ThompsonSamplingBandit
# Create a bandit problem from the SCM
bandit = SCMBandit(scm, reward_vars=['Y1', 'Y2'])
# Initialize Thompson Sampling
ts = ThompsonSamplingBandit(
intervention_sequences=pomis_plus_sequences,
n_trials=10000
)
# Run the bandit algorithm
cumulative_regret = ts.run(bandit)
print(f"Final cumulative regret: {cumulative_regret[-1]:.2f}")import matplotlib.pyplot as plt
plt.figure(figsize=(10, 6))
plt.plot(cumulative_regret, label='Thompson Sampling with POMIS+')
plt.xlabel('Trials')
plt.ylabel('Cumulative Regret')
plt.title('Performance on Non-Stationary Causal Bandit')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()To reproduce the experiments from the NeurIPS 2025 paper:
# Run experiments (~2 hours on 48-core server, ~4-6 hours on typical machines)
python -m src.NIPS2025POMISPLUS_exp.test_nsbandit_strategiesThis creates a bandit_results/ directory with results for three experimental tasks:
- Task 1: Standard non-stationary chain structure (Fig. 3 in paper)
- Task 2: Non-stationary structure with collider (Fig. 4 in paper)
- Task 3: Complex structure with long-range dependencies (Fig. 6 in paper)
# Generate figures as in the paper
python -m src.NIPS2025POMISPLUS_exp.test_drawing_reThis produces:
- Cumulative regret plots comparing POMIS+ vs myopic POMIS
- Optimal arm selection probability plots
- Results for both Thompson Sampling and KL-UCB algorithms
To run specific tasks:
from src.NIPS2025POMISPLUS_exp import test_nsbandit_strategies
# Run only Task 1
test_nsbandit_strategies.run_task(task_id=1, n_trials=100000, n_runs=200)You can customize the experimental parameters:
from src.NIPS2025POMISPLUS_exp.scm_examples import create_task1_scm
from src.scm_bandits import run_bandit_experiment
# Create custom SCM
scm = create_task1_scm()
# Run with custom parameters
results = run_bandit_experiment(
scm=scm,
algorithm='thompson_sampling', # or 'kl_ucb'
n_trials=50000,
n_runs=100,
use_pomis_plus=True # Set to False for myopic baseline
)NS-SCMMAB/
├── src/ # Main package
│ ├── model.py # Structural Causal Model implementation
│ ├── bandits.py # Bandit algorithms (Thompson Sampling, KL-UCB)
│ ├── scm_bandits.py # SCM-specific bandit formulations
│ ├── pomis_plus.py # POMIS+ algorithm implementation
│ ├── where_do.py # POMIS identification utilities
│ ├── utils.py # Helper functions
│ ├── viz_util.py # Visualization utilities
│ └── NIPS2025POMISPLUS_exp/ # Experimental code
│ ├── test_nsbandit_strategies.py # Main experiment runner
│ ├── test_drawing_re.py # Figure generation
│ ├── scm_examples.py # Task-specific SCM definitions
│ ├── construct_pomis.py # POMIS construction utilities
│ ├── report_cum_regret_oap.py # Result reporting
│ └── report_mean_rewards.py # Reward analysis
├── pyproject.toml # Package configuration
├── LICENSE.txt # MIT License
└── README.md # This file
If you use this code in your research, please cite our paper:
@inproceedings{kwon2025nonstationary,
title={Non-Stationary Structural Causal Bandits},
author={Kwon, Yeahoon and Choe, Yesong and Park, Soungmin and Dhir, Neil and Lee, Sanghack},
booktitle={Thirty-Ninth Conference on Neural Information Processing Systems (NeurIPS)},
year={2025},
url={https://openreview.net/pdf?id=F4LhOqhxkk}
}This project is licensed under the MIT License - see the LICENSE.txt file for details.
This work was supported by:
- IITP (RS-2022-II220953, RS-2025-02263754) grants funded by the Korean government
- NRF (RS-2023-00211904, RS-2023-00222663) grants funded by the Korean government
- Basic Science Research Program through the NRF funded by the Ministry of Education (RS-2025-25418030)
For questions or issues:
- Open an issue: GitHub Issues
- Email: [email protected] or [email protected]
- Paper: Non-Stationary Structural Causal Bandits (NeurIPS 2025)
- Conference Page: NeurIPS 2025 Poster #119070
- Blog Post: Interactive Explainer
- Lee, S., & Bareinboim, E. (2018). Structural Causal Bandits: Where to Intervene? NeurIPS 2018.
- Lee, S., & Bareinboim, E. (2019). Structural Causal Bandits with Non-manipulable Variables. AAAI 2019.
- Pearl, J. (2009). Causality: Models, Reasoning and Inference. Cambridge University Press.