Skip to content

IBM/AssetOpsBench

Repository files navigation

AI Agents for Industrial Asset Operations & Maintenance

AssetOps MultiAgentBench EMNLP 2025 NeurIPS 2025 AAAI 2026

πŸ“˜ Tutorials: Learn more from our detailed guides β€”
ReActXen IoT Agent (EMNLP 2025) | FailureSensorIQ (NeurIPS 2025) | AssetOpsBench Lab (AAAI 2026) | Spiral (AAAI 2026) | AssetOpsBench Technical Material

πŸ“„ Paper | πŸ€— HF-Dataset | πŸ“’ IBM Blog | πŸ€— HF Blog | Contributors

Kaggle Hugging Face Open In Colab


πŸ“’ Call for Scenario Contribution

We are expanding AssetOpsBench to cover a broader range of industrial challenges. We invite researchers and practitioners to contribute new scenarios, particularly in the following areas:

  • Asset Classes: Turbines, HVAC Systems, Pumps, Transformers, CNC Machines, Robotics, Engines, and so on.
  • Task Domains: Prognostics and Health Management, Remaining Useful Life (RUL) estimation, or Root Cause Analysis (RCA), Diagnostic Analysis and Predictive Maintenance.

How to contribute:

  1. Define your scenario following our Utterance Guideline, Ground Truth Guideline

  2. Explore the Hugging Face dataset as examples.

  3. Submit a Pull Request or open an Issue with the tag new-scenario.

  4. Contact us via email if any question:


Resources


πŸ“‘ Table of Contents

  1. Announcements
  2. Introduction
  3. Datasets
  4. AI Agents
  5. Multi-Agent Frameworks
  6. System Diagram
  7. Leaderboards
  8. Docker Setup
  9. Talks & Events
  10. External Resources
  11. Contributors

Announcements (Papers, Invited Talks, etc)

  • πŸ“Š Dataset Update: AssetOpsBench expanded to cover wider variety of 9 Asset classes (Chiller, AHU, Pump, Motor, Bearing, Engine, Rotors, Boilers, Turbine, etc.) and various Tasks (Remaining Useful Life, Fault Classification, Rule Monitoring, etc.)
    Hugging Face Dataset
    Special Thanks to primary Contributors: πŸ‘₯ @DeveloperMindset123, @ChathurangiShyalika, @Fabio-Lorenzi1

  • πŸ“° AAAI-2026: SPIRAL: Symbolic LLM Planning via Grounded and Reflective Search Authors
    Code

  • 🎯 AAAI-2026 Lab: From Inception to Productization: Hands-on Lab for the Lifecycle of Multimodal Agentic AI in Industry 4.0
    Website Authors AAAI 2026 Slides

  • πŸ“° AABA4ET/AAAI-2026: Agentic Code Generation for Heuristic Rules in Equipment Monitoring Authors

  • πŸ“° IAAI/AAAI-2026: Diversity Meets Relevancy: Multi-Agent Knowledge Probing for Industry 4.0 Applications Authors

  • πŸ“° IAAI/AAAI-2026: Deployed AI Agents for Industrial Asset Management: CodeReAct Framework for Event Analysis and Work Order Automation Authors

  • πŸ“° AAAI-2026 Demo: AssetOpsBench-Live: Privacy-Aware Online Evaluation of Multi-Agent Performance in Industrial Operations
    Authors Demo Video

  • πŸ“° NeurIPS-2025 Social β€” Evaluating Agentic Systems
    Talk: Building Reliable Agentic Benchmarks: Insights from AssetOpsBench Total Registered Users: 2000+ Conference
    Speaker
    Attend on Luma

  • πŸ•“ Past Event: 2025-10-03 – 2-Hour Workshop: AI Agents and Their Role in Industry 4.0 Applications
    Event Host

  • πŸ† Accepted Papers: Parts of papers are accepted at NeurIPS 2025, EMNLP 2025 Research Track, and EMNLP 2025 Industry Track.

  • πŸš€ 2025-09-01: CODS 2025 Competition launched – Access AI Agentic Challenge AssetOpsBench-Live.

  • πŸ“¦ 2025-06-01: AssetOpsBench v1.0 released with 141 industrial Scenarios.

✨ Stay tuned for new tracks, competitions, and community events.


Introduction

AssetOpsBench is a unified framework for developing, orchestrating, and evaluating domain-specific AI agents in industrial asset operations and maintenance.

It provides:

  • 4 domain-specific agents
  • 2 multi-agent orchestration frameworks

Designed for maintenance engineers, reliability specialists, and facility planners, it allows reproducible evaluation of multi-step workflows in simulated industrial environments.


Datasets: 141 Scenarios

AssetOpsBench scenarios span multiple domains:

Domain Example Task
IoT "List all sensors of Chiller 6 in MAIN site"
FSMR "Identify failure modes detected by Chiller 6 Supply Temperature"
TSFM "Forecast 'Chiller 9 Condenser Water Flow' for the week of 2020-04-27"
WO "Generate a work order for Chiller 6 anomaly detection"

Some tasks focus on a single domain, others are multi-step end-to-end workflows.
Explore all scenarios HF-Dataset.


AI Agents

Domain-Specific Agents (Important tools)

  • IoT Agent: get_sites, get_history, get_assets, get_sensors
  • FMSR Agent: get_sensors, get_failure_modes, get_failure_sensor_mapping
  • TSFM Agent: forecasting, timeseries_anomaly_detection
  • WO Agent: generate_work_order

Multi-Agent Frameworks (Blue Prints)

  • MetaAgent: reAct-based single-agent-as-tool orchestration
  • AgentHive: plan-and-execute sequential workflow

MCP Environment

The src/ directory contains MCP servers and a plan-execute runner built on the Model Context Protocol. See INSTRUCTIONS.md for setup, usage, and testing.


Leaderboards

  • Evaluated with 7 Large Language Models
  • Trajectories scored using LLM Judge (Llama-4-Maverick-17B)
  • 6-dimensional criteria measure reasoning, execution, and data handling

Example: MetaAgent leaderboard

meta_agent_leaderboard


Run AssetOpsBench in Docker

  • Please Refer to the
  • Pre-built Docker Images: assetopsbench-basic (minimal) & assetopsbench-extra (full)
  • Conda environment: assetopsbench
  • Full setup guide
cd /path/to/AssetOpsBench
chmod +x benchmark/entrypoint.sh
docker-compose -f benchmark/docker-compose.yml build
docker-compose -f benchmark/docker-compose.yml up

External Resources


Star History Chart


Contributors

Thanks goes to these wonderful people ✨

DhavalRepo18
DhavalRepo18

πŸ’» πŸ“–
ShuxinLin
ShuxinLin

πŸ’» πŸ“–
jtrayfield
jtrayfield

πŸ’» πŸ“–
nianjunz
nianjunz

πŸ’» πŸ“–
ChathurangiShyalika
ChathurangiShyalika

πŸ’» πŸ“–
PUSHPAK-JAISWAL
PUSHPAK-JAISWAL

πŸ’» πŸ“–
bradleyjeck
bradleyjeck

πŸ’» πŸ“–
florenzi002
florenzi002

πŸ’» πŸ“–
kushwaha001
kushwaha001

πŸ’»
Mohit Gupta
Mohit Gupta

πŸ“–
Ayan Das
Ayan Das

πŸ“– πŸ’»

About

AssetOpsBench - Industry 4.0

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors