feat(engine): add datamachine_engine_snapshot filter at job init#2071
Merged
Conversation
Adds a single filter hook in RunFlowAbility::execute() that fires once per job, immediately before the engine snapshot is persisted. Filters receive the assembled snapshot, the job ID, and the source flow + pipeline rows. Returning a modified array enriches engine_data with context that lives outside DM core. Driving use case: Data Machine Code projects active workspace identity (repo, handle, branch, path) into engine_data so AI directives, abilities, and tool calls can answer "which repo is this job operating against" — context DMC already collects but never surfaces into the AI runtime. See Extra-Chill/data-machine-code#423. Pattern is intentionally tiny — one apply_filters() call, no new classes, no schema changes, no CLI surface. The filter is the public extension point; downstream plugins own the semantics of what they project. DM core stays agnostic about workspace, repo, or any consumer-specific concept. Defensive: filter return value is validated as array; non-array returns silently preserve the prior snapshot so a misbehaving filter cannot corrupt engine_data. Refs Extra-Chill/data-machine-code#423 Refs Extra-Chill/extrachill-docs#31
Contributor
Homeboy Results —
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds a one-line extension point to `RunFlowAbility::execute()` so downstream plugins can enrich the engine snapshot at job initialization without forking DM or building parallel persistence machinery.
Motivation
DM core has no way for an external plugin to inject runtime context into a job's `engine_data` at the moment the job is created. Today the only public path is the `initial_data` input on `run-flow` — which is fine when the caller knows the context, but useless for cases where the context lives in a third plugin that the caller doesn't know about.
Concrete driver: Data Machine Code already tracks every workspace and worktree's identity (`WorktreeContextInjector`), but that knowledge never reaches AI directives, abilities, or tool calls during a job because there's no point in the run-flow pipeline where DMC can stamp "hey, this job is bound to `extrachill-artist-platform@docs/agent-run-123`" into `engine_data`. See Extra-Chill/data-machine-code#423.
What this adds
One `apply_filters()` call at the right spot:
```php
$filtered_snapshot = apply_filters( 'datamachine_engine_snapshot', $engine_snapshot, $job_id, $flow, $pipeline );
if ( is_array( $filtered_snapshot ) ) {
$engine_snapshot = $filtered_snapshot;
}
```
Fires immediately before `datamachine_set_engine_data()`. Filters receive the assembled snapshot, the job ID, and the source flow + pipeline rows. Returning a modified array enriches the snapshot. Returning a non-array silently preserves the prior snapshot so a misbehaving filter cannot corrupt `engine_data`.
Why this shape
The filter pattern matches existing DM conventions (`datamachine_directives`, `datamachine_memory_files`, `datamachine_agent_modes`, `datamachine_agent_mode_{slug}`). One line, fully backward-compatible — DM runs unchanged with no callbacks registered.
Downstream consumers
Smoke testing
`php -l` clean. Pattern is structurally identical to the existing `datamachine_directives` filter loop already in DM; no new code paths.
End-to-end smoke (a filter callback that mutates `engine_data` and an AI step reading it back) is exercised by the companion DMC PR's projector against the artist-platform docs-agent pilot.
Scope
One file, one filter, 24 net lines (mostly the docblock). No tests added in this PR because the change is purely a passthrough; the actual behavior under test belongs to the downstream consumer.
Refs Extra-Chill/data-machine-code#423
Refs Extra-Chill/extrachill-docs#31