Skip to content

Possible issue with concurrent access to mutbale Map with runningDags in JobScheduler #803

@filiphornak

Description

@filiphornak

Describe the bug

This is not a proven bug, but as you can see, the inner state of running dags of JobSchedule is using

private val runningDags = mutable.Map.empty[RunningDagsKey, Future[Unit]]
mutable.Map which doesn't provide a guarantee, that is correctly synchronized between multiple threads. The current implementation behaves synchronously but is still shared between numerous Futures. One example is
.map {
_.foreach { dag =>
logger.debug(s"Deploying dag = ${dag.id}")
runningDags.put(RunningDagsKey(dag.id, dag.workflowId), executors.executeDag(dag))
}
}
which is spawned in different future, which may reside in an other thread.

This change also enables continuous execution of workflows and is not dependent on the heartbeat cycle, which might improve throughput and delays in the future.

Expected behavior

We have guarantees that the internal state of JobScheduler is consistent across threads.

Additional context

We should merge PR for this issue after merging additional trace or debug messages mentioned in the issue #802

Metadata

Metadata

Assignees

No one assigned

    Labels

    backendbugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions