Skip to content

feat: add nodeId to Dataset, DatasetVersion, and Job API response models#3102

Open
psaikaushik wants to merge 2 commits intoMarquezProject:mainfrom
psaikaushik:feat/1461-add-nodeId-to-api-models
Open

feat: add nodeId to Dataset, DatasetVersion, and Job API response models#3102
psaikaushik wants to merge 2 commits intoMarquezProject:mainfrom
psaikaushik:feat/1461-add-nodeId-to-api-models

Conversation

@psaikaushik
Copy link
Copy Markdown

Summary

Adds a computed nodeId field to the Dataset, DatasetVersion, and Job API response models. This allows API consumers to directly use the nodeId to query the Lineage API without having to manually construct it.

Closes #1461

Changes

Dataset.java

Added getNodeId() method that returns NodeId.of(id), producing a nodeId in the format:

dataset:<namespace>:<name>

DatasetVersion.java

Added getNodeId() method that returns a version-qualified nodeId:

dataset:<namespace>:<name>#<version_uuid>

Job.java

Added getNodeId() method that returns NodeId.of(id), producing a nodeId in the format:

job:<namespace>:<name>

Example API Response (before vs. after)

Before:

{
  "id": { "namespace": "my-namespace", "name": "my-dataset" },
  "type": "DB_TABLE",
  "name": "my-dataset",
  ...
}

After:

{
  "id": { "namespace": "my-namespace", "name": "my-dataset" },
  "type": "DB_TABLE",
  "name": "my-dataset",
  "nodeId": "dataset:my-namespace:my-dataset",
  ...
}

Design Decisions

  • Computed, not stored: nodeId is derived from existing fields (id, namespace, name, version) via getNodeId(), so no database changes are needed.
  • Reuses NodeId class: Leverages the existing NodeId.of() factory methods that are already used in the lineage graph construction.
  • JobVersion: The issue also mentions JobVersion, but there is no dedicated JobVersion service model — job version data is served through the Job model. The nodeId on Job covers this use case.

Tests

Added NodeIdOnModelsTest.java with tests verifying:

  • Dataset nodeId format: dataset:namespace:name
  • DatasetVersion nodeId includes version: dataset:namespace:name#uuid
  • Job nodeId format: job:namespace:name

Adds a computed `nodeId` field to the Dataset, DatasetVersion, and Job
response models. This allows API consumers to directly use the nodeId
to query the Lineage API without having to construct it manually.

- Dataset: returns `dataset:<namespace>:<name>`
- DatasetVersion: returns `dataset:<namespace>:<name>#<version>`
- Job: returns `job:<namespace>:<name>`

Closes MarquezProject#1461
@boring-cyborg boring-cyborg bot added the api API layer changes label Apr 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api API layer changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add nodeId to API models

1 participant