Skip to content

Add metric view support to OSS UC server and Spark connector#1

Open
chenwang-databricks wants to merge 26 commits into
mainfrom
metric-view-support
Open

Add metric view support to OSS UC server and Spark connector#1
chenwang-databricks wants to merge 26 commits into
mainfrom
metric-view-support

Conversation

@chenwang-databricks
Copy link
Copy Markdown
Owner

@chenwang-databricks chenwang-databricks commented Mar 20, 2026

PR Checklist

  • A description of the changes is added to the description of this PR.
  • If there is a related issue, make sure it is linked to this PR.
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added or modified a feature, documentation in docs is updated

Description of changes

Summary

This PR adds end-to-end support for Spark metric views in Unity Catalog, spanning the OpenAPI spec, server persistence, and the Spark connector. It uses invoker's rights (not definer's rights) -- the querying user must have SELECT on both the metric view and all source tables.

Permission Model: Invoker's Rights

Per UC TLG decision, external/untrusted engines (including OSS Spark) cannot be trusted to enforce definer's rights. This is consistent with how Databricks UC handles non-PE (Single User) clusters. The user's own permissions are checked directly on each table -- no MAPS endpoint, no credential caching, no view-mediated authorization bypass.

Server Changes

OpenAPI spec (api/all.yaml)

  • Added METRIC_VIEW to TableType enum
  • Added view_definition (string) and view_dependencies (DependencyList) to CreateTable and TableInfo schemas
  • Added DependencyList, Dependency, TableDependency, FunctionDependency schemas

Dependency storage

  • New DependencyDAO -- Hibernate entity for the uc_dependencies table, storing view-to-source-table dependency relationships
  • New DependencyRepository -- CRUD operations for dependency records
  • HibernateConfigurator -- registered DependencyDAO for automatic table creation

Table repository (TableRepository.java)

  • createTable() -- handles METRIC_VIEW table type: stores view definition, schema, and persists dependencies
  • getTable() -- for METRIC_VIEW tables, loads and attaches dependency information to the response
  • deleteTable() -- cascading deletion of associated dependency records

Tests

  • BaseMetricViewCRUDTest / SdkMetricViewCRUDTest -- integration tests for metric view CRUD with dependency tracking (2 tests)
  • SdkMetricViewAccessControlTest -- invoker's rights access control tests (4 tests):
    • User with SELECT on both view and source table can get credentials (positive)
    • User with SELECT on view but NOT source table is denied credentials (negative)
    • User with SELECT on view can read metric view metadata (positive)
    • User without SELECT on view is denied metadata access (negative)

Connector Changes

UCSingleCatalog.scala

  • createTable() -- detects table_type=METRIC_VIEW and routes to dedicated createMetricViewFromTableInfo() method
  • createMetricViewFromTableInfo() -- constructs CreateTable request with METRIC_VIEW type, view definition, schema, and dependencies
  • loadTable() -- detects METRIC_VIEW type and routes to loadMetricView(), which returns a V1Table with CatalogTableType.METRIC_VIEW and the YAML as viewText
  • Standard invoker's rights: no MAPS call, no ThreadLocal cache, no Dependent parameter -- source tables are loaded via normal getTable and credentials are vended with the user's own permissions

Companion PRs

- Add view_dependencies field to CreateTable and TableInfo in OpenAPI spec
- Add dependent parameter to GenerateTemporaryTableCredential for
  view-mediated authorization (definer's rights model)
- Create DependencyDAO and DependencyRepository for persisting view
  dependencies in uc_dependencies table
- Update TableRepository to handle METRIC_VIEW table type with
  dependency storage/retrieval on create, get, and delete
- Update TemporaryTableCredentialsService to support view-mediated
  credential vending via the dependent parameter
- Make AuthorizeKey repeatable for multiple authorization parameters
- Add BaseMetricViewCRUDTest and SdkMetricViewCRUDTest for metric view
  CRUD integration testing
- Add MetricViewContext thread-local for passing metric view identity
  during credential vending (definer's rights model)
- Handle METRIC_VIEW table type in UCSingleCatalog.createTable to route
  metric view creation through UCProxy with proper metadata
- Add createMetricView to UCProxy for constructing METRIC_VIEW CreateTable
  requests with view_definition, schema, and dependency properties
- Update build.sbt to support building against Spark 4.2.0-SNAPSHOT with
  unmanaged JARs from assembly directory
- Fix Guava import references: replace org.sparkproject.guava with
  com.google.common across connector Java files to support Spark 4.2
  where Guava is no longer shaded
…ctured TableInfo

- Replace MetricViewContext thread-local with Spark's AnalysisContext.metricViewId
- Use CatalogTableType.METRIC_VIEW instead of view.viewWithMetrics property
- Handle createTable(ident, TableInfo) for metric views with structured dependencies
- Populate view dependencies in listTables for METRIC_VIEW type
Implement GET /tables/{full_name}/metadata-snapshot that returns the
metric view metadata plus resolved source table metadata in a single
call. This allows the Spark connector to resolve source table metadata
without requiring direct SELECT on source tables (definer's rights for
metadata access).

Server changes:
- Add MetadataSnapshot model and endpoint to OpenAPI spec
- Implement getMetadataSnapshot in TableRepository with dependency resolution
- Add authorized endpoint in TableService (SELECT on metric view only)
- Fix backtick typo in handleDependentCredentialRequest method name

Connector changes:
- Add metadataSnapshotCache to UCSingleCatalog companion object
- Call metadata snapshot API in loadMetricView and cache source table info
- Check cache in loadTable before calling getTable (consume-once semantics)

Tests:
- Add SdkMetricViewAccessControlTest with 8 test cases covering both
  credential vending (4 tests) and metadata snapshot (4 tests) permissions
…mpatibility

Replace the custom GET /metadata-snapshot endpoint with the standard
POST /metadata-and-permissions-snapshot (MAPS) endpoint to align with
Databricks UC wire format. This enables the OSS connector to work
against both OSS UC and Databricks UC without code changes.

Key changes:
- OpenAPI: new MAPS endpoint with nested response shape
  (MetadataAndPermissionsSnapshotResponse wrapping MetadataSnapshotResponse),
  rename TableResult.missing_reason to reason, add Dependent/TableDependent
  schemas for structured credential dependent field
- Server: new MetadataSnapshotService handler for MAPS, remove old
  per-table GET endpoint from TableService, update credential handler
  to parse nested Dependent structure
- Connector: call getMetadataAndPermissionsSnapshot, unwrap nested
  response, construct structured Dependent for credential vending,
  convert metadataSnapshotCache to ThreadLocal for thread safety
- Tests: update all 9 integration tests in SdkMetricViewAccessControlTest
Per UC TLG decision, external/untrusted engines cannot be trusted to
enforce definer's rights. OSS Spark uses invoker's rights: users must
have SELECT on both the metric view and all source tables.

Removed:
- MAPS endpoint (MetadataSnapshotService, route, OpenAPI schemas)
- Structured Dependent/TableDependent types from OpenAPI spec
- dependent parameter from GenerateTemporaryTableCredential
- Definer's rights logic in TemporaryTableCredentialsService
- ThreadLocal metadataSnapshotCache in connector
- MAPS call and AnalysisContext.setMetricViewId in loadMetricView
- Dependent construction in credential vending

Kept:
- view_definition and view_dependencies on CreateTable/TableInfo
- uc_dependencies table and DependencyDAO
- DependencyList/Dependency/TableDependency/FunctionDependency schemas
- Metric view CRUD and backward compatibility

Tests: Rewrote SdkMetricViewAccessControlTest from 9 definer's rights
tests to 4 invoker's rights tests (all passing).
The @repeatable annotation and AuthorizeKeys container were only needed
for the definer's rights implementation (multiple @AuthorizeKey on the
credential vending parameter). No longer needed with invoker's rights.
The original file had no functional changes from definer's rights removal
-- only cosmetic diffs (import reordering, whitespace, unused authorizer
parameter). Revert to main version to keep the diff clean.
With invoker's rights, metric views use the same standard permission
checks as regular tables. There is no metric-view-specific permission
logic to test -- the existing table access control tests already cover
this behavior.
Revert Guava import changes (org.sparkproject.guava -> com.google.common)
and credential vending reformatting that were unrelated to metric views.
The connector diff now only contains metric view additions:
- createTable(TableInfo) override with METRIC_VIEW routing
- loadMetricView method returning V1Table with YAML
- createMetricViewFromTableInfo for V2 catalog create path
- METRIC_VIEW detection in loadTable
OSS UC no longer persists view dependencies (no uc_dependencies table).
Under invoker's rights, Spark resolves dependencies at query time by
parsing the YAML. The view_dependencies field is accepted in CreateTable
payloads for wire compatibility with Databricks UC but not persisted.

Removed:
- DependencyDAO.java and DependencyRepository.java
- DependencyRepository from Repositories.java
- DependencyDAO from HibernateConfigurator
- Dependency create/read/delete logic in TableRepository
- Dependency assertions in BaseMetricViewCRUDTest (now tests that
  view_dependencies is accepted without error)
Remove metric-view-specific branching in createTable(ident, tableInfo).
The method now forwards all TableInfo fields (tableType, viewDefinition,
viewDependencies, columns, properties) to the UC server generically,
working for any table type without type-specific conditional logic.
Now that DelegatingCatalogExtension forwards createTable(Identifier,
TableInfo) properly (Spark PR), the override can live on UCProxy like
all other TableCatalog methods. UCSingleCatalog simply delegates to
the delegate chain without casting or bypassing DeltaCatalog.
Factor out common logic between createTable(TableInfo) and
createTable(StructType, ...) into two shared helpers:
- initCreateTable: sets name, schema, catalog, comment, properties
- convertColumns: converts Spark Column[] to UC ColumnInfo[]

Both createTable overloads now use the same base initialization,
ensuring consistent behavior for common fields. The TableInfo overload
adds tableType, viewDefinition, and viewDependencies on top; the legacy
overload adds storage location, data source format, and partitions.
The old createTable(StructType, Transform[], Map) now converts its
arguments into a TableInfo and calls createTable(Identifier, TableInfo).
All table creation logic lives in one method that handles all fields:
tableType, viewDefinition, viewDependencies, columns, partitions,
storage location, data source format, and properties.
Merge loadMetricView into loadTable by branching on whether the table
has storage (storageLocation != null) rather than checking table type.
Tables with storage get credential vending, partition extraction, and
storage format. Tables without storage (metric views, future view
types) get empty storage, viewText from viewDefinition, and skip
credential vending. No more type-specific method.
Re-introduce uc_dependencies persistence so that view_dependencies
round-trips through the API: what clients send on create is persisted
and returned on read. This makes the API self-consistent and enables
non-Spark clients to discover metric view dependencies.

Restored:
- DependencyDAO and DependencyRepository
- DependencyRepository in Repositories and HibernateConfigurator
- Dependency create/read/delete logic in TableRepository (via shared
  attachDependencies helper)
- Dependency assertions in BaseMetricViewCRUDTest

Also removed unnecessary blank line in UCSingleCatalog.scala.
Server now validates that view_dependencies is provided when creating
a metric view, matching how view_definition is already required.

Rewrote tests:
- testMetricViewCRUD: includes dependencies in payload (mimics Spark),
  verifies dependency round-trip on GET, tests full CRUD lifecycle
- testMetricViewWithSqlSource: tests SQL statement as source with
  dependencies
- testCreateMetricViewWithoutDefinitionFails: negative test
- testCreateMetricViewWithoutDependenciesFails: negative test
- Fixed VIEW_DEFINITION to use YAML format instead of SQL
UDFs are not supported in metric view expressions (the OSS connector
does not implement FunctionCatalog), so function dependencies are not
applicable. Removed:
- FunctionDependency schema from all.yaml
- function field from Dependency schema
- FunctionDependency case in connector dependency conversion
- FUNCTION handling in DependencyDAO.from()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant