Problem
_connect.py:298-303 silently catches NotImplementedError from set_timezone_to_utc() at DEBUG level and proceeds normally. BigQuery and ClickHouse both raise here.
This is silent data corruption. When comparing timestamps across databases where one session is in a non-UTC timezone, the bisection algorithm produces wrong ranges and phantom diffs — or worse, masks real diffs.
Scope
- BigQuery (
bigquery.py:154): Implement using SET @@time_zone = 'UTC' session variable
- ClickHouse (
clickhouse.py:98): Implement using SET session_timezone = 'UTC'
_connect.py:298-303: Elevate the caught NotImplementedError log from DEBUG to WARNING. Consider adding a --require-utc flag that makes this an error.
Key Files
data_diff/databases/_connect.py:296-304
data_diff/databases/bigquery.py:154
data_diff/databases/clickhouse.py:98
Acceptance Criteria
Problem
_connect.py:298-303silently catchesNotImplementedErrorfromset_timezone_to_utc()at DEBUG level and proceeds normally. BigQuery and ClickHouse both raise here.This is silent data corruption. When comparing timestamps across databases where one session is in a non-UTC timezone, the bisection algorithm produces wrong ranges and phantom diffs — or worse, masks real diffs.
Scope
bigquery.py:154): Implement usingSET @@time_zone = 'UTC'session variableclickhouse.py:98): Implement usingSET session_timezone = 'UTC'_connect.py:298-303: Elevate the caughtNotImplementedErrorlog from DEBUG to WARNING. Consider adding a--require-utcflag that makes this an error.Key Files
data_diff/databases/_connect.py:296-304data_diff/databases/bigquery.py:154data_diff/databases/clickhouse.py:98Acceptance Criteria
set_timezone_to_utc()sets session to UTCset_timezone_to_utc()sets session to UTC