You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On the production LDS API (lightningdotspacecom/lds-api:latest, deployed at lightning.space, dfxprd Postgres lds-api DB), two queries fail every ~60 s on the database side. They have been observed continuously since the Azure → dfxprd Stage 2.2 cutover (2026-05-26). DEV is silent — only PRD logs them.
Rates (3 h sample, Loki):
Container
Errors / 20 min
Cadence
lds-api-postgres-1
40
2 distinct queries × ~1/min
The lds-api container itself returns 201 Created to the HTTP caller; only Postgres surfaces the error in its log. Functional impact is limited to whatever the caller does with the empty result — no Customer-visible LDS endpoint is broken — but the noise drowns out real PG errors and indicates the publish/pull contract is out of sync.
Reproduction (verbatim from lds-api-postgres-1 log)
ERROR: column monitoring_balance.rootstockbalance does not exist at character 852
HINT: Perhaps you meant to reference the column "monitoring_balance.rootstockBalance".
Root cause
Both queries match the exact shape produced by SupportService.getRawData() in src/subdomains/support/services/support.service.ts:90-105:
i.e. POST /v1/support/db (SupportController.getRawData, ADMIN role) accepts an arbitrary { table, select, where, join } from a Bearer-authenticated caller. The endpoint is the data-pull contract that an external consumer (almost certainly DFX-API PRD) polls every 60 s to mirror lds-api rows.
The PRD caller is hardcoded to two stale schema entries:
Caller-supplied
Current schema state
Removed in
table: "asset"
renamed to asset_account
migration 1715583133193-setupFrankencoinPay.js (ALTER TABLE "asset" RENAME TO "asset_account") — 2024-05-13
column still exists in DB, but no longer in MonitoringBalanceEntity (src/subdomains/monitoring/entities/monitoring-balance.entity.ts); Rootstock client/service/module/enum deleted
Schema drift — the caller's column list has not been updated since two schema changes (one ~2 years old, one ~3 months old). This is a configuration-side bug, but lds-api is what publicly fails.
Identifier quoting — even if rootstockBalance were still in the schema, the caller's payload is monitoring_balance.rootstockBalance (unquoted), which Postgres case-folds to rootstockbalance. The DDL column is "rootstockBalance" (mixed-case, was created as such in migration 1762853446614-changeMonitoringColumns.js). Postgres' own hint shows it knows the right name: Perhaps you meant to reference the column "monitoring_balance.rootstockBalance". This is the same class of bug PR fix: remove non-existent version column from chainSwaps query #184 fixed for chainSwaps.version.
DEV vs PRD asymmetry
{container_name="lds-api-postgres-1", server="dfxdev"} for the same 3 h: 0 errors. The DEV consumer either polls with the correct (updated) table list, or does not poll dfxdev at all. The mismatch is exclusively in the PRD caller config.
Impact
Severity: medium-low. No customer endpoint returns wrong data. The PRD consumer (probably DFX-API → DFX core DB) is silently failing to mirror two tables; downstream reports on asset and monitoring_balance.rootstockBalance go stale.
Noise: ~6 PG-side errors per minute, ~8 600 / day. Drowns out real errors in monitoring (60 % of all detected_level=error lines in lds-api-postgres on PRD).
Not a deploy-time regression of the Stage 2.2 cutover — the same queries failed on Azure MSSQL too (likely with a different error). The cutover just made the noise visible in Loki because the new dfxprd Postgres logs at ERROR severity.
Suggested fixes (any combination, not exclusive)
Owner: caller side (DFX-API PRD). Update the polling config: drop asset (use asset_account if mirroring is still wanted), drop the monitoring_balance.rootstockBalance from the select list. This is the actual root-cause fix; no lds-api change strictly required.
Owner: this repo — server-side hardening of /v1/support/db. Reject requests that reference a non-existent table or column with 400before hitting Postgres, so the failure surfaces in the HTTP-response side and stops polluting PG logs. The current .catch(...) re-throws as BadRequestException but only after the SQL has already been parsed and rejected by PG — the PG log line still gets written. Pre-validation against information_schema.tables / information_schema.columns would prevent both the PG error AND give the caller a usable diagnostic.
Symptom
On the production LDS API (
lightningdotspacecom/lds-api:latest, deployed at lightning.space, dfxprd Postgreslds-apiDB), two queries fail every ~60 s on the database side. They have been observed continuously since the Azure → dfxprd Stage 2.2 cutover (2026-05-26). DEV is silent — only PRD logs them.Rates (3 h sample, Loki):
lds-api-postgres-1The lds-api container itself returns
201 Createdto the HTTP caller; only Postgres surfaces the error in its log. Functional impact is limited to whatever the caller does with the empty result — no Customer-visible LDS endpoint is broken — but the noise drowns out real PG errors and indicates the publish/pull contract is out of sync.Reproduction (verbatim from
lds-api-postgres-1log)Query A —
relation "asset" does not exist:Postgres response:
Query B —
column monitoring_balance.rootstockbalance does not exist:Postgres response:
Root cause
Both queries match the exact shape produced by
SupportService.getRawData()insrc/subdomains/support/services/support.service.ts:90-105:i.e.
POST /v1/support/db(SupportController.getRawData, ADMIN role) accepts an arbitrary{ table, select, where, join }from a Bearer-authenticated caller. The endpoint is the data-pull contract that an external consumer (almost certainly DFX-API PRD) polls every 60 s to mirror lds-api rows.The PRD caller is hardcoded to two stale schema entries:
table: "asset"asset_account1715583133193-setupFrankencoinPay.js(ALTER TABLE "asset" RENAME TO "asset_account") — 2024-05-13select: ["monitoring_balance.rootstockBalance"](unquoted)MonitoringBalanceEntity(src/subdomains/monitoring/entities/monitoring-balance.entity.ts); Rootstock client/service/module/enum deletedSo the bug is dual-sided:
rootstockBalancewere still in the schema, the caller's payload ismonitoring_balance.rootstockBalance(unquoted), which Postgres case-folds torootstockbalance. The DDL column is"rootstockBalance"(mixed-case, was created as such in migration1762853446614-changeMonitoringColumns.js). Postgres' own hint shows it knows the right name:Perhaps you meant to reference the column "monitoring_balance.rootstockBalance". This is the same class of bug PR fix: remove non-existent version column from chainSwaps query #184 fixed forchainSwaps.version.DEV vs PRD asymmetry
{container_name="lds-api-postgres-1", server="dfxdev"}for the same 3 h: 0 errors. The DEV consumer either polls with the correct (updated) table list, or does not poll dfxdev at all. The mismatch is exclusively in the PRD caller config.Impact
assetandmonitoring_balance.rootstockBalancego stale.detected_level=errorlines in lds-api-postgres on PRD).ERRORseverity.Suggested fixes (any combination, not exclusive)
asset(useasset_accountif mirroring is still wanted), drop themonitoring_balance.rootstockBalancefrom theselectlist. This is the actual root-cause fix; no lds-api change strictly required./v1/support/db. Reject requests that reference a non-existenttableorcolumnwith400before hitting Postgres, so the failure surfaces in the HTTP-response side and stops polluting PG logs. The current.catch(...)re-throws asBadRequestExceptionbut only after the SQL has already been parsed and rejected by PG — the PG log line still gets written. Pre-validation againstinformation_schema.tables/information_schema.columnswould prevent both the PG error AND give the caller a usable diagnostic.selectlist contains mixed-case names, wrap them in"…"so Postgres does not case-fold. Same class of fix as fix: remove non-existent version column from chainSwaps query #184 (chainSwapsversion).(1) alone fixes the immediate noise. (2) + (3) make the endpoint robust against future schema renames so a stale caller fails fast and audibly.
Verification once fixed
Expect: 0 (or only sporadic real errors), down from current ~60/30 min.