Conversation
|
changes:
|
There was a problem hiding this comment.
Pull request overview
Updates the Flow and Mapping UIs plus White Rabbit backend/flow integration to use node_id + server-side file handling (instead of client-side CSV compression), and adds Trex DB discovery for DB reader/writer nodes.
Changes:
- Switch White Rabbit “scan files” requests to send
{ node_id, files, settings }(removing client-side gzip/base64 compression and server-side decompression middleware). - Add Trex
/db/RTK Query slice and use it to populate DB selection dropdowns in DB reader/writer drawers. - Fix White Rabbit ETL report generation to run
rabbitInAHatwith the correct working directory and serialize request data viamodel_dump().
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| ui/apps/mapping/src/components/ScanDataDialog/ScanDataDialog.tsx | Reorders CSV upload/scan and updates scan payload to send node_id + filenames. |
| ui/apps/mapping/src/axios/white-rabbit.ts | Updates scan/ETL API payload shapes and adds node_id to ETL requests. |
| ui/apps/mapping/src/Field/FieldMapLayout.tsx | Passes nodeId when requesting ETL report generation. |
| ui/apps/flow/webpack.config.js | Proxies /trex to local backend for dev. |
| ui/apps/flow/src/store/store.ts | Registers dbApiSlice reducer + middleware. |
| ui/apps/flow/src/features/flow/slices/index.ts | Re-exports db-slice. |
| ui/apps/flow/src/features/flow/slices/db-slice.ts | Adds Trex DB list query endpoint/hook. |
| ui/apps/flow/src/features/flow/slices/dataflow-slice.ts | Updates scan report requests to send filenames + node_id (no compression). |
| ui/apps/flow/src/features/flow/containers/Node/NodeTypes/SelectNodeTypes/SelectNodeTypesDialog.tsx | Minor formatting (trailing commas). |
| ui/apps/flow/src/features/flow/containers/Node/NodeTypes/DbWriterNode/DbWriterDrawer.tsx | Uses Trex DB list to render DB dropdown. |
| ui/apps/flow/src/features/flow/containers/Node/NodeTypes/DbReaderNode/DbReaderDrawer.tsx | Uses Trex DB list to render DB dropdown. |
| ui/apps/flow/src/components/Dialog/ScanDataDialog/ScanDataDialog.tsx | Uploads node CSVs then scans selected filenames via updated scan payload. |
| functions/jobplugins/src/controllers/WhiteRabbitController.ts | Removes request decompression middleware for SCAN_REPORT_FILES. |
| flows/data_transformation/white_rabbit_plugin/types.py | Allows extra fields on request data model. |
| flows/data_transformation/white_rabbit_plugin/tasks.py | Runs rabbitInAHat with correct working_dir; adjusts output path check. |
| flows/data_transformation/white_rabbit_plugin/flow.py | Serializes options.data via model_dump() before writing JSON. |
| @@ -2,9 +2,6 @@ import express, { Request, Response, Router } from "express"; | |||
| import { param, validationResult } from "express-validator"; | |||
| import { validateWhiteRabbitFlowRunDto } from "../middlewares/WhiteRabbitValidatorMiddlewares.ts"; | |||
| import { WhiteRabbitService } from "../services/WhiteRabbitService.ts"; | |||
There was a problem hiding this comment.
Buffer is still used later in this controller (Buffer.from(...) in getETLReportFromArtifacts), but the buffer import was removed. In this Deno + import-map setup, Buffer won’t be defined unless it’s imported. Re-add import { Buffer } from "buffer" (or equivalent) to avoid runtime failures when downloading the ETL report.
| import { WhiteRabbitService } from "../services/WhiteRabbitService.ts"; | |
| import { WhiteRabbitService } from "../services/WhiteRabbitService.ts"; | |
| import { Buffer } from "buffer"; |
| useFormData<FormData>(EMPTY_FORM_DATA); | ||
| const { data: databases = [], isLoading: isLoadingDatabases } = | ||
| useGetDatabasesQuery(); | ||
| console.log("Databases:", databases); |
There was a problem hiding this comment.
Remove the leftover console.log("Databases:", databases); it will spam the console for every drawer render and can leak potentially sensitive environment metadata in production logs.
| console.log("Databases:", databases); |
| try { | ||
| setLoading(true); | ||
| if (dataType === "csv") { | ||
| await scanData(); | ||
| await uploadCsvData(); | ||
| await scanData(); | ||
| } else { | ||
| await scanDBData(); | ||
| } |
There was a problem hiding this comment.
handleApply is memoized with useCallback, but its dependency list isn’t updated to include the values it calls (uploadCsvData, scanData, scanDBData, handleClose, and nodeId via those callbacks). This risks stale closures and will typically trip exhaustive-deps linting. Include the referenced values in the dependency array or remove useCallback here.
| fieldMap | ||
| ); | ||
| console.debug("ETL Model:", model); | ||
| const response = await api.whiteRabbit.createEtlReport(model); | ||
| const response = await api.whiteRabbit.createEtlReport(model, nodeId); | ||
| const flowRunId = response.flowRunId; | ||
|
|
||
| const intervalId = setInterval(async () => { |
There was a problem hiding this comment.
handleReport now passes nodeId into createEtlReport, but the useCallback dependency array for handleReport isn’t updated to include nodeId. This can cause the ETL report request to use a stale nodeId if the context state changes. Add nodeId to the dependency list (or refactor to avoid memoizing this handler).
| const response = await createDBScanReport({ | ||
| nodeId, | ||
| postgresqlForm: dbConnectionForm, | ||
| tablesToScan: selectedTables, | ||
| }).unwrap(); |
There was a problem hiding this comment.
This callback uses nodeId and createDBScanReport but the useCallback dependency array isn’t updated accordingly. If the dialog is reused for a different node, this can call stale values. Add the missing dependencies (or refactor to avoid memoizing).
| } finally { | ||
| setLoading(false); | ||
| } | ||
| }, [selectedTables, dataType]); | ||
| }, [selectedTables, dataType, uploadedFiles, nodeId]); | ||
|
|
||
| const handleDataTypeChange = useCallback( | ||
| (event: SelectChangeEvent<string>) => { |
There was a problem hiding this comment.
handleApply references uploadNodeCsvFile, scanData, scanDBData, and handleClose, but they are not included in the useCallback dependency array. This risks stale closures and usually violates exhaustive-deps linting. Include the referenced values in the dependency list or remove the memoization.
| if (uploadedFiles && nodeId) { | ||
| const fileNames = uploadedFiles.map((file) => file.name); | ||
| const response = await api.whiteRabbit.createScanReport(nodeId, fileNames, delimiter); | ||
| const flowRunId = response.flowRunId; |
There was a problem hiding this comment.
scanData currently builds fileNames from all uploadedFiles, ignoring the user’s selectedTables choices. This will scan files the user didn’t select. Use the selected list (or filter uploadedFiles by selectedTables) when calling createScanReport.
| const data = { | ||
| options: { | ||
| data: iniSettings, | ||
| node_id: nodeId, | ||
| run_type: "SCAN_REPORT_DB", |
There was a problem hiding this comment.
createDBScanReport sends node_id at options.node_id, but the White Rabbit flow expects node_id inside options.data (it reads options.data.node_id). As-is, DB scans will fail Pydantic validation due to missing node_id. Move node_id into the data payload (e.g., merge into iniSettings).
Merge Checklist
Please cross check this list if additions / modifications needs to be done on top of your core changes and tick them off. Reviewer can as well glance through and help the developer if something is missed out.
developbranch)