Skip to content

Add Account_Id to insert statement when inserting a record with an Account_Id.#7972

Open
markmckinnon wants to merge 7 commits intosleuthkit:developfrom
markmckinnon:Fix-account-id-population-in-central-repository
Open

Add Account_Id to insert statement when inserting a record with an Account_Id.#7972
markmckinnon wants to merge 7 commits intosleuthkit:developfrom
markmckinnon:Fix-account-id-population-in-central-repository

Conversation

@markmckinnon
Copy link
Copy Markdown
Contributor

@markmckinnon markmckinnon commented Jul 10, 2025

Update Account_ID column if the table has an account_id. when the CR record is inserted. This has not been done since 4.19.x version.

Summary by CodeRabbit

  • Chores
    • Enhanced bulk import logic to adapt per-destination to whether account fields are present, improving robustness and throughput.
  • Bug Fixes
    • Ensures records with missing or invalid account identifiers are bound and inserted correctly during bulk operations, preventing insertion failures and preserving data integrity.

Update Account_ID column if the table has an account_id.
@bcarrier
Copy link
Copy Markdown
Member

bcarrier commented Apr 6, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 6, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 6, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

commitAttributeInstancesBulk() now precomputes which correlation instance tables include an account_id column, generates per-table INSERT statements with or without account_id, and conditionally binds eamArtifact.getAccountId() or SQL NULL per row only for tables that have the column.

Changes

Cohort / File(s) Summary
Bulk Insertion Account Support
Core/src/org/sleuthkit/autopsy/centralrepository/datamodel/RdbmsCentralRepo.java
Precompute per-correlation-table boolean for account_id presence; generate two INSERT SQL variants (with/without account_id); bind an extra parameter for account_id only when the target table has that column, using the artifact's account ID or SQL NULL.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 I hop through rows with careful art,
Adding accounts when they take part,
If none exist, I leave a space,
Keep bulk inserts in tidy place,
A fluffy dev with nimble heart.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: conditionally adding account_id to INSERT statements when records have account IDs and the target table supports it.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@bcarrier
Copy link
Copy Markdown
Member

bcarrier commented Apr 6, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 6, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
Core/src/org/sleuthkit/autopsy/centralrepository/datamodel/RdbmsCentralRepo.java (1)

1674-1679: Unused variables and redundant computations.

Variables tableName (line 1676) and isAccount (line 1677) are declared and assigned but never used. The same values are computed again inline on line 1678.

♻️ Proposed cleanup
         Map<String, Boolean> correlationHasAccount = new HashMap<>();
         for (CorrelationAttributeInstance.Type artifactType : artifactTypes) {
-           String tableName = correlationTypeToInstanceTableName(artifactType);
-           Boolean isAccount = Boolean.valueOf(correlationAttribHasAnAccount(artifactType)); 
-           correlationHasAccount.put(correlationTypeToInstanceTableName(artifactType), Boolean.valueOf(correlationAttribHasAnAccount(artifactType)));
+           correlationHasAccount.put(
+               correlationTypeToInstanceTableName(artifactType),
+               correlationAttribHasAnAccount(artifactType));
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@Core/src/org/sleuthkit/autopsy/centralrepository/datamodel/RdbmsCentralRepo.java`
around lines 1674 - 1679, The loop in RdbmsCentralRepo builds
correlationHasAccount but declares unused locals tableName and isAccount and
recomputes values; fix by removing the redundant variables or using them:
compute String tableName = correlationTypeToInstanceTableName(artifactType) once
and boolean isAccount = correlationAttribHasAnAccount(artifactType) once, then
call correlationHasAccount.put(tableName, Boolean.valueOf(isAccount)) (or simply
put(tableName, isAccount) if types allow) so no values are recomputed and no
unused variables remain.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@Core/src/org/sleuthkit/autopsy/centralrepository/datamodel/RdbmsCentralRepo.java`:
- Around line 1751-1753: The code risks an NPE by calling bulkPs.setLong(9,
eamArtifact.getAccountId()) without null-checking the Long; update the block
that checks correlationHasAccount.get(tableName) to first retrieve Long
accountId = eamArtifact.getAccountId(), then if accountId != null and accountId
>= 0 call bulkPs.setLong(9, accountId) otherwise call bulkPs.setNull(9,
java.sql.Types.BIGINT) (matching the null/negative handling used in
addArtifactInstance); ensure you reference the same PreparedStatement variable
(bulkPs) and parameter index 9 when making the change.

---

Nitpick comments:
In
`@Core/src/org/sleuthkit/autopsy/centralrepository/datamodel/RdbmsCentralRepo.java`:
- Around line 1674-1679: The loop in RdbmsCentralRepo builds
correlationHasAccount but declares unused locals tableName and isAccount and
recomputes values; fix by removing the redundant variables or using them:
compute String tableName = correlationTypeToInstanceTableName(artifactType) once
and boolean isAccount = correlationAttribHasAnAccount(artifactType) once, then
call correlationHasAccount.put(tableName, Boolean.valueOf(isAccount)) (or simply
put(tableName, isAccount) if types allow) so no values are recomputed and no
unused variables remain.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: be68330e-0712-4137-bde3-d713aa99faef

📥 Commits

Reviewing files that changed from the base of the PR and between 81f9ca2 and 0eb32be.

📒 Files selected for processing (1)
  • Core/src/org/sleuthkit/autopsy/centralrepository/datamodel/RdbmsCentralRepo.java

Comment thread Core/src/org/sleuthkit/autopsy/centralrepository/datamodel/RdbmsCentralRepo.java Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@Core/src/org/sleuthkit/autopsy/centralrepository/datamodel/RdbmsCentralRepo.java`:
- Around line 1674-1679: The lookup of correlationHasAccount can return null
(because bulkArtifacts via addAttributeInstanceBulk can introduce tableName
values not present when correlationHasAccount is built from
getDefinedCorrelationTypes), so replace direct boolean evaluations of
correlationHasAccount.get(tableName) in RdbmsCentralRepo (locations around the
checks after correlationTypeToInstanceTableName/correlationAttribHasAnAccount)
with a null-safe check using
Boolean.TRUE.equals(correlationHasAccount.get(tableName)); update all
occurrences (including the checks that previously assumed a primitive boolean at
the spots corresponding to the logic around bulkArtifacts and
addAttributeInstanceBulk) to prevent NullPointerException.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a49416dd-9416-401c-997f-bd81aceabb83

📥 Commits

Reviewing files that changed from the base of the PR and between 0eb32be and ffc394d.

📒 Files selected for processing (1)
  • Core/src/org/sleuthkit/autopsy/centralrepository/datamodel/RdbmsCentralRepo.java

address coderabbitai comments
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
Core/src/org/sleuthkit/autopsy/centralrepository/datamodel/RdbmsCentralRepo.java (1)

1692-1757: ⚠️ Potential issue | 🔴 Critical

Reuse the null-safe tableHasAccount decision.

addAttributeInstanceBulk() can create new bulkArtifacts entries on the fly, so the lookup at Line 1751 is still nullable. Line 1693 already treats a missing entry as false, but Line 1751 auto-unboxes the same lookup and can still abort the whole batch. Compute the boolean once per table and reuse it for both SQL construction and parameter binding.

🛠️ Proposed fix
-                    String sql;
-                    if (correlationHasAccount.getOrDefault(tableName, false)) {
+                    final boolean tableHasAccount = correlationHasAccount.getOrDefault(tableName, false);
+                    String sql;
+                    if (tableHasAccount) {
                         sql = "INSERT INTO "
                               + tableName
                               + " (case_id, data_source_id, value, file_path, known_status, comment, file_obj_id, account_id) "
                               + "VALUES ((SELECT id FROM cases WHERE case_uid=? LIMIT 1), "
                               + "(SELECT id FROM data_sources WHERE datasource_obj_id=? AND case_id=? LIMIT 1), ?, ?, ?, ?, ?, ?) "
                               + getConflictClause();
                     } else {
                         sql = "INSERT INTO "
                               + tableName
                               + " (case_id, data_source_id, value, file_path, known_status, comment, file_obj_id) "
                               + "VALUES ((SELECT id FROM cases WHERE case_uid=? LIMIT 1), "
                               + "(SELECT id FROM data_sources WHERE datasource_obj_id=? AND case_id=? LIMIT 1), ?, ?, ?, ?, ?) "
                               + getConflictClause();
                     }
...
-                                if (correlationHasAccount.get(tableName)) {
+                                if (tableHasAccount) {
                                     if (eamArtifact.getAccountId() != null && eamArtifact.getAccountId() >= 0) {
                                         bulkPs.setLong(9, eamArtifact.getAccountId());
                                     } else {
                                         bulkPs.setNull(9, Types.INTEGER);
                                     }
                                 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@Core/src/org/sleuthkit/autopsy/centralrepository/datamodel/RdbmsCentralRepo.java`
around lines 1692 - 1757, The lookup correlationHasAccount.get(tableName) is
auto-unboxed later and can NPE; compute a null-safe boolean once (e.g., boolean
tableHasAccount = correlationHasAccount.getOrDefault(tableName, false)) at the
start of the addAttributeInstanceBulk() loop and use tableHasAccount both when
building the sql string and when binding parameters (replace
correlationHasAccount.get(tableName) with tableHasAccount) to avoid nullable
lookups.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In
`@Core/src/org/sleuthkit/autopsy/centralrepository/datamodel/RdbmsCentralRepo.java`:
- Around line 1692-1757: The lookup correlationHasAccount.get(tableName) is
auto-unboxed later and can NPE; compute a null-safe boolean once (e.g., boolean
tableHasAccount = correlationHasAccount.getOrDefault(tableName, false)) at the
start of the addAttributeInstanceBulk() loop and use tableHasAccount both when
building the sql string and when binding parameters (replace
correlationHasAccount.get(tableName) with tableHasAccount) to avoid nullable
lookups.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cb5c2ad1-835b-4fb3-80a0-9c275a9a044b

📥 Commits

Reviewing files that changed from the base of the PR and between ffc394d and 4ab6100.

📒 Files selected for processing (1)
  • Core/src/org/sleuthkit/autopsy/centralrepository/datamodel/RdbmsCentralRepo.java

Fixes suggested by codeRabbit
Coderabbit suggested fix
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
Core/src/org/sleuthkit/autopsy/centralrepository/datamodel/RdbmsCentralRepo.java (1)

1751-1757: Consider using Types.BIGINT and caching the account ID.

The null-handling logic correctly addresses the NPE risk. Two minor improvements:

  1. The account_id column is defined as BIGINT in the schema, so Types.BIGINT would be more accurate (though Types.INTEGER works in practice and matches existing code at line 1062).
  2. Caching getAccountId() in a local variable avoids calling the getter twice.
♻️ Optional refactor
                             if (tableHasAccount) {
-                                if (eamArtifact.getAccountId() != null && eamArtifact.getAccountId() >= 0) {
-                                    bulkPs.setLong(9, eamArtifact.getAccountId());
+                                Long accountId = eamArtifact.getAccountId();
+                                if (accountId != null && accountId >= 0) {
+                                    bulkPs.setLong(9, accountId);
                                 } else {
-                                    bulkPs.setNull(9, Types.INTEGER);
+                                    bulkPs.setNull(9, Types.BIGINT);
                                 }
                             }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@Core/src/org/sleuthkit/autopsy/centralrepository/datamodel/RdbmsCentralRepo.java`
around lines 1751 - 1757, Update the block guarded by tableHasAccount to cache
eamArtifact.getAccountId() into a local Long (e.g., accountId) and use that
variable for the null-check and set operations; when writing a non-null value
call bulkPs.setLong(9, accountId) and when null call bulkPs.setNull(9,
Types.BIGINT) so the JDBC type matches the schema and you avoid calling
eamArtifact.getAccountId() twice.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In
`@Core/src/org/sleuthkit/autopsy/centralrepository/datamodel/RdbmsCentralRepo.java`:
- Around line 1751-1757: Update the block guarded by tableHasAccount to cache
eamArtifact.getAccountId() into a local Long (e.g., accountId) and use that
variable for the null-check and set operations; when writing a non-null value
call bulkPs.setLong(9, accountId) and when null call bulkPs.setNull(9,
Types.BIGINT) so the JDBC type matches the schema and you avoid calling
eamArtifact.getAccountId() twice.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 92f25df1-3f1c-4358-aae6-d8303d97ad0f

📥 Commits

Reviewing files that changed from the base of the PR and between 4ab6100 and caf4716.

📒 Files selected for processing (1)
  • Core/src/org/sleuthkit/autopsy/centralrepository/datamodel/RdbmsCentralRepo.java

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@Core/src/org/sleuthkit/autopsy/centralrepository/datamodel/RdbmsCentralRepo.java`:
- Around line 1674-1679: The code is using correlationAttribHasAnAccount() as a
proxy for whether an instance table actually has an account_id column, which can
cause account IDs to be dropped during addAttributeInstanceBulk(); update the
logic in the loop that builds correlationHasAccount so that for each tableName
you either (preferred) always include account_id in the INSERT and bind
eamArtifact.getAccountId() or a SQL NULL when the artifact has no account, or
(alternative) call doesColumnExist(conn, tableName, "account_id") to determine
the actual schema and set correlationHasAccount based on that; ensure
addAttributeInstanceBulk() and the prepared statement generation bind account_id
appropriately when present or bind NULL when absent so account data is not
silently lost (reference symbols: correlationAttribHasAnAccount,
addAttributeInstanceBulk, correlationHasAccount map, doesColumnExist,
eamArtifact.getAccountId(), RdbmsCentralRepoFactory instance table definitions).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 04748871-0779-4bd8-98fe-fda82094b3ee

📥 Commits

Reviewing files that changed from the base of the PR and between caf4716 and 8fa78a3.

📒 Files selected for processing (1)
  • Core/src/org/sleuthkit/autopsy/centralrepository/datamodel/RdbmsCentralRepo.java

Comment thread Core/src/org/sleuthkit/autopsy/centralrepository/datamodel/RdbmsCentralRepo.java Outdated
Update to use doescolumnexist
@bcarrier
Copy link
Copy Markdown
Member

@markmckinnon What's the context behind this? Wouldn't the DB get upgraded when it is opened and therefore always have the account ID? Are there paths where the upgrade isn't forced?

@markmckinnon
Copy link
Copy Markdown
Contributor Author

@bcarrier the database had account_id but the column never had any data in it since 4.19.x. The column will have empty values for anything that was inserted since 4.19.x for tables with account_id. This fixes any new data that is inserted going forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants