Skip to content

fix: handle archived blob reruns#46

Merged
woutervanranst merged 4 commits into
masterfrom
terrific-topaz
Apr 3, 2026
Merged

fix: handle archived blob reruns#46
woutervanranst merged 4 commits into
masterfrom
terrific-topaz

Conversation

@woutervanranst

@woutervanranst woutervanranst commented Apr 3, 2026

Copy link
Copy Markdown
Owner

Summary

  • normalize Azure BlobArchived open-write conflicts into the existing blob-already-exists recovery path
  • add archive recovery matrix coverage for Archive and Cold tiers, including partial blobs with missing metadata
  • consolidate reusable core test blob fakes under src/Arius.Core.Tests/Fakes/ and reuse them in hydration/list tests

Testing

  • dotnet test --project src/Arius.Core.Tests/Arius.Core.Tests.csproj
  • dotnet test --project src/Arius.AzureBlob.Tests/Arius.AzureBlob.Tests.csproj

Summary by CodeRabbit

  • Bug Fixes

    • Treats archived-blob responses from Azure Storage as an “already exists” condition to improve upload/retry behavior
  • Tests

    • Added matrix tests for archive recovery covering existing blobs and missing metadata
    • Introduced several shared fake blob container services and replaced inline test stubs for more consistent, recordable test behavior
  • Chores

    • Updated dependencies: Spectre.Console 0.54.0 → 0.55.0, TUnit 1.24.13 → 1.24.31

@coderabbitai

coderabbitai Bot commented Apr 3, 2026

Copy link
Copy Markdown
Contributor

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 17ac51cb-067e-46ef-a4d2-07d4cbe922e5

📥 Commits

Reviewing files that changed from the base of the PR and between de9e3d5 and 830389e.

📒 Files selected for processing (4)
  • src/Arius.Core.Tests/Fakes/FakeInMemoryBlobContainerService.cs
  • src/Arius.Core.Tests/Fakes/FakeMetadataOnlyBlobContainerService.cs
  • src/Arius.Core.Tests/Fakes/FakeRecordingBlobContainerService.cs
  • src/Arius.Core.Tests/Fakes/FakeSeededBlobContainerService.cs

📝 Walkthrough

Walkthrough

Consolidates multiple test-local fake blob container implementations into shared fakes, adds archive-recovery matrix tests, adjusts Azure blob error handling to treat 409/BlobArchived as “already exists”, and bumps two package versions.

Changes

Cohort / File(s) Summary
Production bug fix
src/Arius.AzureBlob/AzureBlobContainerService.cs
Treat HTTP 409 RequestFailedException with Azure error code "BlobArchived" as an already-exists condition in IsAlreadyExistsError; updated XML docs to reflect real Azure responses.
New shared test fakes infrastructure
src/Arius.Core.Tests/Fakes/FakeInMemoryBlobContainerService.cs, src/Arius.Core.Tests/Fakes/FakeMetadataOnlyBlobContainerService.cs, src/Arius.Core.Tests/Fakes/FakeRecordingBlobContainerService.cs, src/Arius.Core.Tests/Fakes/FakeSeededBlobContainerService.cs
Added four internal test doubles implementing IBlobContainerService: in-memory store with open-write semantics and conflict simulation, metadata-only lookup with request tracking, lightweight recording fake (uploads/HEAD checks), and seeded read-only blob service.
New archive recovery tests
src/Arius.Core.Tests/Archive/ArchiveRecoveryTests.cs
Added ArchiveRecoveryMatrixTests with matrix-driven cases covering re-run behavior when target blobs already exist (with and without metadata), asserting archive success, chunk index entries, and deletion/retry recovery.
Test updates to use centralized fakes
src/Arius.Core.Tests/List/FileHydrationStatusResolverTests.cs, src/Arius.Core.Tests/List/ResolveFileHydrationStatusesHandlerTests.cs, src/Arius.Core.Tests/FileTree/TreeServiceTests.cs, src/Arius.Core.Tests/Ls/LsHandlerTests.cs
Replaced various per-file fake IBlobContainerService implementations with the new centralized fakes from Arius.Core.Tests.Fakes and removed duplicate local fake classes; updated test plumbing and assertions to match new fakes.
Package dependency updates
src/Directory.Packages.props
Bumped Spectre.Console 0.54.0→0.55.0 and TUnit 1.24.13→1.24.31.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 6.96% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: handle archived blob reruns' directly describes the main change: normalizing Azure BlobArchived errors into the existing blob-already-exists recovery path to enable retry logic for archived blobs.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch terrific-topaz

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov

codecov Bot commented Apr 3, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 71.05%. Comparing base (251dede) to head (830389e).
⚠️ Report is 5 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master      #46      +/-   ##
==========================================
+ Coverage   70.94%   71.05%   +0.10%     
==========================================
  Files          59       59              
  Lines        4684     4684              
  Branches      624      624              
==========================================
+ Hits         3323     3328       +5     
+ Misses       1229     1224       -5     
  Partials      132      132              
Flag Coverage Δ
linux 80.23% <100.00%> (+0.12%) ⬆️
windows 67.10% <0.00%> (+0.19%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/Arius.AzureBlob/AzureBlobContainerService.cs`:
- Around line 244-246: IsAlreadyExistsError currently treats a 409 BlobArchived
as equivalent to BlobAlreadyExists, which is incorrect for unconditioned writes;
change the code so BlobArchived is only treated as an "already exists" error on
the precondition-protected path (OpenWriteAsync which sets IfNoneMatch=ETag.All)
— either remove BlobArchived from IsAlreadyExistsError and handle it locally
inside OpenWriteAsync, or add a boolean/flag parameter to IsAlreadyExistsError
(e.g., preconditionProtected) and only map BlobArchived when that flag is true;
ensure UploadAsync continues to use IsAlreadyExistsError without the
BlobArchived mapping so archived-blob write attempts are reported as
archived-write failures, not duplicate-exists errors.

In `@src/Arius.Core.Tests/Archive/ArchiveRecoveryTests.cs`:
- Around line 55-74: The test
Archive_LargeBlobWithoutMetadata_Rerun_DeletesAndRetries only exercises the
Archive tier path; add the Cold-tier variant so the delete-and-retry branch is
validated for BlobTier.Cold as well. Either duplicate the test as
Archive_LargeBlobWithoutMetadata_Rerun_DeletesAndRetries_Cold (calling
env.ArchiveAsync(BlobTier.Cold) and asserting the same behaviors:
env.Blobs.DeletedBlobNames contains blobName and
finalMeta.Metadata[BlobMetadataKeys.AriusType] equals
BlobMetadataKeys.TypeLarge) or parameterize the existing test to run for both
BlobTier.Archive and BlobTier.Cold; ensure you reuse the same setup calls
(SeedLargeBlobAsync, ClearMetadata, ThrowAlreadyExistsOnOpenWrite) and the same
assertions.
- Around line 18-34: The test
Archive_LargeBlobAlreadyExistsWithMetadata_Rerun_Continues currently only
asserts overall success and index update; modify it to also assert the
rerun-specific side effect by verifying the already-exists recovery path was
exercised — for example, after calling env.Blobs.SeedLargeBlobAsync(...) and
env.Blobs.ThrowAlreadyExistsOnOpenWrite(...), assert that the seeded chunk was
not re-uploaded (check a call count or last-upload timestamp on the fake blob
store) or that existing blob metadata was read (assert a metadata lookup flag or
mock method was invoked). Apply the same additional assertion pattern to the
other similar test (lines referenced in the review) so both validate the rerun
recovery path rather than just final success.

In `@src/Arius.Core.Tests/Fakes/InMemoryBlobContainerService.cs`:
- Around line 71-77: The fake currently computes BlobMetadata.ContentLength from
blob.Content.LongLength which yields 0 for metadata-only seeds; update the logic
in GetMetadataAsync (and the other similar returns around SeedMetadata and the
blocks at the other occurrences) to preserve and return the stored content
length for metadata-only seeds (use the blob.ContentLength or stored length
field instead of blob.Content.LongLength) so BlobMetadata.ContentLength reflects
the original seeded size when seeding metadata-only entries; adjust the
BlobMetadata construction sites to pull the correct length field.
- Around line 24-34: UploadAsync and OpenWriteAsync are non-atomic:
detect-then-write allows two concurrent writers to both succeed; change the
in-memory logic to reserve names atomically and enforce
BlobAlreadyExistsException. In UploadAsync use an atomic insertion (e.g.,
_blobs.TryAdd or lock around _blobs) so that if the key already exists you
immediately throw BlobAlreadyExistsException rather than overwriting; for
OpenWriteAsync reserve the blob name up-front by inserting a placeholder
StoredBlob with a “committed/in-progress” flag (the existing StoredBlob(...,
false) looks suitable) and ensure the IDispoable/stream returned on
OpenWriteAsync either replaces that placeholder with a committed StoredBlob on
successful Dispose/Commit or removes the placeholder on Cancel/Dispose without
commit so concurrent OpenWriteAsync/UploadAsync callers will see the reservation
and fail atomically.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 67fb576e-2c85-421f-b1b1-8196c26f951a

📥 Commits

Reviewing files that changed from the base of the PR and between 251dede and c6a59e1.

📒 Files selected for processing (6)
  • src/Arius.AzureBlob/AzureBlobContainerService.cs
  • src/Arius.Core.Tests/Archive/ArchiveRecoveryTests.cs
  • src/Arius.Core.Tests/Fakes/InMemoryBlobContainerService.cs
  • src/Arius.Core.Tests/Fakes/MetadataOnlyBlobContainerService.cs
  • src/Arius.Core.Tests/List/FileHydrationStatusResolverTests.cs
  • src/Arius.Core.Tests/List/ResolveFileHydrationStatusesHandlerTests.cs

Comment on lines 244 to +246
private static bool IsAlreadyExistsError(RequestFailedException ex) =>
(ex.Status == 412 && ex.ErrorCode == "ConditionNotMet") ||
(ex.Status == 409 && ex.ErrorCode == "BlobAlreadyExists");
(ex.Status == 409 && ex.ErrorCode is "BlobAlreadyExists" or "BlobArchived");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

For Azure Blob Storage, when writing to an existing Archive-tier block blob without an If-None-Match precondition (for example BlobClient.UploadAsync overwrite paths / Put Block), does the service return 409 BlobArchived as a generic archived-blob write failure rather than an already-exists precondition failure?

💡 Result:

Yes, when writing to an existing Archive-tier block blob in Azure Blob Storage without an If-None-Match precondition (e.g., via BlobClient.UploadAsync with overwrite=true or Put Block operations), the service returns a 409 Conflict with the specific error code "BlobArchived" ("This operation is not permitted on an archived blob."). This is a generic archived-blob write failure specific to the blob's archived state, not a precondition failure like "BlobAlreadyExists" (which occurs when creating a new blob with If-None-Match:* on an existing non-archived blob).

Citations:


🏁 Script executed:

cat -n src/Arius.AzureBlob/AzureBlobContainerService.cs | sed -n '50,100p'

Repository: woutervanranst/Arius7

Length of output: 2235


🏁 Script executed:

cat -n src/Arius.AzureBlob/AzureBlobContainerService.cs | sed -n '230,260p'

Repository: woutervanranst/Arius7

Length of output: 1221


🏁 Script executed:

# Check the broader context of exception handling around line 64 and 93
rg -A 15 -B 5 "IsAlreadyExistsError" src/Arius.AzureBlob/AzureBlobContainerService.cs

Repository: woutervanranst/Arius7

Length of output: 2022


Scope the BlobArchived alias to optimistic-concurrency writes only.

The IsAlreadyExistsError helper is reused by both UploadAsync (line 64) and OpenWriteAsync (line 93). However, UploadAsync does not set a precondition (no IfNoneMatch), so when it encounters a 409 BlobArchived error, this is a generic archived-blob write failure—not a precondition conflict. Mapping it to BlobAlreadyExistsException misrepresents the error: the blob is not a duplicate; it cannot be written to because it is archived.

OpenWriteAsync sets IfNoneMatch=ETag.All, making it appropriate for this path to treat BlobArchived as an existence check failure. Keep the BlobArchived mapping local to OpenWriteAsync or gate it on the precondition-protected path to prevent UploadAsync(overwrite: true) from misreporting archived blobs as already-exists conflicts.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/Arius.AzureBlob/AzureBlobContainerService.cs` around lines 244 - 246,
IsAlreadyExistsError currently treats a 409 BlobArchived as equivalent to
BlobAlreadyExists, which is incorrect for unconditioned writes; change the code
so BlobArchived is only treated as an "already exists" error on the
precondition-protected path (OpenWriteAsync which sets IfNoneMatch=ETag.All) —
either remove BlobArchived from IsAlreadyExistsError and handle it locally
inside OpenWriteAsync, or add a boolean/flag parameter to IsAlreadyExistsError
(e.g., preconditionProtected) and only map BlobArchived when that flag is true;
ensure UploadAsync continues to use IsAlreadyExistsError without the
BlobArchived mapping so archived-blob write attempts are reported as
archived-write failures, not duplicate-exists errors.

Comment on lines +18 to +34
[Test]
[MatrixDataSource]
public async Task Archive_LargeBlobAlreadyExistsWithMetadata_Rerun_Continues(
[Matrix(BlobTier.Archive, BlobTier.Cold)] BlobTier uploadTier)
{
using var env = new ArchiveTestEnvironment();
var content = env.WriteRandomFile("large.bin", 2 * 1024 * 1024);
var contentHash = env.ComputeHash(content);

await env.Blobs.SeedLargeBlobAsync(BlobPaths.Chunk(contentHash), content, uploadTier);
env.Blobs.ThrowAlreadyExistsOnOpenWrite(BlobPaths.Chunk(contentHash));

var result = await env.ArchiveAsync(uploadTier);

result.Success.ShouldBeTrue(result.ErrorMessage);
env.Lookup(contentHash).ShouldNotBeNull();
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Assert a rerun-specific side effect here.

These tests only prove that archiving eventually succeeded and the index was updated, which would also pass on a clean first archive. Please add one assertion that the already-exists recovery path actually fired, e.g. that the seeded chunk blob was not re-uploaded or that existing blob metadata was consulted.

As per coding guidelines, **/*.Tests/**: Test projects. Focus on test coverage gaps and assertion quality rather than style.

Also applies to: 36-53

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/Arius.Core.Tests/Archive/ArchiveRecoveryTests.cs` around lines 18 - 34,
The test Archive_LargeBlobAlreadyExistsWithMetadata_Rerun_Continues currently
only asserts overall success and index update; modify it to also assert the
rerun-specific side effect by verifying the already-exists recovery path was
exercised — for example, after calling env.Blobs.SeedLargeBlobAsync(...) and
env.Blobs.ThrowAlreadyExistsOnOpenWrite(...), assert that the seeded chunk was
not re-uploaded (check a call count or last-upload timestamp on the fake blob
store) or that existing blob metadata was read (assert a metadata lookup flag or
mock method was invoked). Apply the same additional assertion pattern to the
other similar test (lines referenced in the review) so both validate the rerun
recovery path rather than just final success.

Comment on lines +55 to +74
[Test]
public async Task Archive_LargeBlobWithoutMetadata_Rerun_DeletesAndRetries()
{
using var env = new ArchiveTestEnvironment();
var content = env.WriteRandomFile("partial.bin", 2 * 1024 * 1024);
var contentHash = env.ComputeHash(content);
var blobName = BlobPaths.Chunk(contentHash);

await env.Blobs.SeedLargeBlobAsync(blobName, content, BlobTier.Archive);
env.Blobs.ClearMetadata(blobName);
env.Blobs.ThrowAlreadyExistsOnOpenWrite(blobName, throwOnce: true);

var result = await env.ArchiveAsync(BlobTier.Archive);

result.Success.ShouldBeTrue(result.ErrorMessage);
env.Blobs.DeletedBlobNames.ShouldContain(blobName);

var finalMeta = await env.Blobs.GetMetadataAsync(blobName);
finalMeta.Metadata[BlobMetadataKeys.AriusType].ShouldBe(BlobMetadataKeys.TypeLarge);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add the Cold-tier variant for the partial-blob retry case.

This is the one rerun branch in the new coverage that still only runs under BlobTier.Archive, so a BlobTier.Cold regression in the delete-and-retry path would still go unnoticed.

As per coding guidelines, **/*.Tests/**: Test projects. Focus on test coverage gaps and assertion quality rather than style.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/Arius.Core.Tests/Archive/ArchiveRecoveryTests.cs` around lines 55 - 74,
The test Archive_LargeBlobWithoutMetadata_Rerun_DeletesAndRetries only exercises
the Archive tier path; add the Cold-tier variant so the delete-and-retry branch
is validated for BlobTier.Cold as well. Either duplicate the test as
Archive_LargeBlobWithoutMetadata_Rerun_DeletesAndRetries_Cold (calling
env.ArchiveAsync(BlobTier.Cold) and asserting the same behaviors:
env.Blobs.DeletedBlobNames contains blobName and
finalMeta.Metadata[BlobMetadataKeys.AriusType] equals
BlobMetadataKeys.TypeLarge) or parameterize the existing test to run for both
BlobTier.Archive and BlobTier.Cold; ensure you reuse the same setup calls
(SeedLargeBlobAsync, ClearMetadata, ThrowAlreadyExistsOnOpenWrite) and the same
assertions.

Comment on lines +24 to +34
public async Task UploadAsync(string blobName, Stream content, IReadOnlyDictionary<string, string> metadata, BlobTier tier, string? contentType = null, bool overwrite = false, CancellationToken cancellationToken = default)
{
await using var ms = new MemoryStream();
await content.CopyToAsync(ms, cancellationToken);

if (!overwrite && _blobs.ContainsKey(blobName))
throw new BlobAlreadyExistsException(blobName);

_blobs[blobName] = new StoredBlob(ms.ToArray(), new Dictionary<string, string>(metadata), tier, contentType, false);
UploadedBlobNames.Add(blobName);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Make the fake's create-if-not-exists behavior atomic.

UploadAsync still splits existence detection from the write, and OpenWriteAsync does not reserve the name until dispose. Two writers for the same blob can therefore both succeed in this fake and the last commit wins, which hides the BlobAlreadyExistsException behavior the real implementation is supposed to surface.

Also applies to: 36-55

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/Arius.Core.Tests/Fakes/InMemoryBlobContainerService.cs` around lines 24 -
34, UploadAsync and OpenWriteAsync are non-atomic: detect-then-write allows two
concurrent writers to both succeed; change the in-memory logic to reserve names
atomically and enforce BlobAlreadyExistsException. In UploadAsync use an atomic
insertion (e.g., _blobs.TryAdd or lock around _blobs) so that if the key already
exists you immediately throw BlobAlreadyExistsException rather than overwriting;
for OpenWriteAsync reserve the blob name up-front by inserting a placeholder
StoredBlob with a “committed/in-progress” flag (the existing StoredBlob(...,
false) looks suitable) and ensure the IDispoable/stream returned on
OpenWriteAsync either replaces that placeholder with a committed StoredBlob on
successful Dispose/Commit or removes the placeholder on Cancel/Dispose without
commit so concurrent OpenWriteAsync/UploadAsync callers will see the reservation
and fail atomically.

Comment on lines +71 to +77
return Task.FromResult(new BlobMetadata
{
Exists = true,
Tier = blob.Tier,
ContentLength = blob.Content.LongLength,
IsRehydrating = blob.IsRehydrating,
Metadata = new Dictionary<string, string>(blob.Metadata)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Preserve ContentLength in metadata-only seeds.

SeedMetadata stores an empty payload, and GetMetadataAsync then reports blob.Content.LongLength, so any metadata-only seed comes back as 0 bytes. That makes this fake inaccurate for recovery paths that use meta.ContentLength to continue after a BlobAlreadyExistsException.

Also applies to: 126-127, 178-183

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/Arius.Core.Tests/Fakes/InMemoryBlobContainerService.cs` around lines 71 -
77, The fake currently computes BlobMetadata.ContentLength from
blob.Content.LongLength which yields 0 for metadata-only seeds; update the logic
in GetMetadataAsync (and the other similar returns around SeedMetadata and the
blocks at the other occurrences) to preserve and return the stored content
length for metadata-only seeds (use the blob.ContentLength or stored length
field instead of blob.Content.LongLength) so BlobMetadata.ContentLength reflects
the original seeded size when seeding metadata-only entries; adjust the
BlobMetadata construction sites to pull the correct length field.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
src/Arius.Core.Tests/Fakes/FakeInMemoryBlobContainerService.cs (2)

24-34: ⚠️ Potential issue | 🟠 Major

Make create-if-not-exists atomic in both write paths.

UploadAsync and OpenWriteAsync still have check-then-write races. Two concurrent writers can both pass existence checks and overwrite each other, which hides the BlobAlreadyExistsException behavior tests are supposed to validate.

💡 Proposed fix
+private readonly ConcurrentDictionary<string, StoredBlob> _blobs = new(StringComparer.Ordinal);
+
 public async Task UploadAsync(string blobName, Stream content, IReadOnlyDictionary<string, string> metadata, BlobTier tier, string? contentType = null, bool overwrite = false, CancellationToken cancellationToken = default)
 {
     await using var ms = new MemoryStream();
     await content.CopyToAsync(ms, cancellationToken);
 
-    if (!overwrite && _blobs.ContainsKey(blobName))
-        throw new BlobAlreadyExistsException(blobName);
-
-    _blobs[blobName] = new StoredBlob(ms.ToArray(), new Dictionary<string, string>(metadata), tier, contentType, false);
+    var committed = new StoredBlob(ms.ToArray(), new Dictionary<string, string>(metadata), tier, contentType, false, ms.Length);
+    if (overwrite)
+    {
+        _blobs[blobName] = committed;
+    }
+    else if (!_blobs.TryAdd(blobName, committed))
+    {
+        throw new BlobAlreadyExistsException(blobName);
+    }
+
     UploadedBlobNames.Add(blobName);
 }
 
 public Task<Stream> OpenWriteAsync(string blobName, string? contentType = null, CancellationToken cancellationToken = default)
 {
@@
-    if (_blobs.ContainsKey(blobName))
+    var placeholder = new StoredBlob(Array.Empty<byte>(), new Dictionary<string, string>(), null, contentType, false, 0);
+    if (!_blobs.TryAdd(blobName, placeholder))
         throw new BlobAlreadyExistsException(blobName);
 
     return Task.FromResult<Stream>(new CommitOnDisposeStream(bytes =>
     {
-        _blobs[blobName] = new StoredBlob(bytes, new Dictionary<string, string>(), null, contentType, false);
+        _blobs[blobName] = new StoredBlob(bytes, new Dictionary<string, string>(), null, contentType, false, bytes.LongLength);
         UploadedBlobNames.Add(blobName);
-    }));
+    },
+    onDisposeWithoutCommit: () => _blobs.TryRemove(blobName, out _)));
 }

Also applies to: 36-55

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/Arius.Core.Tests/Fakes/FakeInMemoryBlobContainerService.cs` around lines
24 - 34, The UploadAsync and OpenWriteAsync methods have a check-then-write race
on the _blobs dictionary; make the create-if-not-exists operation atomic by
replacing the separate ContainsKey + assignment with an atomic insertion (e.g.,
use ConcurrentDictionary.TryAdd or a shared lock around the check-and-write) so
that when insertion fails you throw BlobAlreadyExistsException; update both
UploadAsync and OpenWriteAsync to perform the atomic add of a new StoredBlob
(and only add blobName to UploadedBlobNames after successful insertion) so
concurrent writers cannot both succeed.

71-77: ⚠️ Potential issue | 🟠 Major

Preserve seeded ContentLength for metadata-only blobs.

SeedMetadata writes empty content, and GetMetadataAsync derives length from blob.Content.LongLength, so metadata-only seeds always report 0. This breaks recovery scenarios that rely on meta.ContentLength.

💡 Proposed fix
 return Task.FromResult(new BlobMetadata
 {
     Exists = true,
     Tier = blob.Tier,
-    ContentLength = blob.Content.LongLength,
+    ContentLength = blob.ContentLength,
     IsRehydrating = blob.IsRehydrating,
     Metadata = new Dictionary<string, string>(blob.Metadata)
 });
@@
 public void SeedBlob(string blobName, byte[] content, BlobTier? tier = null, IReadOnlyDictionary<string, string>? metadata = null, string? contentType = null, bool isRehydrating = false)
-    => _blobs[blobName] = new StoredBlob(content, metadata is null ? new Dictionary<string, string>() : new Dictionary<string, string>(metadata), tier, contentType, isRehydrating);
+    => _blobs[blobName] = new StoredBlob(content, metadata is null ? new Dictionary<string, string>() : new Dictionary<string, string>(metadata), tier, contentType, isRehydrating, content.LongLength);
 
 public void SeedMetadata(string blobName, BlobMetadata metadata)
-    => _blobs[blobName] = new StoredBlob(Array.Empty<byte>(), new Dictionary<string, string>(metadata.Metadata), metadata.Tier, null, metadata.IsRehydrating);
+    => _blobs[blobName] = new StoredBlob(Array.Empty<byte>(), new Dictionary<string, string>(metadata.Metadata), metadata.Tier, null, metadata.IsRehydrating, metadata.ContentLength);
@@
 private sealed record StoredBlob(
     byte[] Content,
     Dictionary<string, string> Metadata,
     BlobTier? Tier,
     string? ContentType,
-    bool IsRehydrating);
+    bool IsRehydrating,
+    long? ContentLength);

Also applies to: 126-127, 178-183

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/Arius.Core.Tests/Fakes/FakeInMemoryBlobContainerService.cs` around lines
71 - 77, GetMetadataAsync currently sets BlobMetadata.ContentLength from
blob.Content.LongLength which returns 0 for metadata-only seeds created by
SeedMetadata; update GetMetadataAsync (and the other similar occurrences) to
prefer any seeded content length property on the blob (e.g., blob.ContentLength
or a stored metadata length) when present and fall back to
blob.Content.LongLength only if no seeded length exists so seeded metadata-only
blobs preserve their original ContentLength; touch the methods GetMetadataAsync
and SeedMetadata (and the similar blocks at the other locations mentioned) to
read and use the seeded length field rather than always using
blob.Content.LongLength.
🧹 Nitpick comments (7)
src/Arius.Core.Tests/Fakes/FakeMetadataOnlyBlobContainerService.cs (1)

16-23: Add operation names to NotSupportedException for faster test failure diagnosis.

These throws are correct for a focused fake, but adding messages would make failing tests quicker to triage.

💡 Suggested tweak
-    public Task UploadAsync(string blobName, Stream content, IReadOnlyDictionary<string, string> metadata, BlobTier tier, string? contentType = null, bool overwrite = false, CancellationToken cancellationToken = default) => throw new NotSupportedException();
-    public Task<Stream> OpenWriteAsync(string blobName, string? contentType = null, CancellationToken cancellationToken = default) => throw new NotSupportedException();
-    public Task<Stream> DownloadAsync(string blobName, CancellationToken cancellationToken = default) => throw new NotSupportedException();
-    public Task SetMetadataAsync(string blobName, IReadOnlyDictionary<string, string> metadata, CancellationToken cancellationToken = default) => throw new NotSupportedException();
-    public Task SetTierAsync(string blobName, BlobTier tier, CancellationToken cancellationToken = default) => throw new NotSupportedException();
-    public Task CopyAsync(string sourceBlobName, string destinationBlobName, BlobTier destinationTier, RehydratePriority? rehydratePriority = null, CancellationToken cancellationToken = default) => throw new NotSupportedException();
-    public Task DeleteAsync(string blobName, CancellationToken cancellationToken = default) => throw new NotSupportedException();
+    public Task UploadAsync(string blobName, Stream content, IReadOnlyDictionary<string, string> metadata, BlobTier tier, string? contentType = null, bool overwrite = false, CancellationToken cancellationToken = default) => throw new NotSupportedException("UploadAsync is not supported by FakeMetadataOnlyBlobContainerService.");
+    public Task<Stream> OpenWriteAsync(string blobName, string? contentType = null, CancellationToken cancellationToken = default) => throw new NotSupportedException("OpenWriteAsync is not supported by FakeMetadataOnlyBlobContainerService.");
+    public Task<Stream> DownloadAsync(string blobName, CancellationToken cancellationToken = default) => throw new NotSupportedException("DownloadAsync is not supported by FakeMetadataOnlyBlobContainerService.");
+    public Task SetMetadataAsync(string blobName, IReadOnlyDictionary<string, string> metadata, CancellationToken cancellationToken = default) => throw new NotSupportedException("SetMetadataAsync is not supported by FakeMetadataOnlyBlobContainerService.");
+    public Task SetTierAsync(string blobName, BlobTier tier, CancellationToken cancellationToken = default) => throw new NotSupportedException("SetTierAsync is not supported by FakeMetadataOnlyBlobContainerService.");
+    public Task CopyAsync(string sourceBlobName, string destinationBlobName, BlobTier destinationTier, RehydratePriority? rehydratePriority = null, CancellationToken cancellationToken = default) => throw new NotSupportedException("CopyAsync is not supported by FakeMetadataOnlyBlobContainerService.");
+    public Task DeleteAsync(string blobName, CancellationToken cancellationToken = default) => throw new NotSupportedException("DeleteAsync is not supported by FakeMetadataOnlyBlobContainerService.");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/Arius.Core.Tests/Fakes/FakeMetadataOnlyBlobContainerService.cs` around
lines 16 - 23, The NotSupportedException throws in the fake should include
operation-specific messages to speed debugging: update each method (UploadAsync,
OpenWriteAsync, DownloadAsync, SetMetadataAsync, SetTierAsync, CopyAsync,
DeleteAsync) so the thrown NotSupportedException contains a short descriptive
message like "Operation UploadAsync is not supported" (optionally including the
blobName or key parameters for more context) instead of the parameterless
NotSupportedException.
src/Arius.Core.Tests/Ls/LsHandlerTests.cs (1)

520-521: MakeHandler signature change limits flexibility.

The helper now accepts FakeSeededBlobContainerService instead of IBlobContainerService. This is fine for this file since all tests use the seeded fake, but if future tests need a different fake implementation, the signature would need to change. Consider keeping the interface type if flexibility is desired.

Alternative to maintain flexibility
-    private ListQueryHandler MakeHandler(FakeSeededBlobContainerService blobs, ChunkIndexService index) =>
+    private ListQueryHandler MakeHandler(IBlobContainerService blobs, ChunkIndexService index) =>
         new(blobs, s_encryption, index, NullLogger<ListQueryHandler>.Instance, "account", "container");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/Arius.Core.Tests/Ls/LsHandlerTests.cs` around lines 520 - 521, The
MakeHandler helper currently takes a FakeSeededBlobContainerService which
reduces flexibility; change its first parameter type to the interface
IBlobContainerService (i.e., signature: MakeHandler(IBlobContainerService blobs,
ChunkIndexService index)) and keep the rest of the construction of
ListQueryHandler the same so tests can pass any IBlobContainerService
implementation; update any using/imports if necessary and ensure callers still
compile with the fake instance.
src/Arius.Core.Tests/Fakes/FakeSeededBlobContainerService.cs (2)

23-29: Consider using Azure SDK exception type for consistency.

FileNotFoundException works but doesn't match what the real Azure SDK throws for missing blobs (RequestFailedException with status 404). If tests need to verify exception handling behavior that mirrors production, consider throwing a similar exception. For basic seeded-data tests, this is fine.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/Arius.Core.Tests/Fakes/FakeSeededBlobContainerService.cs` around lines 23
- 29, The test fake FakeSeededBlobContainerService.DownloadAsync currently
throws FileNotFoundException for missing blobs; change it to throw Azure's
RequestFailedException with a 404 status to match real Blob SDK behavior so
tests mirror production. Update the DownloadAsync method to detect missing
blobName (where _blobs.TryGetValue fails) and throw new
RequestFailedException(Status: 404, Message: ...) or the appropriate constructor
overload, preserving the blobName in the message so tests can assert on status
and message like production code does.

11-13: Same thread-safety consideration as FakeRecordingBlobContainerService.

The backing Dictionary<string, byte[]> is not thread-safe. If tests seed blobs concurrently or access the dictionary during enumeration, this could cause issues. For test isolation this is typically fine since each test creates its own instance.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/Arius.Core.Tests/Fakes/FakeSeededBlobContainerService.cs` around lines 11
- 13, The _blobs Dictionary in FakeSeededBlobContainerService is not thread-safe
which can break concurrent tests; replace the Dictionary<string, byte[]> _blobs
with a thread-safe collection (e.g., ConcurrentDictionary<string, byte[]>) or
guard accesses with a lock, and update AddBlob and any enumeration/reads to use
the chosen thread-safe API; refer to the _blobs field and the AddBlob method in
FakeSeededBlobContainerService when making the change.
src/Arius.Core.Tests/Fakes/FakeRecordingBlobContainerService.cs (2)

11-12: Consider thread-safety for parallel test execution.

HashSet<string> is not thread-safe. If TUnit runs tests in parallel and multiple tests share a fake instance, concurrent Add calls could corrupt the set. Consider using ConcurrentDictionary<string, byte> or document that instances must not be shared across parallel tests.

Thread-safe alternative
-    public HashSet<string> Uploaded { get; } = new(StringComparer.Ordinal);
-    public HashSet<string> HeadChecked { get; } = new(StringComparer.Ordinal);
+    private readonly ConcurrentDictionary<string, byte> _uploaded = new(StringComparer.Ordinal);
+    private readonly ConcurrentDictionary<string, byte> _headChecked = new(StringComparer.Ordinal);
+
+    public IReadOnlyCollection<string> Uploaded => _uploaded.Keys;
+    public IReadOnlyCollection<string> HeadChecked => _headChecked.Keys;

Then update the recording calls:

-        Uploaded.Add(blobName);
+        _uploaded.TryAdd(blobName, 0);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/Arius.Core.Tests/Fakes/FakeRecordingBlobContainerService.cs` around lines
11 - 12, The Uploaded and HeadChecked properties on
FakeRecordingBlobContainerService use HashSet<string>, which is not thread-safe;
change them to a thread-safe collection such as ConcurrentDictionary<string,
byte> (e.g., Uploaded and HeadChecked as ConcurrentDictionary<string, byte>) and
update any recording calls that currently call .Add/.Contains to use TryAdd and
ContainsKey (or indexer) respectively; ensure any iteration over these
collections uses the dictionary's Keys if a set-like view is needed, or document
that the fake must not be shared across parallel tests if you prefer not to
change the type.

28-32: GetMetadataAsync always returns Exists = false, limiting test coverage.

Per context snippet 1, TreeService skips uploads when meta.Exists is true (lines 353-356). With this fake always returning Exists = false, tests cannot exercise the remote-blob-exists deduplication path. If that path needs coverage, consider adding a constructor parameter or method to seed "existing" blob names.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/Arius.Core.Tests/Fakes/FakeRecordingBlobContainerService.cs` around lines
28 - 32, FakeRecordingBlobContainerService.GetMetadataAsync currently always
returns BlobMetadata { Exists = false }, preventing tests from exercising the
deduplication path in TreeService; modify FakeRecordingBlobContainerService to
accept a seed of existing blob names (via constructor parameter or
AddExisting(string) method) stored in a HashSet and change GetMetadataAsync to
check that set (and still record HeadChecked.Add(blobName)) and return
BlobMetadata { Exists = true } when the blobName is seeded; update tests to seed
the desired blob names when verifying remote-blob-exists behavior.
src/Arius.Core.Tests/FileTree/TreeServiceTests.cs (1)

490-527: Test covers disk-cache deduplication, not remote-blob deduplication.

The BuildAsync_DeduplicatesBlob_WhenAlreadyOnDisk test verifies disk-cache hit behavior correctly. However, since FakeRecordingBlobContainerService.GetMetadataAsync always returns Exists = false, the remote-blob-exists path (TreeService lines 353-356) is never exercised. Consider adding a separate test with a fake that returns Exists = true for specific blob names to cover that deduplication path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/Arius.Core.Tests/FileTree/TreeServiceTests.cs` around lines 490 - 527,
The test only covers disk-cache deduplication; add a new unit test (e.g.,
BuildAsync_DeduplicatesBlob_WhenAlreadyInRemote) that uses a modified fake blob
service where FakeRecordingBlobContainerService.GetMetadataAsync returns a
BlobMetadata { Exists = true } for the specific blob name(s) produced by
TreeBuilder.BuildAsync so the TreeService remote-exists branch (the logic around
TreeService lines 353-356) is exercised; instantiate TreeBuilder with that fake,
call BuildAsync with a manifest similar to the existing test, assert no uploads
were performed (Uploaded.Count == 0) and that the returned root matches
expectations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/Arius.Core.Tests/Fakes/FakeInMemoryBlobContainerService.cs`:
- Around line 24-34: The UploadAsync and OpenWriteAsync methods have a
check-then-write race on the _blobs dictionary; make the create-if-not-exists
operation atomic by replacing the separate ContainsKey + assignment with an
atomic insertion (e.g., use ConcurrentDictionary.TryAdd or a shared lock around
the check-and-write) so that when insertion fails you throw
BlobAlreadyExistsException; update both UploadAsync and OpenWriteAsync to
perform the atomic add of a new StoredBlob (and only add blobName to
UploadedBlobNames after successful insertion) so concurrent writers cannot both
succeed.
- Around line 71-77: GetMetadataAsync currently sets BlobMetadata.ContentLength
from blob.Content.LongLength which returns 0 for metadata-only seeds created by
SeedMetadata; update GetMetadataAsync (and the other similar occurrences) to
prefer any seeded content length property on the blob (e.g., blob.ContentLength
or a stored metadata length) when present and fall back to
blob.Content.LongLength only if no seeded length exists so seeded metadata-only
blobs preserve their original ContentLength; touch the methods GetMetadataAsync
and SeedMetadata (and the similar blocks at the other locations mentioned) to
read and use the seeded length field rather than always using
blob.Content.LongLength.

---

Nitpick comments:
In `@src/Arius.Core.Tests/Fakes/FakeMetadataOnlyBlobContainerService.cs`:
- Around line 16-23: The NotSupportedException throws in the fake should include
operation-specific messages to speed debugging: update each method (UploadAsync,
OpenWriteAsync, DownloadAsync, SetMetadataAsync, SetTierAsync, CopyAsync,
DeleteAsync) so the thrown NotSupportedException contains a short descriptive
message like "Operation UploadAsync is not supported" (optionally including the
blobName or key parameters for more context) instead of the parameterless
NotSupportedException.

In `@src/Arius.Core.Tests/Fakes/FakeRecordingBlobContainerService.cs`:
- Around line 11-12: The Uploaded and HeadChecked properties on
FakeRecordingBlobContainerService use HashSet<string>, which is not thread-safe;
change them to a thread-safe collection such as ConcurrentDictionary<string,
byte> (e.g., Uploaded and HeadChecked as ConcurrentDictionary<string, byte>) and
update any recording calls that currently call .Add/.Contains to use TryAdd and
ContainsKey (or indexer) respectively; ensure any iteration over these
collections uses the dictionary's Keys if a set-like view is needed, or document
that the fake must not be shared across parallel tests if you prefer not to
change the type.
- Around line 28-32: FakeRecordingBlobContainerService.GetMetadataAsync
currently always returns BlobMetadata { Exists = false }, preventing tests from
exercising the deduplication path in TreeService; modify
FakeRecordingBlobContainerService to accept a seed of existing blob names (via
constructor parameter or AddExisting(string) method) stored in a HashSet and
change GetMetadataAsync to check that set (and still record
HeadChecked.Add(blobName)) and return BlobMetadata { Exists = true } when the
blobName is seeded; update tests to seed the desired blob names when verifying
remote-blob-exists behavior.

In `@src/Arius.Core.Tests/Fakes/FakeSeededBlobContainerService.cs`:
- Around line 23-29: The test fake FakeSeededBlobContainerService.DownloadAsync
currently throws FileNotFoundException for missing blobs; change it to throw
Azure's RequestFailedException with a 404 status to match real Blob SDK behavior
so tests mirror production. Update the DownloadAsync method to detect missing
blobName (where _blobs.TryGetValue fails) and throw new
RequestFailedException(Status: 404, Message: ...) or the appropriate constructor
overload, preserving the blobName in the message so tests can assert on status
and message like production code does.
- Around line 11-13: The _blobs Dictionary in FakeSeededBlobContainerService is
not thread-safe which can break concurrent tests; replace the Dictionary<string,
byte[]> _blobs with a thread-safe collection (e.g., ConcurrentDictionary<string,
byte[]>) or guard accesses with a lock, and update AddBlob and any
enumeration/reads to use the chosen thread-safe API; refer to the _blobs field
and the AddBlob method in FakeSeededBlobContainerService when making the change.

In `@src/Arius.Core.Tests/FileTree/TreeServiceTests.cs`:
- Around line 490-527: The test only covers disk-cache deduplication; add a new
unit test (e.g., BuildAsync_DeduplicatesBlob_WhenAlreadyInRemote) that uses a
modified fake blob service where
FakeRecordingBlobContainerService.GetMetadataAsync returns a BlobMetadata {
Exists = true } for the specific blob name(s) produced by TreeBuilder.BuildAsync
so the TreeService remote-exists branch (the logic around TreeService lines
353-356) is exercised; instantiate TreeBuilder with that fake, call BuildAsync
with a manifest similar to the existing test, assert no uploads were performed
(Uploaded.Count == 0) and that the returned root matches expectations.

In `@src/Arius.Core.Tests/Ls/LsHandlerTests.cs`:
- Around line 520-521: The MakeHandler helper currently takes a
FakeSeededBlobContainerService which reduces flexibility; change its first
parameter type to the interface IBlobContainerService (i.e., signature:
MakeHandler(IBlobContainerService blobs, ChunkIndexService index)) and keep the
rest of the construction of ListQueryHandler the same so tests can pass any
IBlobContainerService implementation; update any using/imports if necessary and
ensure callers still compile with the fake instance.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: efc2d447-94ce-4e90-b339-5078512c6d08

📥 Commits

Reviewing files that changed from the base of the PR and between 421c894 and de9e3d5.

📒 Files selected for processing (9)
  • src/Arius.Core.Tests/Archive/ArchiveRecoveryTests.cs
  • src/Arius.Core.Tests/Fakes/FakeInMemoryBlobContainerService.cs
  • src/Arius.Core.Tests/Fakes/FakeMetadataOnlyBlobContainerService.cs
  • src/Arius.Core.Tests/Fakes/FakeRecordingBlobContainerService.cs
  • src/Arius.Core.Tests/Fakes/FakeSeededBlobContainerService.cs
  • src/Arius.Core.Tests/FileTree/TreeServiceTests.cs
  • src/Arius.Core.Tests/List/FileHydrationStatusResolverTests.cs
  • src/Arius.Core.Tests/List/ResolveFileHydrationStatusesHandlerTests.cs
  • src/Arius.Core.Tests/Ls/LsHandlerTests.cs
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/Arius.Core.Tests/List/FileHydrationStatusResolverTests.cs
  • src/Arius.Core.Tests/List/ResolveFileHydrationStatusesHandlerTests.cs
  • src/Arius.Core.Tests/Archive/ArchiveRecoveryTests.cs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant