Skip to content

fix: 5 issues from encrypted-payload E2E testing#127

Merged
27Bslash6 merged 11 commits into
mainfrom
fix/e2e-smoke-findings
May 28, 2026
Merged

fix: 5 issues from encrypted-payload E2E testing#127
27Bslash6 merged 11 commits into
mainfrom
fix/e2e-smoke-findings

Conversation

@27Bslash6
Copy link
Copy Markdown
Contributor

@27Bslash6 27Bslash6 commented May 27, 2026

Summary

Fixes 5 issues discovered while building @cache.io + CACHEKIT_MASTER_KEY encrypted-payload E2E smoke tests against the live SaaS.

  • Silent fail-open on L2 decrypt/integrity failureSerializationError/EncryptionError at L2 deserialize sites was caught by broad except Exception with no specific warning. Now caught before the generic handler and logged as "L2 cache decrypt/integrity failure for {key}: {err}". Fail-open behavior preserved (recompute).
  • @cache.secure silently L1-only with CACHEKIT_API_KEYDefaultBackendProvider only resolved Redis. Now checks CACHEKIT_API_KEY first (highest priority) and creates CachekitIOBackend, matching the documented env-var priority order. All decorators auto-detect the SaaS backend.
  • SaaS-incompatible cache keys for nested/lambda functionsfunc.__qualname__ contains angle brackets (outer.<locals>.inner, <lambda>) that fail the SaaS regex /^[a-zA-Z0-9_.]{1,200}$/. Now sanitized before entering the key. Deterministic mapping, well-behaved names unchanged.
  • Inconsistent encryption ergonomics docsEncryptionConfig and @cache.io docstrings now document the tenant-mode requirement and point to CACHEKIT_MASTER_KEY env var or @cache.secure() as ergonomic alternatives.
  • .env.dev not gitignoredconftest.py auto-loads it via load_dotenv; secret-leak footgun fixed.

Also includes the encrypted-payload E2E smoke test (tests/integration/saas/test_sdk_secure_e2e.py) that surfaced these issues.

Test plan

  • make quick-check passes (format, lint, type-check; critical tests pre-existing Redis binary path issue)
  • All unit tests pass (1421 passed, 8 skipped)
  • New TestFuncNameSanitization — 5 tests (nested func, lambda, SaaS regex compliance, normal func, deterministic)
  • New TestL2DecryptFailureWarning — 4 tests (SerializationError, EncryptionError, generic exception, recompute)
  • New TestDefaultBackendProvider — 6 tests (CachekitIO detection, Redis fallback, caching, tenant context)
  • test_sdk_secure_e2e.py 6/6 green via live SaaS harness (requires CACHEKIT_API_KEY)

Summary by CodeRabbit

  • New Features

    • Automatic backend detection (SaaS when API key present, otherwise Redis).
    • Automatic client-side encryption detection via master-key env var.
  • Bug Fixes

    • Cache-key sanitization to meet SaaS constraints.
    • Fail-open handling and targeted observability for decrypt/integrity failures.
  • Documentation

    • Clarified encryption/backend docs; exported configs redact master key.
  • Chores

    • Ignore .env.dev; refreshed secrets baseline and CI audit ignore list.
  • Tests

    • Added unit and integration tests for encryption, key sanitization, backend selection, and decrypt observability.

Review Change Stack

27Bslash6 added 7 commits May 27, 2026 21:52
tests/integration/saas/conftest.py auto-loads .env.dev via load_dotenv.
Without this entry, committing a .env.dev with real API keys is a
secret-leak footgun.
Python's __qualname__ for nested functions and lambdas contains angle
brackets (e.g., outer.<locals>.inner, <lambda>) which fail the SaaS
key validation regex /^[a-zA-Z0-9_.]{1,200}$/.

Replace disallowed chars with '_', collapse '..' runs, and cap at
200 chars.  The mapping is deterministic — same function always
produces the same key.  Functions that were 400ing before now get
valid (different) keys; well-behaved names are unchanged.
DefaultBackendProvider only resolved Redis — when CACHEKIT_API_KEY was
set with no REDIS_URL, @cache.secure silently fell through to L1-only
mode (the Redis init failed and was swallowed).

Now checks CACHEKIT_API_KEY first (highest priority) and creates
CachekitIOBackend when found, matching the documented env-var
priority order.  All decorators — including @cache.secure — auto-detect
the SaaS backend without explicit backend= or @cache.io.
When L2 cached data fails deserialization (GCM decrypt failure, wrong
key after rotation, integrity check mismatch, corrupt data), the
decorator silently treated it as a miss.  L1 already logged this
(wrapper.py:691) but L2 did not.

Catch SerializationError/EncryptionError before the broad Exception at
five L2 deserialize sites (wrapper.py async primary, double-check,
lock-timeout paths; cache_handler.py sync and async).  Emit a
logger().warning identifying the failure, then continue fail-open
(recompute).  A broken key rotation or tampered ciphertext now shows
up in logs instead of manifesting as a silent performance cliff.
EncryptionConfig(enabled=True) requires single_tenant_mode=True or a
tenant_extractor, but @cache.io doesn't set either automatically
(unlike @cache.secure which does).

Document the requirement in both the EncryptionConfig and
DecoratorConfig.io() docstrings, and point users to the
CACHEKIT_MASTER_KEY env var path (no code change) or @cache.secure()
as the ergonomic alternative.
Covers roundtrip (encrypt → store in SaaS → decrypt), zero-knowledge
(stored bytes are opaque ciphertext), and tamper detection (mutated
GCM tag → recompute, never serve forged data).

Env-parameterized: runs against any environment via CACHEKIT_API_KEY
and CACHEKIT_API_URL.  Skipped when the API key is absent.
DecoratorConfig.io() never wired CACHEKIT_MASTER_KEY into an
EncryptionConfig — the env var was silently ignored, and data was
stored as plaintext.  The E2E test passed by accident because the
SerializationWrapper JSON envelope defeated _plaintext_recoverable
(it couldn't parse the envelope, so it reported "not recoverable"
regardless of encryption state).

Now io() checks get_settings().master_key and builds an
EncryptionConfig(enabled=True, single_tenant_mode=True) when the
env var is present.  Also fix _plaintext_recoverable to unwrap the
envelope before checking, so the test actually validates encryption.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 27, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8e95601d-dec4-4bb1-8194-43fa15db04ce

📥 Commits

Reviewing files that changed from the base of the PR and between 91b4e5c and 171f7ea.

📒 Files selected for processing (1)
  • src/cachekit/cache_handler.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/cachekit/cache_handler.py

📝 Walkthrough

Walkthrough

Environment-driven backend selection (SaaS via API key, Redis fallback), client-side encryption auto-detection from master key, explicit L2 deserialization error handling with observability, cache-key sanitization for SaaS limits, and corresponding tests/config updates.

Changes

SaaS backend and encryption support

Layer / File(s) Summary
Backend auto-detection and provider selection
src/cachekit/backends/provider.py, tests/unit/backends/test_provider.py
DefaultBackendProvider now lazily selects backends by checking CACHEKIT_API_KEY: uses CachekitIOBackend if set, falls back to Redis via RedisBackendConfig.from_env(). Backend instance is cached on first get_backend() call. Tenant context defaults to "default" when unset. Tests verify selection logic and backend caching behavior.
Cache key sanitization for SaaS
src/cachekit/key_generator.py, tests/unit/test_cache_key_generator.py
New _sanitize_func_name() classmethod normalizes function identifiers by replacing disallowed characters, collapsing consecutive dots, and truncating to fixed length. Regex constants enforce SaaS cache-key constraints. Tests validate nested function, lambda, and module-level function sanitization, regex compliance, and determinism.
Automatic encryption detection and config changes
src/cachekit/cache_handler.py, src/cachekit/config/decorator.py, src/cachekit/config/nested.py, tests/unit/test_cache_handler_encryption_autodetect.py
CacheSerializationHandler auto-enables encryption and single-tenant mode when get_settings().master_key is present and encryption not explicitly set. DecoratorConfig.to_dict() redacts encryption.master_key; docstrings updated. EncryptionConfig.master_key uses field(..., repr=False) to omit key from repr(). Unit tests added for auto-detection and updated to expect redaction/hidden repr.
L2 deserialization error handling and observability
src/cachekit/cache_handler.py, src/cachekit/decorators/wrapper.py, tests/unit/test_l2_decrypt_observability.py
Sync and async cache get paths now catch SerializationError separately, logging "L2 cache decrypt/integrity failure" warning and returning None to trigger recomputation. Async wrapper reports deserialization-specific failures via features.handle_cache_error(). Double-check lock re-read paths also handle SerializationError explicitly. Tests verify fail-open behavior and warning messages for integrity/decrypt failures versus generic backend exceptions.
SaaS integration E2E tests
tests/integration/saas/test_sdk_secure_e2e.py
New test suite validates client-side encryption against live CachekitIO backend. Monkeypatches CachekitIOBackend.set to capture ciphertext written to SaaS. Tests verify encrypted round-trip integrity (plaintext returned to caller, stored bytes are ciphertext), zero-knowledge (plaintext absent from stored ciphertext), and tamper detection (GCM tag mutation causes recomputation). Fixtures manage API key, master key, namespace, and settings singleton.
Tests and config/secrets updates
.gitignore, .secrets.baseline, tests/unit/config/*, tests/unit/test_single_tenant_mode.py, .github/workflows/security-fast.yml
Adds .env.dev to .gitignore. Updates .secrets.baseline filters, a recorded secret line number, and timestamp. Tests updated to expect redacted master_key in exported config and to ensure master_key is omitted from EncryptionConfig.__repr__. New unit tests cover encryption auto-detection and adjusted single-tenant-mode test setup. CI pip-audit ignore list adjusted.

Sequence Diagram(s)

sequenceDiagram
  participant Application
  participant CacheOperationHandler
  participant Backend
  participant CacheSerializationHandler
  participant Logger
  participant Features
  Application->>CacheOperationHandler: get_cached_value(key)
  CacheOperationHandler->>Backend: backend.get(key)
  Backend->>CacheSerializationHandler: deserialize_data(bytes, cache_key)
  CacheSerializationHandler-->>Backend: raises SerializationError
  Backend-->>CacheOperationHandler: propagate SerializationError
  CacheOperationHandler->>Logger: "L2 cache decrypt/integrity failure" (cache_key)
  CacheOperationHandler->>Features: handle_cache_error(op="cache_get_deserialize", corr_id)
  CacheOperationHandler-->>Application: return None (cache miss)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • cachekit-io/cachekit-py#83: Addresses AAD v0x03 deserialization by ensuring cache_key is passed to deserialize_data() in wrapper paths; this PR adds targeted SerializationError handling in the same deserialization flows.

Poem

🐰 Keys scrubbed, the backends choose with grace,
SaaS wakes when the API key takes its place.
Master key hums — encryption comes to light,
Tampered bytes fail, recompute through the night.
Hop, patch, and test — the cache sleeps safe and tight.

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.38% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The PR description provides comprehensive detail on all 5 issues fixed, motivation, test results, and testing coverage. However, it does not follow the required template structure with explicit sections like 'Type of Change', 'Security Checklist', and 'Documentation Validation Checklist'. Consider using the repository's standard PR template structure to ensure all required checkboxes and sections are explicitly addressed.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix: 5 issues from encrypted-payload E2E testing' accurately and concisely summarizes the main change—fixing multiple issues discovered during E2E testing.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/e2e-smoke-findings

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 27, 2026

Codecov Report

❌ Patch coverage is 85.71429% with 8 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/cachekit/decorators/wrapper.py 45.45% 6 Missing ⚠️
src/cachekit/cache_handler.py 87.50% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
tests/unit/test_l2_decrypt_observability.py (1)

38-79: ⚡ Quick win

Add async-path parity tests for decrypt/integrity warnings.

This file validates only sync get_cached_value, but the same behavior was changed in get_cached_value_async and should be covered with pytest.mark.asyncio cases.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/test_l2_decrypt_observability.py` around lines 38 - 79, The
existing tests only cover the synchronous get_cached_value path; add equivalent
async tests for get_cached_value_async to ensure decrypt/integrity warnings are
emitted in the async code path. Create pytest.mark.asyncio test functions
mirroring test_serialization_error_logs_warning,
test_encryption_error_logs_warning,
test_generic_exception_does_not_trigger_decrypt_warning, and
test_recompute_on_decrypt_failure but call await
handler.get_cached_value_async("test:key") using the same handler from
_make_handler (with deserialize_side_effect set to SerializationError,
EncryptionError, and ConnectionError as appropriate) and the same caplog
assertions checking for "decrypt/integrity failure", specific error text like
"GCM tag mismatch", or "Backend operation failed" and that the call returns
None.
tests/unit/backends/test_provider.py (1)

306-370: ⚡ Quick win

Add regression coverage for tenant-context changes across calls.

Current tests validate default tenant setup, but not that a later tenant_context change is honored on subsequent get_backend() calls in Redis mode.

Suggested test shape
+    def test_get_backend_uses_current_tenant_each_call_in_redis_mode(self) -> None:
+        provider = DefaultBackendProvider()
+        with mock.patch.dict("os.environ", {}, clear=False):
+            import os
+            os.environ.pop("CACHEKIT_API_KEY", None)
+            from cachekit.backends.redis.provider import tenant_context
+
+            t1 = tenant_context.set("tenant-a")
+            b1 = provider.get_backend()
+            tenant_context.reset(t1)
+
+            t2 = tenant_context.set("tenant-b")
+            b2 = provider.get_backend()
+            tenant_context.reset(t2)
+
+            assert b1 is not b2
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/backends/test_provider.py` around lines 306 - 370, Add a test to
cover tenant-context changes across calls: instantiate DefaultBackendProvider,
patch RedisBackendConfig.from_env to return a config with redis_url, patch
RedisBackendProvider to return a mock backend, and patch
cachekit.backends.redis.provider.tenant_context so its get() returns different
values across calls (e.g., side_effect=[None, "new-tenant"]). Call
provider.get_backend() twice and assert tenant_context.set was first
called_with("default") and then called_with("new-tenant") (or assert_has_calls
in that order) to ensure the provider honors a changed tenant_context on
subsequent get_backend() calls.
src/cachekit/key_generator.py (1)

46-50: 💤 Low value

Consider importing re at the module level for consistency.

The __import__("re") pattern is unconventional. This file already imports standard library modules (hashlib, sys, datetime, etc.) at the top. Adding import re there would be more consistent and idiomatic.

♻️ Proposed refactor

Add to the imports at the top of the file (e.g., after line 5):

import re

Then update the class-level constants:

-    _FUNC_ALLOWED_RE = __import__("re").compile(r"[^A-Za-z0-9_.]")
-    _DOUBLE_DOT_RE = __import__("re").compile(r"\.{2,}")
+    _FUNC_ALLOWED_RE = re.compile(r"[^A-Za-z0-9_.]")
+    _DOUBLE_DOT_RE = re.compile(r"\.{2,}")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cachekit/key_generator.py` around lines 46 - 50, Replace the
unconventional use of __import__("re") with a module-level import: add "import
re" to the top-level imports and update the class-level constants
_FUNC_ALLOWED_RE and _DOUBLE_DOT_RE to use re.compile(...) (leave _FUNC_NAME_MAX
as-is); this makes the code idiomatic and consistent with other standard-library
imports used in this module.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cachekit/backends/provider.py`:
- Around line 116-144: The code caches a tenant-scoped Redis backend in
get_backend() (self._backend) causing the first tenant to be pinned; change
get_backend() so Redis backends are resolved per-tenant instead of stored in
self._backend: detect when you create RedisBackendProvider (use
RedisBackendConfig.from_env(), RedisBackendProvider and tenant_context), then
either (A) do not assign self._backend and simply return provider.get_backend()
on each call, or (B) maintain a mapping (e.g., self._backend_by_tenant dict
keyed by tenant_context.get()) and store provider.get_backend() per tenant key;
update any initialization logic that sets default tenant_context to still work
with the chosen per-tenant strategy.

---

Nitpick comments:
In `@src/cachekit/key_generator.py`:
- Around line 46-50: Replace the unconventional use of __import__("re") with a
module-level import: add "import re" to the top-level imports and update the
class-level constants _FUNC_ALLOWED_RE and _DOUBLE_DOT_RE to use re.compile(...)
(leave _FUNC_NAME_MAX as-is); this makes the code idiomatic and consistent with
other standard-library imports used in this module.

In `@tests/unit/backends/test_provider.py`:
- Around line 306-370: Add a test to cover tenant-context changes across calls:
instantiate DefaultBackendProvider, patch RedisBackendConfig.from_env to return
a config with redis_url, patch RedisBackendProvider to return a mock backend,
and patch cachekit.backends.redis.provider.tenant_context so its get() returns
different values across calls (e.g., side_effect=[None, "new-tenant"]). Call
provider.get_backend() twice and assert tenant_context.set was first
called_with("default") and then called_with("new-tenant") (or assert_has_calls
in that order) to ensure the provider honors a changed tenant_context on
subsequent get_backend() calls.

In `@tests/unit/test_l2_decrypt_observability.py`:
- Around line 38-79: The existing tests only cover the synchronous
get_cached_value path; add equivalent async tests for get_cached_value_async to
ensure decrypt/integrity warnings are emitted in the async code path. Create
pytest.mark.asyncio test functions mirroring
test_serialization_error_logs_warning, test_encryption_error_logs_warning,
test_generic_exception_does_not_trigger_decrypt_warning, and
test_recompute_on_decrypt_failure but call await
handler.get_cached_value_async("test:key") using the same handler from
_make_handler (with deserialize_side_effect set to SerializationError,
EncryptionError, and ConnectionError as appropriate) and the same caplog
assertions checking for "decrypt/integrity failure", specific error text like
"GCM tag mismatch", or "Backend operation failed" and that the call returns
None.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 19ebac08-2f27-4cd4-af27-a9e9dab0c2a4

📥 Commits

Reviewing files that changed from the base of the PR and between b67aa1b and 58cfcb3.

📒 Files selected for processing (12)
  • .gitignore
  • .secrets.baseline
  • src/cachekit/backends/provider.py
  • src/cachekit/cache_handler.py
  • src/cachekit/config/decorator.py
  • src/cachekit/config/nested.py
  • src/cachekit/decorators/wrapper.py
  • src/cachekit/key_generator.py
  • tests/integration/saas/test_sdk_secure_e2e.py
  • tests/unit/backends/test_provider.py
  • tests/unit/test_cache_key_generator.py
  • tests/unit/test_l2_decrypt_observability.py

Comment thread src/cachekit/backends/provider.py Outdated
CacheSerializationHandler.__init__ now auto-detects CACHEKIT_MASTER_KEY
when encryption is not explicitly configured.  This is the single
convergence point for ALL backends and presets — Redis, CachekitIO,
File, Memcached all get encryption when the env var is set.

Previously only @cache.io() and @cache.secure() detected the env var;
@cache.production(), @cache.minimal(), @cache.dev(), and bare @cache
silently stored plaintext.

Also:
- Revert .io() inline detection (commit 58cfcb3) — handler handles it
- Fix EncryptionConfig master_key repr leak (CWE-532: repr=False)
- Mask master_key in DecoratorConfig.to_dict() (CWE-200: [REDACTED])
- Guard clause skips auto-detect when tenant_extractor is set (user
  expressing multi-tenant intent)
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/cachekit/cache_handler.py (1)

821-823: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid double-logging decrypt failures.

CacheSerializationHandler.deserialize_data() still catches inner SerializationErrors in its generic except Exception block at Lines 698-700, so a corrupted entry will emit an error there and then hit this warning here. That keeps the old noisy error path alive even though this path is intentionally fail-open.

Suggested fix
         try:
             # Unwrap cache data envelope
             serialized_data, metadata_dict, serializer_name = SerializationWrapper.unwrap(data)
             ...
         except ValueError:
             # cache_key missing for encrypted data - FAIL CLOSED (re-raise)
             raise
+        except SerializationError:
+            raise
         except Exception as e:
             get_logger().error(f"Deserialization failed with {self.serializer_name}: {e}")
             raise SerializationError(f"Failed to deserialize data with {self.serializer_name}: {e}") from e

Also applies to: 854-856

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cachekit/cache_handler.py` around lines 821 - 823, The duplicate logging
is caused because CacheSerializationHandler.deserialize_data() swallows
SerializationError in its generic except block and also logs it later in the
specific except SerializationError handler; update deserialize_data() so the
generic "except Exception as e" does not eat SerializationError—i.e. if
isinstance(e, SerializationError): raise—to let the outer except
SerializationError handle/log it (apply the same change to the other similar
generic except block that precedes the except SerializationError at the second
location).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cachekit/cache_handler.py`:
- Around line 309-319: The code currently treats the boolean parameter
encryption as both "unset" and "explicitly False", causing CACHEKIT_MASTER_KEY
to override callers who passed encryption=False; change the API so
auto-detection only runs when encryption is unset (use a tri-state/sentinel or
change the decorator to pass encryption=None for "auto" instead of False), then
update the block in cache_handler.py that checks (encryption, master_key,
tenant_extractor) to only enable encryption when encryption is None/UNSET and
settings.master_key exists; ensure related places including the cache handler
__init__ and the default-serializer guard check the new sentinel (or an explicit
auto_detect_encryption flag) rather than a plain bool so explicit
encryption=False remains honored and auto-detect only applies when encryption
was not provided.

---

Outside diff comments:
In `@src/cachekit/cache_handler.py`:
- Around line 821-823: The duplicate logging is caused because
CacheSerializationHandler.deserialize_data() swallows SerializationError in its
generic except block and also logs it later in the specific except
SerializationError handler; update deserialize_data() so the generic "except
Exception as e" does not eat SerializationError—i.e. if isinstance(e,
SerializationError): raise—to let the outer except SerializationError handle/log
it (apply the same change to the other similar generic except block that
precedes the except SerializationError at the second location).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 07f02c52-5de7-46ac-b29e-92cedf136e10

📥 Commits

Reviewing files that changed from the base of the PR and between 58cfcb3 and 2e6a297.

📒 Files selected for processing (8)
  • .secrets.baseline
  • src/cachekit/cache_handler.py
  • src/cachekit/config/decorator.py
  • src/cachekit/config/nested.py
  • tests/unit/config/test_decorator_config.py
  • tests/unit/config/test_nested_configs.py
  • tests/unit/test_cache_handler_encryption_autodetect.py
  • tests/unit/test_single_tenant_mode.py
✅ Files skipped from review due to trivial changes (1)
  • tests/unit/config/test_decorator_config.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/cachekit/config/nested.py

Comment on lines +309 to +319
# Auto-detect encryption from CACHEKIT_MASTER_KEY when not explicitly configured.
# This is the single convergence point for ALL backends and presets.
if not encryption and master_key is None and tenant_extractor is None:
from cachekit.config.singleton import get_settings

settings = get_settings()
if settings.master_key:
encryption = True
master_key = settings.master_key.get_secret_value()
single_tenant_mode = True

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Preserve an explicit encryption=False opt-out.

Because encryption is still a plain bool, this block can't tell “unset” from “explicitly disabled”. As written, CACHEKIT_MASTER_KEY will force encryption on even for callers that intentionally passed encryption=False, which can now trip the default-serializer guard later in __init__. If the contract is “auto-detect only when not explicitly configured”, this needs a tri-state/sentinel or a separate auto-detect flag from the decorator layer.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cachekit/cache_handler.py` around lines 309 - 319, The code currently
treats the boolean parameter encryption as both "unset" and "explicitly False",
causing CACHEKIT_MASTER_KEY to override callers who passed encryption=False;
change the API so auto-detection only runs when encryption is unset (use a
tri-state/sentinel or change the decorator to pass encryption=None for "auto"
instead of False), then update the block in cache_handler.py that checks
(encryption, master_key, tenant_extractor) to only enable encryption when
encryption is None/UNSET and settings.master_key exists; ensure related places
including the cache handler __init__ and the default-serializer guard check the
new sentinel (or an explicit auto_detect_encryption flag) rather than a plain
bool so explicit encryption=False remains honored and auto-detect only applies
when encryption was not provided.

pyarrow 22.0.0 has a C++ use-after-free in IPC file reading with
pre-buffering enabled.  The advisory explicitly states "not exposed
in language bindings (Python, Ruby, C GLib), so these bindings are
not vulnerable."  Dev-only transitive dep (optional cachekit[data]).
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/security-fast.yml:
- Around line 96-97: Update the security workflow comments to (1) keep the
PYSEC-2026-113 ignore justification but tighten wording to note that the Arrow
C++ pre-buffer API (RecordBatchFileReader::PreBufferMetadata) is not exposed to
Python/PyArrow and that src/cachekit/serializers/arrow_serializer.py only uses
pa.ipc.new_file / pa.ipc.open_file (no pre-buffering), and (2) change the
urllib3/GHSA-mf9v-mfxr-j63j justification to state the issue originates from the
dev dependency set used by the uv sync --group dev task (i.e., dev deps include
urllib3 via that group) rather than implying no runtime impact; edit the
comments adjacent to the uv run pip-audit command and the urllib3 note
accordingly so they reference PYSEC-2026-113 and GHSA-mf9v-mfxr-j63j with the
clarified rationales.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f7000351-a862-4895-b844-e4039613298d

📥 Commits

Reviewing files that changed from the base of the PR and between 2e6a297 and 1c7613f.

📒 Files selected for processing (1)
  • .github/workflows/security-fast.yml

Comment thread .github/workflows/security-fast.yml Outdated
27Bslash6 added 2 commits May 28, 2026 17:58
…ionale

- DefaultBackendProvider no longer caches the Redis backend instance.
  RedisBackendProvider.get_backend() returns per-request tenant-scoped
  wrappers via ContextVar; caching the first result pinned the tenant.
  Now cache the provider (factory) and call get_backend() each time.
  CachekitIO is stateless, still cached.

- Tighten pip-audit ignore rationale: urllib3 CVEs noted as dev-dep
  (runtime uses httpx); PYSEC-2026-113 clarified that PreBufferMetadata
  is not exposed to Python bindings.

- Comment 2 (encryption=False opt-out) is by design: CACHEKIT_MASTER_KEY
  is a fleet-wide switch with no per-function opt-out (confirmed).
deserialize_data() had a catch-all `except Exception` that wrapped
SerializationError/EncryptionError into a new SerializationError before
re-raising.  The outer except SerializationError handler in
get_cached_value() then logged it again — double logging and the
original exception type was lost.

Let SerializationError pass through the catch-all so the outer handler
sees the original exception (EncryptionError vs SerializationError).
@27Bslash6 27Bslash6 merged commit b1aab22 into main May 28, 2026
39 of 44 checks passed
@27Bslash6 27Bslash6 deleted the fix/e2e-smoke-findings branch May 28, 2026 09:07
27Bslash6 added a commit that referenced this pull request May 29, 2026
… (#133)

CI's Tests jobs have been red since PR #127 (May 28) with 14 integration
test failures on all six Python versions, e.g.
https://github.com/cachekit-io/cachekit-py/actions/runs/26573103303

Three independent bugs combined to produce the failure. Each one alone
is harmless; together they make CI nondeterministic across test
orderings.

1. `tests/conftest.py:setup_redis_env` was an autouse fixture that
   pre-set `CACHEKIT_MASTER_KEY="a"*64` for every test. Inert before
   #127, but that PR added auto-detection in
   `CacheSerializationHandler.__init__` (`cache_handler.py:309-318`):
   any handler constructed without explicit encryption now reads
   `get_settings()` and turns encryption on if MASTER_KEY is present.

2. The v0.6.0 cross-SDK rule (`cache_handler.py:351-362`) rejects
   `serializer='auto'` and custom serializer instances when encryption
   is on. With #1's silent env-set, every test using AutoSerializer or
   ArrowSerializer started raising `ConfigurationError` at decoration
   time.

3. `test_master_key_validation_invalid_length` (lines 182-215) wrote
   `"too_short_key"` directly to `os.environ` with a try/finally that
   only restored the env `if original_key is not None`. That assumed the
   conftest pre-set was always present — once #1 is removed the restore
   branch is skipped, leaking `"too_short_key"` to every subsequent
   test's `bytes.fromhex()`. Same shape for the sibling missing-key test.

Fixes:

- Remove the implicit MASTER_KEY set from `setup_redis_env`. Tests that
  need encryption already use `monkeypatch.setenv("CACHEKIT_MASTER_KEY",
  ...)` — verified across all 14 encryption-using files.
- Add `reset_settings()` brackets to `setup_redis_env` so the
  `get_settings()` singleton can never carry env state across tests.
- Convert the two master-key validation tests to `monkeypatch.setenv`
  /`delenv` so their cleanup is automatic.
- Remove the 2 `TestSerializerWithEncryptionInProduction` tests. They
  combined `@cache(serializer=ArrowSerializer())` with MASTER_KEY,
  asserting that Arrow flowed through the encryption path. It never
  did: `cache_handler.py:497-498` builds `EncryptionWrapper` without
  passing the user's serializer, so the wrapper silently fell back to
  StandardSerializer (MessagePack). The v0.6.0 rule made that silent
  substitution loud. Making Arrow+encryption actually work needs the
  wrapper-wiring fix at line 498 plus protocol-spec updates and a
  cross-SDK marker on `SerializerProtocol` — tracked separately.

Verified: 1448 unit + 493 critical/integration tests pass locally with
a clean env.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant