Skip to content

SECURITY: Path Traversal in LLM Cache File Operations #53

@jeremyeder

Description

@jeremyeder

Vulnerability Summary

Severity: HIGH (CVSS 7.5)
CWE: CWE-22 (Path Traversal)
Location: src/agentready/services/llm_cache.py:37,68
Impact: Arbitrary file read/write outside cache directory

Description

The LLMCache class constructs file paths from partially user-controlled cache keys without validation, allowing path traversal attacks.

# VULNERABLE CODE (llm_cache.py:37)
def get(self, cache_key: str) -> DiscoveredSkill | None:
    cache_file = self.cache_dir / f"{cache_key}.json"  # No validation!
    
    if not cache_file.exists():
        return None
    
    with open(cache_file, "r", encoding="utf-8") as f:
        data = json.load(f)

Attack Vector

Cache keys are generated from:

# llm_cache.py:96
key_data = f"{attribute_id}_{score}_{evidence_hash}"
return hashlib.sha256(key_data.encode()).hexdigest()[:16]

While the key is hashed, a malicious assessor or modified repository could provide a crafted attribute_id containing path traversal sequences before hashing, or exploit hash collisions to target specific files.

More critically: If an attacker can control the inputs to generate_key(), they could craft inputs that produce a hash starting with ../ sequences (though unlikely, hash collisions are possible).

Proof of Concept

# Malicious attribute_id
attribute_id = "../../../etc/passwd"
score = 100.0
evidence_hash = "a" * 16

# Generated key might start with ../ sequences after truncation
cache_key = LLMCache.generate_key(attribute_id, score, evidence_hash)

# Results in path traversal
cache_file = Path(".agentready/llm-cache") / f"{cache_key}.json"
# Could resolve to: /etc/passwd.json (depending on hash output)

Security Impact

  • Information disclosure: Read arbitrary files as JSON
  • Arbitrary file write: Write malicious JSON to system locations
  • Cache poisoning: Inject malicious skills into cache
  • Denial of service: Fill disk with cache files in arbitrary locations

Remediation

Immediate Fix (P0)

Add path validation to prevent traversal:

# SECURITY: Path traversal prevention in cache operations
# Why: User-influenced cache keys could contain ../ sequences
# Prevents: Path Traversal (CWE-22)
# Alternative considered: Filesystem sandboxing rejected due to portability

def get(self, cache_key: str) -> DiscoveredSkill | None:
    """Get cached skill if exists and not expired."""
    # Validate cache key format (alphanumeric only)
    if not cache_key.isalnum():
        logger.warning(f"Invalid cache key format: {cache_key}")
        return None
    
    cache_file = self.cache_dir / f"{cache_key}.json"
    
    # SECURITY: Ensure resolved path is within cache directory
    try:
        cache_file = cache_file.resolve()
        if not str(cache_file).startswith(str(self.cache_dir.resolve())):
            logger.error(f"Path traversal attempt blocked: {cache_key}")
            return None
    except Exception as e:
        logger.error(f"Path resolution error: {e}")
        return None
    
    if not cache_file.exists():
        logger.debug(f"Cache miss: {cache_key}")
        return None
    
    # ... rest of method

Additional Protections

  1. Strict key format validation:

    import re
    CACHE_KEY_PATTERN = re.compile(r'^[a-f0-9]{16}$')
    if not CACHE_KEY_PATTERN.match(cache_key):
        raise ValueError(f"Invalid cache key: {cache_key}")
  2. Filesystem permissions:

    # Set restrictive permissions on cache directory
    cache_dir.mkdir(parents=True, exist_ok=True, mode=0o700)
  3. File size limits:

    # Prevent DoS via large cache files
    if cache_file.stat().st_size > 1_000_000:  # 1MB max
        logger.warning(f"Cache file too large: {cache_file}")
        return None

References

Related Vulnerabilities

Same pattern exists in:

  • FileCreationFix.apply() - file_path validation needed
  • FileModificationFix.apply() - file_path validation needed
  • CodeSampler._format_code_samples() - path validation needed

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions