This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Splat is an AWS Lambda function that renders PDFs from HTML/CSS/JS using either PrinceXML or Playwright. It's designed as a self-hosted alternative to DocRaptor.
Key Technologies:
- Python 3.11 runtime
- PrinceXML 14 for fast CSS-based PDF rendering
- Playwright 1.43.0 with Chromium for JavaScript-heavy pages
- Docker-based deployment to AWS Lambda
- Boto3 for S3 integration
mise run install # Bootstrap project: install deps, setup pre-commit hooksmise run test # Run tests in docker-compose with Lambda emulation
mise run ci:test # Run tests in CI mode (builds first, no interactive mode)mise run format # Run ruff formatter
mise run ruff-check # Run ruff linter with auto-fix
mise run lint # Run both format and ruff-checkmise run build # Build docker image
mise run start # Start local Lambda + MinIO (S3) with hot-reload
# Test with CLI (requires local server running)
./splat_cli.py -o /tmp/test.pdf -c "<h1>Hello</h1>"
./splat_cli.py -o /tmp/google.pdf -b https://google.com
# Test against deployed Lambda
./splat_cli.py -o /tmp/test.pdf -c "<h1>Hello</h1>" --function-name splat-stagingdocker compose --profile test run --rm -it dev pytest tests/test_lambda_e2e.py::test_check_license_returns_a_license_payload
docker compose --profile test run --rm -it dev pytest tests/test_lambda_e2e.py -k "princexml and A4"File: lambda_function.py:412
- Function:
lambda_handler(event, context) - Main Logic:
handle_event(payload)(no error handling, called by handler)
1. init() - Setup fonts if fonts/ directory exists
2. Pydantic validation (Payload model)
3. License check (if check_license=true)
4. create_pdf() - Generate PDF based on input source
├─ pdf_from_document_content() - Direct HTML string
├─ pdf_from_document_url() - Fetch remote HTML via requests
└─ pdf_from_browser_url() - Visit with Playwright, extract HTML
5. deliver_pdf() - Send PDF to destination
├─ deliver_pdf_to_s3_bucket() - Upload to S3, return presigned URL
├─ deliver_pdf_to_presigned_url() - POST to external presigned URL (10 retries)
└─ deliver_pdf_via_streaming_base64() - Base64 encode (max 5.5MB)
-
PrinceXML (
prince_handler()at line 272): Fast, CSS-focused, optional JavaScript- Binary:
/var/task/prince - License:
./prince-engine/license/license.dat - Custom fonts via
FONTCONFIG_PATHenvironment variable
- Binary:
-
Playwright (
playwright_page_to_pdf()at line 104): Full browser, JS execution- Launch args optimized for Lambda (30+ flags, no sandbox)
- Network logging for debugging
- Waits for "domcontentloaded" + "load" events
- Emulates print media type
Purpose: Django/Python library for invoking Lambda
Key Functions:
configure_splat()- Set function name, region, bucket, sessionpdf_from_html()- Full S3 workflow (upload HTML to temp location, invoke Lambda with document_url)pdf_from_html_without_s3()- Streaming mode (embed HTML in payload, receive base64)
Files:
uptick_splat/__init__.py- Public APIuptick_splat/config.py- Configuration with defaultsuptick_splat/utils.py- Lambda invocation orchestration (150+ lines)uptick_splat/logging.py- Structured logginguptick_splat/exceptions.py- Exception definitions
- Payload Model:
lambda_function.py:48- Pydantic validation for all input params - PDF Creation:
lambda_function.py:233-266- Orchestrates create_pdf() based on input source - PrinceXML Handler:
lambda_function.py:272-309- Subprocess execution with structured logs - Playwright Context Manager:
lambda_function.py:104-194- Browser lifecycle with error handling - S3 Delivery with Retry:
lambda_function.py:342-362- POST with exponential backoff logic - License Check:
lambda_function.py:368-392- Parse PrinceXML license file
- PDF Generation:
uptick_splat/utils.py:39-98- pdf_from_html() with S3 temp storage - Streaming Mode:
uptick_splat/utils.py:101-126- pdf_from_html_without_s3() - Config Management:
uptick_splat/config.py:11-21- Config dataclass with defaults
-
tests/test_lambda_e2e.py- End-to-end integration tests (151 lines)- License validation
- Renderer matrix (princexml/playwright × A4/Letter)
- Input sources (document_url, document_content, browser_url)
- Output modes (presigned_url, base64, bucket_name)
- Error handling
-
tests/utils.py- Test helpers (MinIO S3 client, Lambda HTTP wrapper)
-
lambda service: Production-like container on port 8080
- Watch mode: Hot-reload on
lambda_function.pychanges - AWS env vars configured
- Watch mode: Hot-reload on
-
minio service: S3-compatible storage on ports 9000 (API), 9090 (console)
- Auto-creates "test" bucket
- Credentials: root/password
-
dev service: Development container with uv for running tests
Stage 1 (Build):
- Base:
mcr.microsoft.com/playwright/python:v1.43.0-jammy - Install: gcc, cmake, libcurl for compiling dependencies
- Copy:
lambda_requirements.txt→/var/task
Stage 2 (Runtime):
- Base: Same Playwright image
- Install: AWS Lambda RIE, PrinceXML 14.2-aws-lambda
- Copy: Custom fonts from
splat-private/fonts/(if exists) - Copy: License from
splat-private/license.dat(if exists) - Entrypoint:
/entry_script.sh lambda_function.lambda_handler
license.dat- PrinceXML license file (place in root before build)fonts.zip- Must containfonts/folder with font files (place in root before build)
# entry_script.sh
if [ -z "${AWS_LAMBDA_RUNTIME_API}" ]; then
# Local: Use Lambda RIE
exec /usr/local/bin/aws-lambda-rie /usr/bin/python -m awslambdaric $@
else
# AWS: Use native runtime
exec /usr/bin/python -m awslambdaric $@
fi- Response Size: 6MB max for base64 streaming (checked at line 318)
- Timeout: 10 minutes for Playwright page loads
- Temp Storage:
/tmpdirectory, cleaned up in finally block
- Retry logic: Up to 10 attempts (line 342-362)
- Handles AWS 500/503 transient failures
- No exponential backoff implemented (constant retries)
- SplatPDFGenerationFailure: Custom exception with status codes (400, 403, 500)
- Sentry Integration: Automatic capture with AwsLambdaIntegration
- Network Logging: Playwright requests/responses logged for debugging
requests==2.31.0
boto3==1.34.0
sentry-sdk==1.39.0
awslambdaric
pydantic
playwright==1.43.0
requires-python = ">=3.9" # Library supports 3.9+, Lambda uses 3.11
dependencies:
django = ">=3.1, <5.0.0"
boto3
playwright = "1.43.0"
dev dependencies:
pytest = ">=8.1.1"
ruff = ">=0.3.7"
- Linter: Ruff with 120 character line length
- Formatter: Ruff format
- Type Hints: MyPy support (py.typed marker file)
- Pre-commit Hooks: Configured in
.pre-commit-config.yaml
SENTRY_DSN- Error tracking endpoint (optional)FONTCONFIG_PATH- Custom fonts directory (set by init() if fonts exist)AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,AWS_DEFAULT_REGION- AWS credentials
AWS_LAMBDA_RUNTIME_API- Presence indicates AWS environment vs local RIE
- Check network logs in Sentry (automatically captured)
- Verify browser_headers are properly formatted
- Ensure target URL is publicly accessible from Lambda
- Check timeout settings (default: 10 min)
- PrinceXML: CSS-based, no JS execution by default
- Playwright: Full browser, always executes JS
- Hybrid approach:
browser_url+renderer: princexml(visit with Playwright, render with PrinceXML)
- Ensure
fonts.zipcontainsfonts/folder (not root-level fonts) - Verify FONTCONFIG_PATH is set (check init() logs)
- PrinceXML includes Liberation fonts by default
- Use
{"check_license": true}payload to check status - Response includes: type, expiry_date, remaining_count (if limited)
- Demo mode watermarks PDFs if no valid license