Skip to content

RE1-T119 Bug fix#382

Merged
ucswift merged 2 commits into
masterfrom
develop
May 14, 2026
Merged

RE1-T119 Bug fix#382
ucswift merged 2 commits into
masterfrom
develop

Conversation

@ucswift
Copy link
Copy Markdown
Member

@ucswift ucswift commented May 14, 2026

Summary by CodeRabbit

  • Chores
    • Optimized Docker build for the TTS service using a refined multi-stage approach and reduced runtime footprint.
    • Adjusted runtime library path handling for the worker image to enforce the dedicated runtime library directory.

Review Change Stack

@request-info
Copy link
Copy Markdown

request-info Bot commented May 14, 2026

Thanks for opening this, but we'd appreciate a little more information. Could you update it with more details?

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 14, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8576236c-fd32-49ee-84af-6ea4f04adb16

📥 Commits

Reviewing files that changed from the base of the PR and between 05a9326 and 1f9e908.

📒 Files selected for processing (2)
  • Web/Resgrid.Web.Tts/Dockerfile
  • Workers/Resgrid.Workers.Console/Dockerfile
🚧 Files skipped from review as they are similar to previous changes (1)
  • Web/Resgrid.Web.Tts/Dockerfile

📝 Walkthrough

Walkthrough

The Dockerfiles move TTS runtime preparation into a new publish stage that installs runtime packages, prepares Piper/ffmpeg/espeak assets, collects required shared libraries, and then the final stage copies only those artifacts and sets LD_LIBRARY_PATH. The workers image now overwrites LD_LIBRARY_PATH to /usr/local/lib/wkhtmltopdf.

Changes

TTS Runtime Image Multistage Build Refactoring

Layer / File(s) Summary
Multistage publish: install deps, prepare Piper, collect libs
Web/Resgrid.Web.Tts/Dockerfile
Adds publish stage from build: installs ffmpeg, libespeak-ng1, ca-certificates, curl; downloads/unpacks Piper assets; creates app user/group and appends entries; collects deduplicated shared libraries for Piper/ffmpeg into /tmp/ttsdeps.
Final stage assembly and runtime config
Web/Resgrid.Web.Tts/Dockerfile
Changes final stage to FROM base AS final; copies prepared Piper binary/voices, espeak data, ffmpeg/ffprobe, collected TTS libraries, captured /etc/passwd//etc/group, and published .NET output; sets LD_LIBRARY_PATH to /usr/local/lib/tts.

Workers image LD_LIBRARY_PATH change

Layer / File(s) Summary
Overwrite LD_LIBRARY_PATH in final image
Workers/Resgrid.Workers.Console/Dockerfile
Final-stage ENV LD_LIBRARY_PATH now sets /usr/local/lib/wkhtmltopdf instead of appending to an existing LD_LIBRARY_PATH.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • Resgrid/Core#365: The main PR's Dockerfile refactor changes how TTS/Piper assets and runtime libraries are staged and copied into the final image, which directly overlaps with #365's adjustments to copying libpiper_phonemize.so*/updating the dynamic linker in the Dockerfile's final stage.
  • Resgrid/Core#364: Both PRs modify Web/Resgrid.Web.Tts/Dockerfile in the Piper TTS download/build flow (the main PR moves asset preparation into a publish stage, while the other PR changes the PIPER_VERSION build-arg that determines which Piper artifact URL is fetched).
  • Resgrid/Core#340: Both PRs modify Web/Resgrid.Web.Tts/Dockerfile to change the multi-stage build/runtime layout (stage definitions and FROM base AS final/final assembly flow).
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'RE1-T119 Bug fix' is vague and does not clearly describe what bug was fixed or what the main changes accomplish. Provide a more descriptive title that specifies the bug or issue being fixed, such as 'Fix Docker multi-stage builds for TTS and Workers runtime dependencies' or similar.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch develop

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
Web/Resgrid.Web.Tts/Dockerfile (1)

38-38: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fallback symlink branch will produce a broken espeak-ng-data in the final image.

When the Piper tarball does not include espeak-ng-data, the else branch creates a symlink pointing to /usr/lib/x86_64-linux-gnu/espeak-ng-data (provided by libespeak-ng1). That target only exists in the publish stage — the final stage’s base image does not install libespeak-ng1, and COPY preserves symlinks rather than dereferencing directory symlinks across stages, so the final image ends up with a dangling /usr/share/espeak-ng-data and Piper will fail to find phoneme data at runtime.

Either resolve the symlink at copy time (e.g. cp -RL the data into a real directory in publish before the final COPY) or assert that the Piper bundle contains espeak-ng-data and fail the build otherwise.

🛡️ Proposed fix (materialize data in publish stage)
-	&& if [ -d /tmp/piper/espeak-ng-data ]; then cp -R /tmp/piper/espeak-ng-data /usr/share/; else ln -sf /usr/lib/x86_64-linux-gnu/espeak-ng-data /usr/share/espeak-ng-data; fi \
+	&& if [ -d /tmp/piper/espeak-ng-data ]; then \
+		cp -R /tmp/piper/espeak-ng-data /usr/share/; \
+	elif [ -d /usr/lib/x86_64-linux-gnu/espeak-ng-data ]; then \
+		cp -R /usr/lib/x86_64-linux-gnu/espeak-ng-data /usr/share/espeak-ng-data; \
+	else \
+		echo "espeak-ng-data not found" >&2; exit 1; \
+	fi \

Also applies to: 70-70

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Web/Resgrid.Web.Tts/Dockerfile` at line 38, The fallback branch in the
Dockerfile line that currently does "if [ -d /tmp/piper/espeak-ng-data ]; then
cp -R /tmp/piper/espeak-ng-data /usr/share/; else ln -sf
/usr/lib/x86_64-linux-gnu/espeak-ng-data /usr/share/espeak-ng-data; fi" produces
a dangling symlink in the final image because /usr/lib/... only exists in the
publish stage; change this to materialize a real directory in the publish stage
or fail the build: detect absence of /tmp/piper/espeak-ng-data and then copy
(dereference) the espeak-ng-data contents from the publish-stage location into
/usr/share/espeak-ng-data (e.g., use a non-symlink copy such as cp -a or cp -R
with dereference) or exit with an error if neither source exists so the final
image contains real data rather than a symlink.
🧹 Nitpick comments (3)
Web/Resgrid.Web.Tts/Dockerfile (3)

25-55: 💤 Low value

Single mega-RUN couples unrelated concerns; consider splitting for cache locality.

This 30-line RUN does apt install, Piper extraction, espeak data placement, voice-model download, user creation, and dependency collection in one shot. Any change to the voice list (likely the most-edited part) invalidates the apt + Piper layers and re-pays the install cost on every rebuild. Splitting into ~3 RUNs (apt + Piper binary, voice downloads, user + lib collection) would give much better layer caching during iteration without changing image size meaningfully.

Not blocking — purely a build-time ergonomics improvement.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Web/Resgrid.Web.Tts/Dockerfile` around lines 25 - 55, The big RUN that
handles apt installs, Piper extraction, espeak data handling, voice-model
downloads and cleanup should be split into separate RUNs to improve layer
caching: 1) a RUN that does apt-get update/install (ffmpeg, libespeak-ng1,
ca-certificates, curl), creates /usr/local/share/piper-voices, downloads and
installs the Piper binary using PIPER_VERSION and performs the espeak-ng-data
copy/link and ldconfig, then cleans /tmp; 2) a separate RUN that performs the
voice-model curl loop writing to /usr/local/share/piper-voices (the list of
voice paths and the curl --retry invocations); and 3) a final RUN for any user
creation, permissions (chmod +x /usr/local/bin/piper), and any remaining lib
copying (find /tmp/piper -name '*.so*' -> /usr/local/lib) if still
needed—preserve command order and cleanup (rm -rf) in each RUN to keep image
size unchanged.

59-63: ⚡ Quick win

ldd scan is missing ffprobe and parses "not found" lines as paths.

Two small issues in the dependency-collection loop:

  1. ffprobe is copied into the final image (line 72) but is not included in the for bin in … list, so any library it needs that piper/ffmpeg do not pull in will be absent at runtime. In practice ffprobe’s deps are a subset of ffmpeg’s, but it is safer to include it explicitly.
  2. awk '/=>/ {print $3}' also matches lib… => not found lines, producing the literal string not as a path. The downstream [ -f "$lib" ] filters it out, but it is worth tightening the filter and surfacing missing libraries instead of silently dropping them.
♻️ Proposed refactor
-	&& for bin in /usr/local/bin/piper /usr/bin/ffmpeg; do \
-		ldd "$bin" 2>/dev/null | awk '/=>/ {print $3}' >> /tmp/ttsdeps/libs.txt; \
-	done \
+	&& for bin in /usr/local/bin/piper /usr/bin/ffmpeg /usr/bin/ffprobe; do \
+		if ldd "$bin" 2>/dev/null | grep -q 'not found'; then \
+			echo "Missing libraries for $bin" >&2; \
+			ldd "$bin" | grep 'not found' >&2; \
+			exit 1; \
+		fi; \
+		ldd "$bin" 2>/dev/null | awk '/=> \// {print $3}' >> /tmp/ttsdeps/libs.txt; \
+	done \
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Web/Resgrid.Web.Tts/Dockerfile` around lines 59 - 63, The
dependency-collection loop currently iterates "for bin in /usr/local/bin/piper
/usr/bin/ffmpeg" and uses "ldd \"$bin\" | awk '/=>/ {print $3}'" which omits
ffprobe and also extracts the word "not" from "=> not found"; update the loop to
include /usr/bin/ffprobe alongside /usr/local/bin/piper and /usr/bin/ffmpeg,
tighten the ldd parsing (e.g. only print the third field when it is an absolute
path or exclude "not found" lines) instead of blindly using awk '/=>/ {print
$3}', and surface missing libraries by capturing ldd lines containing "not
found" to a separate file (e.g. /tmp/ttsdeps/missing.txt) so missing deps are
visible rather than silently ignored.

62-63: ⚖️ Poor tradeoff

Filter system libraries from dependency collection to avoid shadowing the runtime base image.

Lines 59-63 collect all libraries from ldd output (piper, ffmpeg) plus everything in /usr/local/lib, then copy them to /tmp/ttsdeps/. This includes system libraries like libc.so.6, libstdc++.so.6, libgcc_s.so.1, etc., from the SDK image. Because LD_LIBRARY_PATH is prepended to /usr/local/lib/tts (line 78), these copies will take precedence over the runtime base image's system libraries.

Both stages currently use Debian 13, so this is harmless today, but creates a hidden coupling: if the SDK base drifts a glibc patch version ahead, the runtime will silently use the older/newer libc from /usr/local/lib/tts, risking version mismatches like GLIBC_X.X not found errors.

Filter out system libraries guaranteed by the runtime base (libc, libstdc++, libgcc_s, libm, libdl, libpthread, librt, ld-linux*) and keep only TTS-specific ones (libespeak-ng and piper's bundled dependencies). Use a grep exclusion on the ldd output:

ldd "$bin" 2>/dev/null | awk '/=>/ {print $3}' | grep -vE '(libc|libstdc\+\+|libgcc_s|libm|libdl|libpthread|librt|ld-linux)' >> /tmp/ttsdeps/libs.txt

Also applies to: 73

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Web/Resgrid.Web.Tts/Dockerfile` around lines 62 - 63, The Dockerfile is
copying all .so files from /usr/local/lib into /tmp/ttsdeps, which can shadow
runtime system libraries; update the dependency collection for ldd outputs and
the /usr/local/lib copy to exclude standard system libs (libc, libstdc++,
libgcc_s, libm, libdl, libpthread, librt, ld-linux*) so only TTS-specific libs
(espeak-ng, piper and their bundled deps) are collected; specifically, change
the ldd collection pipeline that writes to /tmp/ttsdeps/libs.txt to filter out
those names (use a grep -vE exclusion) and likewise modify the find/copy step
(the line with find /usr/local/lib -name '*.so*' -type f >>
/tmp/ttsdeps/libs.txt and the subsequent copy loop) to skip files matching those
same system-library patterns, preserving LD_LIBRARY_PATH=/usr/local/lib/tts
semantics.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Web/Resgrid.Web.Tts/Dockerfile`:
- Line 78: The ENV line in the Dockerfile sets LD_LIBRARY_PATH to
"/usr/local/lib/tts:${LD_LIBRARY_PATH}" which expands to a trailing colon (empty
path) and lets the dynamic linker treat CWD as a library search location; change
the ENV declaration to set LD_LIBRARY_PATH explicitly without the expansion,
e.g. replace the ENV line with one that sets
LD_LIBRARY_PATH="/usr/local/lib/tts" (remove the ${LD_LIBRARY_PATH} and trailing
colon) so no empty path segment is introduced.
- Around line 74-75: The Dockerfile currently copies /etc/passwd and /etc/group
from the publish stage (COPY --from=publish /etc/passwd /etc/passwd and COPY
--from=publish /etc/group /etc/group), which overwrites base-image system
accounts; remove those two COPY lines and instead append only the app user/group
entries into the final image's /etc/passwd and /etc/group via a RUN that echoes
the new group and user lines (so you preserve base users like root/_apt), and
also remove corresponding groupadd/useradd invocations from the publish stage
(they're unnecessary and not available in the hardened runtime).

---

Outside diff comments:
In `@Web/Resgrid.Web.Tts/Dockerfile`:
- Line 38: The fallback branch in the Dockerfile line that currently does "if [
-d /tmp/piper/espeak-ng-data ]; then cp -R /tmp/piper/espeak-ng-data
/usr/share/; else ln -sf /usr/lib/x86_64-linux-gnu/espeak-ng-data
/usr/share/espeak-ng-data; fi" produces a dangling symlink in the final image
because /usr/lib/... only exists in the publish stage; change this to
materialize a real directory in the publish stage or fail the build: detect
absence of /tmp/piper/espeak-ng-data and then copy (dereference) the
espeak-ng-data contents from the publish-stage location into
/usr/share/espeak-ng-data (e.g., use a non-symlink copy such as cp -a or cp -R
with dereference) or exit with an error if neither source exists so the final
image contains real data rather than a symlink.

---

Nitpick comments:
In `@Web/Resgrid.Web.Tts/Dockerfile`:
- Around line 25-55: The big RUN that handles apt installs, Piper extraction,
espeak data handling, voice-model downloads and cleanup should be split into
separate RUNs to improve layer caching: 1) a RUN that does apt-get
update/install (ffmpeg, libespeak-ng1, ca-certificates, curl), creates
/usr/local/share/piper-voices, downloads and installs the Piper binary using
PIPER_VERSION and performs the espeak-ng-data copy/link and ldconfig, then
cleans /tmp; 2) a separate RUN that performs the voice-model curl loop writing
to /usr/local/share/piper-voices (the list of voice paths and the curl --retry
invocations); and 3) a final RUN for any user creation, permissions (chmod +x
/usr/local/bin/piper), and any remaining lib copying (find /tmp/piper -name
'*.so*' -> /usr/local/lib) if still needed—preserve command order and cleanup
(rm -rf) in each RUN to keep image size unchanged.
- Around line 59-63: The dependency-collection loop currently iterates "for bin
in /usr/local/bin/piper /usr/bin/ffmpeg" and uses "ldd \"$bin\" | awk '/=>/
{print $3}'" which omits ffprobe and also extracts the word "not" from "=> not
found"; update the loop to include /usr/bin/ffprobe alongside
/usr/local/bin/piper and /usr/bin/ffmpeg, tighten the ldd parsing (e.g. only
print the third field when it is an absolute path or exclude "not found" lines)
instead of blindly using awk '/=>/ {print $3}', and surface missing libraries by
capturing ldd lines containing "not found" to a separate file (e.g.
/tmp/ttsdeps/missing.txt) so missing deps are visible rather than silently
ignored.
- Around line 62-63: The Dockerfile is copying all .so files from /usr/local/lib
into /tmp/ttsdeps, which can shadow runtime system libraries; update the
dependency collection for ldd outputs and the /usr/local/lib copy to exclude
standard system libs (libc, libstdc++, libgcc_s, libm, libdl, libpthread, librt,
ld-linux*) so only TTS-specific libs (espeak-ng, piper and their bundled deps)
are collected; specifically, change the ldd collection pipeline that writes to
/tmp/ttsdeps/libs.txt to filter out those names (use a grep -vE exclusion) and
likewise modify the find/copy step (the line with find /usr/local/lib -name
'*.so*' -type f >> /tmp/ttsdeps/libs.txt and the subsequent copy loop) to skip
files matching those same system-library patterns, preserving
LD_LIBRARY_PATH=/usr/local/lib/tts semantics.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c6253ce3-d943-428f-b5c8-9d7064d3a304

📥 Commits

Reviewing files that changed from the base of the PR and between 4c339bc and 05a9326.

📒 Files selected for processing (1)
  • Web/Resgrid.Web.Tts/Dockerfile

Comment thread Web/Resgrid.Web.Tts/Dockerfile Outdated
Comment thread Web/Resgrid.Web.Tts/Dockerfile Outdated
@ucswift
Copy link
Copy Markdown
Member Author

ucswift commented May 14, 2026

Approve

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is approved.

@ucswift ucswift merged commit c5bb106 into master May 14, 2026
19 checks passed
This was referenced May 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant