[SPARK-56630][INFRA] Surface javadoc crash culprit in unidoc failure output#55548
Closed
cloud-fan wants to merge 4 commits into
Closed
[SPARK-56630][INFRA] Surface javadoc crash culprit in unidoc failure output#55548cloud-fan wants to merge 4 commits into
cloud-fan wants to merge 4 commits into
Conversation
Unidoc currently fails with ~100 `[error]` lines on genjavadoc-generated
Java stubs under `target/java/...` (private Scala case-class `apply`
methods that produce invalid `static public abstract R apply(T1, ...)`
Java). These errors are benign -- every PR emits them -- but they
overshadow the real cause when javadoc hard-exits mid-stream on specific
doc-comment content. The actual crash signal is the last
`Generating .../<Class>.html...` line before
`javadoc exited with exit code 1`, which a developer has to hunt for by
hand in multi-thousand-line CI logs.
Tee sbt output to `target/unidoc-build.log` and, on failure, print a
framed banner with:
- the HTML file javadoc was generating when it died,
- the inferred source class to audit,
- a one-paragraph hint about the usual scaladoc triggers
(wiki-style `[[...]]` links, inline-backtick code refs),
- an explicit note that the `[error]` lines on `target/java/...`
stubs are not the cause.
Heuristic only; when the log doesn't match the mid-HTML-crash pattern
(e.g. scaladoc failure, sbt env issue) the banner says so and points
back to the full log above.
Co-authored-by: Isaac
Intentionally reintroduces the scaladoc pattern that hard-exited javadoc on PR apache#51419 (wiki-style [[TableIdentifier]] / [[toQualifiedNameParts]] refs plus backtick-inline `Seq[String]`) in CatalogV2Implicits.IdentifierHelper. CI should fail at the unidoc step and the new diagnostic banner should name this class as the culprit. Drop this commit before merging. Co-authored-by: Isaac
…agnostic helper, drop magic 100
Contributor
Author
|
The bait commit on this branch ( The banner correctly named |
…stic" This reverts commit a4b30e8.
HyukjinKwon
approved these changes
Apr 26, 2026
Member
|
Merged to master. |
juliuszsompolski
pushed a commit
to juliuszsompolski/apache-spark
that referenced
this pull request
Apr 28, 2026
…ilures The unidoc step in CI sometimes failed with the SPARK-56630 (apache#55548) banner reporting "Javadoc exited but no class HTML generation was in progress" or with no actionable diagnostic at all. Two javadoc defaults were masking the underlying causes: 1. `-Xmaxerrs 100`: javadoc bails during source loading once the cumulative count of benign genjavadoc-stub errors crosses 100, before any HTML is generated. Every Spark unidoc run produces ~100 such errors (`error: cannot find symbol` on type variables, `error: illegal combination of modifiers: abstract and static`) -- the SPARK-56630 PR description documents that these are inert. When the count tips past 100 the build fails with a wall of those errors and no Generating .html line for the SPARK-56630 banner to point at. 2. `-Xmaxwarns 100`: even when javadoc completes HTML generation, the doclint warnings on a full Spark unidoc run number in the tens of thousands (`no comment`, `empty <p> tag`, `no @return`, `no @param ...`). Anything past the first 100 is silently dropped, including per-link `error: reference not found` lines that share the warn stream. Setting both ceilings to 999999 keeps javadoc producing output past the real volume so the SPARK-56630 banner can identify the crashing class. `-verbose` makes javadoc emit a `<path>.java:<line>: error: reference not found` line for every broken {@link} during HTML generation. Without it, javadoc tracks reference errors in its internal counter and reports the bulk total in the final `<N> errors / <M> warnings` summary, but does not print a file:line for each one. The flag also dumps "Loading source file ..." progress lines and grows the unidoc log by an order of magnitude; that is the price of being able to debug reference errors at all from CI logs. This change does not silence any failure -- javadoc still exits non-zero when there are real errors. It only removes the noise / clipping masks.
juliuszsompolski
pushed a commit
to juliuszsompolski/apache-spark
that referenced
this pull request
Apr 28, 2026
The diagnostic banner added in SPARK-56630 (apache#55548) names the class javadoc was rendering when it crashed mid-HTML-generation. With -verbose now enabled in JavaUnidoc / unidoc / javacOptions, javadoc also emits a per-error line of the form <path>.java:<line>: error: reference not found for every broken {@link} it can't resolve during HTML generation, and the build then fails on the non-zero error count even though all HTML files were produced. This commit makes the banner scan the captured log for those messages and list them in the diagnostic output, so the developer sees the file:line of each broken {@link} alongside the existing class-crash hint. A short note on the most common cause is included in the banner: [[Class.member]] in scaladoc when Class is a regular class/trait (not a Scala object) trips javadoc's inner-class lookup; the fix is to use [[Class#member]] (the Javadoc-canonical member separator), which genjavadoc passes through unchanged.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Adds a diagnostic banner to the unidoc step in
docs/_plugins/build_api_docs.rb. Whenbuild/sbt unidocfails, the script now scans the captured sbt output and prints a framed summary naming the<Class>.htmljavadoc was generating when it died, the inferred source class to audit, and a one-paragraph hint about the usual scaladoc triggers.Implementation:
stream_and_capturetees sbt output to both stdout andtarget/unidoc-build.log(Ruby-only, no shellpipefailreliance).diagnose_unidoc_failurefinds the lastGenerating .../<Class>.html...line beforejavadoc exited with exit code Nand prints a culprit-pointer banner. ANSI colour codes are stripped before regex matching.Why are the changes needed?
Today, when javadoc hard-exits during unidoc HTML generation -- typically because of a specific scaladoc construct (e.g. wiki-style
[[Class]]links or backtick-inline code refs) in an exposed Scala source -- the failing PR's CI log shows ~100[error]lines ontarget/java/...files. Those errors are benign: they're genjavadoc-emitted Java stubs (static public abstract R apply(T1, T2, T3, T4)) that every PR produces, andjavadocalways complains about them but normally still finishes. They are not the cause of the failure.The actual signal is the last
Generating .../<Class>.html...line beforejavadoc exited with exit code 1, which a developer has to find by hand in a multi-thousand-line log. The error reporting does not differentiate the benign noise from the real crash, so the failure consistently looks like it's "in"ErrorInfo.java/LexicalThreadLocal.java/ similar, when it's actually in a Scala source that none of those names point to.A recent example: PR #51419 hit this exact misdirection -- the log was full of errors on
common/utils/target/java/...stubs, but the real culprit was a doc comment inCatalogV2Implicits.IdentifierHelperthat triggered a hard exit during HTML generation. The diagnostic in this PR would have named that class directly.Does this PR introduce any user-facing change?
No. CI-only output change visible in the unidoc step of the doc-gen job.
How was this patch tested?
org/apache/spark/sql/connector/catalog/CatalogV2Implicits.IdentifierHelper.htmlas the crash class.DO NOT MERGE: break a docstring to validate the unidoc diagnostic) intentionally reintroduces the same[[...]]+backtick-inline scaladoc pattern inCatalogV2Implicits.IdentifierHelper.asTableIdentifierOptso that this PR's CI run actually exercises the new path. Once the banner fires and names that class on the failing CI run, that commit will be dropped from this PR.Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude (Anthropic)