Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ Spark Connect protocol is defined in proto files under `sql/connect/common/src/m

Avoid introducing non-ASCII characters in code or comments. String literals may contain non-ASCII when the content requires it (error messages, test data, etc.). Identifiers are ASCII by convention. The common failure mode is typographic characters (em-dash, smart quotes, ellipsis, non-breaking space) sneaking into comments; scalastyle flags some of these. Spot-check before committing: `grep -rn -P "[^\x00-\x7F]" <files>`.

Scaladoc member-link convention: when linking to a method or field of another class from a `/** */` doc comment, use `[[Class#method]]`, not `[[Class.method]]`. genjavadoc passes wiki-style links through as javadoc `{@link ...}`, and javadoc reads `.` as the inner-class separator; if `Class` is a regular `class` / `trait` (not a Scala `object`) and has no companion-object member with that name, javadoc fails to resolve and the unidoc step fails with `error: reference not found`. The `#` form is the Javadoc-canonical member separator and resolves cleanly. Same-class members can still be referenced bare as `[[methodName]]`.

## Build and Test

Build and tests can take a long time. Before running tests, ask the user if they have more changes to make.
Expand Down
51 changes: 46 additions & 5 deletions docs/_plugins/build_api_docs.rb
Original file line number Diff line number Diff line change
Expand Up @@ -164,11 +164,19 @@ def stream_and_capture(command, log_file)
end

# Scans the captured unidoc log and prints a pointer to the most likely
# culprit source file. The heuristic: when javadoc dies mid-HTML-generation,
# the last "Generating .../X.html" line before "javadoc exited with exit code"
# names the class that tripped it. Prints nothing actionable if the failure
# mode doesn't match (e.g. a scaladoc error), in which case the full log above
# already shows what's wrong.
# culprit source file. Two failure modes are surfaced:
#
# 1. javadoc dies mid-HTML-generation. The last "Generating .../X.html" line
# before "javadoc exited with exit code" names the class that tripped it.
#
# 2. javadoc completes HTML generation but reports a non-zero "<N> errors"
# count from doclint reference checks. With "-verbose" enabled in the
# javacOptions, each such error appears in the log as
# <path>.java:<line>: error: reference not found
# and we list them so the developer knows exactly which {@link} to fix.
#
# Prints nothing actionable if neither pattern matches (e.g. a scaladoc
# error), in which case the full log above already shows what's wrong.
def diagnose_unidoc_failure(log_file)
return unless File.exist?(log_file)
begin
Expand All @@ -187,6 +195,22 @@ def diagnose_unidoc_failure(log_file)
end
end

# "error: reference not found" lines come from javadoc's reference doclint
# check on broken {@link Class.member} or {@link Class#member} refs in the
# generated stubs (under target/java/...). The line number in the message
# is into the *generated* .java, not the original .scala source -- finding
# the offending scaladoc usually means opening that target/java file at
# that line and reading the {@link ...} on it back to the .scala doc.
ansi = /\e\[[0-9;]*[A-Za-z]/
ref_errors = []
lines.each do |line|
stripped = line.gsub(ansi, '')
if stripped =~ %r{^(?:\[(?:error|warn|info)\]\s+)?(\S+\.java):(\d+):\s+error: reference not found}
ref_errors << "#{$1}:#{$2}"
end
end
ref_errors.uniq!

banner = "=" * 78
$stderr.puts ""
$stderr.puts banner
Expand All @@ -209,6 +233,23 @@ def diagnose_unidoc_failure(log_file)
$stderr.puts " NOTE: the '[error]' lines above on files under"
$stderr.puts " target/java/... are benign genjavadoc stubs -- every PR"
$stderr.puts " emits them and they do not cause the exit. Ignore them."
elsif !ref_errors.empty?
$stderr.puts ""
$stderr.puts " Javadoc reference-resolution errors (each one is a broken"
$stderr.puts " {@link} in a doc comment that genjavadoc copied verbatim"
$stderr.puts " from the corresponding scaladoc; fix the [[link]] in the"
$stderr.puts " Scala source):"
$stderr.puts ""
ref_errors.first(50).each { |e| $stderr.puts " #{e}" }
if ref_errors.size > 50
$stderr.puts " ... and #{ref_errors.size - 50} more"
end
$stderr.puts ""
$stderr.puts " Common cause: [[Class.member]] in scaladoc when Class is a"
$stderr.puts " regular `class`/`trait` (not a Scala `object`) and there is"
$stderr.puts " no companion-object member with that name. genjavadoc emits"
$stderr.puts " {@link Class.member}, javadoc reads `.` as the inner-class"
$stderr.puts " separator and fails to resolve. Use [[Class#member]] instead."
elsif javadoc_exit_idx
$stderr.puts ""
$stderr.puts " Javadoc exited but no class HTML generation was in progress;"
Expand Down
5 changes: 4 additions & 1 deletion project/SparkBuild.scala
Original file line number Diff line number Diff line change
Expand Up @@ -1699,7 +1699,10 @@ object Unidoc {
"-tag", "todo:X",
"-tag", "groupname:X",
"-tag", "inheritdoc",
"--ignore-source-errors", "-notree"
"--ignore-source-errors", "-notree",
"-Xmaxerrs", "999999",
"-Xmaxwarns", "999999",
"-verbose"
)
},

Expand Down