Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 13 additions & 13 deletions core/src/main/scala/org/apache/spark/util/Utils.scala
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ import java.util.{HexFormat, Locale, Properties, Random, UUID}
import java.util.concurrent._
import java.util.concurrent.TimeUnit.NANOSECONDS
import java.util.zip.{GZIPInputStream, ZipInputStream}
import javax.management.ObjectName

import scala.annotation.tailrec
import scala.collection.Map
Expand Down Expand Up @@ -2144,19 +2145,18 @@ private[spark] object Utils

/** Return a heap dump. Used to capture dumps for the web UI */
def getHeapHistogram(): Array[String] = {
val pid = String.valueOf(ProcessHandle.current().pid())
val jmap = System.getProperty("java.home") + "/bin/jmap"
val builder = new ProcessBuilder(jmap, "-histo:live", pid)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Root cause

The reason is that JDK25 8354460 changes the output of jmap to streaming output. If it is not read in time, this will cause the buffer to be exhausted and the process will appear to hang.

https://bugs.openjdk.org/browse/JDK-8354460

Test

The way using MBean looks better, the following is just my verification test.

With the -J-Djdk.attach.allowStreamingOutput=false parameter, Spark generates HeapHistogram normally.

    val builder = new ProcessBuilder(jmap, "-J-Djdk.attach.allowStreamingOutput=false", "-histo:live", pid)

Related

JVM freezes when executing a jmap command using the Process and Runtime API

https://bugs.openjdk.org/browse/JDK-8372842

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cxzl25, thanks for finding out the root cause! so the comment is not correct, would you like to send a follow-up PR to fix the comment?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice analysis!

val p = builder.start()
val rows = ArrayBuffer.empty[String]
Utils.tryWithResource(new BufferedReader(new InputStreamReader(p.getInputStream()))) { r =>
var line = ""
while (line != null) {
if (line.nonEmpty) rows += line
line = r.readLine()
}
}
rows.toArray
// SPARK-55809: Use DiagnosticCommandMBean in-process instead of spawning jmap as
// a subprocess.
// Spawning `jmap -histo <pid>` requires a self-attach via signal/socket handshake.
// Any concurrent GC (G1, ZGC, Shenandoah) can put the JVM into states where the
// Signal Dispatcher cannot service the attachment handshake, causing jmap to hang.
// DiagnosticCommandMBean runs gcClassHistogram in-process and coordinates with GC
// through internal JVM APIs, avoiding the unreliable inter-process attach.
val server = ManagementFactory.getPlatformMBeanServer
val on = new ObjectName("com.sun.management:type=DiagnosticCommand")
val result = server.invoke(on, "gcClassHistogram",
Array[AnyRef](Array[String]()), Array[String]("[Ljava.lang.String;"))
result.asInstanceOf[String].split("\n").filter(_.nonEmpty)
}

def getThreadDumpForThread(threadId: Long): Option[ThreadStackTrace] = {
Expand Down