Skip to content

Conversation

@Spxg
Copy link

@Spxg Spxg commented Dec 12, 2025

While enabling coverage instrumentation for a Rust program, I encountered llvm-cov exiting with an error: "malformed instrumentation profile data: function name is empty". I found that the Rust linker defaults to using --gc-sections. The issue is resolved by setting --no-gc-sections.

Every covfun record holds a hash of its symbol name, and llvm-cov will exit fatally if it can't resolve that hash back to an entry in the binary's __llvm_prf_names linker section.

However, WASM stores __llvm_covfun in CustomSection, while __llvm_prf_names is stored in the DATA section. The former not be GC, whereas the latter may be GC, causing llvm-cov execution to fail.

` __llvm_prf_names` -> `hash(name) to name` map, PrfMap eg.
`__llvm_covfun` -> iterate -> PrfMap.looksUp(hashValue) -> malformed if not found

This PR makes __llvm_covfun and __llvm_covmap will be discarded if the object is not live.

@github-actions
Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot
Copy link
Member

llvmbot commented Dec 12, 2025

@llvm/pr-subscribers-llvm-binary-utilities
@llvm/pr-subscribers-lld

@llvm/pr-subscribers-backend-webassembly

Author: None (Spxg)

Changes

While enabling coverage instrumentation for a Rust program, I encountered llvm-cov exiting with an error: "malformed instrumentation profile data: function name is empty". I found that the Rust linker defaults to using --gc-sections. The issue is resolved by setting --no-gc-sections.

Every covfun record holds a hash of its symbol name, and llvm-cov will exit fatally if it can't resolve that hash back to an entry in the binary's __llvm_prf_names linker section.

However, WASM stores __llvm_covfun in CustomSection, while __llvm_prf_names is stored in the DATA section. The former cannot be GC, whereas the latter may be GC, causing llvm-cov execution to fail.

This flag records whether it is an InstrProfSegment so that it can be enqueued during GC.


Full diff: https://github.com/llvm/llvm-project/pull/172023.diff

4 Files Affected:

  • (modified) lld/wasm/InputChunks.h (+3)
  • (modified) lld/wasm/InputFiles.cpp (+17)
  • (modified) lld/wasm/MarkLive.cpp (+12-1)
  • (modified) llvm/include/llvm/BinaryFormat/Wasm.h (+1)
diff --git a/lld/wasm/InputChunks.h b/lld/wasm/InputChunks.h
index 1fe78d76631f1..7cc49da1064fc 100644
--- a/lld/wasm/InputChunks.h
+++ b/lld/wasm/InputChunks.h
@@ -83,6 +83,9 @@ class InputChunk {
 
   bool isTLS() const { return flags & llvm::wasm::WASM_SEG_FLAG_TLS; }
   bool isRetained() const { return flags & llvm::wasm::WASM_SEG_FLAG_RETAIN; }
+  bool isInstrProfSegment() const {
+    return flags & llvm::wasm::WASM_SEG_FLAG_PRF;
+  }
 
   ObjFile *file;
   OutputSection *outputSec = nullptr;
diff --git a/lld/wasm/InputFiles.cpp b/lld/wasm/InputFiles.cpp
index 387b5eb10ba2f..fedaa4b737e07 100644
--- a/lld/wasm/InputFiles.cpp
+++ b/lld/wasm/InputFiles.cpp
@@ -577,6 +577,23 @@ void ObjFile::parse(bool ignoreComdats) {
     if (!seg->isTLS() &&
         (seg->name.starts_with(".tdata") || seg->name.starts_with(".tbss")))
       seg->flags |= WASM_SEG_FLAG_TLS;
+
+    // Every covfun record holds a hash of its symbol name, and `llvm-cov`
+    // will exit fatally if it can't resolve that hash back to an entry in
+    // the binary's `__llvm_prf_names` linker section.
+    //
+    // However, WASM stores `__llvm_covfun` in `CustomSection`, while
+    // `__llvm_prf_names` is stored in the `DATA` section. The former cannot be
+    // GC, whereas the latter may be GC, causing `llvm-cov` execution to fail.
+    //
+    // This flag records whether it is an `InstrProfSegment` so that it can be
+    // enqueued during GC.
+    if (seg->name == getInstrProfSectionName(IPSK_name, Triple::Wasm) ||
+        seg->name == getInstrProfSectionName(IPSK_cnts, Triple::Wasm) ||
+        seg->name == getInstrProfSectionName(IPSK_data, Triple::Wasm)) {
+      seg->flags |= WASM_SEG_FLAG_PRF;
+    }
+
     segments.emplace_back(seg);
   }
   setRelocs(segments, dataSection);
diff --git a/lld/wasm/MarkLive.cpp b/lld/wasm/MarkLive.cpp
index 2b2cf19f14b30..c51214b6920cf 100644
--- a/lld/wasm/MarkLive.cpp
+++ b/lld/wasm/MarkLive.cpp
@@ -43,6 +43,7 @@ class MarkLive {
   void enqueue(InputChunk *chunk);
   void enqueueInitFunctions(const ObjFile *sym);
   void enqueueRetainedSegments(const ObjFile *file);
+  void enqueueInstrProfSegments(const ObjFile *file);
   void mark();
   bool isCallCtorsLive();
 
@@ -104,6 +105,13 @@ void MarkLive::enqueueRetainedSegments(const ObjFile *file) {
       enqueue(chunk);
 }
 
+// Mark instr profile segments.
+void MarkLive::enqueueInstrProfSegments(const ObjFile *file) {
+  for (InputChunk *chunk : file->segments)
+    if (chunk->isInstrProfSegment())
+      enqueue(chunk);
+}
+
 void MarkLive::run() {
   // Add GC root symbols.
   if (!ctx.arg.entry.empty())
@@ -117,7 +125,9 @@ void MarkLive::run() {
   if (ctx.sym.callDtors)
     enqueue(ctx.sym.callDtors);
 
-  for (const ObjFile *obj : ctx.objectFiles)
+  for (const ObjFile *obj : ctx.objectFiles) {
+    // Enqueue instr profile segments
+    enqueueInstrProfSegments(obj);
     if (obj->isLive()) {
       // Enqueue constructors in objects explicitly live from the command-line.
       enqueueInitFunctions(obj);
@@ -125,6 +135,7 @@ void MarkLive::run() {
       // command-line.
       enqueueRetainedSegments(obj);
     }
+  }
 
   mark();
 
diff --git a/llvm/include/llvm/BinaryFormat/Wasm.h b/llvm/include/llvm/BinaryFormat/Wasm.h
index cf90a43d0d7e8..1f76330a29efb 100644
--- a/llvm/include/llvm/BinaryFormat/Wasm.h
+++ b/llvm/include/llvm/BinaryFormat/Wasm.h
@@ -229,6 +229,7 @@ enum WasmSegmentFlag : unsigned {
   WASM_SEG_FLAG_STRINGS = 0x1,
   WASM_SEG_FLAG_TLS = 0x2,
   WASM_SEG_FLAG_RETAIN = 0x4,
+  WASM_SEG_FLAG_PRF = 0x8,
 };
 
 // Kinds of tag attributes.

@llvmbot
Copy link
Member

llvmbot commented Dec 12, 2025

@llvm/pr-subscribers-lld-wasm

Author: None (Spxg)

Changes

While enabling coverage instrumentation for a Rust program, I encountered llvm-cov exiting with an error: "malformed instrumentation profile data: function name is empty". I found that the Rust linker defaults to using --gc-sections. The issue is resolved by setting --no-gc-sections.

Every covfun record holds a hash of its symbol name, and llvm-cov will exit fatally if it can't resolve that hash back to an entry in the binary's __llvm_prf_names linker section.

However, WASM stores __llvm_covfun in CustomSection, while __llvm_prf_names is stored in the DATA section. The former cannot be GC, whereas the latter may be GC, causing llvm-cov execution to fail.

This flag records whether it is an InstrProfSegment so that it can be enqueued during GC.


Full diff: https://github.com/llvm/llvm-project/pull/172023.diff

4 Files Affected:

  • (modified) lld/wasm/InputChunks.h (+3)
  • (modified) lld/wasm/InputFiles.cpp (+17)
  • (modified) lld/wasm/MarkLive.cpp (+12-1)
  • (modified) llvm/include/llvm/BinaryFormat/Wasm.h (+1)
diff --git a/lld/wasm/InputChunks.h b/lld/wasm/InputChunks.h
index 1fe78d76631f1..7cc49da1064fc 100644
--- a/lld/wasm/InputChunks.h
+++ b/lld/wasm/InputChunks.h
@@ -83,6 +83,9 @@ class InputChunk {
 
   bool isTLS() const { return flags & llvm::wasm::WASM_SEG_FLAG_TLS; }
   bool isRetained() const { return flags & llvm::wasm::WASM_SEG_FLAG_RETAIN; }
+  bool isInstrProfSegment() const {
+    return flags & llvm::wasm::WASM_SEG_FLAG_PRF;
+  }
 
   ObjFile *file;
   OutputSection *outputSec = nullptr;
diff --git a/lld/wasm/InputFiles.cpp b/lld/wasm/InputFiles.cpp
index 387b5eb10ba2f..fedaa4b737e07 100644
--- a/lld/wasm/InputFiles.cpp
+++ b/lld/wasm/InputFiles.cpp
@@ -577,6 +577,23 @@ void ObjFile::parse(bool ignoreComdats) {
     if (!seg->isTLS() &&
         (seg->name.starts_with(".tdata") || seg->name.starts_with(".tbss")))
       seg->flags |= WASM_SEG_FLAG_TLS;
+
+    // Every covfun record holds a hash of its symbol name, and `llvm-cov`
+    // will exit fatally if it can't resolve that hash back to an entry in
+    // the binary's `__llvm_prf_names` linker section.
+    //
+    // However, WASM stores `__llvm_covfun` in `CustomSection`, while
+    // `__llvm_prf_names` is stored in the `DATA` section. The former cannot be
+    // GC, whereas the latter may be GC, causing `llvm-cov` execution to fail.
+    //
+    // This flag records whether it is an `InstrProfSegment` so that it can be
+    // enqueued during GC.
+    if (seg->name == getInstrProfSectionName(IPSK_name, Triple::Wasm) ||
+        seg->name == getInstrProfSectionName(IPSK_cnts, Triple::Wasm) ||
+        seg->name == getInstrProfSectionName(IPSK_data, Triple::Wasm)) {
+      seg->flags |= WASM_SEG_FLAG_PRF;
+    }
+
     segments.emplace_back(seg);
   }
   setRelocs(segments, dataSection);
diff --git a/lld/wasm/MarkLive.cpp b/lld/wasm/MarkLive.cpp
index 2b2cf19f14b30..c51214b6920cf 100644
--- a/lld/wasm/MarkLive.cpp
+++ b/lld/wasm/MarkLive.cpp
@@ -43,6 +43,7 @@ class MarkLive {
   void enqueue(InputChunk *chunk);
   void enqueueInitFunctions(const ObjFile *sym);
   void enqueueRetainedSegments(const ObjFile *file);
+  void enqueueInstrProfSegments(const ObjFile *file);
   void mark();
   bool isCallCtorsLive();
 
@@ -104,6 +105,13 @@ void MarkLive::enqueueRetainedSegments(const ObjFile *file) {
       enqueue(chunk);
 }
 
+// Mark instr profile segments.
+void MarkLive::enqueueInstrProfSegments(const ObjFile *file) {
+  for (InputChunk *chunk : file->segments)
+    if (chunk->isInstrProfSegment())
+      enqueue(chunk);
+}
+
 void MarkLive::run() {
   // Add GC root symbols.
   if (!ctx.arg.entry.empty())
@@ -117,7 +125,9 @@ void MarkLive::run() {
   if (ctx.sym.callDtors)
     enqueue(ctx.sym.callDtors);
 
-  for (const ObjFile *obj : ctx.objectFiles)
+  for (const ObjFile *obj : ctx.objectFiles) {
+    // Enqueue instr profile segments
+    enqueueInstrProfSegments(obj);
     if (obj->isLive()) {
       // Enqueue constructors in objects explicitly live from the command-line.
       enqueueInitFunctions(obj);
@@ -125,6 +135,7 @@ void MarkLive::run() {
       // command-line.
       enqueueRetainedSegments(obj);
     }
+  }
 
   mark();
 
diff --git a/llvm/include/llvm/BinaryFormat/Wasm.h b/llvm/include/llvm/BinaryFormat/Wasm.h
index cf90a43d0d7e8..1f76330a29efb 100644
--- a/llvm/include/llvm/BinaryFormat/Wasm.h
+++ b/llvm/include/llvm/BinaryFormat/Wasm.h
@@ -229,6 +229,7 @@ enum WasmSegmentFlag : unsigned {
   WASM_SEG_FLAG_STRINGS = 0x1,
   WASM_SEG_FLAG_TLS = 0x2,
   WASM_SEG_FLAG_RETAIN = 0x4,
+  WASM_SEG_FLAG_PRF = 0x8,
 };
 
 // Kinds of tag attributes.

@Spxg Spxg changed the title [Coverage][WebAssembly] Keep InstrProf segments during gc [Coverage][WebAssembly] Discard InstrProf sections if object not live Dec 13, 2025
@Spxg Spxg changed the title [Coverage][WebAssembly] Discard InstrProf sections if object not live [Coverage][WebAssembly] Remove InstrProf sections if object not live Dec 13, 2025
@Spxg Spxg force-pushed the w/prf branch 2 times, most recently from 5ca22ad to dc84939 Compare December 13, 2025 03:05
@Spxg Spxg marked this pull request as draft December 13, 2025 03:15
@Spxg Spxg marked this pull request as ready for review December 13, 2025 03:34
@Spxg Spxg changed the title [Coverage][WebAssembly] Remove InstrProf sections if object not live [Coverage][WebAssembly] Discard InstrProf sections if object not live Dec 13, 2025
if (!file->isLive() &&
(name == getInstrProfSectionName(IPSK_covfun, Triple::Wasm) ||
name == getInstrProfSectionName(IPSK_covmap, Triple::Wasm)))
continue;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. I wonder if we can just add if (!file->isLive()) continue to the outer loop here. At least there are no failing tests when I do this locally.

It seem pretty odd to be including any custom sections form object files that are not live.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. I wonder if we can just add if (!file->isLive()) continue to the outer loop here.

LGTM, but it's not only calculateCustomSections that is used in this way, also populateProducers etc.. It might be necessary to determine whether it's a non-live object at an earlier stage and exit.

Anyway, I'm not really in favor of doing it in this PR (because the cov issue is clear, it should be given priority consideration), it might cause the problem to escalate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants