Skip to content

Conversation

@bbotella
Copy link
Contributor

This is a follow up PR on top of
#4255

@bbotella bbotella force-pushed the CASSSIDECAR-254-takeover branch 2 times, most recently from 52edd69 to 52c1ed0 Compare November 26, 2025 17:02
@bbotella bbotella force-pushed the CASSSIDECAR-254-takeover branch from 98683de to 073febe Compare December 1, 2025 23:07
try
{
createLogDir();
return Files.readAllBytes(new File(logDir, resultFile).toPath());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need a validate a file size here against a configured limit, if it is too large we can cause OOM or affect GC on a server size or OOM on a client side (nodetool), I suppose the default value should be around 25MiB...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@netudima should not we also limit duration of profiling? I think that something "sensible" should be used there as well but at the same time something not too short. Like 12h? That has to be enough for everybody I guess?

Copy link
Contributor

@smiklosovic smiklosovic Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we can not limit size of a file before it is created. Can we? This can be limited in async-profiler itself? We can basically just react on this. So what if async-profiler creates a file bigger than your limit? How do we actually get that information? As this is all asynchronous - if we wait for duration to expire (if we do not stop it ourselves).
What do you want to do with such a big file? Remove it?
We were also thinking about compressing it before sending.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should not we also limit duration of profiling?

yes, I think it makes sense to set an upper limit, maybe 12h is even too big.., I think 1h is more realistic one (I do not remember if I actually used more than 15 minutes in reality). If if we speak about hours it is more like continuous profiling use case and should be implemented in a different way

Copy link
Contributor

@netudima netudima Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we can not limit size of a file before it is created

I've not seen such logic in async-profiler.., only in FlightRecorder. I am not sure if we can handle this use case easily enough to implement..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess time limit would only apply for the safe mode right?

That of course mitigates the problem, but does not quite make it bullet proof. I guess such a guardrail could depend on a monitor process server side that automatically stops the profiling if the process is going to be causing GC problems? That worries me as I think we may be getting in the area of over complicating things... Using nodetool against a live database is complex. And you can do worse things than profiling for an hour with it. Do those commands have guardrails?

{
try
{
createLogDir();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this method (as well as fetch) be read-only and does not create a directory, especially if profiler is not enabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's no harm there?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still want to return list of files even it is not enabled. There is no harm creating that directory really ...

}
});
}
else
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need a different logic for text files compared to binary ones?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because some results of async-profiler are not text - they are not html but binary, like jfr and similar.

So when I go to fetch it then I need to know into what I am going to save it. Because when I do new String(byte[]) and I write it to a file, it will be messed up as these bytes are not strings (only in html case).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could probably do Files.write(new File(localFile).toPath(), content, CREATE, TRUNCATE_EXISTING, WRITE); for both cases but not sure how it will behave for strings. It will be probably fine.

{
try
{
createLogDir();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still want to return list of files even it is not enabled. There is no harm creating that directory really ...

@bbotella bbotella force-pushed the CASSSIDECAR-254-takeover branch from 62ba006 to f53c997 Compare December 10, 2025 15:07
Add enable/disable AsyncProfiler JVM flags

Refactor

Add output format option & initial tests

Refactor & add tests

Add tests for system properties & refactor

Checkstyle fixes & switching airlift with picocli (CASSANDRA-17445)

Add more tests

Fix Picocli 'Profile' command integration

Address feedback

Changes.txt

Remove not needed property

Add missing licenses

Apply feedback

Fix help tests

refactoring
fix commands
hardened validation and simplification of commands
fix log dir
fixed tests

more fixes

add purge command

more hardening

implement list and fetch

Remove unused import

more fixes

added documentation
fixed nodetool help output tests
added status command
introduced binary download of files in fetch command if necessary
hardened code
ability to specify duration in human format (e.g. 5m)
improved error parsing
ability to execute purge, list and fetch even with disabled profiler
async-profiler is disabled by default
added nodetool tests
add startup check checking kernel parameters

Updates documentation with different profiler info

Applies some feedback

Merge AsyncProfiler and AsyncProfilerService

Improvements

Improve refactor

Fixed folder for profiles

Fix help tests

Use config log folder

Applies feedback

more fixes
@bbotella bbotella force-pushed the CASSSIDECAR-254-takeover branch from f53c997 to 1b6e538 Compare December 10, 2025 15:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants