Runtime startup performance improvements#2515
Conversation
|
Would it be possible to have an equivalent branch on monodroid? That way I can both buddy test manually and launch a CI run on the changes. |
e61bb21 to
a301bb4
Compare
...marin.Forms-Performance-Integration/Droid/Xamarin.Forms.Performance.Integration.Droid.csproj
Show resolved
Hide resolved
...marin.Forms-Performance-Integration/Droid/Xamarin.Forms.Performance.Integration.Droid.csproj
Show resolved
Hide resolved
src/monodroid/jni/monodroid-glue.cc
Outdated
| } else { | ||
| // Inflate environment from user app assembly | ||
| load_assembly (domain, env, assembly); | ||
| // load_assembly (domain, env, assembly); |
There was a problem hiding this comment.
Yep, assemblies are loaded fully on-demand now
There was a problem hiding this comment.
@grendello if assemblies are loaded on-demand now, we should check and make sure a scenario still works in Xamarin.Forms:
- A custom renderer (or anything using their DI) is defined in a separate assembly than the app
- Xamarin.Forms hits
AppDomain.GetAssemblies: https://github.com/xamarin/Xamarin.Forms/blob/505a855cad39322c8b235281d71c35b70c9df129/Xamarin.Forms.Platform.Android/Forms.cs#L412-L415
I'm wondering if something would break here?
I have a test project you could try: https://github.com/jonathanpeppers/glidex/tree/master/glidex.forms.sample
There was a problem hiding this comment.
@jonathanpeppers I'll test it, thanks. It would be unfortunate of XF relied on AppDomain.GetAssemblies to work properly...
There was a problem hiding this comment.
@jonathanpeppers just tested - it works! :)
a301bb4 to
8e0de65
Compare
0c9c111 to
95134b7
Compare
| static java.lang.Class mono_android_IGCUserPeer; | ||
| static java.lang.Class mono_android_GCUserPeer; | ||
|
|
||
| static { |
There was a problem hiding this comment.
Is a static block actually needed, as opposed to just inline initialization?
There was a problem hiding this comment.
Is there a difference? I thought this looked nicely delimited from the other code, just that
There was a problem hiding this comment.
The difference is that when we add Yet Another Field, using "inline assignment" means the new field is a ~1 line change, not a 2 line change, and it would presumably be harder to accidentally forget the assignment line.
Otherwise, I don't think there's a huge difference.
src/monodroid/jni/android-system.h
Outdated
|
|
||
| MonoAotMode get_mono_aot_mode () const | ||
| { | ||
| if (knownEnvVars.MonoAOT == nullptr || *knownEnvVars.MonoAOT == '\0') |
There was a problem hiding this comment.
Instead of strdup()ing a string into knownEnvVars.MonoAOT, should the string instead be mapped into a MonoAotMode enum so that we don't need to re-switch on the first character of knownEnvVars.MonoAOT? It would save memory, at least.
src/monodroid/jni/jni-wrappers.h
Outdated
| } | ||
|
|
||
| protected: | ||
| jstring_wrapper () |
There was a problem hiding this comment.
Why protected and not private? This type isn't currently subclassed.
We should also declare-but-not-implement the copy constructor and assignment operator:
jstring_wrapper (const jstring_wrapper&);
jstring_wrapper& operator=(const jstring_wrapper&);
These members cannot be public, as we cannot sanely permit copying of these instances.
src/monodroid/jni/jni-wrappers.h
Outdated
| return *this; | ||
| } | ||
|
|
||
| jstring_wrapper& operator= (jstring new_js) noexcept |
There was a problem hiding this comment.
why jstring and not const jstring?
|
|
||
| jstring_wrapper& operator= (jstring new_js) noexcept | ||
| { | ||
| assign (new_js); |
There was a problem hiding this comment.
Has exception safety changed in the last 20 years? Shouldn't this instead be akin to:
jstring_wrapper t(env, new_js);
std::swap (*this, t);
with an appropriate specialization for std::swap()?
There was a problem hiding this comment.
We don't use exceptions and we don't want to use the STL. There's no danger here.
95134b7 to
a26f21f
Compare
|
Note: we're currently using a tiny bit of STL stuff ( |
|
I need to figure out what's going on with RTTI and exceptions before this can be merged |
| string[] lines = File.ReadAllLines (javaEnv); | ||
|
|
||
| Assert.IsTrue (lines.Any (x => x.Contains ("\"XAMARIN_BUILD_ID\",")), | ||
| "The environment should contain a XAMARIN_BUIL_ID"); |
There was a problem hiding this comment.
Typo in the error message.
| MainAssembly="$(MonoAndroidLinkerInputDir)$(TargetFileName)" | ||
| OutputDirectory="$(IntermediateOutputPath)android\src\mono" | ||
| OutputDirectory="$(IntermediateOutputPath)android\src\mono" | ||
| EnvironmentOutputDirectory="$(IntermediateOutputPath)android\src\mono\android\app" |
There was a problem hiding this comment.
Why not just stick the environment file into the mono package, so it can just reuse OutputDirectory?
There was a problem hiding this comment.
Because conceptually it is application environment, so it belongs in the app package and also, src/mono/android/app is created anyway so we're reusing an existing location anyway.
|
|
||
| #if defined (WINDOWS) | ||
| pthread_mutex_t AndroidSystem::readdir_mutex = PTHREAD_MUTEX_INITIALIZER; | ||
| std::mutex AndroidSystem::readdir_mutex; |
There was a problem hiding this comment.
Wait...this is in a #if defined (WINDOWS) block.
Why's this need to be changed? Does std::mutex even work on Windows, or in our MXE environment?
/me adds full-mono-integration-build. We'll need to rebuild things. :-/
There was a problem hiding this comment.
It's a part of the standard C++11: https://en.cppreference.com/w/cpp/thread/mutex
We build on mac/linux with gcc which is fully C++11 compliant and I imagine that the current msvc is compliant as well.
There was a problem hiding this comment.
It seems msvc is fine with <mutex>: https://docs.microsoft.com/en-us/cpp/standard-library/mutex?view=vs-2017
| finline-limit=300 | ||
| fvisibility=hidden | ||
| fstack-protector | ||
| fno-rtti |
There was a problem hiding this comment.
I'm surprised this wasn't already in here. ¯_(ツ)_/¯
| @@ -1,8 +1,8 @@ | |||
| #include <climits> | |||
There was a problem hiding this comment.
Why are we moving away from the C++ headers?
Are we now aiming for a C++ Runtime Library of none?
There is also the option to have no STL. There are no linking or licensing requirements in that case. No C++ standard headers are available.
I'm not even sure how we'd set that in CMakeLists.txt.
There was a problem hiding this comment.
<climits> is just a wrapper around limits.h which is already included by other headers in the sources, it's just not needed. Also, climits is not part of the STL but of the C++ standard runtime.
There was a problem hiding this comment.
As for the STL we use system (in https://github.com/grendello/xamarin-android/blob/runtime-profile/build-tools/scripts/cmake-common.props#L5) which is fine for us.
| MONO_AOT_MODE_FULL | ||
| MONO_AOT_MODE_FULL, | ||
|
|
||
| MONO_AOT_MODE_UNKNOWN = 0xBADBAD |
There was a problem hiding this comment.
Will this value be passed into mono, and if so, what will happen?
There was a problem hiding this comment.
It won't be passed to Mono (https://github.com/grendello/xamarin-android/blob/runtime-profile/src/monodroid/jni/monodroid-glue.cc#L1904-L1905)
| size_t len; | ||
| jstring_wrapper *wrappers; | ||
| jstring_wrapper static_wrappers[5]; | ||
| jstring_wrapper invalid_wrapper; |
There was a problem hiding this comment.
Could/should this be static? I'm not sure I see much sense in "wasting" 3sizeof(void) bytes of space.
There was a problem hiding this comment.
They should not be static. That would make them shared across all the instances - dangerous and invalid (especially for the array). This is a tradeoff between memory use (not much of it used) and speed (large savings when we don't have to allocate the wrappers array for each instance of the class). The only one out of these three that could be static is invalid_wrapper but that again creates uncertainty about what happens if the returned reference is retained and used in an unknown way by some unknown future code - a static version would be shared across all the code that would get it and any modifications to it would be visible everywhere the reference is retained. In short - it introduces uncertainty for the sake of saving two sizeof(void*) which is not worth it IMO.
There was a problem hiding this comment.
It's for saving three sizeof(void*), as jstring_wrapper has three pointer members: env, jstr, cstr.
And I only suggested using static on invalid_wrapper, not anything else.
Since invalid_wrapper will have a null env pointer, the instance is ~useless. (Also thread-safe, as all possibly mutating operations do nothing when env==nullptr.)
It's potentially "not worth it," but jstring_array_wrapper is huge: 3*sizeof(void*) + sizeof(size_t) + 6*3*sizeof(void*) == 88 bytes on a 32-bit ABI, double that on a 64-bit ABI.
On the stack.
Perhaps 88 bytes isn't as much as I fear, but IIRC we only have like 2-4KB stacks on Android...
There was a problem hiding this comment.
The default stack size in bionic is 1MB, so I wouldn't worry too much...
src/monodroid/jni/monodroid-glue.cc
Outdated
|
|
||
| if (strstr (apk_file, "/Mono.Android.DebugRuntime") == NULL && | ||
| strstr (apk_file, "/Mono.Android.Platform.ApiLevel_") == NULL) | ||
| if (strstr (apk_file.get_cstr (), "/Mono.Android.DebugRuntime") == nullptr && strstr (apk_file.get_cstr (), "/Mono.Android.Platform.ApiLevel_") == nullptr) |
There was a problem hiding this comment.
Why merge the two lines into one 156-character wide line?
| } | ||
|
|
||
| void* | ||
| operator new (size_t size, const std::nothrow_t&) noexcept |
There was a problem hiding this comment.
Welp, i suppose aborting the process on OOM isn't throwing an exception, so 👍?
There was a problem hiding this comment.
It's a required standard overload, we must implement it. Also, we compile without exceptions anyway and also we call the new implementation above this one in the source which aborts the process on OOM so there's no issue here.
There was a problem hiding this comment.
I suppose I should have been more explicit.
Given that we don't have exceptions, having operator new(size_t) abort on OOM makes sense.
However, the idea with operator new (size_t, nothrow_t) is that it can instead return nullptr on OOM instead of raising an exception. Explicit null checking is thus expected (and required!) when nothrow is used:
int* a = new (std::nothrow) int;
if (a == nullptr) {
// OOM
// ...
}So inverting these could plausibly make sense: have operator new (size_t, nothrow_t) call ::malloc(), returning nullptr when malloc() returns NULL, and have operator new (size_t) call operator new (size_t, nothrow_t) and abort if nullptr is returned.
That said, it's kinda moot on Linux, as malloc() won't ever return NULL (yay OOM killer?).
I'm not sure if Android can return NULL from malloc().
| } | ||
|
|
||
| void | ||
| operator delete (void* ptr) noexcept |
There was a problem hiding this comment.
It's truly a pity that the delete operator doesn't take a void*& parameter, so it could null out the parameter automatically...
|
Looks like the PR build hung: Note: at the time of this comment, it is 21:51, so we're approaching the 3h timeout value. I'm aborting the build and will restart it. (I wanted to restart the build anyway, as I added the full-mono-integration-build label...) |
497b29f to
ebb4fff
Compare
|
The most recent PR build failed in the MXE portion of wat? It looks like gcc/ld doesn't like the template meta-programming within There's also quite a few warnings about signed vs. unsigned integer comparisons: |
|
Well... it seems mingw we have on bots is too old...: |
26e658a to
86843a2
Compare
|
@jonpryor gcc works just fine with the metaprogramming bit, it's mxe gcc that's too old. |
The goal of this commit is to make Xamarin.Android apps start faster. The
changes are focused only around the very first stage of application startup -
between Android invoking our Java startup code (MonoPackageManager) and the end
of our `Runtime.init` (`Java_mono_android_Runtime_init` in the native runtime)
which is when the user application is fully initialized and ready to start is
launcher Activity.
In order to achieve the goal, the following changes were made:
* Java class lookup ("reflection").
We used to call the `FindClass` JNI function as part of the startup at a cost
of several milliseconds. We now put the class handles (accessed with the
`.class` Java accessor) in the `Runtime` class and initialize them from the
static constructor. We then read those fields from within `Runtime.init`,
which is passed a reference to the Java instance of the Runtime class.
Additonally, a handful of class field/method lookups were moved out of the
init code so that the code that doesn't use them doesn't have to pay the tax.
* Android API level is passed to `Runtime.init` from Java instead of using JNI
from the native code.
* Limit logging.
Previously whenever any of the `log_{info,debug}` functions were called we'd
spend time preparing all the parameters to pass to the function, sometimes
involving memory allocation, function calls, etc - only to discard all of
that work **inside** the `log_*` call because the logging category used in
that call was disabled. Now we check whether the category is enabled before
we set out to construct the parameters.
* Java/JNI type wrappers for string and array of strings.
This both a convenience/correctness as well as a performance change.
Introduced are two C++ wrapper classes for the `jstring` and `object array`
types (specialized for object == jstring) which take care of efficiently
caching the retrieved strings as well as of correctly deleting local
references to the obtained objects. Both classes, `jstring_wrapper` and
`jstring_array_wrapper` are optimized so that they compile into the
equivalent of the current, hand-written, code. They also take care to make
the minimum necessary number of calls in order to access the strings, both
standalone and from arrays, as well as to release the resources.
The string and array wrappers are passed around as references, thus using the
minimum amount of memory.
* Do not preload managed assemblies.
We used to preload all of the application assemblies in order to find and
invoke type initializers. This resulted in the list of assemblies being
processed twice at the great expense of time. We now don't call the type
initializers at all and the assemblies are loaded on demand.
* Do not store application environment variables in a file inside the apk.
The textual file used to be read from the apk(s) early in the process,
requiring iteration over all the application apk files, opening each of them,
browsing through the ZIP entries and, finally, reading the file line by line,
parsing into the name and value parts to create either a
property (`mono.aot`, `mono.llvm`) or any environment variables requested by
the application developer (or the XA runtime).
To speed the process up, this commit replaces the text file with a Java class
generated during application build which contains an array of `"name",
"value"` pairs. The class is passed to `Java_mono_android_Runtime_init` and
its elements are used to create the requested environment variables. A
handful of variables is special-cased in that they are not placed in the
environment but rather to set flags in the `AndroidSystem` class. The
variables are `mono.aot`, `mono.llvm` and `__XA_DSO_IN_APK`. This allowed to
remove calls to create (fake) system properties as well as `getenv` in the
init native function.
* Don't try load LLVM.so when it won't be there because we're not using llvm
* Convert package name to hash using optimized code without calling snprintf
* Desktop build is determined on compile time instead of dynamically
* xamarin_getifaddrs are initialized on demand, not at the startup.
Startup time improvements for the XF integration test app (average, Pixel 3 XL):
* Debug mode:
Old: 1s 440ms
New: 1s 100ms
* Release mode:
Old: 650ms
New: 270ms
86843a2 to
5310271
Compare
|
Proposed commit message: |
|
Good work! When do you expect to release it? |
Fixes: #1443 Context: https://jenkins.mono-project.com/view/Xamarin.Android/job/xamarin-android/plot/Tests%20times/ A frequent complaint with Xamarin.Android is that application startup times can be slower than desired. For example, see the [Xamarin.Forms app startup data][xf-startup], in particular the **init-Release** column (times in milliseconds), which is the time at which the `Runtime.init: end native-to-managed transition` message is printed when `timing` messages are enabled. As of Jenkins build 1359, the **init-Release** value is 518ms. (Note that the times in that table are not necessarily useful, nor are they future proof: they come from the emulator that the Jenkins build machine uses, and that emulator *has* changed and *will* change in the future. Additionally, because it's an x86-accelerated emulator, times are not meaningful to Android hardware devices. The values also fluctuate a lot. Better than nothing? ¯\_(ツ)_/¯) Optimize some startup operations to reduce the **init-Release** value: * JNI Handling Improvements: * Reduce Java class lookup. * Don't lookup `android.os.Build.VERSION.SDK_INT` via JNI. * `jstring` handling improvements. * Review logging messages. * Improve package name hash generation. * Improve environment variable processing. * Mono Handling Improvements * Stop preloading all assemblies. * Avoid using "system properties" to control Mono features. * Desktop version is now a compile-time build option, not runtime. * Initialize `xamarin_getifaddrs()` on-demand, not at startup. [xf-startup]: https://jenkins.mono-project.com/view/Xamarin.Android/job/xamarin-android/plot/Tests%20times/#plot-8694435842335223270.csv ~~ JNI Handling Improvements ~~ Previously, JNI was used to lookup the `android.os.Build.VERSION.SDK_INT` value. We now pass `Build.VERSION.SDK_INT` into `Runtime.init()`, avoiding the JNI calls. Previously, `JNIEnv::FindClass()` was used during process startup in order to lookup multiple types which are required for operation. These types are now fields on the `mono.android.Runtime` type, and `JNIEnv::GetStaticObjectField()` is used to obtain the `jclass`. Additionally, a handful of class field/method lookups were moved out of the init code so that the code that doesn't use them doesn't have to pay the tax. `jstring` and `jobjectArray`-of-`jstring` values are now handled by the new `jstring_wrapper` and `jstring_array_wrapper` types, which take care of efficiently caching the retrieved strings as well as of correctly deleting local references to the obtained objects. Both classes are optimized so that they compile into the equivalent of the current, hand-written, code. They also take care to make the minimum necessary number of calls in order to access the strings, both standalone and from arrays, as well as to release the resources. The string and array wrapper instances are passed around as references, thus using the minimum amount of memory. ~~ Log Messages ~~ Previously whenever any of the `log_{info,debug}()` functions were called we'd spend time preparing all the parameters to pass to the function, sometimes involving memory allocation, function calls, etc., only to discard all of that work **inside** the `log_*` call because the logging category used in that call was disabled. Now we check whether the category is enabled before we set out to construct the parameters. ~~ Environment Processing ~~ Since [2012-Aug-16][env-support], `@(AndroidEnvironment)` has worked by creating a file named `environment` within the `.apk` -- which is stored uncompressed within the `.apk` -- and the file is then processed, calling **setenv**(3) to store the recorded values. There's a fair bit of potentially hairy string manipulation here, from ***C***, which is not entirely ideal or performant. To speed the process up, this commit replaces the `environment` file with a Java class generated during application build which contains an array of `"name", "value"` pairs. The class is passed to `Java_mono_android_Runtime_init()` and its elements are used to create the requested environment variables. Some of these variables are special-cased; instead of using them for **setenv**(3), they control flags in the `AndroidSystem` class * `mono.aot`: The `$(AndroidAotMode)` value; which *kind* of AOT mono should support at runtime. * `mono.llvm`: The `$(EnableLLVM)` value; whether LLVM-generated AOT files should be used. * `__XA_DSO_IN_APK`: Support in-place reading of `.so` files; See commit 95ca102. [env-support]: https://github.com/xamarin/monodroid/commit/dbd73ec8343477c8fd10379c6577a38c072a78fa ~~ Mono Handling Improvements ~~ During process startup startup, *every* assembly within the `.apk` would be loaded so that a `Java.Interop.__TypeRegistrations` type could be probed, and if it existed, the `__TypeRegistrations.RegisterPackages()` method would be invoked. This was done in order to [better support "type mappings"][typemaps] between Java names and C# names (and vice versa). However, this support hasn't been required since the introduction of the [`typemap.jm` and `typemap.mj` files][typemap-files]; we just hadn't gotten around to removing the `__TypeRegistrations.RegisterPackages()` invocation. *Not* loading every assembly on startup allows assemblies to be loaded on-demand, and improves startup times. [typemaps]: #1443 (comment) [typemap-files]: https://github.com/xamarin/monodroid/commit/e69b76e2b86f92fa8440756c44ad3e2205563ac7 ~~ Startup Time Summary ~~ Startup times for `tests/Xamarin.Forms-Performance-Integration`, average of 50 iterations of (uninstall app, install app, launch app): * Debug configuration: Old: 1s 440ms New: 1s 100ms * Release configuration: Old: 650ms New: 270ms
PR #2515 highlighted an issue with a few of our repeat build tests. Tests were failing because the `_Sign` target was being skipped. Looking at the build logs, the `BuildApk` target was being called, as was the `_CreateBaseApk` target, so what is going on? It turns out that in the `<BuildApk/>` task we use the `MonoAndroidHelper.CopyIfZipChanged()` method to move the temp zip file over to the actual one. In this case, the zip was identical! Thus the "new" zip was not copied, and the `_Sign` target did not need to run. How can the zip be the same? Didn't it have an updated file? Not quite; we just touched the timestamp. In this case the zip ends up with the exact same CRC values for each file. We use the CRC to detect changes, so no changes were detected. It turns out our build system was behaving as expected, it was just our test was invalid. The fix here is to update the tests to make sure we do change a file. This means the CRC's will change and the targets will run.
The goal of this commit is to make Xamarin.Android apps start faster. The
changes are focused only around the very first stage of application startup -
between Android invoking our Java startup code (MonoPackageManager) and the end
of our
Runtime.init(Java_mono_android_Runtime_initin the native runtime)which is when the user application is fully initialized and ready to start is
launcher Activity.
In order to achieve the goal, the following changes were made:
Java class lookup ("reflection").
We used to call the
FindClassJNI function as part of the startup at a costof several milliseconds. We now put the class handles (accessed with the
.classJava accessor) in theRuntimeclass and initialize them from thestatic constructor. We then read those fields from within
Runtime.init,which is passed a reference to the Java instance of the Runtime class.
Additonally, a handful of class field/method lookups were moved out of the
init code so that the code that doesn't use them doesn't have to pay the tax.
Android API level is passed to
Runtime.initfrom Java instead of using JNIfrom the native code.
Limit logging.
Previously whenever any of the
log_{info,debug}functions were called we'dspend time preparing all the parameters to pass to the function, sometimes
involving memory allocation, function calls, etc - only to discard all of
that work inside the
log_*call because the logging category used inthat call was disabled. Now we check whether the category is enabled before
we set out to construct the parameters.
Java/JNI type wrappers for string and array of strings.
This both a convenience/correctness as well as a performance change.
Introduced are two C++ wrapper classes for the
jstringandobject arraytypes (specialized for object == jstring) which take care of efficiently
caching the retrieved strings as well as of correctly deleting local
references to the obtained objects. Both classes,
jstring_wrapperandjstring_array_wrapperare optimized so that they compile into theequivalent of the current, hand-written, code. They also take care to make
the minimum necessary number of calls in order to access the strings, both
standalone and from arrays, as well as to release the resources.
The string and array wrappers are passed around as references, thus using the
minimum amount of memory.
Do not preload managed assemblies.
We used to preload all of the application assemblies in order to find and
invoke type initializers. This resulted in the list of assemblies being
processed twice at the great expense of time. We now don't call the type
initializers at all and the assemblies are loaded on demand.
Do not store application environment variables in a file inside the apk.
The textual file used to be read from the apk(s) early in the process,
requiring iteration over all the application apk files, opening each of them,
browsing through the ZIP entries and, finally, reading the file line by line,
parsing into the name and value parts to create either a
property (
mono.aot,mono.llvm) or any environment variables requested bythe application developer (or the XA runtime).
To speed the process up, this commit replaces the text file with a Java class
generated during application build which contains an array of
"name", "value"pairs. The class is passed toJava_mono_android_Runtime_initandits elements are used to create the requested environment variables. A
handful of variables is special-cased in that they are not placed in the
environment but rather to set flags in the
AndroidSystemclass. Thevariables are
mono.aot,mono.llvmand__XA_DSO_IN_APK. This allowed toremove calls to create (fake) system properties as well as
getenvin theinit native function.
Don't try load LLVM.so when it won't be there because we're not using llvm
Convert package name to hash using optimized code without calling snprintf
Desktop build is determined on compile time instead of dynamically
xamarin_getifaddrs are initialized on demand, not at the startup.
Startup time improvements for the XF integration test app (average, Pixel 3 XL):
Debug mode:
Old: 1s 440ms
New: 1s 100ms
Release mode:
Old: 650ms
New: 270ms