Closed
Conversation
085b37f to
9951cbb
Compare
8222c3b to
ed4c97d
Compare
ed4c97d to
9b7079f
Compare
c3435ea to
07c1ac6
Compare
* jt/binsearch-with-fanout: packfile: refactor hash search with fanout table packfile: remove GIT_DEBUG_LOOKUP log statements
* jk/cached-commit-buffer: revision: drop --show-all option commit: drop uses of get_cached_commit_buffer() Git 2.16.2
c66768f to
437cc16
Compare
437cc16 to
2646653
Compare
2646653 to
ba1221a
Compare
derrickstolee
pushed a commit
that referenced
this pull request
Apr 2, 2018
The function ce_write_entry() uses a 'self-initialised' variable
construct, for the symbol 'saved_namelen', to suppress a gcc
'-Wmaybe-uninitialized' warning, given that the warning is a false
positive.
For the purposes of this discussion, the ce_write_entry() function has
three code blocks of interest, that look like so:
/* block #1 */
if (ce->ce_flags & CE_STRIP_NAME) {
saved_namelen = ce_namelen(ce);
ce->ce_namelen = 0;
}
/* block #2 */
/*
* several code blocks that contain, among others, calls
* to copy_cache_entry_to_ondisk(ondisk, ce);
*/
/* block #3 */
if (ce->ce_flags & CE_STRIP_NAME) {
ce->ce_namelen = saved_namelen;
ce->ce_flags &= ~CE_STRIP_NAME;
}
The warning implies that gcc thinks it is possible that the first
block is not entered, the calls to copy_cache_entry_to_ondisk()
could toggle the CE_STRIP_NAME flag on, thereby entering block #3
with saved_namelen unset. However, the copy_cache_entry_to_ondisk()
function does not write to ce->ce_flags (it only reads). gcc could
easily determine this, since that function is local to this file,
but it obviously doesn't.
In order to suppress this warning, we make it clear to the reader
(human and compiler), that block #3 will only be entered when the
first block has been entered, by introducing a new 'stripped_name'
boolean variable. We also take the opportunity to change the type
of 'saved_namelen' to 'unsigned int' to match ce->ce_namelen.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The hashclose() method behaves very differently depending on the flags parameter. In particular, the file descriptor is not always closed. Perform a simple rename of "hashclose()" to "finalize_hashfile()" in preparation for functional changes. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we want to use a hashfile on the temporary file for a lockfile, then we need finalize_hashfile() to fully write the trailing hash but also keep the file descriptor open. Do this by adding a new CSUM_HASH_IN_STREAM flag along with a functional change that checks this flag before writing the checksum to the stream. This differs from previous behavior since it would be written if either CSUM_CLOSE or CSUM_FSYNC is provided. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add document specifying the binary format for commit graphs. This format allows for: * New versions. * New hash functions and hash lengths. * Optional extensions. Basic header information is followed by a binary table of contents into "chunks" that include: * An ordered list of commit object IDs. * A 256-entry fanout into that list of OIDs. * A list of metadata for the commits. * A list of "large edges" to enable octopus merges. The format automatically includes two parent positions for every commit. This favors speed over space, since using only one position per commit would cause an extra level of indirection for every merge commit. (Octopus merges suffer from this indirection, but they are very rare.) Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add Documentation/technical/commit-graph.txt with details of the planned commit graph feature, including future plans. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach git the 'commit-graph' builtin that will be used for writing and reading packed graph files. The current implementation is mostly empty, except for an '--object-dir' option. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach Git to write a commit graph file by checking all packed objects to see if they are commits, then store the file in the given object directory. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach git-commit-graph to write graph files. Create new test script to verify this command succeeds without failure. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
ba1221a to
15b8e28
Compare
Teach git-commit-graph to read commit graph files and summarize their contents. Use the read subcommand to verify the contents of a commit graph file in the tests. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit graph feature is controlled by the new core.commitGraph config setting. This defaults to 0, so the feature is opt-in. The intention of core.commitGraph is that a user can always stop checking for or parsing commit graph files if core.commitGraph=0. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach write_commit_graph() to walk all parents from the commits discovered in packfiles. This prevents gaps given by loose objects or previously-missed packfiles. Also automatically add commits from the existing graph file, if it exists. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach Git to inspect a commit graph file to supply the contents of a struct commit when calling parse_commit_gently(). This implementation satisfies all post-conditions on the struct commit, including loading parents, the root tree, and the commit date. If core.commitGraph is false, then do not check graph files. In test script t5318-commit-graph.sh, add output-matching conditions on read-only graph operations. By loading commits from the graph instead of parsing commit buffers, we save a lot of time on long commit walks. Here are some performance results for a copy of the Linux repository where 'master' has 678,653 reachable commits and is behind 'origin/master' by 59,929 commits. | Command | Before | After | Rel % | |----------------------------------|--------|--------|-------| | log --oneline --topo-order -1000 | 8.31s | 0.94s | -88% | | branch -vv | 1.02s | 0.14s | -86% | | rev-list --all | 5.89s | 1.07s | -81% | | rev-list --all --objects | 66.15s | 58.45s | -11% | Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach git-commit-graph to inspect the objects only in a certain list of pack-indexes within the given pack directory. This allows updating the commit graph iteratively. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
derrickstolee
pushed a commit
that referenced
this pull request
Dec 23, 2020
Test 5572.63 ("branch has no merge base with remote-tracking
counterpart") was introduced in 4d36f88 (submodule: do not pass null
OID to setup_revisions, 2018-05-24), as a regression test for the bug
this commit was fixing (preventing a 'fatal: bad object' error when the
current branch and the remote-tracking branch we are pulling have no
merge-base).
However, the commit message for 4d36f88 does not describe in which
real-life situation this bug was encountered. The brief discussion on the
mailing list [1] does not either.
The regression test is not really representative of a real-life
scenario: both the local repository and its upstream have only a single
commit, and the "no merge-base" scenario is simulated by recreating this
root commit in the local repository using 'git commit-tree' before
calling 'git pull --rebase --recurse-submodules'. The rebase succeeds
and results in the local branch being reset to the same root commit as
the upstream branch.
The fix in 4d36f88 modifies 'submodule.c::submodule_touches_in_range'
so that if 'excl_oid' is null, which is the case when the 'git merge-base
--fork-point' invocation in 'builtin/pull.c::get_rebase_fork_point'
errors (no fork-point), then instead of 'incl_oid --not excl_oid' being
passed to setup_revisions, only 'incl_oid' is passed, and
'submodule_touches_in_range' examines 'incl_oid' and all its ancestors
to verify that they do not touch the submodule.
In test 5572.63, the recreated lone root commit in the local repository is
thus the only commit being examined by 'submodule_touches_in_range', and
this commit *adds* the submodule. However, 'submodule_touches_in_range'
*succeeds* because 'combine-diff.c::diff_tree_combined' (see the
backtrace below) returns early since this commit is the root commit
and has no parents.
#0 diff_tree_combined at combine-diff.c:1494
#1 0x0000000100150cbe in diff_tree_combined_merge at combine-diff.c:1649
#2 0x00000001002c7147 in collect_changed_submodules at submodule.c:869
#3 0x00000001002c7d6f in submodule_touches_in_range at submodule.c:1268
#4 0x00000001000ad58b in cmd_pull at builtin/pull.c:1040
In light of all this, add a note in t5572 documenting this peculiar
test.
[1] https://lore.kernel.org/git/20180524204729.19896-1-jonathantanmy@google.com/t/#u
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
derrickstolee
added a commit
that referenced
this pull request
Jan 4, 2021
The previous change reduced time spent in strlen() while comparing consecutive paths in verify_cache(), but we can do better. The conditional checks the existence of a directory separator at the correct location, but only after doing a string comparison. Swap the order to be logically equivalent but perform fewer string comparisons. To test the effect on performance, I used a repository with over three million paths in the index. I then ran the following command on repeat: git -c index.threads=1 commit --amend --allow-empty --no-edit Here are the measurements over 10 runs after a 5-run warmup: Benchmark #1: v2.30.0 Time (mean ± σ): 854.5 ms ± 18.2 ms Range (min … max): 825.0 ms … 892.8 ms Benchmark #2: Previous change Time (mean ± σ): 833.2 ms ± 10.3 ms Range (min … max): 815.8 ms … 849.7 ms Benchmark #3: This change Time (mean ± σ): 815.5 ms ± 18.1 ms Range (min … max): 795.4 ms … 849.5 ms This change is 2% faster than the previous change and 5% faster than v2.30.0. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
derrickstolee
added a commit
that referenced
this pull request
Jan 4, 2021
The previous change reduced time spent in strlen() while comparing consecutive paths in verify_cache(), but we can do better. The conditional checks the existence of a directory separator at the correct location, but only after doing a string comparison. Swap the order to be logically equivalent but perform fewer string comparisons. To test the effect on performance, I used a repository with over three million paths in the index. I then ran the following command on repeat: git -c index.threads=1 commit --amend --allow-empty --no-edit Here are the measurements over 10 runs after a 5-run warmup: Benchmark #1: v2.30.0 Time (mean ± σ): 854.5 ms ± 18.2 ms Range (min … max): 825.0 ms … 892.8 ms Benchmark #2: Previous change Time (mean ± σ): 833.2 ms ± 10.3 ms Range (min … max): 815.8 ms … 849.7 ms Benchmark #3: This change Time (mean ± σ): 815.5 ms ± 18.1 ms Range (min … max): 795.4 ms … 849.5 ms This change is 2% faster than the previous change and 5% faster than v2.30.0. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
derrickstolee
added a commit
that referenced
this pull request
Jan 4, 2021
This is enough to make 'git -c core.fsmonitor="" status -uno' operate entirely on a sparse index without expanding to a full index. Warm cache case: Benchmark #1: full index (git -c core.fsmonitor= status -uno) Time (mean ± σ): 2.564 s ± 0.058 s [User: 1.489 s, System: 1.006 s] Range (min … max): 2.494 s … 2.685 s 10 runs Benchmark #2: sparse index (git -c core.fsmonitor= status -uno) Time (mean ± σ): 127.0 ms ± 4.1 ms [User: 118.2 ms, System: 183.5 ms] Range (min … max): 119.3 ms … 134.2 ms 22 runs Summary 'sparse index (git -c core.fsmonitor= status -uno)' ran 20.19 ± 0.80 times faster than 'full index (git -c core.fsmonitor= status -uno)' Cold cache case: Benchmark #1: full index (git -c core.fsmonitor= status -uno) Time (mean ± σ): 3.911 s ± 0.080 s [User: 1.550 s, System: 1.503 s] Range (min … max): 3.777 s … 4.020 s 10 runs Benchmark #2: sparse index (git -c core.fsmonitor= status -uno) Time (mean ± σ): 2.973 s ± 0.097 s [User: 291.3 ms, System: 1082.6 ms] Range (min … max): 2.811 s … 3.078 s 10 runs Summary 'sparse index (git -c core.fsmonitor= status -uno)' ran 1.32 ± 0.05 times faster than 'full index (git -c core.fsmonitor= status -uno)' Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
derrickstolee
added a commit
that referenced
this pull request
Jan 7, 2021
The previous change reduced time spent in strlen() while comparing consecutive paths in verify_cache(), but we can do better. The conditional checks the existence of a directory separator at the correct location, but only after doing a string comparison. Swap the order to be logically equivalent but perform fewer string comparisons. To test the effect on performance, I used a repository with over three million paths in the index. I then ran the following command on repeat: git -c index.threads=1 commit --amend --allow-empty --no-edit Here are the measurements over 10 runs after a 5-run warmup: Benchmark #1: v2.30.0 Time (mean ± σ): 854.5 ms ± 18.2 ms Range (min … max): 825.0 ms … 892.8 ms Benchmark #2: Previous change Time (mean ± σ): 833.2 ms ± 10.3 ms Range (min … max): 815.8 ms … 849.7 ms Benchmark #3: This change Time (mean ± σ): 815.5 ms ± 18.1 ms Range (min … max): 795.4 ms … 849.5 ms This change is 2% faster than the previous change and 5% faster than v2.30.0. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
derrickstolee
added a commit
that referenced
this pull request
Jan 19, 2021
This is enough to make 'git -c core.fsmonitor="" status -uno' operate entirely on a sparse index without expanding to a full index. Warm cache case: Benchmark #1: full index (git -c core.fsmonitor= status -uno) Time (mean ± σ): 2.564 s ± 0.058 s [User: 1.489 s, System: 1.006 s] Range (min … max): 2.494 s … 2.685 s 10 runs Benchmark #2: sparse index (git -c core.fsmonitor= status -uno) Time (mean ± σ): 127.0 ms ± 4.1 ms [User: 118.2 ms, System: 183.5 ms] Range (min … max): 119.3 ms … 134.2 ms 22 runs Summary 'sparse index (git -c core.fsmonitor= status -uno)' ran 20.19 ± 0.80 times faster than 'full index (git -c core.fsmonitor= status -uno)' Cold cache case: Benchmark #1: full index (git -c core.fsmonitor= status -uno) Time (mean ± σ): 3.911 s ± 0.080 s [User: 1.550 s, System: 1.503 s] Range (min … max): 3.777 s … 4.020 s 10 runs Benchmark #2: sparse index (git -c core.fsmonitor= status -uno) Time (mean ± σ): 2.973 s ± 0.097 s [User: 291.3 ms, System: 1082.6 ms] Range (min … max): 2.811 s … 3.078 s 10 runs Summary 'sparse index (git -c core.fsmonitor= status -uno)' ran 1.32 ± 0.05 times faster than 'full index (git -c core.fsmonitor= status -uno)' Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
derrickstolee
added a commit
that referenced
this pull request
Jan 20, 2021
The previous change reduced time spent in strlen() while comparing consecutive paths in verify_cache(), but we can do better. The conditional checks the existence of a directory separator at the correct location, but only after doing a string comparison. Swap the order to be logically equivalent but perform fewer string comparisons. To test the effect on performance, I used a repository with over three million paths in the index. I then ran the following command on repeat: git -c index.threads=1 commit --amend --allow-empty --no-edit Here are the measurements over 10 runs after a 5-run warmup: Benchmark #1: v2.30.0 Time (mean ± σ): 854.5 ms ± 18.2 ms Range (min … max): 825.0 ms … 892.8 ms Benchmark #2: Previous change Time (mean ± σ): 833.2 ms ± 10.3 ms Range (min … max): 815.8 ms … 849.7 ms Benchmark #3: This change Time (mean ± σ): 815.5 ms ± 18.1 ms Range (min … max): 795.4 ms … 849.5 ms This change is 2% faster than the previous change and 5% faster than v2.30.0. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
derrickstolee
added a commit
that referenced
this pull request
Jan 25, 2021
By testing 'git -c core.fsmonitor= status -uno', we can check for the simplest index operations that can be made sparse-aware. The necessary implementation details are already integrated with sparse-checkout, so modify command_requires_full_index to be zero for cmd_status(). By running the debugger for 'git status -uno' after that change, we find two instances of ensure_full_index() that were added for extra safety, but can be removed without issue. In refresh_index(), we loop through the index entries. The refresh_cache_ent() method copies the sparse directories into the refreshed index without issue. The loop within run_diff_files() skips things that are in stage 0 and have skip-worktree enabled, so seems safe to disable ensure_full_index() here. TODO: performance numbers at this point are confusing: Benchmark #1: full index (git -c core.fsmonitor= status -uno) Time (mean ± σ): 2.522 s ± 0.079 s [User: 1.511 s, System: 0.833 s] Range (min … max): 2.454 s … 2.715 s 10 runs Benchmark #2: sparse index, old (git -c core.fsmonitor= status -uno) Time (mean ± σ): 3.370 s ± 0.036 s [User: 3.166 s, System: 0.294 s] Range (min … max): 3.318 s … 3.428 s 10 runs Benchmark #3: sparse index (git -c core.fsmonitor= status -uno) Time (mean ± σ): 5.196 s ± 0.056 s [User: 5.189 s, System: 0.185 s] Range (min … max): 5.138 s … 5.269 s 10 runs This shows that the previous change (Benchmark #2) had some overhead with ensure_full_index() compared to a full index (Benchmark #1) but the current change got much slower for some reason! Note that ensure_full_index() is not called anywhere in this process! What is going on? Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
derrickstolee
added a commit
that referenced
this pull request
Jan 25, 2021
By testing 'git -c core.fsmonitor= status -uno', we can check for the simplest index operations that can be made sparse-aware. The necessary implementation details are already integrated with sparse-checkout, so modify command_requires_full_index to be zero for cmd_status(). By running the debugger for 'git status -uno' after that change, we find two instances of ensure_full_index() that were added for extra safety, but can be removed without issue. In refresh_index(), we loop through the index entries. The refresh_cache_ent() method copies the sparse directories into the refreshed index without issue. The loop within run_diff_files() skips things that are in stage 0 and have skip-worktree enabled, so seems safe to disable ensure_full_index() here. TODO: performance numbers at this point are confusing: Benchmark #1: full index (git -c core.fsmonitor= status -uno) Time (mean ± σ): 2.522 s ± 0.079 s [User: 1.511 s, System: 0.833 s] Range (min … max): 2.454 s … 2.715 s 10 runs Benchmark #2: sparse index, old (git -c core.fsmonitor= status -uno) Time (mean ± σ): 3.370 s ± 0.036 s [User: 3.166 s, System: 0.294 s] Range (min … max): 3.318 s … 3.428 s 10 runs Benchmark #3: sparse index (git -c core.fsmonitor= status -uno) Time (mean ± σ): 5.196 s ± 0.056 s [User: 5.189 s, System: 0.185 s] Range (min … max): 5.138 s … 5.269 s 10 runs This shows that the previous change (Benchmark #2) had some overhead with ensure_full_index() compared to a full index (Benchmark #1) but the current change got much slower for some reason! Note that ensure_full_index() is not called anywhere in this process! What is going on? Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
derrickstolee
added a commit
that referenced
this pull request
Jan 25, 2021
The cache-tree extension was previously disabled with sparse indexes.
However, the cache-tree is an important performance feature for commands
like 'git status' and 'git add'. Integrate it with sparse directory
entries.
When writing a sparse index, completely clear and recalculate the cache
tree. By starting from scratch, the only integration necessary is to
check if we hit a sparse directory entry and create a leaf of the
cache-tree that has an entry_count of one and no subtrees.
Once the cache-tree exists within a sparse index, we finally get
improved performance. I test the sparse index performance using a
private monorepo with over 2.1 million files at HEAD, but with a
sparse-checkout definition that has only 68,000 paths in the populated
cone. The sparse index has about 2,000 sparse directory entries. I
compare three scenarios:
1. Use the full index. The index size is ~186 MB.
2. Use the sparse index. The index size is ~5.5 MB.
3. Use a commit where HEAD matches the populated set. The full index
size is ~5.3MB.
The third benchmark is included as a theoretical optimium for a
repository of the same object database.
First, a clean 'git status' improves from 3.1s to 240ms.
Benchmark #1: full index (git status)
Time (mean ± σ): 3.167 s ± 0.036 s [User: 2.006 s, System: 1.078 s]
Range (min … max): 3.100 s … 3.208 s 10 runs
Benchmark #2: sparse index (git status)
Time (mean ± σ): 239.5 ms ± 8.1 ms [User: 189.4 ms, System: 226.8 ms]
Range (min … max): 226.0 ms … 251.9 ms 13 runs
Benchmark #3: small tree (git status)
Time (mean ± σ): 195.3 ms ± 4.5 ms [User: 116.5 ms, System: 84.4 ms]
Range (min … max): 188.8 ms … 202.8 ms 15 runs
The optimimum is still 45ms faster. This is due in part to the 2,000+
sparse directory entries, but there might be other optimizations to make
in the sparse-index case. In particular, I find that this performance
difference disappears when I disable FS Monitor, which is somewhat
disabled in the sparse-index case, but might still be adding overhead.
The performance numbers for 'git add .' are much closer to optimal:
Benchmark #1: full index (git add .)
Time (mean ± σ): 3.076 s ± 0.022 s [User: 2.065 s, System: 0.943 s]
Range (min … max): 3.044 s … 3.116 s 10 runs
Benchmark #2: sparse index (git add .)
Time (mean ± σ): 218.0 ms ± 6.6 ms [User: 195.7 ms, System: 206.6 ms]
Range (min … max): 209.8 ms … 228.2 ms 13 runs
Benchmark #3: small tree (git add .)
Time (mean ± σ): 217.6 ms ± 5.4 ms [User: 131.9 ms, System: 86.7 ms]
Range (min … max): 212.1 ms … 228.4 ms 14 runs
In this test, I also used "echo >>README.md" to append a line to the
README.md file, so the 'git add .' command is doing _something_ other
than a no-op. Without this edit (and FS Monitor enabled) the small
tree case again gains about 30ms on the sparse index case.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
derrickstolee
added a commit
that referenced
this pull request
Jan 25, 2021
The cache-tree extension was previously disabled with sparse indexes.
However, the cache-tree is an important performance feature for commands
like 'git status' and 'git add'. Integrate it with sparse directory
entries.
When writing a sparse index, completely clear and recalculate the cache
tree. By starting from scratch, the only integration necessary is to
check if we hit a sparse directory entry and create a leaf of the
cache-tree that has an entry_count of one and no subtrees.
Once the cache-tree exists within a sparse index, we finally get
improved performance. I test the sparse index performance using a
private monorepo with over 2.1 million files at HEAD, but with a
sparse-checkout definition that has only 68,000 paths in the populated
cone. The sparse index has about 2,000 sparse directory entries. I
compare three scenarios:
1. Use the full index. The index size is ~186 MB.
2. Use the sparse index. The index size is ~5.5 MB.
3. Use a commit where HEAD matches the populated set. The full index
size is ~5.3MB.
The third benchmark is included as a theoretical optimium for a
repository of the same object database.
First, a clean 'git status' improves from 3.1s to 240ms.
Benchmark #1: full index (git status)
Time (mean ± σ): 3.167 s ± 0.036 s [User: 2.006 s, System: 1.078 s]
Range (min … max): 3.100 s … 3.208 s 10 runs
Benchmark #2: sparse index (git status)
Time (mean ± σ): 239.5 ms ± 8.1 ms [User: 189.4 ms, System: 226.8 ms]
Range (min … max): 226.0 ms … 251.9 ms 13 runs
Benchmark #3: small tree (git status)
Time (mean ± σ): 195.3 ms ± 4.5 ms [User: 116.5 ms, System: 84.4 ms]
Range (min … max): 188.8 ms … 202.8 ms 15 runs
The optimimum is still 45ms faster. This is due in part to the 2,000+
sparse directory entries, but there might be other optimizations to make
in the sparse-index case. In particular, I find that this performance
difference disappears when I disable FS Monitor, which is somewhat
disabled in the sparse-index case, but might still be adding overhead.
The performance numbers for 'git add .' are much closer to optimal:
Benchmark #1: full index (git add .)
Time (mean ± σ): 3.076 s ± 0.022 s [User: 2.065 s, System: 0.943 s]
Range (min … max): 3.044 s … 3.116 s 10 runs
Benchmark #2: sparse index (git add .)
Time (mean ± σ): 218.0 ms ± 6.6 ms [User: 195.7 ms, System: 206.6 ms]
Range (min … max): 209.8 ms … 228.2 ms 13 runs
Benchmark #3: small tree (git add .)
Time (mean ± σ): 217.6 ms ± 5.4 ms [User: 131.9 ms, System: 86.7 ms]
Range (min … max): 212.1 ms … 228.4 ms 14 runs
In this test, I also used "echo >>README.md" to append a line to the
README.md file, so the 'git add .' command is doing _something_ other
than a no-op. Without this edit (and FS Monitor enabled) the small
tree case again gains about 30ms on the sparse index case.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
derrickstolee
added a commit
that referenced
this pull request
Jan 25, 2021
The cache-tree extension was previously disabled with sparse indexes.
However, the cache-tree is an important performance feature for commands
like 'git status' and 'git add'. Integrate it with sparse directory
entries.
When writing a sparse index, completely clear and recalculate the cache
tree. By starting from scratch, the only integration necessary is to
check if we hit a sparse directory entry and create a leaf of the
cache-tree that has an entry_count of one and no subtrees.
Once the cache-tree exists within a sparse index, we finally get
improved performance. I test the sparse index performance using a
private monorepo with over 2.1 million files at HEAD, but with a
sparse-checkout definition that has only 68,000 paths in the populated
cone. The sparse index has about 2,000 sparse directory entries. I
compare three scenarios:
1. Use the full index. The index size is ~186 MB.
2. Use the sparse index. The index size is ~5.5 MB.
3. Use a commit where HEAD matches the populated set. The full index
size is ~5.3MB.
The third benchmark is included as a theoretical optimium for a
repository of the same object database.
First, a clean 'git status' improves from 3.1s to 240ms.
Benchmark #1: full index (git status)
Time (mean ± σ): 3.167 s ± 0.036 s [User: 2.006 s, System: 1.078 s]
Range (min … max): 3.100 s … 3.208 s 10 runs
Benchmark #2: sparse index (git status)
Time (mean ± σ): 239.5 ms ± 8.1 ms [User: 189.4 ms, System: 226.8 ms]
Range (min … max): 226.0 ms … 251.9 ms 13 runs
Benchmark #3: small tree (git status)
Time (mean ± σ): 195.3 ms ± 4.5 ms [User: 116.5 ms, System: 84.4 ms]
Range (min … max): 188.8 ms … 202.8 ms 15 runs
The optimimum is still 45ms faster. This is due in part to the 2,000+
sparse directory entries, but there might be other optimizations to make
in the sparse-index case. In particular, I find that this performance
difference disappears when I disable FS Monitor, which is somewhat
disabled in the sparse-index case, but might still be adding overhead.
The performance numbers for 'git add .' are much closer to optimal:
Benchmark #1: full index (git add .)
Time (mean ± σ): 3.076 s ± 0.022 s [User: 2.065 s, System: 0.943 s]
Range (min … max): 3.044 s … 3.116 s 10 runs
Benchmark #2: sparse index (git add .)
Time (mean ± σ): 218.0 ms ± 6.6 ms [User: 195.7 ms, System: 206.6 ms]
Range (min … max): 209.8 ms … 228.2 ms 13 runs
Benchmark #3: small tree (git add .)
Time (mean ± σ): 217.6 ms ± 5.4 ms [User: 131.9 ms, System: 86.7 ms]
Range (min … max): 212.1 ms … 228.4 ms 14 runs
In this test, I also used "echo >>README.md" to append a line to the
README.md file, so the 'git add .' command is doing _something_ other
than a no-op. Without this edit (and FS Monitor enabled) the small
tree case again gains about 30ms on the sparse index case.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
derrickstolee
added a commit
that referenced
this pull request
Jan 25, 2021
The cache-tree extension was previously disabled with sparse indexes.
However, the cache-tree is an important performance feature for commands
like 'git status' and 'git add'. Integrate it with sparse directory
entries.
When writing a sparse index, completely clear and recalculate the cache
tree. By starting from scratch, the only integration necessary is to
check if we hit a sparse directory entry and create a leaf of the
cache-tree that has an entry_count of one and no subtrees.
Once the cache-tree exists within a sparse index, we finally get
improved performance. I test the sparse index performance using a
private monorepo with over 2.1 million files at HEAD, but with a
sparse-checkout definition that has only 68,000 paths in the populated
cone. The sparse index has about 2,000 sparse directory entries. I
compare three scenarios:
1. Use the full index. The index size is ~186 MB.
2. Use the sparse index. The index size is ~5.5 MB.
3. Use a commit where HEAD matches the populated set. The full index
size is ~5.3MB.
The third benchmark is included as a theoretical optimium for a
repository of the same object database.
First, a clean 'git status' improves from 3.1s to 240ms.
Benchmark #1: full index (git status)
Time (mean ± σ): 3.167 s ± 0.036 s [User: 2.006 s, System: 1.078 s]
Range (min … max): 3.100 s … 3.208 s 10 runs
Benchmark #2: sparse index (git status)
Time (mean ± σ): 239.5 ms ± 8.1 ms [User: 189.4 ms, System: 226.8 ms]
Range (min … max): 226.0 ms … 251.9 ms 13 runs
Benchmark #3: small tree (git status)
Time (mean ± σ): 195.3 ms ± 4.5 ms [User: 116.5 ms, System: 84.4 ms]
Range (min … max): 188.8 ms … 202.8 ms 15 runs
The optimimum is still 45ms faster. This is due in part to the 2,000+
sparse directory entries, but there might be other optimizations to make
in the sparse-index case. In particular, I find that this performance
difference disappears when I disable FS Monitor, which is somewhat
disabled in the sparse-index case, but might still be adding overhead.
The performance numbers for 'git add .' are much closer to optimal:
Benchmark #1: full index (git add .)
Time (mean ± σ): 3.076 s ± 0.022 s [User: 2.065 s, System: 0.943 s]
Range (min … max): 3.044 s … 3.116 s 10 runs
Benchmark #2: sparse index (git add .)
Time (mean ± σ): 218.0 ms ± 6.6 ms [User: 195.7 ms, System: 206.6 ms]
Range (min … max): 209.8 ms … 228.2 ms 13 runs
Benchmark #3: small tree (git add .)
Time (mean ± σ): 217.6 ms ± 5.4 ms [User: 131.9 ms, System: 86.7 ms]
Range (min … max): 212.1 ms … 228.4 ms 14 runs
In this test, I also used "echo >>README.md" to append a line to the
README.md file, so the 'git add .' command is doing _something_ other
than a no-op. Without this edit (and FS Monitor enabled) the small
tree case again gains about 30ms on the sparse index case.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
derrickstolee
added a commit
that referenced
this pull request
Jan 25, 2021
The cache-tree extension was previously disabled with sparse indexes.
However, the cache-tree is an important performance feature for commands
like 'git status' and 'git add'. Integrate it with sparse directory
entries.
When writing a sparse index, completely clear and recalculate the cache
tree. By starting from scratch, the only integration necessary is to
check if we hit a sparse directory entry and create a leaf of the
cache-tree that has an entry_count of one and no subtrees.
Once the cache-tree exists within a sparse index, we finally get
improved performance. I test the sparse index performance using a
private monorepo with over 2.1 million files at HEAD, but with a
sparse-checkout definition that has only 68,000 paths in the populated
cone. The sparse index has about 2,000 sparse directory entries. I
compare three scenarios:
1. Use the full index. The index size is ~186 MB.
2. Use the sparse index. The index size is ~5.5 MB.
3. Use a commit where HEAD matches the populated set. The full index
size is ~5.3MB.
The third benchmark is included as a theoretical optimium for a
repository of the same object database.
First, a clean 'git status' improves from 3.1s to 240ms.
Benchmark #1: full index (git status)
Time (mean ± σ): 3.167 s ± 0.036 s [User: 2.006 s, System: 1.078 s]
Range (min … max): 3.100 s … 3.208 s 10 runs
Benchmark #2: sparse index (git status)
Time (mean ± σ): 239.5 ms ± 8.1 ms [User: 189.4 ms, System: 226.8 ms]
Range (min … max): 226.0 ms … 251.9 ms 13 runs
Benchmark #3: small tree (git status)
Time (mean ± σ): 195.3 ms ± 4.5 ms [User: 116.5 ms, System: 84.4 ms]
Range (min … max): 188.8 ms … 202.8 ms 15 runs
The optimimum is still 45ms faster. This is due in part to the 2,000+
sparse directory entries, but there might be other optimizations to make
in the sparse-index case. In particular, I find that this performance
difference disappears when I disable FS Monitor, which is somewhat
disabled in the sparse-index case, but might still be adding overhead.
The performance numbers for 'git add .' are much closer to optimal:
Benchmark #1: full index (git add .)
Time (mean ± σ): 3.076 s ± 0.022 s [User: 2.065 s, System: 0.943 s]
Range (min … max): 3.044 s … 3.116 s 10 runs
Benchmark #2: sparse index (git add .)
Time (mean ± σ): 218.0 ms ± 6.6 ms [User: 195.7 ms, System: 206.6 ms]
Range (min … max): 209.8 ms … 228.2 ms 13 runs
Benchmark #3: small tree (git add .)
Time (mean ± σ): 217.6 ms ± 5.4 ms [User: 131.9 ms, System: 86.7 ms]
Range (min … max): 212.1 ms … 228.4 ms 14 runs
In this test, I also used "echo >>README.md" to append a line to the
README.md file, so the 'git add .' command is doing _something_ other
than a no-op. Without this edit (and FS Monitor enabled) the small
tree case again gains about 30ms on the sparse index case.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
derrickstolee
added a commit
that referenced
this pull request
Jan 25, 2021
The cache-tree extension was previously disabled with sparse indexes.
However, the cache-tree is an important performance feature for commands
like 'git status' and 'git add'. Integrate it with sparse directory
entries.
When writing a sparse index, completely clear and recalculate the cache
tree. By starting from scratch, the only integration necessary is to
check if we hit a sparse directory entry and create a leaf of the
cache-tree that has an entry_count of one and no subtrees.
Once the cache-tree exists within a sparse index, we finally get
improved performance. I test the sparse index performance using a
private monorepo with over 2.1 million files at HEAD, but with a
sparse-checkout definition that has only 68,000 paths in the populated
cone. The sparse index has about 2,000 sparse directory entries. I
compare three scenarios:
1. Use the full index. The index size is ~186 MB.
2. Use the sparse index. The index size is ~5.5 MB.
3. Use a commit where HEAD matches the populated set. The full index
size is ~5.3MB.
The third benchmark is included as a theoretical optimium for a
repository of the same object database.
First, a clean 'git status' improves from 3.1s to 240ms.
Benchmark #1: full index (git status)
Time (mean ± σ): 3.167 s ± 0.036 s [User: 2.006 s, System: 1.078 s]
Range (min … max): 3.100 s … 3.208 s 10 runs
Benchmark #2: sparse index (git status)
Time (mean ± σ): 239.5 ms ± 8.1 ms [User: 189.4 ms, System: 226.8 ms]
Range (min … max): 226.0 ms … 251.9 ms 13 runs
Benchmark #3: small tree (git status)
Time (mean ± σ): 195.3 ms ± 4.5 ms [User: 116.5 ms, System: 84.4 ms]
Range (min … max): 188.8 ms … 202.8 ms 15 runs
The optimimum is still 45ms faster. This is due in part to the 2,000+
sparse directory entries, but there might be other optimizations to make
in the sparse-index case. In particular, I find that this performance
difference disappears when I disable FS Monitor, which is somewhat
disabled in the sparse-index case, but might still be adding overhead.
The performance numbers for 'git add .' are much closer to optimal:
Benchmark #1: full index (git add .)
Time (mean ± σ): 3.076 s ± 0.022 s [User: 2.065 s, System: 0.943 s]
Range (min … max): 3.044 s … 3.116 s 10 runs
Benchmark #2: sparse index (git add .)
Time (mean ± σ): 218.0 ms ± 6.6 ms [User: 195.7 ms, System: 206.6 ms]
Range (min … max): 209.8 ms … 228.2 ms 13 runs
Benchmark #3: small tree (git add .)
Time (mean ± σ): 217.6 ms ± 5.4 ms [User: 131.9 ms, System: 86.7 ms]
Range (min … max): 212.1 ms … 228.4 ms 14 runs
In this test, I also used "echo >>README.md" to append a line to the
README.md file, so the 'git add .' command is doing _something_ other
than a no-op. Without this edit (and FS Monitor enabled) the small
tree case again gains about 30ms on the sparse index case.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
derrickstolee
pushed a commit
that referenced
this pull request
Feb 17, 2021
This is modelled on the version of handle_directory_level_conflicts()
from merge-recursive.c, but is massively simplified due to the following
factors:
* strmap API provides simplifications over using direct hashmap
* we have a dirs_removed field in struct rename_info that we have an
easy way to populate from collect_merge_info(); this was already
used in compute_rename_counts() and thus we do not need to check
for condition #2.
* The removal of condition #2 by handling it earlier in the code also
obviates the need to check for condition #3 -- if both sides renamed
a directory, meaning that the directory no longer exists on either
side, then neither side could have added any new files to that
directory, and thus there are no files whose locations we need to
move due to such a directory rename.
In fact, the same logic that makes condition #3 irrelevant means
condition #1 is also irrelevant so we could drop this function.
However, it is cheap to check if both sides rename the same directory,
and doing so can save future computation. So, simply remove any
directories that both sides renamed from the list of directory renames.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
derrickstolee
pushed a commit
that referenced
this pull request
Feb 17, 2021
Add some timing instrumentation for both merge-ort and diffcore-rename; I used these to measure and optimize performance in both, and several future patch series will build on these to reduce the timings of some select testcases. === Setup === The primary testcase I used involved rebasing a random topic in the linux kernel (consisting of 35 patches) against an older version. I added two variants, one where I rename a toplevel directory, and another where I only rebase one patch instead of the whole topic. The setup is as follows: $ git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git $ git branch hwmon-updates fd8bdb23b91876ac1e624337bb88dc1dcc21d67e $ git branch hwmon-just-one fd8bdb23b91876ac1e624337bb88dc1dcc21d67e~34 $ git branch base 4703d9119972bf586d2cca76ec6438f819ffa30e $ git switch -c 5.4-renames v5.4 $ git mv drivers pilots # Introduce over 26,000 renames $ git commit -m "Rename drivers/ to pilots/" $ git config merge.renameLimit 30000 $ git config merge.directoryRenames true === Testcases === Now with REBASE standing for either "git rebase [--merge]" (using merge-recursive) or "test-tool fast-rebase" (using merge-ort), the testcases are: Testcase #1: no-renames $ git checkout v5.4^0 $ REBASE --onto HEAD base hwmon-updates Note: technically the name is misleading; there are some renames, but very few. Rename detection only takes about half the overall time. Testcase #2: mega-renames $ git checkout 5.4-renames^0 $ REBASE --onto HEAD base hwmon-updates Testcase #3: just-one-mega $ git checkout 5.4-renames^0 $ REBASE --onto HEAD base hwmon-just-one === Timing results === Overall timings, using hyperfine (1 warmup run, 3 runs for mega-renames, 10 runs for the other two cases): merge-recursive merge-ort no-renames: 18.912 s ± 0.174 s 14.263 s ± 0.053 s mega-renames: 5964.031 s ± 10.459 s 5504.231 s ± 5.150 s just-one-mega: 149.583 s ± 0.751 s 158.534 s ± 0.498 s A single re-run of each with some breakdowns: --- no-renames --- merge-recursive merge-ort overall runtime: 19.302 s 14.257 s inexact rename detection: 7.603 s 7.906 s everything else: 11.699 s 6.351 s --- mega-renames --- merge-recursive merge-ort overall runtime: 5950.195 s 5499.672 s inexact rename detection: 5746.309 s 5487.120 s everything else: 203.886 s 17.552 s --- just-one-mega --- merge-recursive merge-ort overall runtime: 151.001 s 158.582 s inexact rename detection: 143.448 s 157.835 s everything else: 7.553 s 0.747 s === Timing observations === 0) Maximum speedup The "everything else" row represents the maximum speedup we could achieve if we were to somehow infinitely parallelize inexact rename detection, but leave everything else alone. The fact that this is so much smaller than the real runtime (even in the case with virtually no renames) makes it clear just how overwhelmingly large the time spent on rename detection can be. 1) no-renames 1a) merge-ort is faster than merge-recursive, which is nice. However, this still should not be considered good enough. Although the "merge" backend to rebase (merge-recursive) is sometimes faster than the "apply" backend, this is one of those cases where it is not. In fact, even merge-ort is slower. The "apply" backend can complete this testcase in 6.940 s ± 0.485 s which is about 2x faster than merge-ort and 3x faster than merge-recursive. One goal of the merge-ort performance work will be to make it faster than git-am on this (and similar) testcases. 2) mega-renames 2a) Obviously rename detection is a huge cost; it's where most the time is spent. We need to cut that down. If we could somehow infinitely parallelize it and drive its time to 0, the merge-recursive time would drop to about 204s, and the merge-ort time would drop to about 17s. I think this particular stat shows I've subtly baked a couple performance improvements into merge-ort and into fast-rebase already. 3) just-one-mega 3a) not much to say here, it just gives some flavor for how rebasing only one patch compares to rebasing 35. === Goals === This patch is obviously just the beginning. Here are some of my goals that this measurement will help us achieve: * Drive the cost of rename detection down considerably for merges * After the above has been achieved, see if there are other slowness factors (which would have previously been overshadowed by rename detection costs) which we can then focus on and also optimize. * Ensure our rebase testcase that requires little rename detection is noticeably faster with merge-ort than with apply-based rebase. Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Taylor Blau <ttaylorr@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
derrickstolee
pushed a commit
that referenced
this pull request
Feb 17, 2021
Specify the format of the on-disk reverse index 'pack-*.rev' file, as well as prepare the code for the existence of such files. The reverse index maps from pack relative positions (i.e., an index into the array of object which is sorted by their offsets within the packfile) to their position within the 'pack-*.idx' file. Today, this is done by building up a list of (off_t, uint32_t) tuples for each object (the off_t corresponding to that object's offset, and the uint32_t corresponding to its position in the index). To convert between pack and index position quickly, this array of tuples is radix sorted based on its offset. This has two major drawbacks: First, the in-memory cost scales linearly with the number of objects in a pack. Each 'struct revindex_entry' is sizeof(off_t) + sizeof(uint32_t) + padding bytes for a total of 16. To observe this, force Git to load the reverse index by, for e.g., running 'git cat-file --batch-check="%(objectsize:disk)"'. When asking for a single object in a fresh clone of the kernel, Git needs to allocate 120+ MB of memory in order to hold the reverse index in memory. Second, the cost to sort also scales with the size of the pack. Luckily, this is a linear function since 'load_pack_revindex()' uses a radix sort, but this cost still must be paid once per pack per process. As an example, it takes ~60x longer to print the _size_ of an object as it does to print that entire object's _contents_: Benchmark #1: git.compile cat-file --batch <obj Time (mean ± σ): 3.4 ms ± 0.1 ms [User: 3.3 ms, System: 2.1 ms] Range (min … max): 3.2 ms … 3.7 ms 726 runs Benchmark #2: git.compile cat-file --batch-check="%(objectsize:disk)" <obj Time (mean ± σ): 210.3 ms ± 8.9 ms [User: 188.2 ms, System: 23.2 ms] Range (min … max): 193.7 ms … 224.4 ms 13 runs Instead, avoid computing and sorting the revindex once per process by writing it to a file when the pack itself is generated. The format is relatively straightforward. It contains an array of uint32_t's, the length of which is equal to the number of objects in the pack. The ith entry in this table contains the index position of the ith object in the pack, where "ith object in the pack" is determined by pack offset. One thing that the on-disk format does _not_ contain is the full (up to) eight-byte offset corresponding to each object. This is something that the in-memory revindex contains (it stores an off_t in 'struct revindex_entry' along with the same uint32_t that the on-disk format has). Omit it in the on-disk format, since knowing the index position for some object is sufficient to get a constant-time lookup in the pack-*.idx file to ask for an object's offset within the pack. This trades off between the on-disk size of the 'pack-*.rev' file for runtime to chase down the offset for some object. Even though the lookup is constant time, the constant is heavier, since it can potentially involve two pointer walks in v2 indexes (one to access the 4-byte offset table, and potentially a second to access the double wide offset table). Consider trying to map an object's pack offset to a relative position within that pack. In a cold-cache scenario, more page faults occur while switching between binary searching through the reverse index and searching through the *.idx file for an object's offset. Sure enough, with a cold cache (writing '3' into '/proc/sys/vm/drop_caches' after 'sync'ing), printing out the entire object's contents is still marginally faster than printing its size: Benchmark #1: git.compile cat-file --batch-check="%(objectsize:disk)" <obj >/dev/null Time (mean ± σ): 22.6 ms ± 0.5 ms [User: 2.4 ms, System: 7.9 ms] Range (min … max): 21.4 ms … 23.5 ms 41 runs Benchmark #2: git.compile cat-file --batch <obj >/dev/null Time (mean ± σ): 17.2 ms ± 0.7 ms [User: 2.8 ms, System: 5.5 ms] Range (min … max): 15.6 ms … 18.2 ms 45 runs (Numbers taken in the kernel after cheating and using the next patch to generate a reverse index). There are a couple of approaches to improve cold cache performance not pursued here: - We could include the object offsets in the reverse index format. Predictably, this does result in fewer page faults, but it triples the size of the file, while simultaneously duplicating a ton of data already available in the .idx file. (This was the original way I implemented the format, and it did show `--batch-check='%(objectsize:disk)'` winning out against `--batch`.) On the other hand, this increase in size also results in a large block-cache footprint, which could potentially hurt other workloads. - We could store the mapping from pack to index position in more cache-friendly way, like constructing a binary search tree from the table and writing the values in breadth-first order. This would result in much better locality, but the price you pay is trading O(1) lookup in 'pack_pos_to_index()' for an O(log n) one (since you can no longer directly index the table). So, neither of these approaches are taken here. (Thankfully, the format is versioned, so we are free to pursue these in the future.) But, cold cache performance likely isn't interesting outside of one-off cases like asking for the size of an object directly. In real-world usage, Git is often performing many operations in the revindex (i.e., asking about many objects rather than a single one). The trade-off is worth it, since we will avoid the vast majority of the cost of generating the revindex that the extra pointer chase will look like noise in the following patch's benchmarks. This patch describes the format and prepares callers (like in pack-revindex.c) to be able to read *.rev files once they exist. An implementation of the writer will appear in the next patch, and callers will gradually begin to start using the writer in the patches that follow after that. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
derrickstolee
added a commit
that referenced
this pull request
Feb 17, 2021
The cache-tree extension was previously disabled with sparse indexes.
However, the cache-tree is an important performance feature for commands
like 'git status' and 'git add'. Integrate it with sparse directory
entries.
When writing a sparse index, completely clear and recalculate the cache
tree. By starting from scratch, the only integration necessary is to
check if we hit a sparse directory entry and create a leaf of the
cache-tree that has an entry_count of one and no subtrees.
Once the cache-tree exists within a sparse index, we finally get
improved performance. I test the sparse index performance using a
private monorepo with over 2.1 million files at HEAD, but with a
sparse-checkout definition that has only 68,000 paths in the populated
cone. The sparse index has about 2,000 sparse directory entries. I
compare three scenarios:
1. Use the full index. The index size is ~186 MB.
2. Use the sparse index. The index size is ~5.5 MB.
3. Use a commit where HEAD matches the populated set. The full index
size is ~5.3MB.
The third benchmark is included as a theoretical optimium for a
repository of the same object database.
First, a clean 'git status' improves from 3.1s to 240ms.
Benchmark #1: full index (git status)
Time (mean ± σ): 3.167 s ± 0.036 s [User: 2.006 s, System: 1.078 s]
Range (min … max): 3.100 s … 3.208 s 10 runs
Benchmark #2: sparse index (git status)
Time (mean ± σ): 239.5 ms ± 8.1 ms [User: 189.4 ms, System: 226.8 ms]
Range (min … max): 226.0 ms … 251.9 ms 13 runs
Benchmark #3: small tree (git status)
Time (mean ± σ): 195.3 ms ± 4.5 ms [User: 116.5 ms, System: 84.4 ms]
Range (min … max): 188.8 ms … 202.8 ms 15 runs
The optimimum is still 45ms faster. This is due in part to the 2,000+
sparse directory entries, but there might be other optimizations to make
in the sparse-index case. In particular, I find that this performance
difference disappears when I disable FS Monitor, which is somewhat
disabled in the sparse-index case, but might still be adding overhead.
The performance numbers for 'git add .' are much closer to optimal:
Benchmark #1: full index (git add .)
Time (mean ± σ): 3.076 s ± 0.022 s [User: 2.065 s, System: 0.943 s]
Range (min … max): 3.044 s … 3.116 s 10 runs
Benchmark #2: sparse index (git add .)
Time (mean ± σ): 218.0 ms ± 6.6 ms [User: 195.7 ms, System: 206.6 ms]
Range (min … max): 209.8 ms … 228.2 ms 13 runs
Benchmark #3: small tree (git add .)
Time (mean ± σ): 217.6 ms ± 5.4 ms [User: 131.9 ms, System: 86.7 ms]
Range (min … max): 212.1 ms … 228.4 ms 14 runs
In this test, I also used "echo >>README.md" to append a line to the
README.md file, so the 'git add .' command is doing _something_ other
than a no-op. Without this edit (and FS Monitor enabled) the small
tree case again gains about 30ms on the sparse index case.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
derrickstolee
pushed a commit
that referenced
this pull request
Mar 24, 2021
…sponse query_result can be be an empty strbuf (STRBUF_INIT) - in that case trying to read 3 bytes triggers a buffer overflow read (as query_result.buf = '\0'). Therefore we need to check query_result's length before trying to read 3 bytes. This overflow was introduced in: 940b94f (fsmonitor: log invocation of FSMonitor hook to trace2, 2021-02-03) It was found when running the test-suite against ASAN, and can be most easily reproduced with the following command: make GIT_TEST_OPTS="-v" DEFAULT_TEST_TARGET="t7519-status-fsmonitor.sh" \ SANITIZE=address DEVELOPER=1 test ==2235==ERROR: AddressSanitizer: global-buffer-overflow on address 0x0000019e6e5e at pc 0x00000043745c bp 0x7fffd382c520 sp 0x7fffd382bcc8 READ of size 3 at 0x0000019e6e5e thread T0 #0 0x43745b in MemcmpInterceptorCommon(void*, int (*)(void const*, void const*, unsigned long), void const*, void const*, unsigned long) /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:842:7 #1 0x43786d in bcmp /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:887:10 #2 0x80b146 in fsmonitor_is_trivial_response /home/ahunt/oss-fuzz/git/fsmonitor.c:192:10 #3 0x80b146 in query_fsmonitor /home/ahunt/oss-fuzz/git/fsmonitor.c:175:7 #4 0x80a749 in refresh_fsmonitor /home/ahunt/oss-fuzz/git/fsmonitor.c:267:21 #5 0x80bad1 in tweak_fsmonitor /home/ahunt/oss-fuzz/git/fsmonitor.c:429:4 #6 0x90f040 in read_index_from /home/ahunt/oss-fuzz/git/read-cache.c:2321:3 #7 0x8e5d08 in repo_read_index_preload /home/ahunt/oss-fuzz/git/preload-index.c:164:15 #8 0x52dd45 in prepare_index /home/ahunt/oss-fuzz/git/builtin/commit.c:363:6 #9 0x52a188 in cmd_commit /home/ahunt/oss-fuzz/git/builtin/commit.c:1588:15 #10 0x4ce77e in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #11 0x4ccb18 in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #12 0x4cb01c in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #13 0x4cb01c in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #14 0x6aca8d in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #15 0x7fb027bf5349 in __libc_start_main (/lib64/libc.so.6+0x24349) #16 0x4206b9 in _start /home/abuild/rpmbuild/BUILD/glibc-2.26/csu/../sysdeps/x86_64/start.S:120 0x0000019e6e5e is located 2 bytes to the left of global variable 'strbuf_slopbuf' defined in 'strbuf.c:51:6' (0x19e6e60) of size 1 'strbuf_slopbuf' is ascii string '' 0x0000019e6e5e is located 126 bytes to the right of global variable 'signals' defined in 'sigchain.c:11:31' (0x19e6be0) of size 512 SUMMARY: AddressSanitizer: global-buffer-overflow /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:842:7 in MemcmpInterceptorCommon(void*, int (*)(void const*, void const*, unsigned long), void const*, void const*, unsigned long) Shadow bytes around the buggy address: 0x000080334d70: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 0x000080334d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x000080334d90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x000080334da0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x000080334db0: 00 00 00 00 00 00 00 00 00 00 00 00 f9 f9 f9 f9 =>0x000080334dc0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9[f9]01 f9 f9 f9 0x000080334dd0: f9 f9 f9 f9 03 f9 f9 f9 f9 f9 f9 f9 02 f9 f9 f9 0x000080334de0: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9 0x000080334df0: f9 f9 f9 f9 01 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 0x000080334e00: f9 f9 f9 f9 00 00 00 00 f9 f9 f9 f9 01 f9 f9 f9 0x000080334e10: f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Acked-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
derrickstolee
added a commit
that referenced
this pull request
Mar 24, 2021
In an independent investigation, I noticed that do_write_index() in read-cache.c has its own hashing logic and buffering mechanism. Specifically, the ce_write() method was introduced by 4990aad (Speed up index file writing by chunking it nicely, 2005-04-20) and similar mechanisms were introduced a few months later in c38138c (git-pack-objects: write the pack files with a SHA1 csum, 2005-06-26). Based on the timing, in the early days of the Git codebase, I figured that these roughly equivalent code paths were never unified only because it got lost in the shuffle. The hashfile API has since been used extensively in other file formats, such as pack-indexes, mult-pack-indexes, and commit-graphs. Therefore, it seems prudent to unify the index writing code to use the same mechanism. However, upon doing that refactoring process, I noticed that this caused some commands that write the index to slow down by 1-2%! I then looked for a reason why this could be. First, I noticed that the mechanisms use different buffer sizes. The hashfile uses an 8KB buffer while the index uses an 128KB buffer. Testing with a variety of different buffer sizes made little difference. Next, I inspected the buffering code itself, and found an important difference. Specifically, every call to hashwrite() was causing a flush of the filestream, even if it was a very small write. With many callers using helpers like hashwrite_be32() to write integers in network-byte order, this was leading to many more file flushes than necessary. This change modifies hashwrite() to always populate the hashfile buffer, and only flush when that buffer is full. This is safe to do because all consumers of a hashfile must call finalize_hashfile(), which flushes the buffer at the start. It is worth noting that this is modifying logic introduced by a8032d1 (sha1write: don't copy full sized buffers, 2008-09-02) which reduces memcpy() calls when the input buffer is sufficiently longer than the hashfile's buffer, causing nr to be the length of the full buffer. Use the input buffer directly in these cases. Since we don't guarantee that the buffer is flushed by the end of hashwrite(), we need to group some offset logic into the condition that memcpy() is necessary. Note that nr is equal to sizeof(f->buffer) only when f->offset is zero, so that condition does not need to be added here. As for performance, I focused on known commands that spend a significant amount of time writing through the hashfile API, especially if using small buffers as in hashwrite_be32(). 'git multi-pack-index write' was an excellent example (deleting the multi-pack-index file between runs) and demonstrated this performance change in the Linux kernal repo: Benchmark #1: old Time (mean ± σ): 2.229 s ± 0.143 s [User: 1.409 s, System: 0.327 s] Range (min … max): 2.160 s … 2.636 s 10 runs Benchmark #2: new Time (mean ± σ): 2.162 s ± 0.005 s [User: 1.392 s, System: 0.323 s] Range (min … max): 2.152 s … 2.172 s 10 runs Summary 'new' ran 1.03 ± 0.07 times faster than 'old' Similarly, the same command on the Git repository gave these numbers: Benchmark #1: old Time (mean ± σ): 230.5 ms ± 6.3 ms [User: 140.5 ms, System: 42.9 ms] Range (min … max): 221.7 ms … 240.6 ms 12 runs Benchmark #2: new Time (mean ± σ): 220.6 ms ± 5.1 ms [User: 139.5 ms, System: 34.1 ms] Range (min … max): 214.0 ms … 229.0 ms 13 runs Summary 'new' ran 1.05 ± 0.04 times faster than 'old' Finally, to demonstrate that performance holds when frequently using large buffers, the numbers below are for 'git pack-objects' packing all objects in the Git repository between v2.30.0 and v2.31.1: Benchmark #1: old Time (mean ± σ): 1.003 s ± 0.045 s [User: 1.877 s, System: 0.167 s] Range (min … max): 0.931 s … 1.044 s 10 runs Benchmark #2: new Time (mean ± σ): 976.4 ms ± 42.2 ms [User: 1.854 s, System: 0.192 s] Range (min … max): 940.1 ms … 1049.3 ms 10 runs Summary 'new' ran 1.03 ± 0.06 times faster than 'old' With these consistent improvements of 3-5%, it will be possible to move the index writing logic over to hashfile without performance degradation. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
derrickstolee
pushed a commit
that referenced
this pull request
Apr 10, 2021
shorten_unambiguous_ref() returns an allocated string. We have to track it separately from the const refname. This leak has existed since: 9ab55da (git symbolic-ref --delete $symref, 2012-10-21) This leak was found when running t0001 with LSAN, see also LSAN output below: Direct leak of 19 byte(s) in 1 object(s) allocated from: #0 0x486514 in strdup /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3 #1 0x9ab048 in xstrdup /home/ahunt/oss-fuzz/git/wrapper.c:29:14 #2 0x8b452f in refs_shorten_unambiguous_ref /home/ahunt/oss-fuzz/git/refs.c #3 0x8b47e8 in shorten_unambiguous_ref /home/ahunt/oss-fuzz/git/refs.c:1287:9 #4 0x679fce in check_symref /home/ahunt/oss-fuzz/git/builtin/symbolic-ref.c:28:14 #5 0x679ad8 in cmd_symbolic_ref /home/ahunt/oss-fuzz/git/builtin/symbolic-ref.c:70:9 #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #10 0x69cc6e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #11 0x7f98388a4349 in __libc_start_main (/lib64/libc.so.6+0x24349) Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
derrickstolee
pushed a commit
that referenced
this pull request
Apr 10, 2021
dwim_ref() allocs a new string into ref. Instead of setting to NULL to discard it, we can FREE_AND_NULL. This leak appears to have been introduced in: 4cf76f6 (builtin/reset: compute checkout metadata for reset, 2020-03-16) This leak was found when running t0001 with LSAN, see also LSAN output below: Direct leak of 5 byte(s) in 1 object(s) allocated from: #0 0x486514 in strdup /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3 #1 0x9a7108 in xstrdup /home/ahunt/oss-fuzz/git/wrapper.c:29:14 #2 0x8add6b in expand_ref /home/ahunt/oss-fuzz/git/refs.c:670:12 #3 0x8ad777 in repo_dwim_ref /home/ahunt/oss-fuzz/git/refs.c:644:22 #4 0x6394af in dwim_ref /home/ahunt/oss-fuzz/git/./refs.h:162:9 #5 0x637e5c in cmd_reset /home/ahunt/oss-fuzz/git/builtin/reset.c:426:4 #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #10 0x69c5ce in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #11 0x7f57ebb9d349 in __libc_start_main (/lib64/libc.so.6+0x24349) Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
derrickstolee
pushed a commit
that referenced
this pull request
Apr 10, 2021
Most of these pointers can safely be freed when cmd_clone() completes, therefore we make sure to free them. The one exception is that we have to UNLEAK(repo) because it can point either to argv[0], or a malloc'd string returned by absolute_pathdup(). We also have to free(path) in the middle of cmd_clone(): later during cmd_clone(), path is unconditionally overwritten with a different path, triggering a leak. Freeing the first path immediately after use (but only in the case where it contains data) seems like the cleanest solution, as opposed to freeing it unconditionally before path is reused for another path. This leak appears to have been introduced in: f38aa83 (use local cloning if insteadOf makes a local URL, 2014-07-17) These leaks were found when running t0001 with LSAN, see also an excerpt of the LSAN output below (the full list is omitted because it's far too long, and mostly consists of indirect leakage of members of the refs we are freeing). Direct leak of 178 byte(s) in 1 object(s) allocated from: #0 0x49a53d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3 #1 0x9a6ff4 in do_xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:41:8 #2 0x9a6fca in xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:62:9 #3 0x8ce296 in copy_ref /home/ahunt/oss-fuzz/git/remote.c:885:8 #4 0x8d2ebd in guess_remote_head /home/ahunt/oss-fuzz/git/remote.c:2215:10 #5 0x51d0c5 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1308:4 #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #10 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #11 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349) Direct leak of 165 byte(s) in 1 object(s) allocated from: #0 0x49a53d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3 #1 0x9a6fc4 in do_xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:41:8 #2 0x9a6f9a in xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:62:9 #3 0x8ce266 in copy_ref /home/ahunt/oss-fuzz/git/remote.c:885:8 #4 0x51e9bd in wanted_peer_refs /home/ahunt/oss-fuzz/git/builtin/clone.c:574:21 #5 0x51cfe1 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1284:17 #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #10 0x69c42e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #11 0x7f8fef0c2349 in __libc_start_main (/lib64/libc.so.6+0x24349) Direct leak of 178 byte(s) in 1 object(s) allocated from: #0 0x49a53d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3 #1 0x9a6ff4 in do_xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:41:8 #2 0x9a6fca in xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:62:9 #3 0x8ce296 in copy_ref /home/ahunt/oss-fuzz/git/remote.c:885:8 #4 0x8d2ebd in guess_remote_head /home/ahunt/oss-fuzz/git/remote.c:2215:10 #5 0x51d0c5 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1308:4 #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #10 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #11 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349) Direct leak of 165 byte(s) in 1 object(s) allocated from: #0 0x49a6b2 in calloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3 #1 0x9a72f2 in xcalloc /home/ahunt/oss-fuzz/git/wrapper.c:140:8 #2 0x8ce203 in alloc_ref_with_prefix /home/ahunt/oss-fuzz/git/remote.c:867:20 #3 0x8ce1a2 in alloc_ref /home/ahunt/oss-fuzz/git/remote.c:875:9 #4 0x72f63e in process_ref_v2 /home/ahunt/oss-fuzz/git/connect.c:426:8 #5 0x72f21a in get_remote_refs /home/ahunt/oss-fuzz/git/connect.c:525:8 #6 0x979ab7 in handshake /home/ahunt/oss-fuzz/git/transport.c:305:4 #7 0x97872d in get_refs_via_connect /home/ahunt/oss-fuzz/git/transport.c:339:9 #8 0x9774b5 in transport_get_remote_refs /home/ahunt/oss-fuzz/git/transport.c:1388:4 #9 0x51cf80 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1271:9 #10 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #11 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #12 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #13 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #14 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #15 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349) Direct leak of 105 byte(s) in 1 object(s) allocated from: #0 0x49a859 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3 #1 0x9a71f6 in xrealloc /home/ahunt/oss-fuzz/git/wrapper.c:126:8 #2 0x93622d in strbuf_grow /home/ahunt/oss-fuzz/git/strbuf.c:98:2 #3 0x937a73 in strbuf_addch /home/ahunt/oss-fuzz/git/./strbuf.h:231:3 #4 0x939fcd in strbuf_add_absolute_path /home/ahunt/oss-fuzz/git/strbuf.c:911:4 #5 0x69d3ce in absolute_pathdup /home/ahunt/oss-fuzz/git/abspath.c:261:2 #6 0x51c688 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1021:10 #7 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #8 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #9 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #10 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #11 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #12 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349) Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
derrickstolee
pushed a commit
that referenced
this pull request
Apr 10, 2021
Make sure that we release the temporary strbuf during dwim_branch() for all codepaths (and not just for the early return). This leak appears to have been introduced in: f60a7b7 (worktree: teach "add" to check out existing branches, 2018-04-24) Note that UNLEAK(branchname) is still needed: the returned result is used in add(), and is stored in a pointer which is used to point at one of: - a string literal ("HEAD") - member of argv (whatever the user specified in their invocation) - or our newly allocated string returned from dwim_branch() Fixing the branchname leak isn't impossible, but does not seem worthwhile given that add() is called directly from cmd_main(), and cmd_main() returns immediately thereafter - UNLEAK is good enough. This leak was found when running t0001 with LSAN, see also LSAN output below: Direct leak of 60 byte(s) in 1 object(s) allocated from: #0 0x49a859 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3 #1 0x9ab076 in xrealloc /home/ahunt/oss-fuzz/git/wrapper.c:126:8 #2 0x939fcd in strbuf_grow /home/ahunt/oss-fuzz/git/strbuf.c:98:2 #3 0x93af53 in strbuf_splice /home/ahunt/oss-fuzz/git/strbuf.c:239:3 #4 0x83559a in strbuf_check_branch_ref /home/ahunt/oss-fuzz/git/object-name.c:1593:2 #5 0x6988b9 in dwim_branch /home/ahunt/oss-fuzz/git/builtin/worktree.c:454:20 #6 0x695f8f in add /home/ahunt/oss-fuzz/git/builtin/worktree.c:525:19 #7 0x694a04 in cmd_worktree /home/ahunt/oss-fuzz/git/builtin/worktree.c:1036:10 #8 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #9 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #10 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #11 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #12 0x69caee in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #13 0x7f7b7dd10349 in __libc_start_main (/lib64/libc.so.6+0x24349) Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
derrickstolee
pushed a commit
that referenced
this pull request
Apr 10, 2021
The primary goal of this change is to stop leaking init_db_template_dir.
This leak can happen because:
1. git_init_db_config() allocates new memory into init_db_template_dir
without first freeing the existing value.
2. init_db_template_dir might already contain data, either because:
2.1 git_config() can be invoked twice with this callback in a single
process - at least 2 allocations are likely.
2.2 A single git_config() allocation can invoke the callback multiple
times for a given key (see further explanation in the function
docs) - each of those calls will trigger another leak.
The simplest fix for the leak would be to free(init_db_template_dir)
before overwriting it. Instead we choose to convert to fetching
init.templatedir via git_config_get_value() as that is more explicit,
more efficient, and avoids allocations (the returned result is owned by
the config cache, so we aren't responsible for freeing it).
If we remove init_db_template_dir, git_init_db_config() ends up being
responsible only for forwarding core.* config values to
platform_core_config(). However platform_core_config() already ignores
non-core.* config values, so we can safely remove git_init_db_config()
and invoke git_config() directly with platform_core_config() as the
callback.
The platform_core_config forwarding was originally added in:
2878533 (mingw: respect core.hidedotfiles = false in git-init again, 2019-03-11
And I suspect the potential for a leak existed since the original
implementation of git_init_db_config in:
90b4518 (Add `init.templatedir` configuration variable., 2010-02-17)
LSAN output from t0001:
Direct leak of 73 byte(s) in 1 object(s) allocated from:
#0 0x49a859 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0x9a7276 in xrealloc /home/ahunt/oss-fuzz/git/wrapper.c:126:8
#2 0x9362ad in strbuf_grow /home/ahunt/oss-fuzz/git/strbuf.c:98:2
#3 0x936eaa in strbuf_add /home/ahunt/oss-fuzz/git/strbuf.c:295:2
#4 0x868112 in strbuf_addstr /home/ahunt/oss-fuzz/git/./strbuf.h:304:2
#5 0x86a8ad in expand_user_path /home/ahunt/oss-fuzz/git/path.c:758:2
#6 0x720bb1 in git_config_pathname /home/ahunt/oss-fuzz/git/config.c:1287:10
#7 0x5960e2 in git_init_db_config /home/ahunt/oss-fuzz/git/builtin/init-db.c:161:11
#8 0x7255b8 in configset_iter /home/ahunt/oss-fuzz/git/config.c:1982:7
#9 0x7253fc in repo_config /home/ahunt/oss-fuzz/git/config.c:2311:2
#10 0x725ca7 in git_config /home/ahunt/oss-fuzz/git/config.c:2399:2
#11 0x593e8d in create_default_files /home/ahunt/oss-fuzz/git/builtin/init-db.c:225:2
#12 0x5935c6 in init_db /home/ahunt/oss-fuzz/git/builtin/init-db.c:449:11
#13 0x59588e in cmd_init_db /home/ahunt/oss-fuzz/git/builtin/init-db.c:714:9
#14 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
#15 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
#16 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
#17 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
#18 0x69c4de in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
#19 0x7f23552d6349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
derrickstolee
pushed a commit
that referenced
this pull request
Apr 10, 2021
preprocess_options() allocates new strings for help messages for OPTION_ALIAS. Therefore we also need to clean those help messages up when freeing the returned options. First introduced in: 7c28058 (parse-options: teach "git cmd -h" to show alias as alias, 2020-03-16) The preprocessed options themselves no longer contain any indication that a given option is/was an alias - therefore we add a new flag to indicate former aliases. (An alternative approach would be to look back at the original options to determine which options are aliases - but that seems like a fragile approach. Or we could even look at the alias_groups list - which might be less fragile, but would be slower as it requires nested looping.) As far as I can tell, parse_options() is only ever used once per command, and the help messages are small - hence this leak has very little impact. This leak was found while running t0001. LSAN output can be found below: Direct leak of 65 byte(s) in 1 object(s) allocated from: #0 0x49a859 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3 #1 0x9aae36 in xrealloc /home/ahunt/oss-fuzz/git/wrapper.c:126:8 #2 0x939d8d in strbuf_grow /home/ahunt/oss-fuzz/git/strbuf.c:98:2 #3 0x93b936 in strbuf_vaddf /home/ahunt/oss-fuzz/git/strbuf.c:392:3 #4 0x93b7ff in strbuf_addf /home/ahunt/oss-fuzz/git/strbuf.c:333:2 #5 0x86747e in preprocess_options /home/ahunt/oss-fuzz/git/parse-options.c:666:3 #6 0x866ed2 in parse_options /home/ahunt/oss-fuzz/git/parse-options.c:847:17 #7 0x51c4a7 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:989:9 #8 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #9 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #10 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #11 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #12 0x69c9fe in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #13 0x7fdac42d4349 in __libc_start_main (/lib64/libc.so.6+0x24349) Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
derrickstolee
pushed a commit
that referenced
this pull request
Apr 10, 2021
transport_get_remote_refs() can populate the transport struct's
remote_refs. transport_disconnect() is already responsible for most of
transport's cleanup - therefore we also take care of freeing remote_refs
there.
There are 2 locations where transport_disconnect() is called before
we're done using the returned remote_refs. This patch changes those
callsites to only call transport_disconnect() after the returned refs
are no longer being used - which is necessary to safely be able to
free remote_refs during transport_disconnect().
This commit fixes the following leak which was found while running
t0000, but is expected to also fix the same pattern of leak in all
locations that use transport_get_remote_refs():
Direct leak of 165 byte(s) in 1 object(s) allocated from:
#0 0x49a6b2 in calloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
#1 0x9a72f2 in xcalloc /home/ahunt/oss-fuzz/git/wrapper.c:140:8
#2 0x8ce203 in alloc_ref_with_prefix /home/ahunt/oss-fuzz/git/remote.c:867:20
#3 0x8ce1a2 in alloc_ref /home/ahunt/oss-fuzz/git/remote.c:875:9
#4 0x72f63e in process_ref_v2 /home/ahunt/oss-fuzz/git/connect.c:426:8
#5 0x72f21a in get_remote_refs /home/ahunt/oss-fuzz/git/connect.c:525:8
#6 0x979ab7 in handshake /home/ahunt/oss-fuzz/git/transport.c:305:4
#7 0x97872d in get_refs_via_connect /home/ahunt/oss-fuzz/git/transport.c:339:9
#8 0x9774b5 in transport_get_remote_refs /home/ahunt/oss-fuzz/git/transport.c:1388:4
#9 0x51cf80 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1271:9
#10 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
#11 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
#12 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
#13 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
#14 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
#15 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
derrickstolee
pushed a commit
that referenced
this pull request
Apr 30, 2021
versions could be an empty string_list. In that case, versions->items is NULL, and we shouldn't be trying to perform pointer arithmetic with it (as that results in undefined behaviour). Moreover we only use the results of this calculation once when calling QSORT. Therefore we choose to skip creating relevant_entries and call QSORT directly with our manipulated pointers (but only if there's data requiring sorting). This lets us avoid abusing the string_list API, and saves us from having to explain why this abuse is OK. Finally, an assertion is added to make sure that write_tree() is called with a valid offset. This issue has probably existed since: ee4012d (merge-ort: step 2 of tree writing -- function to create tree object, 2020-12-13) But it only started occurring during tests since tests started using merge-ort: f3b964a (Add testing with merge-ort merge strategy, 2021-03-20) For reference - here's the original UBSAN commit that implemented this check, it sounds like this behaviour isn't actually likely to cause any issues (but we might as well fix it regardless): https://reviews.llvm.org/D67122 UBSAN output from t3404 or t5601: merge-ort.c:2669:43: runtime error: applying zero offset to null pointer #0 0x78bb53 in write_tree merge-ort.c:2669:43 #1 0x7856c9 in process_entries merge-ort.c:3303:2 #2 0x782317 in merge_ort_nonrecursive_internal merge-ort.c:3744:2 #3 0x77feef in merge_incore_nonrecursive merge-ort.c:3853:2 #4 0x6f6a5c in do_recursive_merge sequencer.c:640:3 #5 0x6f6a5c in do_pick_commit sequencer.c:2221:9 #6 0x6ef055 in single_pick sequencer.c:4814:9 #7 0x6ef055 in sequencer_pick_revisions sequencer.c:4867:10 #8 0x4fb392 in run_sequencer revert.c:225:9 #9 0x4fa5b0 in cmd_revert revert.c:235:8 #10 0x42abd7 in run_builtin git.c:453:11 #11 0x429531 in handle_builtin git.c:704:3 #12 0x4282fb in run_argv git.c:771:4 #13 0x4282fb in cmd_main git.c:902:19 #14 0x524b63 in main common-main.c:52:11 #15 0x7fc2ca340349 in __libc_start_main (/lib64/libc.so.6+0x24349) #16 0x4072b9 in _start start.S:120 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior merge-ort.c:2669:43 in Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
derrickstolee
pushed a commit
that referenced
this pull request
May 14, 2021
limit_list() iterates over the original revs->commits list, and consumes
many of its entries via pop_commit. However we might stop iterating over
the list early (e.g. if we realise that the rest of the list is
uninteresting). If we do stop iterating early, list will be pointing to
the unconsumed portion of revs->commits - and we need to free this list
to avoid a leak. (revs->commits itself will be an invalid pointer: it
will have been free'd during the first pop_commit.)
However the list pointer is later reused to iterate over our new list,
but only for the limiting_can_increase_treesame() branch. We therefore
need to introduce a new variable for that branch - and while we're here
we can rename the original list to original_list as that makes its
purpose more obvious.
This leak was found while running t0090. It's not likely to be very
impactful, but it can happen quite early during some checkout
invocations, and hence seems to be worth fixing:
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0x9ac084 in do_xmalloc wrapper.c:41:8
#2 0x9ac05a in xmalloc wrapper.c:62:9
#3 0x7175d6 in commit_list_insert commit.c:540:33
#4 0x71800f in commit_list_insert_by_date commit.c:604:9
#5 0x8f8d2e in process_parents revision.c:1128:5
#6 0x8f2f2c in limit_list revision.c:1418:7
#7 0x8f210e in prepare_revision_walk revision.c:3577:7
#8 0x514170 in orphaned_commit_warning builtin/checkout.c:1185:6
#9 0x512f05 in switch_branches builtin/checkout.c:1250:3
#10 0x50f8de in checkout_branch builtin/checkout.c:1646:9
#11 0x50ba12 in checkout_main builtin/checkout.c:2003:9
#12 0x5086c0 in cmd_checkout builtin/checkout.c:2055:8
#13 0x4cd91d in run_builtin git.c:467:11
#14 0x4cb5f3 in handle_builtin git.c:719:3
#15 0x4ccf47 in run_argv git.c:808:4
#16 0x4caf49 in cmd_main git.c:939:19
#17 0x69dc0e in main common-main.c:52:11
#18 0x7faaabd0e349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Indirect leak of 48 byte(s) in 3 object(s) allocated from:
#0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0x9ac084 in do_xmalloc wrapper.c:41:8
#2 0x9ac05a in xmalloc wrapper.c:62:9
#3 0x717de6 in commit_list_append commit.c:1609:35
#4 0x8f1f9b in prepare_revision_walk revision.c:3554:12
#5 0x514170 in orphaned_commit_warning builtin/checkout.c:1185:6
#6 0x512f05 in switch_branches builtin/checkout.c:1250:3
#7 0x50f8de in checkout_branch builtin/checkout.c:1646:9
#8 0x50ba12 in checkout_main builtin/checkout.c:2003:9
#9 0x5086c0 in cmd_checkout builtin/checkout.c:2055:8
#10 0x4cd91d in run_builtin git.c:467:11
#11 0x4cb5f3 in handle_builtin git.c:719:3
#12 0x4ccf47 in run_argv git.c:808:4
#13 0x4caf49 in cmd_main git.c:939:19
#14 0x69dc0e in main common-main.c:52:11
#15 0x7faaabd0e349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Teach Git to construct and consume a serialized commit graph for faster commit walks. Has ability to extend to include generation numbers.