Skip to content

Updates merge code to support merging many tablets#3934

Merged
keith-turner merged 3 commits into
apache:elasticityfrom
keith-turner:merge_one_million
Nov 9, 2023
Merged

Updates merge code to support merging many tablets#3934
keith-turner merged 3 commits into
apache:elasticityfrom
keith-turner:merge_one_million

Conversation

@keith-turner
Copy link
Copy Markdown
Contributor

Using the conditional writer changes from #3929 this commit updates the
merge code to support merging more tablets than will fit in memory. Also
adds merge and clone operations to the SplitMillionIT and merges a table
with one million tablets in the test. Without the changes in this
commit that test would cause the manager to die with an out of memory
error.

Using the conditional writer changes from apache#3929 this commit updates the
merge code to support merging more tablets than will fit in memory. Also
adds merge and clone operations to the SplitMillionIT and merges a table
with one million tablets in the test.  Without the changes in this
commit that test would cause the manager to die with an out of memory
error.
@keith-turner keith-turner requested a review from cshannon November 7, 2023 02:33
@keith-turner keith-turner linked an issue Nov 7, 2023 that may be closed by this pull request
8 tasks
Copy link
Copy Markdown
Contributor

@cshannon cshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, only comment as noted already is I'm just wondering if it would make sense to try and share some of the code for delete/split/merge fate operations for things like reserving tablets but may not be worth the refactor if there's enough small different and variables.

var opid = TabletOperationId.from(TabletOperationType.MERGING, tid);

AtomicLong opsAccepted = new AtomicLong(0);
BiConsumer<KeyExtent,Ample.ConditionalResult> resultConsumer = (extent, result) -> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should have some shared Abstract class or utility for ReserveTablets for things like this BiConsumer that are basically identical between delete/merge

@cshannon cshannon self-requested a review November 7, 2023 18:03
Copy link
Copy Markdown
Contributor

@cshannon cshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I talked with @keith-turner in detail about this and we realized this would cause problems as this is currently written as using an async writer will flush periodically and when the last tablet is processed it could see disjoint files from the previous copying since it is no longer done in one buffering in memory. This is also a problem during failure and resume.I can let @keith-turner comment too but this will likely be reworked to add extra verification and to limit the merge operation so a number of files that can be done in one operation to prevent a bad state and to keep the copy/fencing as being buffered to it happens at once. Currently in 2.1/main it's not an issue since everything is buffered into one mutation before being written so it's atomic. We likely should add a file limit check there for merge as well to prevent blowing memory if too many files.

@keith-turner
Copy link
Copy Markdown
Contributor Author

Adding to @cshannon comment, we did come up with ways to work around the issues identified. However, we came to the conclusion that making merge able to create a single tablet with more files than can fit in memory would cause problems for split, compaction, scan, etc. Realized it was not worthwhile pursing these changes and decided it would be best to do the following.

  • Add a property that limits the amount files that would be merged into a single tablet. For example if the limit was 10K and if a merge tried to run on 1000 tablets each having 100 files then it would fail before making any metadata changes. If the merge ran then it would create a single tablet with 100K files.
  • Still allow merging a large number of tablets as long as there is not a large number of files. For example merging 1 million tablets with a total of 100 file across all tablets would still work.

Going to close this PR, rework this code and open a new PR that supports the above.

A check was added to ensure that tablets being merged will not result in
creating a tablet with a massive number of files.  Added test for
merging tablets with too many files.  Removed the code in merge that
wrote files in multiple mutations.  Went back to buffering all files and
writing them in a single mutation.
@keith-turner keith-turner reopened this Nov 8, 2023
@keith-turner
Copy link
Copy Markdown
Contributor Author

keith-turner commented Nov 8, 2023

@cshannon I reopened this PR after making the changes 4113dc5. Thought that made more sense than opening a new PR.

Copy link
Copy Markdown
Contributor

@cshannon cshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes LGTM, I like the tests and how they verify things are still in a good state even after failure if too many files are detected

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: ✅ Done

Development

Successfully merging this pull request may close these issues.

Tests table operations with lots of tablets

3 participants