Skip to content

Add a merge column marker to make merge idempotent#3957

Merged
cshannon merged 6 commits into
apache:mainfrom
cshannon:accumulo-3938
Nov 18, 2023
Merged

Add a merge column marker to make merge idempotent#3957
cshannon merged 6 commits into
apache:mainfrom
cshannon:accumulo-3938

Conversation

@cshannon
Copy link
Copy Markdown
Contributor

@cshannon cshannon commented Nov 18, 2023

This PR fixes no-chop merge so that it is idempotent and will no longer fail on recovery if the last tablet in the merge range was already fenced. Previously on resume if it tried to fence the files again there would be disjoint files seen from previously copied files during the merge. See #3938 for more details.

This PR makes the following changes:

  1. A new MERGED column marker is inserted into the last tablet as part of the mutation that contains all the fenced file metadata update so the marker is flushed and atomic with the metadata updates. If a failure occurs after this point and before completion on resume this marker can be read to know that fencing was already completed so we can skip this operation and continue to tablet deletion. Note that this marker is only set for merge and not for deletion as it isn't needed when deleting.
  2. The value of the column doesn't matter as just the existing of the marker is needed to know whether or not the files were already fenced..
  3. A new state called MERGED has also been added to MergeState. This new state is necessary so we can clear the marker. After the merge is completed the state is set to MERGED which allows the MERGED marker to be cleared from the last table if it was set (merge only and not delete).
  4. The update of the prev row on the last tablet has been moved to the same mutation as the metadata updates so that the operation is atomic.

Testing notes:

Trying to test a failure with an IT is not really possible due to how TGW is constructed so I did a lot of manual testing with throwing exceptions and recovery to make sure things worked properly. I also added whatever tests to this PR I could. I added unit tests to verify the new MERGED marker is properly encoded/decoded into the metadata table and I also updated the ITs to verify that the MERGED marker is cleared correctly.

This closes #3938

The new MERGE marker allows correctly resuming metadata updates for
no-chop merge if there was a failure and resume
@cshannon cshannon marked this pull request as ready for review November 18, 2023 16:10
@cshannon cshannon self-assigned this Nov 18, 2023
boolean isMerged = false;
Value hasPrevRowColumn = null;

while (iterator.hasNext()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an assumption of this loop is that its iterating over keys from the same row. It would be good to validate that. In the case where the metadata table is corrupt (like a tablet is missing a prev end row, it would be good to fail here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a check to verify the columns are all seen from the same first row

Comment thread server/manager/src/main/java/org/apache/accumulo/manager/TabletGroupWatcher.java Outdated
Comment thread server/manager/src/main/java/org/apache/accumulo/manager/TabletGroupWatcher.java Outdated
@keith-turner
Copy link
Copy Markdown
Contributor

Wondering if moving deleting tablets to run in the new MERGED state would clean up the code any. Not sure.

Also moved deleteTablets to the MERGED state
@cshannon
Copy link
Copy Markdown
Contributor Author

Wondering if moving deleting tablets to run in the new MERGED state would clean up the code any. Not sure.

This was done in the latest commit

Copy link
Copy Markdown
Contributor

@keith-turner keith-turner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I did not get to analyze was how the addition of the new mergestate impacts the deleterows code. Does the deleterows code walk through the merging and merged states?

Comment thread server/manager/src/main/java/org/apache/accumulo/manager/TabletGroupWatcher.java Outdated
Comment on lines +1039 to +1043
try (BatchWriter bw = client.createBatchWriter(targetSystemTable)) {
deleteTablets(info, scanRange, bw, client);

// Clear the merged marker after we finish deleting tablets
clearMerged(info);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could pass the batchwriter and/or highTablet. Also could flush the batchwriter.

Suggested change
try (BatchWriter bw = client.createBatchWriter(targetSystemTable)) {
deleteTablets(info, scanRange, bw, client);
// Clear the merged marker after we finish deleting tablets
clearMerged(info);
try (BatchWriter bw = client.createBatchWriter(targetSystemTable)) {
deleteTablets(info, highTablet, scanRange, bw, client);
// flush any changes before removing merge marker, after merge marker is removed may not run again
bw.flush();
// Clear the merged marker after we finish deleting tablets
clearMerged(info, bw, highTablet);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops good catch, I meant to pass the bw to that when I refactored it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can skip the flush as it's already done by deleteTablets here

private void deleteMergedTablets(MergeInfo info) throws AccumuloException {
KeyExtent range = info.getExtent();
Manager.log.debug("Deleting merged tablets for {}", range);
HighTablet highTablet = getHighTablet(range);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be able to just return if the merged marker is not present, would indicate running a 2nd time. But if runnning the code again in this case is not harmful that may not matter.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, it probably doesn't matter either way but still can return early as no reason to execute the following code if the merge marker is gone

@cshannon
Copy link
Copy Markdown
Contributor Author

One thing I did not get to analyze was how the addition of the new mergestate impacts the deleterows code. Does the deleterows code walk through the merging and merged states?

Yeah the delete rows code walks through the same state but it doesn't need to clear the marker and it doesn't need to do anything extra as it's already idempotent.

The code checks on this line and then just skips to setting COMPLETE. We could just skip MERGED altogether for delete rows but figured we should just walk through all the states but either would work

Comment thread server/manager/src/main/java/org/apache/accumulo/manager/TabletGroupWatcher.java Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

No chop merge is not idempotent and can fail on recovery

3 participants