persist time to tablet in bulk update#4072
Conversation
When bulk import operations set time and a tablet was hosted the time was not persisted. The bulk import fate operation now persist time in tablet metadata. The tablet code assumed it was the only thing updating a tablets time field. The tablet code was modified to accomodate the bulk import code running in the manager updating the tablets time column in the metadata table.
dlmarion
left a comment
There was a problem hiding this comment.
I'm not quite sure how tablet time works in general. It might be useful to have a discussion about it so that I/we can understand the changes here.
Here is some info. Each tablet has a concept of time that only moves forward. This time is persisted in the tablet metadata table entry. As mutations arrive in a tablet, if the time was not explicitly set then its set on the mutation and tablets time is incremented. For bulk import one can optionally set the tablets current time on an entire file. The way this works is that it allocates a timestamp for the tablet and then persist this with the bulk file entry. When the bulk file is read, if a timestamp is present for the file then its applied to everything in the file.
With the ability to set time on bulk imports, it allows bulk imports to be properly orders w.r.t. write. Like the above would set the timestamp on the 2nd write by the bulk import such that its higher than the timestamp on the first write. This PR adds coordination between a hosted tablet and the bulk import code running in the manager to ensure that timestamp is set correctly. |
|
Does this close #3354 ? |
This todo was already done in apache#4072
This todo was already done in #4072
When bulk import operations set time and a tablet was hosted the time was not persisted. The bulk import fate operation now persist time in tablet metadata. The tablet code assumed it was the only thing updating a tablets time field. The tablet code was modified to accomodate the bulk import code running in the manager updating the tablets time column in the metadata table.