feat(db/trans-cache): optimize for bloomFilter initialization#5394
Conversation
Codecov Report
❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more. @@ Coverage Diff @@
## develop #5394 +/- ##
=============================================
+ Coverage 60.89% 60.94% +0.04%
- Complexity 9231 9246 +15
=============================================
Files 839 839
Lines 50029 50140 +111
Branches 5574 5575 +1
=============================================
+ Hits 30467 30558 +91
- Misses 17176 17194 +18
- Partials 2386 2388 +2
... and 3 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
| throw new RuntimeException(e); | ||
| } | ||
| try (OutputStream out = new BufferedOutputStream(Files.newOutputStream(file, | ||
| StandardOpenOption.CREATE_NEW, StandardOpenOption.WRITE))) { |
There was a problem hiding this comment.
Given that OS cache flushing to the disk is not immediate, is it possible that the property file flushed successfully but partly of bloomFilters is still in the OS cache(then corrupt when OS crashed)?
Add another option StandardOpenOption.SYNC? Or write all data into one file and make sure the metadata at the end also can achieve the atomic, but this will introduce complexity
There was a problem hiding this comment.
Given that OS cache flushing to the disk is not immediate, is it possible that the property file flushed successfully but partly of bloomFilters is still in the OS cache(then corrupt when OS crashed)?
Yeah, it's gonna happen. But it doesn't seem to be a problem, it will rollback to previous mode.
There was a problem hiding this comment.
Add another option StandardOpenOption.SYNC? Or write all data into one file and make sure the metadata at the end also can achieve the atomic, but this will introduce complexity
This PR is a helper feature. StandardOpenOption.SYNC is good , But it may slow down the closing procedure if you're so concerned about closing.
There was a problem hiding this comment.
How long does StandardOpenOption.SYNC take?
There was a problem hiding this comment.
300ms -> 1.2 s for ssd
There was a problem hiding this comment.
700ms -> +15s for non-ssd, StandardOpenOption.SYNC is not a good choice!
|
|
||
| private boolean recovery(int index, Path file) { | ||
| try (InputStream in = new BufferedInputStream(Files.newInputStream(file, | ||
| StandardOpenOption.READ, StandardOpenOption.DELETE_ON_CLOSE))) { |
There was a problem hiding this comment.
DELETE_ON_CLOSE means the bloomFilters files only work once, the situation that node recovery finished but shutdown immediately is ignored, right?
There was a problem hiding this comment.
kill -9 or kill -15 ? if kill -15 , bloomFilters will be dump again.
There was a problem hiding this comment.
the situation that node recovery finished but shutdown immediately is ignored, right?
Sorry, i've resubmitted it to avoid dump wrong cache, please recheck it.
| private BloomFilter<byte[]>[] bloomFilters = new BloomFilter[2]; | ||
| // filterStartBlock record the start block of the active filter | ||
| private volatile long filterStartBlock = INVALID_BLOCK; | ||
| private volatile long currentBlockNum = INVALID_BLOCK; |
There was a problem hiding this comment.
A check for currentBlockNum with block header is needed?
Node shutdown using Kill -9 may result in DBS being inconsistent, so this PR only works with a graceful shutdown? Just for a confirm
There was a problem hiding this comment.
A check for currentBlockNum with block header is needed?
Good question, but how might it be inconsistent? Yes ,a graceful shutdown is required
There was a problem hiding this comment.
Two questions, with no relation. A check just for a double check, seems it introduces no more benefit.
how might it be inconsistent
The whole shutdown process is complicated, still thinking
There was a problem hiding this comment.
Maybe the empty blocks will make different, a check is not required.
chainbase/src/main/java/org/tron/core/db2/common/TxCacheDB.java
Outdated
Show resolved
Hide resolved
| } | ||
|
|
||
| public void init() { | ||
| if (recovery()) { |
There was a problem hiding this comment.
- Recovery failure will affect the initialization logic
- Some exceptions are caught, and some are not. Why is this happening
- Logically speaking, as long as there is an exception, it is a failure, so there is no need to handle those exceptions, so you only need to handle the exception at the outermost layer
| "txCache.properties"); | ||
| Map<String, String> properties = readProperties(txCacheProperties); | ||
|
|
||
| if (properties == null || properties.size() != 3) { |
There was a problem hiding this comment.
If properties are read successfully but bloom fails, will there be any problem?
There was a problem hiding this comment.
If new fields are added to the properties database in the future, it will have an impact
There was a problem hiding this comment.
If properties are read successfully but bloom fails, will there be any problem?
It will fail and fallback to the original loading mode.
There was a problem hiding this comment.
If new fields are added to the properties database in the future, it will have an impact
Wouldn't consider it for now.
45840fc to
8566f5a
Compare
8566f5a to
7d1eac9
Compare
close #5355