-
Notifications
You must be signed in to change notification settings - Fork 651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issue #125: sign block with old key if signing key updated in same block #1203
Conversation
When signing key is updated in the same block, producing will fail.
Travis complains:
Related code: bitshares-core/tests/tests/block_tests.cpp Lines 327 to 331 in 37e72ad
Failed to reopen database file after closed? Any idea how to fix? |
Destroyed db object and created a new one. Travis-CI no longer complains. |
moved `init_witness_key_cache()` call from `plugin_startup()` to `plugin_initialize()`.
449745f
to
bf0d807
Compare
private: | ||
void _apply_block( const signed_block& next_block ); | ||
processed_transaction _apply_transaction( const signed_transaction& trx ); | ||
void _cancel_bids_and_revive_mpa( const asset_object& bitasset, const asset_bitasset_data_object& bad ); | ||
|
||
/// For tracking signing keys of specified witnesses, only update when applied a block or popped a block | ||
flat_map< witness_id_type, optional<public_key_type> > _witness_key_cache; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it is a bad idea to introduce block-dependent state in database that is not covered by object_db. This state must be maintained manually, which is fragile. I think the problem at hand can be solved in a more simple way.
@@ -250,7 +253,7 @@ block_production_condition::block_production_condition_enum witness_plugin::mayb | |||
} | |||
|
|||
fc::time_point_sec scheduled_time = db.get_slot_time( slot ); | |||
graphene::chain::public_key_type scheduled_key = scheduled_witness( db ).signing_key; | |||
graphene::chain::public_key_type scheduled_key = *db.find_witness_key_from_cache( scheduled_witness ); // should be valid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think instead of using the cache, witness_plugin could simply connect to the database::applied_block
signal and use that to retrieve the latest keys of the witness(es) it is interested in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Essentially you mean to move the cache to witness plugin. Because there is no signal when a block is popped out, in order to deal with chain reorganization, the cache should be in (extended) chain state. Then, in order to make it work, need to either replay the whole chain to enable the plugin, or add code to refresh the cache (copy data from main chain state to the extended part) on startup. Then, in case when there is a reorganization to the startup head block, need to make sure the cache don't get emptied.
I didn't see it's simpler so far.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The witness plugin is only interested in current head block. Blocks are only popped when switching forks, which means there will be block pushes immediately after the pop, which means the cache will automatically be updated in that case.
This is slightly different from the cache in database itself - that should always be up to date, also after popping blocks.
Overall, I think even if the resulting code is not significantly simpler, the containment in the witness plugin is a big plus.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that ideally it's better to put the logic in witness plugin because block generation is actually witness plugin's business.
Blocks are only popped when switching forks, which means there will be block pushes immediately after the pop, which means the cache will automatically be updated in that case.
One scenario to consider: if put the cache in witness plugin, if we code the plugin to track applied operations only, after a block is popped out, the cache may become inconsistent if the popped out block contains a witness_update_operation
, when pushing a new block, if there is no the same witness_update_operation
in the new block, the cache would still be inconsistent. To solve this, we need additional code to detect if a chain reorganization has occurred and refresh the cache accordingly, if not simply refresh the whole cache on every block. However, when refreshing the cache, we need to fetch data from main chain state, since signalling is a multi-threading thing, the main state could have changed when fetching data, which means the cache is not 100% reliable.
By putting the cache in database class we can guarantee its consistency. Perhaps there will still be edge cases, for example, witness plugin tries to generate a block in the middle when database is pushing a block.
Another thing is we don't have unit test for witness plugin, so adding more code there means more work / risks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The witness plugin typically needs to keep track of the keys of only one witness. Refreshing this after each pushed block should have minimal cost, and only for the node in question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't like to refresh from the plugin by querying chain state directly, because chain state may have become dirty already when refreshing. For example, this scenario:
- there was a
witness_update_operation
in_pending_transactions
- a block is pushed to database without that operation
- that operation got pushed to database soon after finished pushing the block
- witness plugin received the "applied_block" signal
- witness plugin tries to refresh local cache, but got updated key (dirty data).
That said, ideally we need to keep a clean state or a subset in database for plugins or other modules to use. This PR is trying in this way, IMHO we can merge it now, then refactor it to a more generic module later when necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the applied_block signal is processed in sync right after the block has been applied. There shouldn't be any tx's pushed in between.
https://github.com/bitshares/bitshares-core/blob/2.0.180612/libraries/chain/db_block.cpp#L550
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right, I forgot that. The state should be clean when processing applied_block event.
By the way, why we need to sleep here and there in test cases? IIRC sometimes chain_test will segfault without sleeping. E.G.
fc::usleep(fc::milliseconds(200)); // sleep a while to execute callback in another thread |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's required in some places where API callbacks happen. These are asynchronous.
Shouldn't be necessary in market_rounding_tests IMO.
libraries/chain/db_block.cpp
Outdated
if( !(skip & skip_witness_signature) ) | ||
FC_ASSERT( witness_obj.signing_key == block_signing_private_key.get_public_key() ); | ||
{ | ||
auto signing_key = find_witness_key_from_cache( witness_id ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using the cache, simply move the existing FC_ASSERT down a couple of lines, after _pending_tx_session.reset();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's cheaper to check here than checking after called _pending_tx_session.reset();
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will only hurt the node that is trying to produce without proper keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even for nodes with proper keys, it's still slightly cheaper because the cache would be much smaller than object database.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For nodes with keys it may be cheaper, but the global cache is a penalty for all nodes.
Moved the signing key cache to witness plugin as recommended by @pmconrad. |
Resolved conflicts: libraries/chain/db_block.cpp
which means we assume it will be no more than 3 bytes after serialization. 21 bits means at maximum 2^21-1=2,097,151 transactions in a block, it's practically safe although theoretically the size is unsigned_int which can be 56 bits at maximum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks! (And much simpler than before, IMO.)
The PR contains several unrelated changes. That's OK, IMO - no need to create separate PRs for minor code improvements
We need to keep this balanced though. Perhaps not quite as many unrelated things in the future.
{ | ||
// Note: if this check failed (which won't happen in normal situations), | ||
// we would have temporarily broken the invariant that | ||
// _pending_tx_session is the result of applying _pending_tx. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point _pending_tx_session has been cleared, so the comment makes no sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, _pending_tx_session
has been cleared, but _pending_tx
hasn't, so they're inconsistent. That's why I wrote the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Never thought of this as an invariant, but I guess it's ok.
@@ -532,7 +538,17 @@ void database::_apply_block( const signed_block& next_block ) | |||
uint32_t skip = get_node_properties().skip_flags; | |||
_applied_ops.clear(); | |||
|
|||
FC_ASSERT( (skip & skip_merkle_check) || next_block.transaction_merkle_root == next_block.calculate_merkle_root(), "", ("next_block.transaction_merkle_root",next_block.transaction_merkle_root)("calc",next_block.calculate_merkle_root())("next_block",next_block)("id",next_block.id()) ); | |||
if( !(skip & skip_block_size_check) ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should at least soft-fork protect this.
tests/tests/block_tests.cpp
Outdated
// should fail to push if it's too large | ||
if( fc::raw::pack_size(maybe_large_block) > gpo.parameters.maximum_block_size ) | ||
{ | ||
BOOST_CHECK_THROW( db.push_block(maybe_large_block), fc::exception ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test should ensure that this is triggered a couple of times. Perhaps increment a counter each time and BOOST_CHECK_EQUAL the counter after the loop ends. (Not sure how often it should trigger...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a check to make sure it has been triggered at least once. I think it's enough. Didn't use BOOST_CHECK_EQUAL
because it would be a pain when trying to tweak the loop in the future.
Ok, I won't insist on the soft fork. |
PR for #125.
This PR also contains a fix about block size check, and a fix about removing item from fork db.
Things to be done:
witness_update_operation
witness_create_operation
test case forupdate: I have done some manual tests, it's too much efforts to create such test cases.witness
plugin