While thinking about how to introduce new crypto types required by the Sassafras protocol, I ended up thinking that maybe, before blindly start writing code, it may be a better idea to first open a discussion about how cryptographic stuff is used by Substrate in order to maybe reduce some technical debt.
Things to know:
KeyTypeId: identifies the context of application of a key (e.g. babe or grandpa)
CryptoTypeId: indetifies the cryptographic scheme (e.g. sr25519 or ed25519)
- Keystore: an object used to work with secrets persisted somewhere (typically in the local filesystem)
- raw public key: an opaque byte array
- constructed public key: a key of one particular scheme (e.g.
sr25519::Public)
Keystore
Currently, there are some inconsistencies in the keystore API.
I'm going to refer to the Local keystore implementation (currently the only implementation provided by the Substrate if we are not considering the testing one).
Signing
To sign a message there is the sign_with method which takes KeyTypeId, CryptoTypeId and a raw public key.
The public key and key-type-id are used to lookup the secret while the crypto-type-id is used to apply the proper scheme to sign the message. A constructed public key is built within the method after a match over all the supported CryptoTypeId.
The interface also provides sign_with_any and sign_with_all with an interface similar to sign_with.
The former is not used anywhere BTW and the second can be replaced by a loop in the caller (IMO it is not called so often to justify the requirement of an extra method).
Isolated there is a special ecdsa_sign_prehashed. It takes instead a constructed ecdsa::Public key
Lookup
To lookup keys there is one stand-alone method for each crypto scheme (e.g. one for sr25519 and one for ed25519).
These returns a typed Public key.
This API design is not consistent with the one used by sign_with that instead:
- deals with raw public keys in the interface
- is a catch-all for all crypto schemes (more crypto agile)
Key generation
The API exposes methods to generate keys and these are designed in the same way of lookup methods.
I.e. one stand-alone method for each scheme.
VRF
There is a sr25519_vrf_sign, stand-alone method for VRF signatures using sr25519. This uses constructed keys a well.
Rework
In the prospective of introducing new crypto schemes to support new protocols, I would like to discuss the interface of the keystore first.
As WiP extensions there is already:
-
BLS12-381 signatures introduced by this PR from @drskalman.
Here the bls_sign method is added as a stand-alone method, i.e. not part of sign_with:
- it doesn't follow the same pattern of the others (i.e. is not part of
sign_with).
- what if we will introduce the usage of other BLS curves in the future?
-
Sassafras will make use of ed_on_bls12_381 or ed_on_bls12_381_bandersnatch. A primitive that is related but not the same bls12-381. This will require its methods as well.
-
Just look at how many possible future curves just from arkworks we may require one day here
Option 1
One option is to provide a fully agile keystore and be less bound (at least in the interface) to the currently provided primitives.
This also allows users of Substrate that in their client are using protocols with crypto schemes not already supported by us.
To do this we may think of removing specialized methods from the interface, i.e. provide methods in a form similar to sign_with. And thus:
sign(app_id: KeyTypeId, crypto_id: CryptoTypeId, public: &[u8], msg: &[u8]) -> Result<Vec<u8>>
sign_prehashed(app_id: KeyTypeId, crypto_id: CryptoTypeId, public: &[u8], hash: &[u8]) -> Result<Vec<u8>>
vrf_sign(app_id: KeyTypeId, crypto_id: CryptoTypeId, public: &[u8], msg: &[u8]) -> Result<Vec<u8>>
public_keys(app_id: KeyTypeId, crypto_id: CryptoTypeId) -> Vec<Vec<u8>>
- etc
This interface prevents the explosion of methods and thus is more sustainable.
Obviously, for example, the usage of vrf_sign with a crypto scheme that doesn't support it will return a "not supported" error...
However... should also be considered that we still provide one method per primitive in the
Crypto runtime interface.
So maybe, if a good part of the primitives, are exposed to the runtime as well the explosion will be observed there at some point
(but this may be considered as an independent problem... more below).
Option 2
Remove the generic sign_with method from the keystore and provide only specialized stand-alone methods for all the schemes we support. We already do this for all the operations except for "signing". Thus provide instead sr25519_sign, ed25519_sign, ecdsa_sign.
My only concern is that we will also have to introduce stuff like:
bls12_381_sign
bls12_377_sign
ed_over_bls12_381_sign
ed_over_bls12_377_sign
- and there are others... who know what we're going to support in the future
Host Functions
If we decide to go with Option 1 maybe we should also consider a refactory in the sp_io crate in order to provide more fine-grained capabilities. For example @achimcc is already adopting this approach to define all the host functions he's requiring to work with a bunch of "special" elliptic curves operations.
Maybe we should start grouping these host functions in separate files and better logically organize them.
For example instead of a giant group containing crypto "stuff" divide these in a more fine grained way as:
Bls12381
Bls12377
EdOnBls12381
EdOnBls12377
Sp25519
Ed25519
Ecdsa
...
Strict adoption of this approach also requires to extract the methods from the ever-growing sp_io::crypto module.
The ideal would be having a module like sp_io::crypto::sp25519 instead of embedding all the crypto primitives in one single crypto module.
But maybe is not backward compatible... But is this even possible (having submodules of sp_io::crypto for each crypto scheme)? Maybe we can do this at least for new stuff?
Final Considerations
All in all, I wanted to highlight and open a discussion about possible issues that I have noticed before introduce even more new stuff.
The end goal is to introduce new things in a more organized and carefully designed way before is "too late" to rethinking about it (especially for the host functions part) and accumulate technical debt.
While thinking about how to introduce new crypto types required by the Sassafras protocol, I ended up thinking that maybe, before blindly start writing code, it may be a better idea to first open a discussion about how cryptographic stuff is used by Substrate in order to maybe reduce some technical debt.
Things to know:
KeyTypeId: identifies the context of application of a key (e.g. babe or grandpa)CryptoTypeId: indetifies the cryptographic scheme (e.g. sr25519 or ed25519)sr25519::Public)Keystore
Currently, there are some inconsistencies in the keystore API.
I'm going to refer to the Local keystore implementation (currently the only implementation provided by the Substrate if we are not considering the testing one).
Signing
To sign a message there is the
sign_withmethod which takesKeyTypeId,CryptoTypeIdand a raw public key.The public key and key-type-id are used to lookup the secret while the crypto-type-id is used to apply the proper scheme to sign the message. A constructed public key is built within the method after a
matchover all the supportedCryptoTypeId.The interface also provides
sign_with_anyandsign_with_allwith an interface similar tosign_with.The former is not used anywhere BTW and the second can be replaced by a loop in the caller (IMO it is not called so often to justify the requirement of an extra method).
Isolated there is a special
ecdsa_sign_prehashed. It takes instead a constructedecdsa::PublickeyLookup
To lookup keys there is one stand-alone method for each crypto scheme (e.g. one for sr25519 and one for ed25519).
These returns a typed
Publickey.This API design is not consistent with the one used by
sign_withthat instead:Key generation
The API exposes methods to generate keys and these are designed in the same way of lookup methods.
I.e. one stand-alone method for each scheme.
VRF
There is a
sr25519_vrf_sign, stand-alone method for VRF signatures using sr25519. This uses constructed keys a well.Rework
In the prospective of introducing new crypto schemes to support new protocols, I would like to discuss the interface of the keystore first.
As WiP extensions there is already:
BLS12-381 signatures introduced by this PR from @drskalman.
Here the
bls_signmethod is added as a stand-alone method, i.e. not part ofsign_with:sign_with).Sassafras will make use of
ed_on_bls12_381ored_on_bls12_381_bandersnatch. A primitive that is related but not the same bls12-381. This will require its methods as well.Just look at how many possible future curves just from arkworks we may require one day here
Option 1
One option is to provide a fully agile keystore and be less bound (at least in the interface) to the currently provided primitives.
This also allows users of Substrate that in their client are using protocols with crypto schemes not already supported by us.
To do this we may think of removing specialized methods from the interface, i.e. provide methods in a form similar to
sign_with. And thus:sign(app_id: KeyTypeId, crypto_id: CryptoTypeId, public: &[u8], msg: &[u8]) -> Result<Vec<u8>>sign_prehashed(app_id: KeyTypeId, crypto_id: CryptoTypeId, public: &[u8], hash: &[u8]) -> Result<Vec<u8>>vrf_sign(app_id: KeyTypeId, crypto_id: CryptoTypeId, public: &[u8], msg: &[u8]) -> Result<Vec<u8>>public_keys(app_id: KeyTypeId, crypto_id: CryptoTypeId) -> Vec<Vec<u8>>This interface prevents the explosion of methods and thus is more sustainable.
Obviously, for example, the usage of
vrf_signwith a crypto scheme that doesn't support it will return a "not supported" error...However... should also be considered that we still provide one method per primitive in the
Cryptoruntime interface.So maybe, if a good part of the primitives, are exposed to the runtime as well the explosion will be observed there at some point
(but this may be considered as an independent problem... more below).
Option 2
Remove the generic
sign_withmethod from the keystore and provide only specialized stand-alone methods for all the schemes we support. We already do this for all the operations except for "signing". Thus provide insteadsr25519_sign,ed25519_sign,ecdsa_sign.My only concern is that we will also have to introduce stuff like:
bls12_381_signbls12_377_signed_over_bls12_381_signed_over_bls12_377_signHost Functions
If we decide to go with Option 1 maybe we should also consider a refactory in the
sp_iocrate in order to provide more fine-grained capabilities. For example @achimcc is already adopting this approach to define all the host functions he's requiring to work with a bunch of "special" elliptic curves operations.Maybe we should start grouping these host functions in separate files and better logically organize them.
For example instead of a giant group containing crypto "stuff" divide these in a more fine grained way as:
Bls12381Bls12377EdOnBls12381EdOnBls12377Sp25519Ed25519Ecdsa...
Strict adoption of this approach also requires to extract the methods from the ever-growing
sp_io::cryptomodule.The ideal would be having a module like
sp_io::crypto::sp25519instead of embedding all the crypto primitives in one singlecryptomodule.But maybe is not backward compatible... But is this even possible (having submodules of
sp_io::cryptofor each crypto scheme)? Maybe we can do this at least for new stuff?Final Considerations
All in all, I wanted to highlight and open a discussion about possible issues that I have noticed before introduce even more new stuff.
The end goal is to introduce new things in a more organized and carefully designed way before is "too late" to rethinking about it (especially for the host functions part) and accumulate technical debt.