-
Notifications
You must be signed in to change notification settings - Fork 291
Description
Description of issue or feature request:
The TUF repository tool provides a functionality to distribute a large number of target files over multiple delegated roles (see delegate_hashed_bins).
To add a target to the corresponding bin, identified by the hash of the target path, or remove one from a bin the methods remove add_target_to_bin and remove_target_from_bin are provided. These methods are wrappers around add_target and remove_target, which don't call those methods directly but via a dispatcher helper _locate_and_update_target_in_bin, which also implements common functionality for the wrappers.
These methods should be refactored for the reasons and as outlined below:
Current behavior:
-
The current architecture doesn't allow to pass through different arguments to the wrapped methods, e.g. the
customargument only supported byadd_target:
https://github.com/theupdateframework/tuf/blob/4fe29d138df02035eed9716657cfe0a76e17e178/tuf/repository_tool.py#L2672 -
_locate_and_update_target_in_bindoesn't seem to be designed to dispatch to any other methods thanadd_targetandremove_targetto justify the dispatcher architecture -
_locate_and_update_target_in_binreplicates functionality from the wrapped methods, e.g. that the targets path exists as file in the current directory
https://github.com/theupdateframework/tuf/blob/4fe29d138df02035eed9716657cfe0a76e17e178/tuf/repository_tool.py#L2737-L2739
(see Targets'add_target(s)methods do not correctly check passed paths #957 for reasons why that might not even be necessary). -
_locate_and_update_target_in_binmakes risky assumption, e.g. it infers thehash_prefix_length(i.e.len(hex(number_of_bins - 1)[2:])), by taking the length of the first element in thepath_hash_prefixesfield of the first delegation of the targets object on which it is called.
https://github.com/theupdateframework/tuf/blob/4fe29d138df02035eed9716657cfe0a76e17e178/tuf/repository_tool.py#L2719-L2733 -
_locate_and_update_target_in_binmakes aO(1)searchO(n).
Expected behavior:
- simplify architecture to allow adding a
customargument toadd_target_to_binthat gets passed through - remove redundant code
- pass
hash_prefix_lengthornumber_of_binsinstead of inferring it from the metadata (the calling code should know this). - make bin search
O(1), e.g.:
def find_bin_for_hash(hash_prefix, bin_size):
prefix_len = len(hash_prefix)
hash_prefix = int(hash_prefix, 16)
low = hash_prefix - (hash_prefix % bin_size)
high = low + bin_size - 1
return f"{low:0{prefix_len}x}-{high:0{prefix_len}x}"