Skip to content

hashtable insert / lookup stats #506

@ThomasWaldmann

Description

@ThomasWaldmann

I recently improved the badly chosen load factor (old: 0.9, new: 0.75) and the table initial bucket count (old: 1024, new: 1031). Experiments have shown this gave a major speedup, but it is still unknown whether the hashtable works as it should (O(1) lookup speed) or just works much better as before, but still not at maximum performance.

Insert: statistics about how long the linear search usually is to find a free bucket.
Have: key x, bucket = hash(x), is it free? if not, how many buckets did we need to search until we found a free one?
also: how well did the hash function spread over all the buckets?

Lookup: statistics about how long the linear search usually is to find the correct bucket.
Wanted: key x, bucket = hash(x), is it there? if not, how many buckets did we need to visit until we found key x?

Implementation note: data gathering needs to be coded in C, analysis could by in Cython or Python.

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions