TL;DR
A C++ library implementing hopscotch hashing offers a high-performance, memory-efficient alternative to standard hash containers. It supports move-only types and various growth policies, promising better performance for many applications.
A new C++ library implementing a fast hash map and hash set using hopscotch hashing has been released, offering performance and memory advantages over standard containers like std::unordered_map.
The library, named tsl::hopscotch_map and tsl::hopscotch_set, uses open addressing and hopscotch hashing to resolve collisions, resulting in cache-friendly data structures that outperform std::unordered_map in most benchmarks, according to the developer community.
It supports move-only and non-default constructible key/value types, heterogeneous lookups, and optional precomputed hashes for faster lookups. The library also provides variants with different growth policies—power of two, prime, and customizable—to optimize distribution and security based on use case.
Compared to similar hash tables, tsl::hopscotch_map uses less memory and offers more functionalities, including support for exceptions, with thread safety comparable to standard hash maps. The implementation is header-only, making it easy to integrate into existing projects.
Performance Benefits Over Standard Hash Maps
This new implementation could significantly improve the performance of applications relying heavily on hash-based data structures, especially in scenarios where cache efficiency and memory footprint are critical. Developers working on high-performance systems or large-scale data processing may see notable gains in speed and resource usage.
C++ hopscotch hashing library
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Development of High-Performance Hash Tables in C++
Hopscotch hashing was introduced as an alternative collision resolution method to improve cache locality and reduce lookup times. Prior to this release, standard C++ libraries relied on std::unordered_map, which can suffer from performance issues with poor hash functions or high collision rates. The tsl library builds on these principles, offering a modern, flexible implementation with various growth policies and features tailored for performance-critical applications.
“Our hopscotch-based hash map and set deliver better cache performance and memory efficiency compared to std::unordered_map, with minimal API differences.”
— the library developer
high-performance hash map C++
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Remaining Questions About Implementation and Adoption
It is not yet clear how widely this library will be adopted in production environments or how it performs under various real-world workloads beyond initial benchmarks. Compatibility with existing codebases and integration with other C++ libraries remain to be tested in diverse scenarios.
cache-friendly hash set C++
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Next Steps for Developers and Users
Developers are encouraged to test the library in their projects, compare performance with existing hash containers, and contribute to further development. Future updates may include additional features, optimizations, and broader compatibility testing.
header-only hash map C++
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Key Questions
How does hopscotch hashing improve performance over std::unordered_map?
Hopscotch hashing improves cache locality and reduces collision resolution time, leading to faster lookups and insertions, especially in high-collision scenarios.
Is the library suitable for production use?
The library is designed for high performance and is available now, but users should evaluate its stability and performance in their specific workloads before deploying in critical systems.
What are the main features of this hash map implementation?
Features include support for move-only types, heterogeneous lookups, optional precomputed hashes, multiple growth policies, and a header-only design for easy integration.
Does this library support thread safety?
Its thread safety is comparable to std::unordered_map/set, allowing multiple readers with no writers, but concurrent modifications require external synchronization.
How does the performance compare in benchmarks?
Initial benchmarks indicate that the hopscotch implementation outperforms std::unordered_map in lookup and insertion speed in many cases, especially with cache-sensitive workloads.
Source: Hacker News