(Optional) More on Unordered Sets & Maps
Authors: Darren Yao, Benjamin Qi
Contributors: Neo Wang, Nathan Gong
Maintaining collections of distinct elements with hashing.
Warning!
You can (almost always) use ordered sets and maps instead, but it's good to know that these exist.
Resources | ||||
---|---|---|---|---|
IUSACO | module is based off this |
Hashing
Hashing refers to assigning a unique code to every variable/object which allows insertions, deletions, and searches in time, albeit with a high constant factor, as hashing requires a large constant number of operations. However, as the name implies, elements are not ordered in any meaningful way, so traversals of an unordered set will return elements in some arbitrary order.
Custom Hashing
C++
There is no built in method for hashing pairs or vectors. Namely,
unordered_set<vector<int>>
does not work. In this case, we can use an
sorted map (which supports all of the functions used
in the code above) or declare our own hash function.
Resources | ||||
---|---|---|---|---|
Mark Nelson | How to create user-defined hash function for |
The link provides an example of hashing pairs of strings, as well as other data structures like pairs.
#include <bits/stdc++.h>using namespace std;typedef pair<int,int> pi;#define f first#define s secondstruct hashPi {size_t operator()(const pi& p) const { return p.f^p.s; }};int main() {unordered_map<pi,int,hashPi> um;}
#include <bits/stdc++.h>using namespace std;typedef pair<int,int> pi;#define f first#define s secondnamespace std {template<> struct hash<pi> {size_t operator()(const pi& p) const { return p.f^p.s; }
Java
Java has its own hash functions for pre-defined objects like ArrayList
's.
However, a custom hash function is still needed for user-defined objects.
In order to create one, we can implement the hashCode
method.
Additionally, in order for HashSet
's and HashMap
's to work with
a custom class, we must also implement the equals
method.
import java.io.*;import java.util.*;public class HashTest {public static void main(String[] args) {// uses custom hash function in class PairSet<Pair> set = new HashSet<>();}static class Pair {
However, this hash function is quite bad; if we insert then they will all be mapped to the same bucket (so it would easily be hacked).
C++
Java
A better (and easier) way to hash custom objects is to use Java's built in
Objects.hash()
method. This method takes in multiple objects and uses them
to create a hash code.
import java.io.*;import java.util.*;public class HashTest {public static void main(String[] args) {// uses custom hash function in class PairSet<Pair> set = new HashSet<>();}static class Pair {
Hacking
Warning!
You don't need to know this for USACO, but you will need this to pass some of the problems in this module.
In USACO contests, unordered sets and maps generally fine, but the built-in hashing algorithm for C++ is vulnerable to pathological data sets causing abnormally slow runtimes. Apparently Java is not vulnerable to this, however.
C++
Resources | ||||
---|---|---|---|---|
CF | Explanation of this problem and how to fix it. |
Essentially, use unordered_map<int, int, custom_hash>
defined in the blog above
in place of unordered_map<int, int>
.
Another Hash Function
Resources | ||||
---|---|---|---|---|
Benq (from KACTL) |
struct chash { /// use most bits rather than just the lowest onesconst uint64_t C = ll(2e18*PI)+71; // large odd numberconst int RANDOM = rng(); // random 32-bit numberll operator()(ll x) const {// https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.htmlreturn __builtin_bswap64((x^RANDOM)*C);}};template<class K,class V> using um = unordered_map<K,V,chash>;
This section is not complete.
(explain assumptions that are required for this to work)