PrevNext
Rare

(Optional) More on Unordered Sets & Maps

Authors: Darren Yao, Benjamin Qi

Contributors: Neo Wang, Nathan Gong

Maintaining collections of distinct elements with hashing.

Warning!

You can (almost always) use ordered sets and maps instead, but it's good to know that these exist.

Resources
IUSACO

module is based off this

Hashing

Hashing refers to assigning a unique code to every variable/object which allows insertions, deletions, and searches in O(1)\mathcal{O}(1) time, albeit with a high constant factor, as hashing requires a large constant number of operations. However, as the name implies, elements are not ordered in any meaningful way, so traversals of an unordered set will return elements in some arbitrary order.

Custom Hashing

C++

There is no built in method for hashing pairs or vectors. Namely, unordered_set<vector<int>> does not work. In this case, we can use an sorted map (which supports all of the functions used in the code above) or declare our own hash function.

Resources
Mark Nelson

How to create user-defined hash function for unordered_map.

The link provides an example of hashing pairs of strings, as well as other data structures like pairs.

#include <bits/stdc++.h>
using namespace std;
typedef pair<int,int> pi;
#define f first
#define s second
struct hashPi {
size_t operator()(const pi& p) const { return p.f^p.s; }
};
int main() {
unordered_map<pi,int,hashPi> um;
}
#include <bits/stdc++.h>
using namespace std;
typedef pair<int,int> pi;
#define f first
#define s second
namespace std {
template<> struct hash<pi> {
size_t operator()(const pi& p) const { return p.f^p.s; }

Java

Java has its own hash functions for pre-defined objects like ArrayList's. However, a custom hash function is still needed for user-defined objects. In order to create one, we can implement the hashCode method.

Additionally, in order for HashSet's and HashMap's to work with a custom class, we must also implement the equals method.

import java.io.*;
import java.util.*;
public class HashTest {
public static void main(String[] args) {
// uses custom hash function in class Pair
Set<Pair> set = new HashSet<>();
}
static class Pair {

However, this hash function is quite bad; if we insert (0,0),(1,1),(2,2)(0,0), (1,1), (2,2) \ldots then they will all be mapped to the same bucket (so it would easily be hacked).

C++

Java

A better (and easier) way to hash custom objects is to use Java's built in Objects.hash() method. This method takes in multiple objects and uses them to create a hash code.

import java.io.*;
import java.util.*;
public class HashTest {
public static void main(String[] args) {
// uses custom hash function in class Pair
Set<Pair> set = new HashSet<>();
}
static class Pair {

Hacking

Warning!

You don't need to know this for USACO, but you will need this to pass some of the problems in this module.

In USACO contests, unordered sets and maps generally fine, but the built-in hashing algorithm for C++ is vulnerable to pathological data sets causing abnormally slow runtimes. Apparently Java is not vulnerable to this, however.

C++

Resources
CF

Explanation of this problem and how to fix it.

Essentially, use unordered_map<int, int, custom_hash> defined in the blog above in place of unordered_map<int, int>.

Another Hash Function

Resources
Benq (from KACTL)
struct chash { /// use most bits rather than just the lowest ones
const uint64_t C = ll(2e18*PI)+71; // large odd number
const int RANDOM = rng(); // random 32-bit number
ll operator()(ll x) const {
// https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
return __builtin_bswap64((x^RANDOM)*C);
}
};
template<class K,class V> using um = unordered_map<K,V,chash>;

This section is not complete.

Any help would be appreciated! Just submit a Pull Request on Github.

(explain assumptions that are required for this to work)

Module Progress:

PrevNext