Eightomic Hash 32 D: The Fastest 32-bit Hashing Algorithm Versus Murmur3A
Eightomic Hash 32 D is a 32-bit hashing algorithm as a substantial improvement to Murmur3A.
Library
Source
#ifndef EIGHTOMIC_HASH_32_D_H
#define EIGHTOMIC_HASH_32_D_H
#include <stdint.h>
#include <string.h>
uint32_t eightomic_hash_32_d(unsigned long input_count, const uint8_t *input);
#endif
#include "eightomic_hash_32_d.h"
uint32_t eightomic_align_32(const uint8_t *input, unsigned long i) {
uint32_t input_aligned;
memcpy(&input_aligned, &input[i], sizeof(uint32_t));
return input_aligned;
}
uint32_t eightomic_hash_32_d(unsigned long input_count, const uint8_t *input) {
const unsigned char input_remainder_count = input_count & 3;
uint32_t mix = 1111111111;
uint32_t mix_offset = 1111111111;
unsigned long i = 0;
input_count -= input_remainder_count;
while (i < input_count) {
mix += eightomic_align_32(input, i);
mix_offset += mix;
mix += ((mix << 14) | (mix >> 18)) - mix_offset;
mix_offset += mix_offset << 2;
mix += mix << 1;
i += 4;
}
switch (input_remainder_count) {
case 3:
mix += input[input_count + 2];
mix += mix << 3;
mix_offset += mix;
mix_offset = (mix_offset << 19) | (mix_offset >> 13);
case 2:
mix += input[input_count + 1];
mix += mix << 3;
mix_offset += mix;
case 1:
mix += input[input_count];
}
mix += mix << 3;
mix_offset += input_remainder_count + mix;
mix_offset = (mix_offset << 19) | (mix_offset >> 13);
mix += mix << 3;
mix_offset += input_count + mix;
mix_offset = (mix_offset << 19) | (mix_offset >> 13);
mix ^= mix_offset;
mix += (mix_offset << 27) | (mix_offset >> 5);
mix_offset ^= mix >> 3;
mix += (mix_offset << 8) | (mix_offset >> 24);
mix ^= mix_offset;
mix_offset += (mix << 14) | (mix >> 18);
mix_offset ^= ((mix << 9) | (mix >> 23)) + (mix_offset >> 7);
return mix + mix_offset;
}
Reference
eightomic_hash_32_d() is the hashing function that accepts the 2 following arguments in left-to-right order.
1: input_count is the unsigned long count of elements in the input array.
2: input is the const uint8_t array to hash.
The return value data type is uint32_t.
It returns the 32-bit unsigned integer hash digest result.
Example
#include <stdio.h>
#include "eightomic_hash_32_d.h"
int main(void) {
uint8_t input[8] = {'m', 'e', 's', 's', 'a', 'g', 'e', 0};
unsigned char i = 0;
while (i < 10) {
i++;
printf("Result %u is 0x%08x.\n", i, eightomic_hash_32_d(8, input));
input[7]++;
}
return 0;
}
Explanation
Eightomic Hash 32 D hashes 4-byte input segments with alternating arithmetic combinations, a bit rotation and trailing bitwise multiplications. mix_offset uses a larger multiplier than mix to compensate for the lack of a secondary summed bit rotation.
The finalization process is similar to the pseudo-cryptographic finalization from Eightomic Hash 32 C to increase diffusion for a good avalanche effect in a combination of 2 32-bit blocks.
The 1-3 byte input remainder hashes with a stripped-down variant of the aforementioned one-at-a-time hashing loop.
The finalization process simulates hashing appended null bytes to compensate efficiently for the minimal input mixing, then adds input_count to offset the usual collisions from hashing in 4-byte segments.
In rare instances when hash digests must be saved after program termination and re-used in multiple systems with varying endianness, keys should be re-hashed during initialization instead of slowing down runtime hashing with byte order alignment.
In rare instances when byte order is expected to change during runtime, replace eightomic_align_32() with the following endian-independent version with a tradeoff of slower speed.
uint32_t eightomic_align_32(const uint8_t *input, unsigned long i) {
uint32_t input_aligned = input[i];
input_aligned |= input[i + 1] << 8;
input_aligned |= input[i + 2] << 16;
input_aligned |= input[i + 3] << 24;
return input_aligned;
}
Seed tests in SMHasher are omitted to both discourage using hashing algorithms as PRNGs and prevent collision vulnerabilities from 2³² different initialized states.
It passes all extended SMHasher tests that don't require a seed.
It's faster than Murmur3A with all input sizes.