Eightomic

Eightomic Hash 32 D: The Fastest 32-bit Hashing Algorithm Versus Murmur3A

Eightomic Hash 32 D is a 32-bit hashing algorithm as a substantial improvement to Murmur3A.

Library

Source

#ifndef EIGHTOMIC_HASH_32_D_H #define EIGHTOMIC_HASH_32_D_H #include <stdint.h> #include <string.h> uint32_t eightomic_hash_32_d(unsigned long input_count, const uint8_t *input); #endif#include "eightomic_hash_32_d.h" uint32_t eightomic_align_32(const uint8_t *input, unsigned long i) { uint32_t input_aligned; memcpy(&input_aligned, &input[i], sizeof(uint32_t)); return input_aligned; } uint32_t eightomic_hash_32_d(unsigned long input_count, const uint8_t *input) { const unsigned char input_remainder_count = input_count & 3; uint32_t mix = 1111111111; uint32_t mix_offset = 1111111111; unsigned long i = 0; input_count -= input_remainder_count; while (i < input_count) { mix += eightomic_align_32(input, i); mix_offset += mix; mix += ((mix << 14) | (mix >> 18)) - mix_offset; mix_offset += mix_offset << 2; mix += mix << 1; i += 4; } switch (input_remainder_count) { case 3: mix += input[input_count + 2]; mix += mix << 3; mix_offset += mix; mix_offset = (mix_offset << 19) | (mix_offset >> 13); case 2: mix += input[input_count + 1]; mix += mix << 3; mix_offset += mix; case 1: mix += input[input_count]; } mix += mix << 3; mix_offset += input_remainder_count + mix; mix_offset = (mix_offset << 19) | (mix_offset >> 13); mix += mix << 3; mix_offset += input_count + mix; mix_offset = (mix_offset << 19) | (mix_offset >> 13); mix ^= mix_offset; mix += (mix_offset << 27) | (mix_offset >> 5); mix_offset ^= mix >> 3; mix += (mix_offset << 8) | (mix_offset >> 24); mix ^= mix_offset; mix_offset += (mix << 14) | (mix >> 18); mix_offset ^= ((mix << 9) | (mix >> 23)) + (mix_offset >> 7); return mix + mix_offset; }

Reference

eightomic_hash_32_d() is the hashing function that accepts the 2 following arguments in left-to-right order.

1: input_count is the unsigned long count of elements in the input array.

2: input is the const uint8_t array to hash.

The return value data type is uint32_t.

It returns the 32-bit unsigned integer hash digest result.

Example

#include <stdio.h> #include "eightomic_hash_32_d.h" int main(void) { uint8_t input[8] = {'m', 'e', 's', 's', 'a', 'g', 'e', 0}; unsigned char i = 0; while (i < 10) { i++; printf("Result %u is 0x%08x.\n", i, eightomic_hash_32_d(8, input)); input[7]++; } return 0; }

Explanation

Eightomic Hash 32 D hashes 4-byte input segments with alternating arithmetic combinations, a bit rotation and trailing bitwise multiplications. mix_offset uses a larger multiplier than mix to compensate for the lack of a secondary summed bit rotation.

The finalization process is similar to the pseudo-cryptographic finalization from Eightomic Hash 32 C to increase diffusion for a good avalanche effect in a combination of 2 32-bit blocks.

The 1-3 byte input remainder hashes with a stripped-down variant of the aforementioned one-at-a-time hashing loop.

The finalization process simulates hashing appended null bytes to compensate efficiently for the minimal input mixing, then adds input_count to offset the usual collisions from hashing in 4-byte segments.

In rare instances when hash digests must be saved after program termination and re-used in multiple systems with varying endianness, keys should be re-hashed during initialization instead of slowing down runtime hashing with byte order alignment.

In rare instances when byte order is expected to change during runtime, replace eightomic_align_32() with the following endian-independent version with a tradeoff of slower speed.

uint32_t eightomic_align_32(const uint8_t *input, unsigned long i) { uint32_t input_aligned = input[i]; input_aligned |= input[i + 1] << 8; input_aligned |= input[i + 2] << 16; input_aligned |= input[i + 3] << 24; return input_aligned; }

Seed tests in SMHasher are omitted to both discourage using hashing algorithms as PRNGs and prevent collision vulnerabilities from 2³² different initialized states.

It passes all extended SMHasher tests that don't require a seed.

It's faster than Murmur3A with all input sizes.