Eightomic

Dissection

Hash data efficiently with optimal, unique non-cryptographic hashing algorithms.

Library

Eightomic Hash 32 A

The fastest 32-bit, OAAT hashing algorithm versus CDB and DJB2.

#include <stdint.h> uint32_t eightomic_hash_32_a(const unsigned long input_count, const uint8_t *input) { uint32_t mix = 111111; unsigned long i = 0; while (i < input_count) { mix = (input[i] ^ mix) - ((mix << 25) | (mix >> 7)); i++; } return mix; }

It has equivalent or faster speeds than CDB and DJB2 at all input sizes.

Both CDB and DJB2 have between 100% and 200000% more collision across all SMHasher tests relevant to generalized, high-throughput, non-adversarial hashing.

The hash function expression is selected among dozens of similar expressions that had slower speeds and weaker collision test results.

The non-prime initialization value 111111 is selected at an offset length between 16 and 32 bits with superior test results compared to every other combination of repeated 1 digits at different lengths.

Furthermore, all other shift rotation operand values yield inferior test results compared to 25 and 7.

When faster speeds are required instead of lower collisions, removing the mix XOR operand results in faster speed by a small margin with the tradeoff of either truncating digests to a few bits or hashing pre-tested input sets that aren't expected to change, such as country codes or specific ID formats.

Eightomic Hash 32 B

The fastest 32-bit, OAAT hashing algorithm versus 32-bit FNV-1a and MicroOAAT.

#include <stdint.h> uint32_t eightomic_hash_32_b(const unsigned long input_count, const uint8_t *input) { uint32_t mix = 111111111; uint32_t mix_offset = 1; unsigned long i = 0; if (input_count > 3) { while (i < input_count) { mix -= input[i]; mix += mix << 3; mix_offset -= mix; mix_offset = (mix_offset << 27) | (mix_offset >> 5); i++; } mix ^= mix_offset; mix = (mix ^ mix_offset) + ((mix << 10) | (mix >> 22)); return mix + ((mix_offset << 27) | (mix_offset >> 5)); } while (i < input_count) { mix += input[i]; mix += (mix << 23) | (mix >> 9); mix += mix << 3; i++; } return mix; }

It has equivalent or faster speeds to 32-bit FNV-1a and MicroOAAT at all input sizes.

It's a substantial improvement across all SMHasher tests in the context of critical 32-bit collision vulnerabilities among the compared hashing algorithms.

It appears similar to MicroOAAT due to the limited options when designing the fastest one-at-a-time hashing function with good collision avoidance, but the technical difference is vast.

It uses a subtraction assignment for each input byte instead of XOR or addition assignments. The remaining operations emulate multiplication with combinatorial bit rotations.

Any bitwise multiplication with a left shift operand greater than << 3 is too slow in the core hashing loop and anything less doesn't result in enough distribution to omit a second bitwise multiplication.

It left-rotates bits in mix_offset instead of right-rotating bits in mix.

Furthermore, the bit rotation is non-blocking with instruction-level parallelism for each next byte iteration.

The speed gains from these design differences allow for a minimal finalization sequence with finely-tuned shift values and additive bit rotations in both directions.

Input sizes less than 4 bytes fall through to a minimal hashing algorithm that matches the speed of 32-bit FNV-1a.

Eightomic Hash 32 C

The fastest 32-bit, OAAT hashing algorithm versus GoodOAAT and Jenkins' OAAT.

#include <stdint.h> uint32_t eightomic_hash_32_c(const unsigned long input_count, const uint8_t *input) { uint32_t mix = 1; uint32_t mix_offset = 1111111111; unsigned long i = 0; while (i < input_count) { mix += input[i]; mix += mix << 3; mix_offset += mix + mix_offset; mix_offset = (mix_offset << 19) | (mix_offset >> 13); i++; } mix ^= mix_offset; mix += (mix_offset << 27) | (mix_offset >> 5); mix_offset ^= mix >> 4; mix += (mix_offset << 8) | (mix_offset >> 24); mix ^= mix_offset >> 3; mix_offset += (mix << 14) | (mix >> 18); mix_offset ^= ((mix << 9) | (mix >> 23)) + (mix_offset >> 7); return mix ^ mix_offset; }

It's faster than GoodOAAT on average at input sizes segmented from 1 byte to 32 bytes and 1 byte to 64 bytes. It's faster at all input sizes greater than 64 bytes.

It's faster than Jenkins' OAAT at all input sizes.

It passes all seedless SMHasher tests.

It appears similar to GoodOAAT due to the limited options when designing the fastest one-at-a-time hashing function with good diffusion and collision avoidance, but the technical difference is vast.

The first expression is indentical. It adds an input byte to mix before combining with mix_offset.

The second expression is identical. Any bitwise multiplication with a left shift operand greater than << 3 is too slow in the core hashing loop and anything less doesn't result in enough distribution to omit a second bitwise multiplication.

It left-rotates bits in mix_offset after additive assignment instead of right-rotating bits in mix before additive assignment.

mix_offset combines with mix summed with an additional mix_offset as a cheap addition operand instead of a slow bitwise multiplication after the bit rotation.

Furthermore, the bit rotation is non-blocking with instruction-level parallelism for each next byte iteration.

The speed gains from these design differences allow for Xorshift operations in the finalization sequence to create a greater avalanche effect.

Eightomic Hash 32 D

The fastest 32-bit hashing algorithm versus Murmur3A.

#include <stdint.h> uint32_t eightomic_align_32(const uint8_t *input, unsigned long i) { uint32_t input_aligned = input[i]; input_aligned |= input[i + 1] << 8; input_aligned |= input[i + 2] << 16; input_aligned |= input[i + 3] << 24; return input_aligned; } uint32_t eightomic_hash_32_d(unsigned long input_count, const uint8_t *input) { const unsigned char input_remainder_count = input_count & 3; uint32_t mix = 1111111111; uint32_t mix_offset = 1111111111; unsigned long i = 0; input_count -= input_remainder_count; while (i < input_count) { mix += eightomic_align_32(input, i); mix_offset += mix; mix += ((mix << 14) | (mix >> 18)) - mix_offset; mix_offset += mix_offset << 2; mix += mix << 1; i += 4; } switch (input_remainder_count) { case 3: mix += input[input_count + 2]; mix += mix << 3; mix_offset += mix; mix_offset = (mix_offset << 19) | (mix_offset >> 13); case 2: mix += input[input_count + 1]; mix += mix << 3; mix_offset += mix; case 1: mix += input[input_count]; } mix += mix << 3; mix_offset += input_remainder_count + mix; mix_offset = (mix_offset << 19) | (mix_offset >> 13); mix += mix << 3; mix_offset += input_count + mix; mix_offset = (mix_offset << 19) | (mix_offset >> 13); mix ^= mix_offset; mix += (mix_offset << 27) | (mix_offset >> 5); mix_offset ^= mix >> 3; mix += (mix_offset << 8) | (mix_offset >> 24); mix ^= mix_offset; mix_offset += (mix << 14) | (mix >> 18); mix_offset ^= ((mix << 9) | (mix >> 23)) + (mix_offset >> 7); return mix + mix_offset; }

It's faster than Murmur3A on average at input sizes segmented from 1 byte to 32 bytes, 1 byte to 64 bytes and greater than 64 bytes.

Speed tests were performed with endian-safe memory alignment instead of using memcpy().

It passes all seedless SMHasher tests.

It hashes 4-byte input segments with alternating arithmetic combinations, a bit rotation and trailing bitwise multiplications. mix_offset uses a larger multiplier than mix to compensate for the lack of a secondary summed bit rotation.

The finalization process is similar to the pseudo-cryptographic finalization from Eightomic Hash 32 C to increase diffusion for a good avalanche effect in a combination of 2 32-bit blocks.

The 1-3 byte input remainder hashes with a stripped-down variant of the aforementioned one-at-a-time hashing loop.

The finalization process simulates hashing appended null bytes to compensate efficiently for the minimal input mixing, then adds input_count to offset the usual collisions from hashing in 4-byte segments.

Eightomic Hash 32 E

The fastest 32-bit hashing algorithm versus CityHash32 and XXHash32.

#include <stdint.h> uint32_t eightomic_align_32(const uint8_t *input, unsigned long i) { uint32_t input_aligned = input[i]; input_aligned |= input[i + 1] << 8; input_aligned |= input[i + 2] << 16; input_aligned |= input[i + 3] << 24; return input_aligned; } uint32_t eightomic_hash_32_e(unsigned long input_count, const uint8_t *input) { const unsigned long input_count_capture = input_count; uint32_t a = 1; uint32_t b = 11; uint32_t c = 111; uint32_t d = 1111; uint32_t e = 11111; uint32_t f = 111111; uint32_t g = 1111111; uint32_t h = 11111111; uint32_t mix = 1111111111; uint32_t mix_offset = 111111111; unsigned long i = 0; if (input_count >= 32) { i = 31; while (i < input_count) { mix_offset = mix; mix += a + b + c + d + e + f + g + h; a += eightomic_align_32(input, i - 3) + ((a << 8) | (a >> 24)) + mix_offset; b += eightomic_align_32(input, i - 7) + ((b << 9) | (b >> 23)) + mix_offset; c += eightomic_align_32(input, i - 11) + ((c << 10) | (c >> 22)); d += eightomic_align_32(input, i - 15) + ((d << 11) | (d >> 21)); e += eightomic_align_32(input, i - 19) + ((e << 12) | (e >> 20)); f += eightomic_align_32(input, i - 23) + ((f << 13) | (f >> 19)); g += eightomic_align_32(input, i - 27) + ((g << 14) | (g >> 18)); h += eightomic_align_32(input, i - 31) + ((h << 15) | (h >> 17)); i += 32; } if (i >= input_count) { i -= 32; } mix_offset += a + b + c + d + e + f + g + h; i++; } if ((input_count - i) >= 16) { i += 16; a += eightomic_align_32(input, i - 16) + ((a << 8) | (a >> 24)); b += eightomic_align_32(input, i - 12) + ((b << 23) | (b >> 9)); c += eightomic_align_32(input, i - 8) + ((c << 10) | (c >> 22)); d += eightomic_align_32(input, i - 4) + ((d << 21) | (d >> 11)); mix += a + b + c + d; } if ((input_count - i) >= 8) { i += 8; a += eightomic_align_32(input, i - 8) + ((a << 8) | (a >> 24)); b += eightomic_align_32(input, i - 4) + ((b << 23) | (b >> 9)); mix += a + b; } if (i != input_count) { mix += ((a << 8) | (a >> 24)) + mix_offset; input_count -= i; if (input_count >= 4) { a += eightomic_align_32(input, i) + ((a << 23) | (a >> 9)); if (input_count != 4) { mix += a + b; switch (input_count) { case 7: b += (input[i + 6] << 16) | (input[i + 5] << 8) | input[i + 4]; goto finalize_a; case 6: b += (input[i + 5] << 8) | input[i + 4]; goto finalize_a; case 5: b += input[i + 4]; } } goto finalize_a; } switch (input_count) { case 3: a ^= (input[i + 2] << 16) | (input[i + 1] << 8) | input[i]; goto finalize_a; case 2: a ^= (input[i + 1] << 8) | input[i]; goto finalize_a; case 1: a ^= input[i]; } } finalize_a: a += b + mix_offset; mix += (a << 8) | (a >> 24); if (input_count_capture >= 16) { mix += ((c << 22) | (c >> 10)) + ((d << 11) | (d >> 21)); if (input_count_capture >= 32) { mix_offset += ((e << 20) | (e >> 12)) ^ ((g << 18) | (g >> 14)); mix += (h << 15) | (h >> 17); mix_offset += input_count_capture + mix; mix += ((a << 13) | (a >> 19)) ^ h; b ^= mix_offset; mix_offset += a ^ ((b << 18) | (b >> 14)); c ^= mix; mix += b ^ ((c << 15) | (c >> 17)); e ^= mix_offset; mix_offset += d ^ ((e << 18) | (e >> 14)); f ^= ((e << 18) | (e >> 14)) + mix; mix_offset += e ^ ((f << 19) | (f >> 13)); g ^= mix_offset; mix += f ^ ((g << 21) | (g >> 11)); goto finalize_b; } mix_offset += input_count_capture + mix; mix += ((a << 13) | (a >> 19)) ^ h; b ^= mix_offset; mix_offset += a ^ ((b << 18) | (b >> 14)); c ^= mix; mix += b ^ ((c << 15) | (c >> 17)); e ^= mix_offset; mix_offset += d ^ ((e << 18) | (e >> 14)); goto finalize_b; } mix_offset += input_count_capture + mix; mix += ((a << 13) | (a >> 19)) ^ h; b ^= mix_offset; mix_offset += a ^ ((b << 18) | (b >> 14)); if (input_count_capture >= 8) { c ^= mix; mix += b ^ ((c << 15) | (c >> 17)); e ^= mix_offset; mix_offset += d ^ ((e << 18) | (e >> 14)); } finalize_b: mix ^= mix_offset; mix += (mix_offset << 27) | (mix_offset >> 5); mix_offset ^= mix >> 4; mix += (mix_offset << 8) | (mix_offset >> 24); mix ^= mix_offset >> 3; mix_offset += (mix << 14) | (mix >> 18); mix_offset ^= ((mix << 9) | (mix >> 23)) + (mix_offset >> 7); return input_count_capture + mix + mix_offset; }

It's faster than CityHash32 at all input sizes.

It's faster than XXHash32 on average at input sizes segmented from 1 byte to 32 bytes, 1 byte to 64 bytes and greater than 64 bytes.

Speed tests were performed with endian-safe memory alignment instead of using memcpy().

It passes all seedless SMHasher tests while both CityHash32 and XXHash32 fail some tests.

It uses incrementally-decreasing block sizes based on the largest possible block size within input_count. It starts by hashing blocks of 256 bits, then decreases to 128 bits, 64 bits, 32 bits and the remainder instead of branching into one of several hash functions based on input_count.

The finalization process decreases collisions by mixing between 16 and 256 bits of state into mix and mix_offset based on the input_count to increase speed for inputs smaller than 32 bytes.

The remaining finalization process is similar to the pseudo-cryptographic finalization from Eightomic Hash 32 C to increase diffusion for a good avalanche effect in a combination of 2 32-bit blocks.

License

Eightomic License © Eightomic <https://eightomic.com> Permission to both copy this software and use copies of this software is granted only when the following 2 conditions are satisfied. 1. These software license conditions in entirety, both unmodified and without obfuscation, must be included wherever an instance of either a copy of this software or a derivation from a copy of this software is implemented. 2. Liability for damages incurred in conjunction with this software must not be assumed by Eightomic.