
Hash 32 D
It's the fastest 32-bit hashing algorithm versus Murmur3A.
Algorithm
Source
#include <stdint.h>
uint32_t eightomic_align_32(const uint8_t *input, unsigned long i) {
uint32_t input_aligned = input[i];
input_aligned |= input[i + 1] << 8;
input_aligned |= input[i + 2] << 16;
input_aligned |= input[i + 3] << 24;
return input_aligned;
}
uint32_t eightomic_hash_32_d(unsigned long input_count, const uint8_t *input) {
const unsigned char input_remainder_count = input_count & 3;
uint32_t mix = 1111111111;
uint32_t mix_offset = 1111111111;
unsigned long i = 0;
input_count -= input_remainder_count;
while (i < input_count) {
mix += eightomic_align_32(input, i);
mix_offset += mix;
mix += ((mix << 14) | (mix >> 18)) - mix_offset;
mix_offset += mix_offset << 2;
mix += mix << 1;
i += 4;
}
switch (input_remainder_count) {
case 3:
mix += input[input_count + 2];
mix += mix << 3;
mix_offset += mix;
mix_offset = (mix_offset << 19) | (mix_offset >> 13);
case 2:
mix += input[input_count + 1];
mix += mix << 3;
mix_offset += mix;
case 1:
mix += input[input_count];
}
mix += mix << 3;
mix_offset += input_remainder_count + mix;
mix_offset = (mix_offset << 19) | (mix_offset >> 13);
mix += mix << 3;
mix_offset += input_count + mix;
mix_offset = (mix_offset << 19) | (mix_offset >> 13);
mix ^= mix_offset;
mix += (mix_offset << 27) | (mix_offset >> 5);
mix_offset ^= mix >> 3;
mix += (mix_offset << 8) | (mix_offset >> 24);
mix ^= mix_offset;
mix_offset += (mix << 14) | (mix >> 18);
mix_offset ^= ((mix << 9) | (mix >> 23)) + (mix_offset >> 7);
return mix + mix_offset;
}
License
It's free and open source.
Reference
eightomic_hash_32_d() is the hashing function that accepts the following 2 arguments in left-to-right order.
1: input_count is the unsigned long count of elements in the input array.
2: input is the const uint8_t array to hash.
The return value data type is uint32_t.
It returns the 32-bit unsigned integer hash digest result.
Explanation
It's faster than Murmur3A on average at input sizes segmented from 1 byte to 32 bytes, 1 byte to 64 bytes and greater than 64 bytes.
Speed tests were performed with endian-safe, slower memory reads instead of using memcpy(), but eightomic_align_32() is adjustable whenever the implementation allows endian-unsafe reads or unaligned reads.
It passes all seedless SMHasher tests.
Auxiliary seed input isn't a strict requirement for non-cryptographic hashing algorithms, but Eightomic is developing and testing an improved version that accepts a 32-bit seed integer when implementations require it.
It hashes 4-byte input segments with alternating arithmetic combinations, a bit rotation and trailing bitwise multiplications. mix_offset uses a larger multiplier than mix to compensate for the lack of a secondary summed bit rotation.
The finalization process is similar to the pseudo-cryptographic finalization from Eightomic Hash 32 C to increase diffusion for a good avalanche effect in a combination of 2 32-bit blocks.
The 1-3 byte input remainder hashes with a stripped-down variant of the aforementioned one-at-a-time hashing loop.
The finalization process simulates hashing appended null bytes to compensate efficiently for the minimal input mixing, then adds input_count to offset the usual collisions from hashing in 4-byte segments.