String Vector Denotation

We’ll denote our chars in string with integers:

a -> 1    g -> 7    m -> 13   s -> 19   y -> 25
b -> 2    h -> 8    n -> 14   t -> 20   z -> 26
c -> 3    i -> 9    o -> 15   u -> 21
d -> 4    j -> 10   p -> 16   v -> 22
e -> 5    k -> 11   q -> 17   w -> 23
f -> 6    l -> 12   r -> 18   x -> 24

HashCode Solution

Method 1 - Just Sum up Char Codes

One way to create hash is just take the chars in string and add them up. For eg. if we have chars in range [a-z], we can assign chars numbers, like a having 1, b having 2 and so on. So, bac -> 2+1+3 = 6. But this will result in a lot fo collisions.

Method 2 - Use Prime Multiplication and Summation - smaller power on base in start

Let’s take prime, ( P = 11 ) for this example. Then we can generate the hashcode like this:

$$ \text{hash code} = (1st letter) \times P^0 + (2nd , letter) \times P^1 + (3rd , letter) \times P^2 + \ldots $$

Examples

Lets take prime as 11. The hash value of abc will be:

$$ 1 \times 11^0 + 2 \times 11^1 + 3 \times 11^2 = 1 + 22 + 363 = 386 $$

Hashcode of bac will be:

$$ 2 \times 11^0 + 1 \times 11^1 + 3 \times 11^2 = 2 + 11 + 363 = 376 $$

So, there is no collision.

Code

Java
int hashCode(String s) {
	int code = 0;
	int P = 11;//prime number
	for (int i = 0; i<s.length; i++) {
		code = code + s.charAt(i) * power(P, i);
	}
	return code;
}
Python
def char_to_value(ch: str) -> int:
    """ Convert character to its corresponding integer value. """
    return ord(ch) - ord('a') + 1

def power(x: int, y: int) -> int:
    """ Calculate x raised to the power y. """
    return x ** y

def hash_code(s: str) -> int:
    code = 0
    P = 11  # prime number
    for i in range(len(s)):
        code += char_to_value(s[i]) * power(P, i)
    return code

Method 3 - Use Prime Multiplication and Summation - smaller power on base in end

We can also do prime multiplication from start. So, for example for abc and base 11:

$$ 1 \times 11^2 + 2 \times 11^1 + 3 \times 11^0 = 121 + 22 + 3 = 146 $$

Code

Java
int hashCode(String s) {
	int code = 0;
	int P = 11;//prime number
	for (int i = s.length-1; i>=0; i--) {
		code = code + s.charAt(i) * power(P, i);
	}
	return code;
}
Python
def hash_code_reverse(s: str) -> int:
    code = 0
    P = 11  # prime number
    length = len(s)
    for i in range(length):
        code += char_to_value(s[i]) * power(P, length - i - 1)
    return code

Handling Overflow

When adding numbers, there can be overflow for long strings or larger prime numbers. To overcome this, we can use the modulo operator.

MOD = 2^31 for integers in java
MOD = (1<<31) - 1;
Hash = Hash % MOD

But is it enough to stop the overflow. While adding all the numbers, there may occur overflow at the final step. So, we not take modulo while adding but also while multiplying.

Basically we can use following mod property:

(a+b) % m = (a%m + b%m) % m
(a*b) % m = (a%m * b%m) % m
(a/b) % m = (a * inverse of b, if exists) % m

So, we can do something like this:

(1st letter) X (prime)⁰ % MOD + (2nd letter) X (prime)¹ % MOD + (3rd letter) X (prime)² % MOD + ......

Code

Java
int hashCode(String s) {
	int code = 0;
	int P = 11;//prime number
	int m = (1<<31) - 1;
	for (int i = 0; i<s.length, i++) {
		code = (code + s.charAt(i) * power(P, i) % m) % m;
	}
	return code;
}
Python
def hash_code_with_mod(s: str, mod: int = (1 << 31) - 1) -> int:
    code = 0
    P = 11  # prime number
    for i in range(len(s)):
        code = (code + char_to_value(s[i]) * power(P, i) % mod) % mod
    return code