Bloom Filter Implementation in JavaScript
Bloom filters are probabilistic data structures used to test whether an element is a member of a set. They offer a space-efficient way to check for membership, but with a small chance of false positives (reporting an element as present when it's not). This challenge asks you to implement a Bloom filter in JavaScript, demonstrating your understanding of bitwise operations and hashing.
Problem Description
You are tasked with creating a JavaScript class called BloomFilter that implements a Bloom filter. The class should allow you to add elements, check if an element is potentially present, and provide information about the filter's size and number of hash functions. The Bloom filter should use multiple hash functions to reduce the probability of false positives.
Key Requirements:
- Constructor: The constructor should accept two arguments:
size(the number of bits in the filter) andnumHashFunctions(the number of hash functions to use). add(element): This method should add an element to the Bloom filter. It should hash the element using allnumHashFunctionshash functions, and set the corresponding bits in the filter to 1.contains(element): This method should check if an element is potentially present in the Bloom filter. It should hash the element using allnumHashFunctionshash functions, and check if all corresponding bits in the filter are set to 1. If any bit is 0, the element is definitely not in the filter.getSize(): Returns the size of the Bloom filter (number of bits).getNumHashFunctions(): Returns the number of hash functions used.
Expected Behavior:
- Adding an element should set the appropriate bits in the filter.
- Checking for an element that was added should return
true(with a small probability of a false positive). - Checking for an element that was not added should return
false(with a small probability of a false positive). - The
getSize()andgetNumHashFunctions()methods should return the correct values.
Edge Cases to Consider:
sizeandnumHashFunctionsshould be positive integers. Handle invalid inputs gracefully (e.g., throw an error or return a default value).- The hash functions should distribute elements evenly across the bit array to minimize collisions and false positives.
- Consider how to handle different data types for the elements being added (strings, numbers, etc.). The hash functions should work consistently across these types.
Examples
Example 1:
Input:
const filter = new BloomFilter(100, 3);
filter.add("apple");
filter.add("banana");
Output:
filter.contains("apple"); // true
filter.contains("banana"); // true
filter.contains("orange"); // false (likely, but could be a false positive)
Explanation: The filter is initialized with a size of 100 bits and 3 hash functions. "apple" and "banana" are added, setting the corresponding bits. "orange" is not added, so it's likely to return false, but a false positive is possible.
Example 2:
Input:
const filter = new BloomFilter(20, 2);
filter.add(123);
filter.add(456);
Output:
filter.contains(123); // true
filter.contains(456); // true
filter.contains(789); // false (likely)
Explanation: Demonstrates the filter working with numbers.
Example 3: (Edge Case)
Input:
const filter = new BloomFilter(0, 0); // Invalid size and numHashFunctions
filter.add("test");
Output:
Error: Size and numHashFunctions must be positive integers. (or similar error handling)
Explanation: Shows how to handle invalid input.
Constraints
sizemust be a positive integer greater than 0.numHashFunctionsmust be a positive integer greater than 0.- The Bloom filter should use a bit array (represented as an array of booleans or numbers) to store the bits.
- The hash functions should be simple and efficient (e.g., using modulo arithmetic or bitwise operations). You can use a single hash function and derive multiple hash functions from it.
- The space complexity of the Bloom filter should be proportional to
size. - The time complexity of
addandcontainsshould be O(numHashFunctions).
Notes
- You can use any JavaScript features you are comfortable with.
- Consider using a single hash function and deriving multiple hash functions from it to reduce code complexity. For example, you could use a single hash function
h(x)and then generatenumHashFunctionsdifferent hash values ash1(x) = h(x) % size,h2(x) = (h(x) + 1) % size,h3(x) = (h(x) + 2) % size, and so on. - The goal is to create a functional and reasonably efficient Bloom filter implementation. Focus on clarity and correctness over extreme optimization.
- Error handling for invalid inputs is important.
- Remember that Bloom filters are probabilistic; false positives are possible. The probability of false positives depends on the size of the filter and the number of hash functions.