Counting Bloom Filter
Basic Concept
A Counting Bloom filter is defined as a generalized data structure of Bloom filter that is implemented to test whether a count number of a given element is less than a given threshold when a sequence of elements is given. As a generalized form, of Bloom filter there is possibility of false positive matches, but no chance of false negatives – in other words, a query returns either "possibly higher or equal than the threshold" or "definitely less than the threshold".
Algorithm description
- Most of the parameters, used under counting bloom filter, are defined same with Bloom filter, such as n, k. m is denoted as the number of counters in Counting Bloom filter, which is expansion of m bits in Bloom filter.
- An empty Counting Bloom filter is set as a m counters, all initialized to 0.
- Similar to Bloom filter, there must also be k various hash functions defined, each of which responsible to map or hash some set element to one of the m counter array positions, creating a uniform random distribution. It is also same that k is a constant, much less than m, which is proportional to the number of elements to be appended.
- The main generalization of Bloom filter is appending an element. To append an element, insert it to each of the k hash functions to obtain k array positions and increment the counters 1 at all these positions.
- To query for an element with a threshold θ (verify whether the count number of an element is less than θ), insert it to each of the k hash functions to obtain k counter positions.
- If any of the counters at these positions is smaller than θ, the count number of element is definitely smaller than θ – if it were higher and equal, then all the corresponding counters would have been higher or equal to θ.
- If all are higher or equal to θ, then either the count is really higher or equal to θ, or the counters have by chance been higher or equal to θ.
- If all are higher or equal to θ even though the count is less than θ, this situation is defined as false positive. Like Bloom filter, this also should be minimized.
Data Structure
Networking
RDBMS
Operating System
Java
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP