Find Duplicates of array using bit array in Python


Suppose we have an array of n different numbers; n can be 32,000 at max. The array may have duplicate entries and we do not know what is the value of n. Now if we have only 4-Kilobytes of memory, how would display all duplicates in the array?

So, if the input is like [2, 6, 2, 11, 13, 11], then the output will be [2,11] as 2 and 11 appear more than once in given array.

To solve this, we will follow these steps −

Create one byte-array type data structure bit_arr, it has following methods

  • Define constructor This will take n

  • arr := an array of size (n / 2^5) + 1, fill with 0

  • Define a function get_val() . This will take pos

  • index := pos / 2^5

  • bitNo := pos AND 31

  • return true when (arr[index] AND (2^bitNo)) is not same as 0

  • Define a function set_val() . This will take pos

  • index := pos / 2^5

  • bitNo := pos AND 31

  • arr[index] := arr[index] OR (2^bitNo)

  • From the main method, do the following −

  • arr := bit_arr(320000)

  • for i in range 0 to size of arr, do

    • num := arr[i]

    • if arr.get_val(num) is non-zero, then

      • display num

    • otherwise,

    • set_val(num) of arr

Example

Let us see the following implementation to get better understanding −

class bit_arr:
   def __init__(self, n):
      self.arr = [0] * ((n >> 5) + 1)
   def get_val(self, pos):
      self.index = pos >> 5
      self.bitNo = pos & 31
      return (self.arr[self.index] & (1 << self.bitNo)) != 0
   def set_val(self, pos):
      self.index = pos >> 5
      self.bitNo = pos & 31
      self.arr[self.index] |= (1 << self.bitNo)
def is_duplicate(arr):
   arr = bit_arr(320000)
   for i in range(len(arr)):
      num = arr[i]
      if arr.get_val(num):
         print(num, end = " ")
      else:
         arr.set_val(num)
arr = [2, 6, 2, 11, 13, 11]
is_duplicate(arr)

Input

[2, 6, 2, 11, 13, 11]

Output

2 11

Updated on: 25-Aug-2020

135 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements