Find Duplicates

10.8 Find Duplicates: You have an array with all the numbers from 1 to N, where N is at most 32,000. The array may have duplicate entries and you do not know what N is. With only 4 kilobytes of memory available, how would you print all duplicate elements in the array?
  • Brainstorm:
    • Given 32000 numbers each of which are 32 bit integers, we have a total of 1024000 bits of information, and insufficient 4kb = 2^12 bits = 4096 bits of memory. (250x fewer).
    • Store in 500 hash tables. That contain the digits as well as the frequencies of the digits.
    • Merge the hash tables with external merge sort.
    • Print out frequencies stored.
  • Solution:
    • Bit vector.