Find Duplicates

10.8 Find Duplicates: You have an array with all the numbers from 1 to N, where N is at most 32,000. The array may have duplicate entries and you do not know what N is. With only 4 kilobytes of memory available, how would you print all duplicate elements in the array?

  • Brainstorm:

    • Given 32000 numbers each of which are 32 bit integers, we have a total of 1024000 bits of information, and insufficient 4kb = 2^12 bits = 4096 bits of memory. (250x fewer).

    • Store in 500 hash tables. That contain the digits as well as the frequencies of the digits.

    • Merge the hash tables with external merge sort.

    • Print out frequencies stored.

  • Solution:

    • Bit vector.

Last updated