Data Structures and Algorithms B
  • Introduction
  • Stable Marrage
  • Huffman Codes
    • ABL
    • Implementation
  • Graph Algorithms
    • Connect Component
    • Bipartiteness
    • Strongly Connected Components
      • Implementation
    • Topological Sort
      • Implementation 1
      • Implementation 2
    • Dijkstra’s Shortest Path
    • Minimum Spanning Tree
      • Prims
        • Implementation 1
      • Kruskels
        • Implementation 1
      • Traveling Salesman Problem
    • k-Clustering
    • Dynamic Programming with Trees
      • Implementation 1
      • Implementation 2
    • Disjoint Sets
      • Implementation
    • Eularian Cycle
      • Implementation
    • Hamiltonian Path
      • Implementation
  • Divide and Conquer
    • Merge Sort
    • Integer Multiplication
    • Closest Pair
    • Master Method
  • Dynamic Programming
    • Interval Scheduling
    • Knapsack Problem
  • Network Flow
    • Maximum Network Flow
    • Improvements
    • Image Segmentation
  • String Algorithms
    • Z Algorithm
    • Boyer-Moore
    • Knuth-Morris-Pratt
    • Suffix Trees
      • Naive Implementation
      • Ukkonens Algorithm
        • Implementation
      • Applications
        • Longest Common Substring
        • Longest Palindromic Substring
        • Longest Repeated Substring
  • Randomized Algorithms
    • Bloom Filters
      • Implementation
Powered by GitBook
On this page

Was this helpful?

  1. String Algorithms
  2. Suffix Trees
  3. Applications

Longest Repeated Substring

PreviousLongest Palindromic SubstringNextRandomized Algorithms

Last updated 5 years ago

Was this helpful?

Problem Find the longest common substring shared within the same word.

Examples

LRS in AAAAAAAAAA is : AAAAAAAAA
LRS in ABCDEFG is    : No repeated substring
LRS in ABABABA is    : ABABA
LRS in ATCGATCGA is  : ATCGA

Solution

Assume S = "ABABABA". Begin by building the suffix tree for S.

S = A B A B A B A
I = 1 2 3 4 5 6 7

Next, return the path label to the deepest internal node.

Why this works

As mentioned before, an internal node carries the property that it is shares a common substring with its parent. In addition, it branches out to at least two other occurrences. So the deepest node will not necessarily the longest path label, or the most repeated substring, but the longest substring that is repeated at least 2 times.

A new branch is created per new substring suffix. The deeper into the tree, the more characters are shared with another substring in the tree. Therefore the deepest node will contain the longest shared occurrence.