Data Structures and Algorithms B
  • Introduction
  • Stable Marrage
  • Huffman Codes
    • ABL
    • Implementation
  • Graph Algorithms
    • Connect Component
    • Bipartiteness
    • Strongly Connected Components
      • Implementation
    • Topological Sort
      • Implementation 1
      • Implementation 2
    • Dijkstra’s Shortest Path
    • Minimum Spanning Tree
      • Prims
        • Implementation 1
      • Kruskels
        • Implementation 1
      • Traveling Salesman Problem
    • k-Clustering
    • Dynamic Programming with Trees
      • Implementation 1
      • Implementation 2
    • Disjoint Sets
      • Implementation
    • Eularian Cycle
      • Implementation
    • Hamiltonian Path
      • Implementation
  • Divide and Conquer
    • Merge Sort
    • Integer Multiplication
    • Closest Pair
    • Master Method
  • Dynamic Programming
    • Interval Scheduling
    • Knapsack Problem
  • Network Flow
    • Maximum Network Flow
    • Improvements
    • Image Segmentation
  • String Algorithms
    • Z Algorithm
    • Boyer-Moore
    • Knuth-Morris-Pratt
    • Suffix Trees
      • Naive Implementation
      • Ukkonens Algorithm
        • Implementation
      • Applications
        • Longest Common Substring
        • Longest Palindromic Substring
        • Longest Repeated Substring
  • Randomized Algorithms
    • Bloom Filters
      • Implementation
Powered by GitBook
On this page

Was this helpful?

  1. String Algorithms
  2. Suffix Trees

Ukkonens Algorithm

PreviousNaive ImplementationNextImplementation

Last updated 5 years ago

Was this helpful?

Both KMP and Boyer Moore algorithm run at O(m+n)O(m+n)O(m+n) time complexity, where mmm is the length of the text, and nnn is the length of the string. The weaknesses of these algorithms become more apparent as the text space increases.

Ukkonens construction of suffix trees ensures linear time search for the string S, that is independent from the size of the text T. Once the tree is constructed, main other operations can be complete and can go beyond the scope of just the needle and in haystack problem. Suffix trees, for example, provide a linear time solution to the longest common substring problem.

Improvements

  • O(1) space efficiency per node, through the use of two indices that describe the start and end positions of a suffix substring of S. The number of nodes will stay linear to the given word S, so we achieve O(n)O(n)O(n) space complexity.

  • Node creations are constant time, since only a constant number of node attributes are initialized at a given time independent of S and T.

  • There is no need to traverse down the tree. Instead memory is maintained on to where to make inserts, either in the form of (1) a pointer to a node that was left from a previous insertion, or (2) a network of suffix links.