Effects of suffix repetition rates of a string on the performance of string matching algorithms

Abstract

The highly efficient Boyer-Moore's string matching algorithm utilizes information on multi-occurrences of string suffixes in a pattern string to avoid backtracks in searching the pattern string. One hypothesis is that Boyer-Moore's algorithm even benefits more from highly self-repetitive patterns. In this paper, the author studies how multi-occurrences of string suffixes affect the performance of the Boyer-Moore's algorithm as well as some other well known string search algorithms. The paper introduces a new concept of suffix repetition rate (SRR) to measure how frequently the suffixes of a string occur inside of the string. Using this measurement, experiments with several thousands patterns over the entire range of SRRs have been carried out, and the results show that increasing of SRR on pattern strings does not improve the performance of a searching algorithm in terms of efficiency.

Department(s)

Computer Science

Document Type

Conference Proceeding

DOI

https://doi.org/10.1109/ICIS.2009.29

Keywords

Algorithm efficiency, String matching, String suffix, Suffix repetition rates

Publication Date

11-10-2009

Journal Title

Proceedings of the 2009 8th IEEE/ACIS International Conference on Computer and Information Science, ICIS 2009

Share

COinS