You are on page 1of 4

10/19/2014

C# Boyer-Moore String Search Example

Go

C# Boyer-Moore

C#: Method

Boyer-Moore is a string searching


algorithm. Compared to IndexOf, this is a
faster way to search for substrings in a
string. It avoids checking most positions
in the source string. The implementation
here uses constant characters.
IndexOf

Example
First, this program demonstrates how to use the
Boyer-Moore style algorithm. It uses a lookup table to
determine the next possible location of a string to find.
The implementation here is hard-coded.
Info:
This enhances the method's performance, but
detracts from its usefulness in programs.
The program has methods that search for the characters "ABCD"
uppercase in that order in the source string and returns true or false.
Contains1 is a method that implements parts of the Boyer-Moore
string searching algorithm.
Program that uses simple Boyer-Moore search: C#

C# Delegate
C#
com Programming C#

using System;
class Program
{
static void Main()
{
const string input1
const string input2
const string input3
const string input4

=
=
=
=

"There were ABCD Perls";


"ABCD string";
"No ABC dee string";
"Test BAC";

Console.WriteLine(Contains1(input1));
Console.WriteLine(Contains1(input2));
Console.WriteLine(Contains1(input3));
Console.WriteLine(Contains1(input4));
Console.WriteLine(Contains2(input1));
Console.WriteLine(Contains2(input2));
Console.WriteLine(Contains2(input3));
Console.WriteLine(Contains2(input4));
}
static bool Contains1(string value)
{
// Searches for 4-letter constant string using Boyer-Moore style algorithm.
// ... Uses switch as lookup table.
int i = 3; // First index to check.
int length = value.Length;
while (i < length)
http://www.dotnetperls.com/boyer-moore

1/4

10/19/2014

C# Boyer-Moore String Search Example

{
switch (value[i])
{
case 'D':
// Last character in pattern found.
// ... Check for definite match.
if (value[i - 1] == 'C' &&
value[i - 2] == 'B' &&
value[i - 3] == 'A')
{
return true;
}
// Must be at least 4 characters away.
i += 4;
continue;
case 'C':
// Must be at least 1 character away.
i += 1;
continue;
case 'B':
// Must be at least 2 characters away.
i += 2;
continue;
case 'A':
// Must be at least 3 characters away.
i += 3;
continue;
default:
// Must be at least 4 characters away.
i += 4;
continue;
}
}
// Nothing found.
return false;
}
static bool Contains2(string value)
{
// Searches for 4-letter constant string with IndexOf.
return value.IndexOf("ABCD", StringComparison.Ordinal) != -1;
}
}
Output
True
True
False
False
True
True
False
False

In this example, the input string is passed to the


method, and the pattern string, which here contains
four constant uppercase letters, is searched. The final
character in the pattern (D) is searched for at each
place.
Switch
And:
http://www.dotnetperls.com/boyer-moore

2/4

10/19/2014

C# Boyer-Moore String Search Example

If D is found, we use an if-statement to check the previous three


characters for C, B and A.
If
If the character is any of ABCD, we
advance the maximum number of
places where the string might start
(one to four). If the character is not
one of ABCD, then we know that ABCD
cannot be found unless we advance
four places.
Tip:
This special logic often reduces the number of character checks by
75% in this program.

Discussion
Unfortunately, this constant character checking
code is only useful in limited situations. If you
have a program that must scan for a certain
important word or name in many strings, then you
could use this code to do that.
So:
For example, you could use this code to check
for values in HTTP headers, such as to detect browser user agents.
However, the inflexibility of this code can
be seen as an advantage. In the .NET
Framework, loading a constant onto the
evaluation stack is much faster than
loading a variable reference or field.
Therefore:
A more advanced method that is configurable would likely be
slower because it would have to access memory more often.

Benchmark
Let's look at the benchmark results for the
Boyer-Moore style string searching algorithm
versus the Contains2 method, which uses
the IndexOf method. The Boyer-Moore style
algorithm is about 86% faster than the IndexOf method.
Note:
The Boyer-Moore algorithm is often faster than other direct
character testing expressions in the C# language.
Method calls tested: separate loops
Contains1("This is an A example test of ABCD and more");
Contains2("This is an A example test of ABCD and more");
http://www.dotnetperls.com/boyer-moore

3/4

10/19/2014

C# Boyer-Moore String Search Example

Benchmark results
Contains1: 23.60 ns
Contains2: 166.98 ns

Summary
We implemented and demonstrated a BoyerMoore string searching method that uses
constant characters. We noted how the
principles of the Boyer-Moore algorithm can
be used to improve the performance of string
searching for substrings.
Therefore:
The method here could be adapted for fast searching of other
constant substrings, but is limited in its uses.

C# Example
C# Source Code
C# Enum

http://www.dotnetperls.com/boyer-moore

4/4

You might also like