Please navigate to the bottom of the page for Table of Contents

Saturday, May 21, 2011

String pattern matching

In almost any non-trivial string application, pattern matching is an indispensable part of the code. Finding the position where the search string (pattern) exists in the source is generally referred to pattern matching. Of course, more complex pattern matching involves the use of regular expressions and other advanced techniques. There have been many algorithms that have been developed for solving this simple functionality of pattern matching. In this post, we will discuss the simplest of all: brute force search.

Brute Force Search is the simplest of all search algorithms. In this algorithm, the pattern string is searched character by character in the source string. The search begins at position 0 of the source and continues till either a match is found or the entire source string (minus the pattern string length) has been traversed.

Let’s look at the code to implement is simple brute force search technique.

#include "stdafx.h"
#include <string.h>

// searches for the input pattern string in the
// source string and returns either the position
// where the mach is found or -1 if no match
// was found
int PatternMatching(char* source, char* pattern)
{
// find the source and pattern lengths
int sourceLength = strlen(source);
int patternLength = strlen(pattern);

for (int pass = 0; pass <= sourceLength - patternLength; pass++)
{
// initialize the current positions
// for source and pattern

// for each pass we will have to go through the
// entire pattern string
int patCurrPos = 0;

// for each, we need to start the search at
// the current source string position
int srcCurrPos = pass;

// till we have the match, continue looping
while ((source[srcCurrPos] == pattern[patCurrPos]) && (patCurrPos < patternLength))
{
patCurrPos++;
srcCurrPos++;
}
// do we have a match
if (patCurrPos == patternLength)
return pass;
}

// match not found
return -1;
}

3 comments:

  1. Please explain kth largest element in an array ,
    largest sum of sub array ,
    sum of two numbers is m then find that two numbers are present in an array or not ...
    Programs too.

    ReplyDelete
  2. in your programme, 'while condition' should check the pattern length first. otherwise you will get index out of range exception
    ex:
    while ((patCurrPos < patternLength) && (source[srcCurrPos] == pattern[patCurrPos]))
    {
    patCurrPos++;
    srcCurrPos++;
    }

    ReplyDelete