I want to evolve a generic *binary search* algorithm out
of a conventional one. The starting point is a C++ binary search
algorithm which takes an integer array, the number of elements
in the array and the value searched for as arguments. `binary_search`
returns a constant pointer to the element - if found - the
*nil* pointer else.

const int* binary_search (const int* array, int n, int x) { const int* lo = array, *hi = array + n, *mid; while(lo != hi) { mid = lo + (hi - lo) / 2; if (x == *mid) return mid; if (x < *mid) hi = mid; else lo = mid + 1; } return 0; }

Let us look at the assumptions this algorithm makes about its
environment. `binary_search` only works with integer arrays.
To make it work with arrays of arbitrary types we transform `binary_search`
in a template function.

template<class T> const T* binary_search (const T* array, int n, const T& x) { const T* lo = array, *hi = array + n, *mid; while(lo != hi) { mid = lo + (hi - lo) / 2; if (x == *mid) return mid; if (x < *mid) hi = mid; else lo = mid + 1; } return 0; }

Now the algorithm is designed for use with arrays of different
types. In case of not finding the value searched for, a special
pointer - *nil* - is returned. This requires that such a
value exists. Since we don't want to make this assumption, in
case of an unsuccessful search we return the pointer `array
+ n` (yes, a past-the-end pointer) instead.

template<class T> const T* binary_search (const T* array, int n, const T& x) { const T* lo = array, *hi = array + n, *mid; while(lo != hi) { mid = lo + (hi - lo) / 2; if (x == *mid) return mid; if (x < *mid) hi = mid; else lo = mid + 1; } return array + n; }

Instead of handing over` array` as pointer to the first
element and a size, we could also specify a pointer to the first
and past the last element to approach STL's iterator concept.

template<class T> const T* binary_search (T* first, T* last, const T& value) { const T* lo = array, *hi = array + n, *mid; while(lo != hi) { mid = lo + (hi - lo) / 2; if (value == *mid) return mid; if (value < *mid) hi = mid; else lo = mid + 1; } return last; }

To specify a pointer to the end of a container instead of handing
over its size has the advantage that it has not to be possible
to compute `last` out of `first` with `first+n`.
This is important for containers that don't allow random access
to their elements. Because our `binary_search` needs random
access to the elements of the container, this is of little importance
in our example. Another advantage is that the *difference*
type (here` int`) doesn't have to be explicitly handed
over, so the user of `binary_search` doesn't even have
to know it. The difference type is the type which is used to express
the type of the difference of two arbitrary iterators (pointers),
for example `last` - `first` could be of the type
`signed long`.

The last step to fully adapt the algorithm to the STL style is to change the first and last pointer type from pointers to the value type to an appropriate iterator type. By this step, the information of how the algorithm steps from one element to the next is torn away from the algorithm implementation and is hidden in the iterator objects. Now, no assumptions about the mechanism to iterate through the elements are made. This mechanism is handed over to the algorithm by the iterator objects. So, the algorithm is separated from the container it works on, all the operations that deal with iterators are provided by the iterator objects themselves.

Since `binary_search` needs random access to the elements
of the container it is called for and so iterators handed over
to `binary_search` have to satisfy the requirements of
random access iterators, we name the type of `first` and
`last` "`RandomAccessIterator`":

template<class RandomAccessIterator, class T> RandomAccessIterator binary_search (RandomAccessIterator first, RandomAccessIterator last, const T& value) { RandomAccessIterator not_found = last, mid; while(first != last) { mid = first + (last - first) / 2; if (value == *mid) return mid; if (value < *mid) last = mid; else first = mid + 1; } return not_found; }

The only assumptions the algorithm makes are the random access
to elements of type `T` between the two iterators (pointers)
`first` and `last` and that `operator==`
and `operator<` are defined for type `T` and
the value type of the iterator.

This generic binary search algorithm hasn't lost anything of its functionality, especially not when dealing with built in types.

int x[10]; // array of ten integer values int search_value; // value searched for // initialize variables int* i = binary_search (&x[0], &x[10], search_value); if (i == &x[10]) cout << "value not found"; else cout << "value found";

All the STL algorithms are constructed like our example algorithm - they try to make as few assumptions as possible about the environment they are run in.

Johannes Weidl (J.Weidl@infosys.tuwien.ac.at) - Apr 16, 1996