Visualizing Algorithm Trajectories in C++ Containers
Consider the following code:
#include <stdio.h>
#include <list>
#include <set>
#include <algorithm>
using namespace std;
template <class S>
void ls(S *s)
{
typename S::iterator it;
it = s->begin();
printf("{ ");
while (it!=s->end()) {
int n= *it;
printf("%d ", n);
++it;
}
printf("}\n");
}
int main()
{
int i;
int n;
list<int> l;
set<int> s;
for(i=0; i<10; i++) {
l.push_back(i);
}
for(i=0; i<10; i++) {
s.insert(i);
}
ls(&l);
ls(&s);
list<int>::iterator li;
set<int>::iterator si;
li = find(l.begin(), l.end(), 9);
if(li!=l.end()) printf("%d\n", *li);
si = s.find(9);
if(si!=s.end()) printf("%d\n", *si);
return 0;
}
Here we have two containers: a list<int> and a set<int>. Due to the high similarity in their interfaces, a developer writing code might not immediately perceive which one is better. Delegating work to the container feels like losing control. "You can't tell from the code," but "set has higher retrieval efficiency," one might whisper.
If you don't want to test with tens or hundreds of thousands of elements to verify efficiency, how can you know that set is faster?
There's a trick: using a wrapper type that is binary identical to the original element type. For the type parameter T (here int), we can define a wrapper wrap_T that has the exact same binary layout. We can then overload its constructors, destructor, assignment, comparisons, etc., to intercept these operations and trace their invocation. Let's define a wrapper for an integer:
struct integer {
int x;
integer(int n) {x=n;}
~integer() { printf("~ %d\n", x); }
bool operator<(const struct integer &r) const {
printf("{%d.%d} ", x, r.x);
if(x<r.x) return true;
return false;
}
bool operator==(const struct integer &r) const {
printf("%d. ", x);
if(x==r.x) return true;
return false;
}
};
Now we can observe the differences:
#define TYPE integer
struct Predict {
TYPE t;
Predict(int n):t(n){}
Predict(const Predict &p):t(p.t){printf("copy constructor\n");}
bool operator()(const TYPE &v) {
printf("%d ", v.x);
if(t==v) return true;
return false;
}
};
template <class S>
typename S::iterator find2(S *s, Predict &p)
{
typename S::iterator si;
S &t= *s;
si= find_if(t.begin(), t.end(), p);
printf("\n");
return si;
}
int main()
{
int i;
int n;
list<int> l;
set<int> s;
for(i=0; i<10; i++) {
l.push_back(i);
}
for(i=0; i<10; i++) {
s.insert(i);
}
ls(&l);
ls(&s);
list<int>::iterator li;
set<int>::iterator si;
li = find(l.begin(), l.end(), 9);
if(li!=l.end()) printf("%d\n", *li);
si = s.find(9);
if(si!=s.end()) printf("%d\n", *si);
Predict p(9);
((list<TYPE>::iterator&)li) =find2((list<TYPE>*)&l, p);
if(li!=l.end()) printf("%d\n", *li);
(set<TYPE>::iterator&)si = ((set<TYPE>&)s).find(TYPE(9));
if(si!=s.end()) printf("%d\n", *si);
return 0;
}
The output is:
{ 0 1 2 3 4 5 6 7 8 9 }
{ 0 1 2 3 4 5 6 7 8 9 }
9
9
copy constructor
copy constructor
0 9. 1 9. 2 9. 3 9. 4 9. 5 9. 6 9. 7 9. 8 9. 9 9. ~ 9
~ 9
9
{3.9} {5.9} {7.9} {8.9} {9.9} {9.9} ~ 9
9
~ 9
We performed two groups of type casts, treating the int container as if it were a container of integer. Since integer is a direct wrapper of int with identical binary representation, this cast is valid at runtime. After the conversion, the overloaded functions reveal the comparison traces for list::find and set::find.
list::find scans the entire container and compares each element sequentially. In contrast, set::find only compares a subset of elements. Therefore, set retrieval is faster. list::find uses operator== (via the Predict functor), while set::find uses operator<. When both a<b and b<a are false, equality is inferred. This is why in the set::find output, 9 is compared with 9 twice. Additionally, the Predict object is passed by value into find_if, and internally find_if calls another helper __find_if, so the Predict is copied twice during the process.
Finally, note that if the Predict functor always returns false, the search becomes a full traversal. In this example, we implemented a custom ls function using iterators to print all element. A lazy alterntaive is to use std::find_if directly:
struct Predict2 {
bool operator()(int v) {
printf("%d ", v);
return false;
}
};
template <class S>
void ls2(S *s)
{
S &t= *s;
printf("{ ");
find_if(t.begin(), t.end(), Predict2());
printf("}\n");
}