Simplified Template Implementation of the C++ Standard Library Vector
Unlike the basic string implementation we built earlier, the vector here will be a template class. Separating declarations and definitions of template classes causes linker errors, so we implement both in a single vector.h file—this mirrors how standard library implementations (e.g., SGI STL for GCC) handle it, placing small inline functions (like size and begin) inside the class and large ones (like _insert_aux) outside.
1. Learning Vector Structure from Source Code
Both string and vector use contiguous memory for underlying storage. For this guide, we focus on the SGI STL version (open-source for learning), skipping more complex commercial implementations like VS's PJ STL.
Key findings from stl_vector.h source code:
- Vector’s core uses three raw-pointer-based iterators:
_start,_finish, and_end_of_storage. _startpoints to the first element,_finishto the element after the last valid entry, and_end_of_storageto the memory adress after the allocated capacity. This aligns with how STL containers define ranges as [begin, end).- Memory allocation relies on an internal memory pool for efficiency, but we will use standard
new/deletefor simplicity.
2. Implementing Core Vector Members
2.1 Core Data Members & Iterators
template <typename T>
class Vector {
public:
// Iterator definition (raw pointer for contiguous storage)
typedef T* iterator;
typedef const T* const_iterator;
// Inline accessor functions (standard in library vector)
iterator begin() { return _start; }
iterator end() { return _finish; }
const_iterator begin() const { return _start; }
const_iterator end() const { return _finish; }
size_t size() const { return _finish - _start; }
size_t capacity() const { return _end_of_storage - _start; }
T& operator[](size_t index) {
assert(index < size());
return _start[index];
}
const T& operator[](size_t index) const {
assert(index < size());
return _start[index];
}
// Constructor & Destructor
Vector() : _start(nullptr), _finish(nullptr), _end_of_storage(nullptr) {}
~Vector() {
if (_start) {
delete[] _start;
_start = _finish = _end_of_storage = nullptr;
}
}
// Reserve capacity (only expands if n > current capacity)
void reserve(size_t new_cap) {
if (new_cap <= capacity()) return;
size_t old_sz = size();
T* temp = new T[new_cap];
if (_start) {
// Copy existing elements
memcpy(temp, _start, sizeof(T) * old_sz);
delete[] _start;
}
_start = temp;
_finish = temp + old_sz;
_end_of_storage = temp + new_cap;
}
// Push back an element
void push_back(const T& value) {
if (_finish == _end_of_storage) {
size_t new_cap = capacity() == 0 ? 4 : 2 * capacity();
reserve(new_cap);
}
*_finish = value;
++_finish;
}
// Pop back the last element
void pop_back() {
assert(size() > 0);
--_finish;
}
// Insert single element at position
iterator insert(iterator pos, const T& value) {
assert(pos >= begin() && pos < end());
// Handle capacity expansion
if (_finish == _end_of_storage) {
size_t offset = pos - _start;
size_t new_cap = capacity() == 0 ? 4 : 2 * capacity();
reserve(new_cap);
// Recompute position after memory reallocation
pos = begin() + offset;
}
// Shift elements right
for (iterator it = end(); it > pos; --it) {
*it = *(it - 1);
}
*pos = value;
++_finish;
return pos; // Return updated iterator to avoid external iterator invalidation
}
// Remove single element at position
iterator erase(iterator pos) {
assert(pos >= begin() && pos < end());
// Shift elements left
for (iterator it = pos; it + 1 < end(); ++it) {
*it = *(it + 1);
}
--_finish;
return pos;
}
private:
iterator _start; // Points to first element
iterator _finish; // Points to position after last valid element
iterator _end_of_storage; // Points to position after allocated capacity
};
2.2 Iterator Invalidation
Capacity Expantion
Calling push_back or insert on a full vector triggers a memory reallocation (reserve). This invalidates all existing iterators because:
- Original memory is freed after copying data.
- External iterators still point to the old (deleted) memory.
The insert function in our code avoids internal iterator invalidation by re-calculating pos after reallocation and returns an updated iterator for external use.
After Erase
Even without reallocation, erase invalidates all iterators pointing to or after the erased position. The C++ standard explicitly bans using these invalidated iterators.
3. Using the Vector
A basic test function demonstrates core operations:
#include <iostream>
#include <cassert>
#include "vector.h"
int main() {
Vector<int> vec;
// Push back elements
vec.push_back(1);
vec.push_back(2);
vec.push_back(3);
vec.push_back(4);
vec.push_back(5); // Triggers reserve (capacity 4 → 8)
// Test operator[] and size
assert(vec.size() == 5);
assert(vec[3] == 4);
// Test insert
auto it = vec.insert(vec.begin() + 2, 6); // Should return iterator to 6
assert(*it == 6);
assert(vec.size() == 6);
// Test range-based for loop
int sum = 0;
for (int num : vec) sum += num;
assert(sum == 1+2+6+3+4+5);
// Test erase
it = vec.erase(vec.begin() + 3); // Erase 3, returns iterator to 4
assert(*it == 4);
assert(vec.size() == 5);
std::cout << "All test cases passed!" << std::endl;
return 0;
}
4. Finding Elements with STL Algorithm
Unlike string, vector uses a generic STL find function (std::find from <algorithm>) instead of member methods, highlighting template reusability:
#include <algorithm>
// Find element 6
auto find_it = std::find(vec.begin(), vec.end(), 6);
if (find_it != vec.end()) {
std::cout << "Found value at index: " << find_it - vec.begin() << std::endl;
}
Key Note: std::find returns vec.end() if the element is not found, which must be checked before dereferencing.