Sorting Algorithm Performance Comparison: Bubble Sort vs Quick Sort
Timing Considerations for Sorting Algorithms
When implementing sorting algorithms and measuring their execution time, it's crucial to understand the distinction between wall-clock time and CPU time.
The clock() functon in C returns processor time rather than elapsed real time. During I/O operations or system calls like sleep(), the process yields the CPU, so clock() will not accumulate time during these waiting periods:
clock_t start, finish;
double elapsed;
start = clock();
sleep(3); // CPU is released during sleep
finish = clock();
elapsed = ((double)(finish - start)) / CLOCKS_PER_SEC;
printf("Elapsed: %lf\n", elapsed);
This code outputs a value near zero because clock() measures CPU consumption, not wall time. For measuring actual elapsed time, gettimeofday() or clock_gettime() are more appropriate on Linux systems.
Random Number Generation in C
The standard rand() function returns values in the range [0, RAND_MAX]. On Windows, RAND_MAX equals 2^15 - 1, which severely limits randomness quality. A common workaround is to combine two rand() calls:
int value = (rand() << 15) | rand();
This approach extends the range significantly, though it doesn't cover all possible values in the larger range. On Linux, RAND_MAX is typically 2^31 - 1, so for moderate data sizes, this concern is less pressing.
Array Duplication
When comparing multiple sorting algorithms on identical datasets, preserving the original array is essential. The memcpy() function from string.h provides fast memory copying:
int original[100000];
int duplicate[100000];
// populate original with random values
memcpy(duplicate, original, sizeof(original));
Bubble Sort Implementation
Bubble sort operates with O(n²) time complexity. The algorithm performs backward passes through the array, swapping adjacent elements that are out of order:
void bubble_sort(int* data, int length) {
for (int i = length - 1; i > 0; i--) {
for (int j = 0; j < i; j++) {
if (data[j] > data[j+1]) {
int temp = data[j];
data[j] = data[j+1];
data[j+1] = temp;
}
}
}
}
Optimization: XOR Swap
Memory writes during element swaps can be reduced using XOR bit manipulation:
data[j] ^= data[j+1];
data[j+1] ^= data[j];
data[j] ^= data[j+1];
This works because XOR satisfies both commutativity and associativity, and a ^ a equals zero while a ^ 0 equals a. Note that this only applies when the two memory locations are distinct.
Quick Sort Implementation
Quick sort employs a divide-and-conquer strategy, selecting a pivot and partitioning elements around it. Recursive processing continues until subarrays reach size one:
int partition(int* data, int low, int high) {
int pivot = data[high];
int i = low - 1;
for (int j = low; j < high; j++) {
if (data[j] <= pivot) {
i++;
int temp = data[i];
data[i] = data[j];
data[j] = temp;
}
}
int temp = data[i+1];
data[i+1] = data[high];
data[high] = temp;
return i + 1;
}
void quick_sort(int* data, int low, int high) {
if (low < high) {
int pivot_index = partition(data, low, high);
quick_sort(data, low, pivot_index - 1);
quick_sort(data, pivot_index + 1, high);
}
}
Output Redirection
For automated testing, freopen() redirects standard output to a file:
freopen("output.txt", "w", stdout);
// sorting operations and printf calls here
// restore interactive output
freopen("/dev/tty", "w", stdout);
Verifying Correctness
The diff utility compares two output files to confirm that both algorithms produce identical sorted results:
diff bubble_output.txt quick_output.txt
Python Implementation
Python's time.time() provides wall-clock time measurement:
import time
import random
import copy
start = time.time()
# sorting operations
end = time.time()
elapsed = end - start
print(f"Elapsed: {elapsed}")
Random number generation uses random.choices() for sequences with potential repetitions:
data = random.choices(range(100000), k=100000)
Critical: Deep Copy for Array Duplication
Python requires deep copying rather than shallow copying when duplicating arrays. Shallow copies only copy references, so modifications affect both datasets:
original = random.choices(range(100000), k=100000)
duplicate = copy.deepcopy(original)
Python Bubble Sort
def bubble_sort(arr):
n = len(arr)
for i in range(n - 1, 0, -1):
for j in range(i):
if arr[j] > arr[j + 1]:
arr[j], arr[j + 1] = arr[j + 1], arr[j]
return arr
Python Quick Sort
def quick_sort(arr):
if len(arr) <= 1:
return arr
pivot = arr[len(arr) // 2]
left = [x for x in arr if x < pivot]
middle = [x for x in arr if x == pivot]
right = [x for x in arr if x > pivot]
return quick_sort(left) + middle + quick_sort(right)
Performance Characteristics
Both implementations demonstrate quick sort's O(n log n) average complexity significantly outperforming bubble sort's O(n²) when sorting large datasets of 100,000 elements.