Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
Ok-landscape
GitHub Repository: Ok-landscape/computational-pipeline
Path: blob/main/latex-templates/templates/computer-science/sorting_algorithms.tex
51 views
unlisted
1
\documentclass[a4paper, 11pt]{article}
2
\usepackage[utf8]{inputenc}
3
\usepackage[T1]{fontenc}
4
\usepackage{amsmath, amssymb}
5
\usepackage{graphicx}
6
\usepackage{siunitx}
7
\usepackage{booktabs}
8
\usepackage{float}
9
\usepackage[makestderr]{pythontex}
10
11
\title{Computer Science: Sorting Algorithm Analysis and Comparison}
12
\author{Computational Science Templates}
13
\date{\today}
14
15
\begin{document}
16
\maketitle
17
18
\begin{abstract}
19
This document presents a comprehensive analysis of sorting algorithms including comparison-based sorts (QuickSort, MergeSort, HeapSort, InsertionSort, BubbleSort) and non-comparison sorts (CountingSort, RadixSort). We implement each algorithm in Python, measure their empirical performance across different input sizes and distributions, analyze time and space complexity, and compare their stability and practical applicability. The analysis demonstrates the trade-offs between different sorting strategies and guides algorithm selection for specific use cases.
20
\end{abstract}
21
22
\section{Introduction}
23
Sorting is one of the most fundamental operations in computer science, with applications ranging from database operations to computational geometry. Understanding the theoretical complexity and practical performance of different sorting algorithms is essential for efficient algorithm design and selection.
24
25
\section{Mathematical Framework}
26
27
\subsection{Comparison-Based Sorting Lower Bound}
28
Any comparison-based sorting algorithm requires at least $\Omega(n \log n)$ comparisons in the worst case, as proven by the decision tree model.
29
30
\subsection{Time Complexity Summary}
31
\begin{align}
32
\text{QuickSort: } &O(n \log n) \text{ average}, O(n^2) \text{ worst} \\
33
\text{MergeSort: } &O(n \log n) \text{ all cases} \\
34
\text{HeapSort: } &O(n \log n) \text{ all cases} \\
35
\text{InsertionSort: } &O(n^2) \text{ worst}, O(n) \text{ best} \\
36
\text{BubbleSort: } &O(n^2) \text{ all cases} \\
37
\text{CountingSort: } &O(n + k) \text{ where } k \text{ is range} \\
38
\text{RadixSort: } &O(d(n + k)) \text{ where } d \text{ is digits}
39
\end{align}
40
41
\section{Computational Analysis}
42
43
\begin{pycode}
44
import numpy as np
45
import time
46
import matplotlib.pyplot as plt
47
import sys
48
sys.setrecursionlimit(10000)
49
plt.rc('text', usetex=True)
50
plt.rc('font', family='serif')
51
52
np.random.seed(42)
53
54
# Store results
55
results = {}
56
57
# Sorting algorithm implementations
58
def quicksort(arr):
59
"""QuickSort with median-of-three pivot selection"""
60
if len(arr) <= 1:
61
return arr
62
if len(arr) <= 10:
63
return insertion_sort(arr)
64
65
# Median-of-three pivot
66
mid = len(arr) // 2
67
if arr[0] > arr[mid]:
68
arr[0], arr[mid] = arr[mid], arr[0]
69
if arr[0] > arr[-1]:
70
arr[0], arr[-1] = arr[-1], arr[0]
71
if arr[mid] > arr[-1]:
72
arr[mid], arr[-1] = arr[-1], arr[mid]
73
pivot = arr[mid]
74
75
left = [x for x in arr if x < pivot]
76
middle = [x for x in arr if x == pivot]
77
right = [x for x in arr if x > pivot]
78
return quicksort(left) + middle + quicksort(right)
79
80
def mergesort(arr):
81
"""MergeSort - stable, O(n log n)"""
82
if len(arr) <= 1:
83
return arr
84
mid = len(arr) // 2
85
left = mergesort(arr[:mid])
86
right = mergesort(arr[mid:])
87
return merge(left, right)
88
89
def merge(left, right):
90
result = []
91
i = j = 0
92
while i < len(left) and j < len(right):
93
if left[i] <= right[j]:
94
result.append(left[i])
95
i += 1
96
else:
97
result.append(right[j])
98
j += 1
99
result.extend(left[i:])
100
result.extend(right[j:])
101
return result
102
103
def heapsort(arr):
104
"""HeapSort - in-place, O(n log n)"""
105
arr = arr.copy()
106
n = len(arr)
107
108
def heapify(arr, n, i):
109
largest = i
110
left = 2 * i + 1
111
right = 2 * i + 2
112
if left < n and arr[left] > arr[largest]:
113
largest = left
114
if right < n and arr[right] > arr[largest]:
115
largest = right
116
if largest != i:
117
arr[i], arr[largest] = arr[largest], arr[i]
118
heapify(arr, n, largest)
119
120
# Build max heap
121
for i in range(n // 2 - 1, -1, -1):
122
heapify(arr, n, i)
123
124
# Extract elements
125
for i in range(n - 1, 0, -1):
126
arr[0], arr[i] = arr[i], arr[0]
127
heapify(arr, i, 0)
128
129
return arr
130
131
def insertion_sort(arr):
132
"""InsertionSort - stable, adaptive"""
133
arr = arr.copy()
134
for i in range(1, len(arr)):
135
key = arr[i]
136
j = i - 1
137
while j >= 0 and arr[j] > key:
138
arr[j + 1] = arr[j]
139
j -= 1
140
arr[j + 1] = key
141
return arr
142
143
def bubble_sort(arr):
144
"""BubbleSort - stable, simple"""
145
arr = arr.copy()
146
n = len(arr)
147
for i in range(n):
148
swapped = False
149
for j in range(0, n - i - 1):
150
if arr[j] > arr[j + 1]:
151
arr[j], arr[j + 1] = arr[j + 1], arr[j]
152
swapped = True
153
if not swapped:
154
break
155
return arr
156
157
def counting_sort(arr):
158
"""CountingSort - stable, O(n+k)"""
159
if not arr:
160
return arr
161
min_val, max_val = min(arr), max(arr)
162
range_val = max_val - min_val + 1
163
count = [0] * range_val
164
output = [0] * len(arr)
165
166
for num in arr:
167
count[num - min_val] += 1
168
169
for i in range(1, len(count)):
170
count[i] += count[i - 1]
171
172
for num in reversed(arr):
173
output[count[num - min_val] - 1] = num
174
count[num - min_val] -= 1
175
176
return output
177
178
def radix_sort(arr):
179
"""RadixSort using CountingSort as subroutine"""
180
if not arr:
181
return arr
182
max_val = max(arr)
183
exp = 1
184
arr = arr.copy()
185
186
while max_val // exp > 0:
187
arr = counting_sort_radix(arr, exp)
188
exp *= 10
189
190
return arr
191
192
def counting_sort_radix(arr, exp):
193
n = len(arr)
194
output = [0] * n
195
count = [0] * 10
196
197
for num in arr:
198
index = (num // exp) % 10
199
count[index] += 1
200
201
for i in range(1, 10):
202
count[i] += count[i - 1]
203
204
for i in range(n - 1, -1, -1):
205
index = (arr[i] // exp) % 10
206
output[count[index] - 1] = arr[i]
207
count[index] -= 1
208
209
return output
210
\end{pycode}
211
212
\subsection{Performance Measurement}
213
214
\begin{pycode}
215
# Test configurations
216
sizes = [100, 500, 1000, 2000, 5000]
217
algorithms = {
218
'QuickSort': quicksort,
219
'MergeSort': mergesort,
220
'HeapSort': heapsort,
221
'InsertionSort': insertion_sort,
222
}
223
224
# Performance measurement
225
performance = {name: [] for name in algorithms}
226
performance['BubbleSort'] = []
227
228
for n in sizes:
229
arr = list(np.random.randint(0, 10000, n))
230
231
for name, func in algorithms.items():
232
t0 = time.time()
233
func(arr.copy())
234
performance[name].append((time.time() - t0) * 1000)
235
236
# BubbleSort only for smaller sizes
237
if n <= 2000:
238
t0 = time.time()
239
bubble_sort(arr.copy())
240
performance['BubbleSort'].append((time.time() - t0) * 1000)
241
else:
242
performance['BubbleSort'].append(np.nan)
243
244
# Store results for reporting
245
results['quick_1000'] = performance['QuickSort'][2]
246
results['merge_1000'] = performance['MergeSort'][2]
247
results['heap_1000'] = performance['HeapSort'][2]
248
results['insertion_1000'] = performance['InsertionSort'][2]
249
results['bubble_1000'] = performance['BubbleSort'][2]
250
\end{pycode}
251
252
\subsection{Algorithm Comparison Visualization}
253
254
\begin{pycode}
255
fig, axes = plt.subplots(2, 3, figsize=(15, 10))
256
257
# Linear scale comparison
258
ax1 = axes[0, 0]
259
for name in ['QuickSort', 'MergeSort', 'HeapSort']:
260
ax1.plot(sizes, performance[name], 'o-', linewidth=2, markersize=6, label=name)
261
ax1.set_xlabel('Array Size')
262
ax1.set_ylabel('Time (ms)')
263
ax1.set_title('$O(n \\log n)$ Algorithms')
264
ax1.legend()
265
ax1.grid(True, alpha=0.3)
266
267
# O(n^2) algorithms
268
ax2 = axes[0, 1]
269
ax2.plot(sizes, performance['InsertionSort'], 'o-', linewidth=2, label='InsertionSort')
270
ax2.plot(sizes[:4], performance['BubbleSort'][:4], 's-', linewidth=2, label='BubbleSort')
271
ax2.set_xlabel('Array Size')
272
ax2.set_ylabel('Time (ms)')
273
ax2.set_title('$O(n^2)$ Algorithms')
274
ax2.legend()
275
ax2.grid(True, alpha=0.3)
276
277
# Log-log scale
278
ax3 = axes[0, 2]
279
for name in ['QuickSort', 'MergeSort', 'InsertionSort']:
280
ax3.loglog(sizes, performance[name], 'o-', linewidth=2, label=name)
281
# Theoretical curves
282
n_theory = np.array(sizes)
283
nlogn = n_theory * np.log(n_theory)
284
nlogn = nlogn / nlogn[2] * performance['QuickSort'][2]
285
n2 = n_theory**2
286
n2 = n2 / n2[2] * performance['InsertionSort'][2]
287
ax3.loglog(n_theory, nlogn, 'k--', alpha=0.5, label='$O(n\\log n)$')
288
ax3.loglog(n_theory, n2, 'k:', alpha=0.5, label='$O(n^2)$')
289
ax3.set_xlabel('Array Size')
290
ax3.set_ylabel('Time (ms)')
291
ax3.set_title('Log-Log Scale Comparison')
292
ax3.legend()
293
ax3.grid(True, alpha=0.3)
294
295
# Best/Average/Worst case for QuickSort
296
ax4 = axes[1, 0]
297
quick_random = []
298
quick_sorted = []
299
quick_reverse = []
300
301
for n in sizes:
302
# Random
303
arr = list(np.random.randint(0, 10000, n))
304
t0 = time.time()
305
quicksort(arr)
306
quick_random.append((time.time() - t0) * 1000)
307
308
# Nearly sorted
309
arr = list(range(n))
310
np.random.shuffle(arr[:n//10])
311
t0 = time.time()
312
quicksort(arr)
313
quick_sorted.append((time.time() - t0) * 1000)
314
315
# Reverse sorted (worst for naive quicksort, but our version handles it)
316
arr = list(range(n, 0, -1))
317
t0 = time.time()
318
quicksort(arr)
319
quick_reverse.append((time.time() - t0) * 1000)
320
321
ax4.plot(sizes, quick_random, 'b-o', linewidth=2, label='Random')
322
ax4.plot(sizes, quick_sorted, 'g-s', linewidth=2, label='Nearly Sorted')
323
ax4.plot(sizes, quick_reverse, 'r-^', linewidth=2, label='Reverse')
324
ax4.set_xlabel('Array Size')
325
ax4.set_ylabel('Time (ms)')
326
ax4.set_title('QuickSort: Input Distribution Effect')
327
ax4.legend()
328
ax4.grid(True, alpha=0.3)
329
330
# InsertionSort on different distributions
331
ax5 = axes[1, 1]
332
ins_random = []
333
ins_sorted = []
334
ins_reverse = []
335
336
for n in sizes[:4]: # Limit size for O(n^2)
337
# Random
338
arr = list(np.random.randint(0, 10000, n))
339
t0 = time.time()
340
insertion_sort(arr)
341
ins_random.append((time.time() - t0) * 1000)
342
343
# Sorted
344
arr = list(range(n))
345
t0 = time.time()
346
insertion_sort(arr)
347
ins_sorted.append((time.time() - t0) * 1000)
348
349
# Reverse
350
arr = list(range(n, 0, -1))
351
t0 = time.time()
352
insertion_sort(arr)
353
ins_reverse.append((time.time() - t0) * 1000)
354
355
ax5.plot(sizes[:4], ins_random, 'b-o', linewidth=2, label='Random')
356
ax5.plot(sizes[:4], ins_sorted, 'g-s', linewidth=2, label='Sorted')
357
ax5.plot(sizes[:4], ins_reverse, 'r-^', linewidth=2, label='Reverse')
358
ax5.set_xlabel('Array Size')
359
ax5.set_ylabel('Time (ms)')
360
ax5.set_title('InsertionSort: Input Distribution Effect')
361
ax5.legend()
362
ax5.grid(True, alpha=0.3)
363
364
# Speedup over BubbleSort
365
ax6 = axes[1, 2]
366
speedups = {}
367
for name in ['QuickSort', 'MergeSort', 'HeapSort']:
368
speedups[name] = [performance['BubbleSort'][i] / performance[name][i]
369
if not np.isnan(performance['BubbleSort'][i]) else np.nan
370
for i in range(len(sizes))]
371
ax6.plot(sizes[:4], speedups[name][:4], 'o-', linewidth=2, label=name)
372
373
ax6.set_xlabel('Array Size')
374
ax6.set_ylabel('Speedup vs BubbleSort')
375
ax6.set_title('Algorithm Speedup')
376
ax6.legend()
377
ax6.grid(True, alpha=0.3)
378
379
results['quicksort_speedup_2000'] = speedups['QuickSort'][3]
380
381
plt.tight_layout()
382
plt.savefig('sorting_comparison.pdf', bbox_inches='tight', dpi=150)
383
print(r'\begin{figure}[H]')
384
print(r'\centering')
385
print(r'\includegraphics[width=\textwidth]{sorting_comparison.pdf}')
386
print(r'\caption{Sorting algorithm comparison: (a) $O(n\log n)$ algorithms, (b) $O(n^2)$ algorithms, (c) log-log scale, (d) QuickSort distributions, (e) InsertionSort distributions, (f) speedup analysis.}')
387
print(r'\label{fig:comparison}')
388
print(r'\end{figure}')
389
plt.close()
390
\end{pycode}
391
392
\subsection{Non-Comparison Based Sorting}
393
394
\begin{pycode}
395
# Test non-comparison sorts
396
sizes_nc = [1000, 5000, 10000, 50000, 100000]
397
counting_times = []
398
radix_times = []
399
quick_times_large = []
400
401
for n in sizes_nc:
402
arr = list(np.random.randint(0, 10000, n))
403
404
# CountingSort
405
t0 = time.time()
406
counting_sort(arr)
407
counting_times.append((time.time() - t0) * 1000)
408
409
# RadixSort
410
t0 = time.time()
411
radix_sort(arr)
412
radix_times.append((time.time() - t0) * 1000)
413
414
# QuickSort for comparison
415
t0 = time.time()
416
quicksort(arr)
417
quick_times_large.append((time.time() - t0) * 1000)
418
419
fig, axes = plt.subplots(1, 2, figsize=(12, 5))
420
421
# Linear time sorts
422
ax1 = axes[0]
423
ax1.plot(sizes_nc, counting_times, 'b-o', linewidth=2, label='CountingSort')
424
ax1.plot(sizes_nc, radix_times, 'g-s', linewidth=2, label='RadixSort')
425
ax1.plot(sizes_nc, quick_times_large, 'r-^', linewidth=2, label='QuickSort')
426
ax1.set_xlabel('Array Size')
427
ax1.set_ylabel('Time (ms)')
428
ax1.set_title('Non-Comparison vs Comparison Sorts')
429
ax1.legend()
430
ax1.grid(True, alpha=0.3)
431
432
results['counting_100k'] = counting_times[-1]
433
results['radix_100k'] = radix_times[-1]
434
results['quick_100k'] = quick_times_large[-1]
435
436
# Effect of value range on CountingSort
437
ax2 = axes[1]
438
ranges = [100, 1000, 10000, 100000, 1000000]
439
counting_range_times = []
440
n_fixed = 10000
441
442
for r in ranges:
443
arr = list(np.random.randint(0, r, n_fixed))
444
t0 = time.time()
445
counting_sort(arr)
446
counting_range_times.append((time.time() - t0) * 1000)
447
448
ax2.semilogx(ranges, counting_range_times, 'b-o', linewidth=2)
449
ax2.set_xlabel('Value Range (k)')
450
ax2.set_ylabel('Time (ms)')
451
ax2.set_title(f'CountingSort: Effect of Range (n={n_fixed})')
452
ax2.grid(True, alpha=0.3)
453
454
plt.tight_layout()
455
plt.savefig('sorting_linear.pdf', bbox_inches='tight', dpi=150)
456
print(r'\begin{figure}[H]')
457
print(r'\centering')
458
print(r'\includegraphics[width=\textwidth]{sorting_linear.pdf}')
459
print(r'\caption{Non-comparison sorting: (a) performance comparison with QuickSort, (b) CountingSort dependence on value range.}')
460
print(r'\label{fig:linear}')
461
print(r'\end{figure}')
462
plt.close()
463
\end{pycode}
464
465
\subsection{Stability and Memory Analysis}
466
467
\begin{pycode}
468
# Demonstrate stability
469
def test_stability(sort_func, name):
470
"""Test if sorting is stable using (value, index) pairs"""
471
arr = [(3, 'a'), (1, 'b'), (3, 'c'), (2, 'd'), (1, 'e')]
472
# Sort by first element
473
if name in ['CountingSort', 'RadixSort']:
474
values = [x[0] for x in arr]
475
sorted_vals = sort_func(values)
476
return "N/A" # Can't easily test with current implementation
477
478
try:
479
sorted_arr = sort_func([x[0] for x in arr])
480
return "Stable" if sorted_arr == sorted([x[0] for x in arr]) else "Unstable"
481
except:
482
return "N/A"
483
484
# Memory usage comparison (theoretical)
485
memory_complexity = {
486
'QuickSort': '$O(\\log n)$',
487
'MergeSort': '$O(n)$',
488
'HeapSort': '$O(1)$',
489
'InsertionSort': '$O(1)$',
490
'BubbleSort': '$O(1)$',
491
'CountingSort': '$O(n + k)$',
492
'RadixSort': '$O(n + k)$'
493
}
494
495
stability = {
496
'QuickSort': 'No',
497
'MergeSort': 'Yes',
498
'HeapSort': 'No',
499
'InsertionSort': 'Yes',
500
'BubbleSort': 'Yes',
501
'CountingSort': 'Yes',
502
'RadixSort': 'Yes'
503
}
504
505
# Visualization of sorting process
506
fig, axes = plt.subplots(2, 3, figsize=(15, 10))
507
508
# Sample array for visualization
509
sample = list(np.random.randint(1, 50, 20))
510
511
# Before sorting
512
ax1 = axes[0, 0]
513
ax1.bar(range(len(sample)), sample, color='steelblue', edgecolor='black', alpha=0.7)
514
ax1.set_xlabel('Index')
515
ax1.set_ylabel('Value')
516
ax1.set_title('Unsorted Array')
517
ax1.grid(True, alpha=0.3)
518
519
# After QuickSort
520
ax2 = axes[0, 1]
521
sorted_sample = quicksort(sample)
522
colors = plt.cm.viridis(np.linspace(0, 1, len(sorted_sample)))
523
ax2.bar(range(len(sorted_sample)), sorted_sample, color=colors, edgecolor='black')
524
ax2.set_xlabel('Index')
525
ax2.set_ylabel('Value')
526
ax2.set_title('After Sorting')
527
ax2.grid(True, alpha=0.3)
528
529
# Comparison count estimation
530
ax3 = axes[0, 2]
531
n_vals = np.array([10, 50, 100, 500, 1000])
532
comparisons_nlogn = n_vals * np.log2(n_vals)
533
comparisons_n2 = n_vals * (n_vals - 1) / 2
534
ax3.semilogy(n_vals, comparisons_nlogn, 'b-o', linewidth=2, label='$O(n\\log n)$')
535
ax3.semilogy(n_vals, comparisons_n2, 'r-s', linewidth=2, label='$O(n^2)$')
536
ax3.set_xlabel('Array Size')
537
ax3.set_ylabel('Number of Comparisons')
538
ax3.set_title('Theoretical Comparison Count')
539
ax3.legend()
540
ax3.grid(True, alpha=0.3)
541
542
# Small array performance (where O(n^2) can win)
543
ax4 = axes[1, 0]
544
small_sizes = [5, 10, 20, 30, 50, 100]
545
small_quick = []
546
small_insertion = []
547
small_merge = []
548
549
for n in small_sizes:
550
arr = list(np.random.randint(0, 1000, n))
551
552
# Multiple runs for small arrays
553
runs = 100
554
t0 = time.time()
555
for _ in range(runs):
556
quicksort(arr.copy())
557
small_quick.append((time.time() - t0) / runs * 1000)
558
559
t0 = time.time()
560
for _ in range(runs):
561
insertion_sort(arr.copy())
562
small_insertion.append((time.time() - t0) / runs * 1000)
563
564
t0 = time.time()
565
for _ in range(runs):
566
mergesort(arr.copy())
567
small_merge.append((time.time() - t0) / runs * 1000)
568
569
ax4.plot(small_sizes, small_quick, 'b-o', linewidth=2, label='QuickSort')
570
ax4.plot(small_sizes, small_insertion, 'g-s', linewidth=2, label='InsertionSort')
571
ax4.plot(small_sizes, small_merge, 'r-^', linewidth=2, label='MergeSort')
572
ax4.set_xlabel('Array Size')
573
ax4.set_ylabel('Time (ms)')
574
ax4.set_title('Small Array Performance')
575
ax4.legend()
576
ax4.grid(True, alpha=0.3)
577
578
# Nearly sorted arrays (best case for adaptive sorts)
579
ax5 = axes[1, 1]
580
nearly_sizes = [100, 500, 1000, 2000]
581
swap_percentages = []
582
583
for n in nearly_sizes:
584
# Create nearly sorted array
585
arr = list(range(n))
586
# Swap 5% of elements
587
num_swaps = n // 20
588
for _ in range(num_swaps):
589
i, j = np.random.randint(0, n, 2)
590
arr[i], arr[j] = arr[j], arr[i]
591
592
t_ins = time.time()
593
insertion_sort(arr.copy())
594
t_ins = (time.time() - t_ins) * 1000
595
596
t_quick = time.time()
597
quicksort(arr.copy())
598
t_quick = (time.time() - t_quick) * 1000
599
600
swap_percentages.append((n, t_ins, t_quick))
601
602
ns, t_ins_list, t_quick_list = zip(*swap_percentages)
603
ax5.plot(ns, t_ins_list, 'g-o', linewidth=2, label='InsertionSort')
604
ax5.plot(ns, t_quick_list, 'b-s', linewidth=2, label='QuickSort')
605
ax5.set_xlabel('Array Size')
606
ax5.set_ylabel('Time (ms)')
607
ax5.set_title('Nearly Sorted Arrays (5\\% swapped)')
608
ax5.legend()
609
ax5.grid(True, alpha=0.3)
610
611
# Algorithm selection guide
612
ax6 = axes[1, 2]
613
scenarios = ['Small\n(<50)', 'General\nPurpose', 'Stable\nNeeded', 'Memory\nLimited', 'Integer\nKeys']
614
recommendations = ['Insertion', 'Quick', 'Merge', 'Heap', 'Radix']
615
colors_rec = ['#FF9999', '#99FF99', '#9999FF', '#FFFF99', '#FF99FF']
616
bars = ax6.bar(scenarios, [1]*5, color=colors_rec, edgecolor='black')
617
ax6.set_ylabel('Recommendation')
618
ax6.set_title('Algorithm Selection Guide')
619
for bar, rec in zip(bars, recommendations):
620
height = bar.get_height()
621
ax6.text(bar.get_x() + bar.get_width()/2., height/2,
622
rec, ha='center', va='center', fontsize=10, fontweight='bold')
623
ax6.set_ylim(0, 1.2)
624
ax6.set_yticks([])
625
626
plt.tight_layout()
627
plt.savefig('sorting_analysis.pdf', bbox_inches='tight', dpi=150)
628
print(r'\begin{figure}[H]')
629
print(r'\centering')
630
print(r'\includegraphics[width=\textwidth]{sorting_analysis.pdf}')
631
print(r'\caption{Sorting analysis: (a) unsorted array, (b) sorted array, (c) comparison counts, (d) small array performance, (e) nearly sorted arrays, (f) selection guide.}')
632
print(r'\label{fig:analysis}')
633
print(r'\end{figure}')
634
plt.close()
635
\end{pycode}
636
637
\section{Results and Discussion}
638
639
\subsection{Performance at n=1000}
640
641
\begin{table}[H]
642
\centering
643
\caption{Sorting Algorithm Performance (n=1000)}
644
\label{tab:performance}
645
\begin{tabular}{lcccc}
646
\toprule
647
\textbf{Algorithm} & \textbf{Time (ms)} & \textbf{Complexity} & \textbf{Stable} & \textbf{Space} \\
648
\midrule
649
QuickSort & \py{f"{results['quick_1000']:.3f}"} & $O(n \log n)$ & No & $O(\log n)$ \\
650
MergeSort & \py{f"{results['merge_1000']:.3f}"} & $O(n \log n)$ & Yes & $O(n)$ \\
651
HeapSort & \py{f"{results['heap_1000']:.3f}"} & $O(n \log n)$ & No & $O(1)$ \\
652
InsertionSort & \py{f"{results['insertion_1000']:.3f}"} & $O(n^2)$ & Yes & $O(1)$ \\
653
BubbleSort & \py{f"{results['bubble_1000']:.3f}"} & $O(n^2)$ & Yes & $O(1)$ \\
654
\bottomrule
655
\end{tabular}
656
\end{table}
657
658
\subsection{Large Scale Performance (n=100,000)}
659
660
For large arrays with integer keys:
661
\begin{itemize}
662
\item CountingSort: \py{f"{results['counting_100k']:.2f}"} ms
663
\item RadixSort: \py{f"{results['radix_100k']:.2f}"} ms
664
\item QuickSort: \py{f"{results['quick_100k']:.2f}"} ms
665
\end{itemize}
666
667
\subsection{Speedup Analysis}
668
669
At n=2000, QuickSort achieves \py{f"{results['quicksort_speedup_2000']:.1f}"}x speedup over BubbleSort.
670
671
\subsection{Algorithm Selection Guidelines}
672
673
\begin{enumerate}
674
\item \textbf{General purpose}: QuickSort (fast average case, good cache performance)
675
\item \textbf{Guaranteed $O(n \log n)$}: MergeSort or HeapSort
676
\item \textbf{Stability required}: MergeSort
677
\item \textbf{Memory constrained}: HeapSort ($O(1)$ space)
678
\item \textbf{Small arrays}: InsertionSort (low overhead)
679
\item \textbf{Nearly sorted}: InsertionSort ($O(n)$ best case)
680
\item \textbf{Integer keys, limited range}: CountingSort or RadixSort
681
\end{enumerate}
682
683
\section{Conclusion}
684
This analysis demonstrated the implementation and comparison of major sorting algorithms. Key findings include:
685
\begin{enumerate}
686
\item $O(n \log n)$ algorithms dramatically outperform $O(n^2)$ algorithms for large inputs
687
\item QuickSort typically provides the best practical performance due to cache efficiency
688
\item MergeSort guarantees $O(n \log n)$ and stability but requires $O(n)$ extra space
689
\item HeapSort provides $O(n \log n)$ with $O(1)$ space but poor cache performance
690
\item InsertionSort excels for small or nearly sorted arrays
691
\item Non-comparison sorts achieve $O(n)$ for integer keys with limited range
692
\item Algorithm selection depends on input characteristics, memory constraints, and stability requirements
693
\end{enumerate}
694
695
Understanding these trade-offs enables optimal algorithm selection for specific use cases.
696
697
\end{document}
698
699