(1) Selection Sort
Select the minimum value in an array, then put it to the front.
Time complexity .
1def selectionSort(a):
2
3 for step in range(len(a)):
4 min_idx = step
5 for i in range(step + 1, len(a)):
6 if a[i] < a[min_idx]:
7 min_idx = i
8 a[step], a[min_idx] = a[min_idx], a[step]
9
10 return a
(2) Insertion Sort
Traverse each value, put it forward if the former one is smaller until the former one is larger.
Time complexity .
x1def insertionSort(a):
2
3 for i, key in enumerate(a[1:]):
4
5 j = i - 1
6 while j >= 0 and key < a[j]:
7 a[j + 1] = a[j]
8 j = j - 1
9
10 a[j + 1] = key
11
12 return a
(3) Exchange Sort
Traverse each key in the array, swap if the key and the following key pairs are unordered. Note that this is different from bubble sort.
Time complexity .
xxxxxxxxxx
71def exchangeSort(a):
2 for i in range(len(a)):
3 for j in range(i, len(a)):
4 if a[i] > a[j]:
5 a[i], a[j] = a[j], a[i]
6
7 return a
(4) Bubble Sort: Baseline
Select each pair of keys, swap them if the former one is larger than the later one. Remember the last i
keys must be sorted in the iteration so that we can skip them.
Time complexity .
xxxxxxxxxx
61def bubbleSort(a):
2 for i in range(len(a)):
3 for j in range(0, len(a) - i - 1):
4 if a[j] > a[j + 1]:
5 a[j], a[j+1] = a[j+1], a[j]
6 return a
(5) Bubble Sort: Optimized
In the bubble sort, we can stop early when there's no more swaps in one iteration, which means the rest of the array is already sorted. We can add a swapped
variable to show if we need to go to the next iteration.
Time complexity .
xxxxxxxxxx
91def bubbleSort(a):
2 for i in range(len(a)):
3 swapped = False
4 for j in range(0, len(a) - i - 1):
5 if a[j] > a[j + 1]:
6 a[j], a[j+1] = a[j+1], a[j]
7 swapped = True
8 if not swapped: break
9 return a
(6) Merge Sort
Merge sort is a divide and conquer solution which first divide the array into pairs. Then we sort these pairs and finally merge these sorted arrays together with mergeTwoSortedArray
function of time complexity of .
In general, this sorting technique has a time complexity of .
xxxxxxxxxx
181def mergesort(data):
2
3 if len(data)==0:
4 return []
5
6 elif len(data)==1:
7 return [data[0]]
8
9 elif len(data)==2:
10 if data[0]<data[1]:
11 return data
12 return [data[1], data[0]]
13
14 else:
15 m = len(data) // 2
16 left = mergesort(data[0:m])
17 right = mergesort(data[m:])
18 return mergeTwoSortedArray(left, right)
Here is an example of mergeTwoSortedArray
function with time complexity of .
xxxxxxxxxx
241def mergeTwoSortedArray(list1, list2):
2 result = []
3
4 p1 = 0
5 p2 = 0
6
7 while True:
8 if p1 == len(list1) and p2 == len(list2):
9 return result
10 elif p1 == len(list1):
11 result.append(list2[p2])
12 p2 += 1
13 elif p2 == len(list2):
14 result.append(list1[p1])
15 p1 += 1
16 else:
17 if list1[p1] > list2[p2]:
18 result.append(list2[p2])
19 p2 += 1
20 else:
21 result.append(list1[p1])
22 p1 += 1
23
24 return result
(7) Quick Sort
Quick sort is another divide and conquer algorithm. The idea of quick sort is to select one key as pivot, then put all the keys smaller than it on its left, and put all the other keys on its right by function partition
. Then we continue this process for the left and right part until we get a sorted array.
xxxxxxxxxx
71def quickSort(a, low, high):
2 if low < high:
3 pi = partition(a, low, high)
4 quickSort(a, low, pi - 1)
5 quickSort(a, pi + 1, high)
6
7quickSort(a, 0, len(a)-1)
There are many ways to perform the partition. Here is a naive solution of time complexity .
xxxxxxxxxx
61def partition(a, lo, hi):
2 pivot = a[hi]
3 left = [a[i] for i in range(lo,hi) if a[i]<pivot]
4 right = [a[i] for i in range(lo,hi) if a[i]>=pivot]
5 a[lo:hi+1] = left + [pivot] + right
6 return lo + len(left)
Or we can also use the following partition function for partition with ime complexity , where . This method is called Lomuto partition scheme.
xxxxxxxxxx
121def partition(a, lo, hi):
2
3 pivot = a[hi]
4
5 i = lo - 1
6 for j in range(lo, hi):
7 if a[j] <= pivot:
8 i = i + 1
9 a[i], a[j] = a[j], a[i]
10 a[i+1], a[hi] = a[hi], a[i+1]
11
12 return i + 1
Let's say we know some domain knowledge of the keys we have in the array, then we can develop some heuristic sorting algorithms that is much much faster than the general theorical limition .
(1) Sorting Non-negative Integers: Pigeonhole
The idea is very simple. It is like we first count the amout of each integer we have in the list. And then we output the integers in the order from low to high with repeating times we calculated. So this is not actually sorting.
Time complexity is .
xxxxxxxxxx
131def pigeonholeSort(a):
2
3 result = []
4
5 size = max(a) + 1
6 holes = [0] * size
7 for x in a:
8 holes[x] += 1
9
10 for i in range(0,size):
11 result.extend([i] * holes[i])
12
13 return result
Note that when we have a short array with an extraordinary large maximum value, we are going to experience a huge cost of the space complexity and the algorithm will become super slow!
(2) Sorting Non-negative Integers: Bucket Sort
To deal with the problem of pigeonhole, we can use a hashtable like data structure to store the keys. The difference is that these buckets actually have an order that we can directly join the sorted keys in each bucket to get a sorted result. The worst time complexity is and the average time complexity is .
xxxxxxxxxx
151def bsort(a, nbuckets=5) -> list:
2
3 mx = max(a)
4 buckets = [[] for i in range(nbuckets)]
5 result = []
6
7 for x in a:
8 x_normalized = x / mx
9 i = int(x_normalized * (nbuckets-1))
10 buckets[i].append(x)
11
12 for i in range(nbuckets):
13 result.extend(sorted(buckets[i]))
14
15 return result
(3) Sorting Strings: Bucket Sort
Bucket sort can also be used to sort a string array. It is similar to a hashtable with index of letters. The problem is that if we have all the words start with the same letter in an array, this data structure will be useless. In order to deal with this problem, we will introduce trie structure in the next section.
xxxxxxxxxx
151def pstr_sort(a):
2
3 result = []
4
5 nbuckets = ord('z') - ord('a') + 1
6 buckets = [[] for i in range(nbuckets)]
7
8 for x in a:
9 i = ord(x[0]) - ord('a')
10 buckets[i].append(x)
11
12 for i in range(nbuckets):
13 result.extend(sorted(buckets[i]))
14
15 return result