“Quick” Sort

Sometime back, I had come across this video in which then Google CEO Erin Schmidt asked Barack Obama, “ What is the most efficient way to sort a million 32-bit integers?” to which he vaguely answered-“ Probably the Bubble Sort would not be the way to go…”, and the audience started laughing. Now at that time I knew only two sorting techniques – Bubble Sort and Selection Sort and both seemed to be wrong. So, I Googled ‘Most Efficient Sorting Techniques’ and came up with three very efficient techniques – Merge Sort, Quick Sort and Heap Sort. Today, I’m going to be talking about Quick Sort.

First of all, ‘What is Sorting?’- Sorting is the process of arranging a list of elements in a specific order. There are different Sorting Techniques- Merge Sort, Bubble Sort, Insertion Sort, Selection Sort, Heap Sort, Radix Sort and Quick Sort to name a few. I always found Quick Sort interesting because, duh, there is ‘Quick’ in the name itself. So it must be fast, and it is. Quick Sort is said to have the fastest running time even though it doesn’t have the best complexity.

Quick Sort Technique works on the principle of Divide and Conquer. It basically divides the array into sub-arrays around a pivot element and then applies the same technique to the sub-arrays till it is sorted. The basic steps involved are:

  1. Choose a pivot element.
  2. Divide the array into two, one to the right of pivot and the other to the left.
  3. Now repeat the steps for the sub-arrays.

Sorting_quicksort_anim

The Pivot Element

The most important step in Quick Sort is choosing the right pivot element because this is what decides the running time of the program. The goal is to choose a pivot in such a way that we get a balanced partition. But sometimes we may choose the pivot in such a way that we get an unbalanced partition or sometimes just one sub-array. Usually we choose either the first or last element of the array as the pivot. To achieve better partitioning, we should choose a few elements from the list and set the median of those as the pivot. More the elements considered, better is the pivot value and in turn better is the partition. After selecting a pivot, we partition the array into the left sub-array and right sub-array which is to the left and the right of the pivot respectively. The elements of the left sub-array are smaller than the pivot and the elements in the right sub-array are greater than the pivot. Because of this, after partitioning, the pivot element is in its right place and does not need any further sorting.

F0pge

Algorithm

Algorithm QuickSort( A, low, high)

//Sorts the array A using Quick Sort technique

//Input: Array A[low… high]

//Output: Array A sorted in non-decreasing order

{

If (low < high)

r <- Partition( A, low, high)

QuickSort( A, low, r-1)

QuickSort(A, r+1, high)

}

Algorithm Partition( A, low, high)

//Algorithm to partition the array A

//Input: Array A[low… high]

//Output: Partitioned Array A with value of pivot returned

{

pivot <- A[high]

i <- low-1

for j <- low to high-1, do

if A[j] <= pivot, then

i <- i+1

exchange A[i] <-> A[j]

exchange A[i+1] <-> A[r]

return i+1

}

Complexity Analysis

Best-Case:   This is when we achieve a balanced partitioning, that is the two sub arrays are of same size or differ by one. The tree of sub-problem sizes are shown below:

21cd0d70813845d67fbb11496458214f90ad7cb8

T(n) = 2T(n/2) + n             ;               T(1) = 1

T(n) = 2k T(n/2k) + kn      ;               n/2k = 1 => n = 2k => k = log2 n

T(n) = nT(n/n) + nlog2 n = nT(1) + nlog2 n

T(n) ϵ Θ (nlog2 n)

Worst-Case:   This happens when most of the partitions are unbalanced or when the list is already sorted. This is because the pivot element would have been either the smallest or largest element in the list, thus resulting in only one sub-array in every recursive call. If the original call takes cn time for some constant c, the recursive call on n-1 elements takes c(n-1) time, the recursive call on n-2 elements takes c(n-2) time, and so on. Here’s a tree of the sub-problem sizes with their partitioning times:

7da2ac32779bef669a6f05decb62f219a9132158

When we total up the partitioning times for each level, we get

cn + c(n-1) + c(n-2) + … + 2c = c(n + (n-1) + (n-2) + … + 2 = c((n+1)(n/2)-1)

We have some low-order terms and constant co-efficients, but when we use big-Θ notation, we ignore them. In big-Θ notation, Quick Sort’s worst-case running time is Θ(n2).

Average-Case:   The average case running time of Quick Sort is much closer to the best-case than to the worst-case. It happens when there is unbalanced partitioning but still we get two sub arrays in each recursive call. For example, suppose the partitioning algorithm always produces a 9 to 1 proportional split, the tree of sub-problem sizes can be shown as follows:

158_b

It does not always have to be a 9 to 1 split, it can be a 3 to 1 split or anything else. A tree for the 3 to 1 split sub-problems are shown below:

0afd6362c5866d2fc338a8aa9fb65fc2455f6792

As the figures show, after log n levels, we get down to a sub-problem size of 1. The logic behind this has been explained in the next section. For log n levels, the effort at each level is at most n. Therefore, the total time complexity is given by O(nlog n).

Depth of a Tree

The degree of a tree is the maximum number of children that each node in the tree can have. So in a binary tree, the degree is 2 because each node has a maximum of two children. The depth of a node is its distance from the root node. So, the depth of the root node is 0 and let’s say it has i children, where i is the degree of the tree. Now those d nodes are at depth 1 and they in turn have d children each.

  • So at depth 0, we have 1 node, or i0
  • At depth 1, we have i nodes, or i1
  • At depth 2, we have i x i nodes, or i2

So, in general the number of nodes at depth d would be id.

bin-tree02e

Now, the depth of a tree is the distance between its root node and its farthest leaf. So if a tree has depth d, we can say n = id or d = logi n, where n is the number of nodes at depth d.

Therefore, in general we can say that the depth of a tree is log n.

As we can see, even though the worst-time complexity of Merge Sort is better than Quick Sort, Quick Sort performs way faster in the average case. Also, in Merge Sort, we are dividing the list, sorting it and merging them before ending up with the sorted list whereas, in Quick Sort, we only need to partition the array and it gets sorted during that. Therefore, Quick Sort is the fastest Sorting Technique in its average case.

Leave a comment