Sorting

本文是我正在写的gitbook上的书Algorithms一个章节，由于是记录课程的内容，所以下文都是英文，见谅。

Sorting refers to arranging data in a particular format. Aorting algorithm specifies the way to arrange data in a particular order(the example in this chapter are sorted in non-deincreasing order).

Before talk into the different kinds of sorting algorithms, let's have a look in some definitions:

in-place sorting: do not require any extra space while sorting
not-in-place sorting: require extra space while sorting
stable sorting: the content does not change the sequence of similar content in which they appear after sorting
unstable sorting: the content changes the sequence of similar content in which they appear after sorting
adaptive sorting algorithm: the sorting algorithm will take advantage of already 'sorted' elements and will not try to re-order them
non-adaptive sorting algorithm: the sorting algorithm will not take advantage of already 'sorted' elements and will try to force every single element to be re-order

With the above knowledge, we can go into the specific sorting algorithms.

NOTE: In the following paragraph the implenmentation code of the algorithms will NOT appear. But in the end of this chapter, there will be a link to my github where you can find all algorithms implement in Java.

Bubble Sort

Bubble sort is a simple sorting algorithm. It compare each adjacen elements and swap them if they do not in order. This algorithm is not suitable for large size input, because in the worst case its time complexity is O(n^2) where n is the size of the input.

Let's take an example for look, we assue that the array need to be sorted is [2, 8, 4, 6, 3]. According to the bubble sort, the step should be as follow:

compare 2 and 8, good, they are in the right order.
compare 8 and 4, oh, do not in order, so it will swap them. The array now is [2, 4, 8, 6, 3].
compare 8 and 6, the same as above, the array is change to [2, 4, 6, 8, 3].
compare 8 and 3, so the array is to become [2, 4, 6, 3, 8]
repeat the four step above until all the item in the position where they supposed to be, make the array to be [2, 3, 4, 6, 8]

With the five step above, we can conclude the pseudocode as follow:

BubbleSort(Array)
    for i <- 0 to Array.length - 1
        for j <- 0 to Array.length - 1 - i
            if Array[i] > Array[j]
                // assue swap(a, b) can swap the position
                swap(Array[i], Array[j])

There are two questions, why the i is from 0 to length - 1 and why the j is from 0 to length - 1 - i?

I think the answer should as follow:

the comparison is between the current element and the next element, so if your loop contains the last element of the Array, there will be en error which named ArrayOutOfBorder happened.
every outer loop will place one element in its position, so in the inner loop we do not need to compare the element which has already in the corret position.

From the above presudocode, we can find a nest loop contains two for loop. So, the time complexity should be O(n^2).

Insertion Sort

Insertion Sort is a in-place comparison based sorting algorithm. Here, a sub-list is maintained which is always sorted. For any element need to be sorted, just find a right place(the former one is smaller than it and the coming one is equal or greater than it), shift the following elements and insert it.

Let's also take an example to see how the insertion sort work. I assue that the array need to be sorted is [3, 7, 5, 1, 8, 4]. So, the sorting step should as follow:

take the 3 into the sub-list which mean it has already sorted. Now, the sub-list is [3].
take the 7 into the sub-list compare it with the last element of the sub-list, here is 3, so 7 is greater than 3, just put it behind the 3. Now, the sub-list is [3, 7].
take the 5 into the sub-list, compare it with 7. Oh no, is smaller than 7, so compare with the previous element 3. We find that 5 is greater than 3, so shift the 7 to the next position and insert 5. Now, the sub-list is [3, 5, 7].
repeat the third step, get [1, 3, 5, 7] as the sub-list.
repeat until all elements are into the sub-list.
finally, the sub-list should be [1, 3, 4, 5, 7, 8].

Importantly, although I use the word 'sub-list', it does not mean this algorithm requires any extra space. Insertion sort is an in-place sorting algorithm.

From the above example, we can abstract the basical step which adapts to any array:

if it is the first element, it is already sorted
pick next element
compare with all elements in the sorted list(sub-list)
shift all the elements in the sorted list taht is greater than the value to be sorted
insert the value
repeat until the array is sorted

With the abstract step, we can write the pseudocode for this algorithm:

InsertionSort(Array)
    for i <- 1 to Array.length
        key <- Array[i]
        j <- i - 1
        while i >= 0 && key < Array[j]
            Array[j + 1] <- Array[j--]
        Array[++j] <- key

Look at the presudocode above, it is so simple to finish the insertion sort. Let's think about the time complexity of it. We can see a for loop and a while loop in it. In the worst case, each loop need go througt all the elements. So the time comlexity of the insertion sort should also be O(n^2).

Selection Sort

Selection sort is a simple sorting algorithm. This sorting algorithm is a in-place comparsion based algorithm in which the list is divided tnto two parts, sorted part at left end and unsorted part at right end. Intitally sorted part is empty and unsorted part is entire list.

The idea of the selection sort is that select the smallest element from the unsorted list and swap it with the leftmost element. Always repeat this selection, the array will become sorted.

It is a really super easy algorithm, so I don't think we need and example to show how it work, just put the pseudocode here:

SelectionSort(Array)
    for i <- 0 to Array.length
        smallest <- i
        for j <- i + 1 to Array.length
            if Array[j] < Array[i]
                smallest <- j
        if smallest != i
            swap(Array[smallest], Array[i])

Okey, the same way to analysis the time complexity. Two loop and the wost case it need to go through the array. So, we can conclude the time complexity of it should be O(n^2).

Merge Sort

Here we talk about the merge sort. Before we start, I think we can not miss a significant algorithm design concept named divide-and-conquer. A lot of algorithms are based on this kind of theory and merge sort is one of them. By the way, with worst case time complexity being O(nlogn), merge sort is one of the most respected algorithm in sorting world.

Broadly, we can understand divide-and-conquer approach as three step process:

divide, this step involves breaking the problem into smaller sub-problems. Sub-problems should represent as a part of original problem. Generally, the recursive approach will be used to divide the problem into sub-problems.
conquer, the step receives lot of smaller sub-problem to be solved.
merge, when the smaller problems are being solved, this stage is to recursively combines the solution of the sub-problems until they formulate the answer to the original problem.

Ok, let turn back to the merge sort. We need to use the divide-and-conquer theory which mean take the above three step:

if it is only one element in the array it is already sorted
divide the array recursively into two halves until it can no more be divided
merge the smaller arrays into new array in sorted order

For example, we have a array [4, 5, 8, 1, 3, 9, 2, 7]. According to the rule:

obviously, there is more than one element in the array, so we have to divide the original array into two array. So, now we have array1 [4, 5, 8, 1] and array2 [3, 9, 2, 7].
following the rule, we have to divide them recursive. So, we will get the following sub arrays in order:
- [4, 5] [8, 1] [3, 9] [2, 7]
- [4] [5] [8] [1] [3] [9] [2] [7]
in this example, we recursively call the divide step for three time. We can see above we have already got the �separate arrays which only contains one element. So we should combine(merge) them to build the final answer.
the combination order should as follow:
- [4, 5] [1, 8] [3, 9] [2, 7]
- [1, 4, 5, 8] [2, 3, 7, 9]
- [1, 2, 3, 4, 5, 7, 8, 9]

After three times combination, we got the sorted array. Here we can see the array has been sorted. But what is the mechanism of the combination step? The explanation should as follow:

compare the fist element of two adjacent array（they both have a pointer, let assue one is i the other is j）, put the greater one into another temporary array (let call it sorted, it used to store the sorted array, it mean that merge is a non-in-place sorting algorithm)
make the pointer which point to the array whose element just moved to the sorted array forward a step, and compare the two elements. Do the same thing recursively util one array is empty.
merge the rest elements from the array which is not empty to the end of the sorted array.

It is so abstract to read the above word, let's see the pseudocode:

// f is the first subscript of the Array
// l is the last subscript of the Array
MergeSort(Array, f, l)
    if (f < l)
        m <- (f + l) / 2
        MergeSort(f, m)
        MergeSort(m + 1, l)
        Merge(Array, f, m, l)

Merge(Array, f, m, l)
    k <- 0
    while i <= m && j <= l
        if (Array[i] < Array[j])
            sorted[k++] = Array[i++]
        else
            sorted[k++] = Array[j++]
    while (i <= m)
        sorted[k++] = Array[i++]
    while (j <= l)
        sorted[k++] = Array[j++]
    //assue that the copy(a,b) can copy array a to array b
    copy(sorted, Array)

Quick Sort

It time to talk about quick sort. It is a highly efficient sorting algorithm (the worst case and average case time complexity are O(nlogn)) and is based on partitioning of array of data into smaller arrays. A large array is partitioned into two arrays one of which holds values smaller than specified value say pivot based on thich partitionis made and another array holds values greater than pivot value.

From the above words, we should know partition is an important part of the quick sort. If you finfish the partition, the rest job is to do the partition recursively and find out when to stop. So, let's learn about partition.

the first job is to determine a pivot, I choose the last element (you can choose the one you want, but you need to move it the end of the array, if you do not choose the last one)
create two pointer, pointer j point to the first element of the array and the pointer i is one step behind j which mean that i=j-1
compare the array[j] and the pivot if the array[j] less than pivot then make i move forward one step at that time if i and j do not point at the same element then swap them. If array[j] is greater than array[i] or i and j point at the same element, just move j forward util j reach the array.length-1 position of the array, then swap the i + 1 position and the pivot's position

Using pivot algorithm recursively we end-up with smaller possible partitions. Each partition then processed for quick sort. We define recursive algorithm for quicksort as below:

make the right-most index value pivot
partition the array using pivot value
quicksort left partition recursively
quicksort right partition recursively

I am almost dizzy after writing the opacity explanation, I think for a programmer the pseudocode may be accepted easily.

// f is the first subscript of the Array
// l is the last subscript of the Array
QuickSort(Array, f, l)
    if (f < l)
        m <- QuickPartition(Array, f, l)
        QuickSort(Array, f, m - 1)
        QuickSort(Array, m + 1, l)

// begin is the first subscript of the Array
// pivot is the last subscript of the Array
QuickPartition(Array, begin, pivot)
    j <- begin
    i <- i - 1
    for j to pivot
        if Array[j] < pivot
            i <- i + 1
            if i != j
                swap(Array[i], Array[j])
    swap(Array[i + 1], pivot)
    return i + 1

Heap Sort

Finally, we come to the heap sort. It need some simple knowledge of the binary tree, to be more precisely, complete binary tree. Let me list the things you should know:

a binary tree T with n levels is complete if all levels except possibly the last are completely full, and the last level has all its nodes to the left side
for any node i (the root node is begin from 0), its left child should be 2*i+1 and its right child should be 2*i+2

If you want to use the heap sort, you need to follow the three steps:

build the maximum heap, it mean build the complete binary tree and the father node' value should always greater than child node's value
swap the root node to the last position of the array (heap sort is a kind of in-place sorting) and delete the node (make the array size minus one)
maintain the root node, make sure the heap is the maximum heap

Using the recursive theory, we can make the algorithm easily. We can use the same method when we do the first step and the third step. Let's see the pseudocode:

HeapSort(Array)
    size <- Array.length
    for i <- size / 2 - 1 into 0
        HeapKeepMax(Array, i, size)
    for k <- Array.length - 1 to 0
        swap(Array[k], Array[0])
        size < - size - 1
        HeapKeepMax(Array, 0, size)

// i is the node which need to be maintained
HeapKeepMax(Array, i, size)
    leftChild <- 2 * i + 1
    rightChild <- leftChild + 1
    largest <- i
    if Array[i] < Array[leftChild]
        largest <- leftChild
    if Array[largest] < Array[rightChild]
        largest <- rightChild
    if largest != i
        swap(Array[i)], Array[largest])
        HeapKeepMax(Array, largest, size)

Congratulation! You finish the sorting chapter. Here is a link to my github repository where you can find the implementation code in Java of the algorithms mentioned above.

Click here to get the code, thank you for your reading.

Reference link

tutorialspoint -> data structure and algorithm -> sorting techniques

最后编辑于：2017.12.04 03:41:45

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 194,457评论 5赞 459
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 81,837评论 2赞 371
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 141,696评论 0赞 319
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 52,183评论 1赞 263
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 61,057评论 4赞 355
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 46,105评论 1赞 272
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 36,520评论 3赞 381
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 35,211评论 0赞 253
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 39,482评论 1赞 290
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 34,574评论 2赞 309
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 36,353评论 1赞 326
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 32,213评论 3赞 312
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 37,576评论 3赞 298
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 28,897评论 0赞 17
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 30,174评论 1赞 250
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 41,489评论 2赞 341
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 40,683评论 2赞 335