Comparing sorting exchanges

Spherical algorithms in a vacuum are great. However, let us go down from heaven to the sinful earth and see how all this theoretical beauty will show itself in practice.

Analysis of the next sorting class will end with tests for class sorting. Today we will drive away (not in the sense of throwing it out, but in the sense of running it on tests) sorting exchanges. Sorting other classes will be banished later.

In today's race are:

The weakest group

These algorithms do not claim anything due to depressing time complexity. We test them purely for reference.

Silly sorting

Sluggish sorting

Blunt sorting

Rather, it’s interesting not how good they are, but how bad they are, and which of them is the worst.

Core group

Here gathered exchange sorts a la O ( n ² ). Dwarf sorting (and its optimization) + all sorts of variations of bubble sorting.

Dwarf sort

Dwarf sort (optimized)

Bubble Sort

Cocktail Sort

Even-odd sort

This is the most interesting part of today's measurements. It is among the representatives of the main group that the most persistent struggle is expected.

The strongest group

One type of bubble - sorting a comb - managed to overcome the O ( n ² ) barrier and reach the respectable O ( n log n ). So in our competition, quick sorting has a worthy opponent. In addition, let's test the recently invented k-sort , which is a kind of quick sort.

Hairbrush sorting

Quick sort

K-sort

The strongest, by the way, is not only comparable with each other. For them on some data sets the main group will be a serious competitor.

Source

Sort exchanges in Python

def stooge_rec(data, i = 0, j = None): if j is None: j = len(data) - 1 if data[j] < data[i]: data[i], data[j] = data[j], data[i] if j - i > 1: t = (j - i + 1) // 3 stooge_rec(data, i, j - t) stooge_rec(data, i + t, j) stooge_rec(data, i, j - t) return data def stooge(data): return stooge_rec(data, 0, len(data) - 1) def slow_rec(data, i, j): if i >= j: return data m = (i + j) // 2 slow_rec(data, i, m) slow_rec(data, m + 1, j) if data[m] > data[j]: data[m], data[j] = data[j], data[m] slow_rec(data, i, j - 1) return data def slow(data): return slow_rec(data, 0, len(data) - 1) def stupid(data): i, size = 1, len(data) while i < size: if data[i - 1] > data[i]: data[i - 1], data[i] = data[i], data[i - 1] i = 1 else: i += 1 return data def gnome(data): i, size = 1, len(data) while i < size: if data[i - 1] <= data[i]: i += 1 else: data[i - 1], data[i] = data[i], data[i - 1] if i > 1: i -= 1 return data def gnome_opt(data): i, j, size = 1, 2, len(data) while i < size: if data[i - 1] <= data[i]: i, j = j, j + 1 else: data[i - 1], data[i] = data[i], data[i - 1] i -= 1 if i == 0: i, j = j, j + 1 return data def bubble(data): changed = True while changed: changed = False for i in range(len(data) - 1): if data[i] > data[i+1]: data[i], data[i+1] = data[i+1], data[i] changed = True return data def shaker(data): up = range(len(data) - 1) while True: for indices in (up, reversed(up)): swapped = False for i in indices: if data[i] > data[i+1]: data[i], data[i+1] = data[i+1], data[i] swapped = True if not swapped: return data def odd_even(data): n = len(data) isSorted = 0 while isSorted == 0: isSorted = 1 for i in range(1, n - 1, 2): if data[i] > data[i + 1]: data[i], data[i + 1] = data[i + 1], data[i] isSorted = 0 for i in range(0, n - 1, 2): if data[i] > data[i + 1]: data[i], data[i + 1] = data[i + 1], data[i] isSorted = 0 return data def comb(data): gap = len(data) swaps = True while gap > 1 or swaps: gap = max(1, int(gap / 1.25)) # minimum gap is 1 swaps = False for i in range(len(data) - gap): j = i + gap if data[i] > data[j]: data[i], data[j] = data[j], data[i] swaps = True return data def quick(data): less = [] pivotList = [] more = [] if len(data) <= 1: return data else: pivot = data[0] for i in data: if i < pivot: less.append(i) elif i > pivot: more.append(i) else: pivotList.append(i) less = quick(less) more = quick(more) return less + pivotList + more def k(data): return k_rec(data, 0, len(data) - 1) def k_rec(data, left, right): key = data[left] i = left j = right + 1 k = left + 1 p = left + 1 temp = False while j - i >= 2: if key <= data[p]: if p != j and j != right + 1: data[j] = data[p] elif j == right + 1: temp = data[p] j -= 1 p = j else: data[i] = data[p] i += 1 k += 1 p = k data[i] = key if temp: data[i + 1] = temp if left < i - 1: k_rec(data, left, i - 1) if right > i + 1: k_rec(data, i + 1, right) return data

PHP sorting exchanges

  // function StoogeSort(&$arr, $i, $j) { if($arr[$j] < $arr[$i]) list($arr[$j], $arr[$i]) = array($arr[$i], $arr[$j]); if(($j - $i) > 1) { $t = ($j - $i + 1) / 3; StoogeSort($arr, $i, $j - $t); StoogeSort($arr, $i + $t, $j); StoogeSort($arr, $i, $j - $t); } return $arr; } // function SlowSort(&$arr, $i, $j) { if($i >= $j) return $arr; $m = ($i + $j) / 2; SlowSort($arr, $i, $m); SlowSort($arr, $m + 1, $j); if($arr[$m] > $arr[$j]) list($arr[$m], $arr[$j]) = array($arr[$j], $arr[$m]); SlowSort($arr, $i, $j - 1); return $arr; } // function StupidSort($arr) { $i = 1; $size = count($arr); while($i < $size) { if($arr[$i - 1] <= $arr[$i]){ ++$i; } else { list($arr[$i - 1], $arr[$i]) = array($arr[$i], $arr[$i - 1]); $i = 1; } } return $arr; } // function GnomeSort($arr) { $i = 1; $size = count($arr); while($i < $size) { if($arr[$i - 1] <= $arr[$i]){ ++$i; } else { list($arr[$i - 1], $arr[$i]) = array($arr[$i], $arr[$i - 1]); if($i > 1) --$i; } } return $arr; } // () function GnomeSort_opt($arr) { $i = 1; $j = 2; $size = count($arr); while($i < $size) { if($arr[$i - 1] <= $arr[$i]){ $i = $j; ++$j; } else { list($arr[$i - 1], $arr[$i]) = array($arr[$i], $arr[$i - 1]); if(--$i == 0){ $i = $j; $j++; } } } return $arr; } // function BubbleSort($arr) { $c = count($arr) - 1; do { $swapped = false; for ($i = 0; $i < $c; ++$i) { if ($arr[$i] > $arr[$i + 1]) { list($arr[$i + 1], $arr[$i]) = array($arr[$i], $arr[$i + 1]); $swapped = true; } } } while($swapped); return $arr; } // function ShakerSort($arr){ do{ $swapped = false; for($i = 0; $i < count($arr); $i++){ if(isset($arr[$i + 1])) { if($arr[$i] > $arr[$i+1]) { list($arr[$i], $arr[$i + 1]) = array($arr[$i + 1], $arr[$i]); $swapped = true; } } } if ($swapped == false) break; $swapped = false; for($i = count($arr) - 1; $i >= 0; $i--){ if(isset($arr[$i - 1])){ if($arr[$i] < $arr[$i - 1]) { list($arr[$i], $arr[$i - 1]) = array($arr[$i - 1], $arr[$i]); $swapped = true; } } } } while($swapped); return $arr; } //- function OddEvenSort($arr){ $n = count($arr); $sorted = false; while(!$sorted) { $sorted = true; for($i = 1; $i < $n - 2; $i += 2) { if($arr[$i] > $arr[$i + 1]) { list($arr[$i], $arr[$i + 1]) = array($arr[$i + 1], $arr[$i]); $sorted = false; } } for($i = 0; $i < $n - 1; $i += 2) { if($arr[$i] > $arr[$i + 1]) { list($arr[$i], $arr[$i + 1]) = array($arr[$i + 1], $arr[$i]); $sorted = false; } } } } // function CombSort($arr){ $gap = count($arr); $swap = true; while($gap > 1 || $swap){ if($gap > 1) $gap /= 1.25; $swap = false; $i = 0; while($i + $gap < count($arr)){ if($arr[$i] > $arr[$i + $gap]){ list($arr[$i], $arr[$i + $gap]) = array($arr[$i + $gap], $arr[$i]); $swap = true; } ++$i; } } return $arr; } // function QuickSort($arr) { $loe = $gt = array(); if(count($arr) < 2) { return $arr; } $pivot_key = key($arr); $pivot = array_shift($arr); foreach($arr as $val){ if($val <= $pivot){ $loe[] = $val; }elseif ($val > $pivot){ $gt[] = $val; } } return array_merge(QuickSort($loe), array($pivot_key => $pivot), QuickSort($gt)); } //k- function K_Sort($arr) { K_Sort_rec($arr, 0, count($arr) - 1); return $arr; } function K_Sort_rec(&$arr, $left, $right) { $key = $arr[$left]; $i = $left; $j = $right + 1; $k = $p = $left + 1; $temp = false; while($j - $i >= 2) { if($key <= $arr[$p]) { if(($p != $j) && ($j != ($right + 1))) { $arr[$j] = $arr[$p]; } elseif($j == ($right + 1)) { $temp = $arr[$p]; } --$j; $p = $j; } else { $arr[$i] = $arr[$p]; ++$i; ++$k; $p = $k; } } $arr[$i] = $key; if($temp) $arr[$i + 1] = $temp; if($left < ($i - 1)) K_Sort_rec($arr, $left, $i - 1); if($right > ($i + 1)) K_Sort_rec($arr, $i + 1, $right); }

Time

Time is everywhere measured in seconds with an accuracy of microseconds.

If the algorithm is fed immediately with a pack of arrays (out of ten, one hundred, or even a million pieces), then the total time is indicated.

Iron - my faithful old laptop, who knew the best of times. So, it is quite possible that everything will work much faster for you.

Worst of the worst

Immediately deal with outsiders. For them, prepared arrays of 40/50/60 elements, containing random numbers from 0 to 9.

type = random, unique = False, min = 0, max = 9, count = 1
size
40		50		60
Python	Php	Python	Php	Python	Php
0.004	0.001	0.005	0.003	0.007	0.004
0.007	0.009	0.018	0.009	0.018	0.023
0.02	8.26647	0.053	20.8722	0.12901	45.9786

Blunt sorting is an order of magnitude faster than silly, and silly an order of magnitude faster than sluggish. There is nothing more to watch.

Middling

To test middling, packs of one hundred arrays of 100 elements each (unique numbers from 100 to 999), as well as packs of ten arrays of 1000 elements each (unique numbers from 1000 to 9999) are generated.

Randomized arrays, almost sorted arrays, and reverse-ordered arrays were prepared.

Serednyachki: Random

type = random, unique = True
size ^{(from min to max)} × count
100 ^{(from 100 to 999)} × 100		1000 ^{(1000 to 9999)} × 10
Python	Php	Python	Php
0.14101	0.18901	1.59109	1.7251
0.20601	0.22301	2.33013	2.09712
0.15301	0.22901	1.6711	2.23613
0.21601	0.26402	2.55515	2.73116
0.16701	0.33402	1.7251	3.13218

Approximately the same results for all. Implementations on Python are slightly faster than PHP.

Mets: Almost sorted

type = almost unique = true
size ^{(from min to max)} × count
100 ^{(from 100 to 999)} × 100		1000 ^{(1000 to 9999)} × 10
Python	Php	Python	Php
0.009	0.005	0.009	0.005
0.009	0.014	0.01	0.014
0.01	0.01	0.011	0.008
0.011	0.008	0.013	0.008
0.011	0.017	0.013	0.017

The same results for all, plus or minus. From the interesting: partial data ordering improves the performance of all algorithms tenfold.

Serednyachki: Reverse

type = reverse, unique = True
size ^{(from min to max)} × count
100 ^{(from 100 to 999)} × 100		1000 ^{(1000 to 9999)} × 10
Python	Php	Python	Php
0.26801	0.35902	3.10018	3.4292
0.39602	0.45303	4.49726	4.19524
0.22701	0.38402	2.48114	3.64421
0.30202	0.42603	3.34419	4.17524
0.30402	0.64404	3.36519	6.22036

Here is an interesting conclusion - the reverse ordering does not really affect the speed, which has fallen by only 2 times compared with the random number.

The Middle Games: Binary

type = random, unique = False, min = 0, max = 1
size × count
100 × 100		1000 × 10
Python	Php	Python	Php
0.073	0.094	0.81105	0.86905
0.10401	0.11301	1.16307	1.06606
0.08801	0.12901	0.86405	1.18207
0.11501	0.15001	1.30107	1.41608
0.11401	0.25801	1.31908	2.46314

From more or less interesting: if instead of numbers from 100 to 9999 to sort zeros and ones, then the speed indicators will not grow much.

Champions

Here the main intrigue is whether the sorting with a comb and k-sorting will be able to press the fast one off the pedestal?

Champions: Random

To find out, we will create packages of randomly arranged arrays: 10 pieces of 10 thousand elements and 10 pieces of 100 thousand elements each. I wanted to give millionaires to the input algorithms, but I didn’t pull the laptop. That RAM is not enough, the depth of recursion is too great.

type = random, unique = False, count = 10
size ^{(from min to max)}
10,000 ^{(from 10,000 to 99999)}		100,000 ^{(from 100,000 to 999999)}
Python	Php	Python	Php
0.80205	0.63104	10.4506	8.24647
0.47703	1.64009	6.65138	26.5435
0.90005	2.18613	15.8089	29.4867

K-sorting on Python is slower, and PHP is faster than quick sort.
Sorting with a comb in comparison with a fast one looks solid, although it is inferior to it in all tests (and in these and subsequent ones) for any data sets.

However, there is a comb and an important distinct advantage. It has no recursion and therefore, unlike its colleagues, it successfully processes arrays consisting of millions of elements.

Special cases

Although champions are hundreds of times faster than average ones, on some specific data sets, sorting data more simply shows that they are not embroidered as well.

Special cases: Almost sorted

Let's try almost sorted arrays, just take a larger size than those that we tested for average ones.

A package of 10 arrays of 10 thousand each and a package of 10 arrays of 100 thousand elements each.

type = almost, unique = False
size ^{(from min to max)} × count
10,000 ^{(from 10,000 to 99999)} × 10		100,000 ^{(from 100,000 to 999999)} × 10
Python	Php	Python	Php
0.43202	0.058	0.81705	0.58003
0.084	0.062	0.85805	0.54003
0.11601	0.12401	1.41908	1.36008
0.14001	0.11901	1.83411	1.41708
0.13801	0.23101	1.6491	2.56915
?	122.088	?	?
0.71504	1.58909	9.78356	19.4731

Quick sorting is not present here at all - there was not enough RAM. At the same time, randomized arrays for 10/100 thousand quicksort elements process with a bang. k-sort faced similar problems, pulled only 10-thousandth of PHP. Large, almost sorted arrays are a degenerate case for quick sorting and its modifications. Because of this arrangement of elements, recursion divides an array into parts for almost every pair of adjacent elements, which leads to the maximum possible number of recursion calls.

By the way, the situation is similar with large reverse-ordered arrays. The same problem arises as with almost ordered - the algorithm has to call itself for each pair of adjacent elements.

Sorting a comb, though it works one and a half to two times slower than quick or k (in general), but it does not have the above-mentioned memory overflows. No recursion - no problem.

Well, we note that all middlings unanimously overtook the champions on large, almost sorted arrays. Here the principle has worked: "Hush and you go - you will continue."

Special cases: Million scarlet roses of small arrays

Small arrays - element middle. If you need to handle a huge number of small sets of numbers, you can get a time gain by taking a "primitive" bubble or dwarf sort. And quick sorting and others like it to use for larger tasks.

type = random, unique = True
size ^{(from min to max)} × count
5 ^{(from 10 to 99)} × 1000000		10 ^{(10 to 99)} × 1,000,000
Python	Php	Python	Php
11.77267	17.888	24.29039	33.6659
12.51872	18.165	29.33268	35.202
15.44188	17.861	30.06672	36.7911
14.18681	19.7831	31.65981	39.3533
13.63978	24.3374	28.41662	52.864
14.21881	21.1452	25.80548	32.5419
14.58683	28.5016	24.85942	50.3139
18.53806	30.7238	29.6647	58.2493

With exchange sortings everything is clear. Now let's get down to the really interesting - sorting inserts.

Links

Excel-AlgoLab application , with which you can step by step to view the visualization of these (and not only these) sorts.

All arrays in the form of JSON are also laid out on the Google disk .

Wiki / Wiki - Silly / Stooge , Slow , Gnome / Gnome , Bubble / Bubble , Shaker / Shaker , Odd-Even / Odd-Even , Comb / Fast, Quick

Series articles

Excel application AlgoLab.xlsm
Exchange sorting
- Comparing sorting exchanges
Sorting inserts
Sort by selection
Merge sort

Source: https://habr.com/ru/post/415691/

All Articles