Hash Sort

1 year ago
22

A very quick and simple sorting algorithm. Public domain. Could also called as Leaf Sort, because it split-sorts into equal width bins. If the count of the bins is very large compared to the count of the unsorted values, each unsorted value will have its own bin, the algorithm will run in one pass O(n), and extra storage required O(n), for example 4x used storage space. If Bin count is 2, it is like a binary search sort, but does not need sorted data and will sort the data values at same time. The Hash (Leaf) Sort algorithm is simply blindly multi-thread-able. Could also be called Linear Sort or Function Sort, because of the linear function the default implementation uses, to classify values to buckets. If your particular application has different distribution of the data values, than uniform linear distribution, like gaussian distribution, or logarithmic, then that can be implemented in the hash function. The hash function attempts to fill the buckets with equal amounts, making the sorting problem split exactly in the bin count sub-problems that are exactly bin count section of the original problem. For example a sort of 1000 items to bins of 10, makes it 10x 100 item sort problem, then 100x 10 item problem, and finally 1000x 1 item problem, the unsorted array becomes sorted. The most fastest but also most memory using setup will go directly to the 1000x 1 item sort problem. The bin data structure should a linked list, for O(1) item addition and O(n) of any order all items query, or a pre-allocated array at some expected item amount, like 1/10 of n in each bin, like a vector. Extension to the quick sort, multiple pivots, in bins. Logarithmic sub-problem complexity of O(n). Also for hash search and hash insert, O(1), as a data structure.

Loading comments...