A Novel In-Place Sorting Algorithm with O(n log z) Comparisons and O(n log z) Moves

In-place sorting algorithms play an important role in many fields such as very large database systems, data warehouses, data mining, etc. Such algorithms maximize the size of data that can be processed in main memory without input/output operations. In this paper, a novel in-place sorting algorithm is presented. The algorithm comprises two phases; rearranging the input unsorted array in place, resulting segments that are ordered relative to each other but whose elements are yet to be sorted. The first phase requires linear time, while, in the second phase, elements of each segment are sorted inplace in the order of z log (z), where z is the size of the segment, and O(1) auxiliary storage. The algorithm performs, in the worst case, for an array of size n, an O(n log z) element comparisons and O(n log z) element moves. Further, no auxiliary arithmetic operations with indices are required. Besides these theoretical achievements of this algorithm, it is of practical interest, because of its simplicity. Experimental results also show that it outperforms other in-place sorting algorithms. Finally, the analysis of time and space complexity, and required number of moves are presented, along with the auxiliary storage requirements of the proposed algorithm.





References:
[1] A. LaMarca and R. E. Ladner, "The influence of caches on the
performance of heaps," Journal of Experimental Algorithmics, vol. 1,
Article 4, 1996.
[2] D. Knuth, The Art of Computer Programming: Volume 3 / Sorting and
Searching, Addison-Wesley Publishing Company, 1973.
[3] Y. Azar, A. Broder, A. Karlin, and E. Upfal, "Balanced allocations," in
Proceedings of 26th ACM Symposium on the Theory of Computing,
1994, pp.593-602.
[4] Andrei Broder and Michael Mitzenmacher."Using multiple hash
functions to improve IP lookups," in Proceedings of IEEE INFOCOM,
2001.
[5] Jan Van Lunteren, "Searching very large routing tables in wide
embedded memory," in Proceedings of IEEE Globecom, November
2001.
[6] Ramesh C. Agarwal, "A super scalar sort algorithm for RISC
processors," SIGMOD Record (ACM Special Interest Group on
Management of Data), 25(2):240-246, June 1996.
[7] R. Anantha Krishna, A. Das, J. Gehrke, F. Korn, S. Muthukrishnan, and
D. Shrivastava, "Efficient approximation of correlated sums on data
streams," TKDE, 2003.
[8] A. Arasu and G. S. Manku, "Approximate counts and quantiles over
sliding windows," PODS, 2004.
[9] N. Bandi, C. Sun, D. Agrawal, and A. El Abbadi, "Hardware
acceleration in commercial databases: A case study of spatial
operations," VLDB, 2004.
[10] Peter A. Boncz, Stefan Manegold, and Martin L. Kersten, "Database
architecture optimized for the new bottleneck: Memory access," in
Proceedings of the Twenty-fifth International Conference on Very Large
Databases, 1999, pp. 54-65.
[11] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction
to Algorithms. MIT Press, Cambridge, MA, 2nd edition, 2001.
[12] Abhinandan Das, Johannes Gehrke, and Mirek Riedewald,
"Approximate join processing over data streams," in Proceedings of the
2003 ACM SIGMOD international conference on Management of data,
ACM Press, 2003, pp.40-51.
[13] A. LaMarca and R. Ladner, "The influence of caches on the performance
of sorting," in Proc. of the ACM/SIAM SODA, 1997, pp. 370-379.
[14] Stefan Manegold, Peter A. Boncz, and Martin L. Kersten, "What
happens during a join? Dissecting CPU and memory optimization
effects," in Proceedings of 26th International Conference on Very Large
Data Bases, 2000, pp. 339-350.
[15] A. Andersson, T. Hagerup, J. H┬░astad, and O. Petersson, "Tight bounds
for searching a sorted array of strings," SIAM Journal on Computing,
30(5):1552-1578, 2001.
[16] L. Arge, P. Ferragina, R. Grossi, and J.S. Vitter, "On sorting strings in
external memory," ACM STOC -97, 1997, pp.540-548.
[17] M.A. Bender, E.D. Demaine, andM. Farach-Colton, "Cache-oblivious Btrees,"
IEEE FOCS -00, 2000, pp.399-409.
[18] J.L. Bentley and R. Sedgewick, "Fast algorithms for sorting and
searching strings," ACM-SIAM SODA -97, 1997, pp.360-369.
[19] G. Franceschini, "Proximity mergesort: Optimal in-place sorting in the
cacheoblivious model," ACM-SIAM SODA -04, 2004, pp.284-292.
[20] G. Franceschini. "Sorting stably, in-place, with O(n log n) comparisons
and O(n) moves," STACS -05, 2005.
[21] G. Franceschini and V. Geffert. "An In-Place Sorting with O(n log n)
Comparisons and O(n) Moves," IEEE FOCS -03, 2003, pp. 242-250.