Showing posts with label merge sort. Show all posts
Showing posts with label merge sort. Show all posts

Monday, March 28, 2011

Mergesort in Java

A simple implementation of Mergesort in Java.
Mergesort algorithm is used in Java SE Collections.sort() method.
Provided implementation has an optimization which prevents already ordered arrays to be merged thus guarantees O(n*log n) time performance.

Mergesort is particularly useful for "online" sorting which means that all amount of data is not known in advance and is becoming available to sorting mechanism chunk-by-chunk.
In this case algorithm can sort each new chunk and then effectively merge it into existing sorted structure. For such cases mergesort is much more effective than quicksort.

One of the "cons" for mergesort is that it uses O(n) auxiliary memory in distinct of quicksort which operates in constant O(1) memory.

It is possible to implement merge sort to be in-place (use constant auxiliary memory) and stay stable, however these algorithms are overcomplicated and their complexity usually is not balanced by their benefit.

public class Mergesort {
 public static int[] sort(int... array) {
  return new Mergesort().sort(array, 0, array.length - 1);  
 }
 
 private int[] sort(int[] array, int start, int end) {
  if (start >= end) {
   return array;   
  }
  int pivot = start + (end - start) / 2 + 1;
//  At this point end may become bigger than start because of integer
//  division. If pivot == start => pivot - 1 < start
  sort(array, start, pivot - 1);
  sort(array, pivot, end);
  merge(array, start, pivot, end);
  return array;
 }

 public int[] merge(int[] array, int start, int pivot, int end) {
  if (start == end) {
//   array of one element is already sorted
   return array;
  }
  
  if (array[pivot - 1] <= array[pivot]) {
//   Handling the case when array is already properly sorted. This optimization
//   guarantees O(n*log n) sorting time 
   return array;
  }
  
  int length = end - start + 1;
  int[] merged = new int[length];
  for (int i = start, j = pivot, storeIndex = 0; i < pivot || j <= end; storeIndex++) {
   if (i == pivot) {
    merged[storeIndex] = array[j++];
   } else if (j > end) {
    merged[storeIndex] = array[i++];
   } else {
    merged[storeIndex] = array[i] <= array[j] ? array[i++] : array[j++];
   }
  }
  System.arraycopy(merged, 0, array, start, length);
  return array;
 }
}

Don't trust Wiki blindly

Sometimes Wiki contains inconsistencies even for basic computer science articles. Wiki page for merge sort states that Java SE 7 uses merge sort as a default implementation for Arrays.sort() method. However that's not true.


Oracle Java SE 7 documentation for Arrays.sort() clearly states that the default implementation is based on tuned Double-Pivot version of Quicksort.

This is important if some applications depend on stable sorting algorithms.
Merge sort is stable, quicksort is not (in most implementations).