Detailed explanation of direct insertion sorting algorithm and related Java version code implementation-javaTutorial-php.cn

Detailed explanation of direct insertion sorting algorithm and related Java version code implementation

高洛峰

Release： 2017-01-19 09:30:51

Original

1655 people have browsed it

Direct insertion sort

The idea of direct insertion sort is easy to understand. It is as follows:
1. Divide the array to be sorted into sorted and unsorted parts. Initially, divide the array into sorted and unsorted parts. An element is considered sorted.
2. Starting from the second element, find the appropriate position of the element in the sorted subarray and insert it at that position.
3. Repeat the above process until the last element is inserted into the ordered subarray.
4. Sorting completed.

Example:
The idea is very simple, but the code is not as easy to write as bubble sorting. First, how do you determine the appropriate location? Greater than or equal to the left, less than or equal to the right? No, there are many boundary conditions to consider, and there are too many judgments. Secondly, inserting elements into an array will inevitably require moving a large number of elements. How to control their movement?
Actually, this is not a problem with the algorithm itself, but has something to do with the programming language. Sometimes the algorithm itself is already very mature, and still needs to be slightly modified when it comes to the specific programming language. What we are talking about here is the Java algorithm, so let’s talk about Java.
In order to solve the above problem, we slightly refine the second step. We do not start the comparison from the starting position of the sub-array, but start the comparison from the end of the sub-array in reverse order. As long as it is larger than the number that needs to be inserted, we will go backwards. move. Until the number is no larger than this number, then the number that needs to be inserted is placed in this vacant position. Therefore, we can write the following code:
InsertArray.java

public class InsertArray {
  // 数组
  private long[] arr;
 
  // 数组中有效数据的大小
  private int elems;
 
  // 默认构造函数
  public InsertArray() {
    arr = new long[50];
  }
 
  public InsertArray(int max) {
    arr = new long[max];
  }
 
  // 插入数据
  public void insert(long value) {
    arr[elems] = value;
    elems++;
  }
 
  // 显示数据
  public void display() {
    for (int i = 0; i < elems; i++) {
      System.out.print(arr[i] + " ");
    }
    System.out.println();
  }
 
  // 插入排序
  public void insertSort() {
    long select = 0L;
    for(int i = 1; i < elems; i++) {
      select = arr[i];
      int j = 0;
      for(j = i;j > 0 && arr[j - 1] >= select; j--) {
        arr[j] = arr[j - 1];
      }
      arr[j] = select;
    }
  }
}

Copy after login

Test class:
TestInsertArray.java

public class TestInsertArray {
  public static void main(String[] args) {
    InsertArray iArr = new InsertArray();
    iArr.insert(85);
    iArr.insert(7856);
    iArr.insert(12);
    iArr.insert(8);
    iArr.insert(5);
    iArr.insert(56);
 
    iArr.display();
    iArr.insertSort();
    iArr.display();
  }
 
}

Copy after login

Algorithm performance/complexity
Let’s discuss it directly Time complexity of the insertion algorithm. Regardless of the input, the algorithm always performs n-1 rounds of sorting. However, since the insertion point of each element is uncertain and greatly affected by the input data, its complexity is not certain. We can discuss the best, worst, and average situations.
1. Best situation: From the characteristics of the algorithm, it can be seen that it is best when the array to be sorted itself is in positive order (the array is in order and the order is the same as the required order, which is ascending order based on the premise of our discussion). The reason is that in this case, each element only needs to be compared once and does not need to be moved. The time complexity of the algorithm is O(n);
2. Worst case: Obviously, when the array to be sorted is in reverse order, it is the worst case. In this case, our number of comparisons in each round is i-1, The number of assignments is i. The total degree is the sum of the first n terms of the series 2n-1, that is, n^2. The time complexity of the algorithm is O(n^2);
3. Average case: From the above analysis, the algorithm under the average case can be obtained The number of operations is approximately (n^2)/2 (Note: The calculation here is based on assignment and comparison. If it is based on movement and comparison, it is approximately n^2/4). Obviously, the time complexity is still O(n^2 ).
As for the space complexity of the algorithm, all movements are performed within the data. The only overhead is that we introduce a temporary variable (called a "sentinel" in some data structure books). Therefore, its space complexity (extra space) is O(1).

Algorithm stability
Since you only need to find a position that is no larger than the current number and do not need to exchange, direct insertion sort is a stable sorting method.

Algorithm variant
If there is a lot of data to be sorted, then searching from back to front each time will cause a lot of overhead. In order to improve the search speed, binary search (Binary Search) can be used to improve performance. optimization. Since binary search is very efficient and ensures O(㏒n) complexity, it can greatly improve the search efficiency when there is a lot of data or the input data tends to the worst case. This method is called half insertion sort in some books. Its code implementation is relatively complicated, and I will post it when I have time in the future.
In addition, there are 2-way insertion sort and table insertion sort. 2-way insertion sort is further improved on the basis of half insertion sort, and its number of moves is greatly reduced, about n^2/8. However, it does not eliminate the number of moves or reduce the complexity level. Table insertion sort completely changes the storage structure and does not move records, but it requires maintaining a linked list and replacing the moving records with pointer modifications in the linked list. Therefore, its complexity is still O(n^2).
For 2-way insertion sort and table insertion sort, you can refer to the book "Data Structure" edited by Yan Weimin and Wu Weimin.

Applicable scenarios of the algorithm
Due to the complexity of O(n^2), insertion sort is not applicable when the array is large. However, it is a good choice when the data is relatively small, and is generally used as an expansion of quick sort. For example, in STL's sort algorithm and stdlib's qsort algorithm, insertion sort is used as a supplement to quick sort for sorting a small number of elements. For another example, in the implementation of the sort method used in JDK 7 java.util.Arrays, when the length of the array to be sorted is less than 47, insertion sort will be used.

For more detailed explanations of the direct insertion sorting algorithm and related Java version code implementation related articles, please pay attention to the PHP Chinese website!