In the shortest possible length, we will review the characteristics, implementation methods, and performance of all collections and concurrent collections. It is suitable for reading by all those who are not very confident in being "proficient in Java".
List
ArrayList
Implemented as an array. Save space, but arrays have capacity limits. When the limit is exceeded, the capacity will be increased by 50% and copied to a new array using System.arraycopy(), so it is best to give an estimate of the array size. By default, an array of size 10 is created the first time an element is inserted.
Accessing elements by array subscript-get(i)/set(i,e) has high performance, which is the basic advantage of arrays.
Adding elements directly to the end of the array--add(e) has high performance, but if you insert or delete elements by pressing the subscript--add(i,e), remove(i), remove(e), you have to use System.arraycopy() to move some affected elements, the performance will become worse, which is a basic disadvantage.
LinkedList
is implemented as a two-way linked list. There is no capacity limit for linked lists, but the doubly linked list itself uses more space and requires additional linked list pointer operations.
Accessing elements by subscript--get(i)/set(i,e) will tragically traverse the linked list to move the pointer into place (if i>half the size of the array, it will be moved from the end).
When inserting or deleting elements, you can just modify the pointers of the previous and next nodes, but you still need to traverse part of the linked list pointers to move to the position pointed by the subscript. Only operations at both ends of the linked list-add(), addFirst(), removeLast () or use remove() on iterator() to save pointer movement.
CopyOnWriteArrayList
Concurrency optimized ArrayList. Use the CopyOnWrite strategy to first copy a snapshot to modify it, and then let the internal pointer point to the new array after the modification.
Since modifications to snapshots are invisible to read operations, there are only write locks and no read locks. In addition to the expensive cost of replication, it is typically suitable for scenarios where there is more reading and less writing. If the update frequency is high or the array is large, it is better to use Collections.synchronizedList(list) and use the same lock for all operations to ensure thread safety.
Added the addIfAbsent(e) method, which will traverse the array to check whether the element already exists. The performance is not very good as you can imagine.
Supplement
No matter which implementation is implemented, returning subscripts by value--contains(e), indexOf(e), remove(e) requires traversing all elements for comparison, and the performance is not very good as you can imagine.
There is no SortedList that is sorted by element value, and there is no lock-free ConcurrentLinkedList in the thread-safe class. When you make do with the equivalent classes in Set and Queue, you will lack some List-specific methods.
Map
HashMap
Hash bucket array implemented with Entry[] array, the array subscript can be obtained by modulo the size of the bucket array using the hash value of Key.
When inserting elements, if two keys fall into the same bucket (for example, hash values 1 and 17 both belong to the first hash bucket after modulo 16), Entry uses a next attribute to implement multiple Entries in a one-way linked list Storage, the Entry entered later in the bucket will point next to the current Entry in the bucket.
When looking for a key with a hash value of 17, first locate the first hash bucket, then traverse all the elements in the bucket using a linked list, and compare their key values one by one.
When the number of Entries reaches 75% of the number of buckets (many articles say that the number of buckets used has reached 75%, but looking at the code it is not true), the bucket array will be expanded exponentially and all the original Entries will be reallocated, so this is also the most Good to have an estimate.
It will be faster to use bitwise operations (hash & (arrayLength-1)) to take the modulo, so the size of the array is always 2 to the Nth power. If you give an initial value at will, such as 17, it will be converted to 32. The default initial value when an element is first placed is 16.
When iterator() is used to traverse the hash bucket array, it seems to be out of order.
In JDK8, a new threshold value of 8 is added by default. When the Entry in a bucket exceeds the threshold, it will be stored not as a one-way linked list but as a red-black tree to speed up Key search.
LinkedHashMap
Extends HashMap to add the implementation of a two-way linked list, which is known as the most memory-consuming data structure. When iterator() is supported, the entries are sorted according to the insertion order (but updates are not counted. If the accessOrder attribute is set to true, all read and write accesses are counted).
The implementation is to add attribute before/after pointers to the Entry, and add itself in front of the Header Entry when inserting. If all read and write accesses need to be sorted, the before/after of the front and rear Entries must be spliced together to delete themselves in the linked list.
TreeMap
It is implemented using red-black trees. Please refer to the introductory tutorial for details due to space limitations. When iterator() is supported, sorting is by Key value, which can be sorted in ascending order of Keys that implement the Comparable interface, or controlled by the incoming Comparator. It is conceivable that the cost of inserting/deleting elements in the tree must be greater than that of HashMap.
Support SortedMap interface, such as firstKey(), lastKey() to obtain the largest and smallest key, or sub(fromKey, toKey), tailMap(fromKey) to cut a certain segment of the Map.
ConcurrentHashMap
Concurrently optimized HashMap has 16 write locks by default (more can be set), which effectively disperses the probability of blocking and has no read locks.
The data structure is Segment[]. Inside the Segment is the hash bucket array, and each Segment has a lock. Key first calculates which Segment it is in, and then calculates which hash bucket it is in.
Supports ConcurrentMap interface, such as putIfAbsent(key, value) and the opposite replace(key, value) and replace(key, oldValue, newValue) that implements CAS.
There is no read lock because the put/remove action is an atomic action (for example, put is an assignment operation to an array element/Entry pointer), and the read operation will not see the intermediate state of an update action.
ConcurrentSkipListMap
JDK6’s new concurrency-optimized SortedMap is implemented with SkipList. SkipList is a simplified alternative to red-black trees and a popular ordered set algorithm. Please refer to the introductory tutorial due to space limitations. The Concurrent package uses it because it supports CAS-based lock-free algorithms, while red-black trees do not have good lock-free algorithms.
Very special, its size() cannot be adjusted casually, it will be traversed for statistics.
Supplement
Regarding null, HashMap and LinkedHashMap are arbitrary. When TreeMap does not set a Comparator, the key cannot be null; the value of ConcurrentHashMap cannot be null in JDK7 (why is this?), and neither the key nor the value can be null in JDK8. ;ConcurrentSkipListMap is that the key and value in all JDK cannot be null.
Set
Set is almost always implemented internally using a Map, because the KeySet in the Map is a Set, and the value is a false value, all using the same Object. Set's characteristics also inherit those of the internal Map implementation.
HashSet: Internally is HashMap.
LinkedHashSet: Internally it is LinkedHashMap.
TreeSet: Internally is the SortedSet of TreeMap.
ConcurrentSkipListSet: Internally, it is a concurrently optimized SortedSet of ConcurrentSkipListMap.
CopyOnWriteArraySet: Internally, it is a concurrently optimized Set of CopyOnWriteArrayList. Its addIfAbsent() method is used to achieve element deduplication. As mentioned above, the performance of this method is very average.
Supplement: It seems that ConcurrentHashSet is missing. There should be a simple implementation of ConcurrentHashMap internally, but JDK does not provide it. Jetty has sealed one by itself, and Guava directly uses java.util.Collections.newSetFromMap(new ConcurrentHashMap()) to implement it.
Queue
Queue is a List that enters and exits at both ends, so it can also be implemented with an array or linked list.
--Ordinary Queue--
LinkedList
Yes, LinkedList implemented as a doubly linked list is both a List and a Queue. It is the only Queue that allows nulls to be placed.
ArrayDeque
A bidirectional Queue implemented with a circular array. Size is a multiple of 2, default is 16.
Ordinary arrays can only quickly add elements at the end. In order to support FIFO and quickly remove elements from the head of the array, you need to use a loop array: there are two subscripts at the head and tail: when an element is popped out, the head subscript is incremented; join element, if the end of the array space has been reached, the elements will be assigned to the array [0] in a loop (if the head subscript is greater than 0 at this time, it means that the head of the team has popped out the element and there is a vacancy), and the tail subscript points to 0. , and then insert the next element and assign it to array [1], and the tail subscript points to 1. If the subscript at the end of the queue catches up with the head of the queue, it means that all the space in the array has been used up, and the array will be doubled.
PriorityQueue
Priority queue implemented using binary heap, please refer to the introductory tutorial for details. It is no longer FIFO but dequeues based on the Comparable interface implemented by the element or the comparison result passed into the Comparator. The smaller the value, the higher the priority. The higher the value, the higher the number will be. But note that the return of iterator() will not be sorted.
--Thread-safe queue--
ConcurrentLinkedQueue/ConcurrentLinkedDeque
Unbounded concurrency-optimized Queue, based on a linked list, implements a lock-free algorithm that relies on CAS.
The structure of ConcurrentLinkedQueue is a one-way linked list and two pointers, head/tail. Because when joining the queue, you need to modify the next pointer of the tail element of the queue, and modify the tail to point to the newly joined element. The two CAS actions cannot be atomic, so a special algorithm is required. , please refer to the introductory tutorial due to space limitations.
PriorityBlockingQueue
Unbounded concurrency optimized PriorityQueue is also based on a binary heap. Use a public read-write lock. Although the BlockingQueue interface is implemented, it does not have any blocking queue characteristics. It will automatically expand when the space is insufficient.
DelayQueue
Contains a PriorityQueue internally, which is also unbounded. The element needs to implement the Delayed interface, and each time it is called, it needs to return how long it is before the trigger time. If it is less than 0, it means it is time to trigger.
When pulling(), peek() will be used to check the element at the head of the queue to check whether the trigger time has been reached. ScheduledThreadPoolExecutor uses a similar structure.
--Thread-safe blocking queue--
The queue length of BlockingQueue is limited to ensure that the speed of producers and consumers will not differ too far and avoid memory exhaustion. The queue length cannot be changed after it is set. When the queue is full when entering the queue, or the queue is empty when leaving the queue, the effects of different functions are shown in the table below:
May report an exception Return a Boolean value May block waiting Waiting time can be set
Enter the queue add(e) offer( e) put(e) offer(e, timeout, unit)
dequeue remove() poll() take() poll(timeout, unit)
View element() peek() none none
ArrayBlockingQueue
fixed Long concurrency optimized BlockingQueue, implemented based on loop array. There is a common read-write lock and two Conditions, notFull and notEmpty, to manage the blocking state when the queue is full or empty.
LinkedBlockingQueue/LinkedBlockingDeque
You can choose a long concurrency-optimized BlockingQueue, which is implemented based on a linked list, so you can set the length to Integer.MAX_VALUE. Taking advantage of the characteristics of the linked list, the two locks takeLock and putLock are separated, and notEmpty and notFull are used to continue to manage the blocking state when the queue is full or empty.
Supplement
JDK7 has a LinkedTransferQueue. The transfer(e) method ensures that the elements put in by the Producer are taken away by the Consumer and returned. It is better than SynchronousQueue. You should learn it when you have time.