ArrayList作为一个天天见面的一个数据结构了,精通它的每一个细节是必须的
在日常开发中,对ArrayList
的使用无非就是new
一个对象出来,然后通过add()
、get()
、set()
一系列方法进行操作。
既然这样,那么我们就从它的构造函数开始讲起。
ArrayList
有三个构造方法,分别无参构造方法、带初始化容量的构造方法、带初始化集合的构造方法。
/**
* Constructs an empty list with the specified initial capacity.
*
* @param initialCapacity the initial capacity of the list
* @throws IllegalArgumentException if the specified initial capacity
* is negative
*/
public ArrayList(int initialCapacity) {
if (initialCapacity > 0) {
this.elementData = new Object[initialCapacity];
} else if (initialCapacity == 0) {
this.elementData = EMPTY_ELEMENTDATA;
} else {
throw new IllegalArgumentException("Illegal Capacity: "+
initialCapacity);
}
}
/**
* Constructs an empty list with an initial capacity of ten.
*/
public ArrayList() {
this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}
/**
* Constructs a list containing the elements of the specified
* collection, in the order they are returned by the collection's
* iterator.
*
* @param c the collection whose elements are to be placed into this list
* @throws NullPointerException if the specified collection is null
*/
public ArrayList(Collection<? extends E> c) {
elementData = c.toArray();
if ((size = elementData.length) != 0) {
// c.toArray might (incorrectly) not return Object[] (see 6260652)
if (elementData.getClass() != Object[].class)
elementData = Arrays.copyOf(elementData, size, Object[].class);
} else {
// replace with empty array.
this.elementData = EMPTY_ELEMENTDATA;
}
}
其实看上去很好理解,上面的一些变量名即使不说明也能大概能知道它表示的是什么。
但是想要说清楚还是得先看对应的变量代表什么,所以我们回溯回来。
/**
* Default initial capacity.
*/
private static final int DEFAULT_CAPACITY = 10;
/**
* Shared empty array instance used for empty instances.
*/
private static final Object[] EMPTY_ELEMENTDATA = {};
/**
* Shared empty array instance used for default sized empty instances. We
* distinguish this from EMPTY_ELEMENTDATA to know how much to inflate when
* first element is added.
*/
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
/**
* The array buffer into which the elements of the ArrayList are stored.
* The capacity of the ArrayList is the length of this array buffer. Any
* empty ArrayList with elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA
* will be expanded to DEFAULT_CAPACITY when the first element is added.
*/
transient Object[] elementData; // non-private to simplify nested class access
/**
* The size of the ArrayList (the number of elements it contains).
*
* @serial
*/
private int size;
看到这里应该是能够猜出ArrayList的实现原理的,没错就是基于数组的。但是你也许会感到奇怪,EMPTY_ELEMENTDATA
和DEFAULTCAPACITY_EMPTY_ELEMENTDATA
为什么两个的值是一样的还要声明成两个呢。其实这里在源码的注释上也是有说明的,EMPTY_ELEMENTDATA
是表示空数组对象,而DEFAULTCAPACITY_EMPTY_ELEMENTDATA
则表示一个默认的数组对象,当你新增元素的时候,它会扩展到默认大小,换句话说是有初始化值的,而初始化值就是DEFAULT_CAPACITY
。
这里我们看回去构造方法,可以看到我们平时不加参数直接new ArrayList()
是相当于创建了一个初始化值为DEFAULT_CAPACITY
的数组,而如果加入了初始化大小的话就会创建一个指定大小的数组了,当然为0的时候就会直接赋值一个EMPTY_ELEMENTDATA
。
既然知道了实现的原理,那新增、修改、删除就简单了,不就是对数组就像插入、删除和修改。但是有一个问题,既然是数组也就说明它不像链表那样可以无限新增进去,这里我们可以从源码上看看这里究竟是怎么实现的。
/**
* Appends the specified element to the end of this list.
*
* @param e element to be appended to this list
* @return <tt>true</tt> (as specified by {@link Collection#add})
*/
public boolean add(E e) {
ensureCapacityInternal(size + 1); // Increments modCount!!
elementData[size++] = e;
return true;
}
/**
* Inserts the specified element at the specified position in this
* list. Shifts the element currently at that position (if any) and
* any subsequent elements to the right (adds one to their indices).
*
* @param index index at which the specified element is to be inserted
* @param element element to be inserted
* @throws IndexOutOfBoundsException {@inheritDoc}
*/
public void add(int index, E element) {
rangeCheckForAdd(index);
ensureCapacityInternal(size + 1); // Increments modCount!!
System.arraycopy(elementData, index, elementData, index + 1,
size - index);
elementData[index] = element;
size++;
}
可以看到两个add()
方法在插入之前都会经历一个方法ensureCapacityInternal
,通过名字也能大概知道它是干什么的,我们不妨继续深入进去。
private void ensureCapacityInternal(int minCapacity) {
ensureExplicitCapacity(calculateCapacity(elementData, minCapacity));
}
// 计算扩容最小容量
private static int calculateCapacity(Object[] elementData, int minCapacity) {
if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
return Math.max(DEFAULT_CAPACITY, minCapacity);
}
return minCapacity;
}
private void ensureExplicitCapacity(int minCapacity) {
modCount++;
// overflow-conscious code
if (minCapacity - elementData.length > 0)
grow(minCapacity);
}
还记得我们上面我们说到EMPTY_ELEMENTDATA
和DEFAULTCAPACITY_EMPTY_ELEMENTDATA
的区别吗?在这里我们就看到实打实的证据了。在扩容计算最小容量的时候会判断当前对象是否DEFAULTCAPACITY_EMPTY_ELEMENTDATA
,进而决定是否基于默认容量DEFAULT_CAPACITY
来计算。
我们接着继续往下走。
/**
* The maximum size of array to allocate.
* Some VMs reserve some header words in an array.
* Attempts to allocate larger arrays may result in
* OutOfMemoryError: Requested array size exceeds VM limit
*/
private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
/**
* Increases the capacity to ensure that it can hold at least the
* number of elements specified by the minimum capacity argument.
*
* @param minCapacity the desired minimum capacity
*/
private void grow(int minCapacity) {
// overflow-conscious code
int oldCapacity = elementData.length;
int newCapacity = oldCapacity + (oldCapacity >> 1);
if (newCapacity - minCapacity < 0)
newCapacity = minCapacity;
if (newCapacity - MAX_ARRAY_SIZE > 0)
newCapacity = hugeCapacity(minCapacity);
// minCapacity is usually close to size, so this is a win:
elementData = Arrays.copyOf(elementData, newCapacity);
}
private static int hugeCapacity(int minCapacity) {
if (minCapacity < 0) // overflow
throw new OutOfMemoryError();
return (minCapacity > MAX_ARRAY_SIZE) ?
Integer.MAX_VALUE :
MAX_ARRAY_SIZE;
}
到这里扩容就结束了,从代码上看默认扩容容量就是原来的1/2(此处忽略扩容后还是比需要的小这种情况,具体看代码就行),然后在copy到新的数组就好了。
既然新增的时候会扩容,那当我们删除了一定的元素之后会不会缩小空间呢?不会!但是ArrayList
提供了一个方法给我们去删减空间,在适当的时候我们可以调用一下。
/**
* Trims the capacity of this <tt>ArrayList</tt> instance to be the
* list's current size. An application can use this operation to minimize
* the storage of an <tt>ArrayList</tt> instance.
*/
public void trimToSize() {
modCount++;
if (size < elementData.length) {
elementData = (size == 0)
? EMPTY_ELEMENTDATA
: Arrays.copyOf(elementData, size);
}
}
另外,在看源码的过程中不知道你有没有注意到一个变量经常出现modCount
,每当数组改变的时候它就会改变,冥冥之中感觉这个小东西没那么简单,我们看看它的说明。
/**
* The number of times this list has been <i>structurally modified</i>.
* Structural modifications are those that change the size of the
* list, or otherwise perturb it in such a fashion that iterations in
* progress may yield incorrect results.
*
* <p>This field is used by the iterator and list iterator implementation
* returned by the {@code iterator} and {@code listIterator} methods.
* If the value of this field changes unexpectedly, the iterator (or list
* iterator) will throw a {@code ConcurrentModificationException} in
* response to the {@code next}, {@code remove}, {@code previous},
* {@code set} or {@code add} operations. This provides
* <i>fail-fast</i> behavior, rather than non-deterministic behavior in
* the face of concurrent modification during iteration.
*
* <p><b>Use of this field by subclasses is optional.</b> If a subclass
* wishes to provide fail-fast iterators (and list iterators), then it
* merely has to increment this field in its {@code add(int, E)} and
* {@code remove(int)} methods (and any other methods that it overrides
* that result in structural modifications to the list). A single call to
* {@code add(int, E)} or {@code remove(int)} must add no more than
* one to this field, or the iterators (and list iterators) will throw
* bogus {@code ConcurrentModificationExceptions}. If an implementation
* does not wish to provide fail-fast iterators, this field may be
* ignored.
*/
protected transient int modCount = 0;
注释说明的很清晰,它是作用在迭代器中,当你在用迭代器遍历元素的时候,有人修改列表那之后迭代器出来的值就可能与预想的不一致了,这里就是通过这样的机制fail-fast
来避免这样的情况。另外需要注意的是,这里是不保证在并发情况也能看到一样的效果。
来到这里其实分析的差不多了,也是时候说再见了,期待下次再见~
扩展
- 关于
SubList
这个类我们使用的时候需要注意,它只是对原来数组进行逻辑化转换,不是真正的创建了一个子列表出来,所以如果改了SubList
的话,原数组是会受到影响的。
评论区