Cache_t初识
我们在前面对类的结构探索中知道了类结构体成员如下
struct objc_class : objc_object {
// Class ISA;
Class superclass;
cache_t cache; // formerly cache pointer and vtable
class_data_bits_t bits; // class_rw_t * plus custom rr/alloc flags
...
}
我们通过地址偏移探索知道在bits
中包含了类的属性和方法,那么cache_t cache
又是什么呢?
从名字上我们可以简单的认知就是缓存
,那究竟是不是缓存呢?它到底缓存了类中的什么信息呢?我们接下来就探索来cache_t
。
Cache_t源码探索
还是老规矩,从源码下手分析其结构,我们来到其源码的地方如下:
struct cache_t {
#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_OUTLINED
explicit_atomic<struct bucket_t *> _buckets;
explicit_atomic<mask_t> _mask;
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16
explicit_atomic<uintptr_t> _maskAndBuckets;
mask_t _mask_unused;
// How much the mask is shifted by.
static constexpr uintptr_t maskShift = 48;
// Additional bits after the mask which must be zero. msgSend
// takes advantage of these additional bits to construct the value
// `mask << 4` from `_maskAndBuckets` in a single instruction.
static constexpr uintptr_t maskZeroBits = 4;
// The largest mask value we can store.
static constexpr uintptr_t maxMask = ((uintptr_t)1 << (64 - maskShift)) - 1;
// The mask applied to `_maskAndBuckets` to retrieve the buckets pointer.
static constexpr uintptr_t bucketsMask = ((uintptr_t)1 << (maskShift - maskZeroBits)) - 1;
// Ensure we have enough bits for the buckets pointer.
static_assert(bucketsMask >= MACH_VM_MAX_ADDRESS, "Bucket field doesn't have enough bits for arbitrary pointers.");
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4
// _maskAndBuckets stores the mask shift in the low 4 bits, and
// the buckets pointer in the remainder of the value. The mask
// shift is the value where (0xffff >> shift) produces the correct
// mask. This is equal to 16 - log2(cache_size).
explicit_atomic<uintptr_t> _maskAndBuckets;
mask_t _mask_unused;
static constexpr uintptr_t maskBits = 4;
static constexpr uintptr_t maskMask = (1 << maskBits) - 1;
static constexpr uintptr_t bucketsMask = ~maskMask;
#else
#error Unknown cache mask storage type.
#endif
#if __LP64__
uint16_t _flags;
#endif
uint16_t _occupied;
...
}
从源码我们可以看出cache_t
依然是一个结构体,在里面做了诸多判断,我们首先弄清楚这些判断是什么意思,点进其中一个发现
#if defined(__arm64__) && __LP64__
#define CACHE_MASK_STORAGE CACHE_MASK_STORAGE_HIGH_16 //真机
#elif defined(__arm64__) && !__LP64__
#define CACHE_MASK_STORAGE CACHE_MASK_STORAGE_LOW_4 //真机 非64位
#else
#define CACHE_MASK_STORAGE CACHE_MASK_STORAGE_OUTLINED //MacOS、模拟器
#endif
显然这是对我们的架构进行了区分。可以看到在真机中把mask
和buckets
合并到一起为_maskAndBuckets
。
通过源码得到cache_t
包含了以下成员
buckets探索
点进buckets
可以看到下面信息
#if __arm64__
explicit_atomic<uintptr_t> _imp;
explicit_atomic<SEL> _sel;
#else
explicit_atomic<SEL> _sel;
explicit_atomic<uintptr_t> _imp;
#endif
对真机和非真机进行了判断,我们可以看到两个很重要的东西imp
和sel
,那么这是不是代表里面缓存类类的方法呢?同样我们可以通过地址偏移去分析。首先还是新建一个类SYPerson
,添加个方法
- (void)helloWorld;
- (void)sayGoodJo;
//调用
SYPerson *person = [[SYPerson alloc]init];
[person helloWorld];
在方法调用前我们打下断点就行lldb调试打印如下
//获取类的首地址
(lldb) p/x pClass
(Class) $0 = 0x0000000100002420 CCPerson
//地址偏移0x10 即16位打印出cache_t地址
(lldb) p (cache_t *)0x0000000100002430
(cache_t *) $1 = 0x0000000100002430
//打印cache_t信息
(lldb) p *$1
(cache_t) $2 = {
_buckets = {
std::__1::atomic<bucket_t *> = 0x0000000100794100 {
_sel = {
std::__1::atomic<objc_selector *> = ""
}
_imp = {
std::__1::atomic<unsigned long> = 3265552
}
}
}
_mask = {
std::__1::atomic<unsigned int> = 3
}
_flags = 32784
_occupied = 1
}
//调用buckets()方法
(lldb) p $2.buckets()
(bucket_t *) $3 = 0x0000000100794100
//打印buckets中的信息
(lldb) p *$3
(bucket_t) $4 = {
_sel = {
std::__1::atomic<objc_selector *> = ""
}
_imp = {
std::__1::atomic<unsigned long> = 3265552
}
}
//打印sel,发现了初始化的init方法
(lldb) p $4.sel()
(SEL) $5 = "init"
//试图打印其它方法
(lldb) p *($3+1)
(bucket_t) $6 = {
_sel = {
std::__1::atomic<objc_selector *> = (null) //发现为空,说明只有一个init方法
}
_imp = {
std::__1::atomic<unsigned long> = 0
}
}
//接下来执行一个实例方法打印
2020-09-19 18:59:40.848267+0800 KCObjc[8767:115568] SYPerson -- -[CCPerson helloWorld]
//再次打印
(lldb) p *($3+1)
(bucket_t) $7 = {
_sel = {
std::__1::atomic<objc_selector *> = ""
}
_imp = {
std::__1::atomic<unsigned long> = 10496
}
}
//打印sel,这里可以看出在执行完helloWorld方法后在cache中就可以找到了,说明已经缓存进去了
(lldb) p $7.sel()
(SEL) $8 = "helloWorld"
//打印imp
(lldb) p $7.imp(pClass)
(IMP) $9 = 0x0000000100000d20 (KCObjc`-[CCPerson helloWorld])
通过lldb调试发现init
和helloWorld
都加入了缓存。验证了之前所说的buckets
中缓存了方法.
_occupied & _mask
- _occupied表示哈希表中 sel-imp 的占用大小 (即可以理解为分配的内存中已经存储了sel-imp的的个数),
- _mask是指掩码数据,用于在哈希算法或者哈希冲突算法中计算哈希下标,其中mask 等于capacity - 1
cache_t
在下层通过系统的算法分配内存空间时候会根据_occupied
的值增加进行扩容,扩容后会将原来的内存都清除,重新开辟内存。