04. Cache

«Speed»: latency
«Speed»: trhoughput
+ (hardware specific) «Speed»: granularity (e. g. reading from disk by sectors vs. reading from disk cache by bytes)

Cache

keeping an actual part of «slow» storage data on «fast» storage

Why cache < storage?

What is «actuality»? See Locality_of_reference

(other: branches and equidistant references)

What to keep in cache:

TODO What to throw out of cache:

Random: easy implementation, but inefficient;
LFU (less frequently used): need counters and age meters, and search;
LRU (less recently used): need age and search;
FIFO: relatively easy, but ineffective
FIFO+LRU: needs direct queue modification, but no age/counters (newly arrived element appends (or moves) to te top of the queue

There always is worst case algorithm for every cache architecture.

We must know address (storage index) of cached element. If element is too small, caching is ineffective because of too much metadata:

| 00 | 10 04 00 00 | a0 78 4f 95 |

38 bits of metadata over 32 bits of data.

Note: address is 30 bits instead of 32 because we need no trailing two bits, which are zero if we addressing a word (4 bytes) instead of byte.

So let's go further and cache a whole memory block:

| 00 | 10 04 00 | a0 78 4f 95 … 34 45|

32 bits of metadata over 8192 bits of data.

Caching a line of words at once (and probably by access to one word only) is statistically good tactics because of spatial locality

A line can cache one of selected list of blocks. If tags are equal, this is hit, if not equal or unused, this is miss (and cache line is filled).

Any line can cache any memory block (aligned to cache line size).

Tag size = number of blocks (address // line size ) ⇒ larger than in direct cache
Line numbers are stored when cached
Needs hardware implementation of tag searching (we don't know where line is cached, if it is)
- ⇒ Effective if small
Complies to temporal locality, strong to spatial locality breach

How to violate spatial locality:

Perform random access to various addresses (rarely accidental)
Have more than one spots of code/data flow (e. g. in multitasking environment, and that is common)

Multi-associative cache is superposition of direct and associative cache:

Each bank works as direct cache
⇒ tag size is equal to direct cache tag size
There are a number of banks, and data is searched among them in associated manner
- Depending on number of banks, search is relatively fast
Can hold multiple locality spots

Cache strategies:

H/W

HSE/ProgrammingOS/04_Cache (последним исправлял пользователь FrBrGeorge 2020-03-01 11:32:40)