Cache coherence - Wikipedia, the free encyclo...

来源:百度文库 编辑:神马文学网 时间:2024/04/28 00:38:49
Cache coherence
From Wikipedia, the free encyclopedia
Jump to:navigation,search

Multiple Caches of Shared Resource
In computing, cache coherence (also cache coherency) refers to the consistency of data stored in localcaches of a shared resource. Cache coherence is a special case ofmemory coherence.
When clients in a system maintaincaches of a common memory resource, problems may arise with inconsistent data. This is particularly true of CPUs in amultiprocessing system. Referring to the "Multiple Caches of Shared Resource" figure, if the top client has a copy of a memory block from a previous read and the bottom client changes that memory block, the top client could be left with an invalid cache of memory without any notification of the change. Cache coherence is intended to manage such conflicts and maintain consistency between cache and memory.
Contents
[hide]
1 Definition2 Cache coherence mechanisms3 Coherency protocol4 Further reading5 See also
[edit] Definition
Coherence defines the behavior of reads and writes to the same memory location. The coherence of caches is obtained if the following conditions are met:
A read made by a processor P to a location X that follows a write by the same processor P to X, with no writes of X by another processor occurring between the write and the read instructions made by P, X must always return the value written by P. This condition is related with the program order preservation, and this must be achieved even in monoprocessed architectures. A read made by a processor P1 to location X that follows a write by another processor P2 to X must return the written value made by P2 if no other writes to X made by any processor occur between the two accesses. This condition defines the concept of coherent view of memory. If processors can read the same old value after the write made by P2, we can say that the memory is incoherent. Writes to the same location must be sequenced. In other words, if location X received two different values A and B, in this order, by any two processors, the processors can never read location X as B and then read it as A. The location X must be seen with values A and B in that order.
These conditions are defined supposing that the read and write operations are made instantaneously. However, this doesn't happen in computer hardware given memory latency and other aspects of the architecture. A write by processor P1 may not be seen by a read from processor P2 if the read is made within a very small time after the write has been made. The memory consistency model defines when a written value must be seen by a following read instruction made by the other processors.
[edit] Cache coherence mechanisms
Directory-based coherence: In a directory-based system, the data being shared is placed in a common directory that maintains the coherence between caches. The directory acts as a filter through which the processor must ask permission to load an entry from the primary memory to its cache. When an entry is changed the directory either updates or invalidates the other caches with that entry.
Snooping is the process where the individual caches monitor address lines for accesses to memory locations that they have cached. When a write operation is observed to a location that a cache has a copy of, the cache controller invalidates its own copy of the snooped memory location.
Snarfing is where a cache controller watches both address and data in an attempt to update its own copy of a memory location when a second master modifies a location in main memory. When a write operation is observed to a location that a cache has a copy of, the cache controller updates its own copy of the snarfed memory location with the new data.
Distributed shared memory systems mimic these mechanisms in an attempt to maintain consistency between blocks of memory in loosely coupled systems.
The two most common types of coherence that are typically studied are Snooping and Directory-based, each having its own benefits and drawbacks. Snooping protocols tend to be faster, if enough bandwidth is available, since all transactions are a request/response seen by all processors. The drawback is that snooping isn't scalable. Every request must be broadcast to all nodes in a system, meaning that as the system gets larger, the size of the (logical or physical) bus and the bandwidth it provides must grow. Directories, on the other hand, tend to have longer latencies (with a 3 hop request/forward/respond) but use much less bandwidth since messages are point to point and not broadcast. For this reason, many of the larger systems (>64 processors) use this type of cache coherence.
[edit] Coherency protocol
A coherency protocol is a protocol which maintains the consistency between all the caches in a system ofdistributed shared memory; the protocol maintainsmemory coherence according to a specifiedconsistency model. Most of the cache protocols in multiprocessors are supportingsequential consistency model, while in software distributed shared memory more popular are models supportingrelease consistency orweak consistency.
Transitions between states in any specific implementation of these protocols may vary. For example, an implementation may choose different update and invalidation transitions such as update-on-read, update-on-write, invalidate-on-read, or invalidate-on-write. The choice of transition may affect the amount of inter-cache traffic, which in turn may affect the amount of cache bandwidth available for actual work. This should be taken into consideration in the design of distributed software that could cause strong contention between the caches of multiple processors.
Various models and protocols have been devised for maintaining cache coherence, such as:
MSI protocolMESI protocol aka Illinois protocolMOSI protocolMOESI protocolMERSI protocolMESIF protocolWrite-once protocolSynapse protocolBerkeley protocolFirefly protocolDragon protocol
Choice of theconsistency model is crucial to designing a cache coherent system. Coherence models differ in performance and scalability; each must be evaluated for every system design.
[edit] Further reading
Handy, Jim. The Cache Memory Book. Academic Press, Inc., 1998.ISBN 0-12-322980-4
[edit] See also
Computer science portal
ccNUMAWrite barrier