The buffer cache

来源:百度文库 编辑:神马文学网 时间:2024/04/24 13:10:01
6.6. The buffer cache
Reading from a diskis very slow compared to accessing (real) memory. In addition,it is common to read the same part of a disk several timesduring relatively short periods of time. For example, onemight first read an e-mail message, then read the letter intoan editor when replying to it, then make the mail program readit again when copying it to a folder. Or, consider how oftenthe command ls might be run on a system withmany users. By reading the information from disk only onceand then keeping it in memory until no longer needed, one canspeed up all but the first read. This is called diskbuffering, and the memory used for the purpose iscalled the buffer cache.
Since memory is, unfortunately, a finite, nay, scarceresource, the buffer cache usually cannot be big enough (itcan‘t hold all the data one ever wants to use). When the cachefills up, the data that has been unused for the longest timeis discarded and the memory thus freed is used for the newdata.
Disk buffering works for writes as well. On the one hand,data that is written is often soon read again (e.g., a sourcecode file is saved to a file, then read by the compiler),so putting data that is written in the cache is a good idea.On the other hand, by only putting the data into the cache, notwriting it to disk at once, the program that writes runs quicker.The writes can then be done in the background, without slowingdown the other programs.
Most operating systems have buffer caches (althoughthey might be called something else), but not all ofthem work according to the above principles. Some arewrite-through: the data is written to diskat once (it is kept in the cache as well, of course). The cacheis called write-back if the writes are doneat a later time. Write-back is more efficient than write-through,but also a bit more prone to errors: if the machine crashes,or the power is cut at a bad moment, or the floppy is removedfrom the disk drive before the data in the cache waiting to bewritten gets written, the changes in the cache are usually lost.This might even mean that the filesystem (if there is one) isnot in full working order, perhaps because the unwritten dataheld important changes to the bookkeeping information.
Because of this, you should never turn off thepower without using a proper shutdown procedureor remove a floppy from thedisk drive until it has been unmounted (if it was mounted)or after whatever program is using it has signaled that itis finished and the floppy drive light doesn‘t shine anymore.The sync command flushesthe buffer, i.e., forces all unwritten data to be written to disk,and can be used when one wants to be sure that everything issafely written. In traditional UNIX systems, there is a programcalled update running in the backgroundwhich does a sync every 30 seconds, soit is usually not necessary to use sync.Linux has an additional daemon, bdflush,which does a more imperfect sync more frequently to avoid thesudden freeze due to heavy disk I/O that syncsometimes causes.
Under Linux, bdflush is started byupdate. There is usually no reason to worryabout it, but if bdflush happens to die forsome reason, the kernel will warn about this, and you shouldstart it by hand (/sbin/update).
The cache does not actually buffer files, but blocks, whichare the smallest units of disk I/O (under Linux, they are usually1 KB). This way, also directories, super blocks, other filesystembookkeeping data, and non-filesystem disks are cached.
The effectiveness of a cache is primarily decided by itssize. A small cache is next to useless: it will hold so littledata that all cached data is flushed from the cache before itis reused. The critical size depends on how much data is readand written, and how often the same data is accessed. The onlyway to know is to experiment.
If the cache is of a fixed size, it is not very good to haveit too big, either, because that might make the free memory toosmall and cause swapping (which is also slow). To make the mostefficient use of real memory, Linux automatically uses all freeRAM for buffer cache, but also automatically makes the cachesmaller when programs need more memory.
Under Linux, you do not need to do anything to make useof the cache, it happens completely automatically. Except forfollowing the proper procedures for shutdown and removingfloppies, you do not need to worry about it.