IT- 213

Storage Hierarchy

The range of memory and storage devices within the computer system. The following list starts with the slowest devices and ends with the fastest. See storage and memory.

      VERY SLOW



   Punch cards (obsolete)

   Punched paper tape (obsolete)



     FASTER



   Bubble memory

   Floppy disks



     MUCH FASTER



   Magnetic tape

   Optical discs (CD-ROM, DVD-ROM, MO, etc.)

   Magnetic disks with movable heads

   Magnetic disks with fixed heads (obsolete)

   Low-speed bulk memory



     FASTEST



   Flash memory

   Main memory

   Cache memory

   Microcode

   Registers

Storage Hierarchy

To clarify the ``guarantees'' provided at different settings of the persistence spectrum without binding the application to a specific environment or set of storage devices, MBFS implements the continuum, in part, with a logical storage hierarchy. The hierarchy is defined by N levels:

1.: LM (Local Memory storage): very high-speed volatile storage located on the machine creating the file.
2.: LCM (Loosely Coupled Memory storage): high-speed volatile storage consisting of the idle memory space available across the system.
3.: -N DA (Distributed Archival storage): slower speed stable storage space located across the system.

Logically, decreasing levels of the hierarchy are characterized by stronger persistence, larger storage capacity, and slower access times. The LM level is simply locally addressable memory (whether on or off CPU). The LCM level combines the idle memory of machines throughout the system into a loosely coupled, and constantly changing, storage space. The DA level may actually consist of any number of sub-levels (denoted DA₁, DA₂, ..., DA_n) each of increasing persistence (or capacity) and decreasing performance. LM data will be lost if the current machine crashes or loses power. LCM data has the potential to be lost if one or more machines crash or lose power. DA data is guaranteed to survive power outages and machine crashes. Replication and error correction are provided at the LCM and DA levels to improve the persistence offered by those levels.

Each level of the logical MBFS hierarchy is ultimately implemented by a physical storage device. LM is implemented using standard RAM on the local machine and LCM using the idle memory of workstations throughout the network. The DA sub-levels must be mapped to some organization of the available archival storage devices in the system. The system administrator is expected to define the mapping via a system configuration file. For example, DA-1 might be mapped to the distributed disk system while DA-2 is mapped to the distributed tape system.

Because applications are written using the logical hierarchy, they can be run in any environment, regardless of the mapping. The persistence guarantees provided by the three main levels of the hierarchy (LM, LCM, DA₁) are well defined. In general, applications can use the other layers of the DA to achieve higher persistence guarantees, without knowing the exact details of the persistence guaranteed; only that it is better. For applications that want to change their storage behavior based on the characteristics of the current environment, the details of each DA's persistence guarantees, such as the expected mean-time-till-failure, can be obtained via a stat() call to the file system. Thus, MBFS makes the layering abstraction explicit while hiding the details of the devices used to implement it. Applications can control persistence with or without exact knowledge of the characteristics of the hardware used to implement it. Once the desired persistence level has been selected, MBFS's loosely coupled memory system uses an addressing algorithm to distribute data to idle machines and employs a migration algorithm to move data off machines that change from idle to active. The details of the addressing and migration algorithms can be found in [15,14] and are also used by the archival storage levels. Finally, MBFS provides whole-file consistency via callbacks similar to Andrew[19] and a Unix security and protection model.

Caching

Caching is a well known concept in computer science:

when programs continually access the same set of instructions,

a massive performance benefit can be realized by

storing those instructions in RAM.

This prevents the program from having to access the

disk thousands or even millions of times during execution

by quickly retrieving them from RAM.

Caching on the web is similar in that it

avoids a roundtrip to the origin web server each time a resource

is requested and instead retrieves the file from a

local computer's browser cache or a proxy cache closer to the user.

The most commonly encountered caches on the web are the ones

 found in a user's web browser such as Internet Explorer,

Mozilla and Netscape. When a web page, image, or JavaScript

 file is requested through the browser each one of these

resources may be accompanied by HTTP header directives that

tell the browser how long the object can be considered fresh,

that is for how long the resource can be retrieved directly

 from the browser cache as opposed to from the origin or

proxy server. Since the browser represents the cache closest

 to the end user it offers the maximum performance benefit

whenever content can be stored there.

Coherency and consistency

Transactional Coherence and Consistency (TCC) offers a way to

simplify parallel programming by executing all code in transactions.

 In TCC systems, transactions serve as the fundamental unit of

parallel work, communication and coherence. As each transaction

completes, it writes all of its newly produced state to shared

memory atomically, while restarting other processors that have

speculatively read from modified data. With this mechanism,

a TCC-based system automatically handles data synchronization

correctly, without programmer intervention. To gain the benefits

of TCC, programs must be decomposed into transactions.

Decomposing a program into transactions is largely a matter

of performance tuning rather than correctness, and that a

few basic transaction programming optimization techniques are

sufficient to obtain good performance over a wide range of

applications with little programmer effort.

IT- 213

Tuesday, June 23, 2009

Storage Hierarchy

0 comments:

Blog Archive

About Me