The mechanism of virtual memory was developed in the years 1960. It is based on the use of a mass memory (standard Hard drive or in the past a drum), for the goal, inter alia, to make it possible programs to be able to be carried out in a material environment having less main memory than necessary (or, considering differently, to make turn more programs than the main memory cannot contain some!)

The virtual memory allows:

History

The standard commodity of James Kilburn, published in 1962, described the first computer equipped with a management system of paginated virtual memory and using a drum like extension of the main memory to tori of Ferrite: the Atlas.

Today, all the computers have a mechanism of management of the virtual memory, except some Supercalculateur S or systems embarked real-time.

Memory virtuelle paginated

The principle is the following:
  • the addresses memories emitted by the processor are virtual addresses, indicating the position of a word in the virtual memory .
  • This memory virtual is made of of the same zones cuts, called pages. A virtual address is thus a couple (number of page, displacement in the page). The size of the pages is a power of two, in order to determine without calculation displacement (10 bits of weak weight of the virtual address for pages of 1024 words), and the number of page (other bits).
  • the physical memory is also made up of of the same zones cuts, called “frameworks” ( frames in English), in which seat the pages take.
  • a mechanism of translation ( translation , or generation of address ) ensures the conversion of the virtual addresses into physical addresses, by consulting a table of the pages” ( page counts English ) to know the number of the framework which contains the required page. The physical address obtained is the couple (number of framework, displacement).
  • It can have there more pages than executives (it is all the interest there): the pages which are not in memory are stored on another support (disc), they will be brought back within a framework when one needs some.

The table of the pages is indexed by the number of page. Each line is called “entered the table of the pages” ( pages counts entry , shortened PTE), and contains the number of framework. The table of the pages being able to be located anywhere in memory, a special register (PTBR for Page Counts Register Base) preserves its address.

In practice, the mechanism of translation belongs to an electronic circuit called DRIVEN (memory management links) which contains also part of the table of the pages, stored in a associative memory made of fast registers. This avoids having to consult the table of the pages (in memory) for each access report (including the table look-up of the pages?!)

Here a real example of a machine whose processor generates virtual addresses on 32 bits, thus being able to reach 4 Gio of memory. The size of the page is of 4  KB. One from of deduced that the field displacement occupies the 12 bits of weak weight, and the field number of page 20 bits of strong weight.

One will note the presence of a special field pertaining to each PTE. To simplify, we reduced the width of this field to a bit: the bit of validity . If this one is to 0, that means that the number of framework is invalid. It is thus necessary to obtain a technique making it possible to update this PTE to make it valid.

Three cases can occur:

  1. the entry is valid: it replaces the number of page to form the physical address.

  2. the entry in the table of the pages is invalid. In this case it is necessary to find a framework free in memory physical and to put its number in this entry of the table of the pages.
  3. the entry in the table of the pages is valid but corresponds at an address on the mass memory where the contents of the framework are. A mechanism will have to bring back these data to place them in physical memory.

Allowance with the request

See also: Algorithms of replacement of the lines of mask

In the last both cases, a interruption - called defect of page (or sometimes for lack of page, translation of English page fault - is generated by the material and gives the hand to the operating system. This one with the responsibility of find a framework available in memory main in order to allocate it with the process responsible for this defect of page, and if required to reload the contents of this frame by the contents saved on the mass memory (usually the hard drive, on a zone called zone of exchange or swap ).

It may be that there is no more no free framework in main memory: this one is occupied with 100  %. In this case a algorithm of pagination to the responsibility to choose a page “victim”. This page will be seen either immediately reallocated with the petitioning process, or it will be initially safeguarded on hard drive, and the entry of the table of the pages which it reference will be updated. It will be noted that the victim page can belong very well to the process which lacks place.

Below some examples of algorithms are listed. The first line corresponds to the chain of references , i.e. the order in which the process will reach the pages. It is supposed that the main memory is made up of three frames . The frame victim will appear underlined. The initial defects of page are not counted (they are identical number some is the selected algorithm).

  • the optimal algorithm : the number of defects of page is tiny room to 6. The rule of replacement is “to replace the frame which will not be used for the longest length of time”. Unfortunately, this algorithm would require to know the future. The other algorithms will thus try to approach this optimal solution.

  • FIFO ( First in, first out or “sunken First, left first”): the framework victim is that which was brought in memory there is longest (more “old”). Note that it is not necessary to preserve the moment to which a framework was replaced: it is enough to maintain a structure FIFO, to replace the framework whose number appears at the head, and to insert the number of the new framework in last position. This algorithm gives place to 12 replacements:

  • the algorithm generally used is called LRU ( Least recently used , “is the least recently used”). It consists in choosing as victim the framework which was not referred since longest. One can implement it is by adding bits in each entry of the table of the pages which indicate when the last reference to this entry took place, that is to say via a structure of list where one will bring in first position the framework recently referred, the executives victims thus remaining in last positions. This algorithm gives place to 8 replacements:

  • Other algorithms:
    • random Replacement: where the frame victim is randomly selected.
    • LFU ( Least frequently used “is the least most often used”): one keeps a meter which is incremented with each time the framework is referred, and the victim will be the framework whose meter is lowest. Disadvantage: with the starting of the program some pages can be intensely used, then never again thereafter. The value of the meter will be so high that they will be replaced only too tardily. It is also necessary to manage the case of capacity overshooting of the meter…

It can be relatively easy to find cases pathological which make an algorithm unusable. For example, for algorithm LRU, it would be about a program which uses 5 pages in a loop on a machine which counts only 4 cadres'. It initially sequentially will use the 4 first frameworks (1, 2,3,4) then a defect of page will occur and it is the page 1, in the past charged, which will be the victim. The pages used are now (5, 2,3,4). Since the program buckles, it needs page 1 (following page 5). This time, the victim page is page 2, replaced by the 1: (5, 1,3,4), then (5, 1,2,4), (5, 1,2,3), etc a defect of page is generated with each iteration…

Anomaly of Belady

Intuitively, to increase the number of frameworks of pages (i.e. to increase the main memory), must reduce the number of defects of page.

The anomaly of Belady (1970) is a counterexample which indeed shows that it is not absolutely true with algorithm FIFO, the reader will be able to check by itself that the sequence of references (3, 2,1,0,3,2,4,3,2,1,0,4) led to

  • 9 defects of page with 3 frameworks,
  • 10 defects of page with 4 frameworks.

Remark : one should not exaggerate the range of this curiosity. It shows certainly that algorithm FIFO does not have in general a property which one would have expected (to add memory reduces the defects of page) but it does not show that it does not have it on average . And in any case algorithm FIFO is never used for the replacement of page.

In addition, one can show that certain algorithms of replacement of pages (LRU for example) are not prone to this type of anomaly.

Method of allowance in a multiprogrammé system

The methods of selection of the victim page evoked above can apply either to the pages belonging to a process (one speaks then about “local allowance”), or in all the pages and thus with all the memory (in this case the technique of allowance is known as “total”).

In a system of total allowance, the execution time of a process can largely vary from one authority to another because the number of defects of page does not depend on the process itself. Of another dimensioned, this system allows the number of frameworks allocated with a process to evolve/move.

Divide of memory in a paginated system

The following diagram shows three processes in the course of execution, for example a text editor named ED. The three authorities all are located at the same virtual addresses (1, 2,3,4,5). This program uses two distinct zones memory: the pages which contain the code, i.e. instructions describing the program, and the data field, the file in the course of edition. It is enough to keep the same entries in the table of the pages so that the three authorities share the zone of code. On the other hand, the entries corresponding to the pages of data are, they, distinct.

Protection

bits of protections are added to each entry of the table of the pages. Thus one will be able easily to make the distinction between the pages allocated with the core, in reading alone, etc Voir the example below.

Effectiveness

Three main issues are encountered:
  1. size of the table of the pages: for an architecture where 20 bits are reserved for the number of page, the table will occupy 4 minimum Mo of memory (1024 entries of 4  KB each one). This problem is solved by the use of several tables of pages: the fields number of page will be broken up into several, each one indicating a displacement in the table moreover low level. VAX and the Pentium S support two levels, SPARC three, Motorola 680x0 four… One can also segment the table of the pages.
  2. access time: the table of the pages being located in memory, one would need two accesses report by request on behalf of the processor. To mitigate this problem the entries most often used are preserved in an associative memory (Mémoire hiding place) called TLB for Translation Lookaside Buffer . Each virtual address from the processor is sought in the TLB; if there is correspondence, the entry of the TLB is used, if not an interruption is started and the TLB will have to be updated by the entry of the table of the pages stored in memory before the faulty instruction is not started again. All the current microprocessors have a TLB.
  3. Phenomenon of trashing (collapse): the more the Taux of multiprogramming increases, the less each process is seen allocating pages. At the end of one moment, the system saturates because too many defects of page are generated. The phenomenon of trashing appears with each time, in a hierarchical storage system, one of the levels is seen overloaded. It is for example the case if the memory hiding place is too small. At this time the ceaseless return tickets of data along the hierarchy strongly will decrease the output of the computer. It is possible to decrease the effects of this behavior is by adding material resources (to add memory), to decrease the rate of multiprogramming, or to modify the priority of the processes.

Principle of locality

The behavior of the programs is not chaotic: the program starts, it calls upon functions (or parts of code) which call of them others in their turn, etc Each one of these calls defines an area. It is probable that the program will spend much time to be carried out within some areas: it is the principle of locality. The same principle can be applied to the pages containing of the data.

In other words, a program frequently reaches a small whole of pages, and this whole of pages evolves/moves slowly with time.

If one is able to store these often reached spaces, one decreases the chances to see a program putting itself at trasher , i.e. to claim pages which one has just withdrawn to him recently.

The working set : workspace

One can define a parameter, Δ, which is the number of references to the pages reached by the process during a certain amount of time. The figure below watch the value of the workspace at two different moments:

It is necessary to choose the value of Δ carefully: too much small it does not cover the nominal workspace of the process, if it is too large it includes useless pages. If Δ is equal ad infinitum, it covers the totality of the program, of course.

For a single process, one can represent graphically how the memory is allocated to him, and visualize the workspaces:

The “plates” are zones where there is no defect of page: allocated space is sufficiently large to contain all the frames of which the process needs during a relatively long time. The defects of pages take place in the ascending part of the transition, while memory is released when the curve falls down towards the next workspace: zone of balance.

It is with the operating system to implement the algorithms so that the value of Δ is optimum so that the Taux of multiprogramming and the use of the Central processing unit are maximized. In other words: to avoid the trashing . If the sum of the workspaces of each process is higher than the number of frames available, there will be inevitably collapse.

Fragmentation

A system paginated with the disadvantage of generating an internal fragmentation: a whole page is allocated with a process, whereas only some bytes are occupied. For example, if one supposes a size of page of 4  KB, a process needing 5.000 bytes will be seen allocating 2 pages, that is to say 8.192 bytes, nearly 40% “is lost”.

Prépagination

One of the advantages of the virtual memory is to be able to begin the execution of a program as soon as its first page of code is charged in memory. The prepagination not only will charge the first page, but the few following ones, of which the probability of being reached is very high.

Cut pages for some computers

Here indicated out of bits, addressable total space, the width of the fields number of page, and displacement.

Example

  • Here an example, drawn from the handbook of the Tahoe, a clone of VAX:
the addresses are coded on 32 bits (4 Gio of total space)
the size of the page is of 1 KB (coded on 10 bits).
the entries in the table of the pages are with this format:

3 3 2 2 2 2 2 1 0 7 3 2 1 0 0 +---+------+-----+---+---+---+------------------------------------------------+ | V | PROT | | NR | M | U | NDP | +---+------+-----+---+---+---+------------------------------------------------+

Fields M, U, NR and NDP are valid only if the bit V is to 1. When V is to 0, field NDP contains the address on the hard drive where the page is.

Fields PROT must be interpreted as this (the value of the field is given into binary on 4 bits):

The bit 24, NR (Non-cachée), means that the page is not hides some and that the system must read or write directly since or towards the memory.

The bit M (Modifiée) is modified by the material if the contents of the page are modified.

The bit U (Utilized) indicates if the page to summer read or written by a process. It is useful, in partnership with the others, for the determination of the workspace ( Working Set ) of a process (cf above).

  • the call system vfork (2) of the operating system Unix creates a new context (process) by duplicating the table of the pages of the process which makes the call (its father ). The part of the table of the pages marked in reading alone (the code) will be duplicated such as it is. The pages which correspond to the data are marked Copy one Write . When Unix must carry out a writing on a marked page Copy one Write , it will allocate a news frame , will recopy the contents of the original frame and finally will make the modification requested on this news frame . Finally vfork (2) is thus a call inexpensive system because the weather is not large thing…

  • For those which can read the sources C of Unix, the definition of the PTE is given in the file <… /pte.h> of various architectures. An excellent example in the manner of using the PTE since a user's program is provided in the source of the program PS of BSD 4.3.

Segmentation

The segmentation offers a sight of the more consistent memory with that of the user. Indeed, this one does not consider (or seldom!) memory like a continuation of pages but rather by spaces, or areas, dedicated to a particular use as for example: the code of a program, the data, the pile, a whole of subroutines, modules, a table, etc the segmentation reflects this organization.

Each logical object will be indicated by a segment. In a segment addressing will be done using a displacement. The couple (segment, displacement) will be translated into address memory by the means of a table of segments containing two fields, limit and bases . The base is the address of beginning of the segment, and limit the last address of the same segment:

Problem of fragmentation

The paginated systems encounter a problem of fragmentation intern : place is lost at the end of a page. The segmented systems know an external problem of fragmentation : spaces between segments are too small to place new fragments, this space is thus lost.

It is possible to recover it by compacting the memory, i.e. by moving the segments - while reflecting these modifications in the tables of the segments - so that it are contiguous. Nevertheless this operation is expensive.

Share segments

It is possible to share segments between process, as illustrated on the figure below, where two processes Ed1 and Ed2 share the same segment of code (program) but have disjoined segments for the data and of different sizes.

Protection in a segmented system

This protection will be ensured by additional bits added in the table of the segments, in the same way that for a paginated system.

Example of microprocessors with architecture segmented memory

The most known example is the Intel 8086 and its four registers:
  • CS, for Code Segment: point towards the segment containing the current program.
  • DS, for Data Segment: point towards the segment containing the data of the program in the course of execution.
  • ES, for will Extra Segment: point towards the segment of which the use is left to the programmer.
  • S, for Stack Segment: point towards the segment containing the pile.

The successors of the 8086 are also segmented:

  • the 80286 can manage 16 physical Mo of memory and 1 Go of virtual memory is 16.384 segments of 64 KB.
  • the 80386 4 Go of physical memory, 64 To of virtual memory, is 16.384 segments of 4 Go.

Paginated/segmented analog and digital systems

It is possible to mix the two preceding modes:
  1. the segmented Pagination, where the table of the pages will be segmented. In other words, the number of page p of the couple (p, d) of the virtual address will be interpreted like a segment (S, p'). This system solves the problem of size of the table of the pages.
  2. the paginated Segmentation, where each segment will be paginated. In other words, the field displacement D of the couple (S, d) of the virtual address will be interpreted like a number of page and a displacement (p, of).

Swapping

It is sometimes necessary to remove all the pages or segments of a process of the main memory. In this case the process will be known as swappé , and all the data belonging to him will be stored in mass memory. That can occur for processes sleeping for a long time, whereas the operating system needs to allocate memory with the active processes. The pages or segments of code (program) will be never swappés , but quite simply réassignés, because one can find them in the file corresponding to the program (the file of achievable the ). For this reason, the operating system prohibits the access in writing to an achievable file in the course of use; symmetrically, it is impossible to launch the execution of a file as long as it is held open for an access in writing by another process.

External bibliography/bonds

  • One level storage system , Kilburn, Edwards, Lanigan, Summer, ANGER Transactions one elecronic computers, EC-11, vol. April 2nd, th and th 1962, pp. 223-235.
  • Computer Organization and Design , Hennessy, Patterson, Morgan Koffman, ISBN 1558604286.
  • Operating system concepts , Patterson, Silberschatz, ISBN 020151379X.
  • Computer Organization & Architecture , Hennessy, Patterson, Morgan Koffman, ISBN 0333645510.
  • Archtecture Computer: In Quantitative Approach
  • Computation Structures , Stephen A. Ward, Robert H. Halstead ISBN 026273088X.
  • Structured Computer Organization
  • VAX refers Manual
  • Sperry 7000/40 Architecture and Assembly Language Manual

Simple: Virtual memory

Random links:Douai | Misiones (province) | Phil Holliday | Kévin Remains | OJ Jae-hyeon

© 2007-2008 speedlook.com; article text available under the terms of GFDL, from fr.wikipedia.org