[Sauers04] 8.1. Virtual Address Space

来源:百度文库 编辑:神马文学网 时间:2024/05/06 16:19:36

Chapter 8. Memory Bottlenecks

Thischapter describes major memory bottlenecks, starting with a review ofsome important concepts in the area of memory management. This isfollowed by a description of typical bottleneck symptoms and sometechniques for diagnosing and tuning them. Chapter 8 covers the following topics:

  • Virtual Address Space Layouts

  • Adaptive Address Space

  • Variable-Sized Pages

  • Cell Local Memory Allocation

  • Paging, Swapping, and Deactivation

  • Dynamic Buffer Cache

  • Memory-Mapped Files and Semaphores

  • Process and Thread Execution using Fork() and Vfork()

  • Sticky Bit

  • Malloc()

  • Shared Memory

  • Shared Libraries

  • Memory Management Policies

  • Sizing Memory and the Swap Area

  • Memory Metrics

  • Types of Memory Management Bottlenecks

  • Expensive System Calls

  • Tuning Memory Bottlenecks

  • Memory-Related Tuneable Parameters

8.1. Virtual Address Space

Whilethe amount of physical memory available for HP-UX is determined by thenumber of memory chips installed on the computer, the system can make amuch larger amount of space available to each process through the useof virtual memory.Virtual addressing is used to access the virtual memory on the system.In order to understand the major bottlenecks that affect memory, it isnecessary to know how virtual addressing works on PrecisionArchitecture and IPF machines. Section 6.1.3, “Virtual Memory” on page 108describes the virtual addressing mechanism at the hardware level insome detail for PA-RISC and IPF. This section will describe how HP-UXuses the hardware virtual addressing features to provide a virtualaddress space configuration for applications.

HP-UXprovides a global virtual address space layout, which has fundamentalperformance benefits. In particular, the global layout provides formore sharing of virtual addresses, which results in fewer TLB entries,fewer OS data structures, and less memory used for maintaining virtualmappings. A strict global layout, however, also imposes a static layoutof virtual memory, which can be restrictive at times, especially for32-bit applications. HP-UX works around these restrictions by providingmultiple static layouts, which allows for flexibility when runningdifferent application types.

Ingeneral, HP-UX partitions the global virtual address space into severalregions of memory that contain particular memory objects. For instance,one region always holds the virtual memory of the operating systemitself. Other regions hold memory objects that can be shared amongmultiple processes. Finally, some regions hold private memory objectsfor an application.

In general, the amount of virtual memory directly available to an application virtual is as follows in HP-UX:

  • For 32-bit processes, the virtual address space available to each process is 4 GB, spread over four 1 GB quadrants that are used for various kinds of memory objects.

  • For 64-bit programs (running under HP-UX 11.0 or later), the process address space is 16 TB for PA-RISC, spread over four 4 TB quadrants, whereas for 64-bit IPF processes, the architected user level address space is enormous: seven regions of 261 bytes each.

Theactual limit on virtual memory is based on the CPU and memory systemimplementation, including the size of registers and TLB entries. Thus,the upper bound on the size of virtual memory for a given system isoften less than the absolute limit of the overall architecture.

Thenext sections describe the general and specific address space layoutsused in HP-UX for both PA-RISC and IPF platforms. The performanceaspects of each layout are also discussed.

8.1.1. PA-RISC Virtual Address Space Layout

InHP-UX systems utilizing PA-RISC processors, processes will havedifferent virtual memory layouts depending on whether the 32-bit or64-bit process model is used. Different levels of HP PA providedifferent amounts of virtual address space (VAS), as shown in Table 8-1.

Table 8-1. PA-RISC Virtual Address Space Levels
Level Virtual Addressing 0 32-bit physical addressing only

(Note: HP has never made a Level 0 system)

1 48-bit virtual addressing

248 total VAS (272 TB)

216 (32768) spaces of 4 GB each

2 64-bit virtual addressing

264 total VAS

232 spaces of 4 GB each

3 96-bit virtual addressing

296 total VAS

up to 264 spaces of 4 GB to 262 bytes each


8.1.1.1. PA-RISC 32-bit Virtual Space Layout

The32-bit virtual layout for PA-RISC systems consists of four 1 GBquadrants. These quadrants will hold different memory objects dependingon which 32-bit virtual layout is used by an application. The quadrantused is usually implicitly selected by the top two bits of the virtualaddress. The contents of the space register associated with thequadrant are used to form the complete global virtual address. Figure 8-1shows the default 32-bit virtual address organization, while the other32-bit layouts will be described later in the chapter. The chatr(1) command can be used to determine the type of 32-bit executable. If the chatr(1)output has “shared executable” in the description, then the executablequeried has the default virtual address layout, which is referred to asthe SHARE_MAGIC layout.

Figure 8-1. HP-UX VAS SHARE_MAGIC Format for 32-Bit HP-UX


In the SHARE_MAGIC layout, each quadrant has a specific purpose such as text, data, or shared objects. Shared text, which is code that can be used by many processes, is located in Quadrant 1. Private data is in Quadrant 2. This data includes:

  • Initialized data

  • Uninitialized data (BSS)

  • Dynamically allocated memory

  • u_areas

  • Kernel stack

  • User stacks

  • All data from shared libraries

  • Private memory-mapped files

Originallyin HP-UX, shared library text occupied Quadrant 3, while Quadrant 4 wasfor shared memory. This layout limited the amount of shared memory inthe system to only 1 GB. Therefore, in HP-UX 10.0 and beyond, the thirdand fourth quadrants are globally allocated. Shared library text,shared memory-mapped files, and shared memory can go anywhere withinthe third or fourth quadrants, but no single segment can cross theboundary between them. This limits any one segment size to a 1 GBmaximum. Note that because shared objects are mapped globally, thetotal size of all shared objects in the system must fit within thethird and fourth quadrants. Finally, the last 256 MB of the fourthquadrant is reserved for hard and soft physical address space (I/Ospace), which is used for addressing hardware devices based on slotnumber. This limits the fourth quadrant to only 756 MB of shared data.

Note the following characteristics with the 32-bit SHARE_MAGIC format:

  • Text is limited to 1 GB

  • Private Data is limited to 1 GB

  • Text is shared

  • Text is read-only

  • Text and data are demand paged in

  • Swap space is reserved only for the data

Theuse of four fixed-size quadrants can be a limitation to the amount ofvirtual address space that a 32-bit application can access, since eachquadrant is used for different memory objects. Loosely speaking, if aparticular memory object does not occupy the entire quadrant, then theprocess will not be able to consume the entire 4 GB virtual addressspace. This is an artifact of HP-UX's global virtual address space. Formost applications, the default SHARE_MAGIC layout provides a goodcompromise for distributing the configurable memory for private andshared data objects. However, some applications may require more sharedspace or more private space than is provided by the default layout.Therefore, other static layouts for 32-bit applications are describedlater that allow different amounts of private and shared memory to beallocated.

8.1.1.2. PA-RISC 64-bit Virtual Space Layout

Thesole virtual space layout for 64-bit PA-RISC applications is slightlydifferent from the default 32-bit layout. The biggest difference isthat the quadrants are 4 TB in size compared to the 1 GB 32-bitquadrants. With 64-bit PA-RISC applications, the large quadrant sizemakes private versus shared memory space configuration a non-issuesince a single 4 TB quadrant is much larger than physical memoryavailable today, and most applications are still far away fromconsuming the entire 4 TB virtual address space.

Thequadrant size is only 4 TB (42-bits worth of offset) for 64-bit PA-RISCapplications, due to a trade-off in HP-UX between the number of spacessupported and the size of each quadrant. Recall from Section 6.5.2.1, “Virtual Address Space” on page 144,that PA-RISC processors form the global virtual address by bitwiseORing the lower 32 bits of the space register with the upper 32 bits ofthe virtual offset. To keep from generating duplicate addresses due tothe ORing function, HP-UX makes sure the lower 10 bits of each spaceidentifier that it creates contains all zeros. Since all currentPA-RISC processors contain at most 32-bit space registers, this leaves22 bits for specifying space identifiers, or approximately four millionspace identifiers. HP-UX also makes sure that it does not create anoffset with more than the lower 42-bits, thus creating the current 4 TBlimit.

Thesecond biggest difference with the 64-bit virtual layout is that thequadrants contain different data than the 32-bit layout. Quadrant 1contains shared 64-bit objects, plus a portion of which is for shared32-bit objects. The 2-4 GB address range of Quadrant 1 is for sharedaccess with the third and fourth quadrants for 32-bit objects. Thisallows 64-bit applications and 32-bit applications to share up to 2 GBof data with each other. This 2-4 GB range also includes the 256 MB of32-bit I/O space that is at the end of Quadrant 4 for 32-bitapplications. The 64-bit I/O space is also configured into Quadrant 1from 4 - 68 GB making this unavailable for user data. Quadrant 2contains the shared text and Quadrant 3 contains the private data.Finally, Quadrant 4 is fully allocated to globally shared objects. Figure 8-2 shows the 64-bit SHARE_MAGIC organization.

Figure 8-2. HP-UX PA-RISC SHARE_MAGIC Format for 64-Bit executables


While64-bit layout allows a far greater number of addresses, only a subsetof the possible addresses within the 4 TB is normally used. The actualnumber of addresses used is constrained by the total swap spaceavailable.

8.1.2. IPF Virtual Address Space Layout

TheIPF virtual address space layout is different from the PA-RISC layoutand is also different between 32-bit and 64-bit processes. For IPFmachines, the address space is a flat 64-bit address range. However,the OS divides this address space into eight 261-byte(2 exabyte) regions with the top three bits of each virtual addressselecting a region register (RR) to use. The region register isanalogous to the PA-RISC space register. The region register is 24 bitsin size and it is combined with the 61-bit offset of the virtualaddress to form an 85-bit global virtual address that is used by thehardware and OS. The following sections describe the 32-bit and 64-bitlayouts for the Itanium® Processor Family.

8.1.2.1. 32-bit Virtual Space Layout

Thedefault 32-bit IPF virtual memory layout is similar in many ways to thePA-RISC 32-bit layout. There are four 1 GB regions that make up theuser space virtual address layout. The first octant contains sharedtext, the second contains private data, and the third and fourthoctants contain shared data. Unlike PA-RISC, the u_areas for the 32-bit applications are stored in the fifth octant instead of being allocated with the private data. (Note: a u_areais a process-specific data structure called the “user area”. It ispointed to from the process table entry for a specific process.) Also,the fourth octant does not have a missing 256 MB for I/O space as thefourth quadrant in PA-RISC does. Figure 8-3shows the default virtual address layout for 32-bit IPF processes,which is referred to as the SHARE_MAGIC configuration, as in PA-RISC.

Figure 8-3. Default IPF Per-Process Virtual Address Space (32 Bits)


Allpointers used to reference memory on IPF platforms are 64 bits in size,even for 32-bit executables. However, pointers in 32-bit executablesare stored using only 32 bits. In order to create a 64-bit pointer thatcan be used to reference memory from the 32 bits of stored address, theIPF architecture provides the instructions addp4 and shladdp4 to convert 32-bit addresses into 64-bit addresses. This conversion process is known as swizzling.The swizzling process zeros the upper 32 bits of the new 64-bit addressand copies bits 31 and 30 of the 32-bit address to bits 62 and 61(higher order bits) of the new 64-bit address (see Figure 8-4). For example, for the 32-bit virtual address 0x40005200, the swizzled 64-bit address would be 0x2000000040005200.

Figure 8-4. Swizzling a 32-bit Pointer
[View full size image]

Becauseonly two bits of the 32-bit address are used to select the regionregister, 32-bit IPF executables have direct access only to regionregisters 0 through 3. Region registers 4 through 7 are either unusedby 32-bit processes or are used by the HP-UX kernel.

8.1.2.2. 64-bit Virtual Space Layout

The64-bit IPF virtual address layout is somewhat different than thePA-RISC layout. First, the 64-bit IPF layout has a much larger virtualaddress range than PA-RISC. For instance, there are almost eightexabytes of virtual address space available for shared 64-bit objects,compared to the eight terabytes in PA-RISC versions of HP-UX. Inaddition, there are four exabytes of private data versus the fourterabytes in the PA-RISC layout. The IPF 64-bit virtual address spacelayout is shown in Figure 8-5.

Figure 8-5. IPF Per-Process Virtual Address Space (64 Bits)


Asin PA-RISC, the 64-bit virtual address layout for IPF binaries wasdesigned to allow efficient interaction with 32-bit binaries. Inparticular, the layout allows the 2 GB shared region of the 32-bitlayout to be shared with 64-bit applications. This is accomplished bymapping the shared area referenced by the 32-bit process into the firstoctant of the 64-bit process, and then making sure that the regionidentifier used for the first octant for the 64-bit application isidentical to the region identifier used for the third and fourth octantin the 32-bit application. If a 64-bit application wants to access datain the 32-bit shared memory area it can actually load the 32-bitpointer and cast this to a 64-bit pointer. The compiler willzero-extend the 32-bit pointer, thus creating a virtual address thatends up selecting the first octant for the region ID. Since the firstoctant region identifier is identical on all 64-bit processes and alsoidentical to the third and fourth octant for all 32-bit processes, theresulting global virtual address is identical between a 32-bit and64-bit access of the pointer.

8.1.3. Modifying the Default Virtual Address Space Layout

Thedefault virtual address space layouts for PA-RISC and IPF applicationsunder HP-UX were previously shown. However, these default layouts don'tnecessarily work well for all applications. In particular, the 32-bitlayouts can be overly restrictive for applications that access verylittle shared data but need to access a lot of private data orapplications that access lots of shared data but have little privatedata.

Executable programs compiled for HP-UX include a magic number in their a.outfile. This number tells the operating system what type of virtualmemory layout to use for an application. Based on the virtual memorylayout, the kernel changes how it interprets references in the code toeither the four quadrants for PA-RISC or the eight octants for IPF, asdescribed in the previous section. The default virtual address layouthas a magic type that is called SHARE_MAGIC. Other magic types includeEXEC_MAGIC and SHMEM_MAGIC. Additionally, there are modifications tothese three layouts which are referred to as the Q3 private/Q4 privatelayouts. Note that for 64-bit executables, only the SHARE_MAGIC type isused. All of the other magic types are available for 32-bit executablesonly.

8.1.3.1. EXEC_MAGIC

TheEXEC_MAGIC format was introduced with HP-UX 10.0 on the Series 800 andwith HP-UX 9.01 on the Series 700. In this format, the private dataobjects and text objects are allowed to share quadrants 1 and 2 forPA-RISC and octants 1 and 2 for IPF. The objects can exceed 1 GB insize, but must be less than approximately 2 GB. This layout doubles theamount of private data that a 32-bit application can access andtherefore may improve performance for applications that need to accessmore than 1 GB of private data, such as multi-threaded or technicalapplications.

Onepotential drawback of EXEC_MAGIC executables is that the text (code) isprivate versus shared. Having private text means there can be multiplevirtual copies in memory compared to one shared copy. Having multiplevirtual copies of the text means that there will be separate sets ofinstruction TLB entries needed for each copy and the separateinstantiations of the program will not share cache lines in the upperlevel instruction caches, which are virtually indexed on both PA-RISCand IPF processors. HP-UX does use virtual aliasing of the text pagesin an EXEC_MAGIC executable to allocate only one set of physical pageswhen multiple processes are created, so this allows the physical memoryusage to stay the same as with a SHARE_MAGIC executable.

Thelack of virtual address sharing for the text may cause severeperformance problems if multiple copies of the same program are run atthe same time. This is especially true for PA-RISC processors if theapplication has a large text “footprint,” such as is the case for manycommercial applications. The problem is that separate cache lines willneed to be allocated in the cache for each text copy and this can causesevere cache pressure. Normally with a SHARE_MAGIC executable, manyprocesses all share the same virtual mapping so only a single copyresides in the instruction cache. For IPF processors, private text isnot so much of a problem because the virtually indexed first levelcaches are relatively small, so these caches most likely don't hold therequired data on a context switch anyway. TLB pressure can still be anissue for IPF machines with private text regions and multiple copies ofan application.

Fora multi-threaded application, however, having private text is usuallynot a performance loss because all threads within the process share theprivate text just like they share the private data. Only if many of thesame multiply-threaded processes are executed at the same time will amulti-threaded application potentially have performance issues with anEXEC_MAGIC layout.

Textis writable with EXEC_MAGIC even though text is not modified during itsexecution. Since it is writeable, swap space must be reserved for thetext segment, although a lazy swap allocation scheme is used in whichswap space is allocated only for text pages that are modifiable. Eventhis lazy swap is wasteful, since virtually no modern application usesmodifiable code.

For32-bit IPF applications, the EXEC_MAGIC type is only supported in HP-UX11i v2 and later releases. The HP-UX 11i v1.5 and v1.6 releases have nosupport for the EXEC_MAGIC layout. The EXEC_MAGIC layout is enabled bylinking an application with the -N linker flag.

Dueto the potential performance problems, it is recommended that this formof magic only be used for programs that really need the extra privatedata storage, are multi-threaded, or use at most one process perprocessor. In most non-threaded applications, it may be more efficientto convert a 32-bit application to a 64-bit application rather thanusing EXEC_MAGIC, unless you are constrained to run the application ona system that supports only 32-bit applications.

8.1.3.2. SHMEM_MAGIC

SHMEM_MAGICis similar to EXEC_MAGIC, but instead of extending the private dataarea, it increases the virtual address space available to global sharedobjects. SHMEM_MAGIC achieves this goal at the expense of the virtualaddress space available for process text and private data, which islimited to 1 GB together when using this option. The total amount ofarea available for shared objects is 2.75 GB under this layout forPA-RISC and 3 GB for IPF. In this layout, the first quadrant or octantis used for private text and data. The other three quadrants or octantsare for shared data, with the second quadrant or octant available forSystem V shared memory only.

Thislayout has all of the same performance problems as the EXEC_MAGIClayout because text is no longer shared among multiple processes.However, for 32-bit applications that need a lot of shared memory, thislayout can help increase the amount of shared memory available and,potentially, improve performance. For instance, a multi-threadeddatabase application could use a SHMEM_MAGIC layout to increase theamount of its shared cache for data read from the database.

From Chris's Consulting Log— An application developer insisted on using the SHMEM_MAGIC virtual memory layout to increase the amount of shared memory available to a 32-bit application from 1.75 GB to 2.75 GB on HP-UX. Unfortunately, the application was written as a multi-process application and it had a very large text footprint. When the developer switched from the standard SHARED_MAGIC layout to SHMEM_MAGIC, performance dropped by a factor of two even though they were now using much more memory for the internal buffer cache. The problem was that a huge number of TLB misses and instruction cache misses resulted from the lack of shared text, and this overwhelmed any reduction in I/O from disk that was obtained by the increased shared memory. This developer quickly reverted to the default SHARE_MAGIC layout for the 32-bit application and eagerly awaited the arrival of 64-bit computing.


TheSHMEM_MAGIC layout has been available for PA-RISC-based machines sinceHP-UX 10.20 and has been available for IPF-based machines since HP-UX11i v2. The -N linker option must be used to first enable the application to be an EXEC_MAGIC application, and then the chatr(1) command must be used with the -M option for PA-RISC and the +as shmem_magic option for IPF, to convert the EXEC_MAGIC executable to a SHMEM_MAGIC executable.

8.1.3.3. Q3/Q4 Private Data

Some32-bit applications need more than the 1.9 GB of memory provided by theEXEC_MAGIC virtual layout in PA-RISC, so starting with HP-UX 11i, chatr(1) options were created to modify the meaning of the third and fourth quadrants for any of the 32-bit virtual layouts. When the chatr(1) +q3p enable option is used, the third quadrant is used for private data and when the chatr(1) +q4p enableoption is used, the fourth quadrant is used for private data. If thefourth quadrant is chosen to use private data, then the third quadrantis automatically set to use private data as well. For an EXEC_MAGICexecutable, enabling +q4p will allow up to 3.8 GB of private data for a 32-bit PA-RISC application. Enabling +q3p will enable up to 2.85 GB of private data over the default of 1.9 GB for an EXEC_MAGIC layout.

These chatr(1)options can greatly increase the amount of private data available to a32-bit application and potentially provide a performance improvement byhaving this extra memory available. When using these options withEXEC_MAGIC and SHMEM_MAGIC executables, all of the same negativeperformance aspects surrounding private text still exist with theseoptions as exist with the generic EXEC_MAGIC or SHMEM_MAGIC layout.However, using these options with the standard SHARE_MAGIC layoutallows more private data at the cost of shared data while stillmaintaining the good properties of shared text. These chatr(1) options are not available for 32-bit IPF applications. Instead, an adaptive address space model is used.

8.1.4. Adaptive Address Space

Traditionally,HP-UX provides a global address space (GAS) for all processes runningon the system. The GAS design provides advantages in TLB design and OSmemory management codes. However, some application designers like towrite their applications using address aliasing, which provides someprogramming conveniences. Address aliasing is not supported with a GASoperating system.

Startingin the HP-UX 11i v2 release, with IPF systems only, the HP-UX operatingsystem allows a process to allocate memory similar to how a multipleaddress space (MAS) operating system like Linux would. HP refers tothis ability as an adaptive address space because it allows anapplication running under HP-UX to choose between a global addressspace layout or a more private address space layout. This functionalityis only available on IPF-based servers for both 32-bit and 64-bitapplications.

Applicationwriters using a multiple address space system often take advantage ofprogramming shortcuts by using features such as address aliasing. Theadaptive address space functionality was added to make it easier whenporting programs from a multiple address space OS like Linux to HP-UXand to remove some of the global address space restrictions for 32-bitapplications that require a large amount of private data. The chatr(1) command can be used to mark an application as using the multiple address space layout using the +as mpas option. MPAS stands for mostly private address spaceas compared to the standard mostly global address space normally foundin HP-UX. A 32-bit application must have been first compiled with the -N linker option in order to apply the +as mpas option.

Thirty-twobit processes can allocate any combination of private and sharedobjects up to the 4 GB 32-bit virtual address space limit under theMPAS model. This allows great flexibility for an application comparedto the various other global virtual memory layouts. For instance, withthe 32-bit global address layout, the 1 GB octants can causelimitations in the total amount of memory available to an application,given that each octant must be used for a particular type, like text,private data, or shared objects. An MPAS layout does not have thisrestriction. For 64-bit applications, however, memory limitations arenot really an issue, so there is no real memory allocation advantagefor the MPAS layout over the standard global address space layout. The64-bit MPAS layout is shown in Figure 8-6. Octant 7, which contains the global shared objects, can only be accessed by using the MAP_GLOBAL flag to mmap(2) calls, or by using the IPC_GLOBAL flag for shmget(2) calls.

Figure 8-6. IPF Mostly Private Virtual Address Space (64 Bits)
[View full size image]

Applicationwriters may like the MPAS layout because it provides more flexibilitycompared to the standard global address space layouts. For instance,for 32-bit applications, a shared object can now be greater than 1 GBin size. In addition, more control over mmap(2)operations, such as being able to map part of a shared file multipletimes with a single process, is allowed using the MPAS model, With theMPAS model, an application can also map certain shared objects at adifferent virtual address in separate processes. Sharing is providedthrough virtual aliasing, where two separate virtual addresses can mapto the same physical address. This allows a process to use the samevirtual address for a shared object on each invocation of a givenprogram. Most applications, however, don't have a need to map objectsat a particular address.

MPASapplications may not run as efficiently as the normal global addressspace executables, particularly if the application shares a lot ofdata. The problem with multiple address space executables is that theyuse TLB mappings to provide protection between processes instead of theprotection IDs used by the global address space layout. This meansthere will be increased TLB pressure when multiple MPAS-type processesare executing. In addition, running multiple MAS-type executables mayalso affect the overall system performance by requiring more page tableentries and thus causing even global address space processes to spendmore time in TLB misses. Finally, the 32-bit MPAS model will have thesame performance issues with private text as the EXEC_MAGIC layout. The64-bit MPAS model uses shared text versus private, so it does not havethe text-side TLB or cache issues.

Variablepage allocation, described later in the chapter, may be worse using theMAS layout because each virtual address chosen for a shared objectneeds to be aligned the same. If all mappings for a shared object can'tbe aligned the same, the page size currently allocated will be demotedto the smallest page size that aligns properly for all mappings. Aglobal address space layout does not have this issue because all sharedaddresses use the same virtual address.

Ingeneral, the increased flexibility and private memory available withthe MPAS model may outweigh any potential negative performance aspectsfor 32-bit executables, especially multi-threaded ones. Applicationsthat share a lot of data between processes may want to consider notusing the MPAS model, given the potential TLB performance hits and thepotential for not being able to allocate large pages. In any event,testing should be done if possible to verify that the MPAS model is notcausing performance problems compared to a global address space layout.For 64-bit executables, the MAS virtual layout it is not recommendedunless a simple port of an application is needed. For 64-bitapplications, there are no 1 GB octant restrictions on memory objects,so the memory allocation flexibility of the MPAS layout is not needed.

8.1.5. Shared Memory Windows

Sharedmemory is normally a globally visible resource to all applications.With HP-UX's global virtual address space, all applications use thesame virtual address to access shared data. For 32-bit applications,all shared objects must be placed in memory quadrants 3 and 4 with thedefault SHARE_MAGIC layout; therefore, the total size of all sharedobjects that 32-bit applications can access is 1.75 GB. This may beacceptable on smaller systems or with smaller applications. On largesystems, however, where the total size of shared memory segments fordifferent applications exceeds 1.75 GB, this can be severely limitingfor 32-bit applications.

ForHP-UX 11.0, extension media has been released that will allow 32-bitapplications to access shared memory in windows that are visible onlyto the group of processes that are authorized through the use of aunique key. This feature is standard starting with HP-UX 11i. Eachshared memory window provides up to 2 GB of shared object space, whichis visible only to the group of 32-bit processes configured to accessit with the setmemwindow(1m)command. The total amount of shared memory in a shared memory windowdepends on the magic number of the executable. SHARE_MAGIC andEXEC_MAGIC executables can use a shared window of up to 1 GB, whereasSHMEM_MAGIC executables can use a shared window of up to 2 GB.

Theper-process virtual address space for SHARE_MAGIC and EXEC_MAGICvirtual layouts using shared memory windows is identical to the normallayouts for these magic types with the exception that the thirdquadrant contains the shared memory window. For SHMEM_MAGIC layouts,both the second and third quadrants contain the memory window. Inaddition, using the -b option of the setmemwindow(1m)command with SHMEM_MAGIC layouts will cause the second and thirdquadrants to use the same space identifier so a single contiguousshared memory segment can be created. Applications that have been chatr'edto use Q3 or Q4 private data don't work in a windowed environment,given that their third quadrant will be private data versus being ableto be used for the shared memory window.

Useof shared memory windows constrains the globally accessible sharedobject virtual address space to 768 MB on PA-RISC and 1 GB on IPF for32-bit applications. This means that all shared libraries,memory-mapped files, and shared memory segments that must be accessibleto all processes on the system must fit into Quadrant 4. Therefore, useshared memory windows with care.

The number of shared memory windows is configurable with the tunable parameter max_mem_window.Each group of applications can access its own private memory window.The shared objects placed in Quadrant 4 remain globally visible.Therefore, HP-UX tries to load all shared libraries into Quadrant 4when shared memory windows are used. Shared objects placed intoQuadrant 3 are not globally visible. Rather, these objects are visibleonly to the set of applications that attach to a particular sharedmemory window.

Using memory windows has several side effects:

  • Shared libraries that cannot be placed into Quadrant 4 are placed in Quadrant 3 and must be mapped into each shared memory window, consuming extra memory for each window.

  • The IPC_GLOBAL attribute must be used to force a shared memory segment into the globally-shared fourth quadrant using shmat(2).

  • The MAP_GLOBAL attribute must be used to force a memory-mapped file into the globally-shared fourth quadrant using mmap(2).

  • Processes must be in the same memory window to share data in a window. Processes can always share data in the globally-shared fourth quadrant.

  • Child processes inherit the shared memory window ID.

  • The shared memory window ID may be shared among a group of processes by inheritance or by use of a unique key referred to by the processes.

Sharedmemory windows are not needed for 64-bit applications given the hugeamount of virtual address space available in the 64-bit model.

8.1.6. Page Protection

Pagesof memory in PA-RISC and IPF can have two types of protection assignedto them: authorization, which is granted with a Protection ID, andaccess rights (read/write/execute), which are the same as the actualfile permissions for shared libraries and memory-mapped files. Notethat, in general, for HP-UX, text does not normally have writepermission, which means that code cannot be modified in memory.

Ifmany different memory objects are accessed within a process in HP-UX,protection ID thrashing can occur and cause a performance degradation.As mentioned in Section 6.5.4.3, “Protection ID Issues” on page 149,there is a cache of protection IDs stored in the chip of each PA-RISCand IPF processor. This cache is not large, and a protection ID entryis needed for each separate memory object accessed. For instance, eachshared memory segment created requires a separate protection ID, thetext segment requires an ID, and the private data segment requires anID. HP-UX maps shared libraries using a global protection ID, so aseparate ID is not needed for each shared library. Applications caneasily experience protection ID thrashing or even run out of protectionIDs by allocating several shared memory-mapped objects, so care needsto be taken when creating memory maps or shared memory segments inHP-UX. The mostly private address space virtual layout under IPFservers running HP-UX 11i v2 and later don't use as many protectionIDs, so these protection ID issues are not as relevant.

The mprotect(2) system call can be used to change or check access rights on pieces of memory that have been allocated via mmap(2).

  • Create Bookmark (Key: b)Create Bookmark
  • Create Note or Tag (Key: t)Create Note or Tag
  • Download (Key: d)Download
  • PrintPrint
  • Html View (Key: h)Html View
  • Zoom Out (Key: -)Zoom Out
  • Zoom In (Key: +)Zoom In
  • Toggle to Full Screen (Key: f)
  • Previous (Key: p)Previous
  • Next (Key: n)Next

Related Content

Virtual Memory Support
From: HP-UX 11i Internals

The Process's Logical View
From: HP-UX 11i Internals

Memory Limitations for 32-bit Operating Systems, magic Numbers, and Memory Windows
From: HP-UX CSE Official Study Guide and Desk Reference

64-bit virtual address space memory map
From: DB2 UDB for z/OS Version 8: Everything You Ever Wanted to Know, ... and More

Swap Space, Paging, and Virtual Memory Management
From: HP-UX CSE Official Study Guide and Desk Reference

Memory Windows
From: HP-UX 11i Internals

Kernel and User Address Space Layouts
From: Mac OS X Internals: A Systems Approach

Thirty-two Bit and 64-Bit Compute Slices
From: Building Clustered Linux Systems

Memory
From: Windows® Small Business Server 2008 Administrator’s Companion

The Virtual Address Space
From: Windows® CE 3.0 Application Programming