A memory-related issue unfolded using performance tools for AIX(July 28,2009)

来源:百度文库 编辑:神马文学网 时间:2024/04/18 09:56:38

The problem

Upon the launch of a specific application, we noticed that thememory was getting completely drained out.This application was the onlyprocess running in the system.There were also issues doing manualtransactions, as there were frequent time outs due to no response fromthe server.


Problemanalysis and resolution

The initial thought was that the memory drain was related to the application. However,after using the performance tools, we changed our minds.

In Indian mythology, there is a reference of a medicine called Sanjeevini, which cancure almost all problems. topas is the Sanjeevini of performance tools, as it gives anoverview of all of the system resources and is used as a starting point forperformance analysis. It helped in this scenario by providing the first stepping stoneto attack the problem. When we first observed the problem, topas showed the total percentage ofcomputational memory as 99%. This was also the time when the team was observing issueswhile doing manual transactions. The team was performing the test runs related to only one application.

So, as a next step of investigation, the application was stopped, and topas was usedonce again to check the status of computational memory. This time the computationalmemory was 78%, an alarming figure considering no other application was running, andit provided a new direction to our train of thought.

The other tools that were of great help in the investigation were svmon, vmstat, and vmo.

svmon command

The svmon command is a performance-measurement tool that captures and analyzes the snapshot of virtual memory.

The svmon -G command displays the following global memory report. It shows the used and free sizes of real and virtual memory in the system.

svmon -G


Figure 1. svmon global memoryreport

This memory report shows that out of 1998848 pages (page size=4K) of total memory,there were 1996881 pages in use and 1967 pages free.

However, in the above report, the number of free pages is 1967, but that does not inferany memory-related constraint or memory bottleneck, because to improve I/O performance,AIX® tries to use the maximum amount of free memory for file caching if it's notexplicitly requested by athe pplication or kernel. Moreover, the report states thatout of the total paging space size of 3145728 pages, the Inuse paging space is 99556 pages.

The svmon -P command displays the memory usage statistics for all the processes.

svmon -P


Figure 2. Process memory utilization report

The process memory utilization report shows that a Java™ process hasan Inuse memory of 166690 pages. Upon adding the Inuse memory used byall the processes in this report, we observed that the sum total ofmemory in use by different processes was significantly less than thetotal memory of that system. This observation was also an indicationthat the memory is not the limiting factor.

vmstat command

Another performance-monitoring tool, vmstat, was used for reporting statistics aboutkernel threads, virtual memory, disks, and CPU activity.

# vmstat 2 10


Figure 3. vmstatreport

This vmstat report shows that there is approximately 114MB of free memory.Moreover, there are no page-outs getting reported. However, the last five entries do statethat there was one blocked thread and that there is some I/O wait happening in the system.

vmo

The performance parameters were also checked using the vmo command.The vmo command also displays and adjusts the Virtual Memory Managerparameters.

vmo –a


Figure 4. vmo commandoutput

While observing the output of the vmo command, we noticed that lgpg_regions is set to 256 and lgpg_sizeis 16777216 (or 16MB). AIX treats large pages as pinned memory and does not providepaging support for large pages. The data of an application that is backed by largepages remains in physical memory until the application completes. In our case, thismeans that 256 pages of size 64MB are pinned and are set reserved.

If you look back at Figure 1, the output of svmon –G, you can see thatfor a large page size of 16MB, the pool size is 256 and since for large pages the memoryis pinned, 256 pages of 16MB each cannot be paged out. Looking at the output ofFigure 2,the svmon –P command, you can see that the first line of the output hasthe last column named 16MB and its corresponding value is 'N'. Thismeans that the process with PID 266372 is not using 16MB pages; thatis, the application is not making use of large pages that have beenreserved by the vmo command. Similarly, when other running processeswere checked in the same report, we observed that none of theapplications was using the large page support. Hence, 256*16=4096MB issimply blocked and not used at all by any running process.

From Figure 3, you can see that the total physical memory is7808MB. As is clear from the above analysis, out of that, 4096MB is reserved for largepages. It implies that all the running applications can make use of only 7808-4096=3712MB of memory. Since a big chunk of memory was blocked and left unused, that wasthe reason memory was getting exhausted completely.

Hence, either you should not block the memory for large pages or you should ensure that the application must utilize that well.


Configuringthe large pages

The application or the system can be configured to use large pages.

Configuring the application for large pages

The blpdata flag is used with the ldedit command to enable an executable file torequest large pages. More details regarding it can be found in the Resources section.

Configuring the system for large pages

By default, the system does not allocate any memory to the large page physical memorypool. The vmo command can be used to configure the size of the large page physical memorypool using the lgpg_regions and lgpg_size options.

LDR_CNTRL environment variable is used so that the application's data and heap segments should use large pages.

More details regarding the usage of the vmo command for configuring large pages and LDR_CONTROL variable can be found in the Resources section.

In the above case study, changing the value of lgpg_regions to anominal value and making the application use the large pages helped inresolving the memory problem and helped in increasing the overallsystem performance, as well.

Deciding the value for lgpg_regions

A general recommendation cannot be given for the value of lgpg_regions and even otherperformance-related tuning parameters; yet a nominal value can be decided byidentifying the workload on the system. In the above case study, when the LDR_CNTRLvariable was exported, the application started making use of large pages. After this,the vmstat command was used again.

# vmstat –l 10

This command displays the large page section-related statistics, at 10-second intervals.


Figure 5. Large page statistics

This has a section for large pages, which provides the detailed statistics related to large pages. The two fields titled alp and flp correspond to active large pages and free large pages,respectively. For this case study, the value of alp was not going beyond 80. So, the judicious way of utilizingmemory is to lower the value of lgpg_regions from 256 to a value of, say, 100. Thishelped in returning a significant chunk of memory to a pool of 4K memory. This helpedin increasing the available memory for other running applications and hence improvingthe overall performance of the system. In short, tuning the parameters afteridentifying the workload of the system is the key to the efficient usage of system resources.


Conclusion

The main purpose for large page usage is to improve systemperformance for high-performance computing applications or anymemory-access-intensive application that uses large amounts of virtualmemory. Large page is useful but it should be used in specificscenarios. It is a special-purpose performance improvement feature andis not recommended for general use.


Resources

Learn

  • The Performance management guide discusses Large pages in AIX.

  • The AIX Performance Tools Handbook discusses AIX performance tools.

  • AIX 6.1 information center is your source for technical information about the AIX operating system

  • The AIX and UNIX developerWorks zone provides a wealth of information relating to all aspects of AIX systems administration and expanding your UNIX skills.

  • New to AIX and UNIX? Visit the New to AIX and UNIX page to learn more.

  • Browse the technology bookstore for books on this and other technical topics.

Discuss

  • Check out developerWorks blogs and get involved in the developerWorks community.

  • Participate in the AIX and UNIX forums:
    • AIX Forum
    • AIX Forum for developers
    • Cluster Systems Management
    • IBM Support Assistant Forum
    • Performance Tools Forum
    • Virtualization Forum
    • More AIX and UNIX Forums

About the authors

SaravananDevendran joined IBM in 1997, fresh out of college. He has 12 years ofexperience in AIX Operating System Development. He has worked onvarious subsystems in AIX and has expertise in Performance Tooling. Hedeveloped various performance-monitoring tools for AIX like tprof,mpstat, and more. Currently, he is responsible for the architecture ofperformance tools in the AIX operating system.

KiranGrover has 10 years experience in the IT Industry and is currently apart of the AIX Performance Tools Development Team in IBM. She has aMasters degree in Mathematics and has M.Tech in Computer Science.Catching the train of thoughts and penning down the ideas is one of herhobbies. She is a nature admirer who wants to make the planet greener.