[Milberg09] Chapter 15. Network I/O: Tuning

来源:百度文库 编辑:神马文学网 时间:2024/04/29 06:04:30

Chapter 15. Network I/O: Tuning

The most important command for tuning AIX network parameters is the no command. First, take a look at the first few parameters, using the -a flag:

root@lpar37p682e[/] > no -a

arpqsize = 12
arpt_killc = 20

arptab_bsiz = 7

arptab_nb = 149
bcastping = 0

clean_partial_conns = 0

delayack = 0
delayackports = {}

As an alternative, you can use the -L flag, which provides much more detailed information.

The no command provides more than 100 parameters you can tune. In older versions of AIX, thewallwas an important tunable whose defaults you needed to change; thisparameter defined the upper limit for network kernel buffers. Today,this size is defined at installation time depending on the amount ofRAM and the kernel type. For example, if you are running AIX 5.3 on a64-bit kernel, the parameter is set at half the size of real memory. (Iactually used to enjoy playing around with thewall, so I'm not sure I like the new approach.) You can use netstat -m to detect shortages or failures of network memory requests. In the following example, there are no shortages (failures):

root@lpar37p682e[/etc/tunables] > netstat -m
Kernel malloc statistics

******* CPU 0 *******
By size inuse calls failed delayed free hiwat free

32 117 217 0 0 11 5240 0
64 109 6523 0 1 83 5240 0

128 975 15951 0 29 785 2620 0
256 520 67637 0 30 1016 5240 0

Streams mblk statistic failures

0 high priority mblk failures
0 medium priority mblk failures
0 low priority mblk failures

Although you can change many parameters using the noutility, most of them are better left alone. The most importantparameters are those that relate to TCP streaming workload tuning:

  • tcp_sendspace — This parameter controls how much buffer space in the kernel is used to buffer application data. You really want to bump this value up from the default because if its limit is reached, the sending application suspends data transfer until TCP sends the data to the buffer.

  • tcp_recvspace — In addition to controlling the amount buffer space to be consumed by receive buffers, this value helps AIX determine the size to make its transmit window.

  • udp_sendspace — When using UDP, you can set this value no higher than 65536 because IP has an upper limit of 65,536 bytes per packet.

  • udp_recvspace — This value should be greater than udp_sendspace because it needs to handle as many simultaneous UDP packets per socket as it can. You can easily set this parameter to 10 times the value of udp_sendspace.

Let's use no make a few changes. First, increase the size of udp_send-space:

root@lpar37p682e[/] > no -p -o udp_sendspace=65536

Setting udp_sendspace to 65536
Setting udp_sendspace to 65536 in nextboot file

Next, change udp_recvspace to the recommended configuration of 10 times udp_sendspace:

Code View:Scroll/Show All
root@lpar37p682e[/] > no -p -o udp_recvspace=655360

Setting udp_recvspace to 655360
Setting udp_recvspace to 655360 in nextboot file
Change to tunable udp_recvspace, will only be effective for future connections


Note that the -p flag retains the entries, even after a reboot. It appends the updated values in the etc/tunables/nextboot stanza file.

Regarding the TCP parameters for higher-speed adapters, there is no problem setting tcp_sendspace to twice the value of tcp_recvspace. These are good settings.

Two other important workload parameters of the no command are rfc1323 and sb_max. The rfc1323tunable enables the TCP window scaling option, which lets TCP use alarger window size. Turning on this parameter enables the best TCPperformance. The sb_maxtunable sets an upper limit on the number of socket buffers queued toan individual socket, controlling the amount of buffer space consumedby buffers (queued to either a sender or receiver socket). This numbershould usually be less than thewall and approximately four times the size of the largest value of the TCP or UDP send and receive settings. For example, if your udp_recvspace value is 655360, you can't go wrong by doubling this to 1310720.

Another useful no tunable, tcp_nodelayack,prompts TCP to send an immediate rather than a delayed acknowledgment.Although sending an immediate acknowledgment can add more overhead insome environments, it can greatly improve network performance inothers. If changing this parameter does not improve performance in yourenvironment, you can quickly change it back.

Let's also review ipqmalen. This tunable controls the length of the IP input queue. If you see an overflow counter (using netstat -s), setting a maximum length for this queue can help fix the overflow.

Whatabout Address Resolution Protocol (ARP)? When many clients areconnected to the system, you might want to tune the ARP cache. You canexamine the relevant statistics using netstat:

root@lpar37p682e[/etc/tunables] > netstat -p arp

arp:
10 packets sent
0 packets purged

If you see a high purge count, increase the size of the ARP table. In the preceding example, no increase is needed.

Here are the no parameters that relate to arp:

root@lpar37p682e[/etc/tunables] > no -a | grep arp

arpqsize = 12
arpt_killc = 20
arptab_bsiz = 7

arptab_nb = 149

You can tune these buffers either systemwide or according to specific interfaces. To tune by interface, set the no command's use_isno option to 1 (this option is enabled by default in AIX 5.3):

root@lpar37p682e[/etc/tunables] > no -a | grep use
use_isno = 1

Disabling the use_isnoparameter (by setting it to 0) can serve as a diagnostic tool of sortsby setting the buffer values across the board to help isolateperformance problems. When these values are set for the specificinterfaces, they actually override the default value in the no view, which can sometimes confuse system administrators. You can view specific interface settings using either ifconfig or lsattr:

# ifconfig en0

en0: flags=1e080863,480 GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),CHAIN>
inet 172.29.135.44 netmask 0xffffc000 broadcast 172.29.191.255
tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1

In this example, look at the settings using ifconfig(see the last line, which references a couple of the tunables mentionedearlier). You can change these options (by interface) using SMIT or thechdev or ifconfig command. Note that ifconfigwill not update the Object Data Manager (ODM), so on reboot, thesettings will revert to their previous values. For this reason, youshould use SMIT. Use the smit tcpip fastpath, and go to Further configuration > Network interfaces > Change/Show characteristics of an interface.

15.1. Name Resolution

Nameresolution is another area that can impact performance. If you know howyou want to resolve names (using either DNS or the hosts file), makesure name resolution is set up correctly in the /etc/netsvc.conffile. If you're using DNS, take out the local if you are not using ahosts file at all, or leave it in if you are using it as a backup toDNS (but make it the second entry). If you're not using DNS, remove thebind because it will slow performance by first trying (if it is thefirst entry in the record) to resolve using a name server that doesn'texist.

15.2. Maximum Transfer Unit

Themaximum transfer unit (MTU) is defined as the largest packet that canbe sent over a network. The size depends on the type of network. Forexample, 16-bit token-ring has a default MTU size of 17,914, whileFiber Distributed Data Interface (FDDI) has a default size of 4,352.Ethernet's default size is 1,500 (or 9,000 with jumbo frames enabled).Larger packets mean fewer packet transfers, which results in higherbandwidth utilization on your system. An exception to this rule is ifyour application prefers smaller packets.

Ifyou're using a Gigabit Ethernet, you can use a jumbo frames option. Tosupport the use of jumbo frames, your switch must be configuredaccordingly. To change to jumbo frames, use the smit device fastpath and go to Communication > Ethernet > Adapter > Change > Show characteristics of an Ethernet adapter. You can make the change from there.

15.3. Tuning: Client

The biod daemon plays an important role in connectivity. While biodself-tunes the number of threads (the daemon process creates and killsthreads as needed), you can adjust the maximum number of biodthreads, depending on the overall load. An important concept tounderstand here is that increasing the number of threads alone will notalleviate performance problems caused by CPU, I/O, or memorybottlenecks. For example, if your CPU is near 100 percent utilization,increasing the number of threads won't help you at all.

Increasingthe number of threads can help when multiple application threads accessthe same files and you don't find any other types of bottlenecks. Usingthe lsofcommand can help you further determine which threads are accessingwhich files. From earlier tuning sections, you might remember theVirtual Memory Manager parameters minperm and max-perm.Unlike when you tune database servers, with NFS you want to let the VMMuse as much RAM as possible for NFS data caching. Most NFS clients havelittle need for working segment pages. To ensure that all memory isused for file caching, set both maxperm and maxclient to 100 percent:

root@lpar24ml162f_pub[/tmp] > vmo -o maxperm%=100

Setting maxperm% to 100
root@lpar24ml162f_pub[/tmp] > vmo -o maxclient%=100
Setting maxclient% to 100

Notethat in the event that your application uses databases and couldbenefit from performing its own file data caching, you should not set maxperm and maxclientto 100 percent. In this situation, set these numbers low and mount yourfile systems using concurrent I/O over NFS. NFS maintains caches oneach client system that contain attributes of the most recentlyaccessed files and directories. The mount command controls the length of time that these entries are kept in cache.

The mount parameters you can change include the following: acdirmin, acdirmax, acregmin, acregmax, and actime. For example, the acregminparameter specifies the minimum length of time after an actual updatethat file entries will be retained. When a file is updated, its removalfrom cache depends on this parameter's value.

Using the mountcommand, you can also specify whether you want a hard or soft mount.With a soft mount, if an error occurs, it is reported immediately tothe requested program; with a hard mount, NFS keeps retrying. Theseretries themselves could lead to performance problems. From areliability standpoint, hard mounting read and write directories isrecommended to prevent possible data corruption.

Mount parameters rsize and wsizedefine the maximum sizes of RPC packets for read and write directories,respectively. The default value is 32,768 bytes. With NFS 3 and 4, ifyour NFS volumes are mounted on high-speed networks, you shouldincrease this setting to 65,536. On the other hand, if your network isextremely slow, you might think about decreasing the default to reducethe amount of packet fragmentation by sending shorter packets. However,if you do decrease the default, more packets will need to be sent,which could increase overall network utilization.

Understand your network, and tune it accordingly!

15.4. Tuning: Server

Beforeexamining specific NFS parameters, always try to decrease the load onthe network while also looking at your CPU and I/O subsystems. CPUbottlenecks often contribute to what appears to be an NFS-specificproblem. For example, NFS can use either TCP or UDP, depending on theversion and your preference. Make sure your tcp_sendspace and tcp_recvspacetunables are set to values higher than the defaults because this canhave an impact on your server by increasing network performance. Youtune these values with the no command:

Code View:Scroll/Show All
root@lpar24ml162f_pub[/tmp] > no -a | grep send

ipsendredirects = 1
ipsrcroutesend = 1
send_file_duration = 300
tcp_sendspace = 1638
udp_sendspace = 9216

root@lpar24ml162f_pub[/] > no -o tcp_sendspace=524288

Setting tcp_sendspace to 524288
Change to tunable tcp_sendspace, will only be effective for future connection


If you are running Version 4 of NFS, make sure you turn on nfs_rfc1323. Doing so allows for TCP window sizes greater than 64K. Set this value on the client as well.

root@lpar24ml162f_pub[/] > no -o rfc1323

Setting rfc1323 to 1

As an alternative, you can set the rfc1323 tunable using the nfso command, which manages the NFS tuning parameters:

root@lpar24ml162f_pub[/] > nfso -o nfs_rfc1323=1

Setting nfs_rfc1323 to 1

Setting rfc1323 with nfso configures the TCP window to affect only NFS (as opposed to no, which applies this setting across the board). If you have already set this option with no, you don't need to change it, although you might want to in case some other Unix administrator decides to play around with the no command.

Similar to the client, if the server is a dedicated NFS server, make sure you tune your VMM parameters accordingly. Modify maxperm and maxclient to 100 percent to make sure the VMM controls the caching of the page files, using as much memory as possible in the process.

On the server, tune nfsd, which is multithreaded, the same way you tuned biod. (Other daemons you can tune include rpc.mountd and rpc.lockd.) Like biod, nfsd self-tunes, depending on the load. Increase the number of threads using the nfso command. One parameter to check is nfs_max_read_size, which sets the maximum size of RPCs for read replies. Look at what nfs_max_read_size is set to below:

root@lpar24ml162f_pub[/tmp] > nfso -L nfs_max_read_size

NAME CUR DEF BOOT MIN MAX UNIT TYPE
DEPENDENCIES
---------------------------------------------------------------------------
nfs_max_read_size 32K 32K 32K 512 64K Bytes D

Let's increase it to 64K (using bytes):

root@lpar24ml162f_pub[/tmp] > nfso -o nfs_max_read_size=65536
root@lpar24ml162f_pub[/tmp] > nfso -L nfs_max_read_size

NAME CUR DEF BOOT MIN MAX UNIT TYPE
DEPENDENCIES
---------------------------------------------------------------------------
nfs_max_read_size 64K 32K 32K 512 64K Bytes D

We just changed nfs_max_read_size to the maximum value allowed. If you want to keep the new values, add your changes to the /etc/tunables/nextboot file so that the settings will remain changed after a reboot.

The nfso offers additional parameters you can modify. To list them all, use the -a or -L flag.