System Tuning overview


Subject : System Tuning overview

Description :

1.Performance Tuning Overview Summary

1) Ideal System

	a) High but not full CPU utilization ( 70 - 90 %)

	b) CPU Time spent in user application ( 85 + % in user )

	c) Low Disk utilization ( 5 - 15 % for each disk )

	d) Low Network utilization
	   ( 10 -30 % per network, less 5 % of collision )

2) Standard Tuning Steps

	1st. Application Tuning

	2nd. DataBase Tuning

	3rd. OS Tuning ( System Tuning )

3) Components of System Performance

	KERNEL, MEMORY, CPU, DISK(I/O), NETWORK

4) System Resources

	* CPU - number of CPUs
	* I/O Devices - disk, printer, terminal, transfer information
	* Memory - primary memory(RAM), secondary memory(disk)
	* Kernel - Kernel parameters ( /etc/system)


< Rules and Tunables Quick Reference Tables >


1.Disk I/O Rules


Table : Disk I/O Performance Rules for Sun OS 4.X
-----------------------------------------------------------------------------
         Rule for Each Disk Drive                       Level   Action
-----------------------------------------------------------------------------
(iostat -D30.util < 5%)&&(other disks white or green)   White   No Problem
-----------------------------------------------------------------------------
(iostat -D30.util < 5%)&&(other disks amber or red)     Blue    1.Idle Disk
-----------------------------------------------------------------------------
5% <= iostat -D30.util < 35%                            Green   No Problem
-----------------------------------------------------------------------------
35% <= iostat -D30.util < 65%                           Amber   2. Busy Disk
-----------------------------------------------------------------------------
65% <= iostat -D30.util                                 Red     2. Busy Disk
-----------------------------------------------------------------------------

White : Low usage
Blue : underutilization or imbalance
Green : target utilization levels or no problem
Amber : warning level
Red : critical level
Black: prevent your application from runing.

cf) iostat - report I/O statistics
SYNOPSIS
     iostat [ -cdDIt ] [ -l n ] [ disk ...  ] [ interval [  count ] ]

          -D   For each disk, report the reads per second, writes  per
               second, and percentage disk utilization.

          ex) hyundai2% iostat -D 30
                    dk0           dk1           dk2           dk3 
           rps wps util  rps wps util  rps wps util  rps wps util 
             0   0  0.7    0   0  0.4    0   0  0.3    0   0  0.1 
             0   1  1.0    0   2  3.4    0   0  0.0    0   0  0.0 
             0   0  0.7    0   3  4.8    0   0  0.0    0   0  0.1 
             0   0  0.6    0   2  3.6    0   0  0.3    0   0  0.1 
             0   0  0.6    0   4  6.2    0   0  0.0    0   0  0.1 
             0   0  0.5    0   2  3.0    0   0  0.0    0   0  0.1

     
Table : Disk I/O Performance Tules for Solaris 2
--------------------------------------------------------------------------
         Rule for Each Disk Drive                       Level   Action
--------------------------------------------------------------------------
(iostat -x30.b < 5%)&&(other disks white or green)      White   No Problem
--------------------------------------------------------------------------
(iostat -x30.b < 5%)&&(other disks amber or red)        Blue    1.Idle Disk
--------------------------------------------------------------------------
(5% <= iostat -x30.b)&&(iostat -x30.svc_t < 30ms)       Green   No Problem
--------------------------------------------------------------------------
(20%<=iostat -x30.b)&&(30ms<=iostat -x30.svc_t < 50ms)  Amber   2.Busy Disk 
--------------------------------------------------------------------------
(20%<=iostat -x30.b)&&(50ms<=iostat -x30.svc_t )        Red     2.Busy Disk 
--------------------------------------------------------------------------
(20%<=iostat -x30.b)&&(50ms<=iostat -x30.svc_t ) &&     Amber   3.Floppy/CD 
(iostat -x30.disk=="fd0" || iostat -x30.disk == "sd6")
--------------------------------------------------------------------------
0% == iostat -x30.w                                     Green   No Problem
--------------------------------------------------------------------------
0% < iostat -x30.w < 5%                                 Amber   4.SCSI Busy 
--------------------------------------------------------------------------
5% <= iostat -x30.w                                    Red      4.SCSI Busy 
--------------------------------------------------------------------------



NAME
     iostat - report I/O statistics

SYNOPSIS
     /usr/bin/iostat [ -cdDItx ] [ -l n ] [ disk ...  ]
          [ interval [ count ] ]

           -x        For each disk, report  extended  disk  statistics.
                          The output is in tabular form.

           %w        percent of time there are transactions  waiting
                          for service (queue non-empty)

           %b        percent of time the disk is busy  (transactions
                          in progress)

           svc_t     average service time, in milliseconds

ex)hyundai3% iostat -x 30
                                 extended disk statistics 
disk      r/s  w/s   Kr/s   Kw/s wait actv  svc_t  %w  %b 
fd0       0.0  0.0    0.0    0.0  0.0  0.0    0.0   0   0 
sd1       0.1  0.0    0.3    0.2  0.0  0.0  168.0   0   0 
sd3       1.4  0.5    6.0    4.1  0.0  0.1   78.7   0   2 
                                 extended disk statistics 
disk      r/s  w/s   Kr/s   Kw/s wait actv  svc_t  %w  %b 
fd0       0.0  0.0    0.0    0.0  0.0  0.0    0.0   0   0 
sd1       0.0  0.0    0.0    0.0  0.0  0.0    0.0   0   0 
sd3       0.0  0.2    0.0    1.6  0.0  0.0   60.3   0   0 
                                 extended disk statistics 
disk      r/s  w/s   Kr/s   Kw/s wait actv  svc_t  %w  %b 
fd0       0.0  0.0    0.0    0.0  0.0  0.0    0.0   0   0 
sd1       0.0  0.0    0.0    0.0  0.0  0.0    0.0   0   0 
sd3       1.1  1.9    4.8   24.4  0.0  0.1   20.0   0   4 
                                 extended disk statistics 
disk      r/s  w/s   Kr/s   Kw/s wait actv  svc_t  %w  %b 
fd0       0.0  0.0    0.0    0.0  0.0  0.0    0.0   0   0 
sd1       0.0  0.0    0.0    0.0  0.0  0.0    0.0   0   0 
sd3       0.0  0.9    0.0    7.1  0.0  0.1  139.8   0   1 
                                 extended disk statistics 
disk      r/s  w/s   Kr/s   Kw/s wait actv  svc_t  %w  %b 
fd0       0.0  0.0    0.0    0.0  0.0  0.0    0.0   0   0 
sd1       0.0  0.0    0.0    0.0  0.0  0.0    0.0   0   0 
sd3       0.0  0.8    0.0    5.1  0.0  0.1   85.8   0   1


ref)

1) Idle Disk
- Idle disk 는 다른 disk 가 overload 될때 I/O throughput 의 낭비를 말한다.
  Rebalance the load or stripe this disk together with a busy one.

2) Busy Disk
- Busy or slow disk reduces system throughput and increase user response times.
  Rebalance the load or stripe this disk together with a idle one.


3) Floppy/CD
- device 가 늦게되는것을 막기위해 fd0 type 인 floppy 를 check 하라.
  CD 는 보통 매우 늦은 disk 이며 first SCSI bus 상의 target number 6 로 구성.
  만약, floppy 와 CD 가 오랫동안 active 이면, 속도가 느려지며,            
   Data 는 빠른 disk 에 옮겨져야 한다.

4) SCSI Busy.
- disk commands 가 wait 되면 이것은 아마 overloaded SCSI bus 에 기인한다.
  There are no direct measures of SCSI bus utilization levels.



2.Network Rules


Table : Network Performance Rules Based on Ethernet Collisions
------------------------------------------------------------------------------
         Rule or Each Network Interface                     Level   Action
------------------------------------------------------------------------------
(0 < netstat-i30.output.packets<10)&& (100*netstat -i30.    White  No problem
output.colls / netstat -i30.output.packets < 0.5%)
&& (other nets white or green)    
------------------------------------------------------------------------------
(0 < netstat-i30.output.packets<10)&& (100*netstat -i30.    Blue  1.Inactive
output.colls / netstat -i30.output.packets < 0.5%)               Net
&& (other nets amber or red)    
------------------------------------------------------------------------------
(10<= netstat-i30.output.packets)&& (0.5% <= 100*           Green  No problem
netstat -i30.output.colls / netstat -i30.output.packets 
< 2.0%)              
------------------------------------------------------------------------------
(10<= netstat-i30.output.packets)&& (2.0% <= 100*           Amber  2.Busy Net
netstat -i30.output.colls / netstat -i30.output.packets 
< 5.0%)              
------------------------------------------------------------------------------
(10<= netstat-i30.output.packets)&& (5.0% <= 100*           Red    2.Busy Net
netstat -i30.output.colls / netstat -i30.output.packets 
------------------------------------------------------------------------------
network type is not ie, le, ne, or qe; it is bf or nf.      Green  3.Not Ether 
------------------------------------------------------------------------------

ex)hyundai3#netstat -i 30
    input   le0       output           input  (Total)    output
packets errs  packets errs  colls  packets errs  packets errs  colls 
10429   0     22168   14    1      10643   0     22382   14    1     
5       0     0       0     0      5       0     0       0     0     
4       0     0       0     0      4       0     0       0     0     
2       0     0       0     0      2       0     0       0     0     
2       0     0       0     0      2       0     0       0     0
2       0     0       0     0      2       0     0       0     0     
2       0     0       0     0      2       0     0       0     0     
2       0     0       0     0      2       0     0       0     0     
2       0     0       0     0      2       0     0       0     0     
2       0     0       0     0      2       0     0       0     0     
2       0     0       0     0      2       0     0       0     0     
2       0     0       0     0      2       0     0       0     0

1) Inactive net
- An inactive network is a waste of throughput when other networks are
  overloaded. Rebalabce the load so that all networks are used more evenly.

2) Busy Net
- collisions 이 너무많은 network  은 throughput 을 감소시키고 response time
  을 증가시킨다.할수있다면 inactive networks 로 load 를 일부옮겨라.
  Add more Ethernets or upgrade to a faster interface type like FDDI,or ATM.

3) Not Ether
- 마지막 interface name 이 e 가 아니면, Ethernet 이 아니다. 그래서
  collision-based network performance rule 은 사용되지않는다.



3.NFS Client Rules


Table : NFS Client Performance Rules
-----------------------------------------------------------------------------
         Rule for Each NFS Client                       Level   Action
-----------------------------------------------------------------------------
0 == nfsstat -rc.calls                                  White  No Client NFS
-----------------------------------------------------------------------------
nfsstat -rc.timeout < 0.05 * nfsstat -rc.calls          Green  No Problem
-----------------------------------------------------------------------------
0.05 * nfsstat -rc.calls <= nfsstat -rc.timeout &&      Red    1.Bad Net
nfsstat -rc.badxid == 0
-----------------------------------------------------------------------------
0.05 * nfsstat -rc.calls <= (nfsstat -rc.timeout ==     Red    2.Slow NFS 
nfsstat -rc.badxid) 
-----------------------------------------------------------------------------

ex)hyundai3# nfsstat -rc

Client rpc:
calls      badcalls   retrans    badxids    timeouts   waits      newcreds   
121        0          0          0          0          0          0          
badverfs   timers     toobig     nomem      cantsend   bufulocks  
0          3          0          0          0          0          


1) Bad net
- Packets are not making it to or from the NFS server.
  Fix the network H/W or reduce NFS packet sizes.

2) slow NFS
- NFS server 가 너무느려 client 로 부터 time-out 후에 중복된 requests 가
  수행되므로 client 의 time-out 을 증가하라.


 
4.Memory Rules


Table : Virtual memory Rules for SunOS 4
---------------------------------------------------------------------------
         Virtual Memory Rule                   Level         Action
---------------------------------------------------------------------------
100000k <= pstat -s.available                  White         1.Swap Waste
---------------------------------------------------------------------------
10000k <= pstat -s.available< 100000k          Green         No Problem
---------------------------------------------------------------------------
4000k <= pstat -s.available< 10000k            Amber         2.Swap Low 
---------------------------------------------------------------------------
1000k <= pstat -s.available< 4000k             Red           2.Swap Low
---------------------------------------------------------------------------
pstat -s.available < 1000k                     Black         3.No Swap
---------------------------------------------------------------------------



Table : Virtual memory Rules for Solaris 2
---------------------------------------------------------------------------
         Virtual Memory Rule                   Level         Action
---------------------------------------------------------------------------
100000k <= vmstat30.swap                       White         1.Swap Waste
---------------------------------------------------------------------------
10000k <= vmstat30.swap < 100000k              Green         No Problem
---------------------------------------------------------------------------
4000K <= vmstat30.swap < 10000k                Amber         2.Swap Low
---------------------------------------------------------------------------
1000K <= vmstat30.swap < 4000k                 Red           2.Swap Low
---------------------------------------------------------------------------
vmstat30.swap < 1000k                          Black         3.No Swap 
---------------------------------------------------------------------------

ex)hyundai3# vmstat 30
 procs     memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr f0 s1 s3 --   in   sy   cs us sy id
 0 0 0  63708  6920   0   5  6  3  3  0  1  0  0  1  0   83  206   59  1  2 97
 0 0 0  67656  6604   0   0  0  0  0  0  0  0  0  0  0   18  118   93  0  1 99
 0 0 0  67656  6604   0   0  0  0  0  0  0  0  0  0  0   11   91   69  0  1 99
 0 0 0  67656  6604   0   0  0  0  0  0  0  0  0  0  0    1    3   13  0  0 100
 0 0 0  67656  6604   0   0  0  0  0  0  0  0  0  0  0   15  121   89  0  1 99
 0 0 0  67656  6604   0   0  0  0  0  0  0  0  0  0  0   17  133   94  0  1 99
 0 0 0  67656  6604   0   0  0  0  0  0  0  0  0  0  0   18  168  104  0  1 99
 0 0 0  67656  6600   0   0  0  0  0  0  0  0  0  0  0   12   92   73  0  1 99
 0 0 0  67656  6600   0   0  0  0  0  0  0  0  0  0  0  559   25   26  0  2 98
 0 0 0  67656  6600   0   0  0  0  0  0  0  0  0  0  0   44  203  160  0  1 99

- Sun상에서 paging 은 vmstat 또는 sar 에 의해서 monitoring 된다. sar 가 
  더 information 을 logging 하는데는 나을지모르나 vmstat 가 더 간결하고
   interactive user 를 위해서는 more information 을 각 line 에 잘 표시한다.
  SunOS 4 version 은 free swap space 를 보여주지 않으며 avm field 는 항상
  0 를 나타낸다.

* Runnable Queue: vmstat procs r and sar -q runq-sz, runocc

                  하나의 system 이 더 많은 CPU power 를 필요로 하는지 결정할때
                  사용되어지는 중요한 수단이다. 하나의 process 가 수행되려고할
                  때, 수행될 free CPU 가 없다면, 그 process 는 이 queue 에서
                  기다려야만 한다. 이것은 점심시간에 은행에 service 를 받기위해
                  줄을 서는것과 같고 이때, 더많은 은행원이 있다면 그 line 은
                  빨리움직이고 line 은 OK 가 된다. 
                  Multiprocessor machine 에서는 queue 가 uniprocessor 보다 더
                  빨리 비워진다. 대충측정된 service time 은  cput 수를 queue 로
                  나눈 길이가 될것이다. 하나의 CPU time slice 는 50 에서 100 ms
                  이며 4 개의 time-slices 에 이르는 service time 이 보통
                  기준에 맞는 performance 로 인식될수가 있다.

* Blocked Queue: vmstat procs b

                  이것은 현재 run하기위해 대기하고 있는 process 이며 resources 
                  에 block 되어있다.(pagind, I/O, and so forth)

* Swapped Queue : vmstat proces w and sar -q swpq-sz, swpocc

                  이것은 RAM 의 부족으로 swapped out 되어있는 process 의 수이다.                  
* Swap Space: vmstat swap, sar -r freeswap and swap -s

                  vmstat 의 swap 은 available swap 을 Kbytes 로 나타낸다.
hyundai3% vmstat 2
 procs     memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr f0 s1 s3 s6   in   sy   cs us sy id
 0 0 0  64244  2724   0   4  2  0  0  0  0  0  0  1  0   62   93   49  0  1 99
 0 0 0  65148  2252   0   4  0  0  0  0  0  0  0  0  0   34  198   98  2  0 98

Chyundai3% sar -r 2

SunOS hyundai3 5.4 Generic_Patch sun4m    12/11/95

16:08:31 freemem freeswap
16:08:33     526   129968

hyundai3% swap -s
total: 29460k bytes allocated + 10616k reserved = 40076k used, 65168k available

                  sar -r freeswap 은 512-byte blocks 로 swap space 를 나타냄.
                    
* Free Memory: vmstat free and sar -r freemem

               vmstat 는 free memory 를 Kbytes 로 보고한다. 이것은 하나의
               process 가  구동하거나 더 많은 memory 를 필요로할경우마다
               즉시 사용가능한 RAM 의 pages 이다. sar 는 freemem 으로 pages 
               단위로 보고됨. 이 변수의 절대값은 의미가 없고 다른 kernel 
               thresholds 에 상대적이다.


* Reclaims: vmstat re

              free list  로부터 요청된 pages 의 수이다. Page 200 참조.

* Minor Faults: vmstat mf and sar -p vflt

              minor faults 는 address space 또는 H/W address translation fault
              에 의해 발생된다.

* Other Fault Types:sar -p pflt, slock, vmstat -s copy-on-write, zero fill

  hyundai3% vmstat -s
        0 swap ins
        0 swap outs
        0 pages swapped in
        0 pages swapped out
    60401 total address trans. faults taken
     4185 page ins
        0 page outs
     7331 pages paged in
        0 pages paged out
     1676 total reclaims
     1676 reclaims from free list
        0 micro (hat) faults
    60401 minor (as) faults
     3896 major faults
    15431 copy-on-write faults
    17651 zero fill page faults
        0 pages examined by the clock daemon
        0 revolutions of the clock hand
        0 pages freed by the clock daemon
      578 forks
       69 vforks
      719 execs
   195165 cpu context switches
   771716 device interrupts
    81319 traps
  1220654 system calls
    76683 total name lookups (cache hits 92%)
       64 toolong
    10591 user   cpu
    38152 system cpu
   800334 idle   cpu
    25045 wait   cpu

               많은 종류의 page faults 가 있다. "segmentation violation core
               dump" messages 를 내는 illegal accesses 에 의해 야기되는 
               Protection faults 가 있다. 전체적인 fault types 은 vmstat -s 
               에 의해 report 된다.

*  Attach To Existing Pages: vmstat at and sar -p atch
     
              이것은 다른 processes  에 의해 이미 사용중인 공유 pages 에 
              attach 된 수 를 말한다. (1.X)

* Pages Swapped In: vmstat -S si and sar -w swpin, bswin

              vmstat -S si 는 number of Kbytes/s swapped in 을 말한다.
              sar -w bswin 은 512-byte blocks swapped in 을 말한다.

* Pages Swapped Out: vmstat so and sar -w swpot, bswot

              vmstat -S so 는 number of Kbytes/s swapped out 을 말한다.
              sar -w bswin 은 512-byte blocks swapped out 을 말한다.

* Pages Paged In: vmstat pi and sar -p pgin, ppgin

              vmstat pi 는 Kbytes/s 의 수를 나타내고  sar 는 swap space 
              또는 file system 에서 paged 된 pages 의 수 와 page faults 의
              수 를 나타냄. file system block size 가 8 K 이므로
              종종 2 Page 또는 8 K 가 page fault 당 page 된다.

* Pages Freed: vmstat fr and sar -g pgfree

             Pages freed 는 page scanner daemon 에 의해 memory가 free list로
             들어가게되는 비율을 말한다.
             fr 은 초당 freed 되는 Kbytes 이며 sar -g pgfree 는 초당 freed 
             되는 page  수 이다.
            

 
1) Swap Waste
- You have a lot of unused swap space

2) Swap Low
- 남아있는 swap 이 많지않다. 그래서 시스템은 아마 virtual memory 가 
  바닥날것이다. 시스템상에서 돌아가는 program 의 크기나 수를 줄이거나
  또는 완전히 고갈되기전에 swap space 를 더 늘려라.
  P4: Swap space
  - 대부분의 Application vendor 는 그들의 application 이 필요로하는
    swap space 를 말해줄수가 있다. 만약에 당신이 당신이 필요한 swap space 가
    얼마나 필요한지를 알수가 없다면 최소한 시작으로 64M 의 virtual memory 를
    구성하라. 나중에 더 늘이면 되므로 먼저 많이 잡지마라.
    SunOS 4  에서는 당신의 swap space 는 RAM 보다 커야만 되었고 최소한
    64MB 의 swap space 가 필요했음.
    Solaris 2 swap space 는 RAM size 와 64MB 의 차이가 되어야만 한다.
    즉, 16M RAM 은 48MB swap, 32 M RAM 은 32 swap, 64M or 그 이상의 RAM 에는
    swap 은 필요가 없다. 만약, 당신의 application vendor 가 solaris 2 
    applciation 은 64 MB 의 RAM 과 128 MB 의 swap 이 필요하다면 이것은
    192 MB 의 virtual memory 가 된다. 따라서 당신은 96 MB 의 RAM 과 96 MB 
    의 swap size 가 필요하며, 만약 당신의 시스템이 swap space 가 모자라면
    swap file 을 늘리거나 RAM 을 늘려야 됨. 

3) No Swap
- 시스템은 실제적으로 swap space 가 고갈됨. 프로그램은 fail or hang .
  swap space 를 더 늘이거나 즉시 applications 을 죽여라.


Table : Physical memory Rules for SunOS 4 and Solaris 2
---------------------------------------------------------------------------
         Physical Memory Rule                    Level         Action
---------------------------------------------------------------------------
vmstat30.sr == 0                                 White         1.RAM Waste
---------------------------------------------------------------------------
0 < vmstat30.sr < 200                            Green         No Problem
---------------------------------------------------------------------------
200 < vmstat30.sr < 300                          Amber         2. RAM Low
---------------------------------------------------------------------------
300 < vmstat30.sr                                Red           2. RAM Low
---------------------------------------------------------------------------

1) RAM Waste
- You have more RAM than you need.
  The system does not even need to reclaim inactive pages.

2) RAM Low
- The system is scanning through memory looking for pages to free at a 
  high rate. This indicates that, as well as inactive pages, active pages
  may be stolen from processes.


Table : kernel memory Rules for Solaris 2
---------------------------------------------------------------------------
         Kernel Memory Rule                             Level   Action
---------------------------------------------------------------------------
((0 < sar-k1.sml_mem.fail)||(0