Subject : ½Ã½ºÅÛÀå¾ÖºÐ¼®(savecore,hangup,Panic,watchdog-reset)

Description :

1. Setup savecore ( 1.X and 2.X )
2. Hangup
3. Panic
4. Watchdog Reset


		< 1. Setup savecore >



1. Solaris 1.X : How to setup savecore

1) Customizing /etc/rc.local
....
# Default is to not do a savecore
#
# mkdir -p /var/crash/`hostname`
# echo -n 'checking for crash dump... '
# intr savecore /var/crash/`hostname`
# echo '

2) Default is to not do a savecore.

# Default is to not do a savecore
#
mkdir -p /var/crash/`hostname`
echo -n 'checking for crash dump... '
intr savecore /var/crash/`hostname`
echo '

3) -p option of mkdir says to create the parent directories if they
  don't already exist.

4) Configuring a special dump device

- ¿ì¸®´Â ¹ú½á dump device  ¿¡ ´ëÇؼ­ À̾߱âÇß°í primary swap device °¡ º¸Åë
  dump device ·Î »ç¿ëµÈ´Ù´Â°ÍÀ» ¾Ë°í ÀÖ´Ù.

cf) config vmunix swap on sd1b
    config vmunix swap on dumps on sd2f


2. Solaris 2.X : How to setup savecore

1) Customizing /etc/rc2.d/S20sysetup
......
##
## Default is to not do a savecore
##
if [ ! -d /home/lsh/crash/`uname -n` ]
then mkdir -p /home/lsh/crash/`uname -n`
fi
                echo 'checking for crash dump...\c '
savecore /home/lsh/crash/`uname -n`
                echo ''
....

2) Displaying the dumpfile kernel variable via adb

hyundai3# adb -k /dev/ksyms /dev/mem
physmem 3e1a
dumpfile/20X
dumpfile:
dumpfile:       0               0               0               0
                2f646576        2f64736b        2f633074        33643073
                31000000        0               0               0
                0               0               0               0
                0               0               0               0
dumpfile+10/X
dumpfile+0x10:  2f646576
dumpfile+10/s
dumpfile+0x10:  /dev/dsk/c0t3d0s1
$q



		< 2. System Hangup>


1. What is a system hang ?

- System hangs ´Â system admin ¿¡°Ô´Â Ä¿´Ù¶õ ÁÂÀýÀÌ µÉ¼ö°¡ ÀÖ´Ù.
  Àá½Ãµ¿ÇÑ ¸ðµç sysetm admin Àº ÇϳªÀÇ ½Ã½ºÅÛÀ» º¸°í ±×°ÍÀÌ »ì¾ÆÀÖ°í
  Á×°í, »ó´çÈ÷ ¼Óµµ°¡ ´Ê¾îÁö´Â°ÍÀ» º¸°ÔµÇ°í ¾ó¸¶ÈÄ "hung" system À» 
  º¸°ÔµÈ´Ù. System hang Àº ¸Å¿ì ´Ù¾çÇÑ Á¾·ùÀÇ ¿øÀÎÀ» °¡Áö°í ÀÖÁö¸¸ ±×µéÀº
  ÇÑ°¡Áö °øÅëÀûÀΠ¡Èĸ¦ µå·¯³½´Ù. ½Ã½ºÅÛÀº ´õÀÌ»ó ¿ÏÀüÇÏ°Ô »ç¿ëµÇÁö¾Ê´Â´Ù.
  Ç×»ó ½Ã½ºÅÛÀÌ ¿ÏÀüÇÏ°Ô »ç¿ëµÉ¼ö°¡ ¾ø°ÔµÇ´Â panics °ú´Â ´Þ¸® system hang Àº
  system resources ¸¦ õõÈ÷ Àâ¾Æ¸Ô¾î ¸¶Ä§³» ¿ÏÀüÇÏ°Ô useless system ÀÌ µÈ´Ù.

- kernel errors ¸¦ º¼¶§¿¡ ´ç½ÅÀº ¸ðµç½Ã½ºÅÛÀÌ core dump ·Î½á panic À» À¯¹ßÇÏ´Â
  ¹®Á¦¸¦ ÀÏÀ¸Å°Áö´Â ¾Ê´Â´Ù´Â°ÍÀ» ¾Ë°ÔµÉ°ÍÀÌ´Ù. °¡²ûÀº ½Ã½ºÅÛµéÀº hangup À̵ǰí
  ¿ì¸®´Â memeory ÀÇ ³»¿ëÀ» Á¶»çÇϱâÀ§ÇÏ¿© core dump ¸¦ ÀÏÀ¸ÄѼ­ hang À» ¸¸µé°ÔµÈ
  ¿øÀÎÀ» ¾Ë¾Æº¸´Â°ÍÀÌ ¹Ù¶øÁ÷ÇÏ´Ù.

2. What conditions cause hangs ?

- system hang ÀÇ ÀϹÝÀûÀÎ ¿øÀÎÀº deadlock ¶Ç´Â ÇϳªÀÇ process °¡ ´Ù¸¥ process 
  ¿¡ ÀÇÇØ lock µÇ¾îÀÖ´Â ¹«¾ùÀΰ¡¸¦ waiting ÇÏ¸ç ´Ù¸¥ process ´Â óÀ½ process
  °¡ lock ÇسõÀº resource ¸¦ ±â´Ù¸®´Â »óȲÀÌ´Ù.

- System hangs can also occur when resources dry up and the system has to sit
  around waiting for more resources before it can continue doing what was
  asked of it.

- °¡²û, system Àº hardware problems ¿¡ ÀÇÇØ hang  ÀÌ µÈ´Ù. ¿¹¸¦ µé¸é,
  µð½ºÅ© µå¶óÀ̹ö¿¡ ºÙ¾îÀÖ´Â data transfer cable ÀÇ ¹®Á¦´Â system °ú
  disk driver »çÀÌÀÇ communication problem À» ÀÏÀ¸Å²´Ù.
  ±× °á°ú´Â bus ¸¦ hung ÇÏ°Ô ¸¸µç´Ù.

- application  ÀÌ loop ¿¡ ºüÁ® hangup ÀÌ µÇ¾úÀ»¶§¿¡ ±× ½Ã½ºÅÛÀÇ ´Ù¸¥ user ´Â
  ¿µÇâÀ» ¹ÞÁö¾Ê´Â´Ù. Áï, ±× process µ¿¾È¿¡ disk ¸¦ ¸Ô°Å³ª µÎ°³ ¶Ç´Â ´Ù¸¥ 
  kenel resource ¸¦ ¸ÔÁö¸¸ ¾Ê´Â´Ù¸é hung program ÀÌ ±× system ÀÇ ³ª¸ÓÁö¿¡°Ô
  ¿µÇâÀ» ÁÖÁö¾Ê´Â´Ù.

- Hangs Àº ´Ù¾çÇÑ Á¶°Ç¿¡ÀÇÇؼ­ ¹ß»ýÇÏ¸ç ¼­·Î´Ù¸¥ Ư¼ºÀ» °¡Áö°í ÀÖ´Ù.

   * ¿ì¼± ½Ã½ºÅÛÀº hangup µÇ¾îÀÖ´Â ½Ã½ºÅÛÀ¸·ÎºÎÅÍ ÇϳªÀÇ low-level ICMP request
   ¸¦ º¸³»°ÔÇÏ´Â ¸í·É¾îÀÎ ping ¿¡µµ  ÀÀ´äÀ» ÇÏÁö¾Ê´Â ½Ã½ºÅÛÀÌ ÀÖÀ»¼ö ÀÖ´Ù.
   ¸¸¾à¿¡ ÀÀ´äÀ» ÇÑ´Ù¸é kernel Àº ±× ¼ø°£¿¡µµ network interrupts ¿¡ ´ëÇØ ÃæºÐÈ÷
   ÀνÄÇÏ°í ÀÀ´äÀ» ÇÒ¼ö°¡ Àִٴ°ÍÀÌ´Ù. 

   * ½Ã½ºÅÛÀº keyboard ÀÇ characters ¿¡ echo ¼Ò¸®¸¸ ³»°Å³ª mouse movements ´Â
    ÀÖÁö¸¸ ÀԷµǴ command ³ª abort sequence ¿¡ Á¶Â÷µµ ÀÀ´äÀ» ÇÏÁö ¾Ê´Â°æ¿ì°¡ 
    ÀÖ´Ù. ÀÌ°ÍÀº process °¡ °è¼Ó¼öÇàÀü¿¡ resources ¿¡»õÇØ availabelÇϰԵDZ⸦
    ±â´Ù¸®´Â »óȲ , Áï deadlocks ¿¡ ÀÇÇÑ hang up Àϼö ÀÖ´Ù. ÀÌ°æ¿ì¿¡´Â ±× 
    process µéÀº °áÄÚ ready »óÅ°¡ µÇÁö¾Ê´Â´Ù. ps ÀÇ output Àº ¾Æ¸¶ D wait
    state ¿¡ ¸¹Àº process ¸¦ º¼¼ö°¡ ÀÖÀ»°ÍÀÌ´Ù.

   * ¸¸¾à¿¡ keyboard ÀÇ echo Á¶Â÷µµ ÀüÇô¾ø´Â ¿Ïº®ÇÑ hangup Àΰæ¿ì´Â ¾Æ¸¶
    STREAMS problems Àϼö°¡ ÀÖ´Ù. °¡²û L1-A Á¶Â÷µµ ÀÌ °æ¿ì¿¡´Â ¼Ò¿ëÀÌ ¾ø´Ù.

   *  Server systems ¿¡¼­´Â  CPU B/D »óÀÇ LEDs °¡ ±× ½Ã½ºÅÛÀÇ »óŸ¦ ³ªÅ¸³½´Ù. 
   Á¤»óÀûÀÎ °æ¿ì´Â bounce ¶Ç´Â regular moving light ÀÌ´Ù. ¸¸¾à¿¡
   µ¿ÀÛÀº ÇÏÁö¸¸ ¸Å¿ì ¼Óµµ°¡ ´ÊÀ»¶§¿¡´Â ±× ½Ã½ºÅÛÀº ¸Å¿ì busy »óÅÂÀÌ´Ù.
   ÀÌ°ÍÀº kernel ÀÌ loop À̰ųª Çϳª ¶Ç´Â ±× ÀÌ»óÀÇ modem lines °ú °°Àº
   external device ·ÎºÎÅÍÀÇ ´ë·®ÀÇ interrupt ¶§¹®ÀÌ´Ù.
   Frozon lights ´Â H/W problem À» ³ªÅ¸³½´Ù.

3. Capturing system hang information

- ´ëºÎºÐÀÇ °æ¿ì hung system ÀÇ crash dump ´Â °­Á¦ÀûÀϼö°¡ ÀÖ´Ù. ±×·¯³ª ÀÌ°ÍÀº
  ¸ðµç system hang conditions ¿¡ ´ëÇØ not guaranteed.
  °­Á¦ÀûÀ¸·Î dump ¸¦ ÇÏ·Á¸é, ´ç½ÅÀº boot PROM monitor ·Î ³»·Á¾ß ÇÑ´Ù.
  Suspending all current program execution. It`s L1-A.
  On systems using ASCII terminals for the console, usually the Break key can 
  be used to get to the boot PROM monitor.

- ¸ðµç hang situations ÀÌ interrupted µÇÁö´Â ¾Ê´Â´Ù. ¸¸¾à, L1-A °¡ ÀÛµ¿À» ÇÏÁö
  ¾Ê´Â´Ù¸é °¡²û console keyboard ¸¦ »Ì°Å³ª ¸îºÐµ¿¾È terminal À» »Ì´Â´Ù.
  ÀÌ ¸ðµç°ÍÀÌ ½ÇÆзΠµ¹¾Æ°¡¸é ½Ã½ºÅÛÀ» power down Çϴ¼ö¹Û¿¡ ¾ø´Ù. 


4. Sun-4d

- psrinfo (print processor info) ¿Í psradm (processor admin)  command ´Â
  status display ¿Í multiprocessor system ÀÇ control ¿¡ À¯¿ëÇÔ.

- sun4d system  ( SPARCserver 1000, SPARCcenter 2000) Àº ½Ã½ºÅÛÁø´Ü¿¡ À¯¿ëÇÑ
  Ưº°ÇÑ H/W Ư¼º¿Ü¿¡ prtdiag ¶ó´Â »õ·Î¿î command °¡ ÀÖ´Ù.

- µÎ°³ÀÇ ¼­·Î´Ù¸¥ Á¾·ùÀÇ watchdog reset  ÀÌ ÀÖÀ¸³ª º¸Åë H/W problem À» ³ªÅ¸³¿.
  ½Ã½ºÅÛÀÇ watch dog reset Àº º¸Åë H/W error ¿¡ ÀÇÇϹǷΠ½Ã½ºÅÛÀ» reset ½ÃÅ´.
  
- POST routines Àº watchdog reset ¿¡ °üÇÑ information À» ÀúÀåÇϹǷΠ
  prtdiag -v ¶ó´Â command ·Î½á  È®ÀÎ ÇÒ¼ö°¡ ÀÖ´Ù.

- A local CPU watchdog reset occurs when a single processor is reset due to
  a trap occuring when traps are disabled ( a  "standard" watchdog).
  The system drops into the OBP.




		< 3. Panic >


1.What happened ?

- Computers crash. It's just a fact of life.
  Depending on the H/W and S/W. ÀϺδ ÀÚÁֹ߻ýÇÏ°í ÀϺδ ÀüÇô¹ß»ýÇÏÁö ¾Ê´Â´Ù.

- UNIX °¡ Á¸ÀçÇÑÀÌ·¡·Î UNIX system crash dump ¸¦ ºÐ¼®ÇÏ·Á´Â »ç¶÷ÀÌ ¸¹°í
  ÀÌ »ç¶÷µéÀº UNIX system ÀÌ crash ÈÄÀÇ ¸¸µé¾îÁø files À¸·ÎºÎÅÍ ¿øÀÎÀ» ºÐ¼®ÇÒ
  ¼ö ÀÖ°Ô µÇ¾ú´Ù.


2. What is a system crash ?

- UNIX  ¿¡ µû¸£¸é 1970 1 ¿ù 1 ÀÏ ÀÚÁ¤À¸·ÎºÎÅÍ computer systems Àº crash °¡ ¹ß»ý.
 
- System crash ´Â Á¾Á¾ ´ÙÀ½°ú °°Àº Á¶°Ç¿¡¼­ °©ÀÚ±â system ÀÌ »ç¿ëÇÒ¼ö ¾ø°ÔµÊ.
  ( System panics & bad traps, Watchdog resets, Dropping out to boot PROM)


3. What conditions cause panics ?

- ¾î¶²ÀÌ´Â panics À» Çø¿ÀÇÑ´Ù. ±×µéÀº ¾Æ¸¶ ½Ã½ºÅÛ°ú data integrity ¸¦ 
  ¾ÈÀüÀåÄ¡(safeguards) ·Î »ý°¢ÇÏ´Â°Í °°´Ù.

- ½Ã½ºÅÛ panic messages  ´Â µÎ°¡ÁöÁßÀÇ ÇÑ°¡ÁöÀÇ ¿øÀÎÀÌ´Ù.
   software consistency check, hardware fault.

- ÈǸ¢ÇÑ O/S programmer ´Â system resources ÀÇ integrity ÀÇ checking À» ÇÒ¶§¿¡
  ±× code ³»¿¡ panic() routine À» ³¢¿ö³Ö¾î referencing °ú manipulating À» ÇÑ´Ù.
  ¿¹¸¦µé¸é,  ½Ã½ºÅÛ ÇÁ·Î±×·¡¸Ó ÀÇ program code ¿¡¼­ Áö±ÝÇöÀç »ç¿ëÁßÀ̶ó°í 
  ¾Ë·ÁÁø(marking) disk ÀÇ ÇÑ block À» ÀÌÁ¦ ¸· free up ½ÃÅ°·Á°í ÇÒ¶§¿¡ ±×´Â
  ¸ÕÀú ±× µð½ºÅ©°¡ ¾ÆÁ÷µµ »ç¿ëÁßÀΰÍÀ¸·Î mark µÇ¾îÀÖ´ÂÁö¸¦ °ËÁõÇÒ°ÍÀÌ´Ù.
  ¸¸¾à ±× block ÀÌ °©Àڱ⠱װ¡ free ÇϱâÀü¿¡ free µÈ°ÍÀ¸·Î mark µÇ¾îÀÖ°í ±×°ÍÀ»
  ¾Ë¾ÒÀ»¶§ ±×ÀÇ code ´Â ±×°ÍÀ» freeing ÇÏ¸é ¾ÈµÈ´Ù. ±×·¯³ª ¾î¶»°Ô ±× block ÀÌ
  ¿ä¼úó·³ free µÇ¾úÀ»±î? ¾î¶»°Ô , ¾îµð¿¡¼­, ¹«¾ùÀÌ ¾öû³ª°Ô À߸øµÇ¾ú´Â°¡?
  À̶§ panic() À» call Çϸ鼭 system programmer ´Â ±× system À» °©ÀÚ±â ÁßÁö½Ãų
  ¼ö ÀÖÀ¸¸ç ÀÌ·¸°Ô ÇÔÀ¸·Î½á ½Ã½ºÅÛÀ» º¸È£ÇÏ°í ±× problem ÀÌ ¹ß°ßµÉ¶§±îÁö
  Ãß°¡ÀûÀÎ corruption À» ¿¹¹æÇÑ´Ù.

- panic() Àº ¿ÀÁ÷ O/S °¡ kernel mode ¿¡ ÀÖÀ»¶§¸¸ call µÈ´Ù.±×·¯³ª O/S ¿¡ À־
  bug ¸¦ ½ÇÇèÇÏ´Â ¾î¶°ÇÑ program ÀÌ¶óµµ panic À» ÀÏÀ¸Å³¼ö°¡ ÀÖ´Ù. ¿¹¸¦µé¸é,
  debuggin ÁßÀÎ »õ·Î¿î device driver ¸¦ »ç¿ëÇÏ´Â user program ¿¡¼­ driver °¡ 
  »ç¿ëµÉ¶§¸¶´Ù kernel mode ·Î ¿òÁ÷À̰ԵȴÙ. Çѹø kernel mode ¿¡ ÀְԵǸé, 
  panics Àº ÀϾ¼ö°¡ ÀÖ´Ù. ±×ÀÇ program ÀÌ panic À» ÀÏÀ¸Å² °ÍÀº ±× user ¿¡°Ô
  ³ªÅ¸³ª°ÔµÇÁö¸¸ ½ÇÁ¦ ±×ÀÇ ÇÁ·Î±×·¥Àº ´ÜÁö panic À¸·Î À¯µµÇÏ°Ô µÇ´Â events ÀÇ 
   trigger °¡ µÈ°ÍÀÌ´Ù. Áï °£´ÜÈ÷ ¸»Çϸé, ¸¸¾à ½Ã½ºÅÛÀÌ panics ÀÌ ³ª¸é
  ¹Ù·Î ½Ã½ºÅÛÀÌ data ÀÇ integrity or  data ÀÇ corruption ÀÌ ÀǽɵǴ Á¶°ÇÀ»
 °¨ÁöÇß°¡ ¶§¹®ÀÌ´Ù.

- data integrity concept À» user level programming ÀÇ °üÁ¡¿¡¼­ »ìÆ캸ÀÚ.
  ¸¸¾à ´ç½ÅÀÌ ÇϳªÀÇ È­ÀÏÀ» open ÇÏ´Â ÇÁ·Î±×·¥À» open() system call À» »ç¿ëÇÏ¿©
  ÇÁ·Î±×·¡¹ÖÇÑ´Ù¸é, ´ç½ÅÀº ¾Æ¸¶µµ ´ÙÀ½ ´Ü°è¸¦ ³Ñ¾î°¡±âÀü¿¡ ½ÇÁ¦·Î open ÀÌ ¼º°ø
  Çߴ°¡¸¦ open() status ¸¦ check ÇÒ°ÍÀÌ´Ù.¸¸¾à open() status °¡ fail À̸é
  ´ç½ÅÀÇ program Àº ¾Æ¸¶ ÀÌ °ÍÀ» report ÇÏ°í exit Çϰųª »õ·Î¿î file name À»
  À§ÇØ prompt ¸¦ ³»°Å³ª °£´ÜÈ÷ ´ÙÀ½ course ÀÇ action À» ÃëÇÒ°ÍÀÌ´Ù. ¿©±â¼­
  ¸¸¾à ´ç½ÅÀÌ open() system call ·ÎºÎÅÍ ³Ñ¾î¿Â status ¸¦  ¹«½ÃÇÑ´Ù¸é ÇâÈÄ¿¡ ÀÌ
  line ¿¡ ¿Í¼­´Â ¾î¶°ÇÑ ÀáÀçÀûÀÎ ¹®Á¦¿¡ ºÎµúÈú°ÍÀÌ´Ù. ´ç½ÅÀÇ data integrity ´Â
  À§Çè¿¡ ³õÀÏ°ÍÀÌ´Ù.

- ´ç½ÅÀÌ ¿îÀüÇÏ´Â ÀÚµ¿Â÷ ´Â panic() routine °ú ºñ½ÁÇÑ ¾î¶²°ÍÀ» °¡Áö´Â°¡ ?
  ¸¸¾à air bag ÀÌ ÀåÂøµÇ¾î ÀÖ´Ù¸é ´äÀº yes ÀÌ´Ù. ´ç½ÅÀÇ Â÷°¡  °©ÀÚ±â ¾Õ ¹üÆÛ°¡
  high-speed collision °ú °°Àº°ÍÀ» °¨ÁöÇß´Ù¸é, air bag ÀÌ ºÎÇ®·¯Á®¼­ ¿îÀüÀÚ¸¦
  º¸È£ÇÏ°Ô µÉ°ÍÀÌ´Ù.

-  Software(Kernel) ´Â ¼ö¸¹Àº hardcoded validity tests ¸¦ Æ÷ÇÔÇÏ°í Àִµ¥,
  ÀÌ°ÍÀº invalid pointers ¶Ç´Â impossible conditions before continuing À»
   checking ÇϰԵȴÙ. panics Àº µÎ°¡Áö types  Áß¿¡¼­ ÇÑ°¡Áö°¡ µÉ¼öÀÖ´Ù.
   a regular panic messages, or an assertion ÀÌ´Ù.
   - ÀÌÀüºÎÅÍÀÇ panic messages ¿¡ ´ëÇؼ­´Â ´ç½ÅÀÌ º¸Åë ¾òÀ»¼ö Àִ°ÍÀº
   messages ±× ÀÚüÀÌ´Ù. À̰͵éÀº unique ÇÑ ±× ÀÚüÀ̸ç Á¤È®È÷ ±× ¹®Á¦¸¦
   ³ªÅ¸³» ÁØ´Ù. ´ç½ÅÀº source code ³»¿¡¼­ ±×°ÍÀ» Çѹø º¼¼ö°¡ ÀÖ´Ù.

- Assertion messages ´Â "panic: assertion failed" ¶ó´Â messages ¿¡ À̾
   erroneous conditionÀ» ³ªÅ¸³»´Â messages ¸¦ console ¿¡ prints ÇÏ´Â
   macro ·Î ºÎÅÍ À¯·¡ÇÑ´Ù. ÀÌ °æ¿ì¿¡, °ü½ÉÀÖ´Â article Àº panic: ¿¡ ¼±ÇàÇÏ´Â
   condition message À̸ç ÀÌ°ÍÀº test, file, ±×¸®°í ±× code ³»¿¡ line number
   ¸¦ ³ªÅ¸³½´Ù.

- °©ÀÛ½º·± hardware traps  Àº panics À» ÀÏÀ¸Å²´Ù. ÀÌ°ÍÀº ÀϹÝÀûÀ¸·Î
  kernel ·Î ºÎÅÍÀÇ invalid address °¡ access µÇ´Â °æ¿ìÀÌ´Ù.¿Ö³ÄÇϸé OS ´Â
  page µÇ´Â°ÍÀÌ ¾Æ´Ï¹Ç·Î kernel code ·Î ºÎÅÍÀÇ fault ´Â Áï°¢ÀûÀÎ Á×À½(immediate
  death) ÀÇ ¿øÀÎÀÌ´Ù. software panic messages ¿Í ´Þ¸® hardware traps Àº Á¤È®ÇÑ
  ½Ã½ºÅÛÀÇ »óŸ¦ ³ªÅ¸³»¸ç console ¿¡ print µÇ´Â traceback À¸·Î ±Í°áµÈ´Ù.
  ÀÌ°ÍÀº º¸Åë ¶ÇÇÑ /var/adm/messages file ¿¡ ³ªÅ¸³ª°Ô µÈ´Ù.
 
- º¸Åë panics ´Â hardware-related or detected fault ¸¦ ³ªÅ¸³½´Ù.
   Á¾·ù´Â.
    - trap : for any unexpected trap into or from kernel mode
    - bus error(Sun-3) : a kernle segmentation violation.
    - text fault : an attempt to fetch an instruction from a bad place.
    - data fault: generally an erroneous pointer
    - address alignment: also generally a bad pointer.
    - illegal instruction : possibly an attempt to execute "data"



4. A word about bad traps

- Computer system Àº H/W ¿¡¼­ ÀϾÁö¸»¾Æ¾ß ÇÒ Á¶°ÇÀÌ °¨ÁöµÈ´Ù¸é ¶ÇÇÑ crash 
  ¸¦ ³½´Ù. UNIX system¿¡¼­ ÀÌ·¯ÇÑ Á¾·ùÀÇ crash ¸¦ "bad trap " À̶ó°íÇϸç
  system admin ÀÇ °üÁ¡¿¡¼­ º»´Ù¸é bad traps °ú S/W panics ´Â µ¿ÀÏÇÑ ¹æ¹ýÀ¸·Î
  ´Ù·ç¾îÁ®¾ß ÇÑ´Ù. UNIX systems Àº ÇÏ·ç¿¡ ¼ö¹é¸¸ÀÇ traps À» ¼öÇàÇѤ§.
  ±×·¡¼­ ´ç½ÅÀÌ trap À» µè°ÔµÈ´Ù¸é panic  À̶ó°í ÇÏÁö¸»¶ó. ±×·¯³ª µå¹®°æ¿ì¿¡
  ´ç½ÅÀº bad trap À» ¸¸³¯¼ö°¡ ÀÖ´Ù. ´ç½ÅÀÇ UNIX system ÀÌ ±×·¸´Ù¸é ±×°ÍÀº
  panic() À» invoke ÇÒ°ÍÀÌ´Ù.


- SPARC terms ¿¡ À־ trap À̶ó´Â°ÍÀº kernel code ·ÎÀÇ Áï°¢ÀûÀÎ ºÐ±â¸¦
  ÀÏÀ¸Å²´Ù. Áï Á¤»óÀûÀÎ instructions  ÀÇ ¼öÇàÀ» Áß´Ü(interruption).
  ÀÌ·¯ÇÑ interruptionÀº user request(a system call) ¶Ç´Â ÀϺÎexternal
  event ( a page fault, a disk interrupt, a keystroke) °¡ ¿øÀÎÀÌ µÉ¼öÀÖ´Ù.
  ¾î¶² °æ¿ì¿¡µµ interrupt ´Â H/W ¿Í very low-level sofrware ¿¡ ÀÇÇØ
  processing  µÈ´Ù. ±×·¡¼­ ¾î¶»°Ô traps ÀÌ ¼öÇàµÇ°í ¾î¶»°Ô 󸮵ǴÂÁö¿¡ ´ëÇÑ
  °ÍÀº  ±× ½Ã½ºÅÛÀÇ architecure ¸¦ ÀÌÇØÇؾßÇÑ´Ù. 
  CPU H/W ´Â trap ÀÇ type À» ÀνÄÇÏ°í ±×°ÍÀ»  ó¸®ÇϱâÀ§ÇØ Á¤È®ÇÑ À§Ä¡¸¦
  ¾Ë·Á°í ½ÃµµÇÑ´Ù. kernel Àº Àû´çÇÑ trap handling code °¡ ¹ÌÄ¥¼ö ÀÖµµ·Ï
  È®½ÇÈ÷ ÇϱâÀ§ÇØ ¸î°³ÀÇ control registers ¸¦ setup Çؾ߸¸ ÇÑ´Ù. 
  Çѹø ½Ã½ºÅÛÀÌ ±¸µ¿µÇ°í user processes °¡ running µÇ¸é, ÇϳªÀÇ trap Àº
  kernel ÀÌ ÇϳªÀÇ  user program À¸·ÎºÎÅÍ control À» °®°ÔµÉ À¯ÀÏÇÑ ¹æ¹ýÀ̵ȴÙ.
  trap À̶ó´Â°ÍÀº ÇϳªÀÇ user request °¡ processµÇ°í ( kernel Àº user program
  À§¿¡¼­ running) ÇϳªÀÇ device °¡ control(kernel Àº ¸î°³ÀÇ external request
  ¶§¹®¿¡ running) µÇ´Â ¼ö´Ü(means) ÀÌ´Ù.

5. Kinds of traps

- µÎ°³ÀÇ ±âº»ÀûÀÎ trap ÀÌ ÀϾ¼ö°¡ Àִµ¥ synchronous ¿Í asynchronous ÀÌ´Ù.
  Synchronous trap Àº opeation  À̰ųª instruction Áß¿¡ÀÇÇØ ¹ß»ýÇÒ¼öÀÖ´Ù.
  ÀÌ°ÍÀº ½ÇÁ¦ trap instruction ÀÌ µÉ¼öµµ ÀÖ°í ¶Ç´Â bad address alignment,
  bad address(bus timeouts), illegal instructions, floating-point coprocessor 
  error °°Àº H/W  error Àϼöµµ ÀÖ´Ù. ÀÌ·¯ÇÑ traps Àº Áï½Ã ¹Þ¾Æµé¿©Áø´Ù.
  Áï, H/W ´Â kernel space À» À§ÇØ H/W ÀÇ tracks °ú heads ³»ÀÇ  ÇöÀç instruction
  ÀÇ operation À» ÁßÁö½ÃŲ´Ù.

- Asynchronous trap  Àº processor ¿¡¼­  ¾î¶²»óŸ¦ º¯°æÇϱâÀü¿¡ ¹ß»ýÇÑ´Ù.
  À̸®ÇÏ¿© ±× trap ÀÌ º¹±¸°¡´ÉÇÑ H/W fault ¿¡ ÀÇÇØ ÀϾÀ»¶§¿¡´Â  ±× 
  instruction Àº Çѹø ±×  trap handling ÀÌ ³¡³µÀ»¶§ ±× ¹®Á¦·ÎºÎÅÍ recovery
  ÇϱâÀ§ÇØ restart  ÇÑ´Ù. page faults ´Â ÁÁÀº¿¹ÀÌ´Ù.
  Asynchronous trap Àº ¾ðÁ¦³ª request µÉ¼ö°¡ ÀÖÀ¸¸ç ÇϳªÀÇ instruction ÀÌ
  ¿ÏÀüÈ÷ ³¡³µÀ»°æ¿ì¿¡¸¸ processing µÉ¼ö°¡ ÀÖ´Ù.
  ÀÌ·¯ÇÑ traps Àº interrupts ¿Í °°Àº external events ¿¡ ÀÇÇØ ÀϾ.
  ÀÌ traps Àº instruction ÀÇ operation ¿¡´Â ¿µÇâÀ» ¹ÌÄ¡Áö ¾ÊÀ¸¸ç ´ÜÁö 
  instruction stream ¿¡¼­ÀÇ break(ºÐ±â) ¸¦ ÀÏÀ¸Å²´Ù. ÀÌ°ÍÀº ¸¶Ä¡
  kernel ¿¡ÀÇ subroutine call ÀÌ  kernel ³»¿¡ ´«¿¡ º¸ÀÌÁö ¾Ê°Ô ½É¾îÁ® Àִ°Í
  °ú °°´Ù.  

- µÎ°¡Áötraps ÀüºÎ user program  °ú kernel ³»ºÎ¿¡¼­ ¼öÇàµÉ¼ö°¡ ÀÖ´Ù.
  µÑ´Ù  switch ¸¦ kernle ¶Ç´Â supervisor mode ·Î ºÐ±â½Ãų¼ö°¡ ÀÖ°í kernel trap
  code ·Î controle À» transfer ÇÏ¸ç ¿©±â¼­ software °¡  ±×°Í¿¡´ëÇØ ÇÒÀÏÀ» °áÁ¤.
  À̸®ÇÏ¿© user program À¸·ÎºÎÅÍÀÇ page fault ´Â ÀϹÝÀûÀ¸·Î acceptable Çϸç
  kernel Àº Àû´çÇÑ page ¸¦ load ÇÒ°ÍÀ̸ç instruction À» °è¼ÓÇÏ°ÔÇÑ´Ù.
  kernel ·Î ºÎÅÍÀÇ page fault ´Â ±×·¯³ª bad news ÀÌ°í trap code ´Â panic À¸·Î¼­
  stop ÇÏ°Ô µÈ´Ù.

6. Trap sequence

- H/W ´Â ±× trap  ÀÌ synchronous fault ¶Ç´Â asynchronous interrupt ÀÌ´ø°£¿¡
  operation ÀÇ ÇÑ sequence ¸¦ ¼öÇàÇÑ´Ù.
  interrupt requests, page faults, illegal instructions, or system calls ˼
  ¸ðµÎ µ¿ÀÏÇÑ ¹æ¹ýÀ¸·Î handling µÈ´Ù.
  trap recognition sequence ´Â kernel ¿¡°Ô control À» Àü´ÞÇÏ°í kernel ¶Ç´Â
  supervisor mode ·Î trap ÀÌ ¹ß»ýÇÑ °÷°ú trap ÀÇ Á¾·ù¿¡ °üÇؼ­ save µÈ 
  information À» °¡Áö°í µé¾î°£´Ù.

- trap sequence as performed by the H/W looks like:
  1) Recognize the trap
  2) Get to a new window ( an implicit save instruction)
  3) Set TBR according to the trap type
  4) Force a branch to the trap instructions. - the address in the TBR

- Enable Traps bit ¸¦ turning off Çϴ°ÍÀº interrupt recognitionÀ»
  delay  ½ÃÅ°±â ¶§¹®¿¡  °¡´ÉÇϸé ÃÖ´ëÇÑ Âª°Ô ÇؾßÇÏ¸ç ±× code ´Â ¸Å¿ì ÁÖÀÇ
  ÇÏ¿© writing µÇ¾î¾ßÇÏ¸ç ¸¸¾à ÇϳªÀÇ trap ÀÌ disalble µÇ¾úÀ»¶§¿¡ ¿äûµÇ¸é
  watchdog ÀÌ ÀϾ°ÍÀÌ´Ù.

- current window pointer(CWP, in the Processor Status Register) ´Â ÇöÀç
  »ç¿ëµÇ°í ÀÖ´Â register ¸¦ °¡¸®Å²´Ù.  registers ´Â circular buffer ó·³
  ÇൿÇϹǷΠ ¿ÏÀüÇÑ register set À» ÅëÇÏ¿© ¿øÇüÀ¸·Î µ¹°ÔµÈ´Ù.
  °ð ±×°ÍÀº overlap À̵ǰí new register window °¡ °¡¸®Å°´Â°ÍÀº ½ÇÁ¦·Î
  »ç¿ëÇϱâÀ§ÇÑ free °¡ ¾Æ´Ï´Ù. ÀÌ·¯ÇÑ °æ¿ì°¡ ¹Ù·Î window overflow trap(or
  a window underflow,when moving in the other direction) ÀÇ source ÀÌ´Ù.
  ±×¸®°í À̼ø°£ÀÇ trap Àº watchdog reset À» ÀÏÀ¸Å°¹Ç·Î  CWP ´Â ½ÇÁ¦ ¹Ù²î¾î
  Á®¼­ invaild window ¸¦ °¡¸®Å°´Â point °¡ µÈ´Ù. ÀÌ·¯ÇÑ ÀÌÀ¯¸¦ À§ÇÏ¿©
  H/W ¿Í S/W (trap handling process)  ´Â ´ÜÁö local(%l0-%l7) registers À»
  »ç¿ëÇÒ¼ö°¡ ÀÖ´Ù. ´Ù¸¥ registers ´Â touch µÇ¾îÁöÁö¾Ê´Â´Ù.
  ÀÌ°ÍÀº stack  »ó¿¡¼­ nonstandard stack frame À» ¸¸µé¸ç ¿¹¸¦µé¸é return 
  address (in %i7) Àº ½ÇÁ¦ valid pointer °¡ ¾Æ´Ô.
  Trap Base Register ´Â º¸Åë ½Ã½ºÅÛÀÇ ÃʱâÈ­ °úÁ¤¿¡¼­ Çѹø setup À̵Ǹç
  ÀϺΠpage boundary ¸¦ °¡¸®Å²´Ù.

          Trap Base Address           Trap Type   0000
             (20 bits)                 (8 bits) 

  - lower bits ´Â Ç×»ó 0 ÀÌ¸ç ´ÙÀ½ 8 bits ´Â trap type field ·Î¼­ H/W ¿¡¼­
   Á¤ÀÇµÈ  trap ÀÇ type ¿¡ ±Ù°ÅÇÏ¿© ÀÚµ¿ÀûÀ¸·Î ä¿öÁø´Ù.

7. Trap frames

- trap frame Àº ±¸Á¶ÀûÀ¸·Î  stack frame ÀÇ ´Ù¸¥ type °ú ´Ù¸£Áö ¾Ê´Ù.
  trap frame Àº local register %l1 ¿¡ ÀÖ´Â trap À» ÀÏÀ¸Å² instructionÀÇ 
  ÁÖ¼Ò¸¦ °¡Áö¸ç  local register %l2  ¿¡ next PC address ¸¦ °¡Áø´Ù.
  ÀÌ°ÍÀº À§¿¡¼­µµ ¸»ÇßÁö¸¸  H/W ¿¡ ÀÇÇØ ÇàÇØÁø´Ù.
  trap À» handling ÇÏ´Â S/W ÀÇ ±â´ÉÀº  registers ¿Í °°ÀÌ ´Ù¸¥ÀÏÀ» ÇÒÁöµµ
  ¸ð¸£¸ç ±×·¯³ª º¸Åë, ÃÖ¼ÒÇÑ PC address°¡  %l1 ¿¡ °¡´ÉÇÏ´Ù.
- Synchronous traps resulting from an instruction Àº º¸Åë stack trace 
  ¹Ù·ÎµÚ¿¡ trap fram ÀÌ ³ªÅ¸³ª´Â fault function ¶Ç´Â trap function À¸·ÎºÎÅÍ
  ÇϳªÀÇ frame À» °®´Â´Ù.
- º¸Åë external device interrups ¿¡ ÀÇÇØ ¹ß»ýÇÏ´Â Asynchronous faults ´Â   
  interrupt-handling code ¿¡ ÀÇÇØ Àνĵɼö°¡ ÀÖ´Ù. ÀÌ°ÍÀº hardclock ÀÎ
  clock function ÀÌ µÉ¼öµµ ÀÖ°í ¶Ç´Â ÇϳªÀÇ Æ¯º°ÇÑ interrupt level(level 10)¿¡    Àü¿ëÀΠƯÁ¤ÇÑ code °¡ µÉ¼öµµ ÀÖ´Ù. interrupt ³ª fault handler °°Àº ÀÌ·±
  functions ¿¡ ÂüÁ¶ÇÏ´Â stack »óÀÇ address ·Î  return Çϴ°ÍÀº º¸Åë
  ¹Ù·Î ¾ÕÀÇ trap frame ¸¦ °¡¸®Å²´Ù. code address in %l1 °ú °°Àº frame À»
  ÁÖÀDZí°Ô º¸¸é º¸Åë ±× address ´Â in %l2 ´õÇϱâ 4 °¡ µÈ´Ù.
  Device interrupts ´Â º¸Åë interrupt service routine ÀÇ À̸§¿¡ ÀÇÇØ ÀνĵǸç
  À̰͵éÀº º¸Åë int ·Î ³¡³­´Ù. ¿¹¸¦µé¸é zsint() ´Â  ZS(serial keyboadr/moust)
  device ¸¦ À§ÇÑ service routine ÀÌ´Ù.

8. Trap types

- °¢ trap type Àº unique ÇÑ number ¸¦ °¡Áö¸ç ÀÌ°ÍÀº Trap Base Register ¸¦
  ¼öÁ¤Çϴµ¥ »ç¿ëµÇ¸ç ±×¸®°í CPU ¸¦ Á¤È®ÇÑ trap-handling routine À¸·Î Áö½ÃÇϴµ¥
  »ç¿ëµÈ´Ù. SPARC chip Specs ¿¡ ÀÇÇØ ÇÒ´çµÈ types ´Â º¸Åë ±×µéÀÇ  Priority ¿¡ 
  ´ëÃæ ÀÏÄ¡ÇÑ´Ù. trap priorities ´Â ´ÜÁö µ¿½ÃÀÇ trap ¶Ç´Â interrupt requests°¡
  ³ªÅ¸³¯¶§¿¡¸¸ Áß¿äÇÏ´Ù. ¸î°³ÀÇ Bad Trap panics ¸¦ º»ÈÄ¿¡´Â ÀÌ·¯ÇÑ °ÍµéÀÌ ´ç½Å
  ¿¡°Ô´Â Àͼ÷ÇÒ°ÍÀÌ´Ù. (data fault ¿¹¸¦µé¸é, trap tyep 9 )

- °¡Àå ÀϹÝÀûÀÎ trap types °ú ÀǹÌ
   1 : Illegal instruction access(text fault)
   2 : Illegal instruction
   3 : Privileged instruction
   4 : Floating-point disabled
   5 : Window overflow
   6 : window underfolw
   7 : Memory address alignment error
   8 : Floating-point exception
   9 : Data access exception ( data fault)
   17: Interrupt level 1
   18: Interrupt level 2 up to
   31: Interrupt level 15
   128: Software trap #0 up to
   255: Software trap #127

9. Retunring from traps

- ½Ã½ºÅÛÀº interrupt µÈ code ¶Ç´Â trap ÀÌ ¹ß»ýÇÑ code ·Î  µ¹¾Æ°¥¼öÀÖ¾î¾ß¸¸
  ÇÑ´Ù. ¿©±â¿¡ rett ¶ó°íÇÏ´Â ÇϳªÀÇ Æ¯º°ÇÑ instruction ÀÎ return from trap
  operation À» ¼öÇàÇÏ´Â °ÍÀÌ ÀÖ´Ù. ÀÌ°ÍÀº H/W °¡ trap À» ÀνÄÇßÀ»¶§ ¹ß»ýÇÑ
  events ÀÇ sequence ¸¦ ¿øÀ§Ä¡ ½ÃŲ´Ù.

10. panic() routine.

- panic() routine Àº °©Àڱ⠸ðµç Á¤»óÀûÀÎ process scheduling À» interrupt ÇÔ.
  user ÀÇ °üÁ¡¿¡¼­ º»´Ù¸é ½Ã½ºÅÛÀº Á×Àº°ÍÀÌ´Ù. panic() Àº ±× memory ÀÇ ³»¿ëÀ»
  dump device ¿¡ ±×´ë·Î copy ÇϰԵȴÙ. default ·Î, dump device ´Â º¸Åë primary
  swap device ÀÌ´Ù. dumps ¸¦ À§Çؼ­ disk ÀÇ ºÐ¸®µÈ chunk ¸¦ »ç¿ëÇϴ°ÍÀ» º¸±â´Â
  Èûµé´Ù. ±×·¯³ª ±×·¯ÇÑ ¹æ¹ýÀ¸·Î setup µµ °¡´ÉÇÏ´Ù. ´ëºÎºÐÀÇ UNIX systems ¿¡ 
  À־ dump device ´Â ¹Ýµå½Ã ÇϳªÀÇ disk partition ÀÌ µÇ¾î¾ßÇÑ´Ù. ÀϺνýºÅÛÀº
  tape drive °¡ ¸í½ÃµÇ±âµµ ÇÑ´Ù.

- panic() Àº ÇöÀçÀÇ CPU »óÅ¿¡ ´ëÇÑ  critical information À» ±â·ÏÇÑ´Ù.
  ÀÌ·¯ÇÑ information  Àº CPU registers, stack pointer, ±×¸®°í ´Ù¾çÇÑ state 
  register ¸¦ Æ÷ÇÔÇÏ°í ÀÖ´Ù.

- Çѹø panic() ÀÌ dumping memory ¸¦ dump device ¿¡ ¿Ï¼ºÇϰԵǸé 
  ½Ã½ºÅÛÀ» reboot ÇÑ´Ù.


11. Panic messages

- system programmer ¿Í ÇöÀçÀÇ operation ¿¡ µû¶ó¼­ ÀϺΠpanic messages Àº ²Ï
  °£´ÜÇØÁú¼ö°¡ ÀÖ´Ù. ¹Ý¸é¿¡ ´Ù¸¥°ÍµéÀº »ó´çÈ÷ ÀÚ¼¼ÇÏ°Ô messages ¸¦ Á¦°øÇÑ´Ù.
  Áï, °¡²û ´ç½ÅÀº calling program ÀÇ name À̳ª »ç¿ëµÇ°í ÀÖ´Â variables »Ó¸¸
  ¾Æ´Ï¶ó ±× source ÀÇ line number ±îÁö º¸°ÔµÉ¼öµµ ÀÖ°í ´ÜÁö programmer ¸¸ÀÌ
  ¾Ë¾Æº¼¼öÀÖ´Â  ´Ù¼Ò cryptic word µµ º¼¼öÀÖ´Ù.


12. Kernel Tracebacks 

- panic ÀÇ ¿øÀÎÀ» Á¤È®È÷ °áÁ¤Çϱâ À§Çؼ­´Â source code  °¡ ÇÊ¿äÇÏÁö¸¸
  stack  À» º½À¸·Î½á °¡²û ¹®Á¦ÀÇ º»Áú·Î¼­ÀÇ ½Ç¸¶¸®¸¦ Á¦°øÇÏ´Â Èï¹ÌÀÖ´Â
  information À» Á¦°ø¹ÞÀ»¼ö°¡ ÀÖ´Ù. 
  Sun-3 systems Àº function call À» À§ÇÏ¿© parameters ¸¦ stack »ó¿¡
  push ÇÏÁö¸¸ Sun-4/SPARC systems Àº registers ¸¦ »ç¿ëÇÑ´Ù.
  À̸®ÇÏ¿© Sun-3 stack traceback Àº ´Ù¾çÇÑ parameters  ¸¦ º¸¿©ÁÙ°ÍÀÌ´Ù.
  ±×·¯³ª SPARC stack Àº Ç×»ó Á¤È®È÷ six parameters ¸¸ º¸¿©ÁØ´Ù.
  À̰͵éÁßÀÇ ÀϺδ registers  ¸¦ scratch(erase) ÇÒ¼öµµ ÀÖÁö¸¸ ´Ù¸¥ÀϺδÂ
  À¯È¿ÇÏ´Ù. Áï, ¾ó¸¶³ª ¸¹Àº parameters °¡ pass µÇ¾ú´Â°¡¸¦ ¾Ë±âÀ§ÇØ ±× code
  ¸¦ check ÇÏÁö¾Ê°í¼­´Â ¾Ë ¹æ¹ýÀÌ ¾ø´Ù.

- stack traceback Àº º¸Åë ±× code °¡ Á×¾úÀ»¶§¿¡ call ÇÑ ¸¶Áö¸· routine À»
  º¸¿©ÁØ´Ù. Áï, H/W fault ¿¡ ´ëÇؼ­´Â actual location ¿¡¼­ÀÇ PC value.
  adb ÀÇ ?i ´Â real function À» ³ªÅ¸³»ÁØ´Ù. »ç¿ëÇغ¸¶ó.¶ÇÇÑ, 
  SPARC system À» À§Çؼ­ traps Àº erroneous traceback °ú °°ÀÌ º¸ÀÌ´Â 
  ´Ù¸¥ registers ¿¡ PC value ¸¦ ÀúÀåÇϰԵȴÙ, Sun-4 systems  ÀÇ ¸¹Àº°æ¿ì
  ´ç½ÅÀº trap function ÀÇ ¹Ù·Î ¾Õ address ¸¦ ¹«½ÃÇϰԵǴµ¥ ¿Ö³ÄÇϸé
  ¹Ýµå½Ã À¯È¿ÇÏÁö´Â ¾Ê±â ¶§¹®ÀÌ´Ù.
  ºñ·Ï, ½ÇÁ¦·Î parameter °¡ ¹«¾ùÀÎÁö¸¦ °áÁ¤Çϴ°ÍÀÌ ½±Áö´Â ¾ÊÁö¸¸,
  ù¹ø° ¸î°³ÀÇ registers ¿¡ ÀÖ´Â ¿©·¯°³ÀÇ zeros, small constants, or odd 
  numbers ´Â chain À¸·Î ³»·Á¿À¸é¼­ Àü´ÞµÈ bad parameters ¸¦ ³ªÅ¸³¾¼ö°¡ ÀÖ´Ù.

- Many times device drivers are involved.
  Check for these in the traceback.
  driver routines Àº ÀϹÝÀûÀ¸·Î 2 or 3-letter abbreviation À¸·Î ½ÃÀ۵Ǹç
  ÀÌ°ÍÀº ±× function ÀÇ  À̸§À¸·Î ¼öÇàµÇ°í boot time ¶§ probe routine ¿¡
  ÀÇÇØ device ÀÇ À̸§À¸·Î printed µÈ´Ù.
  STREAMS-related ÀÎ str ·Î¼­ xystrate,zsopen, stwrite °¡ ÀÖ´Ù.
  ¶ÇÇÑ interrupt service routines À» ÁÖ¸ñÇ϶ó. ¸¸¾à, xyintr °¡ stack³»¿¡
  ³ªÅ¸³­´Ù¸é, ±×°ÍÀº ÀϹÝÀûÀ¸·Î traceback information °ú °ü·ÃÀÌ ¾ø´Ù,
  panic or trap Àº interrupt code ³»¿¡¼­ ¹ß»ýÇÏ¸ç ¾Æ¸¶µµ device ¿¡ °ü·ÃÀÌ 
  ÀÖÀ¸¸ç ÇöÀç process context ¿¡ °ü·ÃÀÌ ¾ø´Ù.



		< 4. Watchdog Reset >



1. What is a watchdog ?

- °¡²û ½Ã½ºÅÛÀº "watchdog reset" À̶ó´Â message ¸¦ console ¿¡ ³»°í PROM 
  À¸·Î ³»·Á°£´Ù. ÀÌ°ÍÀº panic Àº ¾Æ´Ï´Ù. ±× ½Ã½ºÅÛÀº ´õÀÌ»ó control ¿¡
  Àִ°ÍÀº ¾Æ´Ï´Ù.  ±×°ÍÀº memory ¸¦ disk ·Î dumping ÇÏÁö¾Ê°í
  CPU °¡ reset À¸·Î µÈ´Ù.

- Watchdog resets Àº ±Ùº»ÀûÀÎ ¿øÀÎÀº H/W  ¿¡ ¿¬°üµÉÁöµµ ¸ð¸£Áö¸¸ º¸ÅëÀº
  S/W ¹®Á¦ÀÌ´Ù. Á÷Á¢ÀûÀÎ ¿øÀÎÀº page fault ¿Í °°Àº trap Àε¥ ´Ù¸¥ trap À»
  handling ÇÏ´ÂÁß¿¡ ¹ß»ýÇÑ´Ù. Kernel Àº PSR(Processor Status Register) ³»ÀÇ 
  Enable Traps bit  À» reset(turned off) ½ÃÅ´À¸·Î½á trap À» ¿î¿ëÇϴµ¥
  ÀÌ°ÍÀº ÃÖÃÊ¿¡ 󸮵Ǵø ù¹ø° trap ÀÌ ³¡³¯¶§±îÁö ´Ù¸¥ trap À» CPU °¡ 
  ó¸®Çϴ°ÍÀ» ¹æÁöÇÑ´Ù. ÀÌ°ÍÀº Áï ½Ã½ºÅÛÀÌ Ã¹¹ø° trap À» ¿ÏÀüÈ÷ ó¸®ÇÒ¶§
  ±îÁö ´Ù¸¥ trap Àº ¸¸µé¾îÁöÁö ¾Ê´Â´Ù´Â ÀǹÌÀÌ´Ù. ¸¸¾à¿¡ ÀÌ ±â°£ µ¿¾È ¿¡  
  ¾î¶² ÀÌÀ¯¶§¹®¿¡ ÇϳªÀÇ trap ÀÌ ¹ß»ýÇÑ´Ù¸é ½Ã½ºÅÛÀº trap À» ¼öÇàÇؾß
  Çϴµ¥ ÀÌ°ÍÀº bit °¡ off µÇ¾î¼­°¡ ¾Æ´Ï±â ¶§¹®¿¡ ½Ã½ºÅÛÀº ±× Áï½Ã
  quit(ÁßÁö)  ÇÑ´Ù. ÀÌ°ÍÀÌ ¹Ù·Î watchdog reset ÀÌ´Ù. Áï, unrecoverable 
  situation ( ±Ùº»ÀûÀ¸·Î CPUÀÇ reset »óÅ·Π°­Á¦·Î ¸¸µå´Â °Í) ÀÌ´Ù.
  Watchdog reset ÈÄ¿¡ ´ç½ÅÀÌ ÇÒ¼öÀÖ´Â À¯ÀÏÇÑ ÀÏÀº ¹Ù·Î reboot ÀÌ´Ù.

- Watchog reset ÀÇ Æ¯¼º¶§¹®¿¡  kadb Á¶Â÷µµ watchdog ÀÌ ÀϾÀ¸¶§ÀÇ
  watchdog resets À» ÀâÀ»¼ö°¡ ¾ø´Ù.±×·¯³ª ´ç½ÅÀº °£´ÜÈ÷ ¸î°³ÀÇ
  OpenBoot PROM commands ·Î¼­  reboot  ÇϱâÀü¿¡ ÀϺ¸ÀÇ status informatin
  À» ¾òÀ»¼ö°¡ ÀÖ´Ù.

2. Can you get a core file ?

- Not usually, ÀÌ watchdog ÀÇ Æı«ÀûÀÎ ¼Ó¼º»ó ´ç½Å ÀÌ boot PROM ok prompt ¸¦
  º¸°ÔµÈ´Ù°í ÇÏ´õ¶óµµ  CPU registers  ´Â ¹ú½á ±úÁ®ÀÖ°í sync command ¼öÇàÀÌ
  fail or  ¾µµ¥¾ø´Â core dump ¸¦ ¾ò°ÔµÉ°ÍÀÌ´Ù. ÀÌ°ÍÀº  unreadabl ¶Ç´Â »ìÆ캼
  ÁÁÀº data  °¡ ³²¾ÆÀÖÁö ¾Ê´Ù. Ç×»ó try Çغ¼ÆÞ¿ä´Â ÀÖÁö¸¸ ±×·¯³ª ´ç½ÅÀÌ
  ¸ÕÀúÇؾßÇÒ ´Ù¸¥ÀÏÀÌ ÀÖ´Ù.

3.  What do you do next ?

- Çѹø boot PROM ok prompt ¸¦ °¡Áø´Ù¸é ´ç½ÅÀº ¸î°³ÀÇ Áß¿äÇÑ PROM command 
  ¸¦ »ç¿ëÇÒ¼ö°¡ ÀÖÀ¸¸ç ½Ã½ºÅÛÀÌ watchdog À»  ¹Þ¾ÒÀ»¶§  ±× ½Ã½ºÅÛÀÇ »óÅ¿¡ 
  °üÇÑ information  À» dump out ÇϱâÀ§ÇØ ´ÙÀ½°ú °°Àº ¸í·ÉÀÌ ÀÖ´Ù.
* .registers : Display many of the kernel internal CPU registers.
* .locals - Dumps out the registers in the current register "window."
            These are the registers that were in use at the time of the ctash.
* .psr - prints the Processor Status Register contents in a readable format.
* ctrace - Displays the return stack(like $c in adb)
* wd-dump (sun4d only)
- ºÒÇàÇÏ°Ôµµ À̼ø°£¿¡ kernel Àº running ÀÌ µÇÁö¾Ê´Â »óÅÂÀ̹ǷΠ´ç½ÅÀº
  ÀÌ information À» file ·Î ¹ÞÀ»¼ö°¡ ¾ø´Ù. ´ç½ÅÀº ¾Æ¸¶µµ paper ¿¡ ±â·Ï.

4. Watchdog analysis.

- Watchdog reset Àº ½Ã½ºÅÛÀÌ traps À» processing  ÇÒ¶§¿¡ ¹ß»ýÇϹǷΠactual PC
  º¯¼ö´Â Å©°Ô ¼Ò¿ëÀÌ ¾ø´Ù. ´ç½ÅÀº  kernel trap handling code ¸¦ ºÐ¼®ÇؾßÇÏ°í
  trace information Àº °¡Àå Áß¿äÇÏ°í À¯¿ëÇÑ output ÀÌ´Ù. ´ç½ÅÀÌ PROM À» ÀÌ¿ëÇÒ
  ¶§ kernel Àº running µÇÁö¾ÊÀ¸¸ç sysmbol table Àº PROM code ¿¡ À¯¿ëÇÏÁö¾Ê´Ù.
  Áï, PROM command ·Î ºÎÅÍÀÇ output Àº ÀüÀûÀ¸·Î hexdecimal À̸ç raw numeric
  address ÀÌ´Ù. ±× system  ÀÌ reboot µÇ°í »ì¾ÆÀÖ´Â ½Ã½ºÅÛ»ó¿¡¼­ adb ¸¦ °¡Áö°í
  kernel ³»ÀÇ functions À¸·Î¼­ try Çغ¼¼ö°¡ ÀÖ´Ù. addredd/i ´Â stack trace ·Î 
  ·Î ºÎÅÍ °¢ address ÀÇ À§Ä¡¿Í instruction  À» display ÇÒ¼ö°¡ ÀÖ´Ù.

5. Summary

- Analyzing watchdog reset is not an easy task. ¸î°³ÀÇ PROM command ¸¸ÀÌ »ç¿ë
  ÇÒ¼ö°¡ ÀÖ°í  ´ç½ÅÀÇ ³ë·Â¿¡ ºñ¾Ö À¯¿ëÇÑ information À» Ç×»ó ¾òÀ»¼ö Àִ°ÍÀº
  ¾Æ´Ï´Ù. ¸¸¾à ´Ù¼öÀÇ watchdog resets ÀÌ ¹ß»ýÇÑ´Ù¸é ´ç½ÅÀº ÀÏ°üµÈ results ¸¦
  ¾òÀ»¼ö°¡ ÀÖÀ»°ÍÀÌ°í °ü·ÃµÈ functions À» ¾Ë°Ô µÉ°ÍÀÌ´Ù. 
  ºñ·Ï watchdog resets ÀÌ software ÀÇ problem À̶ó°í ÇÒÁö¶ó°í ±×°ÍµéÀº Á¾Á¾
  ƯÁ¤ÇÑ H/W ÀÇ ºÎºÐ(CPU,Memory,M/B...) ¿¡ °ü·ÃÀÌ µÉ¼ö°í ÀÖ´Ù. ÀÌ°ÍÀº
  stack trace ·Î ºÎÅÍ ¿îÀÌÁÁÀ¸¸é ¾Ë¼ö°¡ ÀÖ´Ù. watchdog resets À¸·ÎºÎÅÍ
  ÇÇÇظ¦ º¸°í ÀÖ´Â ½Ã½ºÅÛÀ» ó¸®ÇÒ¶§¿¡ ÀüüÀûÀÎ system À» º¸µµ·ÏÇؾßÇÑ´Ù.
  H/W ¿Í S/W µÑ´Ù¹®Á¦°¡ Àִ°÷À» ¸»ÀÌ´Ù.




Revision History

ÀÛ¼ºÀÏÀÚ : 96.06.13
ÀÛ¼ºÀÚ : À̽ÂÈÆ

¼öÁ¤ÀÏÀÚ : 
¼öÁ¤ÀÚ