Subject : ½Ã½ºÅÛÀå¾ÖºÐ¼®(savecore,hangup,Panic,watchdog-reset) Description : 1. Setup savecore ( 1.X and 2.X ) 2. Hangup 3. Panic 4. Watchdog Reset < 1. Setup savecore > 1. Solaris 1.X : How to setup savecore 1) Customizing /etc/rc.local .... # Default is to not do a savecore # # mkdir -p /var/crash/`hostname` # echo -n 'checking for crash dump... ' # intr savecore /var/crash/`hostname` # echo ' 2) Default is to not do a savecore. # Default is to not do a savecore # mkdir -p /var/crash/`hostname` echo -n 'checking for crash dump... ' intr savecore /var/crash/`hostname` echo ' 3) -p option of mkdir says to create the parent directories if they don't already exist. 4) Configuring a special dump device - ¿ì¸®´Â ¹ú½á dump device ¿¡ ´ëÇؼ À̾߱âÇß°í primary swap device °¡ º¸Åë dump device ·Î »ç¿ëµÈ´Ù´Â°ÍÀ» ¾Ë°í ÀÖ´Ù. cf) config vmunix swap on sd1b config vmunix swap on dumps on sd2f 2. Solaris 2.X : How to setup savecore 1) Customizing /etc/rc2.d/S20sysetup ...... ## ## Default is to not do a savecore ## if [ ! -d /home/lsh/crash/`uname -n` ] then mkdir -p /home/lsh/crash/`uname -n` fi echo 'checking for crash dump...\c ' savecore /home/lsh/crash/`uname -n` echo '' .... 2) Displaying the dumpfile kernel variable via adb hyundai3# adb -k /dev/ksyms /dev/mem physmem 3e1a dumpfile/20X dumpfile: dumpfile: 0 0 0 0 2f646576 2f64736b 2f633074 33643073 31000000 0 0 0 0 0 0 0 0 0 0 0 dumpfile+10/X dumpfile+0x10: 2f646576 dumpfile+10/s dumpfile+0x10: /dev/dsk/c0t3d0s1 $q < 2. System Hangup> 1. What is a system hang ? - System hangs ´Â system admin ¿¡°Ô´Â Ä¿´Ù¶õ ÁÂÀýÀÌ µÉ¼ö°¡ ÀÖ´Ù. Àá½Ãµ¿ÇÑ ¸ðµç sysetm admin Àº ÇϳªÀÇ ½Ã½ºÅÛÀ» º¸°í ±×°ÍÀÌ »ì¾ÆÀÖ°í Á×°í, »ó´çÈ÷ ¼Óµµ°¡ ´Ê¾îÁö´Â°ÍÀ» º¸°ÔµÇ°í ¾ó¸¶ÈÄ "hung" system À» º¸°ÔµÈ´Ù. System hang Àº ¸Å¿ì ´Ù¾çÇÑ Á¾·ùÀÇ ¿øÀÎÀ» °¡Áö°í ÀÖÁö¸¸ ±×µéÀº ÇÑ°¡Áö °øÅëÀûÀΠ¡Èĸ¦ µå·¯³½´Ù. ½Ã½ºÅÛÀº ´õÀÌ»ó ¿ÏÀüÇÏ°Ô »ç¿ëµÇÁö¾Ê´Â´Ù. Ç×»ó ½Ã½ºÅÛÀÌ ¿ÏÀüÇÏ°Ô »ç¿ëµÉ¼ö°¡ ¾ø°ÔµÇ´Â panics °ú´Â ´Þ¸® system hang Àº system resources ¸¦ õõÈ÷ Àâ¾Æ¸Ô¾î ¸¶Ä§³» ¿ÏÀüÇÏ°Ô useless system ÀÌ µÈ´Ù. - kernel errors ¸¦ º¼¶§¿¡ ´ç½ÅÀº ¸ðµç½Ã½ºÅÛÀÌ core dump ·Î½á panic À» À¯¹ßÇÏ´Â ¹®Á¦¸¦ ÀÏÀ¸Å°Áö´Â ¾Ê´Â´Ù´Â°ÍÀ» ¾Ë°ÔµÉ°ÍÀÌ´Ù. °¡²ûÀº ½Ã½ºÅÛµéÀº hangup ÀÌµÇ°í ¿ì¸®´Â memeory ÀÇ ³»¿ëÀ» Á¶»çÇϱâÀ§ÇÏ¿© core dump ¸¦ ÀÏÀ¸ÄѼ hang À» ¸¸µé°ÔµÈ ¿øÀÎÀ» ¾Ë¾Æº¸´Â°ÍÀÌ ¹Ù¶øÁ÷ÇÏ´Ù. 2. What conditions cause hangs ? - system hang ÀÇ ÀϹÝÀûÀÎ ¿øÀÎÀº deadlock ¶Ç´Â ÇϳªÀÇ process °¡ ´Ù¸¥ process ¿¡ ÀÇÇØ lock µÇ¾îÀÖ´Â ¹«¾ùÀΰ¡¸¦ waiting ÇÏ¸ç ´Ù¸¥ process ´Â óÀ½ process °¡ lock ÇسõÀº resource ¸¦ ±â´Ù¸®´Â »óȲÀÌ´Ù. - System hangs can also occur when resources dry up and the system has to sit around waiting for more resources before it can continue doing what was asked of it. - °¡²û, system Àº hardware problems ¿¡ ÀÇÇØ hang ÀÌ µÈ´Ù. ¿¹¸¦ µé¸é, µð½ºÅ© µå¶óÀ̹ö¿¡ ºÙ¾îÀÖ´Â data transfer cable ÀÇ ¹®Á¦´Â system °ú disk driver »çÀÌÀÇ communication problem À» ÀÏÀ¸Å²´Ù. ±× °á°ú´Â bus ¸¦ hung ÇÏ°Ô ¸¸µç´Ù. - application ÀÌ loop ¿¡ ºüÁ® hangup ÀÌ µÇ¾úÀ»¶§¿¡ ±× ½Ã½ºÅÛÀÇ ´Ù¸¥ user ´Â ¿µÇâÀ» ¹ÞÁö¾Ê´Â´Ù. Áï, ±× process µ¿¾È¿¡ disk ¸¦ ¸Ô°Å³ª µÎ°³ ¶Ç´Â ´Ù¸¥ kenel resource ¸¦ ¸ÔÁö¸¸ ¾Ê´Â´Ù¸é hung program ÀÌ ±× system ÀÇ ³ª¸ÓÁö¿¡°Ô ¿µÇâÀ» ÁÖÁö¾Ê´Â´Ù. - Hangs Àº ´Ù¾çÇÑ Á¶°Ç¿¡ÀÇÇؼ ¹ß»ýÇÏ¸ç ¼·Î´Ù¸¥ Ư¼ºÀ» °¡Áö°í ÀÖ´Ù. * ¿ì¼± ½Ã½ºÅÛÀº hangup µÇ¾îÀÖ´Â ½Ã½ºÅÛÀ¸·ÎºÎÅÍ ÇϳªÀÇ low-level ICMP request ¸¦ º¸³»°ÔÇÏ´Â ¸í·É¾îÀÎ ping ¿¡µµ ÀÀ´äÀ» ÇÏÁö¾Ê´Â ½Ã½ºÅÛÀÌ ÀÖÀ»¼ö ÀÖ´Ù. ¸¸¾à¿¡ ÀÀ´äÀ» ÇÑ´Ù¸é kernel Àº ±× ¼ø°£¿¡µµ network interrupts ¿¡ ´ëÇØ ÃæºÐÈ÷ ÀνÄÇÏ°í ÀÀ´äÀ» ÇÒ¼ö°¡ Àִٴ°ÍÀÌ´Ù. * ½Ã½ºÅÛÀº keyboard ÀÇ characters ¿¡ echo ¼Ò¸®¸¸ ³»°Å³ª mouse movements ´Â ÀÖÁö¸¸ ÀԷµǴ command ³ª abort sequence ¿¡ Á¶Â÷µµ ÀÀ´äÀ» ÇÏÁö ¾Ê´Â°æ¿ì°¡ ÀÖ´Ù. ÀÌ°ÍÀº process °¡ °è¼Ó¼öÇàÀü¿¡ resources ¿¡»õÇØ availabelÇϰԵDZ⸦ ±â´Ù¸®´Â »óȲ , Áï deadlocks ¿¡ ÀÇÇÑ hang up Àϼö ÀÖ´Ù. ÀÌ°æ¿ì¿¡´Â ±× process µéÀº °áÄÚ ready »óÅ°¡ µÇÁö¾Ê´Â´Ù. ps ÀÇ output Àº ¾Æ¸¶ D wait state ¿¡ ¸¹Àº process ¸¦ º¼¼ö°¡ ÀÖÀ»°ÍÀÌ´Ù. * ¸¸¾à¿¡ keyboard ÀÇ echo Á¶Â÷µµ ÀüÇô¾ø´Â ¿Ïº®ÇÑ hangup Àΰæ¿ì´Â ¾Æ¸¶ STREAMS problems Àϼö°¡ ÀÖ´Ù. °¡²û L1-A Á¶Â÷µµ ÀÌ °æ¿ì¿¡´Â ¼Ò¿ëÀÌ ¾ø´Ù. * Server systems ¿¡¼´Â CPU B/D »óÀÇ LEDs °¡ ±× ½Ã½ºÅÛÀÇ »óŸ¦ ³ªÅ¸³½´Ù. Á¤»óÀûÀÎ °æ¿ì´Â bounce ¶Ç´Â regular moving light ÀÌ´Ù. ¸¸¾à¿¡ µ¿ÀÛÀº ÇÏÁö¸¸ ¸Å¿ì ¼Óµµ°¡ ´ÊÀ»¶§¿¡´Â ±× ½Ã½ºÅÛÀº ¸Å¿ì busy »óÅÂÀÌ´Ù. ÀÌ°ÍÀº kernel ÀÌ loop À̰ųª Çϳª ¶Ç´Â ±× ÀÌ»óÀÇ modem lines °ú °°Àº external device ·ÎºÎÅÍÀÇ ´ë·®ÀÇ interrupt ¶§¹®ÀÌ´Ù. Frozon lights ´Â H/W problem À» ³ªÅ¸³½´Ù. 3. Capturing system hang information - ´ëºÎºÐÀÇ °æ¿ì hung system ÀÇ crash dump ´Â °Á¦ÀûÀϼö°¡ ÀÖ´Ù. ±×·¯³ª ÀÌ°ÍÀº ¸ðµç system hang conditions ¿¡ ´ëÇØ not guaranteed. °Á¦ÀûÀ¸·Î dump ¸¦ ÇÏ·Á¸é, ´ç½ÅÀº boot PROM monitor ·Î ³»·Á¾ß ÇÑ´Ù. Suspending all current program execution. It`s L1-A. On systems using ASCII terminals for the console, usually the Break key can be used to get to the boot PROM monitor. - ¸ðµç hang situations ÀÌ interrupted µÇÁö´Â ¾Ê´Â´Ù. ¸¸¾à, L1-A °¡ ÀÛµ¿À» ÇÏÁö ¾Ê´Â´Ù¸é °¡²û console keyboard ¸¦ »Ì°Å³ª ¸îºÐµ¿¾È terminal À» »Ì´Â´Ù. ÀÌ ¸ðµç°ÍÀÌ ½ÇÆзΠµ¹¾Æ°¡¸é ½Ã½ºÅÛÀ» power down Çϴ¼ö¹Û¿¡ ¾ø´Ù. 4. Sun-4d - psrinfo (print processor info) ¿Í psradm (processor admin) command ´Â status display ¿Í multiprocessor system ÀÇ control ¿¡ À¯¿ëÇÔ. - sun4d system ( SPARCserver 1000, SPARCcenter 2000) Àº ½Ã½ºÅÛÁø´Ü¿¡ À¯¿ëÇÑ Æ¯º°ÇÑ H/W Ư¼º¿Ü¿¡ prtdiag ¶ó´Â »õ·Î¿î command °¡ ÀÖ´Ù. - µÎ°³ÀÇ ¼·Î´Ù¸¥ Á¾·ùÀÇ watchdog reset ÀÌ ÀÖÀ¸³ª º¸Åë H/W problem À» ³ªÅ¸³¿. ½Ã½ºÅÛÀÇ watch dog reset Àº º¸Åë H/W error ¿¡ ÀÇÇϹǷΠ½Ã½ºÅÛÀ» reset ½ÃÅ´. - POST routines Àº watchdog reset ¿¡ °üÇÑ information À» ÀúÀåÇϹǷΠprtdiag -v ¶ó´Â command ·Î½á È®ÀÎ ÇÒ¼ö°¡ ÀÖ´Ù. - A local CPU watchdog reset occurs when a single processor is reset due to a trap occuring when traps are disabled ( a "standard" watchdog). The system drops into the OBP. < 3. Panic > 1.What happened ? - Computers crash. It's just a fact of life. Depending on the H/W and S/W. ÀϺδ ÀÚÁֹ߻ýÇÏ°í ÀϺδ ÀüÇô¹ß»ýÇÏÁö ¾Ê´Â´Ù. - UNIX °¡ Á¸ÀçÇÑÀÌ·¡·Î UNIX system crash dump ¸¦ ºÐ¼®ÇÏ·Á´Â »ç¶÷ÀÌ ¸¹°í ÀÌ »ç¶÷µéÀº UNIX system ÀÌ crash ÈÄÀÇ ¸¸µé¾îÁø files À¸·ÎºÎÅÍ ¿øÀÎÀ» ºÐ¼®ÇÒ ¼ö ÀÖ°Ô µÇ¾ú´Ù. 2. What is a system crash ? - UNIX ¿¡ µû¸£¸é 1970 1 ¿ù 1 ÀÏ ÀÚÁ¤À¸·ÎºÎÅÍ computer systems Àº crash °¡ ¹ß»ý. - System crash ´Â Á¾Á¾ ´ÙÀ½°ú °°Àº Á¶°Ç¿¡¼ °©ÀÚ±â system ÀÌ »ç¿ëÇÒ¼ö ¾ø°ÔµÊ. ( System panics & bad traps, Watchdog resets, Dropping out to boot PROM) 3. What conditions cause panics ? - ¾î¶²ÀÌ´Â panics À» Çø¿ÀÇÑ´Ù. ±×µéÀº ¾Æ¸¶ ½Ã½ºÅÛ°ú data integrity ¸¦ ¾ÈÀüÀåÄ¡(safeguards) ·Î »ý°¢ÇÏ´Â°Í °°´Ù. - ½Ã½ºÅÛ panic messages ´Â µÎ°¡ÁöÁßÀÇ ÇÑ°¡ÁöÀÇ ¿øÀÎÀÌ´Ù. software consistency check, hardware fault. - ÈǸ¢ÇÑ O/S programmer ´Â system resources ÀÇ integrity ÀÇ checking À» ÇÒ¶§¿¡ ±× code ³»¿¡ panic() routine À» ³¢¿ö³Ö¾î referencing °ú manipulating À» ÇÑ´Ù. ¿¹¸¦µé¸é, ½Ã½ºÅÛ ÇÁ·Î±×·¡¸Ó ÀÇ program code ¿¡¼ Áö±ÝÇöÀç »ç¿ëÁßÀ̶ó°í ¾Ë·ÁÁø(marking) disk ÀÇ ÇÑ block À» ÀÌÁ¦ ¸· free up ½ÃÅ°·Á°í ÇÒ¶§¿¡ ±×´Â ¸ÕÀú ±× µð½ºÅ©°¡ ¾ÆÁ÷µµ »ç¿ëÁßÀΰÍÀ¸·Î mark µÇ¾îÀÖ´ÂÁö¸¦ °ËÁõÇÒ°ÍÀÌ´Ù. ¸¸¾à ±× block ÀÌ °©Àڱ⠱װ¡ free ÇϱâÀü¿¡ free µÈ°ÍÀ¸·Î mark µÇ¾îÀÖ°í ±×°ÍÀ» ¾Ë¾ÒÀ»¶§ ±×ÀÇ code ´Â ±×°ÍÀ» freeing ÇÏ¸é ¾ÈµÈ´Ù. ±×·¯³ª ¾î¶»°Ô ±× block ÀÌ ¿ä¼úó·³ free µÇ¾úÀ»±î? ¾î¶»°Ô , ¾îµð¿¡¼, ¹«¾ùÀÌ ¾öû³ª°Ô À߸øµÇ¾ú´Â°¡? À̶§ panic() À» call ÇÏ¸é¼ system programmer ´Â ±× system À» °©ÀÚ±â ÁßÁö½Ãų ¼ö ÀÖÀ¸¸ç ÀÌ·¸°Ô ÇÔÀ¸·Î½á ½Ã½ºÅÛÀ» º¸È£ÇÏ°í ±× problem ÀÌ ¹ß°ßµÉ¶§±îÁö Ãß°¡ÀûÀÎ corruption À» ¿¹¹æÇÑ´Ù. - panic() Àº ¿ÀÁ÷ O/S °¡ kernel mode ¿¡ ÀÖÀ»¶§¸¸ call µÈ´Ù.±×·¯³ª O/S ¿¡ ÀÖ¾î¼ bug ¸¦ ½ÇÇèÇÏ´Â ¾î¶°ÇÑ program ÀÌ¶óµµ panic À» ÀÏÀ¸Å³¼ö°¡ ÀÖ´Ù. ¿¹¸¦µé¸é, debuggin ÁßÀÎ »õ·Î¿î device driver ¸¦ »ç¿ëÇÏ´Â user program ¿¡¼ driver °¡ »ç¿ëµÉ¶§¸¶´Ù kernel mode ·Î ¿òÁ÷À̰ԵȴÙ. Çѹø kernel mode ¿¡ ÀְԵǸé, panics Àº ÀϾ¼ö°¡ ÀÖ´Ù. ±×ÀÇ program ÀÌ panic À» ÀÏÀ¸Å² °ÍÀº ±× user ¿¡°Ô ³ªÅ¸³ª°ÔµÇÁö¸¸ ½ÇÁ¦ ±×ÀÇ ÇÁ·Î±×·¥Àº ´ÜÁö panic À¸·Î À¯µµÇÏ°Ô µÇ´Â events ÀÇ trigger °¡ µÈ°ÍÀÌ´Ù. Áï °£´ÜÈ÷ ¸»Çϸé, ¸¸¾à ½Ã½ºÅÛÀÌ panics ÀÌ ³ª¸é ¹Ù·Î ½Ã½ºÅÛÀÌ data ÀÇ integrity or data ÀÇ corruption ÀÌ ÀǽɵǴ Á¶°ÇÀ» °¨ÁöÇß°¡ ¶§¹®ÀÌ´Ù. - data integrity concept À» user level programming ÀÇ °üÁ¡¿¡¼ »ìÆ캸ÀÚ. ¸¸¾à ´ç½ÅÀÌ ÇϳªÀÇ ÈÀÏÀ» open ÇÏ´Â ÇÁ·Î±×·¥À» open() system call À» »ç¿ëÇÏ¿© ÇÁ·Î±×·¡¹ÖÇÑ´Ù¸é, ´ç½ÅÀº ¾Æ¸¶µµ ´ÙÀ½ ´Ü°è¸¦ ³Ñ¾î°¡±âÀü¿¡ ½ÇÁ¦·Î open ÀÌ ¼º°ø Çߴ°¡¸¦ open() status ¸¦ check ÇÒ°ÍÀÌ´Ù.¸¸¾à open() status °¡ fail ÀÌ¸é ´ç½ÅÀÇ program Àº ¾Æ¸¶ ÀÌ °ÍÀ» report ÇÏ°í exit Çϰųª »õ·Î¿î file name À» À§ÇØ prompt ¸¦ ³»°Å³ª °£´ÜÈ÷ ´ÙÀ½ course ÀÇ action À» ÃëÇÒ°ÍÀÌ´Ù. ¿©±â¼ ¸¸¾à ´ç½ÅÀÌ open() system call ·ÎºÎÅÍ ³Ñ¾î¿Â status ¸¦ ¹«½ÃÇÑ´Ù¸é ÇâÈÄ¿¡ ÀÌ line ¿¡ ¿Í¼´Â ¾î¶°ÇÑ ÀáÀçÀûÀÎ ¹®Á¦¿¡ ºÎµúÈú°ÍÀÌ´Ù. ´ç½ÅÀÇ data integrity ´Â À§Çè¿¡ ³õÀÏ°ÍÀÌ´Ù. - ´ç½ÅÀÌ ¿îÀüÇÏ´Â ÀÚµ¿Â÷ ´Â panic() routine °ú ºñ½ÁÇÑ ¾î¶²°ÍÀ» °¡Áö´Â°¡ ? ¸¸¾à air bag ÀÌ ÀåÂøµÇ¾î ÀÖ´Ù¸é ´äÀº yes ÀÌ´Ù. ´ç½ÅÀÇ Â÷°¡ °©ÀÚ±â ¾Õ ¹üÆÛ°¡ high-speed collision °ú °°Àº°ÍÀ» °¨ÁöÇß´Ù¸é, air bag ÀÌ ºÎÇ®·¯Á®¼ ¿îÀüÀÚ¸¦ º¸È£ÇÏ°Ô µÉ°ÍÀÌ´Ù. - Software(Kernel) ´Â ¼ö¸¹Àº hardcoded validity tests ¸¦ Æ÷ÇÔÇÏ°í Àִµ¥, ÀÌ°ÍÀº invalid pointers ¶Ç´Â impossible conditions before continuing À» checking ÇϰԵȴÙ. panics Àº µÎ°¡Áö types Áß¿¡¼ ÇÑ°¡Áö°¡ µÉ¼öÀÖ´Ù. a regular panic messages, or an assertion ÀÌ´Ù. - ÀÌÀüºÎÅÍÀÇ panic messages ¿¡ ´ëÇؼ´Â ´ç½ÅÀÌ º¸Åë ¾òÀ»¼ö Àִ°ÍÀº messages ±× ÀÚüÀÌ´Ù. À̰͵éÀº unique ÇÑ ±× ÀÚüÀ̸ç Á¤È®È÷ ±× ¹®Á¦¸¦ ³ªÅ¸³» ÁØ´Ù. ´ç½ÅÀº source code ³»¿¡¼ ±×°ÍÀ» Çѹø º¼¼ö°¡ ÀÖ´Ù. - Assertion messages ´Â "panic: assertion failed" ¶ó´Â messages ¿¡ ÀÌ¾î¼ erroneous conditionÀ» ³ªÅ¸³»´Â messages ¸¦ console ¿¡ prints ÇÏ´Â macro ·Î ºÎÅÍ À¯·¡ÇÑ´Ù. ÀÌ °æ¿ì¿¡, °ü½ÉÀÖ´Â article Àº panic: ¿¡ ¼±ÇàÇÏ´Â condition message À̸ç ÀÌ°ÍÀº test, file, ±×¸®°í ±× code ³»¿¡ line number ¸¦ ³ªÅ¸³½´Ù. - °©ÀÛ½º·± hardware traps Àº panics À» ÀÏÀ¸Å²´Ù. ÀÌ°ÍÀº ÀϹÝÀûÀ¸·Î kernel ·Î ºÎÅÍÀÇ invalid address °¡ access µÇ´Â °æ¿ìÀÌ´Ù.¿Ö³ÄÇϸé OS ´Â page µÇ´Â°ÍÀÌ ¾Æ´Ï¹Ç·Î kernel code ·Î ºÎÅÍÀÇ fault ´Â Áï°¢ÀûÀÎ Á×À½(immediate death) ÀÇ ¿øÀÎÀÌ´Ù. software panic messages ¿Í ´Þ¸® hardware traps Àº Á¤È®ÇÑ ½Ã½ºÅÛÀÇ »óŸ¦ ³ªÅ¸³»¸ç console ¿¡ print µÇ´Â traceback À¸·Î ±Í°áµÈ´Ù. ÀÌ°ÍÀº º¸Åë ¶ÇÇÑ /var/adm/messages file ¿¡ ³ªÅ¸³ª°Ô µÈ´Ù. - º¸Åë panics ´Â hardware-related or detected fault ¸¦ ³ªÅ¸³½´Ù. Á¾·ù´Â. - trap : for any unexpected trap into or from kernel mode - bus error(Sun-3) : a kernle segmentation violation. - text fault : an attempt to fetch an instruction from a bad place. - data fault: generally an erroneous pointer - address alignment: also generally a bad pointer. - illegal instruction : possibly an attempt to execute "data" 4. A word about bad traps - Computer system Àº H/W ¿¡¼ ÀϾÁö¸»¾Æ¾ß ÇÒ Á¶°ÇÀÌ °¨ÁöµÈ´Ù¸é ¶ÇÇÑ crash ¸¦ ³½´Ù. UNIX system¿¡¼ ÀÌ·¯ÇÑ Á¾·ùÀÇ crash ¸¦ "bad trap " À̶ó°íÇϸç system admin ÀÇ °üÁ¡¿¡¼ º»´Ù¸é bad traps °ú S/W panics ´Â µ¿ÀÏÇÑ ¹æ¹ýÀ¸·Î ´Ù·ç¾îÁ®¾ß ÇÑ´Ù. UNIX systems Àº ÇÏ·ç¿¡ ¼ö¹é¸¸ÀÇ traps À» ¼öÇàÇѤ§. ±×·¡¼ ´ç½ÅÀÌ trap À» µè°ÔµÈ´Ù¸é panic À̶ó°í ÇÏÁö¸»¶ó. ±×·¯³ª µå¹®°æ¿ì¿¡ ´ç½ÅÀº bad trap À» ¸¸³¯¼ö°¡ ÀÖ´Ù. ´ç½ÅÀÇ UNIX system ÀÌ ±×·¸´Ù¸é ±×°ÍÀº panic() À» invoke ÇÒ°ÍÀÌ´Ù. - SPARC terms ¿¡ ÀÖ¾î¼ trap À̶ó´Â°ÍÀº kernel code ·ÎÀÇ Áï°¢ÀûÀÎ ºÐ±â¸¦ ÀÏÀ¸Å²´Ù. Áï Á¤»óÀûÀÎ instructions ÀÇ ¼öÇàÀ» Áß´Ü(interruption). ÀÌ·¯ÇÑ interruptionÀº user request(a system call) ¶Ç´Â ÀϺÎexternal event ( a page fault, a disk interrupt, a keystroke) °¡ ¿øÀÎÀÌ µÉ¼öÀÖ´Ù. ¾î¶² °æ¿ì¿¡µµ interrupt ´Â H/W ¿Í very low-level sofrware ¿¡ ÀÇÇØ processing µÈ´Ù. ±×·¡¼ ¾î¶»°Ô traps ÀÌ ¼öÇàµÇ°í ¾î¶»°Ô 󸮵ǴÂÁö¿¡ ´ëÇÑ °ÍÀº ±× ½Ã½ºÅÛÀÇ architecure ¸¦ ÀÌÇØÇؾßÇÑ´Ù. CPU H/W ´Â trap ÀÇ type À» ÀνÄÇÏ°í ±×°ÍÀ» ó¸®ÇϱâÀ§ÇØ Á¤È®ÇÑ À§Ä¡¸¦ ¾Ë·Á°í ½ÃµµÇÑ´Ù. kernel Àº Àû´çÇÑ trap handling code °¡ ¹ÌÄ¥¼ö ÀÖµµ·Ï È®½ÇÈ÷ ÇϱâÀ§ÇØ ¸î°³ÀÇ control registers ¸¦ setup Çؾ߸¸ ÇÑ´Ù. Çѹø ½Ã½ºÅÛÀÌ ±¸µ¿µÇ°í user processes °¡ running µÇ¸é, ÇϳªÀÇ trap Àº kernel ÀÌ ÇϳªÀÇ user program À¸·ÎºÎÅÍ control À» °®°ÔµÉ À¯ÀÏÇÑ ¹æ¹ýÀ̵ȴÙ. trap À̶ó´Â°ÍÀº ÇϳªÀÇ user request °¡ processµÇ°í ( kernel Àº user program À§¿¡¼ running) ÇϳªÀÇ device °¡ control(kernel Àº ¸î°³ÀÇ external request ¶§¹®¿¡ running) µÇ´Â ¼ö´Ü(means) ÀÌ´Ù. 5. Kinds of traps - µÎ°³ÀÇ ±âº»ÀûÀÎ trap ÀÌ ÀϾ¼ö°¡ Àִµ¥ synchronous ¿Í asynchronous ÀÌ´Ù. Synchronous trap Àº opeation À̰ųª instruction Áß¿¡ÀÇÇØ ¹ß»ýÇÒ¼öÀÖ´Ù. ÀÌ°ÍÀº ½ÇÁ¦ trap instruction ÀÌ µÉ¼öµµ ÀÖ°í ¶Ç´Â bad address alignment, bad address(bus timeouts), illegal instructions, floating-point coprocessor error °°Àº H/W error Àϼöµµ ÀÖ´Ù. ÀÌ·¯ÇÑ traps Àº Áï½Ã ¹Þ¾Æµé¿©Áø´Ù. Áï, H/W ´Â kernel space À» À§ÇØ H/W ÀÇ tracks °ú heads ³»ÀÇ ÇöÀç instruction ÀÇ operation À» ÁßÁö½ÃŲ´Ù. - Asynchronous trap Àº processor ¿¡¼ ¾î¶²»óŸ¦ º¯°æÇϱâÀü¿¡ ¹ß»ýÇÑ´Ù. À̸®ÇÏ¿© ±× trap ÀÌ º¹±¸°¡´ÉÇÑ H/W fault ¿¡ ÀÇÇØ ÀϾÀ»¶§¿¡´Â ±× instruction Àº Çѹø ±× trap handling ÀÌ ³¡³µÀ»¶§ ±× ¹®Á¦·ÎºÎÅÍ recovery ÇϱâÀ§ÇØ restart ÇÑ´Ù. page faults ´Â ÁÁÀº¿¹ÀÌ´Ù. Asynchronous trap Àº ¾ðÁ¦³ª request µÉ¼ö°¡ ÀÖÀ¸¸ç ÇϳªÀÇ instruction ÀÌ ¿ÏÀüÈ÷ ³¡³µÀ»°æ¿ì¿¡¸¸ processing µÉ¼ö°¡ ÀÖ´Ù. ÀÌ·¯ÇÑ traps Àº interrupts ¿Í °°Àº external events ¿¡ ÀÇÇØ ÀϾ. ÀÌ traps Àº instruction ÀÇ operation ¿¡´Â ¿µÇâÀ» ¹ÌÄ¡Áö ¾ÊÀ¸¸ç ´ÜÁö instruction stream ¿¡¼ÀÇ break(ºÐ±â) ¸¦ ÀÏÀ¸Å²´Ù. ÀÌ°ÍÀº ¸¶Ä¡ kernel ¿¡ÀÇ subroutine call ÀÌ kernel ³»¿¡ ´«¿¡ º¸ÀÌÁö ¾Ê°Ô ½É¾îÁ® ÀÖ´Â°Í °ú °°´Ù. - µÎ°¡Áötraps ÀüºÎ user program °ú kernel ³»ºÎ¿¡¼ ¼öÇàµÉ¼ö°¡ ÀÖ´Ù. µÑ´Ù switch ¸¦ kernle ¶Ç´Â supervisor mode ·Î ºÐ±â½Ãų¼ö°¡ ÀÖ°í kernel trap code ·Î controle À» transfer ÇÏ¸ç ¿©±â¼ software °¡ ±×°Í¿¡´ëÇØ ÇÒÀÏÀ» °áÁ¤. À̸®ÇÏ¿© user program À¸·ÎºÎÅÍÀÇ page fault ´Â ÀϹÝÀûÀ¸·Î acceptable Çϸç kernel Àº Àû´çÇÑ page ¸¦ load ÇÒ°ÍÀ̸ç instruction À» °è¼ÓÇÏ°ÔÇÑ´Ù. kernel ·Î ºÎÅÍÀÇ page fault ´Â ±×·¯³ª bad news ÀÌ°í trap code ´Â panic À¸·Î¼ stop ÇÏ°Ô µÈ´Ù. 6. Trap sequence - H/W ´Â ±× trap ÀÌ synchronous fault ¶Ç´Â asynchronous interrupt ÀÌ´ø°£¿¡ operation ÀÇ ÇÑ sequence ¸¦ ¼öÇàÇÑ´Ù. interrupt requests, page faults, illegal instructions, or system calls Àº ¸ðµÎ µ¿ÀÏÇÑ ¹æ¹ýÀ¸·Î handling µÈ´Ù. trap recognition sequence ´Â kernel ¿¡°Ô control À» Àü´ÞÇÏ°í kernel ¶Ç´Â supervisor mode ·Î trap ÀÌ ¹ß»ýÇÑ °÷°ú trap ÀÇ Á¾·ù¿¡ °üÇؼ save µÈ information À» °¡Áö°í µé¾î°£´Ù. - trap sequence as performed by the H/W looks like: 1) Recognize the trap 2) Get to a new window ( an implicit save instruction) 3) Set TBR according to the trap type 4) Force a branch to the trap instructions. - the address in the TBR - Enable Traps bit ¸¦ turning off Çϴ°ÍÀº interrupt recognitionÀ» delay ½ÃÅ°±â ¶§¹®¿¡ °¡´ÉÇϸé ÃÖ´ëÇÑ Âª°Ô ÇؾßÇÏ¸ç ±× code ´Â ¸Å¿ì ÁÖÀÇ ÇÏ¿© writing µÇ¾î¾ßÇÏ¸ç ¸¸¾à ÇϳªÀÇ trap ÀÌ disalble µÇ¾úÀ»¶§¿¡ ¿äûµÇ¸é watchdog ÀÌ ÀϾ°ÍÀÌ´Ù. - current window pointer(CWP, in the Processor Status Register) ´Â ÇöÀç »ç¿ëµÇ°í ÀÖ´Â register ¸¦ °¡¸®Å²´Ù. registers ´Â circular buffer ó·³ ÇൿÇϹǷΠ¿ÏÀüÇÑ register set À» ÅëÇÏ¿© ¿øÇüÀ¸·Î µ¹°ÔµÈ´Ù. °ð ±×°ÍÀº overlap À̵ǰí new register window °¡ °¡¸®Å°´Â°ÍÀº ½ÇÁ¦·Î »ç¿ëÇϱâÀ§ÇÑ free °¡ ¾Æ´Ï´Ù. ÀÌ·¯ÇÑ °æ¿ì°¡ ¹Ù·Î window overflow trap(or a window underflow,when moving in the other direction) ÀÇ source ÀÌ´Ù. ±×¸®°í À̼ø°£ÀÇ trap Àº watchdog reset À» ÀÏÀ¸Å°¹Ç·Î CWP ´Â ½ÇÁ¦ ¹Ù²î¾î Á®¼ invaild window ¸¦ °¡¸®Å°´Â point °¡ µÈ´Ù. ÀÌ·¯ÇÑ ÀÌÀ¯¸¦ À§ÇÏ¿© H/W ¿Í S/W (trap handling process) ´Â ´ÜÁö local(%l0-%l7) registers À» »ç¿ëÇÒ¼ö°¡ ÀÖ´Ù. ´Ù¸¥ registers ´Â touch µÇ¾îÁöÁö¾Ê´Â´Ù. ÀÌ°ÍÀº stack »ó¿¡¼ nonstandard stack frame À» ¸¸µé¸ç ¿¹¸¦µé¸é return address (in %i7) Àº ½ÇÁ¦ valid pointer °¡ ¾Æ´Ô. Trap Base Register ´Â º¸Åë ½Ã½ºÅÛÀÇ ÃʱâÈ °úÁ¤¿¡¼ Çѹø setup À̵Ǹç ÀϺΠpage boundary ¸¦ °¡¸®Å²´Ù. Trap Base Address Trap Type 0000 (20 bits) (8 bits) - lower bits ´Â Ç×»ó 0 ÀÌ¸ç ´ÙÀ½ 8 bits ´Â trap type field ·Î¼ H/W ¿¡¼ Á¤ÀÇµÈ trap ÀÇ type ¿¡ ±Ù°ÅÇÏ¿© ÀÚµ¿ÀûÀ¸·Î ä¿öÁø´Ù. 7. Trap frames - trap frame Àº ±¸Á¶ÀûÀ¸·Î stack frame ÀÇ ´Ù¸¥ type °ú ´Ù¸£Áö ¾Ê´Ù. trap frame Àº local register %l1 ¿¡ ÀÖ´Â trap À» ÀÏÀ¸Å² instructionÀÇ ÁÖ¼Ò¸¦ °¡Áö¸ç local register %l2 ¿¡ next PC address ¸¦ °¡Áø´Ù. ÀÌ°ÍÀº À§¿¡¼µµ ¸»ÇßÁö¸¸ H/W ¿¡ ÀÇÇØ ÇàÇØÁø´Ù. trap À» handling ÇÏ´Â S/W ÀÇ ±â´ÉÀº registers ¿Í °°ÀÌ ´Ù¸¥ÀÏÀ» ÇÒÁöµµ ¸ð¸£¸ç ±×·¯³ª º¸Åë, ÃÖ¼ÒÇÑ PC address°¡ %l1 ¿¡ °¡´ÉÇÏ´Ù. - Synchronous traps resulting from an instruction Àº º¸Åë stack trace ¹Ù·ÎµÚ¿¡ trap fram ÀÌ ³ªÅ¸³ª´Â fault function ¶Ç´Â trap function À¸·ÎºÎÅÍ ÇϳªÀÇ frame À» °®´Â´Ù. - º¸Åë external device interrups ¿¡ ÀÇÇØ ¹ß»ýÇÏ´Â Asynchronous faults ´Â interrupt-handling code ¿¡ ÀÇÇØ Àνĵɼö°¡ ÀÖ´Ù. ÀÌ°ÍÀº hardclock ÀÎ clock function ÀÌ µÉ¼öµµ ÀÖ°í ¶Ç´Â ÇϳªÀÇ Æ¯º°ÇÑ interrupt level(level 10)¿¡ Àü¿ëÀΠƯÁ¤ÇÑ code °¡ µÉ¼öµµ ÀÖ´Ù. interrupt ³ª fault handler °°Àº ÀÌ·± functions ¿¡ ÂüÁ¶ÇÏ´Â stack »óÀÇ address ·Î return Çϴ°ÍÀº º¸Åë ¹Ù·Î ¾ÕÀÇ trap frame ¸¦ °¡¸®Å²´Ù. code address in %l1 °ú °°Àº frame À» ÁÖÀDZí°Ô º¸¸é º¸Åë ±× address ´Â in %l2 ´õÇϱâ 4 °¡ µÈ´Ù. Device interrupts ´Â º¸Åë interrupt service routine ÀÇ À̸§¿¡ ÀÇÇØ ÀνĵǸç À̰͵éÀº º¸Åë int ·Î ³¡³´Ù. ¿¹¸¦µé¸é zsint() ´Â ZS(serial keyboadr/moust) device ¸¦ À§ÇÑ service routine ÀÌ´Ù. 8. Trap types - °¢ trap type Àº unique ÇÑ number ¸¦ °¡Áö¸ç ÀÌ°ÍÀº Trap Base Register ¸¦ ¼öÁ¤Çϴµ¥ »ç¿ëµÇ¸ç ±×¸®°í CPU ¸¦ Á¤È®ÇÑ trap-handling routine À¸·Î Áö½ÃÇϴµ¥ »ç¿ëµÈ´Ù. SPARC chip Specs ¿¡ ÀÇÇØ ÇÒ´çµÈ types ´Â º¸Åë ±×µéÀÇ Priority ¿¡ ´ëÃæ ÀÏÄ¡ÇÑ´Ù. trap priorities ´Â ´ÜÁö µ¿½ÃÀÇ trap ¶Ç´Â interrupt requests°¡ ³ªÅ¸³¯¶§¿¡¸¸ Áß¿äÇÏ´Ù. ¸î°³ÀÇ Bad Trap panics ¸¦ º»ÈÄ¿¡´Â ÀÌ·¯ÇÑ °ÍµéÀÌ ´ç½Å ¿¡°Ô´Â Àͼ÷ÇÒ°ÍÀÌ´Ù. (data fault ¿¹¸¦µé¸é, trap tyep 9 ) - °¡Àå ÀϹÝÀûÀÎ trap types °ú ÀÇ¹Ì 1 : Illegal instruction access(text fault) 2 : Illegal instruction 3 : Privileged instruction 4 : Floating-point disabled 5 : Window overflow 6 : window underfolw 7 : Memory address alignment error 8 : Floating-point exception 9 : Data access exception ( data fault) 17: Interrupt level 1 18: Interrupt level 2 up to 31: Interrupt level 15 128: Software trap #0 up to 255: Software trap #127 9. Retunring from traps - ½Ã½ºÅÛÀº interrupt µÈ code ¶Ç´Â trap ÀÌ ¹ß»ýÇÑ code ·Î µ¹¾Æ°¥¼öÀÖ¾î¾ß¸¸ ÇÑ´Ù. ¿©±â¿¡ rett ¶ó°íÇÏ´Â ÇϳªÀÇ Æ¯º°ÇÑ instruction ÀÎ return from trap operation À» ¼öÇàÇÏ´Â °ÍÀÌ ÀÖ´Ù. ÀÌ°ÍÀº H/W °¡ trap À» ÀνÄÇßÀ»¶§ ¹ß»ýÇÑ events ÀÇ sequence ¸¦ ¿øÀ§Ä¡ ½ÃŲ´Ù. 10. panic() routine. - panic() routine Àº °©Àڱ⠸ðµç Á¤»óÀûÀÎ process scheduling À» interrupt ÇÔ. user ÀÇ °üÁ¡¿¡¼ º»´Ù¸é ½Ã½ºÅÛÀº Á×Àº°ÍÀÌ´Ù. panic() Àº ±× memory ÀÇ ³»¿ëÀ» dump device ¿¡ ±×´ë·Î copy ÇϰԵȴÙ. default ·Î, dump device ´Â º¸Åë primary swap device ÀÌ´Ù. dumps ¸¦ À§Çؼ disk ÀÇ ºÐ¸®µÈ chunk ¸¦ »ç¿ëÇϴ°ÍÀ» º¸±â´Â Èûµé´Ù. ±×·¯³ª ±×·¯ÇÑ ¹æ¹ýÀ¸·Î setup µµ °¡´ÉÇÏ´Ù. ´ëºÎºÐÀÇ UNIX systems ¿¡ ÀÖ¾î¼ dump device ´Â ¹Ýµå½Ã ÇϳªÀÇ disk partition ÀÌ µÇ¾î¾ßÇÑ´Ù. ÀϺνýºÅÛÀº tape drive °¡ ¸í½ÃµÇ±âµµ ÇÑ´Ù. - panic() Àº ÇöÀçÀÇ CPU »óÅ¿¡ ´ëÇÑ critical information À» ±â·ÏÇÑ´Ù. ÀÌ·¯ÇÑ information Àº CPU registers, stack pointer, ±×¸®°í ´Ù¾çÇÑ state register ¸¦ Æ÷ÇÔÇÏ°í ÀÖ´Ù. - Çѹø panic() ÀÌ dumping memory ¸¦ dump device ¿¡ ¿Ï¼ºÇÏ°ÔµÇ¸é ½Ã½ºÅÛÀ» reboot ÇÑ´Ù. 11. Panic messages - system programmer ¿Í ÇöÀçÀÇ operation ¿¡ µû¶ó¼ ÀϺΠpanic messages Àº ²Ï °£´ÜÇØÁú¼ö°¡ ÀÖ´Ù. ¹Ý¸é¿¡ ´Ù¸¥°ÍµéÀº »ó´çÈ÷ ÀÚ¼¼ÇÏ°Ô messages ¸¦ Á¦°øÇÑ´Ù. Áï, °¡²û ´ç½ÅÀº calling program ÀÇ name À̳ª »ç¿ëµÇ°í ÀÖ´Â variables »Ó¸¸ ¾Æ´Ï¶ó ±× source ÀÇ line number ±îÁö º¸°ÔµÉ¼öµµ ÀÖ°í ´ÜÁö programmer ¸¸ÀÌ ¾Ë¾Æº¼¼öÀÖ´Â ´Ù¼Ò cryptic word µµ º¼¼öÀÖ´Ù. 12. Kernel Tracebacks - panic ÀÇ ¿øÀÎÀ» Á¤È®È÷ °áÁ¤Çϱâ À§Çؼ´Â source code °¡ ÇÊ¿äÇÏÁö¸¸ stack À» º½À¸·Î½á °¡²û ¹®Á¦ÀÇ º»Áú·Î¼ÀÇ ½Ç¸¶¸®¸¦ Á¦°øÇÏ´Â Èï¹ÌÀÖ´Â information À» Á¦°ø¹ÞÀ»¼ö°¡ ÀÖ´Ù. Sun-3 systems Àº function call À» À§ÇÏ¿© parameters ¸¦ stack »ó¿¡ push ÇÏÁö¸¸ Sun-4/SPARC systems Àº registers ¸¦ »ç¿ëÇÑ´Ù. À̸®ÇÏ¿© Sun-3 stack traceback Àº ´Ù¾çÇÑ parameters ¸¦ º¸¿©ÁÙ°ÍÀÌ´Ù. ±×·¯³ª SPARC stack Àº Ç×»ó Á¤È®È÷ six parameters ¸¸ º¸¿©ÁØ´Ù. À̰͵éÁßÀÇ ÀϺδ registers ¸¦ scratch(erase) ÇÒ¼öµµ ÀÖÁö¸¸ ´Ù¸¥ÀϺδ À¯È¿ÇÏ´Ù. Áï, ¾ó¸¶³ª ¸¹Àº parameters °¡ pass µÇ¾ú´Â°¡¸¦ ¾Ë±âÀ§ÇØ ±× code ¸¦ check ÇÏÁö¾Ê°í¼´Â ¾Ë ¹æ¹ýÀÌ ¾ø´Ù. - stack traceback Àº º¸Åë ±× code °¡ Á×¾úÀ»¶§¿¡ call ÇÑ ¸¶Áö¸· routine À» º¸¿©ÁØ´Ù. Áï, H/W fault ¿¡ ´ëÇؼ´Â actual location ¿¡¼ÀÇ PC value. adb ÀÇ ?i ´Â real function À» ³ªÅ¸³»ÁØ´Ù. »ç¿ëÇغ¸¶ó.¶ÇÇÑ, SPARC system À» À§Çؼ traps Àº erroneous traceback °ú °°ÀÌ º¸ÀÌ´Â ´Ù¸¥ registers ¿¡ PC value ¸¦ ÀúÀåÇϰԵȴÙ, Sun-4 systems ÀÇ ¸¹Àº°æ¿ì ´ç½ÅÀº trap function ÀÇ ¹Ù·Î ¾Õ address ¸¦ ¹«½ÃÇϰԵǴµ¥ ¿Ö³ÄÇÏ¸é ¹Ýµå½Ã À¯È¿ÇÏÁö´Â ¾Ê±â ¶§¹®ÀÌ´Ù. ºñ·Ï, ½ÇÁ¦·Î parameter °¡ ¹«¾ùÀÎÁö¸¦ °áÁ¤Çϴ°ÍÀÌ ½±Áö´Â ¾ÊÁö¸¸, ù¹ø° ¸î°³ÀÇ registers ¿¡ ÀÖ´Â ¿©·¯°³ÀÇ zeros, small constants, or odd numbers ´Â chain À¸·Î ³»·Á¿À¸é¼ Àü´ÞµÈ bad parameters ¸¦ ³ªÅ¸³¾¼ö°¡ ÀÖ´Ù. - Many times device drivers are involved. Check for these in the traceback. driver routines Àº ÀϹÝÀûÀ¸·Î 2 or 3-letter abbreviation À¸·Î ½ÃÀ۵Ǹç ÀÌ°ÍÀº ±× function ÀÇ À̸§À¸·Î ¼öÇàµÇ°í boot time ¶§ probe routine ¿¡ ÀÇÇØ device ÀÇ À̸§À¸·Î printed µÈ´Ù. STREAMS-related ÀÎ str ·Î¼ xystrate,zsopen, stwrite °¡ ÀÖ´Ù. ¶ÇÇÑ interrupt service routines À» ÁÖ¸ñÇ϶ó. ¸¸¾à, xyintr °¡ stack³»¿¡ ³ªÅ¸³´Ù¸é, ±×°ÍÀº ÀϹÝÀûÀ¸·Î traceback information °ú °ü·ÃÀÌ ¾ø´Ù, panic or trap Àº interrupt code ³»¿¡¼ ¹ß»ýÇÏ¸ç ¾Æ¸¶µµ device ¿¡ °ü·ÃÀÌ ÀÖÀ¸¸ç ÇöÀç process context ¿¡ °ü·ÃÀÌ ¾ø´Ù. < 4. Watchdog Reset > 1. What is a watchdog ? - °¡²û ½Ã½ºÅÛÀº "watchdog reset" À̶ó´Â message ¸¦ console ¿¡ ³»°í PROM À¸·Î ³»·Á°£´Ù. ÀÌ°ÍÀº panic Àº ¾Æ´Ï´Ù. ±× ½Ã½ºÅÛÀº ´õÀÌ»ó control ¿¡ Àִ°ÍÀº ¾Æ´Ï´Ù. ±×°ÍÀº memory ¸¦ disk ·Î dumping ÇÏÁö¾Ê°í CPU °¡ reset À¸·Î µÈ´Ù. - Watchdog resets Àº ±Ùº»ÀûÀÎ ¿øÀÎÀº H/W ¿¡ ¿¬°üµÉÁöµµ ¸ð¸£Áö¸¸ º¸ÅëÀº S/W ¹®Á¦ÀÌ´Ù. Á÷Á¢ÀûÀÎ ¿øÀÎÀº page fault ¿Í °°Àº trap Àε¥ ´Ù¸¥ trap À» handling ÇÏ´ÂÁß¿¡ ¹ß»ýÇÑ´Ù. Kernel Àº PSR(Processor Status Register) ³»ÀÇ Enable Traps bit À» reset(turned off) ½ÃÅ´À¸·Î½á trap À» ¿î¿ëÇϴµ¥ ÀÌ°ÍÀº ÃÖÃÊ¿¡ 󸮵Ǵø ù¹ø° trap ÀÌ ³¡³¯¶§±îÁö ´Ù¸¥ trap À» CPU °¡ ó¸®Çϴ°ÍÀ» ¹æÁöÇÑ´Ù. ÀÌ°ÍÀº Áï ½Ã½ºÅÛÀÌ Ã¹¹ø° trap À» ¿ÏÀüÈ÷ ó¸®ÇÒ¶§ ±îÁö ´Ù¸¥ trap Àº ¸¸µé¾îÁöÁö ¾Ê´Â´Ù´Â ÀǹÌÀÌ´Ù. ¸¸¾à¿¡ ÀÌ ±â°£ µ¿¾È ¿¡ ¾î¶² ÀÌÀ¯¶§¹®¿¡ ÇϳªÀÇ trap ÀÌ ¹ß»ýÇÑ´Ù¸é ½Ã½ºÅÛÀº trap À» ¼öÇàÇØ¾ß Çϴµ¥ ÀÌ°ÍÀº bit °¡ off µÇ¾î¼°¡ ¾Æ´Ï±â ¶§¹®¿¡ ½Ã½ºÅÛÀº ±× Áï½Ã quit(ÁßÁö) ÇÑ´Ù. ÀÌ°ÍÀÌ ¹Ù·Î watchdog reset ÀÌ´Ù. Áï, unrecoverable situation ( ±Ùº»ÀûÀ¸·Î CPUÀÇ reset »óÅ·Π°Á¦·Î ¸¸µå´Â °Í) ÀÌ´Ù. Watchdog reset ÈÄ¿¡ ´ç½ÅÀÌ ÇÒ¼öÀÖ´Â À¯ÀÏÇÑ ÀÏÀº ¹Ù·Î reboot ÀÌ´Ù. - Watchog reset ÀÇ Æ¯¼º¶§¹®¿¡ kadb Á¶Â÷µµ watchdog ÀÌ ÀϾÀ¸¶§ÀÇ watchdog resets À» ÀâÀ»¼ö°¡ ¾ø´Ù.±×·¯³ª ´ç½ÅÀº °£´ÜÈ÷ ¸î°³ÀÇ OpenBoot PROM commands ·Î¼ reboot ÇϱâÀü¿¡ ÀϺ¸ÀÇ status informatin À» ¾òÀ»¼ö°¡ ÀÖ´Ù. 2. Can you get a core file ? - Not usually, ÀÌ watchdog ÀÇ Æı«ÀûÀÎ ¼Ó¼º»ó ´ç½Å ÀÌ boot PROM ok prompt ¸¦ º¸°ÔµÈ´Ù°í ÇÏ´õ¶óµµ CPU registers ´Â ¹ú½á ±úÁ®ÀÖ°í sync command ¼öÇàÀÌ fail or ¾µµ¥¾ø´Â core dump ¸¦ ¾ò°ÔµÉ°ÍÀÌ´Ù. ÀÌ°ÍÀº unreadabl ¶Ç´Â »ìÆ캼 ÁÁÀº data °¡ ³²¾ÆÀÖÁö ¾Ê´Ù. Ç×»ó try Çغ¼ÆÞ¿ä´Â ÀÖÁö¸¸ ±×·¯³ª ´ç½ÅÀÌ ¸ÕÀúÇؾßÇÒ ´Ù¸¥ÀÏÀÌ ÀÖ´Ù. 3. What do you do next ? - Çѹø boot PROM ok prompt ¸¦ °¡Áø´Ù¸é ´ç½ÅÀº ¸î°³ÀÇ Áß¿äÇÑ PROM command ¸¦ »ç¿ëÇÒ¼ö°¡ ÀÖÀ¸¸ç ½Ã½ºÅÛÀÌ watchdog À» ¹Þ¾ÒÀ»¶§ ±× ½Ã½ºÅÛÀÇ »óÅ¿¡ °üÇÑ information À» dump out ÇϱâÀ§ÇØ ´ÙÀ½°ú °°Àº ¸í·ÉÀÌ ÀÖ´Ù. * .registers : Display many of the kernel internal CPU registers. * .locals - Dumps out the registers in the current register "window." These are the registers that were in use at the time of the ctash. * .psr - prints the Processor Status Register contents in a readable format. * ctrace - Displays the return stack(like $c in adb) * wd-dump (sun4d only) - ºÒÇàÇÏ°Ôµµ À̼ø°£¿¡ kernel Àº running ÀÌ µÇÁö¾Ê´Â »óÅÂÀ̹ǷΠ´ç½ÅÀº ÀÌ information À» file ·Î ¹ÞÀ»¼ö°¡ ¾ø´Ù. ´ç½ÅÀº ¾Æ¸¶µµ paper ¿¡ ±â·Ï. 4. Watchdog analysis. - Watchdog reset Àº ½Ã½ºÅÛÀÌ traps À» processing ÇÒ¶§¿¡ ¹ß»ýÇϹǷΠactual PC º¯¼ö´Â Å©°Ô ¼Ò¿ëÀÌ ¾ø´Ù. ´ç½ÅÀº kernel trap handling code ¸¦ ºÐ¼®ÇؾßÇÏ°í trace information Àº °¡Àå Áß¿äÇÏ°í À¯¿ëÇÑ output ÀÌ´Ù. ´ç½ÅÀÌ PROM À» ÀÌ¿ëÇÒ ¶§ kernel Àº running µÇÁö¾ÊÀ¸¸ç sysmbol table Àº PROM code ¿¡ À¯¿ëÇÏÁö¾Ê´Ù. Áï, PROM command ·Î ºÎÅÍÀÇ output Àº ÀüÀûÀ¸·Î hexdecimal À̸ç raw numeric address ÀÌ´Ù. ±× system ÀÌ reboot µÇ°í »ì¾ÆÀÖ´Â ½Ã½ºÅÛ»ó¿¡¼ adb ¸¦ °¡Áö°í kernel ³»ÀÇ functions À¸·Î¼ try Çغ¼¼ö°¡ ÀÖ´Ù. addredd/i ´Â stack trace ·Î ·Î ºÎÅÍ °¢ address ÀÇ À§Ä¡¿Í instruction À» display ÇÒ¼ö°¡ ÀÖ´Ù. 5. Summary - Analyzing watchdog reset is not an easy task. ¸î°³ÀÇ PROM command ¸¸ÀÌ »ç¿ë ÇÒ¼ö°¡ ÀÖ°í ´ç½ÅÀÇ ³ë·Â¿¡ ºñ¾Ö À¯¿ëÇÑ information À» Ç×»ó ¾òÀ»¼ö Àִ°ÍÀº ¾Æ´Ï´Ù. ¸¸¾à ´Ù¼öÀÇ watchdog resets ÀÌ ¹ß»ýÇÑ´Ù¸é ´ç½ÅÀº ÀÏ°üµÈ results ¸¦ ¾òÀ»¼ö°¡ ÀÖÀ»°ÍÀÌ°í °ü·ÃµÈ functions À» ¾Ë°Ô µÉ°ÍÀÌ´Ù. ºñ·Ï watchdog resets ÀÌ software ÀÇ problem À̶ó°í ÇÒÁö¶ó°í ±×°ÍµéÀº Á¾Á¾ ƯÁ¤ÇÑ H/W ÀÇ ºÎºÐ(CPU,Memory,M/B...) ¿¡ °ü·ÃÀÌ µÉ¼ö°í ÀÖ´Ù. ÀÌ°ÍÀº stack trace ·Î ºÎÅÍ ¿îÀÌÁÁÀ¸¸é ¾Ë¼ö°¡ ÀÖ´Ù. watchdog resets À¸·ÎºÎÅÍ ÇÇÇظ¦ º¸°í ÀÖ´Â ½Ã½ºÅÛÀ» ó¸®ÇÒ¶§¿¡ ÀüüÀûÀÎ system À» º¸µµ·ÏÇؾßÇÑ´Ù. H/W ¿Í S/W µÑ´Ù¹®Á¦°¡ Àִ°÷À» ¸»ÀÌ´Ù. Revision History ÀÛ¼ºÀÏÀÚ : 96.06.13 ÀÛ¼ºÀÚ : À̽ÂÈÆ ¼öÁ¤ÀÏÀÚ : ¼öÁ¤ÀÚ