View unanswered posts    View active topics

All times are UTC - 6 hours





Post new topic Reply to topic  [ 39 posts ] 
Go to page 1, 2, 3  Next

Print view Previous topic   Next topic  
Author Message
Search for:
PostPosted: Sat Apr 07, 2007 11:21 pm 
Offline
Joined: Sat Feb 05, 2005 3:26 pm
Posts: 121
Location: Calgary, Alberta - Canada!
Hi Guys,

Well, im still running an older backend (R5D1), needless to say i swapped out my backend for some newer equipment, and now im having some weird issues... everything is working fine, ie i can watch tv, no problems on any of my frontends. however, everynight the backend slows right the heck down on my, top is reporting load averages of

top - 22:53:07 up 12:43, 3 users, load average: 4.10, 2.16, 0.87

It is extremly weird, this always happens around 10:49 a night. The only way to fix is it a reboot. going back in the logs i dont seem to find anything that stands out, and top is not reporting anything that is taking to much cpu.

Can anyone think of any other places i could look at? or ideas of what may have happend.

The hardware upgrade as

before: AMD Athlon XP 1900+
1.5 gb sdram (pc 133)

to: Pentium 4 HT
1 gb DDR 400 (I think)

everything is the same, hd, dvdburner, video card

Thanks!


Last edited by j0ly on Tue Apr 24, 2007 9:54 am, edited 1 time in total.


Top
 Profile  
 
PostPosted: Sun Apr 08, 2007 9:11 am 
Offline
Joined: Thu Mar 25, 2004 11:00 am
Posts: 9551
Location: Arlington, MA
j0ly wrote:
It is extremly weird, this always happens around 10:49 a night.

Do you have a UPS on the system? I know that the ones on my system used to beep (until I turned off the brown out alert) at around 6:50AM every weekday as a large number of coffee machines, setback thermostats, clock radios, ... all went on at the same time. ;-)

I'm a bit fanatical that every system and especially every server deserves it's own UPS. It's the legacy of 11 years in lightening country, where no tree over 60 feet tall had a living top, there was a 50% chance of thunderstorms every day in the summer time, and the power surges killed two external modems and a half dozen surge suppressors. Thankfully none of the protected systems suffered as the surge suppressors "took one for the team".


Top
 Profile  
 
 Post subject:
PostPosted: Sun Apr 08, 2007 11:26 am 
Offline
Joined: Sat Feb 05, 2005 3:26 pm
Posts: 121
Location: Calgary, Alberta - Canada!
I do have a UPS on the MBE, but i don't think tghat is it, i think the problem is a process taking up some serious cpu. I can log in to the backend (ssh) and use top, but it takes like 10 minutes to load top. nothing seems to be taking huge cpu time, but the system is reporting a load of around 4.0

if i hard reboot the system (holding in power on the box and then powering back on) the system is fine for a day.

So it seems to be something running everyday at around the same time is killing my box. I thought it may be the mythfilldatabase as this would be something that runs everyday, but that seems to be fine.

then last night, i was watching tv (a recorded show) and my frontend froze, again i could ssh in to the system but the responce time was horrible. checking the logs this morning shows that at the time of the freeze that last thing run in the syslog was

Quote:
Apr 7 23:35:01 mbe /USR/SBIN/CRON[6783]: (root) CMD ( nice -n 19 /usr/local/bin/mythrename.pl --link /myth/pretty)


im wondering if this is killing my box somehow... anyone else have any other ideas?


Top
 Profile  
 
 Post subject:
PostPosted: Sun Apr 08, 2007 2:51 pm 
Offline
Site Admin
Joined: Fri Sep 19, 2003 6:37 pm
Posts: 2659
Location: Whittier, Ca
Boot from the CD and run memtest...


Top
 Profile  
 
 Post subject:
PostPosted: Sun Apr 08, 2007 3:38 pm 
Offline
Joined: Thu Mar 25, 2004 11:00 am
Posts: 9551
Location: Arlington, MA
You might want to try running top in batch mode (see "man top" for details) for say a 15 minute period bridging the time when the problem usually occurs.


Top
 Profile  
 
 Post subject:
PostPosted: Sun Apr 08, 2007 4:47 pm 
Offline
Joined: Sat Feb 05, 2005 3:26 pm
Posts: 121
Location: Calgary, Alberta - Canada!
both really good sugestions, ill try them tonight an see what happens.... as always, thanks guys!

-S


Top
 Profile  
 
 Post subject:
PostPosted: Sun Apr 08, 2007 7:44 pm 
Offline
Joined: Sun Jun 12, 2005 10:55 pm
Posts: 3161
Location: Warwick, RI
Hi,
Since the main source of the system is the same (hd with original install?) I think I would check the cmos settings for something turned on that is not needed. Printer port with out a printer connected can give false irq 7 hits, power save features, monitor sleep modes, and similar features that you don't want enabled.

Also what did you upgrade from as it may not have had what the new system does so problem wasn't apparent then. SATA needs to be disabled (if you can) and if you aren't using it.

Sadly, there area few combinations of hardware that just don't play nicely together either so you may not want to rule that thought out. Cecil in his forward thinking put in teir 3 just for those situations.

I have one "modern" mobo (EPIA MII 10000) that would throw irq 9 errors at me, until I turned off APIC in the cmos. It didn't appear to bother withe exception of myself that I could tell.

Another thing that will make things get sluggish is if there was network activity trying for a dns lookup or a reverse lookup. Example I ran into was with proftpd when trying to connect to it would seem to hang, but it was really trying to check the source ip. Turning that feature off makes us both happy. Maybe check for cron jobs. Also if you have done something that has activated the mail it will put a squeeze on response as it tries to connect to a non valid mail server.

If it is a specific task it may be worth while doing a ps ax to see what process was fired up during that time window.

Just some ideas that may be helpful :)
Mike.


Top
 Profile  
 
 Post subject:
PostPosted: Mon Apr 09, 2007 8:45 am 
Offline
Joined: Sat Feb 05, 2005 3:26 pm
Posts: 121
Location: Calgary, Alberta - Canada!
OK, so heres whats up so far...

Cecil; did a memtest and the memory is fine on the computer, came back clean.

tjc; Checked out the top, and the only process that was going crazy was mythbackend, running at 99% cpu, i killed it (/etc/init.d/mythbackend stop) and the computer still didn't react again.

mjl; Disabled a ton of the settings in the bios, like sata, as well as any power settings that would put the computer in to a standby mode or low power mode.

finally, i checked messages and a couple of lines really stood out. like

Quote:
8 19:12:59 mbe kernel: ivtv0 warning: ENC: REG_DMAXFER 2 wait failed


This one had like 50 lines the same.... and these other ones stood out as well.

Quote:
Apr 8 20:38:48 mbe kernel: BIOS-provided physical RAM map:
Apr 8 20:38:48 mbe kernel: BIOS-e820: 0 - 09f800 (usable) 2 wait failed
Apr 8 20:38:48 mbe kernel: BIOS-e820: 09f800 - 0a0000 (reserved) failed
Apr 8 20:38:48 mbe kernel: BIOS-e820: 0e0000 - 0100000 (reserved)failed
Apr 8 20:38:48 mbe kernel: BIOS-e820: 0100000 - 03ff70000 (usable)ailed
Apr 8 20:38:48 mbe kernel: BIOS-e820: 03ff70000 - 03ff7a000 (ACPI data)
Apr 8 20:38:48 mbe kernel: BIOS-e820: 03ff7a000 - 03ff80000 (ACPI NVS)d
Apr 8 20:38:48 mbe kernel: BIOS-e820: 03ff80000 - 040 (reserved) failed
Apr 8 20:38:48 mbe kernel: BIOS-e820: 0fec00000 - 0fec10000 (reserved)d
Apr 8 20:38:48 mbe kernel: BIOS-e820: 0fee00000 - 0fee01000 (reserved)d
Apr 8 20:38:48 mbe kernel: BIOS-e820: 0ff800000 - 010 (reserved) failed

Quote:
Apr 8 20:39:19 mbe kernel: nvidia: module license 'NVIDIA' taints kernel.


here is a section of my top showing mythbackend using 99%

op - 20:13:06 up 2:48, 2 users, load average: 3.48, 2.41, 1.79

Tasks: 113 total, 1 running, 112 sleeping, 0 stopped, 0 zombie

Cpu(s): 41.7% us, 0.4% sy, 0.0% ni, 55.6% id, 2.1% wa, 0.1% hi, 0.1% si

Mem: 1032344k total, 1014192k used, 18152k free, 11000k buffers

Swap: 498004k total, 0k used, 498004k free, 800404k cached


Quote:
PID USER VIRT RES SHR S %CPU %MEM TIME+ TIME COMMAND
5373 mythtv 231m 31m 9388 S 99.9 3.2 480:19.09 480:19 mythbackend
22229 root 2208 1068 788 R 0.3 0.1 0:00.01 0:00 top
1 root 156 80 52 S 0.0 0.0 0:01.17 0:01 init
2 root 0 0 0 S 0.0 0.0 0:00.00 0:00 migration/0
3 root 0 0 0 S 0.0 0.0 0:00.00 0:00 ksoftirqd/0
4 root 0 0 0 S 0.0 0.0 0:00.00 0:00 watchdog/0
5 root 0 0 0 S 0.0 0.0 0:00.00 0:00 migration/1
6 root 0 0 0 S 0.0 0.0 0:00.00 0:00 ksoftirqd/1
7 root 0 0 0 S 0.0 0.0 0:00.00 0:00 watchdog/1


does this help at all? would reinstalling the backend clean this up? Some other hardware type changes i went from in the old hardware was

amd to p4 ht
old nvidia card to new 5200

Thanks again guys


Top
 Profile  
 
 Post subject:
PostPosted: Mon Apr 09, 2007 6:55 pm 
Offline
Joined: Sun Jun 12, 2005 10:55 pm
Posts: 3161
Location: Warwick, RI
Hi

Two things, how long did you allow the memtest to run? Need to give it over night at least.

Also when it goes into the slow down mode, what does ps ax show as the last few entries?

Mike


Top
 Profile  
 
 Post subject:
PostPosted: Mon Apr 09, 2007 7:00 pm 
Offline
Joined: Sat Feb 05, 2005 3:26 pm
Posts: 121
Location: Calgary, Alberta - Canada!
i have since rebooted the box, and ill wait for the nightly (and daily) slowdown, seems to be happening after a show is recorded, then ill post the ps ax results as well


Top
 Profile  
 
 Post subject:
PostPosted: Tue Apr 10, 2007 9:57 am 
Offline
Joined: Sat Feb 05, 2005 3:26 pm
Posts: 121
Location: Calgary, Alberta - Canada!
The system is acting slow right now, here are the results of the ps ax:

5774 ? Ss 0:00 /usr/bin/x11vnc -nap -wait 50 -displa
5815 ? Ss 0:00 /usr/bin/ssh-agent x-window-manager
5817 ? Ssl 0:13 mythfrontend
31437 ? Ss 0:00 sshd: root@ttyp0
31448 ttyp0 Ss+ 0:00 -bash
13341 ? S 0:00 /usr/sbin/apache
13348 ? S 0:00 /usr/sbin/apache
8369 ? Ss 0:00 sshd: root@ttyp1
8380 ttyp1 Ss 0:00 -bash
8400 ttyp1 R+ 0:00 ps ax


Top
 Profile  
 
 Post subject:
PostPosted: Tue Apr 10, 2007 10:18 am 
Offline
Joined: Sat Feb 05, 2005 3:26 pm
Posts: 121
Location: Calgary, Alberta - Canada!
I thought i would post the results of top-b as well

Quote:
top - 01:16:38 up 9:34, 3 users, load average: 0.51, 0.28, 0.17
Tasks: 115 total, 1 running, 114 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0% us, 0.0% sy, 0.0% ni, 50.9% id, 49.0% wa, 0.1% hi, 0.1% si
Mem: 1032344k total, 998176k used, 34168k free, 8576k buffers
Swap: 498004k total, 0k used, 498004k free, 710428k cached

PID USER VIRT RES SHR S %CPU %MEM TIME+ TIME COMMAND
8509 root 2212 1136 844 R 0.5 0.1 0:00.03 0:00 top
3238 www-data 147m 10m 2588 D 0.3 1.1 0:00.77 0:00 apache
3239 www-data 150m 13m 2684 S 0.3 1.3 0:04.95 0:04 apache
1 root 160 80 52 S 0.0 0.0 0:01.12 0:01 init
2 root 0 0 0 S 0.0 0.0 0:00.00 0:00 migration/0
3 root 0 0 0 S 0.0 0.0 0:00.00 0:00 ksoftirqd/0
4 root 0 0 0 S 0.0 0.0 0:00.00 0:00 watchdog/0
5 root 0 0 0 S 0.0 0.0 0:00.00 0:00 migration/1
6 root 0 0 0 S 0.0 0.0 0:00.00 0:00 ksoftirqd/1
7 root 0 0 0 S 0.0 0.0 0:00.00 0:00 watchdog/1
8 root 0 0 0 S 0.0 0.0 0:03.38 0:03 events/0
9 root 0 0 0 S 0.0 0.0 0:01.41 0:01 events/1
10 root 0 0 0 S 0.0 0.0 0:00.01 0:00 khelper
11 root 0 0 0 S 0.0 0.0 0:00.00 0:00 kthread
14 root 0 0 0 S 0.0 0.0 0:00.30 0:00 kblockd/0
15 root 0 0 0 S 0.0 0.0 0:00.03 0:00 kblockd/1
16 root 0 0 0 S 0.0 0.0 0:00.00 0:00 kacpid
107 root 0 0 0 S 0.0 0.0 0:00.00 0:00 kseriod
213 root 0 0 0 S 0.0 0.0 0:00.35 0:00 pdflush
214 root 0 0 0 S 0.0 0.0 0:01.35 0:01 pdflush
215 root 0 0 0 S 0.0 0.0 0:04.86 0:04 kswapd0
216 root 0 0 0 S 0.0 0.0 0:00.00 0:00 kprefetchd
217 root 0 0 0 S 0.0 0.0 0:00.00 0:00 aio/0
218 root 0 0 0 S 0.0 0.0 0:00.00 0:00 aio/1
219 root 0 0 0 S 0.0 0.0 0:00.00 0:00 jfsIO
220 root 0 0 0 S 0.0 0.0 0:00.00 0:00 jfsCommit
221 root 0 0 0 S 0.0 0.0 0:00.00 0:00 jfsCommit
222 root 0 0 0 S 0.0 0.0 0:00.00 0:00 jfsSync
223 root 0 0 0 S 0.0 0.0 0:00.00 0:00 xfslogd/0
224 root 0 0 0 S 0.0 0.0 0:00.00 0:00 xfslogd/1
225 root 0 0 0 S 0.0 0.0 0:00.00 0:00 xfsdatad/0
226 root 0 0 0 S 0.0 0.0 0:00.00 0:00 xfsdatad/1
813 root 0 0 0 S 0.0 0.0 0:00.00 0:00 fcached
910 root 0 0 0 S 0.0 0.0 0:00.00 0:00 ata/0
911 root 0 0 0 S 0.0 0.0 0:00.00 0:00 ata/1
930 root 0 0 0 S 0.0 0.0 0:00.00 0:00 kpsmoused
935 root 0 0 0 S 0.0 0.0 0:00.00 0:00 kcryptd/0
936 root 0 0 0 S 0.0 0.0 0:00.00 0:00 kcryptd/1
937 root 0 0 0 S 0.0 0.0 0:00.00 0:00 kmirrord
938 root 0 0 0 S 0.0 0.0 0:00.00 0:00 kirqd
953 root 0 0 0 S 0.0 0.0 0:00.54 0:00 kjournald
1329 root 0 0 0 S 0.0 0.0 0:00.00 0:00 khubd
1682 root 0 0 0 S 0.0 0.0 0:00.00 0:00 khpsbpkt
2812 root 0 0 0 S 0.0 0.0 0:03.20 0:03 kjournald
2886 daemon 1712 368 272 S 0.0 0.0 0:00.00 0:00 portmap
2899 root 0 0 0 S 0.0 0.0 0:00.00 0:00 rpciod/0
2900 root 0 0 0 S 0.0 0.0 0:00.00 0:00 rpciod/1
2901 root 0 0 0 S 0.0 0.0 0:00.00 0:00 lockd
3194 root 1648 620 512 S 0.0 0.1 0:00.09 0:00 syslogd
3203 root 2640 1584 380 S 0.0 0.2 0:00.11 0:00 klogd
3230 root 137m 4288 3132 S 0.0 0.4 0:00.04 0:00 apache
3241 www-data 148m 10m 2588 S 0.0 1.1 0:00.86 0:00 apache
3243 www-data 151m 14m 2616 S 0.0 1.4 0:02.03 0:02 apache
3244 www-data 153m 16m 2600 S 0.0 1.6 0:05.54 0:05 apache
3246 root 1592 368 312 S 0.0 0.0 0:00.00 0:00 inetd
3278 root 0 0 0 S 0.0 0.0 0:01.20 0:01 lirc_dev
3281 root 2756 616 472 S 0.0 0.1 0:00.00 0:00 lircd
3836 root 2692 1332 1076 S 0.0 0.1 0:00.00 0:00 mysqld_safe
3872 mysql 70404 26m 3104 S 0.0 2.7 0:55.22 0:55 mysqld
3873 root 1584 508 444 S 0.0 0.0 0:00.00 0:00 logger
3904 root 1784 748 636 S 0.0 0.1 0:00.00 0:00 rpc.statd
3925 root 0 0 0 S 0.0 0.0 0:00.00 0:00 nfsd4
3926 root 0 0 0 S 0.0 0.0 0:03.38 0:03 nfsd
3927 root 0 0 0 S 0.0 0.0 0:03.26 0:03 nfsd
3928 root 0 0 0 S 0.0 0.0 0:03.46 0:03 nfsd
3929 root 0 0 0 S 0.0 0.0 0:03.44 0:03 nfsd
3930 root 0 0 0 S 0.0 0.0 0:03.29 0:03 nfsd
3931 root 0 0 0 S 0.0 0.0 0:03.56 0:03 nfsd
3932 root 0 0 0 S 0.0 0.0 0:03.19 0:03 nfsd
3933 root 0 0 0 S 0.0 0.0 0:03.38 0:03 nfsd
3937 root 1820 272 148 S 0.0 0.0 0:00.00 0:00 rpc.mountd
3946 root 5660 1388 888 S 0.0 0.1 0:00.26 0:00 nmbd
3948 root 8068 1564 952 S 0.0 0.2 0:00.02 0:00 smbd
3969 root 8068 936 324 S 0.0 0.1 0:00.00 0:00 smbd
4005 root 3620 960 676 S 0.0 0.1 0:00.00 0:00 sshd
4236 root 0 0 0 S 0.0 0.0 0:00.00 0:00 shpchpd
4293 root 8652 5284 1372 S 0.0 0.5 0:00.07 0:00 miniserv.pl
4721 root 0 0 0 S 0.0 0.0 0:00.00 0:00 ivtv_vbi/0
4722 root 0 0 0 S 0.0 0.0 0:00.00 0:00 ivtv_vbi/1
4738 root 0 0 0 S 0.0 0.0 0:00.06 0:00 msp34xx
4796 root 0 0 0 S 0.0 0.0 0:06.02 0:06 ivtv-enc
4797 root 0 0 0 S 0.0 0.0 2:30.97 2:30 ivtv-enc-vbi
4798 root 0 0 0 S 0.0 0.0 0:00.00 0:00 ivtv_vbi/0
4799 root 0 0 0 S 0.0 0.0 0:00.00 0:00 ivtv_vbi/1
4889 root 0 0 0 S 0.0 0.0 0:00.00 0:00 ivtv-enc
4890 root 0 0 0 S 0.0 0.0 0:00.00 0:00 ivtv-enc-vbi
4891 root 0 0 0 S 0.0 0.0 0:00.00 0:00 ivtv_vbi/0
4892 root 0 0 0 S 0.0 0.0 0:00.00 0:00 ivtv_vbi/1
4908 root 0 0 0 S 0.0 0.0 0:00.05 0:00 msp34xx
4966 root 0 0 0 S 0.0 0.0 0:02.53 0:02 ivtv-enc
4967 root 0 0 0 S 0.0 0.0 1:03.51 1:03 ivtv-enc-vbi
5316 root 1740 672 568 S 0.0 0.1 0:00.75 0:00 automount
5357 root 1744 676 568 S 0.0 0.1 0:00.75 0:00 automount
5405 mythtv 201m 33m 10m S 0.0 3.3 2:22.76 2:22 mythbackend
5425 daemon 1804 400 296 S 0.0 0.0 0:00.00 0:00 atd
5434 root 1876 712 568 S 0.0 0.1 0:00.00 0:00 cron
5496 root 9268 1500 1060 S 0.0 0.1 0:00.00 0:00 gdm
5505 root 9632 2348 1840 S 0.0 0.2 0:00.02 0:00 gdm
5611 root 37164 25m 3140 S 0.0 2.6 0:01.07 0:01 X
5628 root 1596 504 436 S 0.0 0.0 0:00.00 0:00 getty
5629 root 1592 504 436 S 0.0 0.0 0:00.00 0:00 getty
5630 root 1596 504 436 S 0.0 0.0 0:00.00 0:00 getty
5631 root 1596 508 436 S 0.0 0.0 0:00.00 0:00 getty
5632 root 1596 504 436 S 0.0 0.0 0:00.00 0:00 getty
5633 root 1592 504 436 S 0.0 0.0 0:00.00 0:00 getty
5771 mythtv 8732 4528 3180 S 0.0 0.4 0:00.37 0:00 x-window-manage
5774 root 6368 1580 380 S 0.0 0.2 0:00.00 0:00 x11vnc
5815 mythtv 3112 528 316 S 0.0 0.1 0:00.00 0:00 ssh-agent
5817 mythtv 166m 50m 18m S 0.0 5.0 0:13.33 0:13 mythfrontend
31437 root 14936 2172 1588 S 0.0 0.2 0:00.03 0:00 sshd
31448 root 3992 2804 1232 S 0.0 0.3 0:00.10 0:00 bash
13341 www-data 147m 10m 2560 S 0.0 1.1 0:00.63 0:00 apache
13348 www-data 148m 10m 2584 S 0.0 1.1 0:00.72 0:00 apache
8369 root 14936 2164 1580 S 0.0 0.2 0:00.06 0:00 sshd
8380 root 4004 2824 1240 S 0.0 0.3 0:00.11 0:00 bash


Top
 Profile  
 
 Post subject:
PostPosted: Tue Apr 10, 2007 3:59 pm 
Offline
Site Admin
Joined: Fri Oct 31, 2003 11:40 pm
Posts: 357
Location: Irvine, Ca
How much space is free on the partitions? I've seen systems get "real busy" when a filesystem gets full.


Top
 Profile  
 
 Post subject:
PostPosted: Tue Apr 10, 2007 4:10 pm 
Offline
Joined: Sat Feb 05, 2005 3:26 pm
Posts: 121
Location: Calgary, Alberta - Canada!
root@mbe:~# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/hda1 4.7G 2.2G 2.2G 51% /
/dev/hda3 182G 164G 18G 91% /myth
sbe:/myth 147G 51M 147G 1% /mnt/sbe1
root@mbe:~#


Top
 Profile  
 
 Post subject:
PostPosted: Tue Apr 10, 2007 5:58 pm 
Offline
Joined: Thu Mar 25, 2004 11:00 am
Posts: 9551
Location: Arlington, MA
Here's the bit that caught my eye:
Quote:
Cpu(s): 0.0% us, 0.0% sy, 0.0% ni, 50.9% id, 49.0% wa, 0.1% hi, 0.1% si

That wait number is really high at 49%, this usually means that the system is hammering away at some device that requires lots of handholding. You haven't got a USB disk on that box have you?

It could also be a network device and I see apache near the top of the list. Is your box behind a firewall or exposed? If it's exposed this could be some spider indexing your box... ;-)


Top
 Profile  
 

Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 39 posts ] 
Go to page 1, 2, 3  Next



All times are UTC - 6 hours




Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
Powered by phpBB® Forum Software © phpBB Group

Theme Created By ceyhansuyu