Tux

...making Linux just a little more fun!

using smp kernel, get 100% cpu usage one one cpu without any real load on the system

jim ruxton [cinetron at passport.ca]


Mon, 08 Oct 2007 14:21:21 -0400

Hi I'm on a P4 running an smp kernel ie.

uname -a : Linux jims-laptop 2.6.20-16-386 #2 Sun Sep 23 19:47:10 UTC 2007 i686 GNU/Linux

For some reason occasionaly one of my cpu's runs away and starts showing 100% cpu usage, however top only shows the program using the most cpu at 7% or less. Where is that 93% going? This is driving me crazy. I have to reboot the machine to get it back to normal. I've tried Fedora and Kubuntu. Though Kubuntu appeared more stable it still happened. Played around with changing max_cstate without success.I've just downgraded to a 386 kernel to see if that helps. This isn't a real solution however. Do you have any suggestions? I've asked around on forums and irc channels without luck. Thanks. Jim

output of cat /proc/cpuinfo :

processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 2
model name : Intel(R) Pentium(R) 4 CPU 3.06GHz
stepping : 9
cpu MHz : 3059.352
cache size : 512 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 1
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid xtpr
bogomips : 6123.68
clflush size : 64


Top    Back


Mulyadi Santosa [mulyadi.santosa at gmail.com]


Tue, 09 Oct 2007 02:21:23 +0700

Hi...

> Hi I'm on a P4 running an smp kernel ie.
>
> uname -a :
>  
> Linux jims-laptop 2.6.20-16-386 #2 Sun Sep 23 19:47:10 UTC 2007 i686
> GNU/Linux
>
> For some reason occasionaly one of my cpu's runs away and starts showing
> 100% cpu usage, however top only shows the program using the most cpu at
> 7% or less.

Looking at /proc/cpuinfo, sounds like you have P4 with hyperthread enabled, true? If so, this 7% max of CPU usage, is it only on one core or cumulative load over all cores?

>  Where is that 93% going? This is driving me crazy. I have to
> reboot the machine to get it back to normal. I've tried Fedora and
> Kubuntu. Though Kubuntu appeared more stable it still happened. Played
> around with changing max_cstate without success.I've just downgraded to
> a 386 kernel to see if that helps. This isn't a real solution however.
> Do you have any suggestions? I've asked around on forums and irc
> channels without luck. Thanks.
>   
My simple suggestion right now: please download latest stable kernel from kernel.org and compile by yourself. There is a chance you hit a (unknown?) kernel scheduler bug. Or it could be a power management problem (is it a mobile processor?).

Since I am not really knowledgeable in this field, try to fiddle with options during kernel configuration such as ACPI, APM, APIC support and so on.

let's see if that will cure your problem.

regards,

Mulyadi


Top    Back


Ben Okopnik [ben at linuxgazette.net]


Mon, 8 Oct 2007 15:30:00 -0400

On Mon, Oct 08, 2007 at 02:21:21PM -0400, jim ruxton wrote:

> Hi I'm on a P4 running an smp kernel ie.
> 
> uname -a :
>  
> Linux jims-laptop 2.6.20-16-386 #2 Sun Sep 23 19:47:10 UTC 2007 i686
> GNU/Linux
> 
> For some reason occasionaly one of my cpu's runs away and starts showing
> 100% cpu usage, however top only shows the program using the most cpu at
> 7% or less. Where is that 93% going? This is driving me crazy. I have to
> reboot the machine to get it back to normal. I've tried Fedora and
> Kubuntu. Though Kubuntu appeared more stable it still happened. Played
> around with changing max_cstate without success.I've just downgraded to
> a 386 kernel to see if that helps. This isn't a real solution however.
> Do you have any suggestions? I've asked around on forums and irc
> channels without luck. Thanks.

One of the things that might be worth trying is reducing everything to a minimum and seeing if it still happens. That is, try booting into single mode and shutting down any non-critical processes, then just letting the system "cook" for a little while. If it happens again, you've most likely narrowed it down to either a) the kernel or b) the hardware. For troubleshooting the former, check out the kernel compilation options (I seem to recall a developer/debugging option accessible via 'make config'.) For the latter, try a hairdryer aimed at the offending CPU. :)

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *


Top    Back


Thomas Adam [thomas.adam22 at gmail.com]


Mon, 8 Oct 2007 20:43:38 +0100

On 08/10/2007, Ben Okopnik <ben at linuxgazette.net> wrote:

> One of the things that might be worth trying is reducing everything to a
> minimum and seeing if it still happens. That is, try booting into single
> mode and shutting down any non-critical processes, then just letting the
> system "cook" for a little while. If it happens again, you've most
> likely narrowed it down to either a) the kernel or b) the hardware. For
> troubleshooting the former, check out the kernel compilation options (I
> seem to recall a developer/debugging option accessible via 'make
> config'.) For the latter, try a hairdryer aimed at the offending CPU. :)

Yes, and quite often the other main offender is choosing a wrong arch type for your system. That'll cause quite often either a high CPU load or massive amounts of disk access. In addition, specifying:

noapic

As a kernel parameter might also help.

-- Thomas Adam


Top    Back


René Pfeiffer [lynx at luchs.at]


Mon, 8 Oct 2007 21:53:39 +0200

Hello, Jim!

On Oct 08, 2007 at 1421 -0400, jim ruxton appeared and said:

> Hi I'm on a P4 running an smp kernel ie.
> 
> uname -a :
>  
> Linux jims-laptop 2.6.20-16-386 #2 Sun Sep 23 19:47:10 UTC 2007 i686
> GNU/Linux

Well, if you run a SMP kernel the output is usually something like this:

Linux nightfall 2.6.22.9 #1 SMP PREEMPT Tue Oct 2 22:02:38 CEST 2007 x86_64 GNU/Linux

uname shows a "SMP" next to the version. Double check if your kernel has activated SMP code. Your output of /proc/cpuinfo only shows one processor. You should have two (indicated by a different ID next to the field "processor").

It should be noted that although some P4 CPUs have hyperthreading not all mainboards or BIOS variants support it. I've seen P4s not capable of hyperthreading simply because of the hardware or the BIOS on the board.

> For some reason occasionaly one of my cpu's runs away and starts showing
> 100% cpu usage, however top only shows the program using the most cpu at
> 7% or less. Where is that 93% going? This is driving me crazy.

Does the system become unresponsive or sluggish?

Can you check if "top" shows any distribution of the load? My top and procinfo tools also show what the load is like, such as IRQ activity, waiting for I/O or other things. This is an output of procinfo on my SMP system (Intel Core 2 Duo):

lynx at nightfall:~$ procinfo 
Linux 2.6.22.9 (lynx at nightfall) (gcc [can't parse]) #???  2CPU [nightfall.luchs.at]
 
Memory:      Total        Used        Free      Shared     Buffers      
Mem:       1028328      967664       60664           0       16480
Swap:       995988         972      995016
 
Bootup: Tue Oct  2 22:06:31 2007    Load average: 0.03 0.07 0.08 1/133 27736
 
user  :       6:15:19.98   2.2%  page in :  4400622  disk 1:      451r     214w
nice  :       0:01:00.35   0.0%  page out:  3204961  disk 2:     6518r    4384w
system:       2:15:43.08   0.8%  page act:   858311  disk 3:   139898r  364672w
IOwait:       0:06:14.76   0.0%  page dea:   479245
hw irq:       0:25:24.70   0.1%  page flt: 34457299
sw irq:       0:18:43.76   0.1%  swap in :       25
idle  :  11d 10:38:31.98  95.6%  swap out:      233
uptime:   5d 23:37:23.38         context :109507629
[...]

There's user, nice, system, IOwait, the IRQ modes and idle time. And yes, now I see that procinfo also counts the CPUs by printing "2CPU". :) Using "x86info -v" is yet another way of checking SMP mode and CPU capabilities.

Best regards, Ren?.


Top    Back


Marty Leisner [leisner at rochester.rr.com]


Tue, 09 Oct 2007 00:26:59 -0400

jim ruxton <cinetron at passport.ca> writes on Mon, 08 Oct 2007 14:21:21 EDT

> Hi I'm on a P4 running an smp kernel ie.
> 
> uname -a :
>  
> Linux jims-laptop 2.6.20-16-386 #2 Sun Sep 23 19:47:10 UTC 2007 i686
> GNU/Linux
> 
> For some reason occasionaly one of my cpu's runs away and starts showing
> 100% cpu usage, however top only shows the program using the most cpu at
> 7% or less. Where is that 93% going? This is driving me crazy. I have to
> reboot the machine to get it back to normal. I've tried Fedora and
> Kubuntu. Though Kubuntu appeared more stable it still happened. Played
> around with changing max_cstate without success.I've just downgraded to
> a 386 kernel to see if that helps. This isn't a real solution however.
> Do you have any suggestions? I've asked around on forums and irc
> channels without luck. Thanks.
> Jim
> 
> 
> output of cat /proc/cpuinfo :
> 

top also reports kernel, idle, user and a few others...

Are you 100% user? Is the system sluggish?

What type of load average are you seeing?

Also, if you're SMP there's an option in top to display each CPU (in recent procps tops, type 1).

[tops a lovely program -- there's been so many flavors since Bill Lefebvre version -- all slightly different...

Have you also run other load monitors as a sanity check?

marty


Top    Back


jim ruxton [cinetron at passport.ca]


Tue, 09 Oct 2007 00:43:08 -0400

Thanks Mulaydi.

> > Hi I'm on a P4 running an smp kernel ie.
> >
> > uname -a :
> >  
> > Linux jims-laptop 2.6.20-16-386 #2 Sun Sep 23 19:47:10 UTC 2007 i686
> > GNU/Linux
> >
> > For some reason occasionaly one of my cpu's runs away and starts showing
> > 100% cpu usage, however top only shows the program using the most cpu at
> > 7% or less.
> 
> Looking at /proc/cpuinfo, sounds like you have P4 with hyperthread 
> enabled, true? If so, this 7% max of CPU usage, is it only on one core 
> or cumulative load over all cores?
No it is only on one core.

> >  Where is that 93% going? This is driving me crazy. I have to
> > reboot the machine to get it back to normal. I've tried Fedora and
> > Kubuntu. Though Kubuntu appeared more stable it still happened. Played
> > around with changing max_cstate without success.I've just downgraded to
> > a 386 kernel to see if that helps. This isn't a real solution however.
> > Do you have any suggestions? I've asked around on forums and irc
> > channels without luck. Thanks.
> >   
> My simple suggestion right now: please download latest stable kernel 
> from kernel.org and compile by yourself. There is a chance you hit a 
> (unknown?) kernel scheduler bug. Or it could be a power management 
> problem (is it a mobile processor?).
It is a P4 on a laptop. I'm not sure how to tell whether or not it is a mobile version?

> 
> Since I am not really knowledgeable in this field, try to fiddle with 
> options during kernel configuration such as ACPI, APM, APIC support and 
> so on.
I've tried with noapic as a kernal flag with no luck.

Jim


Top    Back


jim ruxton [cinetron at passport.ca]


Tue, 09 Oct 2007 00:48:33 -0400

Thanks Ben,

> > Hi I'm on a P4 running an smp kernel ie.
> > 
> > uname -a :
> >  
> > Linux jims-laptop 2.6.20-16-386 #2 Sun Sep 23 19:47:10 UTC 2007 i686
> > GNU/Linux
> > 
> > For some reason occasionaly one of my cpu's runs away and starts showing
> > 100% cpu usage, however top only shows the program using the most cpu at
> > 7% or less. Where is that 93% going? This is driving me crazy. I have to
> > reboot the machine to get it back to normal. I've tried Fedora and
> > Kubuntu. Though Kubuntu appeared more stable it still happened. Played
> > around with changing max_cstate without success.I've just downgraded to
> > a 386 kernel to see if that helps. This isn't a real solution however.
> > Do you have any suggestions? I've asked around on forums and irc
> > channels without luck. Thanks.
> 
> One of the things that might be worth trying is reducing everything to a
> minimum and seeing if it still happens. That is, try booting into single
> mode and shutting down any non-critical processes, then just letting the
> system "cook" for a little while. If it happens again, you've most
> likely narrowed it down to either a) the kernel or b) the hardware. For
> troubleshooting the former, check out the kernel compilation options (I
> seem to recall a developer/debugging option accessible via 'make
> config'.) For the latter, try a hairdryer aimed at the offending CPU. :)
I tried switching into Single User Mode once the 1 processor jumped to 100% but it didn't seem to change anything. I've tried unloading some modules but I keep getting Resource currently unavailable after running rmmod. Do you know what this would mean in this context? This glitch is hard to track down because sometimes it doesn't happen for a day or so after I've rebooted.

jim


Top    Back


jim ruxton [cinetron at passport.ca]


Tue, 09 Oct 2007 01:10:33 -0400

Thanks Ren? ,

> > 
> > uname -a :
> >  
> > Linux jims-laptop 2.6.20-16-386 #2 Sun Sep 23 19:47:10 UTC 2007 i686
> > GNU/Linux
> 
> Well, if you run a SMP kernel the output is usually something like this:
> 
> Linux nightfall 2.6.22.9 #1 SMP PREEMPT Tue Oct 2 22:02:38 CEST 2007
> x86_64 GNU/Linux
> 
> uname shows a "SMP" next to the version. Double check if your kernel has
> activated SMP code. Your output of /proc/cpuinfo only shows one
> processor. You should have two (indicated by a different ID next to the
> field "processor").
Hmm that is curious isn't it, beause htop and top show 2 processors.

> 
> It should be noted that although some P4 CPUs have hyperthreading not
> all mainboards or BIOS variants support it. I've seen P4s not capable of
> hyperthreading simply because of the hardware or the BIOS on the board.
But the fact I see two processors in htop and top that are constantly changing in terms of load I would think that would mean HT is enabled?

> 
> > For some reason occasionaly one of my cpu's runs away and starts showing
> > 100% cpu usage, however top only shows the program using the most cpu at
> > 7% or less. Where is that 93% going? This is driving me crazy.
> 
> Does the system become unresponsive or sluggish?
Very sluggish to input ie. enetering text or backspacing is really slow. Sometimes it can't retrieve a webpage.

> 
> Can you check if "top" shows any distribution of the load? My top and
> procinfo tools also show what the load is like, such as IRQ activity,
> waiting for I/O or other things. This is an output of procinfo on my
> SMP system (Intel Core 2 Duo):
Below is my procinfo: (at this time things are not going bonkers. I'll run this again next time processor 1 runs away)

Linux 2.6.20-16-generic (root at terranova) (gcc 4.1.2 ) #2 2CPU
[jims-laptop]
 
Memory:      Total        Used        Free      Shared     Buffers
Mem:        515428      481384       34044           0       25356
Swap:       489940           0      489940
 
Bootup: Mon Oct  8 17:04:34 2007    Load average: 0.27 0.09 0.02 1/141
7093
 
user  :       0:02:23.44   0.2%  page in :   297863  disk 1:    14884r
12157w
nice  :       0:00:00.50   0.0%  page out:   178564
system:       0:04:57.27   0.5%  page act:    71635
IOwait:       0:01:38.46   0.1%  page dea:    12933
hw irq:       0:00:07.60   0.0%  page flt:  2111672
sw irq:       0:01:06.03   0.1%  swap in :        0
idle  :      15:36:19.55  98.9%  swap out:        0
uptime:       7:53:13.86         context : 10939172
 
irq  0:   7098612 timer                 irq 16:      3368 yenta, Intel
ICH5
irq  1:      3092 i8042                 irq 17:    348537 uhci_hcd:usb1,
nvidi
irq  7:         0 parport0              irq 18:         3 uhci_hcd:usb2,
ohci1
irq  8:         3 rtc                   irq 19:         0 uhci_hcd:usb3
irq  9:   7793376 acpi                  irq 20:     10624 eth0
irq 12:    478552 i8042                 irq 21:         0 ehci_hcd:usb4
irq 14:     27046 libata                irq 22:   1179944 eth1
irq 15:    132569 libata

Below is the result of x86info -v: This statement at the bottom is kind of cryptic:

WARNING: Detected SMP, but unable to access cpuid driver.
Used Uniprocessor CPU routines. Results inaccurate.
Any idea what it could mean?? Thanks again for the help.

x86info v1.18.  Dave Jones 2001-2006
Feedback to <davej at redhat.com>.
 
Found 2 CPUs
CPU #1 /dev/cpu/0/cpuid: No such file or directory Found unknown cache descriptors: 40 50 5b 66 70 7b Family: 15 Model: 2 Stepping: 9 Type: 0 Brand: 9 CPU Model: Pentium 4 (Northwood) [D1] Original OEM Processor name string: Intel(R) Pentium(R) 4 CPU 3.06GHz Feature flags: Onboard FPU Virtual Mode Extensions Debugging Extensions Page Size Extensions Time Stamp Counter Model-Specific Registers Physical Address Extensions Machine Check Architecture CMPXCHG8 instruction Onboard APIC SYSENTER/SYSEXIT Memory Type Range Registers Page Global Enable Machine Check Architecture CMOV instruction Page Attribute Table 36-bit PSEs CLFLUSH instruction Debug Trace Store ACPI via MSR MMX support FXSAVE and FXRESTORE instructions SSE support SSE2 support CPU self snoop Hyper-Threading Thermal Monitor Pending Break Enable cntx-id xTPR Extended feature flags: Instruction trace cache: Size: 12K uOps 8-way associative. L1 Data cache: Size: 8KB Sectored, 4-way associative. line size=64 bytes. L2 unified cache: Size: 512KB Sectored, 8-way associative. line size=64 bytes. Instruction TLB: 4K, 2MB or 4MB pages, fully associative, 64 entries. Found unknown cache descriptors: 40 50 5b 66 70 7b Data TLB: 4KB or 4MB pages, fully associative, 64 entries. The physical package supports 2 logical processors -------------------------------------------------------------------------- CPU #2 Found unknown cache descriptors: 40 50 5b 66 70 7b Family: 15 Model: 2 Stepping: 9 Type: 0 Brand: 9 CPU Model: Pentium 4 (Northwood) [D1] Original OEM Processor name string: Intel(R) Pentium(R) 4 CPU 3.06GHz Feature flags: Onboard FPU Virtual Mode Extensions Debugging Extensions Page Size Extensions Time Stamp Counter Model-Specific Registers Physical Address Extensions Machine Check Architecture CMPXCHG8 instruction Onboard APIC SYSENTER/SYSEXIT Memory Type Range Registers Page Global Enable Machine Check Architecture CMOV instruction Page Attribute Table 36-bit PSEs CLFLUSH instruction Debug Trace Store ACPI via MSR MMX support FXSAVE and FXRESTORE instructions SSE support SSE2 support CPU self snoop Hyper-Threading Thermal Monitor Pending Break Enable cntx-id xTPR Extended feature flags: Instruction trace cache: Size: 12K uOps 8-way associative. L1 Data cache: Size: 8KB Sectored, 4-way associative. line size=64 bytes. L2 unified cache: Size: 512KB Sectored, 8-way associative. line size=64 bytes. Instruction TLB: 4K, 2MB or 4MB pages, fully associative, 64 entries. Found unknown cache descriptors: 40 50 5b 66 70 7b Data TLB: 4KB or 4MB pages, fully associative, 64 entries. The physical package supports 2 logical processors -------------------------------------------------------------------------- WARNING: Detected SMP, but unable to access cpuid driver. Used Uniprocessor CPU routines. Results inaccurate. ''


Top    Back


jim ruxton [cinetron at passport.ca]


Tue, 09 Oct 2007 01:15:40 -0400

Thanks Marty,

> > Hi I'm on a P4 running an smp kernel ie.
> > 
> > uname -a :
> >  
> > Linux jims-laptop 2.6.20-16-386 #2 Sun Sep 23 19:47:10 UTC 2007 i686
> > GNU/Linux
> > 
> > For some reason occasionaly one of my cpu's runs away and starts showing
> > 100% cpu usage, however top only shows the program using the most cpu at
> > 7% or less. Where is that 93% going? This is driving me crazy. I have to
> > reboot the machine to get it back to normal. I've tried Fedora and
> > Kubuntu. Though Kubuntu appeared more stable it still happened. Played
> > around with changing max_cstate without success.I've just downgraded to
> > a 386 kernel to see if that helps. This isn't a real solution however.
> > Do you have any suggestions? I've asked around on forums and irc
> > channels without luck. Thanks.
> > Jim
> > 
> > 
> > output of cat /proc/cpuinfo :
> > 
> 
> top also reports kernel, idle, user  and a few others...
> 
> Are you 100% user?  Is the system sluggish?
Yes I am the only user and the system is very sluggish.

> 
> What type of load average are you seeing?
I see 100% on one cpu and 7% or so on the other. In terms of overall load, I'll check again next time the sytem goes crazy.

> 
> Also, if you're SMP there's an option in top to display each CPU (in recent procps tops, type 1).
Yes I see that now , thanks. I had been using htops which shows both cpus automatically. I didn't know about entering 1 in top to get both cpus. Thanks. Jim.

> 
> Have you also run other load monitors as a sanity check?
Just htops but I can tell as soon as it happens because everything crawls along including closing down the system.

Jim


Top    Back


Ramon van Alteren [ramon at forgottenland.net]


Tue, 09 Oct 2007 15:09:54 +0200

Ben Okopnik wrote:

> On Mon, Oct 08, 2007 at 02:21:21PM -0400, jim ruxton wrote:
>   
>> Hi I'm on a P4 running an smp kernel ie.
>>
>> uname -a :
>>  
>> Linux jims-laptop 2.6.20-16-386 #2 Sun Sep 23 19:47:10 UTC 2007 i686
>> GNU/Linux
>>
>> For some reason occasionaly one of my cpu's runs away and starts showing
>> 100% cpu usage, however top only shows the program using the most cpu at
>> 7% or less. Where is that 93% going? This is driving me crazy. I have to
>> reboot the machine to get it back to normal. I've tried Fedora and
>> Kubuntu. Though Kubuntu appeared more stable it still happened. Played
>> around with changing max_cstate without success.I've just downgraded to
>> a 386 kernel to see if that helps. This isn't a real solution however.
>> Do you have any suggestions? I've asked around on forums and irc
>> channels without luck. Thanks.
>>     
>
> One of the things that might be worth trying is reducing everything to a
> minimum and seeing if it still happens. That is, try booting into single
> mode and shutting down any non-critical processes, then just letting the
> system "cook" for a little while. If it happens again, you've most
> likely narrowed it down to either a) the kernel or b) the hardware. For
> troubleshooting the former, check out the kernel compilation options (I
> seem to recall a developer/debugging option accessible via 'make
> config'.) For the latter, try a hairdryer aimed at the offending CPU. :)
>
>   
I've seen this behaviour before in combination with run-away acpi processes (acpid and friends) My own laptop (a dell D810 with a dual-core with Hyperthreading) shows the same behaviour if I try to hot-(un)dock it from it's docking station

Ramon


Top    Back


Ben Okopnik [ben at linuxgazette.net]


Tue, 9 Oct 2007 12:24:08 -0400

On Tue, Oct 09, 2007 at 12:48:33AM -0400, jim ruxton wrote:

> Ben Okopnik wrote:
> > 
> > One of the things that might be worth trying is reducing everything to a
> > minimum and seeing if it still happens. That is, try booting into single
> > mode and shutting down any non-critical processes, then just letting the
> > system "cook" for a little while. If it happens again, you've most
> > likely narrowed it down to either a) the kernel or b) the hardware. For
> > troubleshooting the former, check out the kernel compilation options (I
> > seem to recall a developer/debugging option accessible via 'make
> > config'.) For the latter, try a hairdryer aimed at the offending CPU. :)
>
> I tried switching into Single User Mode once the 1 processor jumped to
> 100% but it didn't seem to change anything. 

You need to go single-user before it happens. That way, you'll know if it's some program that you're running when you're not in single-user, which is the valuable piece of info here.

> I've tried unloading some
> modules but I keep getting Resource currently unavailable after running
> rmmod. 

Sounds like your process table got filled up - that's when I've seen that kind of message.

> Do you know what this would mean in this context? This glitch is
> hard to track down because sometimes it doesn't happen for a day or so
> after I've rebooted.

Unfortunately, this kind of problem usually breaks down to one of two possibilities:

1) You're just Joe Average-User, messing around at home and trying to learn something. Because of this, you have the leisure to experiment as I suggested, so you spend several days twiddling and fiddling, and you discover a kernel bug that earns you world-wide recognition, undying fame, and a bouquet of roses from Linus Torvalds (*NOW* you'll be able to get all the girls!)

2) You work in the corporate environment, and your company hasn't quite yet spent the money on the spares that you've been begging them to get for years. You curse them, curse the damn computers, and start madly swapping hardware and kernels until it sorta kinda works (i.e., no crashes in a 24-hour period), and forget all about it until the next problem comes along.

Or there's always whatever intermediate version of these that you come up with for yourself. Take your pick. :)

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *


Top    Back


Mulyadi Santosa [mulyadi.santosa at gmail.com]


Sat, 13 Oct 2007 01:53:54 +0700

Hi Jim...

I am sorry for this late reply...

> No it is only on one core.
>   
OK, I think you hit a classic Linux kernel "bug". Not a bug in its true meaning, but the inability of Linux kernel to recognize the specific nature of Hyperthread processor. AFAIK, HT is not like multi core CPU. You have one true core and a "sibling". This sibling is actually like...uhm can't find the right word...ok..let's just say a "dummy" processor. It's like additional pipelines but isn't really a complete processor. But in Linux kernel's point of view, the real core and the sibling are identical.

This brings one implication. During load balancing between real core and the sibling, sometimes Linux kernel brings heavy jobs to the sibling instead of toward the real core (the powerful one). I don't know if the recent kernel release has already fix this "bug", so you are free to test.

Or, if you're really desperate, try to stick with non SMP kernel. That way, your tasks will always get assigned to the real core, abandoning the weak sibling.

I hope this clears up your confusion.

regards,

Mulyadi


Top    Back