PDA

View Full Version : New machine freezing.


epworth
04-28-2009, 06:25 AM
Hi.

I recently built a new machine, with the following:


BIOSTAR TpowerX58 LGA 1366 Intel X58 ATX Intel Motherboard
Intel Core i7 920 Nehalem 2.66GHz 4 x 256KB L2 Cache 8MB L3 Cache
OCZ Platinum 12GB (6 x 2GB) 240-Pin DDR3 SDRAM DDR3 1333 (PC3 10666)
SeaSonic S12 Energy Plus SS-550HT 550W ATX12V V2.3
Powercolor Radeon x1950 256mb.


The machine's unstable. The BIOS has a copy of memtest86 modified
to support the Core i7 but with all six sticks of ram installed,
it'll freeze after a few hours of testing.

At this point, having tested just about everything, I think it's
either the motherboard, cpu or a thermal problem.

My notes:


Power supply assumed to be good. Voltages in BIOS appear
to be correct and stable.

Removed CPU/heatsink. Cleaned stock thermal compound with
Arcticlean solution. Applied 'Zalman Grease' thermal compound.

Idle temperature:

CPU 30 - 35C
Northbridge 50C
Ambient 25C

Tested ram sticks 1 .. 2 in slots 1 .. 2, 7.5 hours, no problem.
Tested ram sticks 1 .. 3 in slots 1 .. 2, 7.5 hours, no problem.
Tested ram sticks 1 .. 4 in slots 1 .. 2, 7.5 hours, no problem.
Tested ram sticks 1 .. 5 in slots 1 .. 2, 7.5 hours, no problem.
Tested ram sticks 1 .. 6 in slots 1 .. 2, 7.5 hours, no problem.
Tested ram sticks 1 .. 6 in slots 1 .. 6, page fault after 4 hours.
Tested ram sticks 1 .. 2 in slots 5 .. 6, 5 hours, no problem.

Suspicious of slot 4. Page fault occurred when accessing
8gb - 10gb. If memory is arranged sequentially from slot
1 .. 6, slot 4 contains the 8gb - 10gb range when using
all six sticks.

Tested ram sticks 1 .. 4 in slots 3 .. 6, 12 hours, no problem. Shows what I know...


My main questions are:

What sort of temperatures should I be seeing from the CPU,
northbridge, etc?

Am I going to need additional cooling with all six sticks
of ram in there? Currently using Zalman Grease compound on
the CPU and three case fans (2 x 80mm and 1 x 120mm). Case
seems to have decent airflow.

Anybody got any other ideas? I'm at a loss as to how to
get this machine to stay alive, otherwise.

deanjo
04-28-2009, 08:46 AM
My guess is that one of your ram sticks, once it gets up to temp, starts exhibiting the issues you are having now.

or

Your using a older BIOS that does not have the microcode updates to fix (workaround) errata in the i7. Early i7 BIOS's were plagued with issues.

epworth
04-28-2009, 12:41 PM
Hi.

I've been testing the board today, running all six sticks.

Booted off a ubuntu 9.10 live cd and am watching the output
of 'sensors'.

Running prime95 with eight threads going pushed the cores
to an average temp of 80C. The machine crashed after about
an hour of that.

I've got the case open now with some enormous 40w floor fan
pointed at the exposed motherboard and the temperature output
running prime95 has been:


coretemp-isa-0000
Adapter: ISA adapter
Core 0: +79.0°C (high = +80.0°C, crit = +100.0°C)

coretemp-isa-0001
Adapter: ISA adapter
Core 1: +78.0°C (high = +80.0°C, crit = +100.0°C)

coretemp-isa-0002
Adapter: ISA adapter
Core 2: +77.0°C (high = +80.0°C, crit = +100.0°C)

coretemp-isa-0003
Adapter: ISA adapter
Core 3: +75.0°C (high = +80.0°C, crit = +100.0°C)

coretemp-isa-0004
Adapter: ISA adapter
Core 4: +78.0°C (high = +80.0°C, crit = +100.0°C)

coretemp-isa-0005
Adapter: ISA adapter
Core 5: +78.0°C (high = +80.0°C, crit = +100.0°C)

coretemp-isa-0006
Adapter: ISA adapter
Core 6: +75.0°C (high = +80.0°C, crit = +100.0°C)

coretemp-isa-0007
Adapter: ISA adapter
Core 7: +75.0°C (high = +80.0°C, crit = +100.0°C)


... for a few hours now.

I have actually updated the BIOS to the latest version
available from Biostar.

Tests continue...

Hephasteus
05-01-2009, 01:35 PM
My guess is that the northbridge which is on the cpu on that is getting bit too toasty.

Try to turn off all the hyperthreading in bios and I bet it passes the test because it will give some dead space to soak up that memory controller heat.

hax0r
05-01-2009, 02:19 PM
Try tweaking your options, memory latencies, and voltages within the BIOS. You should ge a nice overlock if that nehalem is D0 revision. Install XP and go about tdebugging the system with apps like coretemp, superpi, windows memtest, and prime95. Put a 120MM fan on your sticks, and additional fan around the cpu socket where all of PWM circuit resides. MX-2 is a great thermal paste. That PSU sounds cheapo, I would get a corsair, ocz, silverstone, or pc power & cooling.

epworth
06-04-2009, 04:54 AM
Hello. I'd forgotten I'd posted here.

It turned out to be the BIOS setting the wrong
voltage for the RAM. The ram was specified as 1.65v
but was only receiving 1.5v. At high memory load,
with all six sticks inserted, things would die.

With that out of the way, I've been running the
machine at heavy load for weeks without issue.

Kano
06-04-2009, 05:17 AM
That's normal when you buy OC ram, then you have to adjust voltage manually.

Celestemmcknight
06-07-2009, 07:30 AM
I think:

FF means the mobo has initialised the CPU so ok there. Dx is about your ram so would check everything around that: seating, heating, power settings timings,....

Did you change the OC jumper on the board?