Announcement

**milkylainen** · 07 March 2020, 04:10 PM

I'm doubting the benefits of this. Sounds rather obvious, but cores sit very tight in the core complex of a SoC.
If you're running one core hot, chances are you are going to run any core hot.
We are not talking about different classes of cores in big.little configurations.

Say you can squeeze out a few more percent by cycling through cores as they cool for a single task load.
Then you're adding to the thermal runaway situation that caused the throttling as you load up the cores.
Maximum usage without throttling >= power consumption which brings you closer to throttling either way.
And if you're already loading the cores to full tilt, there is not much you can do.

Anyone with some wise insights into this rather complex problem?

**tuxd3v** · 07 March 2020, 10:08 PM

Originally posted by milkylainen View Post

I'm doubting the benefits of this. Sounds rather obvious, but cores sit very tight in the core complex of a SoC.
If you're running one core hot, chances are you are going to run any core hot.
We are not talking about different classes of cores in big.little configurations.

Say you can squeeze out a few more percent by cycling through cores as they cool for a single task load.
Then you're adding to the thermal runaway situation that caused the throttling as you load up the cores.
Maximum usage without throttling >= power consumption which brings you closer to throttling either way.
And if you're already loading the cores to full tilt, there is not much you can do.

Even in big.LITTLE Setups, but there you have a greater chance to have small differences..

But there are corner cases too,
If you are not in the same L2 cache, loading the process to another core( not in same cluster ), could cost you more..
Imagine a bit.LITTLE setup, which usually have l2 caches separated.., always jumping between big and LITTLE cores,
Could have impact in performance too..

**creative** · 07 March 2020, 10:39 PM

Nevermind its mainly for ARM. Yeah I can see why this is needed.

**torsionbar28** · 07 March 2020, 10:41 PM

I would imagine the larger chips like Epyc that are made of multiple dies could see a nice improvement from this.

**gilboa** · 08 March 2020, 04:47 AM

Originally posted by milkylainen View Post

I'm doubting the benefits of this. Sounds rather obvious, but cores sit very tight in the core complex of a SoC.
If you're running one core hot, chances are you are going to run any core hot.
We are not talking about different classes of cores in big.little configurations.

Say you can squeeze out a few more percent by cycling through cores as they cool for a single task load.
Then you're adding to the thermal runaway situation that caused the throttling as you load up the cores.
Maximum usage without throttling >= power consumption which brings you closer to throttling either way.
And if you're already loading the cores to full tilt, there is not much you can do.

Anyone with some wise insights into this rather complex problem?

Actually, in the world of MCM chips (AKA chiplets), you can have considerably different temperature between each cluster, especially given the fact that 64 core Eypc process has eight of them spread across a fairly large surface...

- Gilboa

**milkylainen** · 08 March 2020, 09:10 AM

Originally posted by gilboa View Post

Actually, in the world of MCM chips (AKA chiplets), you can have considerably different temperature between each cluster, especially given the fact that 64 core Eypc process has eight of them spread across a fairly large surface...

- Gilboa

Yes. Obviously. But this article is about ARM and what is likely to be tight core packaging.
I fail to see the major point f.ex. a cellphone big.little config.
Sockets, or even chiplets, which house symmetrical units are likely to gain even more from this than SoCs.

"The Linux thermal pressure code is designed with ARM SoCs in mind for better performance".

**tuxd3v** · 08 March 2020, 08:37 PM

Originally posted by milkylainen View Post

Sockets, or even chiplets, which house symmetrical units are likely to gain even more from this than SoCs.

"The Linux thermal pressure code is designed with ARM SoCs in mind for better performance".

well,
Multi-node systems would gain indeed, but there are also corner cases as those systems are majority NUMA systems..
Which means, the memory latency is big when passing from one node to another..

So, if you have a process running in a node, it would mean( in principle ), that the memory was allocated in that node memory region..
Jumping from 1 node to another, would have a big cost on memory..

So the scheduler needs also to be aware of that..

**starshipeleven** · 09 March 2020, 06:10 AM

Originally posted by torsionbar28 View Post

I would imagine the larger chips like Epyc that are made of multiple dies could see a nice improvement from this.

It's actually the chips that are NOT on multiple dies that will see most improvements from this

**starshipeleven** · 09 March 2020, 06:14 AM

Originally posted by milkylainen View Post

I'm doubting the benefits of this. Sounds rather obvious, but cores sit very tight in the core complex of a SoC.
If you're running one core hot, chances are you are going to run any core hot.

Not really, the issue is hot spots here. Cores generate too much heat for their tiny surface to move it away so you have a spike in temperature in that core and then it throttles. This happens before the heat even travels to neighboring cores.

Announcement

Thermal Pressure On Tap For Linux 5.7 So The Scheduler Can Be Aware Of Overheating CPUs

Thermal Pressure On Tap For Linux 5.7 So The Scheduler Can Be Aware Of Overheating CPUs

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment