MINOR: cpu-topo: consider capacity when forming clusters

By using the cluster+capacity sorting function we can detect
heterogneous clusters which are not properly reported. Thanks to this,
the following misnumbered machine featuring 4 big cores, 4 medium ones
an 4 small ones is properly detected with its clusters correctly
assigned:

      [keep] thr=  0 -> cpu=  0 pk=00 no=00 cl=000 ts=000 capa=1024
      [keep] thr=  1 -> cpu=  1 pk=00 no=00 cl=002 ts=008 capa=278
      [keep] thr=  2 -> cpu=  2 pk=00 no=00 cl=002 ts=009 capa=278
      [keep] thr=  3 -> cpu=  3 pk=00 no=00 cl=002 ts=010 capa=278
      [keep] thr=  4 -> cpu=  4 pk=00 no=00 cl=002 ts=011 capa=278
      [keep] thr=  5 -> cpu=  5 pk=00 no=00 cl=001 ts=004 capa=905
      [keep] thr=  6 -> cpu=  6 pk=00 no=00 cl=001 ts=005 capa=905
      [keep] thr=  7 -> cpu=  7 pk=00 no=00 cl=001 ts=006 capa=866
      [keep] thr=  8 -> cpu=  8 pk=00 no=00 cl=001 ts=007 capa=866
      [keep] thr=  9 -> cpu=  9 pk=00 no=00 cl=000 ts=001 capa=984
      [keep] thr= 10 -> cpu= 10 pk=00 no=00 cl=000 ts=002 capa=984
      [keep] thr= 11 -> cpu= 11 pk=00 no=00 cl=000 ts=003 capa=1024

Also this has the benefit of always assigning highest performance
clusters with the smallest IDs so that simple configs can decide to
simply bind to cluster 0 or clusters 0,1 and benefit from optimal
performance.
This commit is contained in:
Willy Tarreau 2025-02-27 19:50:09 +01:00
parent 4a6eaf6c5e
commit 204ac3c0b6

View File

@ -684,10 +684,12 @@ void cpu_compose_clusters(void)
int curr_gid, prev_gid; int curr_gid, prev_gid;
int curr_lid, prev_lid; int curr_lid, prev_lid;
/* Now we'll sort CPUs by topology and assign cluster IDs to those that /* Now we'll sort CPUs by topology/cluster/capacity and assign cluster
* don't yet have one, based on the die/pkg/llc. * IDs to those that don't have one, based on the die/pkg/lcc, and
* double-check that capacity within a cluster doesn't vary by +/- 5%,
* otherwise it indicates different clusters (typically big.little).
*/ */
cpu_reorder_by_locality(ha_cpu_topo, cpu_topo_maxcpus); cpu_reorder_by_cluster_capa(ha_cpu_topo, cpu_topo_maxcpus);
prev_gid = prev_lid = -2; // make sure it cannot match even unassigned ones prev_gid = prev_lid = -2; // make sure it cannot match even unassigned ones
curr_gid = curr_lid = -1; curr_gid = curr_lid = -1;
@ -710,7 +712,10 @@ void cpu_compose_clusters(void)
(ha_cpu_topo[cpu].ca_id[4] < 0 && // no l4 ? check L3 (ha_cpu_topo[cpu].ca_id[4] < 0 && // no l4 ? check L3
((ha_cpu_topo[cpu].ca_id[3] != ha_cpu_topo[cpu-1].ca_id[3]) || ((ha_cpu_topo[cpu].ca_id[3] != ha_cpu_topo[cpu-1].ca_id[3]) ||
(ha_cpu_topo[cpu].ca_id[3] < 0 && // no l3 ? check L2 (ha_cpu_topo[cpu].ca_id[3] < 0 && // no l3 ? check L2
(ha_cpu_topo[cpu].ca_id[2] != ha_cpu_topo[cpu-1].ca_id[2]))))) { (ha_cpu_topo[cpu].ca_id[2] != ha_cpu_topo[cpu-1].ca_id[2])))) ||
(ha_cpu_topo[cpu].capa > 0 && ha_cpu_topo[cpu-1].capa > 0 &&
(ha_cpu_topo[cpu].capa * 100 < ha_cpu_topo[cpu-1].capa * 95 ||
ha_cpu_topo[cpu].capa * 95 > ha_cpu_topo[cpu-1].capa * 100))) {
curr_gid++; curr_gid++;
curr_lid++; curr_lid++;
} }