MEDIUM: cpu-topo: change "efficiency" to consider per-core capacity

This is similar to the previous change to the "performance" policy but it applies to the "efficiency" one. Here we're changing the sorting method to sort CPU clusters by average per-CPU capacity, and we evict clusters whose per-CPU capacity is above 125% of the previous one. Per-core capacity allows to detect discrepancies between CPU cores, and to continue to focus on efficient ones as a priority.
2025-05-13 16:18:25 +02:00 · 2025-05-13 16:18:25 +02:00 · 70b0dd6b0f
commit 70b0dd6b0f
parent 6c88e27cf4
2 changed files with 21 additions and 17 deletions
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@ -1993,16 +1993,17 @@ cpu-policy <policy>
                        if not set, will be set to 1.

   - efficiency         exactly like group-by-cluster below, except that CPU
-                        clusters whose performance is more than twice that of
-                        the next less performant one are evicted. These are
-                        typically "big" or "performance" cores. This means that
-                        if more than one type of CPU cores are detected, only
-                        the efficient one will be used. This can make sense for
-                        use with moderate loads when the most powerful cores
-                        need to be available to the application or a security
-                        component. Some modern CPUs have a large number of such
-                        efficient CPU cores which can collectively deliver a
-                        decent level of performance while using less power.
+                        clusters composed of cores whose performance is more
+                        than 25% above that of the next less performant one are
+                        evicted. These are typically "big" or "performance"
+                        cores. This means that if more than one type of CPU
+                        cores are detected, only the efficient one will be
+                        used. This can make sense for use with moderate loads
+                        when the most powerful cores need to be available to
+                        the application or a security component. Some modern
+                        CPUs have a large number of such efficient CPU cores
+                        which can collectively deliver a decent level of
+                        performance while using less power.

   - first-usable-node  if the CPUs were not previously restricted at boot (for
                        example using the "taskset" utility), and if the
--- a/src/cpu_topo.c
+++ b/src/cpu_topo.c
@ -1359,7 +1359,7 @@ static int cpu_policy_performance(int policy, int tmin, int tmax, int gmin, int

 /* the "efficiency" cpu-policy:
 *  - does nothing if nbthread or thread-groups are set
- *  - eliminates clusters whose total capacity is above half of others
+ *  - eliminates clusters whose average per-cpu capacity is above 80% of others
 *  - tries to create one thread-group per cluster, with as many
 *    threads as CPUs in the cluster, and bind all the threads of
 *    this group to all the CPUs of the cluster.
@ -1372,22 +1372,25 @@ static int cpu_policy_efficiency(int policy, int tmin, int tmax, int gmin, int g
 	if (global.nbthread || global.nbtgroups)
 		return 0;

-	/* sort clusters by reverse capacity */
-	cpu_cluster_reorder_by_capa(ha_cpu_clusters, cpu_topo_maxcpus);
+	/* sort clusters by average reverse capacity */
+	cpu_cluster_reorder_by_avg_capa(ha_cpu_clusters, cpu_topo_maxcpus);

 	capa = 0;
 	for (cluster = cpu_topo_maxcpus - 1; cluster >= 0; cluster--) {
-		if (capa && ha_cpu_clusters[cluster].capa > capa * 2) {
-			/* This cluster is more than twice as fast as the
-			 * previous one, we're not interested in using it.
+		if (capa && ha_cpu_clusters[cluster].capa * 8 >= ha_cpu_clusters[cluster].nb_cpu * capa * 10) {
+			/* This cluster is made of cores each at last 25% faster
+			 * than those of the previous cluster, previous one, we're
+			 * not interested in using it.
 			 */
 			for (cpu = 0; cpu <= cpu_topo_lastcpu; cpu++) {
 				if (ha_cpu_topo[cpu].cl_gid == ha_cpu_clusters[cluster].idx)
 					ha_cpu_topo[cpu].st |= HA_CPU_F_IGNORED;
 			}
 		}
+		else if (ha_cpu_clusters[cluster].nb_cpu)
+			capa = ha_cpu_clusters[cluster].capa / ha_cpu_clusters[cluster].nb_cpu;
 		else
-			capa = ha_cpu_clusters[cluster].capa;
+			capa = 0;
 	}

 	cpu_cluster_reorder_by_index(ha_cpu_clusters, cpu_topo_maxcpus);