Re: [PATCH 09/30] sched_ext: Implement BPF extensible scheduler class
View on Lore: https://lore.kernel.org/all/Zn0joEebAdwjiTyT@gpd
Commit Message
On Tue, Jun 18, 2024 at 11:17:24AM -1000, Tejun Heo wrote:
...
> + /**
> + * set_weight - Set task weight
> + * @p: task to set weight for
> + * @weight: new eight [1..10000]
Small nit: eight -> weight
> + *
> + * Update @p's weight to @weight.
> + */
> + void (*set_weight)(struct task_struct *p, u32 weight);
-Andrea
Diff
No diff found.
Implementation Analysis
What This Email Addresses
Andrea Righi spots a typo in the sched_ext_ops kernel-doc comment for the set_weight callback:
/**
* set_weight - Set task weight
* @p: task to set weight for
* @weight: new eight [1..10000] ← "eight" should be "weight"
*
* Update @p's weight to @weight.
*/
void (*set_weight)(struct task_struct *p, u32 weight);
The word "eight" in the @weight parameter description should be "weight". This is a documentation typo that does not affect compilation or behavior.
Why set_weight Matters for sched_ext
The set_weight callback was added in v3 of the patch series (noted in the commit message: "ops.set_weight() added to allow BPF schedulers to track weight changes without polling p->scx.weight"). Without this callback, a BPF scheduler that wants to implement weight-aware scheduling has to either poll p->scx.weight on every scheduling decision (expensive) or miss updates entirely.
The callback's weight range [1..10000] maps to the Linux scheduler's nice-value-derived weight range. Weight 1 is the minimum (nice +19 equivalent), weight 10000 is an approximation of the nice 0 base weight (1024 in the kernel's SCHED_NORMAL scheme), and higher values correspond to negative nice values. BPF schedulers that implement proportional-share or weighted fair queueing need this value.
What the Community Decided
This is a trivial typo fix — it will be corrected in the next revision of the patch. No design changes are implied.
Design Insights Revealed
The set_weight callback is an example of a pattern repeated throughout sched_ext_ops: rather than having BPF schedulers poll task fields directly, sched_ext provides callbacks that notify the BPF scheduler when something changes. This push model is important because:
- BPF programs cannot safely dereference arbitrary task_struct fields without explicit whitelisting
- Polling from
ops.dispatch()orops.enqueue()would add overhead on every scheduling event - State changes (weight, cpumask, priority) happen infrequently; push notifications are efficient
The full set of "state notification" callbacks in sched_ext_ops includes set_weight, set_cpumask, and update_idle. Understanding that these exist as a group — and why they exist — is essential for writing correct BPF schedulers.
What Maintainers Should Know
Kernel-doc typos in sched_ext_ops callback descriptions are worth catching: those comments are the primary documentation BPF scheduler authors read when implementing a callback. Unlike internal kernel comments, these are part of the public BPF interface documentation and will appear in generated API docs. Reviewers like Andrea Righi who read the kernel-doc carefully are providing genuine value by catching these.