Critique The strongest part of the design is that it now has a clearer ownership split:
- scheduler-owned state stays under the scheduler lock
- task-local membership stays under threads_lock
- VM teardown is split into “free user mappings” and “free paging objects”
- blocked-thread wakeups are classified by block_reason instead of being globally permissive

That is the right direction. The earlier failures were mostly from violating one of those ownership boundaries.

The weakest part is that the kernel still mixes three different kinds of change in the same code paths:

- real design logic
- defensive invariants
- live debugging instrumentation

That makes it harder to tell what is fundamental versus temporary. The best example is the malloc-area probing in kern/malloc_wrappers.c. The current sfree() overlap check is useful, but it is still diagnostic policy living in the production allocator wrapper.

The scheduler/task design is much better than it was around e3f7f80, but it still has one conceptual cost: task exit is spread across task_task_vanish(), task_thread_exit(), scheduler_mark_task_dying_wake_task_threads(), scheduler_exit_thread(), and task_wait().
That split is correct for safety, but it is not simple. The saving grace is that the roles are now cleaner:

- task code decides when a task or thread is logically dead
- scheduler code decides how blocked siblings are woken and how zombies are published
- wait/reap code decides when memory is finally freed

The VM design is mostly solid now. The important improvement is that page_walk_user() no longer assumes “everything in user PDE range must be user-owned.” That old assumption was too strong for your actual kernel, because user-visible address spaces still contain
kernel-only remnants in some page tables. The current design is more realistic: walk only the mappings the user-space walker actually owns.

One design choice I would still question is the timer-rate change in kern/inc/handler_installers_internals.h. Moving from 1000 Hz to 500 Hz may be fine, but it is a system-level behavior change, not just a bug fix. If the reason is “reduce scheduler/interrupt
churn and make long tests stable,” that should be documented explicitly, because otherwise it looks like a timing tweak that could mask rather than fix races.

Summary of fixes since e3f7f80
The meaningful committed kern/ fixes since e3f7f80 are:

- VM walker ownership fix in kern/vm/vm.c: page_walk_user() now skips kernel-only PDEs/PTEs instead of panicking on them. This is the fix behind the earlier “non-user PDE/PTE in user range” failures during teardown.
- ZFOD/new-pages correctness fix in kern/vm/vm.c: when copying a ZFOD mapping during fork, the child now preserves PTE_NP_START and PTE_NP_END. Without that, remove_pages() metadata could be lost across fork.
- Kernel-stack churn fix in kern/task/task.c: added a bounded kernel-stack cache for 16 KB stacks. This is the big minclone_many stabilization change. It reduces allocator churn and avoids repeatedly freeing/remapping aligned kernel stacks.
- Thread-reap safety checks in kern/task/task.c: dead threads are now asserted to be fully detached from scheduler, condvar, and sleep queues before being freed. This closes a class of “freed while still linked” corruption bugs.
- Parent/child sanity checks in kern/task/task.c: after fork(), the child’s parent link is asserted both after assignment and before return. This is defensive, but it directly protects the wait/zombie design.
- Deschedule control-flow fix in kern/scheduler/scheduler.c: scheduler_deschedule() now takes the scheduler lock before the rej == NULL early-return path. That makes the function’s control flow consistent with its shared context_switch cleanup path and fixes the
  earlier boot-time assertion around scheduler unlocking.
- Hash-table rehash policy fix in kern/datastructure/hash_table.c: tombstone-heavy tables now rehash in place instead of always growing. This reduces unnecessary allocation churn in the scheduler’s task/thread registries.
- Early-boot sync fix in kern/sync/mutex.c and kern/sync/sem.c: mutexes and semaphores are now more permissive before the first thread exists. That removed bogus “interrupts must already be off” assumptions during early boot and allocator initialization.
- Allocator corruption probe in kern/malloc_wrappers.c: sfree() now validates that the freed chunk is inside malloc_lmm and does not overlap an already-free node. This is not a design fix; it is a targeted invariant check to catch double-free / overlapping-free
  corruption earlier.
- Frame-pool integrity check in kern/vm/vm.c: page_alloc() now asserts the popped frame is inside the managed physical-frame range. Again, this is defensive validation, not core design.
- Timer-rate tuning in kern/inc/handler_installers_internals.h: timer frequency changed from 1000 Hz to 500 Hz. This is a runtime-policy change, not a structural fix.

Net assessment
Since e3f7f80, the real architectural improvements are:

- the VM teardown logic now matches the true ownership of mappings
- the task/scheduler lifecycle is safer under long-run churn
- the kernel-stack lifetime story is much better
- the deschedule/wait path is more internally consistent

The weaker changes are the timing tweak and the remaining diagnostic assertions mixed into hot code. Those are useful today, but I would treat them as provisional unless you want them as permanent invariants.


- kern/loader.c: getbytes() now walks exec2obj_userapp_TOC by pointer instead of copying the full 264-byte TOC entry by value onto the kernel stack each iteration.
- kern/io/console.c: public console operations are now serialized with a console spinlock plus interrupt save/restore, so cursor/buffer/flush state is no longer reentrant under mixed thread and interrupt printing.
Table of Content