make so-yaml PG-canonical and add pillar-change reactor stack

Two coupled changes that together let so_pillar.* be the canonical config store, with config edits driving service reloads automatically: so-yaml PG-canonical mode - Adds /opt/so/conf/so-yaml/mode (and SO_YAML_BACKEND env override) with three values: dual (legacy), postgres (PG-only for managed paths), disk (emergency rollback). Bootstrap files (secrets.sls, ca/init.sls, *.nodes.sls, top.sls, ...) stay disk-only regardless via the existing SkipPath allowlist in so_yaml_postgres.locate. - loadYaml/writeYaml/purgeFile now route to so_pillar.* in postgres mode: replace/add/get all read+write the database with no disk file ever appearing. PG failure is fatal in postgres mode (no silent fallback); dual mode preserves the prior best-effort mirror. - so_yaml_postgres gains read_yaml(path), is_pg_managed(path), and is_enabled() so so-yaml can answer "is this path PG-managed and is PG up" without reaching into private helpers. - schema_pillar.sls writes /opt/so/conf/so-yaml/mode = postgres after the importer succeeds, so flipping postgres:so_pillar:enabled flips so-yaml's behavior in lockstep with the schema being live. pg_notify-driven change fan-out - 008_change_notify.sql adds so_pillar.change_queue + an AFTER trigger on pillar_entry that enqueues the locator and pg_notifies 'so_pillar_change'. Queue is drained at-least-once so engine restarts don't lose events; pg_notify is just the wakeup signal. - New salt-master engine pg_notify_pillar.py LISTENs on the channel, drains the queue with FOR UPDATE SKIP LOCKED, debounces bursts, and fires 'so/pillar/changed' events grouped by (scope, role, minion). - Reactor so_pillar_changed.sls catches the tag and dispatches to orch.so_pillar_reload, which carries a DISPATCH map of pillar-path prefix -> (state sls, role grain set) so adding a new service to the auto-reload list is a one-line edit instead of a new reactor. - Engine + reactor wiring is gated on the same postgres:so_pillar:enabled flag as the schema and ext_pillar config so the whole stack flips on/off together. Tests: 21 new cases (112 total, all passing) covering mode resolution, PG-managed detection, and PG-canonical read/write/purge routing with the PG client stubbed.
2026-06-14 22:28:43 +02:00 · 2026-05-01 09:31:48 -04:00
parent 23255f88e0
commit 3d11694d51
13 changed files with 958 additions and 11 deletions
@@ -37,6 +37,16 @@ GRANT SELECT ON so_pillar.v_pillar_global,
    TO so_pillar_master;
 GRANT EXECUTE ON FUNCTION so_pillar.fn_pillar_secrets(text) TO so_pillar_master;

+-- Engine reads + drains the change queue from the salt-master process. It
+-- needs SELECT to find unprocessed rows and UPDATE to mark them processed.
+-- The queue contains only locator metadata (no pillar data), so the master
+-- role's existing privilege footprint is unchanged in practice.
+GRANT SELECT, UPDATE ON so_pillar.change_queue TO so_pillar_master;
+GRANT USAGE ON SEQUENCE so_pillar.change_queue_id_seq TO so_pillar_master;
+-- Writer needs INSERT (the trigger runs as table owner, so this is just for
+-- direct testing / manual replays from psql).
+GRANT INSERT ON so_pillar.change_queue TO so_pillar_writer;
+
 -- Writer needs CRUD on pillar_entry/minion/role_member plus access to seed tables.
 GRANT SELECT, INSERT, UPDATE, DELETE
    ON so_pillar.pillar_entry,