The SOC postgres database was renamed so_soc -> securityonion (see
POSTGRES_DB in salt/postgres/enabled.sls and the SOC postgres config in
salt/soc/defaults.yaml). The pillar_db beacon still hardcoded so_soc, so
every poll failed with 'database "so_soc" does not exist' (rc=2),
silently disabling active-push detection of audit_settings changes.
Update DATABASE to 'securityonion' and refresh the now-stale so_soc
references in the beacon and push_pillar reactor comments.
The active-push tunables (enabled, highstate_interval_hours, debounce_seconds,
drain_interval, batch, batch_wait) described how Salt auto-applies changes, not
general grid config, so relocate them from the global namespace to a new
salt.auto_apply settings module.
- Add salt/salt/{defaults.yaml,auto_apply.map.jinja,soc_salt.yaml,adv_salt.yaml}.
auto_apply.map.jinja is a dedicated, side-effect-free merge map (the existing
salt/salt/map.jinja dereferences pillar.host.mainint at import time).
- Remove the push blocks from salt/global/{defaults,soc_global}.yaml.
- Register salt.soc_salt/salt.adv_salt in pillar/top.sls; seed the local pillar
stubs for fresh installs (make_some_dirs) and upgrades (ensure_salt_local_pillar
in soup, wired into up_to_3.2.0).
- Repoint all consumers: GLOBALMERGED.push.* -> AUTOAPPLY.* (schedule, salt
master, manager beacons, beacons_pushstate, orch.push_batch) and
pillar.get('global:push...') -> 'salt:auto_apply...' (push reactors,
so-push-drainer).
- Add a salt: fleetwide-highstate entry to pillar_push_map.yaml so edits keep
applying immediately, matching the prior global-namespace behavior.
The active-push feature detected pillar/settings changes via an inotify
beacon on the manager watching /opt/so/saltstack/local/pillar. Replace
that pillar watch with a custom salt beacon (pillar_db) that polls the
SOC so_soc.audit_settings table on a monotonic id watermark, so changes
made through SOC drive immediate pushes from the database instead of the
files. The suricata/strelka rule inotify watches (and pyinotify) are kept
unchanged, since rule-file edits are not recorded in audit_settings.
- salt/_beacons/pillar_db.py: new beacon. Polls audit_settings via
`docker exec so-postgres psql` (unix-socket trust auth), tracks the last
processed id in /opt/so/state/pillar_db_watch.id, seeds to MAX(id) on
first run (no history replay), and emits one event per new row.
- salt/reactor/push_pillar.sls: consume setting_id/node_id from the beacon
event instead of a file path. App = first dotted segment of setting_id,
looked up in pillar_push_map.yaml. Empty node_id -> grid-wide actions as
is; populated node_id -> the app's state(s) retargeted to that one node.
- salt/manager/files/beacons_pushstate.conf.jinja: drop the pillar inotify
block, add the pillar_db beacon (interval = push.drain_interval); keep
the suricata/strelka inotify watches.
- salt/salt/files/reactor_pushstate.conf: map salt/beacon/*/pillar_db/
audit_settings to push_pillar.sls; remove the pillar inotify reactor
lines; keep suricata/strelka.
The intent -> so-push-drainer -> orch.push_batch pipeline is unchanged.
Verified end-to-end on a standalone: a grid-wide telegraf.output change
re-applied telegraf fleetwide (container replaced), and a per-host
ntp.config.servers change applied ntp to only that node.
- schedule highstate every 2 hours (was 15 minutes); interval lives in
global:push:highstate_interval_hours so the SOC admin UI can tune it and
so-salt-minion-check derives its threshold as (interval + 1) * 3600
- add inotify beacon on the manager + master reactor + orch.push_batch that
writes per-app intent files, with a so-push-drainer schedule on the manager
that debounces, dedupes, and dispatches a single orchestration
- pillar_push_map.yaml allowlists the apps whose pillar changes trigger an
immediate targeted state.apply (targets verified against salt/top.sls);
edits under pillar/minions/ trigger a state.highstate on that one minion
- host-batch every push orchestration (batch: 25%, batch_wait: 15) so rule
changes don't thundering-herd large fleets
- new global:push:enabled kill-switch tears down the beacon, reactor config,
and drainer schedule on the next highstate for operators who want to keep
highstate-only behavior
- set restart_policy: unless-stopped on 23 container states so docker
recovers crashes without waiting for the next highstate; leave registry
(always), strelka/backend (on-failure), kratos, and hydra alone with
inline comments explaining why