Load component and index templates as throttled background jobs (max 10
concurrent) instead of sequential curl PUTs, matching the bounded-concurrency
+ flock-serialized-output pattern used by the fleet/ILM load scripts. Keeps a
wait barrier between the component phase and the index phase so index
templates never load before their referenced component templates. Failures are
tracked via per-job marker files since counter increments can't escape
background subshells.
A single printf per block was not actually one write() call, so
concurrent jobs still occasionally interleaved their label and response
lines. Hold an flock around just the printf (curl still runs in
parallel) so each policy's block prints intact, keeping live
completion-order streaming.
Run the ~300 ILM policy PUTs concurrently (bounded to 10 in flight via a
throttle gate) instead of one serial curl per policy. Adds a put_policy
helper and waits for all background jobs before exiting. Preserves policy
parity; only the scheduling changes. Drops the dead empty sid cookie arg
(falls back to basic auth from curl.config as before).
- schedule highstate every 2 hours (was 15 minutes); interval lives in
global:push:highstate_interval_hours so the SOC admin UI can tune it and
so-salt-minion-check derives its threshold as (interval + 1) * 3600
- add inotify beacon on the manager + master reactor + orch.push_batch that
writes per-app intent files, with a so-push-drainer schedule on the manager
that debounces, dedupes, and dispatches a single orchestration
- pillar_push_map.yaml allowlists the apps whose pillar changes trigger an
immediate targeted state.apply (targets verified against salt/top.sls);
edits under pillar/minions/ trigger a state.highstate on that one minion
- host-batch every push orchestration (batch: 25%, batch_wait: 15) so rule
changes don't thundering-herd large fleets
- new global:push:enabled kill-switch tears down the beacon, reactor config,
and drainer schedule on the next highstate for operators who want to keep
highstate-only behavior
- set restart_policy: unless-stopped on 23 container states so docker
recovers crashes without waiting for the next highstate; leave registry
(always), strelka/backend (on-failure), kratos, and hydra alone with
inline comments explaining why