- schedule highstate every 2 hours (was 15 minutes); interval lives in
global:push:highstate_interval_hours so the SOC admin UI can tune it and
so-salt-minion-check derives its threshold as (interval + 1) * 3600
- add inotify beacon on the manager + master reactor + orch.push_batch that
writes per-app intent files, with a so-push-drainer schedule on the manager
that debounces, dedupes, and dispatches a single orchestration
- pillar_push_map.yaml allowlists the apps whose pillar changes trigger an
immediate targeted state.apply (targets verified against salt/top.sls);
edits under pillar/minions/ trigger a state.highstate on that one minion
- host-batch every push orchestration (batch: 25%, batch_wait: 15) so rule
changes don't thundering-herd large fleets
- new global:push:enabled kill-switch tears down the beacon, reactor config,
and drainer schedule on the next highstate for operators who want to keep
highstate-only behavior
- set restart_policy: unless-stopped on 23 container states so docker
recovers crashes without waiting for the next highstate; leave registry
(always), strelka/backend (on-failure), kratos, and hydra alone with
inline comments explaining why
Add ulimits as a configurable advanced setting for every container,
allowing customization through the web UI. Move hardcoded ulimits
from elasticsearch and zeek into defaults.yaml and fix elasticsearch
ulimits that were incorrectly nested under the environment key.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This way when kafka_controllers is updated the pillar value gets updated and any non-controllers get updated to revert to 'broker' only role.
Needs more testing when a new controller joins in this manner Kafka errors due to cluster metadata being out of sync. One solution is to remove /nsm/kafka/data/__cluster_metadata-0/quorum-state and restart cluster. Alternative is working with Kafka cli tools to inform cluster of new voter, likely best option but requires a wrapper script of some sort to be created for updating cluster in-place.
Easiest option is to have all receivers join grid and then configure Kafka with specific controllers via SOC UI prior to enabling Kafka. This way Kafka cluster comes up in the desired configuration with no need for immediately modifying cluster
Signed-off-by: reyesj2 <94730068+reyesj2@users.noreply.github.com>