Move highstate_interval_hours to salt.schedule and split schedule.sls

highstate_interval_hours describes the per-minion highstate schedule, not the
active-push pipeline, so relocate it from salt.auto_apply to a new salt.schedule
settings subtree. Repoint so-salt-minion-check at the new pillar path (it had
been left on the stale global:push path) so its restart grace period tracks the
schedule again.

- Add salt.schedule.highstate_interval_hours to defaults.yaml/soc_salt.yaml and a
  side-effect-free salt/salt/schedule.map.jinja (SCHEDULEMERGED), matching the
  *MERGED map convention. Consumers read SCHEDULEMERGED.highstate_interval_hours.
- Split salt/schedule.sls into salt/salt/highstate_schedule.sls (every minion) and
  salt/salt/push_drain_schedule.sls (managers); update top.sls to apply the
  highstate schedule via '*' and the drainer schedule via the configured-manager
  block. Remove the now-empty schedule.sls aggregator.
- pillar_push_map.yaml and so-push-drainer: comment/doc updates only.
This commit is contained in:
Josh Patterson
2026-06-26 10:51:41 -04:00
parent fa2ae1b87f
commit da94788255
9 changed files with 34 additions and 28 deletions
@@ -5,7 +5,7 @@
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
{% from 'salt/schedule.map.jinja' import SCHEDULEMERGED %}
# this script checks the time the file /opt/so/log/salt/state-apply-test was last modified and restarts the salt-minion service if it is outside a threshold date/time
# the file is modified via file.touch using a scheduled job healthcheck.salt-minion.state-apply-test that runs a state.apply.
@@ -23,8 +23,8 @@ SYSTEM_START_TIME=$(date -d "$(</proc/uptime awk '{print $1}') seconds ago" +%s)
LAST_HIGHSTATE_END=$([ -e "/opt/so/log/salt/lasthighstate" ] && date -r /opt/so/log/salt/lasthighstate +%s || echo 0)
LAST_HEALTHCHECK_STATE_APPLY=$([ -e "/opt/so/log/salt/state-apply-test" ] && date -r /opt/so/log/salt/state-apply-test +%s || echo 0)
# SETTING THRESHOLD TO ANYTHING UNDER 600 seconds may cause a lot of salt-minion restarts since the job to touch the file occurs every 5-8 minutes by default
# THRESHOLD is derived from the global push highstate interval + 1 hour, so the minion-check grace period tracks the schedule automatically.
THRESHOLD=$(( ({{ salt['pillar.get']('global:push:highstate_interval_hours', 2) }} + 1) * 3600 )) #within how many seconds the file /opt/so/log/salt/state-apply-test must have been touched/modified before the salt minion is restarted
# THRESHOLD is derived from the salt schedule highstate interval + 1 hour, so the minion-check grace period tracks the schedule automatically.
THRESHOLD=$(( ({{ SCHEDULEMERGED.highstate_interval_hours }} + 1) * 3600 )) #within how many seconds the file /opt/so/log/salt/state-apply-test must have been touched/modified before the salt minion is restarted
THRESHOLD_DATE=$((LAST_HEALTHCHECK_STATE_APPLY+THRESHOLD))
logCmd() {
+1 -1
View File
@@ -10,7 +10,7 @@ so-push-drainer
===============
Scheduled drainer for the active-push feature. Runs on the manager every
drain_interval seconds (default 15) via a salt schedule in salt/schedule.sls.
drain_interval seconds (default 15) via a salt schedule in salt/salt/push_drain_schedule.sls.
For each intent file under /opt/so/state/push_pending/*.json whose last_touch
is older than debounce_seconds, this script:
+6 -6
View File
@@ -186,12 +186,12 @@ registry:
tgt: 'G@role:so-eval or G@role:so-import or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone'
# salt: fanout to a fleetwide highstate. The salt.auto_apply settings tune the
# push pipeline itself (enabled, debounce/drain intervals, batch sizing) and the
# per-minion highstate schedule; they are consumed by the manager's schedule,
# beacons, and master reactor config as well as every minion's highstate
# schedule, so a targeted re-apply isn't meaningful. A salt audit row only fires
# for SOC-driven salt.auto_apply edits -- salt version bumps go through soup, not
# SOC, so they never reach this map.
# push pipeline itself (enabled, debounce/drain intervals, batch sizing) and
# salt.schedule sets the per-minion highstate interval; they are consumed by the
# manager's schedule, beacons, and master reactor config as well as every
# minion's highstate schedule, so a targeted re-apply isn't meaningful. A salt
# audit row only fires for SOC-driven salt.auto_apply / salt.schedule edits --
# salt version bumps go through soup, not SOC, so they never reach this map.
salt:
- highstate: True
tgt: '*'
+2 -1
View File
@@ -1,8 +1,9 @@
salt:
auto_apply:
enabled: true
highstate_interval_hours: 2
debounce_seconds: 30
drain_interval: 15
batch: '25%'
batch_wait: 15
schedule:
highstate_interval_hours: 2
+11
View File
@@ -0,0 +1,11 @@
{% from 'vars/globals.map.jinja' import GLOBALS %}
{% from 'salt/schedule.map.jinja' import SCHEDULEMERGED %}
highstate_schedule:
schedule.present:
- function: state.highstate
- hours: {{ SCHEDULEMERGED.highstate_interval_hours }}
- maxrunning: 1
{% if not GLOBALS.is_manager %}
- splay: 1800
{% endif %}
@@ -1,15 +1,6 @@
{% from 'vars/globals.map.jinja' import GLOBALS %}
{% from 'salt/auto_apply.map.jinja' import AUTOAPPLY %}
highstate_schedule:
schedule.present:
- function: state.highstate
- hours: {{ AUTOAPPLY.highstate_interval_hours }}
- maxrunning: 1
{% if not GLOBALS.is_manager %}
- splay: 1800
{% endif %}
{% if GLOBALS.is_manager and AUTOAPPLY.enabled %}
push_drain_schedule:
schedule.present:
+2
View File
@@ -0,0 +1,2 @@
{% import_yaml 'salt/defaults.yaml' as SALT_DEFAULTS %}
{% set SCHEDULEMERGED = salt['pillar.get']('salt:schedule', SALT_DEFAULTS.salt.schedule, merge=True) %}
+7 -6
View File
@@ -5,12 +5,6 @@ salt:
forcedType: bool
helpLink: push
global: True
highstate_interval_hours:
description: How often every minion in the grid runs a scheduled state.highstate, in hours. Lower values keep minions closer in sync at the cost of more load; higher values reduce load but increase worst-case latency for non-pushed changes. The salt-minion health check restarts a minion if its last highstate is older than this value plus one hour.
forcedType: int
helpLink: push
global: True
advanced: True
debounce_seconds:
description: Trailing-edge debounce window in seconds. A push intent must be quiet for this long before the drainer dispatches. Rapid bursts of edits within this window coalesce into one dispatch.
forcedType: int
@@ -36,3 +30,10 @@ salt:
helpLink: push
global: True
advanced: True
schedule:
highstate_interval_hours:
description: How often every minion in the grid runs a scheduled state.highstate, in hours. Lower values keep minions closer in sync at the cost of more load; higher values reduce load but increase worst-case latency for non-pushed changes. The salt-minion health check restarts a minion if its last highstate is older than this value plus one hour.
forcedType: int
helpLink: push
global: True
advanced: True
+2 -2
View File
@@ -19,7 +19,7 @@ base:
- repo.client
- versionlock
- ntp
- schedule
- salt.highstate_schedule
- logrotate
# manager node on proper salt version with empty node_data pillar
@@ -55,6 +55,7 @@ base:
- motd
- salt.minion-check
- salt.lasthighstate
- salt.push_drain_schedule
- common
- docker
- docker_clean
@@ -300,7 +301,6 @@ base:
- nginx
- elasticfleet
- elasticfleet.install_agent_grid
- schedule
- stig
'*_hypervisor and I@features:vrt and G@saltversion:{{saltversion}}':