Commit Graph

18385 Commits

Author SHA1 Message Date
Josh Patterson 12f4447875 Replace inotify rule-watch beacon with poll-based rules_db beacon
Salt's stock inotify beacon leaks one kernel inotify instance every time
the minion rebuilds the beacon loader's __context__ (the orphaned
pyinotify.Notifier is never stopped), accumulating against
fs.inotify.max_user_instances=128 until inotify_init() fails with EMFILE
and rule-change push detection silently stops. This is independent of
disable_during_state_run.

Add a custom poll-based beacon (salt/_beacons/rules_db.py) modeled on
pillar_db.py: it fingerprints the suricata/strelka rule dirs each interval
(relpath + mtime_ns + size, temp files excluded) against a per-dir
watermark, emitting an event only on change. It holds zero inotify
instances, so the leak is impossible, and it keeps firing during state
runs. Swap the inotify beacon config and reactor tag mappings accordingly;
the push_suricata/push_strelka reactors are unchanged (they read only
data['path']).
2026-06-26 15:40:32 -04:00
Josh Patterson da94788255 Move highstate_interval_hours to salt.schedule and split schedule.sls
highstate_interval_hours describes the per-minion highstate schedule, not the
active-push pipeline, so relocate it from salt.auto_apply to a new salt.schedule
settings subtree. Repoint so-salt-minion-check at the new pillar path (it had
been left on the stale global:push path) so its restart grace period tracks the
schedule again.

- Add salt.schedule.highstate_interval_hours to defaults.yaml/soc_salt.yaml and a
  side-effect-free salt/salt/schedule.map.jinja (SCHEDULEMERGED), matching the
  *MERGED map convention. Consumers read SCHEDULEMERGED.highstate_interval_hours.
- Split salt/schedule.sls into salt/salt/highstate_schedule.sls (every minion) and
  salt/salt/push_drain_schedule.sls (managers); update top.sls to apply the
  highstate schedule via '*' and the drainer schedule via the configured-manager
  block. Remove the now-empty schedule.sls aggregator.
- pillar_push_map.yaml and so-push-drainer: comment/doc updates only.
2026-06-26 10:51:57 -04:00
Josh Patterson fa2ae1b87f Merge remote-tracking branch 'origin/3/dev' into saltthangs 2026-06-25 11:45:03 -04:00
Josh Patterson 5bf9751adf do not disable during state run 2026-06-25 11:44:38 -04:00
Josh Patterson 3effdbc91e do not disable during state run 2026-06-25 11:36:52 -04:00
Jason Ertel 30312b93a6 Merge pull request #16008 from Security-Onion-Solutions/jertel/wip
support multiple capinfos versions
2026-06-25 10:19:56 -04:00
Jason Ertel a9c03e39bb support multiple capinfos versions 2026-06-25 09:32:08 -04:00
Josh Patterson 8836529496 Merge remote-tracking branch 'origin/3/dev' into saltthangs 2026-06-25 08:13:32 -04:00
Josh Patterson b09c3776b7 Point pillar_db beacon at securityonion database
The SOC postgres database was renamed so_soc -> securityonion (see
POSTGRES_DB in salt/postgres/enabled.sls and the SOC postgres config in
salt/soc/defaults.yaml). The pillar_db beacon still hardcoded so_soc, so
every poll failed with 'database "so_soc" does not exist' (rc=2),
silently disabling active-push detection of audit_settings changes.

Update DATABASE to 'securityonion' and refresh the now-stale so_soc
references in the beacon and push_pillar reactor comments.
2026-06-24 16:51:32 -04:00
Josh Patterson dfdb1fbaeb Move global.push config to salt.auto_apply
The active-push tunables (enabled, highstate_interval_hours, debounce_seconds,
drain_interval, batch, batch_wait) described how Salt auto-applies changes, not
general grid config, so relocate them from the global namespace to a new
salt.auto_apply settings module.

- Add salt/salt/{defaults.yaml,auto_apply.map.jinja,soc_salt.yaml,adv_salt.yaml}.
  auto_apply.map.jinja is a dedicated, side-effect-free merge map (the existing
  salt/salt/map.jinja dereferences pillar.host.mainint at import time).
- Remove the push blocks from salt/global/{defaults,soc_global}.yaml.
- Register salt.soc_salt/salt.adv_salt in pillar/top.sls; seed the local pillar
  stubs for fresh installs (make_some_dirs) and upgrades (ensure_salt_local_pillar
  in soup, wired into up_to_3.2.0).
- Repoint all consumers: GLOBALMERGED.push.* -> AUTOAPPLY.* (schedule, salt
  master, manager beacons, beacons_pushstate, orch.push_batch) and
  pillar.get('global:push...') -> 'salt:auto_apply...' (push reactors,
  so-push-drainer).
- Add a salt: fleetwide-highstate entry to pillar_push_map.yaml so edits keep
  applying immediately, matching the prior global-namespace behavior.
2026-06-24 15:17:48 -04:00
Dan Marr 4d34470b84 Merge pull request #16005 from triggerman86/triggerman-fix-root_check-so-soup
Fix premature fail_setup function call in so-setup
2026-06-24 13:41:53 -04:00
Josh Patterson 61aa963a2d Merge remote-tracking branch 'origin/3/dev' into saltthangs 2026-06-24 08:10:27 -04:00
Jorge Reyes 81c8d54589 Merge pull request #16006 from Security-Onion-Solutions/reyesj2-patch-5
remove heayvnode FleetServer_* directory creation, and skip empty dir…
2026-06-23 15:53:34 -05:00
reyesj2 4f3b57f495 remove duplicate package-upgrade attempts, upgrade only when reported latest version differs from installed version 2026-06-23 15:52:10 -05:00
reyesj2 84228a819b remove heayvnode FleetServer_* directory creation, and skip empty directories during FleetServer policy management 2026-06-23 15:30:49 -05:00
Dan Marr 81ebea0451 Fix non-root exit checks at start of so-setup 2026-06-23 16:07:30 -04:00
Josh Patterson d71e80cf66 Merge remote-tracking branch 'origin/3/dev' into saltthangs 2026-06-23 10:32:32 -04:00
Josh Patterson a9f9d8bd0d Merge pull request #15985 from Security-Onion-Solutions/soupmod2
allow manager two full highstates during soup, improve elastic script runtime
2026-06-22 17:02:02 -04:00
Jason Ertel 953fdee3af Merge pull request #15984 from Security-Onion-Solutions/jertel/wip
Upgrade registry
2026-06-22 16:56:18 -04:00
Jason Ertel e2e3e690ca reset version 2026-06-22 16:52:29 -04:00
Josh Patterson 323491f58e Merge pull request #15983 from Security-Onion-Solutions/reyesj2-jpp
wip
2026-06-22 16:52:10 -04:00
reyesj2 96fcc0ec38 wip 2026-06-22 14:25:46 -05:00
Jason Ertel bcc60a4ae0 kilo version 2026-06-22 13:07:49 -04:00
Jason Ertel b77103aa9f upgrade registry 2026-06-22 13:01:02 -04:00
Jorge Reyes 63a2e20698 Merge pull request #15982 from Security-Onion-Solutions/reyesj2/wip
don't create stack trace when set -e is disabled
2026-06-18 15:25:41 -05:00
reyesj2 22d5c96bd5 don't create stack trace when set -e is disabled 2026-06-18 14:56:29 -05:00
Mike Reeves 28fdd1eb6f Merge pull request #15970 from Security-Onion-Solutions/udev
Pin NIC names by MAC via udev (run-once) from the common state
2026-06-18 14:28:09 -04:00
Josh Patterson d0bea2ebcb Restore grouped per-integration logging and retry 409s in fleet integration loader
elastic_fleet_load_integrations_dir now buffers each concurrent job's
output (header + API response) to a per-job file and prints them in
submission order after wait, restoring the readable serial-style output
while keeping concurrent writes.

Add --retry-all-errors to the integration create/update curl calls so
transient 409 conflicts from concurrent writes to the same agent policy
are retried (curl --retry alone does not retry 409).
2026-06-18 11:19:36 -04:00
Josh Patterson 62c01a9756 Merge remote-tracking branch 'origin/3/dev' into soupmod2 2026-06-18 09:53:44 -04:00
Jorge Reyes b143e1e577 Merge pull request #15979 from Security-Onion-Solutions/reyesj2/wip
add context to soup errors
2026-06-17 16:47:49 -05:00
reyesj2 16149df71f formatting 2026-06-16 18:21:28 -05:00
reyesj2 6a18f35020 add context to soup errors and optional soup debug log with xtrace output 2026-06-16 18:21:28 -05:00
Jason Ertel aa58225e8f Merge pull request #15974 from Security-Onion-Solutions/jertel/wip
es|ql defaults
2026-06-16 14:27:54 -04:00
Josh Patterson 8e33d0e1e9 Merge remote-tracking branch 'origin/3/dev' into soupmod2 2026-06-16 12:54:18 -04:00
Jorge Reyes acf48db915 Merge pull request #15978 from Security-Onion-Solutions/reyesj2-patch-1
remove pillar merge
2026-06-16 11:17:56 -05:00
reyesj2 3daed551df use --fail flag without set -x, since elasticsearch can return a 404 on the template lookup 2026-06-16 11:17:04 -05:00
reyesj2 4456bde1c8 check if template exists without --fail flag 2026-06-16 10:45:53 -05:00
Jorge Reyes 4a6c675223 skip kibana backport if the template doesn't exist 2026-06-16 10:33:11 -05:00
reyesj2 a769d4c680 another unneeded default 2026-06-16 09:32:37 -05:00
reyesj2 f68e3e47a1 remove pillar merge 2026-06-16 09:19:10 -05:00
Jorge Reyes b81257bf45 Merge pull request #15973 from Security-Onion-Solutions/reyesj2/dlm-support
Data stream lifecycle management support
2026-06-15 14:47:51 -05:00
reyesj2 1a423a2434 update message 2026-06-15 14:17:34 -05:00
reyesj2 95cae4c734 remove so-elasticsearch-indices-delete cron when using DLM 2026-06-15 13:32:45 -05:00
reyesj2 596471e140 using new annotation config 2026-06-15 13:31:53 -05:00
reyesj2 d10f21399c remove comments 2026-06-15 13:31:23 -05:00
Jason Ertel ae1ddf3817 es|ql defaults 2026-06-15 12:33:08 -04:00
Josh Brower ea73216f4e Merge pull request #15971 from Security-Onion-Solutions/delta
userid vs names
2026-06-15 15:28:03 +02:00
Josh Patterson 1ee555957a Speed up so-elastic-fleet-integration-upgrade
Fetch each agent policy once and extract integration name/package/version/id
locally via a single jq pass instead of re-fetching the identical policy JSON
1+3N times. Memoize epm/packages latest-version lookups so each package is
queried once instead of per (policy, integration). Dispatch the per-integration
dry-run+upgrade as throttled background jobs (MAX_FLEET_JOBS) with
flock-serialized output and a FAIL_FILE marker, mirroring
elastic_fleet_load_integrations_dir.

Behavior preserved: same elastic-defend-endpoints/fleet_server skips, same
AUTO_UPGRADE_INTEGRATIONS default-package gating (moved into jq, using $defaults
to avoid the jq $def keyword collision), and exit 1 on any failure so salt
retries.
2026-06-12 15:23:43 -04:00
Josh Patterson 43f72c1f9f Parallelize so-elasticsearch-templates-load template PUTs
Load component and index templates as throttled background jobs (max 10
concurrent) instead of sequential curl PUTs, matching the bounded-concurrency
+ flock-serialized-output pattern used by the fleet/ILM load scripts. Keeps a
wait barrier between the component phase and the index phase so index
templates never load before their referenced component templates. Failures are
tracked via per-job marker files since counter increments can't escape
background subshells.
2026-06-12 15:11:34 -04:00
Josh Brower 9031c1fd22 userid vs names 2026-06-12 11:18:59 -04:00