securityonion

mirror of https://github.com/Security-Onion-Solutions/securityonion.git synced 2026-05-08 12:27:52 +02:00

Author	SHA1	Message	Date
Mike Reeves	a7efabd90d	fix: tolerate pip's non-zero exit on psycopg2 patchelf step salt's pip.installed flagged so_pillar_psycopg2_in_salt_python as failed because pip exits non-zero when it can't find the patchelf binary to rewrite the psycopg2 wheel's RPATH after extraction. The wheel is fully installed and importable regardless — the patchelf step is a cosmetic post-install rewrite, not a build dependency. But salt's failure cascade then short-circuited so_pillar_initial_import and the so-yaml mode flip, leaving the install in dual-pillar mode instead of PG-canonical. Replaced with cmd.run that runs pip with `\|\| true` and uses an `import psycopg2` check as the actual readiness gate — same idea as how salt's own bootstrap does it. Also fixed the require: ref on so_pillar_initial_import (was `pip:`, needs to be `cmd:` for the new state type).	2026-05-04 22:08:31 -04:00
Mike Reeves	b25b221076	postsalt: move PG-canonical enable to AFTER the install highstate Supersedes the pre-install placement (right after secrets_pillar) from the previous commit, which was broken: salt's ext_pillar overlay shadowed disk pillar's elasticsearch subtree before so-pillar-import had populated PG, so elasticsearch.enabled.sls failed rendering on ELASTICSEARCHMERGED.auth.users.so_elastic_user.pass — that key lives in elasticsearch/auth.sls, which is on the importer's secrets allowlist and never makes it into so_pillar.pillar_entry. The install would then hang forever waiting for the elasticsearch container that the broken state never deployed. The new placement is right after the final state.highstate completes: 1. drop adv_postgres.sls flipping the flag to True 2. salt-call saltutil.refresh_pillar so the next state sees it 3. salt-call state.apply postgres.schema_pillar — deploys schema, ALTERs role login passwords, installs psycopg2 into salt's bundled python, runs so-pillar-import, writes /opt/so/conf/so-yaml/mode=postgres 4. salt-call state.apply salt.master — re-renders engines.conf with the pg_notify_pillar engine block, drops master.d ext_pillar config, watch_in restarts salt-master and ext_pillar takes over verify_setup runs after this so its final checks see PG-canonical mode in place. Same end state as the previous commit's intent, just without the bootstrap chicken-and-egg.	2026-05-04 21:02:08 -04:00
Mike Reeves	7b9ab2d9d1	postsalt: enable PG-canonical pillar mode by default during so-setup Drops a local pillar override (postgres.so_pillar.enabled = True) right after secrets_pillar so the install-time highstate brings up schema_pillar, ext_pillar_postgres, and the pg_notify_pillar engine without operator intervention. Without this the whole PG-canonical stack stays gated off on the default-False flag and the install lands in legacy disk-pillar mode — which defeats the point of being on the postsalt branch at all. The new enable_so_pillar_postgres() function in so-functions is idempotent (overwrites adv_postgres.sls with a fixed body) and the generated file is mode 0644 socore:socore so it merges into pillar under the existing local-pillar directory ownership convention. Rollback path: edit /opt/so/saltstack/local/pillar/postgres/adv_postgres.sls to set enabled: False, or delete the file. The schema and engine config states will tear themselves down on the next highstate via their existing else-branch absent states.	2026-05-04 19:56:14 -04:00
Mike Reeves	92a7bb3053	fix: get postsalt's PG-canonical pillar actually working end-to-end Five blockers turned up the first time the so_pillar schema was applied against a fresh standalone install. Fixing them in order: 1. 006_rls.sql ordering bug 006 GRANTed on so_pillar.change_queue and its sequence, but the table isn't created until 008_change_notify.sql. 006 errored mid-file with "relation so_pillar.change_queue does not exist", short-circuiting the rest of the pillar staging chain. Moved the three change_queue grants into 008 alongside the table creation so each file is self-contained. 2. so_pillar_* roles unable to log in 006 created the roles as NOLOGIN and set no password. Salt-master's ext_pillar (postgres) and the pg_notify_pillar engine both connect as so_pillar_master via TCP, so both came up with "password authentication failed for user so_pillar_master". Added a templated cmd.run step in schema_pillar.sls (so_pillar_role_login_passwords) that ALTERs all three roles WITH LOGIN PASSWORD pulling from secrets:pillar_master_pass — the same password ext_pillar_postgres.conf.jinja and the engines.conf pg_notify_pillar block render with. 3. Missing GRANT CONNECT ON DATABASE securityonion USAGE on the schema is granted in 006 but CONNECT on the database isn't. Engine + ext_pillar succeeded auth then died with "permission denied for database securityonion". Added the explicit GRANT CONNECT in 006. 4. psycopg2 missing from salt's bundled python /opt/saltstack/salt/bin/python3 doesn't ship psycopg by default, so when salt-master tries to load the pg_notify_pillar engine its `import psycopg2` fails inside salt's loader and the engine silently doesn't start (no error in the salt log — you only notice when nothing ever drains so_pillar.change_queue). Added a pip.installed state in schema_pillar.sls bound to that interpreter via bin_env. 5. engines.conf vs pg_notify_pillar_engine.conf list-replace Salt's master.d/.conf merge replaces top-level lists rather than concatenating them. The engine config used to live in its own master.d/pg_notify_pillar_engine.conf with `engines: [pg_notify_pillar]` alongside the legacy `engines.conf` carrying `engines: [checkmine, pillarWatch]`. Whichever loaded last won, so the engine never showed up in the loaded set even when the file existed. Fold the pg_notify_pillar declaration into engines.conf (now jinja-rendered, gated on postgres:so_pillar:enabled), drop the standalone state from pg_notify_pillar_engine.sls, and delete the now-orphaned conf jinja. End state validated against a live standalone-net install on the dev rig: salt-master ext_pillar reads from so_pillar. with no errors, the pg_notify_pillar engine LISTENs on so_pillar_change and drains the change_queue (134-row backlog → 0 within seconds), and a so-yaml replace on a pillar key flows disk → PG → ext_pillar → salt pillar.get with the new value visible after a saltutil.refresh_pillar.	2026-05-04 19:47:38 -04:00
Mike Reeves	155b5c5d66	fix: consistent allowed_states guard in postgres.schema_pillar Same `sls.split('.')[0]` pattern as ext_pillar_postgres + pg_notify_pillar_engine. For sls='postgres.schema_pillar' the split happened to evaluate 'postgres', which is in manager_states, so the guard worked accidentally — but it would break silently if anyone ever moved the file under a deeper SLS path. Switch to a literal `{% if 'postgres' in allowed_states %}` for the same intent- revealing pattern as the master.d guards.	2026-05-04 19:25:14 -04:00
Mike Reeves	f1746b0f59	fix: correct allowed_states guard in ext_pillar_postgres + pg_notify_pillar_engine Both SLS files used `sls.split('.')[0]` to derive what to look up in allowed_states. For these files (sls='salt.master.ext_pillar_postgres' and sls='salt.master.pg_notify_pillar_engine') that returns 'salt', which is never in any role's allowed_states list — only specific keys like 'salt.master', 'salt.minion', 'salt.cloud' are. The guard's else branch fired on every highstate, emitting two cosmetic ID: <sls>_state_not_allowed Function: test.fail_without_changes Comment: Failure! entries that polluted the so-setup error summary even on green installs. Both states drop config under /etc/salt/master.d/ and watch_in the salt-master service, so the natural intent is "only run when this node hosts the salt master". Switching the guard to a literal {% if 'salt.master' in allowed_states %} expresses that directly without string-parsing the SLS path, and matches the existing membership in manager_states (which is in turn included in every manager-bearing role: so-eval, so-manager, so-managerhype, so-managersearch, so-standalone, so-import).	2026-05-04 19:17:30 -04:00
Mike Reeves	2e411625c4	fix: subshell-scope umask 077 in so_pillar key generation The unscoped `umask 077` on postsalt's secrets_pillar path leaked into every subsequent file write by so-setup (and the salt-call processes it spawned) for the rest of the install. Every state-rendered config file under /opt/so/conf landed at mode 0600 instead of 0644, which broke any container that bind-mounts its config read-only and runs as a non-root user after the entrypoint's gosu drop. The first concrete casualty was the influxdb container, which exits with "failed to load config file: open /conf/config.yaml: permission denied" after init mode completes and re-execs as the influxdb user. The chmod 0400 immediately after the printf already enforces the intended file mode, so the umask was redundant for the key file itself; scoping it to a subshell preserves the defense-in-depth between the printf and the chmod without polluting the parent shell.	2026-05-04 18:02:58 -04:00
Mike Reeves	e43ad2ff74	Merge remote-tracking branch 'origin/feature/ensure-pyyaml' into postsalt	2026-05-04 16:37:42 -04:00
Mike Reeves	b39d259101	Merge remote-tracking branch 'origin/3/dev' into postsalt	2026-05-04 16:19:17 -04:00
Mike Reeves	5bca81d833	Merge pull request #15858 from Security-Onion-Solutions/security-fix Fix unsafe PyYAML load in filecheck	2026-05-04 16:16:40 -04:00
Mike Reeves	b701664e04	Fix unsafe PyYAML load in filecheck	2026-05-04 12:09:35 -04:00
Jorge Reyes	bc64f1431d	Merge pull request #15857 from Security-Onion-Solutions/reyesj2/package-registry-health fleet package registry health check	2026-05-04 11:05:23 -05:00
reyesj2	2203037ce7	fleet package registry health check	2026-05-04 10:52:37 -05:00
Jorge Reyes	77a4ad877e	Merge pull request #15851 from Security-Onion-Solutions/reyesj2/integration-transforms	2026-05-01 14:11:12 -05:00
reyesj2	702b3585cc	excluding additional integration transform job failures	2026-05-01 12:57:59 -05:00
reyesj2	86966d2778	reauthorize unhealthy transform jobs using kibana 9.3.3 auth flow	2026-05-01 12:44:08 -05:00
Mike Reeves	3d11694d51	make so-yaml PG-canonical and add pillar-change reactor stack Two coupled changes that together let so_pillar.* be the canonical config store, with config edits driving service reloads automatically: so-yaml PG-canonical mode - Adds /opt/so/conf/so-yaml/mode (and SO_YAML_BACKEND env override) with three values: dual (legacy), postgres (PG-only for managed paths), disk (emergency rollback). Bootstrap files (secrets.sls, ca/init.sls, .nodes.sls, top.sls, ...) stay disk-only regardless via the existing SkipPath allowlist in so_yaml_postgres.locate. - loadYaml/writeYaml/purgeFile now route to so_pillar. in postgres mode: replace/add/get all read+write the database with no disk file ever appearing. PG failure is fatal in postgres mode (no silent fallback); dual mode preserves the prior best-effort mirror. - so_yaml_postgres gains read_yaml(path), is_pg_managed(path), and is_enabled() so so-yaml can answer "is this path PG-managed and is PG up" without reaching into private helpers. - schema_pillar.sls writes /opt/so/conf/so-yaml/mode = postgres after the importer succeeds, so flipping postgres:so_pillar:enabled flips so-yaml's behavior in lockstep with the schema being live. pg_notify-driven change fan-out - 008_change_notify.sql adds so_pillar.change_queue + an AFTER trigger on pillar_entry that enqueues the locator and pg_notifies 'so_pillar_change'. Queue is drained at-least-once so engine restarts don't lose events; pg_notify is just the wakeup signal. - New salt-master engine pg_notify_pillar.py LISTENs on the channel, drains the queue with FOR UPDATE SKIP LOCKED, debounces bursts, and fires 'so/pillar/changed' events grouped by (scope, role, minion). - Reactor so_pillar_changed.sls catches the tag and dispatches to orch.so_pillar_reload, which carries a DISPATCH map of pillar-path prefix -> (state sls, role grain set) so adding a new service to the auto-reload list is a one-line edit instead of a new reactor. - Engine + reactor wiring is gated on the same postgres:so_pillar:enabled flag as the schema and ext_pillar config so the whole stack flips on/off together. Tests: 21 new cases (112 total, all passing) covering mode resolution, PG-managed detection, and PG-canonical read/write/purge routing with the PG client stubbed.	2026-05-01 09:31:48 -04:00
Mike Reeves	23255f88e0	add so-yaml dual-write to so_pillar.* + purge verb Hooks every so-yaml.py write through a new so_yaml_postgres helper that mirrors disk YAML mutations into so_pillar.pillar_entry via docker exec psql. Disk remains canonical during the transition; PG mirror failures are logged only when a real write error occurs (skipped paths and postgres-unreachable cases stay silent so existing callers don't see new noise on stderr). Adds a `purge YAML_FILE` verb on so-yaml that deletes the file from disk and removes the matching pillar_entry rows. For minion files it also drops the so_pillar.minion row, which CASCADEs to pillar_entry + role_member. Designed for so-minion's delete path (replaces rm -f) so the audit log captures the deletion. setup/so-functions::generate_passwords + secrets_pillar generate secrets:pillar_master_pass and /opt/so/conf/postgres/so_pillar.key on fresh installs, and append the password to existing secrets.sls files on upgrade. - salt/manager/tools/sbin/so_yaml_postgres.py: locate(), write_yaml(), purge_yaml(), and a small CLI for diagnostics. Skips bootstrap and mine-driven paths via the same allowlist used by so-pillar-import. - salt/manager/tools/sbin/so-yaml.py: import the helper, hook writeYaml() to mirror after every disk write, add purgeFile() and the purge verb. - salt/manager/tools/sbin/so-yaml_test.py: 16 new tests covering the purge verb and the path-locator / write contract of so_yaml_postgres without contacting Postgres. All 91 tests pass. - setup/so-functions: generate_passwords adds PILLARMASTERPASS and SO_PILLAR_KEY; secrets_pillar writes pillar_master_pass and the pgcrypto master key file.	2026-04-30 17:09:58 -04:00
Mike Reeves	d30b52b327	add so-pillar-import — seeds so_pillar.* from on-disk pillar tree Idempotent importer that schema_pillar.sls runs once at end of postgres state on first install, and that so-minion can call per-minion on add / delete. UPSERTs into so_pillar.pillar_entry; the audit trigger handles versioning so re-runs without SLS edits produce no version bumps. Connects via docker exec so-postgres psql, so no DSN config is required at first-install time. Skips bootstrap files (secrets.sls, postgres/ auth.sls, etc.), mine-driven nodes.sls files, and any file containing Jinja templates — those stay disk-authoritative and ext_pillar_first: False means they render before the PG overlay. Auto-syncs to /usr/sbin via the existing manager_sbin file.recurse.	2026-04-30 16:34:05 -04:00
Mike Reeves	3fad895d6a	add so_pillar schema + ext_pillar wiring (postsalt foundation) Lays the database-backed pillar foundation for the postsalt branch. Salt continues to read on-disk SLS first; the new ext_pillar config overlays values from the so_pillar.* schema in so-postgres. - salt/postgres/files/schema/pillar/00{1..7}_*.sql: idempotent DDL for scope/role/role_member/minion/pillar_entry/pillar_entry_history/ drift_log, secret pgcrypto helpers, RLS, pg_cron retention. - salt/postgres/schema_pillar.sls: applies the SQL files inside the so-postgres container after it's healthy, configures the master_key GUC, and runs so-pillar-import once. Gated on postgres:so_pillar:enabled feature flag (default false). - salt/salt/master/ext_pillar_postgres.{sls,conf.jinja}: drops /etc/salt/master.d/ext_pillar_postgres.conf with list-form ext_pillar queries (global/role/minion/secrets) and ext_pillar_first: False so bootstrap pillars on disk render before the PG overlay. - salt/postgres/init.sls + salt/salt/master.sls: include the new states. Both new state branches are guarded so a default install with the flag off is a no-op.	2026-04-30 16:30:57 -04:00
Jorge Reyes	ce3ad3a895	Merge pull request #15844 from Security-Onion-Solutions/reyesj2/elastic-agent-warning update default elastic agent logging level to warning	2026-04-30 09:46:28 -05:00
Mike Reeves	3a4b7b50de	ensure python3-pyyaml is installed before continuing setup	2026-04-30 10:15:09 -04:00
reyesj2	39d0947102	update default elastic agent logging level to warning	2026-04-29 17:38:40 -05:00
Jorge Reyes	0085d9a353	Merge pull request #15842 from Security-Onion-Solutions/reyesj2-patch-1 so-elastic-fleet-outputs-update now checks for cert drift. Remove run…	2026-04-29 12:37:04 -05:00
Jorge Reyes	2f01ce3b23	so-elastic-fleet-outputs-update now checks for cert drift. Remove running --cert arg on cert change to prevent highstate from running outputs-update 2x	2026-04-29 12:33:28 -05:00
Mike Reeves	71b19c1b5f	Merge pull request #15840 from Security-Onion-Solutions/fix/import-postgres-firewall Open postgres in DOCKER-USER firewall everywhere influxdb is open	2026-04-29 09:20:03 -04:00
Mike Reeves	82e55ae87f	Open postgres on every hostgroup that opens influxdb The static defaults only listed postgres on each role's self-hostgroup, leaving sensor/searchnode/heavynode/receiver/fleet/idh/desktop/hypervisor hostgroups unable to reach the manager's so-postgres in distributed grids. A dynamic block in firewall/map.jinja added postgres to those hostgroups only when telegraf.output was switched to POSTGRES/BOTH, which left postgres unreachable by default. Mirror influxdb statically across manager/managerhype/managersearch/ standalone for every hostgroup that already lists influxdb, and drop the now-redundant telegraf-gated dynamic block from firewall/map.jinja.	2026-04-29 09:09:50 -04:00
Mike Reeves	3e02001544	Open postgres port for import role in DOCKER-USER firewall When so-postgres was wired in (`868cd1187`), the import role's firewall defaults were missed while every other manager-class role (manager, managerhype, managersearch, standalone, eval) had postgres added to their DOCKER-USER manager-hostgroup portgroups. As a result, on a fresh import install the so-postgres container starts but tcp/5432 is dropped at DOCKER-USER, so soc/kratos/telegraf can't reach it. Add postgres alongside the existing influxdb entry so import nodes match the other roles.	2026-04-29 08:48:45 -04:00
Mike Reeves	82f70bb53a	Merge pull request #15839 from Security-Onion-Solutions/fix/drop-postgres-soc-module-injection drop postgres module from soc defaults injection	2026-04-28 15:48:49 -04:00
Mike Reeves	2dcded6cca	drop postgres module from soc defaults injection The soc binary on 3/dev does not register a postgres module, so injecting postgres into soc.config.server.modules makes soc abort at launch with 'Module does not exist: postgres'. The soc-side module is staged on feature/postgres but is not landing this release. Drop the injection until the module ships; salt/postgres state and pillars are unchanged.	2026-04-28 15:46:56 -04:00
Mike Reeves	8ca59e6f0c	Merge pull request #15838 from Security-Onion-Solutions/fix/docker-refresh-multiarch-pull Fix/docker refresh multiarch pull	2026-04-28 15:14:27 -04:00
Mike Reeves	82dac82d15	drop platform/digest pull resolution The digest-pull logic was added to make `docker push` work for multi-arch upstream tags. Now that the push step is `docker buildx imagetools create` pinned to the gpg-verified RepoDigest, the registry-to-registry copy handles single- and multi-arch sources without help. Reverts the pull back to the original line and removes the unused PLATFORM_OS/_ARCH detection.	2026-04-28 14:54:25 -04:00
Mike Reeves	288a823edf	push images via buildx imagetools create Replaces `docker push` with a registry-to-registry copy. On Docker 29.x with the containerd image store, `docker push` of a freshly-pulled image hits a path that wraps single-platform manifests in a synthetic index and then can't push the layers it claims to reference, producing `NotFound: content digest ...` even when the image is fully present. Keep the local `docker tag` so so-image-pull's `docker images \| grep :5000` existence check continues to work.	2026-04-28 14:49:02 -04:00
Jorge Reyes	f9e3d30a71	Merge pull request #15837 from Security-Onion-Solutions/reyesj2/elastic-fleet-cert-check check current fleet policy cert against cert on disk	2026-04-28 13:47:55 -05:00
reyesj2	9cec79b299	check current fleet policy cert against cert on disk Co-authored-by: Copilot <copilot@github.com>	2026-04-28 13:34:39 -05:00
Mike Reeves	c86399327b	fix so-docker-refresh push for multi-arch source images docker pull of a multi-arch tag on Docker 29.x leaves the local tag pointing at the image index rather than the platform-specific manifest. The subsequent docker push then tries to push every sub-manifest the index references and fails on layers we never fetched. Resolve the local-platform manifest digest from the upstream index via docker buildx imagetools inspect, pull by that digest, and re-tag locally to the canonical tag. The signing flow and the existing tag/push to the embedded registry are unchanged.	2026-04-28 14:27:59 -04:00
Mike Reeves	fa8162de02	Merge pull request #15749 from Security-Onion-Solutions/feature/postgres Add so-postgres Salt states and infrastructure	2026-04-28 10:15:47 -04:00
Josh Patterson	33abc429d1	Merge pull request #15835 from Security-Onion-Solutions/fix/reactor/sominon_setup fix sominion_setup reactor	2026-04-28 08:55:58 -04:00
Jorge Reyes	b22585ca90	Merge pull request #15833 from Security-Onion-Solutions/reyesj2-es933 exclude more transform job errors	2026-04-27 15:05:11 -05:00
reyesj2	9f2ca7012f	exclude more transform job errors	2026-04-27 15:02:13 -05:00
Josh Patterson	21aeb68188	fix sominion_setup reactor	2026-04-27 14:30:41 -04:00
Josh Patterson	81e60ec5bf	Merge pull request #15829 from Security-Onion-Solutions/fix/reinstall2 fix reinstall	2026-04-24 16:20:53 -04:00
Josh Patterson	199c2746f1	stop salt-minion and salt-master regardless of install type. display reinstall on console and save to logfile	2026-04-24 15:24:11 -04:00
Josh Patterson	8eca465ef6	uninstall elastic-agent before stopping dockers on reinstall	2026-04-24 14:35:11 -04:00
Jorge Reyes	a45e59239f	Merge pull request #15826 from Security-Onion-Solutions/reyesj2-es933 heavynode should run es cluster state	2026-04-24 13:07:48 -05:00
Josh Patterson	2ad0bcab7c	Merge pull request #15828 from Security-Onion-Solutions/fix/annotations readonly soc and kratos enabled	2026-04-24 14:00:02 -04:00
Josh Patterson	070d150420	readonly soc and kratos enabled	2026-04-24 13:56:35 -04:00
reyesj2	90ecbe90d8	allow heavynodes to run elasticsearch/cluster state	2026-04-24 12:56:27 -05:00
Josh Patterson	813fa03dc3	Merge pull request #15824 from Security-Onion-Solutions/fix/reinstall2 fix reinstall issue with salt	2026-04-24 12:22:54 -04:00
Josh Patterson	02381fbbe9	stop salt-cloud , belt-and-suspenders against a broken/incomplete salt RPM	2026-04-24 11:33:21 -04:00

1 2 3 4 5 ...

18158 Commits