Compare commits

..

5 Commits

Author SHA1 Message Date
Mike Reeves a433e9524d Move onionconfig writes out of so-yaml 2026-05-12 16:05:55 -04:00
Mike Reeves 3d11694d51 make so-yaml PG-canonical and add pillar-change reactor stack
Two coupled changes that together let so_pillar.* be the canonical
config store, with config edits driving service reloads automatically:

so-yaml PG-canonical mode
- Adds /opt/so/conf/so-yaml/mode (and SO_YAML_BACKEND env override) with
  three values: dual (legacy), postgres (PG-only for managed paths),
  disk (emergency rollback). Bootstrap files (secrets.sls, ca/init.sls,
  *.nodes.sls, top.sls, ...) stay disk-only regardless via the existing
  SkipPath allowlist in so_yaml_postgres.locate.
- loadYaml/writeYaml/purgeFile now route to so_pillar.* in postgres
  mode: replace/add/get all read+write the database with no disk file
  ever appearing. PG failure is fatal in postgres mode (no silent
  fallback); dual mode preserves the prior best-effort mirror.
- so_yaml_postgres gains read_yaml(path), is_pg_managed(path), and
  is_enabled() so so-yaml can answer "is this path PG-managed and is
  PG up" without reaching into private helpers.
- schema_pillar.sls writes /opt/so/conf/so-yaml/mode = postgres after
  the importer succeeds, so flipping postgres:so_pillar:enabled flips
  so-yaml's behavior in lockstep with the schema being live.

pg_notify-driven change fan-out
- 008_change_notify.sql adds so_pillar.change_queue + an AFTER trigger
  on pillar_entry that enqueues the locator and pg_notifies
  'so_pillar_change'. Queue is drained at-least-once so engine restarts
  don't lose events; pg_notify is just the wakeup signal.
- New salt-master engine pg_notify_pillar.py LISTENs on the channel,
  drains the queue with FOR UPDATE SKIP LOCKED, debounces bursts, and
  fires 'so/pillar/changed' events grouped by (scope, role, minion).
- Reactor so_pillar_changed.sls catches the tag and dispatches to
  orch.so_pillar_reload, which carries a DISPATCH map of pillar-path
  prefix -> (state sls, role grain set) so adding a new service to
  the auto-reload list is a one-line edit instead of a new reactor.
- Engine + reactor wiring is gated on the same postgres:so_pillar:enabled
  flag as the schema and ext_pillar config so the whole stack flips
  on/off together.

Tests: 21 new cases (112 total, all passing) covering mode resolution,
PG-managed detection, and PG-canonical read/write/purge routing with
the PG client stubbed.
2026-05-01 09:31:48 -04:00
Mike Reeves 23255f88e0 add so-yaml dual-write to so_pillar.* + purge verb
Hooks every so-yaml.py write through a new so_yaml_postgres helper that
mirrors disk YAML mutations into so_pillar.pillar_entry via docker exec
psql. Disk remains canonical during the transition; PG mirror failures
are logged only when a real write error occurs (skipped paths and
postgres-unreachable cases stay silent so existing callers don't see
new noise on stderr).

Adds a `purge YAML_FILE` verb on so-yaml that deletes the file from
disk and removes the matching pillar_entry rows. For minion files it
also drops the so_pillar.minion row, which CASCADEs to pillar_entry +
role_member. Designed for so-minion's delete path (replaces rm -f) so
the audit log captures the deletion.

setup/so-functions::generate_passwords + secrets_pillar generate
secrets:pillar_master_pass and /opt/so/conf/postgres/so_pillar.key on
fresh installs, and append the password to existing secrets.sls files
on upgrade.

- salt/manager/tools/sbin/so_yaml_postgres.py: locate(), write_yaml(),
  purge_yaml(), and a small CLI for diagnostics. Skips bootstrap and
  mine-driven paths via the same allowlist used by so-pillar-import.
- salt/manager/tools/sbin/so-yaml.py: import the helper, hook
  writeYaml() to mirror after every disk write, add purgeFile() and
  the purge verb.
- salt/manager/tools/sbin/so-yaml_test.py: 16 new tests covering the
  purge verb and the path-locator / write contract of so_yaml_postgres
  without contacting Postgres. All 91 tests pass.
- setup/so-functions: generate_passwords adds PILLARMASTERPASS and
  SO_PILLAR_KEY; secrets_pillar writes pillar_master_pass and the
  pgcrypto master key file.
2026-04-30 17:09:58 -04:00
Mike Reeves d30b52b327 add so-pillar-import — seeds so_pillar.* from on-disk pillar tree
Idempotent importer that schema_pillar.sls runs once at end of postgres
state on first install, and that so-minion can call per-minion on add /
delete. UPSERTs into so_pillar.pillar_entry; the audit trigger handles
versioning so re-runs without SLS edits produce no version bumps.

Connects via docker exec so-postgres psql, so no DSN config is required
at first-install time. Skips bootstrap files (secrets.sls, postgres/
auth.sls, etc.), mine-driven nodes.sls files, and any file containing
Jinja templates — those stay disk-authoritative and ext_pillar_first:
False means they render before the PG overlay.

Auto-syncs to /usr/sbin via the existing manager_sbin file.recurse.
2026-04-30 16:34:05 -04:00
Mike Reeves 3fad895d6a add so_pillar schema + ext_pillar wiring (postsalt foundation)
Lays the database-backed pillar foundation for the postsalt branch. Salt
continues to read on-disk SLS first; the new ext_pillar config overlays
values from the so_pillar.* schema in so-postgres.

- salt/postgres/files/schema/pillar/00{1..7}_*.sql: idempotent DDL for
  scope/role/role_member/minion/pillar_entry/pillar_entry_history/
  drift_log, secret pgcrypto helpers, RLS, pg_cron retention.
- salt/postgres/schema_pillar.sls: applies the SQL files inside the
  so-postgres container after it's healthy, configures the master_key
  GUC, and runs so-pillar-import once. Gated on
  postgres:so_pillar:enabled feature flag (default false).
- salt/salt/master/ext_pillar_postgres.{sls,conf.jinja}: drops
  /etc/salt/master.d/ext_pillar_postgres.conf with list-form ext_pillar
  queries (global/role/minion/secrets) and ext_pillar_first: False so
  bootstrap pillars on disk render before the PG overlay.
- salt/postgres/init.sls + salt/salt/master.sls: include the new states.

Both new state branches are guarded so a default install with the flag
off is a no-op.
2026-04-30 16:30:57 -04:00
132 changed files with 1176 additions and 4035 deletions
-1
View File
@@ -11,7 +11,6 @@ body:
-
- 3.0.0
- 3.1.0
- 3.2.0
- Other (please provide detail below)
validations:
required: true
+11 -11
View File
@@ -1,17 +1,17 @@
### 3.1.0-20260528 ISO image released on 2026/05/28
### 3.0.0-20260331 ISO image released on 2026/03/31
### Download and Verify
3.1.0-20260528 ISO image:
https://download.securityonion.net/file/securityonion/securityonion-3.1.0-20260528.iso
3.0.0-20260331 ISO image:
https://download.securityonion.net/file/securityonion/securityonion-3.0.0-20260331.iso
MD5: 9D6FF58DEEE24089D722C73169765B3E
SHA1: 2B8B816B6CEC3B7F96B3C5E040EBF502DD2C412F
SHA256: 62FAB57E247C843D6A04F0796D8162C732B65D82FC3E4A59D087135B9FD32912
MD5: ECD318A1662A6FDE0EF213F5A9BD4B07
SHA1: E55BE314440CCF3392DC0B06BC5E270B43176D9C
SHA256: 7FC47405E335CBE5C2B6C51FE7AC60248F35CBE504907B8B5A33822B23F8F4D5
Signature for ISO image:
https://github.com/Security-Onion-Solutions/securityonion/raw/3/main/sigs/securityonion-3.1.0-20260528.iso.sig
https://github.com/Security-Onion-Solutions/securityonion/raw/3/main/sigs/securityonion-3.0.0-20260331.iso.sig
Signing key:
https://raw.githubusercontent.com/Security-Onion-Solutions/securityonion/3/main/KEYS
@@ -25,22 +25,22 @@ wget https://raw.githubusercontent.com/Security-Onion-Solutions/securityonion/3/
Download the signature file for the ISO:
```
wget https://github.com/Security-Onion-Solutions/securityonion/raw/3/main/sigs/securityonion-3.1.0-20260528.iso.sig
wget https://github.com/Security-Onion-Solutions/securityonion/raw/3/main/sigs/securityonion-3.0.0-20260331.iso.sig
```
Download the ISO image:
```
wget https://download.securityonion.net/file/securityonion/securityonion-3.1.0-20260528.iso
wget https://download.securityonion.net/file/securityonion/securityonion-3.0.0-20260331.iso
```
Verify the downloaded ISO image using the signature file:
```
gpg --verify securityonion-3.1.0-20260528.iso.sig securityonion-3.1.0-20260528.iso
gpg --verify securityonion-3.0.0-20260331.iso.sig securityonion-3.0.0-20260331.iso
```
The output should show "Good signature" and the Primary key fingerprint should match what's shown below:
```
gpg: Signature made Wed 27 May 2026 03:03:59 PM EDT using RSA key ID FE507013
gpg: Signature made Mon 30 Mar 2026 06:22:14 PM EDT using RSA key ID FE507013
gpg: Good signature from "Security Onion Solutions, LLC <info@securityonionsolutions.com>"
gpg: WARNING: This key is not certified with a trusted signature!
gpg: There is no indication that the signature belongs to the owner.
-1
View File
@@ -1 +0,0 @@
+1 -1
View File
@@ -1 +1 @@
3.2.0
3.1.0
-142
View File
@@ -1,142 +0,0 @@
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
# Custom salt beacon that watches the SOC audit_settings table in postgres for
# new settings changes and emits a beacon event per new row. This replaces the
# inotify watch on /opt/so/saltstack/local/pillar -- instead of monitoring pillar
# files on disk, we monitor the so_soc.audit_settings table that SOC writes to.
#
# Detection is poll-based with a monotonic `id` watermark persisted to
# WATERMARK_FILE: each pass selects rows with id greater than the last id seen,
# which makes it self-healing (a missed poll simply catches up on the next one).
#
# Each emitted event carries setting_id and node_id; the push_pillar reactor maps
# setting_id -> app via pillar_push_map.yaml and writes a push intent, after which
# the existing so-push-drainer / orch.push_batch pipeline takes over unchanged.
import logging
import os
import subprocess
log = logging.getLogger(__name__)
WATERMARK_FILE = '/opt/so/state/pillar_db_watch.id'
CONTAINER = 'so-postgres'
DATABASE = 'so_soc'
# Unaligned, tuples-only psql output with a field separator that cannot appear in
# an id/setting_id/node_id, so we can split each row reliably.
FIELD_SEP = '\x1f'
def __virtual__():
return True
def validate(config):
return True, 'valid'
def _read_watermark():
# Returns the last processed id, or None if the watermark has not been seeded.
try:
with open(WATERMARK_FILE, 'r') as f:
return int((f.read() or '').strip())
except (IOError, ValueError):
return None
def _write_watermark(value):
try:
os.makedirs(os.path.dirname(WATERMARK_FILE), exist_ok=True)
tmp = WATERMARK_FILE + '.tmp'
with open(tmp, 'w') as f:
f.write(str(int(value)))
os.rename(tmp, WATERMARK_FILE)
except OSError:
log.exception('pillar_db beacon: failed to persist watermark to %s', WATERMARK_FILE)
def _query(sql):
# Run a query against so_soc inside the so-postgres container over the unix
# socket (trust auth, no password). Returns stdout on success, or None on any
# failure so the caller can no-op and retry on the next interval.
cmd = [
'docker', 'exec', CONTAINER,
'psql', '-U', 'postgres', '-d', DATABASE,
'-tA', '-F', FIELD_SEP, '-c', sql,
]
try:
result = subprocess.run(cmd, capture_output=True, text=True, timeout=30)
except subprocess.TimeoutExpired:
log.warning('pillar_db beacon: psql timed out')
return None
except Exception:
log.exception('pillar_db beacon: failed to exec psql')
return None
if result.returncode != 0:
log.warning('pillar_db beacon: psql failed (rc=%s): %s',
result.returncode, (result.stderr or '').strip())
return None
return result.stdout
def beacon(config):
retval = []
watermark = _read_watermark()
# First run / missing watermark: seed to the current MAX(id) and emit nothing
# so we never replay the entire settings history into a fleetwide push.
if watermark is None:
seed = _query('SELECT COALESCE(MAX(id), 0) FROM audit_settings;')
if seed is None:
return retval # postgres not ready yet; retry next interval
try:
_write_watermark(int((seed or '0').strip() or 0))
except ValueError:
log.warning('pillar_db beacon: could not parse MAX(id) seed: %r', seed)
return retval
rows = _query(
"SELECT id, setting_id, COALESCE(node_id, '') FROM audit_settings "
"WHERE id > %d ORDER BY id;" % watermark
)
if rows is None:
return retval
max_id = watermark
for line in rows.splitlines():
# Do NOT str.strip() the whole line: Python treats the \x1f field
# separator (and \x1c-\x1e) as whitespace, so stripping would eat an
# empty trailing node_id field and make the row look malformed.
if not line.strip():
continue
parts = line.split(FIELD_SEP)
if len(parts) < 3:
log.warning('pillar_db beacon: skipping malformed row: %r', line)
continue
try:
row_id = int(parts[0])
except ValueError:
log.warning('pillar_db beacon: skipping row with non-int id: %r', line)
continue
setting_id = parts[1]
node_id = parts[2]
retval.append({
'tag': 'audit_settings',
'id': row_id,
'setting_id': setting_id,
'node_id': node_id,
})
if row_id > max_id:
max_id = row_id
if max_id > watermark:
_write_watermark(max_id)
log.info('pillar_db beacon: emitted %d change(s), watermark %d -> %d',
len(retval), watermark, max_id)
return retval
@@ -25,11 +25,9 @@ if [ ! -f $BACKUPFILE ]; then
# Create empty backup file
tar -cf $BACKUPFILE -T /dev/null
# Loop through all paths defined in global.sls, and append them to backup file if they exist
# Loop through all paths defined in global.sls, and append them to backup file
{%- for LOCATION in BACKUPLOCATIONS %}
if [[ -d {{ LOCATION }} || -f {{ LOCATION }} ]]; then
tar -rf $BACKUPFILE "${EXCLUSIONS[@]}" {{ LOCATION }}
fi
{%- endfor %}
fi
+14
View File
@@ -48,6 +48,13 @@ copy_so-yaml_manager_tools_sbin:
- force: True
- preserve: True
copy_so-config_manager_tools_sbin:
file.copy:
- name: /opt/so/saltstack/default/salt/manager/tools/sbin/so-config.py
- source: {{UPDATE_DIR}}/salt/manager/tools/sbin/so-config.py
- force: True
- preserve: True
copy_so-repo-sync_manager_tools_sbin:
file.copy:
- name: /opt/so/saltstack/default/salt/manager/tools/sbin/so-repo-sync
@@ -97,6 +104,13 @@ copy_so-yaml_sbin:
- force: True
- preserve: True
copy_so-config_sbin:
file.copy:
- name: /usr/sbin/so-config.py
- source: {{UPDATE_DIR}}/salt/manager/tools/sbin/so-config.py
- force: True
- preserve: True
copy_so-repo-sync_sbin:
file.copy:
- name: /usr/sbin/so-repo-sync
+2 -15
View File
@@ -192,21 +192,8 @@ update_docker_containers() {
echo "Unable to tag $image" >> "$LOG_FILE" 2>&1
exit 1
}
# Push to the embedded registry via a registry-to-registry copy. Avoids
# `docker push`, which on Docker 29.x with the containerd image store
# represents freshly-pulled images as an index whose layer content
# isn't reachable through the push path. The local `docker tag` above
# is preserved so so-image-pull's `:5000` existence check still works.
# Pin to the digest already gpg-verified above so we copy exactly the
# bytes we approved.
local VERIFIED_REF
VERIFIED_REF=$(echo "$DOCKERINSPECT" | jq -r ".[0].RepoDigests[] | select(. | contains(\"$CONTAINER_REGISTRY\"))" | head -n 1)
if [ -z "$VERIFIED_REF" ] || [ "$VERIFIED_REF" = "null" ]; then
echo "Unable to determine verified digest for $image" >> "$LOG_FILE" 2>&1
exit 1
fi
docker buildx imagetools create --tag $HOSTNAME:5000/$IMAGEREPO/$image "$VERIFIED_REF" >> "$LOG_FILE" 2>&1 || {
echo "Unable to copy $image to embedded registry" >> "$LOG_FILE" 2>&1
docker push $HOSTNAME:5000/$IMAGEREPO/$image >> "$LOG_FILE" 2>&1 || {
echo "Unable to push $image" >> "$LOG_FILE" 2>&1
exit 1
}
fi
+1 -3
View File
@@ -165,8 +165,6 @@ if [[ $EXCLUDE_FALSE_POSITIVE_ERRORS == 'Y' ]]; then
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|upgrading component template" # false positive (elasticsearch index or template names contain 'error')
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|upgrading composable template" # false positive (elasticsearch composable template names contain 'error')
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|Error while parsing document for index \[.ds-logs-kratos-so-.*object mapping for \[file\]" # false positive (mapping error occuring BEFORE kratos index has rolled over in 2.4.210)
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|No such container" # false positive (telegraf trying to run stats on an old container)
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|passwords do not match" # false positive (automated hydra test)
fi
if [[ $EXCLUDE_KNOWN_ERRORS == 'Y' ]]; then
@@ -229,7 +227,7 @@ if [[ $EXCLUDE_KNOWN_ERRORS == 'Y' ]]; then
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|from NIC checksum offloading" # zeek reporter.log
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|marked for removal" # docker container getting recycled
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|tcp 127.0.0.1:6791: bind: address already in use" # so-elastic-fleet agent restarting. Seen starting w/ 8.18.8 https://github.com/elastic/kibana/issues/201459
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|TransformTask\] \[logs-(tychon|aws_billing|microsoft_defender_endpoint|armis|o365_metrics|microsoft_sentinel|snyk|cyera|island_browser).*user so_kibana lacks the required permissions \[(logs|metrics)-\1" # Known issue with integrations starting transform jobs that are explicitly not allowed to start as a system user. This error should not be seen on fresh ES 9.3.3 installs or after SO 3.1.0 with soups addition of check_transform_health_and_reauthorize()
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|TransformTask\] \[logs-(tychon|aws_billing|microsoft_defender_endpoint|armis|o365_metrics|microsoft_sentinel|snyk).*user so_kibana lacks the required permissions \[(logs|metrics)-\1" # Known issue with integrations starting transform jobs that are explicitly not allowed to start as a system user. (installed as so_elastic / so_kibana)
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|manifest unknown" # appears in so-dockerregistry log for so-tcpreplay following docker upgrade to 29.2.1-1
fi
@@ -1,3 +1,5 @@
{% import_yaml 'salt/minion.defaults.yaml' as SALT_MINION_DEFAULTS -%}
#!/bin/bash
#
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
@@ -23,8 +25,7 @@ SYSTEM_START_TIME=$(date -d "$(</proc/uptime awk '{print $1}') seconds ago" +%s)
LAST_HIGHSTATE_END=$([ -e "/opt/so/log/salt/lasthighstate" ] && date -r /opt/so/log/salt/lasthighstate +%s || echo 0)
LAST_HEALTHCHECK_STATE_APPLY=$([ -e "/opt/so/log/salt/state-apply-test" ] && date -r /opt/so/log/salt/state-apply-test +%s || echo 0)
# SETTING THRESHOLD TO ANYTHING UNDER 600 seconds may cause a lot of salt-minion restarts since the job to touch the file occurs every 5-8 minutes by default
# THRESHOLD is derived from the global push highstate interval + 1 hour, so the minion-check grace period tracks the schedule automatically.
THRESHOLD=$(( ({{ salt['pillar.get']('global:push:highstate_interval_hours', 2) }} + 1) * 3600 )) #within how many seconds the file /opt/so/log/salt/state-apply-test must have been touched/modified before the salt minion is restarted
THRESHOLD={{SALT_MINION_DEFAULTS.salt.minion.check_threshold}} #within how many seconds the file /opt/so/log/salt/state-apply-test must have been touched/modified before the salt minion is restarted
THRESHOLD_DATE=$((LAST_HEALTHCHECK_STATE_APPLY+THRESHOLD))
logCmd() {
+1 -2
View File
@@ -9,8 +9,7 @@
prune_images:
cmd.run:
- name: so-docker-prune
- onlyif: command -v /usr/sbin/so-docker-prune >/dev/null 2>&1
- order: 9000
- order: last
{% else %}
-1
View File
@@ -19,7 +19,6 @@ wait_for_elasticsearch:
so-elastalert:
docker_container.running:
- image: {{ GLOBALS.registry_host }}:5000/{{ GLOBALS.image_repo }}/so-elastalert:{{ GLOBALS.so_version }}
- restart_policy: unless-stopped
- hostname: elastalert
- name: so-elastalert
- user: so-elastalert
@@ -15,7 +15,6 @@ include:
so-elastic-fleet-package-registry:
docker_container.running:
- image: {{ GLOBALS.registry_host }}:5000/{{ GLOBALS.image_repo }}/so-elastic-fleet-package-registry:{{ GLOBALS.so_version }}
- restart_policy: unless-stopped
- name: so-elastic-fleet-package-registry
- hostname: Fleet-package-reg-{{ GLOBALS.hostname }}
- detach: True
@@ -52,16 +51,6 @@ so-elastic-fleet-package-registry:
- {{ ULIMIT.name }}={{ ULIMIT.soft }}:{{ ULIMIT.hard }}
{% endfor %}
{% endif %}
wait_for_so-elastic-fleet-package-registry:
http.wait_for_successful_query:
- name: "http://localhost:8080/health"
- status: 200
- wait_for: 300
- request_interval: 15
- require:
- docker_container: so-elastic-fleet-package-registry
delete_so-elastic-fleet-package-registry_so-status.disabled:
file.uncomment:
- name: /opt/so/conf/so-status/so-status.conf
-1
View File
@@ -16,7 +16,6 @@ include:
so-elastic-agent:
docker_container.running:
- image: {{ GLOBALS.registry_host }}:5000/{{ GLOBALS.image_repo }}/so-elastic-agent:{{ GLOBALS.so_version }}
- restart_policy: unless-stopped
- name: so-elastic-agent
- hostname: {{ GLOBALS.hostname }}
- detach: True
-3
View File
@@ -26,9 +26,7 @@ include:
wait_for_elasticsearch_elasticfleet:
cmd.run:
- name: so-elasticsearch-wait
{% endif %}
{% if GLOBALS.role == "so-fleet" %}
# Sync Elastic Agent artifacts to Fleet Node
elasticagent_syncartifacts:
file.recurse:
@@ -42,7 +40,6 @@ elasticagent_syncartifacts:
so-elastic-fleet:
docker_container.running:
- image: {{ GLOBALS.registry_host }}:5000/{{ GLOBALS.image_repo }}/so-elastic-agent:{{ GLOBALS.so_version }}
- restart_policy: unless-stopped
- name: so-elastic-fleet
- hostname: FleetServer-{{ GLOBALS.hostname }}
- detach: True
+12 -3
View File
@@ -11,14 +11,24 @@ include:
- elasticfleet.config
# If enabled, automatically update Fleet Logstash Outputs
{% if ELASTICFLEETMERGED.config.server.enable_auto_configuration %}
{% if grains.role not in ['so-import', 'so-eval']%}
{% if ELASTICFLEETMERGED.config.server.enable_auto_configuration and grains.role not in ['so-import', 'so-eval'] %}
so-elastic-fleet-auto-configure-logstash-outputs:
cmd.run:
- name: /usr/sbin/so-elastic-fleet-outputs-update
- retry:
attempts: 4
interval: 30
{# Separate from above in order to catch elasticfleet-logstash.crt changes and force update to fleet output policy #}
so-elastic-fleet-auto-configure-logstash-outputs-force:
cmd.run:
- name: /usr/sbin/so-elastic-fleet-outputs-update --certs
- retry:
attempts: 4
interval: 30
- onchanges:
- x509: etc_elasticfleet_logstash_crt
- x509: elasticfleet_kafka_crt
{% endif %}
# If enabled, automatically update Fleet Server URLs & ES Connection
@@ -28,7 +38,6 @@ so-elastic-fleet-auto-configure-server-urls:
- retry:
attempts: 4
interval: 30
{% endif %}
# Automatically update Fleet Server Elasticsearch URLs & Agent Artifact URLs
so-elastic-fleet-auto-configure-elasticsearch-urls:
@@ -240,7 +240,7 @@ elastic_fleet_policy_create() {
--arg DESC "$DESC" \
--arg TIMEOUT $TIMEOUT \
--arg FLEETSERVER "$FLEETSERVER" \
'{"name": $NAME,"id":$NAME,"description":$DESC,"namespace":"default","monitoring_enabled":["logs"],"inactivity_timeout":$TIMEOUT,"has_fleet_server":$FLEETSERVER,"advanced_settings":{"agent_logging_level": "warning"}}'
'{"name": $NAME,"id":$NAME,"description":$DESC,"namespace":"default","monitoring_enabled":["logs"],"inactivity_timeout":$TIMEOUT,"has_fleet_server":$FLEETSERVER}'
)
# Create Fleet Policy
if ! fleet_api "agent_policies" -XPOST -H 'kbn-xsrf: true' -H 'Content-Type: application/json' -d "$JSON_STRING"; then
@@ -235,16 +235,6 @@ function update_kafka_outputs() {
{% endif %}
# Compare the current Elastic Fleet certificate against what is on disk
POLICY_CERT_SHA=$(jq -r '.item.ssl.certificate' <<< $RAW_JSON | openssl x509 -noout -sha256 -fingerprint)
DISK_CERT_SHA=$(openssl x509 -in /etc/pki/elasticfleet-logstash.crt -noout -sha256 -fingerprint)
if [[ "$POLICY_CERT_SHA" != "$DISK_CERT_SHA" ]]; then
printf "Certificate on disk doesn't match certificate in policy - forcing update\n"
UPDATE_CERTS=true
FORCE_UPDATE=true
fi
# Sort & hash the new list of Logstash Outputs
NEW_LIST_JSON=$(jq --compact-output --null-input '$ARGS.positional' --args -- "${NEW_LIST[@]}")
NEW_HASH=$(sha256sum <<< "$NEW_LIST_JSON" | awk '{print $1}')
@@ -232,6 +232,7 @@ printf '%s\n'\
" grid_enrollment_general: '$GRIDNODESENROLLMENTOKENGENERAL'"\
" grid_enrollment_heavy: '$GRIDNODESENROLLMENTOKENHEAVY'"\
"" >> "$pillar_file"
/usr/sbin/so-config.py import-file "$pillar_file" --note "so-elastic-fleet-setup"
#Store Grid Nodes Enrollment token in Global pillar
global_pillar_file=/opt/so/saltstack/local/pillar/global/soc_global.sls
@@ -239,6 +240,7 @@ printf '%s\n'\
" fleet_grid_enrollment_token_general: '$GRIDNODESENROLLMENTOKENGENERAL'"\
" fleet_grid_enrollment_token_heavy: '$GRIDNODESENROLLMENTOKENHEAVY'"\
"" >> "$global_pillar_file"
/usr/sbin/so-config.py import-file "$global_pillar_file" --note "so-elastic-fleet-setup"
# Call Elastic-Fleet Salt State
printf "\nApplying elasticfleet state"
+1 -18
View File
@@ -9,12 +9,9 @@
{% from 'elasticsearch/config.map.jinja' import ELASTICSEARCHMERGED %}
{% from 'elasticsearch/template.map.jinja' import ES_INDEX_SETTINGS, SO_MANAGED_INDICES %}
{% if GLOBALS.role != 'so-heavynode' %}
{% from 'elasticsearch/template.map.jinja' import ALL_ADDON_SETTINGS, ADDON_INDICES %}
{% from 'elasticsearch/template.map.jinja' import ALL_ADDON_SETTINGS %}
{% endif %}
include:
- elasticsearch.enabled
escomponenttemplates:
file.recurse:
- name: /opt/so/conf/elasticsearch/templates/component
@@ -38,20 +35,6 @@ so_index_template_dir:
{%- endfor %}
{%- endif %}
{% if GLOBALS.role != "so-heavynode" %}
# Clean up legacy and non-SO managed templates from the elasticsearch/templates/addon-index/ directory
addon_index_template_dir:
file.directory:
- name: /opt/so/conf/elasticsearch/templates/addon-index
- clean: True
{%- if ADDON_INDICES %}
- require:
{%- for index in ADDON_INDICES %}
- file: addon_index_template_{{index}}
{%- endfor %}
{%- endif %}
{% endif %}
# Auto-generate index templates for SO managed indices (directly defined in elasticsearch/defaults.yaml)
# These index templates are for the core SO datasets and are always required
{% for index, settings in ES_INDEX_SETTINGS.items() %}
+1 -4
View File
@@ -3958,13 +3958,10 @@ elasticsearch:
- vulnerability-mappings
- common-settings
- common-dynamic-mappings
- logs-redis.log@package
- logs-redis.log@custom
data_stream:
allow_custom_routing: false
hidden: false
ignore_missing_component_templates:
- logs-redis.log@custom
ignore_missing_component_templates: []
index_patterns:
- logs-redis.log*
priority: 501
-1
View File
@@ -24,7 +24,6 @@ include:
so-elasticsearch:
docker_container.running:
- image: {{ GLOBALS.registry_host }}:5000/{{ GLOBALS.image_repo }}/so-elasticsearch:{{ ELASTICSEARCHMERGED.version }}
- restart_policy: unless-stopped
- hostname: elasticsearch
- name: so-elasticsearch
- user: elasticsearch
+1 -2
View File
@@ -63,8 +63,7 @@
{ "set": { "if": "ctx.event?.dataset != null && !ctx.event.dataset.contains('.')", "field": "event.dataset", "value": "{{event.module}}.{{event.dataset}}" } },
{ "split": { "if": "ctx.event?.dataset != null && ctx.event.dataset.contains('.')", "field": "event.dataset", "separator": "\\.", "target_field": "dataset_tag_temp" } },
{ "append": { "if": "ctx.dataset_tag_temp != null", "field": "tags", "value": "{{dataset_tag_temp.1}}" } },
{ "grok": { "if": "ctx.http?.response?.status_code instanceof String", "field": "http.response.status_code", "patterns": ["%{NUMBER:http.response.status_code:long}(?:\\s+%{GREEDYDATA})?"], "ignore_failure": true } },
{ "convert": { "if": "ctx.http?.response?.status_code != null && !(ctx.http.response.status_code instanceof Number)", "field": "http.response.status_code", "type": "long", "ignore_failure": true } },
{ "grok": { "if": "ctx.http?.response?.status_code != null", "field": "http.response.status_code", "patterns": ["%{NUMBER:http.response.status_code:long} %{GREEDYDATA}"]} },
{ "set": { "if": "ctx?.metadata?.kafka != null" , "field": "kafka.id", "value": "{{metadata.kafka.partition}}{{metadata.kafka.offset}}{{metadata.kafka.timestamp}}", "ignore_failure": true } },
{ "remove": { "field": [ "message2", "type", "fields", "category", "module", "dataset", "dataset_tag_temp", "event.dataset_temp" ], "ignore_missing": true, "ignore_failure": true } },
{ "pipeline": { "name": "global@custom", "ignore_missing_pipeline": true, "description": "[Fleet] Global pipeline for all data streams" } }
+1 -74
View File
@@ -177,84 +177,12 @@
"description": "Extract IPs from Elastic Agent events (host.ip) and adds them to related.ip"
}
},
{
"script": {
"description": "Snapshot event.ingested into _tmp.event_ingested_pre_fleet before .fleet_final_pipeline-1 overwrites it with ES ingest time",
"lang": "painless",
"if": "ctx.event?.ingested != null && ctx.event?.created == null",
"ignore_failure": true,
"source": "ctx.putIfAbsent('_tmp', [:]); ctx._tmp.event_ingested_pre_fleet = ctx.event.ingested;"
}
},
{
"pipeline": {
"name": ".fleet_final_pipeline-1",
"ignore_missing_pipeline": true
}
},
{
"script": {
"description": "Calculate time from Elastic Agent to Logstash.",
"lang": "painless",
"if": "ctx._tmp?.logstash_from_agent != null",
"ignore_failure": true,
"source": "ZonedDateTime start = ctx._tmp.event_ingested_pre_fleet != null ? ZonedDateTime.parse(ctx._tmp.event_ingested_pre_fleet) : ZonedDateTime.parse(ctx['@timestamp']); ctx.event.putIfAbsent('ingestion', [:]); ctx.event.ingestion.latency_elasticagent_to_logstash = ChronoUnit.SECONDS.between(start, ZonedDateTime.parse(ctx._tmp.logstash_from_agent));"
}
},
{
"script": {
"description": "Calculate time from Logstash to Redis",
"lang": "painless",
"if": "ctx._tmp?.logstash_from_agent != null && ctx._tmp?.logstash_to_redis != null",
"ignore_failure": true,
"source": "ctx.event.putIfAbsent('ingestion', [:]); ctx.event.ingestion.latency_logstash_to_redis = ChronoUnit.SECONDS.between(ZonedDateTime.parse(ctx._tmp.logstash_from_agent), ZonedDateTime.parse(ctx._tmp.logstash_to_redis));"
}
},
{
"script": {
"description": "Calculate time message spends in redis queue (logstash delay in pulling event).",
"lang": "painless",
"if": "ctx._tmp?.logstash_to_redis != null && ctx._tmp?.logstash_from_redis != null",
"ignore_failure": true,
"source": "ctx.event.putIfAbsent('ingestion', [:]); ctx.event.ingestion.latency_redis_to_logstash = ChronoUnit.SECONDS.between(ZonedDateTime.parse(ctx._tmp.logstash_to_redis), ZonedDateTime.parse(ctx._tmp.logstash_from_redis));"
}
},
{
"script": {
"description": "Calculate time from Logstash to Elasticsearch (after read from Redis).",
"lang": "painless",
"if": "ctx._tmp?.logstash_from_redis != null",
"ignore_failure": true,
"source": "ctx.event.putIfAbsent('ingestion', [:]); ctx.event.ingestion.latency_logstash_to_elasticsearch = ChronoUnit.SECONDS.between(ZonedDateTime.parse(ctx._tmp.logstash_from_redis), metadata().now);"
}
},
{
"script": {
"description": "Calculate time from Elastic Agent to Kafka.",
"lang": "painless",
"if": "ctx._tmp?.logstash_from_kafka != null && ctx._tmp?.logstash_from_agent == null",
"ignore_failure": true,
"source": "ZonedDateTime start = ctx._tmp.event_ingested_pre_fleet != null ? ZonedDateTime.parse(ctx._tmp.event_ingested_pre_fleet) : ZonedDateTime.parse(ctx['@timestamp']); ctx.event.putIfAbsent('ingestion', [:]); ctx.event.ingestion.latency_elasticagent_to_kafka = ChronoUnit.SECONDS.between(start, ZonedDateTime.parse(ctx._tmp.logstash_from_kafka));"
}
},
{
"script": {
"description": "Calculate time message spends in Kafka queue (logstash delay in pulling event).",
"lang": "painless",
"if": "ctx._tmp?.logstash_from_kafka != null && ctx.metadata?.kafka?.timestamp != null && ctx._tmp?.logstash_from_agent == null",
"ignore_failure": true,
"source": "ctx.event.putIfAbsent('ingestion', [:]); ctx.event.ingestion.latency_kafka_queue = ChronoUnit.SECONDS.between(ZonedDateTime.ofInstant(Instant.ofEpochMilli(Long.parseLong(ctx.metadata.kafka.timestamp.toString())), ZoneId.of('UTC')), ZonedDateTime.parse(ctx._tmp.logstash_from_kafka));"
}
},
{
"script": {
"description": "Calculate time from Logstash to Elasticsearch (after read from Kafka).",
"lang": "painless",
"if": "ctx._tmp?.logstash_from_kafka != null && ctx._tmp?.logstash_from_agent == null",
"ignore_failure": true,
"source": "ctx.event.putIfAbsent('ingestion', [:]); ctx.event.ingestion.latency_kafka_to_elasticsearch = ChronoUnit.SECONDS.between(ZonedDateTime.parse(ctx._tmp.logstash_from_kafka), metadata().now);"
}
},
{
"remove": {
"field": "event.agent_id_status",
@@ -274,8 +202,7 @@
"event.dataset_temp",
"dataset_tag_temp",
"module_temp",
"datastream_dataset_temp",
"_tmp"
"datastream_dataset_temp"
],
"ignore_missing": true,
"ignore_failure": true
-71
View File
@@ -1,71 +0,0 @@
{
"description": "zeek.ja4d",
"processors": [
{
"set": {
"field": "event.dataset",
"value": "ja4d"
}
},
{
"remove": {
"field": [
"host"
],
"ignore_failure": true
}
},
{
"json": {
"field": "message",
"target_field": "message2",
"ignore_failure": true
}
},
{
"rename": {
"field": "message2.ja4d",
"target_field": "hash.ja4d",
"ignore_missing": true,
"if": "ctx?.message2?.ja4d != null && ctx.message2.ja4d.length() > 0"
}
},
{
"rename": {
"field": "message2.client_mac",
"target_field": "host.mac",
"ignore_missing": true,
"if": "ctx?.message2?.client_mac != null && ctx.message2.client_mac.length() > 0"
}
},
{
"rename": {
"field": "message2.hostname",
"target_field": "host.hostname",
"ignore_missing": true,
"if": "ctx?.message2?.hostname != null && ctx.message2.hostname.length() > 0"
}
},
{
"rename": {
"field": "message2.requested_ip",
"target_field": "dhcp.requested_address",
"ignore_missing": true,
"if": "ctx?.message2?.requested_ip != null && ctx.message2.requested_ip.length() > 0"
}
},
{
"rename": {
"field": "message2.vendor_class_id",
"target_field": "zeek.ja4d.vendor_class_id",
"ignore_missing": true,
"if": "ctx?.message2?.vendor_class_id != null && ctx.message2.vendor_class_id.length() > 0"
}
},
{
"pipeline": {
"name": "zeek.common"
}
}
]
}
+3 -22
View File
@@ -61,25 +61,15 @@
{% if ALL_ADDON_SETTINGS_ORIG.keys() | length > 0 %}
{% for index in ALL_ADDON_SETTINGS_ORIG.keys() %}
{% do ALL_ADDON_SETTINGS_GLOBAL_OVERRIDES.update({index: salt['defaults.merge'](ALL_ADDON_SETTINGS_ORIG[index], PILLAR_GLOBAL_OVERRIDES, in_place=False)}) %}
{# Explicitly excluding addon indices from ES_INDEX_SETTINGS_ORIG
When manager.soc_managed_annotations runs, new entries are added to the salt/elasticsearch/defaults.yaml file to support 'revert to default' functionality.
Subsequent map renders will then incorrectly include 'integration X' in 'ES_INDEX_SETTINGS_ORIG' due to being in the defaults.yaml file. #}
{% if index in ES_INDEX_SETTINGS_ORIG.keys() %}
{% do ES_INDEX_SETTINGS_ORIG.pop(index) %}
{% endif %}
{% endfor %}
{% endif %}
{% set ES_INDEX_SETTINGS = {} %}
{% macro create_final_index_template(DEFINED_SETTINGS, GLOBAL_OVERRIDES, FINAL_INDEX_SETTINGS, EXCLUDE_INDICES=[]) %}
{% macro create_final_index_template(DEFINED_SETTINGS, GLOBAL_OVERRIDES, FINAL_INDEX_SETTINGS) %}
{% do GLOBAL_OVERRIDES.update(salt['defaults.merge'](GLOBAL_OVERRIDES, ES_INDEX_PILLAR, in_place=False)) %}
{% for index, settings in GLOBAL_OVERRIDES.items() %}
{% if index in EXCLUDE_INDICES %}
{% continue %}
{% endif %}
{# prevent this action from being performed on custom defined indices. #}
{# the custom defined index is not present in either of the dictionaries and fails to reder. #}
{% if index in DEFINED_SETTINGS and index in GLOBAL_OVERRIDES %}
@@ -160,19 +150,10 @@
{% endfor %}
{% endmacro %}
{# Exclude addon integrations from final ES_INDEX_SETTINGS #}
{{ create_final_index_template(ES_INDEX_SETTINGS_ORIG, ES_INDEX_SETTINGS_GLOBAL_OVERRIDES, ES_INDEX_SETTINGS, ALL_ADDON_SETTINGS_ORIG.keys() | list ) }}
{# Exclude SO managed indices, otherwise ALL_ADDON_SETTINGS will include pillar values
of core integrations without merging defaults, resulting in an overlapping, but bad index template being generated. #}
{{ create_final_index_template(ALL_ADDON_SETTINGS_ORIG, ALL_ADDON_SETTINGS_GLOBAL_OVERRIDES, ALL_ADDON_SETTINGS, ES_INDEX_SETTINGS_ORIG.keys() | list ) }}
{{ create_final_index_template(ES_INDEX_SETTINGS_ORIG, ES_INDEX_SETTINGS_GLOBAL_OVERRIDES, ES_INDEX_SETTINGS) }}
{{ create_final_index_template(ALL_ADDON_SETTINGS_ORIG, ALL_ADDON_SETTINGS_GLOBAL_OVERRIDES, ALL_ADDON_SETTINGS) }}
{% set SO_MANAGED_INDICES = [] %}
{% for index, settings in ES_INDEX_SETTINGS.items() %}
{% do SO_MANAGED_INDICES.append(index) %}
{% endfor %}
{% set ADDON_INDICES = [] %}
{% for index, settings in ALL_ADDON_SETTINGS.items() %}
{% do ADDON_INDICES.append(index) %}
{% endfor %}
-33
View File
@@ -398,7 +398,6 @@ firewall:
- elasticsearch_rest
- docker_registry
- influxdb
- postgres
- sensoroni
- yum
- beats_5044
@@ -411,7 +410,6 @@ firewall:
portgroups:
- docker_registry
- influxdb
- postgres
- sensoroni
- yum
- beats_5044
@@ -429,7 +427,6 @@ firewall:
- yum
- docker_registry
- influxdb
- postgres
- sensoroni
searchnode:
portgroups:
@@ -440,7 +437,6 @@ firewall:
- yum
- docker_registry
- influxdb
- postgres
- elastic_agent_control
- elastic_agent_data
- elastic_agent_update
@@ -454,7 +450,6 @@ firewall:
- yum
- docker_registry
- influxdb
- postgres
- elastic_agent_control
- elastic_agent_data
- elastic_agent_update
@@ -464,7 +459,6 @@ firewall:
- yum
- docker_registry
- influxdb
- postgres
- elastic_agent_control
- elastic_agent_data
- elastic_agent_update
@@ -498,7 +492,6 @@ firewall:
portgroups:
- docker_registry
- influxdb
- postgres
- sensoroni
- yum
- elastic_agent_control
@@ -509,7 +502,6 @@ firewall:
- yum
- docker_registry
- influxdb
- postgres
- elastic_agent_control
- elastic_agent_data
- elastic_agent_update
@@ -618,7 +610,6 @@ firewall:
- elasticsearch_rest
- docker_registry
- influxdb
- postgres
- sensoroni
- yum
- beats_5044
@@ -631,7 +622,6 @@ firewall:
portgroups:
- docker_registry
- influxdb
- postgres
- sensoroni
- yum
- beats_5044
@@ -649,7 +639,6 @@ firewall:
- yum
- docker_registry
- influxdb
- postgres
- sensoroni
searchnode:
portgroups:
@@ -660,7 +649,6 @@ firewall:
- yum
- docker_registry
- influxdb
- postgres
- elastic_agent_control
- elastic_agent_data
- elastic_agent_update
@@ -674,7 +662,6 @@ firewall:
- yum
- docker_registry
- influxdb
- postgres
- elastic_agent_control
- elastic_agent_data
- elastic_agent_update
@@ -684,7 +671,6 @@ firewall:
- yum
- docker_registry
- influxdb
- postgres
- elastic_agent_control
- elastic_agent_data
- elastic_agent_update
@@ -716,7 +702,6 @@ firewall:
portgroups:
- docker_registry
- influxdb
- postgres
- sensoroni
- yum
- elastic_agent_control
@@ -727,7 +712,6 @@ firewall:
- yum
- docker_registry
- influxdb
- postgres
- elastic_agent_control
- elastic_agent_data
- elastic_agent_update
@@ -836,7 +820,6 @@ firewall:
- elasticsearch_rest
- docker_registry
- influxdb
- postgres
- sensoroni
- yum
- beats_5044
@@ -849,7 +832,6 @@ firewall:
portgroups:
- docker_registry
- influxdb
- postgres
- sensoroni
- yum
- beats_5044
@@ -867,7 +849,6 @@ firewall:
- yum
- docker_registry
- influxdb
- postgres
- sensoroni
searchnode:
portgroups:
@@ -877,7 +858,6 @@ firewall:
- yum
- docker_registry
- influxdb
- postgres
- elastic_agent_control
- elastic_agent_data
- elastic_agent_update
@@ -890,7 +870,6 @@ firewall:
- yum
- docker_registry
- influxdb
- postgres
- elastic_agent_control
- elastic_agent_data
- elastic_agent_update
@@ -900,7 +879,6 @@ firewall:
- yum
- docker_registry
- influxdb
- postgres
- elastic_agent_control
- elastic_agent_data
- elastic_agent_update
@@ -934,7 +912,6 @@ firewall:
portgroups:
- docker_registry
- influxdb
- postgres
- sensoroni
- yum
- elastic_agent_control
@@ -945,7 +922,6 @@ firewall:
- yum
- docker_registry
- influxdb
- postgres
- elastic_agent_control
- elastic_agent_data
- elastic_agent_update
@@ -1064,7 +1040,6 @@ firewall:
- elasticsearch_rest
- docker_registry
- influxdb
- postgres
- sensoroni
- yum
- beats_5044
@@ -1077,7 +1052,6 @@ firewall:
portgroups:
- docker_registry
- influxdb
- postgres
- sensoroni
- yum
- beats_5044
@@ -1089,7 +1063,6 @@ firewall:
portgroups:
- docker_registry
- influxdb
- postgres
- sensoroni
- yum
- beats_5044
@@ -1101,7 +1074,6 @@ firewall:
portgroups:
- docker_registry
- influxdb
- postgres
- sensoroni
- yum
- redis
@@ -1111,7 +1083,6 @@ firewall:
portgroups:
- docker_registry
- influxdb
- postgres
- sensoroni
- yum
- redis
@@ -1122,7 +1093,6 @@ firewall:
- yum
- docker_registry
- influxdb
- postgres
- elastic_agent_control
- elastic_agent_data
- elastic_agent_update
@@ -1159,7 +1129,6 @@ firewall:
portgroups:
- docker_registry
- influxdb
- postgres
- sensoroni
- yum
- elastic_agent_control
@@ -1170,7 +1139,6 @@ firewall:
- yum
- docker_registry
- influxdb
- postgres
- elastic_agent_control
- elastic_agent_data
- elastic_agent_update
@@ -1514,7 +1482,6 @@ firewall:
- kibana
- redis
- influxdb
- postgres
- elasticsearch_rest
- elasticsearch_node
- elastic_agent_control
+13
View File
@@ -1,5 +1,6 @@
{% from 'vars/globals.map.jinja' import GLOBALS %}
{% from 'docker/docker.map.jinja' import DOCKERMERGED %}
{% from 'telegraf/map.jinja' import TELEGRAFMERGED %}
{% import_yaml 'firewall/defaults.yaml' as FIREWALL_DEFAULT %}
{# add our ip to self #}
@@ -55,4 +56,16 @@
{% endif %}
{# Open Postgres (5432) to minion hostgroups when Telegraf is configured to write to Postgres #}
{% set TG_OUT = TELEGRAFMERGED.output | upper %}
{% if TG_OUT in ['POSTGRES', 'BOTH'] %}
{% if role.startswith('manager') or role == 'standalone' or role == 'eval' %}
{% for r in ['sensor', 'searchnode', 'heavynode', 'receiver', 'fleet', 'idh', 'desktop', 'import'] %}
{% if FIREWALL_DEFAULT.firewall.role[role].chain["DOCKER-USER"].hostgroups[r] is defined %}
{% do FIREWALL_DEFAULT.firewall.role[role].chain["DOCKER-USER"].hostgroups[r].portgroups.append('postgres') %}
{% endif %}
{% endfor %}
{% endif %}
{% endif %}
{% set FIREWALL_MERGED = salt['pillar.get']('firewall', FIREWALL_DEFAULT.firewall, merge=True) %}
-7
View File
@@ -1,10 +1,3 @@
global:
pcapengine: SURICATA
pipeline: REDIS
push:
enabled: true
highstate_interval_hours: 2
debounce_seconds: 30
drain_interval: 15
batch: '25%'
batch_wait: 15
-37
View File
@@ -59,41 +59,4 @@ global:
description: Allows use of Endgame with Security Onion. This feature requires a license from Endgame.
global: True
advanced: True
push:
enabled:
description: Master kill-switch for the active push feature. When disabled, rule and pillar changes are picked up at the next scheduled highstate instead of being pushed immediately.
forcedType: bool
helpLink: push
global: True
highstate_interval_hours:
description: How often every minion in the grid runs a scheduled state.highstate, in hours. Lower values keep minions closer in sync at the cost of more load; higher values reduce load but increase worst-case latency for non-pushed changes. The salt-minion health check restarts a minion if its last highstate is older than this value plus one hour.
forcedType: int
helpLink: push
global: True
advanced: True
debounce_seconds:
description: Trailing-edge debounce window in seconds. A push intent must be quiet for this long before the drainer dispatches. Rapid bursts of edits within this window coalesce into one dispatch.
forcedType: int
helpLink: push
global: True
advanced: True
drain_interval:
description: How often the push drainer checks for ready intents, in seconds. Small values lower dispatch latency at the cost of more background work on the manager.
forcedType: int
helpLink: push
global: True
advanced: True
batch:
description: "Host batch size for push orchestrations. A number (e.g. '10') or a percentage (e.g. '25%'). Limits how many minions run the push state at once so large fleets don't thundering-herd."
helpLink: push
global: True
advanced: True
regex: '^([0-9]+%?)$'
regexFailureMessage: Enter a whole number or a whole-number percentage (e.g. 10 or 25%).
batch_wait:
description: Seconds to wait between host batches in a push orchestration. Gives the fleet time to breathe between waves.
forcedType: int
helpLink: push
global: True
advanced: True
-1
View File
@@ -58,7 +58,6 @@ so-hydra:
- {{ ULIMIT.name }}={{ ULIMIT.soft }}:{{ ULIMIT.hard }}
{% endfor %}
{% endif %}
# Intentionally unless-stopped -- matches the fleet default.
- restart_policy: unless-stopped
- watch:
- file: hydraconfig
-1
View File
@@ -15,7 +15,6 @@ include:
so-idh:
docker_container.running:
- image: {{ GLOBALS.registry_host }}:5000/{{ GLOBALS.image_repo }}/so-idh:{{ GLOBALS.so_version }}
- restart_policy: unless-stopped
- name: so-idh
- detach: True
- network_mode: host
-1
View File
@@ -18,7 +18,6 @@ include:
so-influxdb:
docker_container.running:
- image: {{ GLOBALS.registry_host }}:5000/{{ GLOBALS.image_repo }}/so-influxdb:{{ GLOBALS.so_version }}
- restart_policy: unless-stopped
- hostname: influxdb
- networks:
- sobridge:
+4 -1
View File
@@ -20,8 +20,11 @@ so-kafka_so-status.disabled:
ensure_default_pipeline:
cmd.run:
- name: |
/usr/sbin/so-yaml.py replace /opt/so/saltstack/local/pillar/kafka/soc_kafka.sls kafka.enabled False;
set -e
/usr/sbin/so-yaml.py replace /opt/so/saltstack/local/pillar/kafka/soc_kafka.sls kafka.enabled False
/usr/sbin/so-config.py sync-yaml-mutation /opt/so/saltstack/local/pillar/kafka/soc_kafka.sls replace kafka.enabled False --note "kafka.disabled"
/usr/sbin/so-yaml.py replace /opt/so/saltstack/local/pillar/global/soc_global.sls global.pipeline REDIS
/usr/sbin/so-config.py sync-yaml-mutation /opt/so/saltstack/local/pillar/global/soc_global.sls replace global.pipeline REDIS --note "kafka.disabled"
{% endif %}
{# If Kafka has never been manually enabled, the 'Kafka' user does not exist. In this case certs for Kafka should not exist since they'll be owned by uid 960 #}
-1
View File
@@ -27,7 +27,6 @@ include:
so-kafka:
docker_container.running:
- image: {{ GLOBALS.registry_host }}:5000/{{ GLOBALS.image_repo }}/so-kafka:{{ GLOBALS.so_version }}
- restart_policy: unless-stopped
- hostname: so-kafka
- name: so-kafka
- networks:
-1
View File
@@ -16,7 +16,6 @@ include:
so-kibana:
docker_container.running:
- image: {{ GLOBALS.registry_host }}:5000/{{ GLOBALS.image_repo }}/so-kibana:{{ GLOBALS.so_version }}
- restart_policy: unless-stopped
- hostname: kibana
- user: kibana
- networks:
-1
View File
@@ -51,7 +51,6 @@ so-kratos:
- {{ ULIMIT.name }}={{ ULIMIT.soft }}:{{ ULIMIT.hard }}
{% endfor %}
{% endif %}
# Intentionally unless-stopped -- matches the fleet default.
- restart_policy: unless-stopped
- watch:
- file: kratosschema
+1 -1
View File
@@ -103,7 +103,7 @@ kratos:
config:
session:
lifespan:
description: Defines the length of a login session before it will timeout, and require a new login.
description: Defines the length of a login session.
global: True
helpLink: kratos
whoami:
+2 -3
View File
@@ -26,12 +26,12 @@ logstash:
manager:
- so/0011_input_endgame.conf
- so/0012_input_elastic_agent.conf.jinja
- so/0013_input_lumberjack_fleet.conf.jinja
- so/0013_input_lumberjack_fleet.conf
- so/9999_output_redis.conf.jinja
receiver:
- so/0011_input_endgame.conf
- so/0012_input_elastic_agent.conf.jinja
- so/0013_input_lumberjack_fleet.conf.jinja
- so/0013_input_lumberjack_fleet.conf
- so/9999_output_redis.conf.jinja
search:
- so/0900_input_redis.conf.jinja
@@ -69,5 +69,4 @@ logstash:
pipeline_x_batch_x_size: 125
pipeline_x_ecs_compatibility: disabled
dmz_nodes: []
latency_metrics: False
-1
View File
@@ -28,7 +28,6 @@ include:
so-logstash:
docker_container.running:
- image: {{ GLOBALS.registry_host }}:5000/{{ GLOBALS.image_repo }}/so-logstash:{{ GLOBALS.so_version }}
- restart_policy: unless-stopped
- hostname: so-logstash
- name: so-logstash
- networks:
@@ -1,4 +1,3 @@
{%- from 'logstash/map.jinja' import LOGSTASH_MERGED %}
input {
elastic_agent {
port => 5055
@@ -12,11 +11,6 @@ input {
}
}
filter {
{% if LOGSTASH_MERGED.get('latency_metrics', False) %}
ruby {
code => "event.set('[_tmp][logstash_from_agent]', Time.now().utc.iso8601(3));"
}
{% endif %}
if ![metadata] {
mutate {
rename => {"@metadata" => "metadata"}
@@ -0,0 +1,23 @@
input {
elastic_agent {
port => 5056
tags => [ "elastic-agent", "fleet-lumberjack-input" ]
ssl_enabled => true
ssl_certificate => "/usr/share/logstash/elasticfleet-lumberjack.crt"
ssl_key => "/usr/share/logstash/elasticfleet-lumberjack.key"
ecs_compatibility => v8
id => "fleet-lumberjack-in"
codec => "json"
}
}
filter {
if ![metadata] {
mutate {
rename => {"@metadata" => "metadata"}
}
}
}
@@ -1,26 +0,0 @@
{%- from 'logstash/map.jinja' import LOGSTASH_MERGED %}
input {
elastic_agent {
port => 5056
tags => [ "elastic-agent", "fleet-lumberjack-input" ]
ssl_enabled => true
ssl_certificate => "/usr/share/logstash/elasticfleet-lumberjack.crt"
ssl_key => "/usr/share/logstash/elasticfleet-lumberjack.key"
ecs_compatibility => v8
id => "fleet-lumberjack-in"
codec => "json"
}
}
filter {
{% if LOGSTASH_MERGED.get('latency_metrics', False) %}
ruby {
code => "event.set('[_tmp][logstash_from_fleet]', Time.now().utc.iso8601(3));"
}
{% endif %}
if ![metadata] {
mutate {
rename => {"@metadata" => "metadata"}
}
}
}
@@ -1,4 +1,3 @@
{%- from 'logstash/map.jinja' import LOGSTASH_MERGED %}
{%- set kafka_password = salt['pillar.get']('kafka:config:password') %}
{%- set kafka_trustpass = salt['pillar.get']('kafka:config:trustpass') %}
{%- set kafka_brokers = salt['pillar.get']('kafka:nodes', {}) %}
@@ -31,11 +30,6 @@ input {
}
}
filter {
{% if LOGSTASH_MERGED.get('latency_metrics', False) %}
ruby {
code => "event.set('[_tmp][logstash_from_kafka]', Time.now().utc.iso8601(3));"
}
{% endif %}
if ![metadata] {
mutate {
rename => { "@metadata" => "metadata" }
@@ -1,4 +1,4 @@
{%- from 'logstash/map.jinja' import LOGSTASH_REDIS_NODES, LOGSTASH_MERGED %}
{%- from 'logstash/map.jinja' import LOGSTASH_REDIS_NODES with context %}
{%- set REDIS_PASS = salt['pillar.get']('redis:config:requirepass') %}
{%- for index in range(LOGSTASH_REDIS_NODES|length) %}
@@ -18,10 +18,3 @@ input {
}
{% endfor %}
{% endfor -%}
filter {
{% if LOGSTASH_MERGED.get('latency_metrics', False) %}
ruby {
code => "event.set('[_tmp][logstash_from_redis]', Time.now().utc.iso8601(3));"
}
{% endif %}
}
@@ -1,11 +1,3 @@
{%- from 'logstash/map.jinja' import LOGSTASH_MERGED %}
{% if LOGSTASH_MERGED.get('latency_metrics', False) %}
filter {
ruby {
code => "event.set('[_tmp][logstash_to_elasticsearch]', Time.now().utc.iso8601(3));"
}
}
{% endif %}
output {
if "elastic-agent" in [tags] and "so-ip-mappings" in [tags] {
elasticsearch {
@@ -13,14 +13,7 @@ filter {
add_tag => "fleet-lumberjack-{{ GLOBALS.hostname }}"
}
}
{%- from 'logstash/map.jinja' import LOGSTASH_MERGED %}
{% if LOGSTASH_MERGED.get('latency_metrics', False) %}
filter {
ruby {
code => "event.set('[_tmp][fleet_to_logstash]', Time.now().utc.iso8601(3));"
}
}
{% endif %}
output {
lumberjack {
codec => json
@@ -1,17 +1,10 @@
{%- from 'logstash/map.jinja' import LOGSTASH_MERGED %}
{%- if grains.role in ['so-heavynode', 'so-receiver'] %}
{%- set HOST = GLOBALS.hostname %}
{%- else %}
{%- set HOST = GLOBALS.manager %}
{%- endif %}
{%- set REDIS_PASS = salt['pillar.get']('redis:config:requirepass') %}
{% if LOGSTASH_MERGED.get('latency_metrics', False) %}
filter {
ruby {
code => "event.set('[_tmp][logstash_to_redis]', Time.now().utc.iso8601(3));"
}
}
{% endif %}
output {
redis {
host => '{{ HOST }}'
-5
View File
@@ -86,8 +86,3 @@ logstash:
multiline: True
advanced: True
forcedType: "[]string"
latency_metrics:
description: Enable latency metrics within events processed by logstash. Useful for pinpointing log ingest delay.
forcedType: bool
global: False
advanced: True
-21
View File
@@ -1,21 +0,0 @@
{% from 'vars/globals.map.jinja' import GLOBALS %}
{% from 'global/map.jinja' import GLOBALMERGED %}
include:
- salt.minion
{% if GLOBALS.is_manager and GLOBALMERGED.push.enabled %}
salt_beacons_pushstate:
file.managed:
- name: /etc/salt/minion.d/beacons_pushstate.conf
- source: salt://manager/files/beacons_pushstate.conf.jinja
- template: jinja
- watch_in:
- service: salt_minion_service
{% else %}
salt_beacons_pushstate:
file.absent:
- name: /etc/salt/minion.d/beacons_pushstate.conf
- watch_in:
- service: salt_minion_service
{% endif %}
@@ -1,41 +0,0 @@
{% from 'global/map.jinja' import GLOBALMERGED %}
beacons:
pillar_db:
- interval: {{ GLOBALMERGED.push.drain_interval }}
- disable_during_state_run: True
inotify:
- disable_during_state_run: True
- coalesce: True
- files:
/opt/so/saltstack/local/salt/suricata/rules:
mask:
- close_write
- moved_to
- delete
recurse: True
auto_add: True
exclude:
- '\.sw[a-z]$':
regex: True
- '~$':
regex: True
- '/4913$':
regex: True
- '/\.#':
regex: True
/opt/so/saltstack/local/salt/strelka/rules/compiled:
mask:
- close_write
- moved_to
- delete
recurse: True
auto_add: True
exclude:
- '\.sw[a-z]$':
regex: True
- '~$':
regex: True
- '/4913$':
regex: True
- '/\.#':
regex: True
-2
View File
@@ -15,7 +15,6 @@ include:
- manager.elasticsearch
- manager.kibana
- manager.managed_soc_annotations
- manager.beacons
repo_log_dir:
file.directory:
@@ -232,7 +231,6 @@ surifiltersrules:
- user: 939
- group: 939
{% else %}
{{sls}}_state_not_allowed:
+3 -5
View File
@@ -31,13 +31,11 @@ sync_es_users:
- http: wait_for_kratos
- file: so-user.lock # require so-user.lock file to be missing
# we dont want this added too early in setup, so the onlyif gates on the
# /opt/so/state/setup-complete marker. The marker is written by
# mark_setup_complete in setup/so-functions just before the final setup
# highstate (and by an upgrade-path state for systems set up under the old gate).
# we dont want this added too early in setup, so we add the onlyif to verify 'startup_states: highstate'
# is in the minion config. That line is added before the final highstate during setup
so-user_sync:
cron.present:
- user: root
- name: 'PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin /usr/sbin/so-user sync &>> /opt/so/log/soc/sync.log'
- identifier: so-user_sync
- onlyif: "test -e /opt/so/state/setup-complete"
- onlyif: "grep -x 'startup_states: highstate' /etc/salt/minion"
-117
View File
@@ -1,117 +0,0 @@
#!/bin/bash
#
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
# Runs once per boot on managers (via so-boot-mine-update.service), before
# so-boot-highstate.service. Waits for the responsive minion set to settle, pushes
# mine.update, waits until every up minion has actually reported to the mine, then
# warms the master's per-minion pillar cache so the mine-backed node pillars (node
# IPs, ES/Redis/Logstash/hypervisor discovery -- some glob- and some pillar/grain-
# targeted) are complete before the boot highstate renders them. Otherwise a node
# that is up but not yet fully reported gets dropped from those pillars and torn
# out of the configs they build (e.g. so-elasticsearch ExtraHosts -> container recreate).
MAX_WAIT=${MINE_UPDATE_MAX_WAIT:-180} # hard backstop only
INTERVAL=10
STABLE_CHECKS=3 # up-count must hold steady this many polls
elapsed=0
prev=-1
stable=0
up=0
# Wait for the *reachable* minion set to settle rather than for every accepted
# key to report up: an operator may accept a minion's key and then intentionally
# power off that host, so requiring up >= accepted would never be satisfied and
# we'd always burn the full MAX_WAIT. Once the responsive count stops growing we
# stop waiting and run mine.update against whoever is up.
while [ "$elapsed" -lt "$MAX_WAIT" ]; do
up=$(/usr/bin/salt-run manage.up --out=json 2>/dev/null \
| python3 -c 'import sys,json; print(len(json.load(sys.stdin)))' 2>/dev/null)
up=${up:-0}
if [ "$up" -gt 0 ] && [ "$up" -eq "$prev" ]; then
stable=$((stable + 1))
[ "$stable" -ge "$STABLE_CHECKS" ] && break
else
stable=0
fi
prev=$up
sleep "$INTERVAL"
elapsed=$((elapsed + INTERVAL))
done
echo "so-boot-mine-update: ${up} minions up (settled after ${elapsed}s); running mine.update"
/usr/bin/salt '*' mine.update --out=txt
# A node that is up but has not yet re-reported network.ip_addrs to the mine is
# silently dropped from mine-backed pillars (elasticsearch:nodes, node_data, ...)
# when highstate recompiles them -- which e.g. removes it from so-elasticsearch
# ExtraHosts and forces a container recreate. After the broad mine.update above,
# wait until every up minion actually has network.ip_addrs in the mine, re-pushing
# mine.update to stragglers, before releasing the boot highstate. Bounded by the
# same MAX_WAIT backstop so a slow/down node never blocks boot indefinitely.
missing=""
while [ "$elapsed" -lt "$MAX_WAIT" ]; do
up_json=$(/usr/bin/salt-run manage.up --out=json 2>/dev/null)
mine_json=$(/usr/bin/salt-run mine.get '*' network.ip_addrs tgt_type=glob --out=json 2>/dev/null)
missing=$(printf '%s' "$up_json" | python3 -c '
import sys, json
up = set(json.load(sys.stdin) or [])
mine = {k for k, v in (json.loads(sys.argv[1]) or {}).items() if v}
print("\n".join(sorted(up - mine)))
' "$mine_json" 2>/dev/null)
if [ -z "$missing" ]; then
echo "so-boot-mine-update: mine complete for all up minions after ${elapsed}s"
break
fi
echo "so-boot-mine-update: mine missing up minion(s): $(echo $missing); re-running mine.update"
for m in $missing; do /usr/bin/salt "$m" mine.update --out=txt; done
sleep "$INTERVAL"
elapsed=$((elapsed + INTERVAL))
done
[ -n "$missing" ] && echo "so-boot-mine-update: WARNING ${MAX_WAIT}s backstop hit; up minion(s) still absent from mine: $(echo $missing); highstate may drop them from configs"
# The pillar/compound-targeted node pillars (elasticsearch:nodes, redis:nodes,
# logstash:nodes, hypervisor:nodes) resolve their target against the master's
# per-minion data cache (grains+pillar in .../minions/<id>/data.p), populated only
# when a minion's pillar is (re)compiled -- separately from the mine. A freshly
# booted node can be in the mine (glob/node_data sees it) yet absent from that
# cache, so it is dropped from those pillars and from the configs they build (e.g.
# so-elasticsearch ExtraHosts). Force a synchronous pillar refresh so the master
# caches every up node's pillar; refresh_pillar wait=True returns only once the
# pillar is recompiled (and thus cached for matching). Retry stragglers <= MAX_WAIT.
echo "so-boot-mine-update: warming master pillar cache for pillar/grain-targeted node pillars"
/usr/bin/salt '*' saltutil.refresh_pillar wait=True --out=txt
missing=""
while [ "$elapsed" -lt "$MAX_WAIT" ]; do
up_json=$(/usr/bin/salt-run manage.up --out=json 2>/dev/null)
cached_json=$(/usr/bin/salt-run cache.pillar tgt='*' --out=json 2>/dev/null)
missing=$(printf '%s' "$up_json" | python3 -c '
import sys, json
up = set(json.load(sys.stdin) or [])
cached = {k for k, v in (json.loads(sys.argv[1]) or {}).items() if v}
print("\n".join(sorted(up - cached)))
' "$cached_json" 2>/dev/null)
if [ -z "$missing" ]; then
echo "so-boot-mine-update: pillar cache warm for all up minions after ${elapsed}s"
break
fi
echo "so-boot-mine-update: pillar not yet cached for: $(echo $missing); refreshing"
for m in $missing; do /usr/bin/salt "$m" saltutil.refresh_pillar wait=True --out=txt; done
sleep "$INTERVAL"
elapsed=$((elapsed + INTERVAL))
done
[ -n "$missing" ] && echo "so-boot-mine-update: WARNING ${MAX_WAIT}s backstop hit; pillar not cached for: $(echo $missing); pillar-targeted pillars may drop them"
# Log what the mine-backed pillars render so the boot-time state is inspectable.
/usr/bin/salt-call saltutil.refresh_pillar >/dev/null 2>&1
sleep 2
for key in node_data elasticsearch:nodes; do
rendered=$(/usr/bin/salt-call --out=json pillar.get "$key" 2>/dev/null \
| python3 -c 'import sys,json; print(json.dumps(json.load(sys.stdin).get("local"), indent=2, sort_keys=True))' 2>/dev/null)
echo "so-boot-mine-update: ${key} rendered as:"
echo "${rendered:-null}"
done
exit 0
+448
View File
@@ -0,0 +1,448 @@
#!/usr/bin/env python3
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
"""
so-config.py writes SOC/onionconfig settings to Postgres.
so-yaml.py remains a YAML file editor. Call this tool when a pillar-backed
setting also needs to be reflected in the onionconfig database.
"""
import argparse
import json
import os
from pathlib import Path
import subprocess
import sys
import yaml
PILLAR_ROOT = Path(os.environ.get("SO_CONFIG_PILLAR_ROOT", "/opt/so/saltstack/local/pillar"))
DOCKER_CONTAINER = os.environ.get("SO_CONFIG_PG_CONTAINER", "so-postgres")
PG_DATABASE = os.environ.get("SO_CONFIG_PG_DATABASE", "securityonion")
PG_USER = os.environ.get("SO_CONFIG_PG_USER", "postgres")
DEFAULT_USER_ID = os.environ.get("SO_CONFIG_USER_ID", "so-config")
EXCLUDE_BASENAMES = {
"secrets.sls",
"auth.sls",
"top.sls",
}
EXCLUDE_PATH_FRAGMENTS = (
"/elasticsearch/nodes.sls",
"/redis/nodes.sls",
"/kafka/nodes.sls",
"/hypervisor/nodes.sls",
"/logstash/nodes.sls",
"/node_data/ips.sls",
"/postgres/auth.sls",
"/elasticsearch/auth.sls",
"/kibana/secrets.sls",
)
class SkipPath(Exception):
pass
def pg_str(value):
if value is None:
return "NULL"
return "'" + str(value).replace("'", "''") + "'"
def pg_jsonb(value):
return pg_str(json.dumps(value)) + "::jsonb"
def docker_psql(sql):
proc = subprocess.run(
["docker", "exec", "-i", DOCKER_CONTAINER,
"psql", "-U", PG_USER, "-d", PG_DATABASE,
"-tA", "-q", "-v", "ON_ERROR_STOP=1"],
input=sql.encode(),
capture_output=True,
check=False,
timeout=60,
)
if proc.returncode != 0:
sys.stderr.write(proc.stderr.decode(errors="replace"))
raise RuntimeError(f"docker exec psql failed with rc={proc.returncode}")
return proc.stdout.decode(errors="replace")
def schema_ready():
sql = """
SELECT to_regclass('public.settings') IS NOT NULL
AND to_regclass('public.audit_settings') IS NOT NULL;
"""
return docker_psql(sql).strip() == "t"
def cmd_wait_schema(args):
import time
deadline = time.time() + args.timeout
while time.time() <= deadline:
if schema_ready():
return 0
time.sleep(args.interval)
print("so-config: onionconfig schema is not ready", file=sys.stderr)
return 1
def upsert_setting(setting_id, value, *, node_id="", duplicated_from_id=None,
user_id=DEFAULT_USER_ID, note=None):
note = note or "so-config upsert"
sql = f"""
BEGIN;
WITH old_row AS (
SELECT value
FROM settings
WHERE setting_id = {pg_str(setting_id)}
AND node_id = {pg_str(node_id)}
FOR UPDATE
),
upserted AS (
INSERT INTO settings (setting_id, value, duplicated_from_id, node_id)
VALUES ({pg_str(setting_id)}, {pg_jsonb(value)}, {pg_str(duplicated_from_id)}, {pg_str(node_id)})
ON CONFLICT (setting_id, node_id) DO UPDATE
SET value = EXCLUDED.value,
duplicated_from_id = EXCLUDED.duplicated_from_id
RETURNING value
)
INSERT INTO audit_settings (setting_id, node_id, user_id, old_value, new_value, note)
SELECT {pg_str(setting_id)},
{pg_str(node_id)},
{pg_str(user_id)},
(SELECT value FROM old_row),
(SELECT value FROM upserted),
{pg_str(note)}
WHERE NOT EXISTS (SELECT 1 FROM old_row)
OR (SELECT value FROM old_row) IS DISTINCT FROM (SELECT value FROM upserted);
COMMIT;
"""
docker_psql(sql)
def delete_setting(setting_id, *, node_id="", user_id=DEFAULT_USER_ID, note=None):
note = note or "so-config delete"
sql = f"""
BEGIN;
WITH deleted AS (
DELETE FROM settings
WHERE setting_id = {pg_str(setting_id)}
AND node_id = {pg_str(node_id)}
RETURNING value
)
INSERT INTO audit_settings (setting_id, node_id, user_id, old_value, new_value, note)
SELECT {pg_str(setting_id)}, {pg_str(node_id)}, {pg_str(user_id)}, value, NULL::jsonb, {pg_str(note)}
FROM deleted;
COMMIT;
"""
docker_psql(sql)
def delete_setting_prefix(setting_id, *, node_id="", user_id=DEFAULT_USER_ID, note=None):
if not setting_id:
raise ValueError("setting_id prefix cannot be empty")
note = note or "so-config delete-prefix"
sql = f"""
BEGIN;
WITH deleted AS (
DELETE FROM settings
WHERE node_id = {pg_str(node_id)}
AND (
setting_id = {pg_str(setting_id)}
OR substring(setting_id from 1 for char_length({pg_str(setting_id)}) + 1) = {pg_str(setting_id + ".")}
)
RETURNING setting_id, value
)
INSERT INTO audit_settings (setting_id, node_id, user_id, old_value, new_value, note)
SELECT setting_id, {pg_str(node_id)}, {pg_str(user_id)}, value, NULL::jsonb, {pg_str(note)}
FROM deleted;
COMMIT;
"""
docker_psql(sql)
def purge_node(node_id, *, user_id=DEFAULT_USER_ID, note=None):
note = note or "so-config purge-node"
sql = f"""
BEGIN;
WITH deleted AS (
DELETE FROM settings
WHERE node_id = {pg_str(node_id)}
RETURNING setting_id, value
)
INSERT INTO audit_settings (setting_id, node_id, user_id, old_value, new_value, note)
SELECT setting_id, {pg_str(node_id)}, {pg_str(user_id)}, value, NULL::jsonb, {pg_str(note)}
FROM deleted;
COMMIT;
"""
docker_psql(sql)
def parse_value(value, value_file=None):
if value_file:
with open(value_file, "r") as fh:
value = fh.read()
parsed = yaml.safe_load(value)
if parsed is None and value == "":
return ""
return parsed
def parse_yaml_file(path):
with open(path, "rb") as fh:
raw = fh.read()
if b"{%" in raw or b"{{" in raw:
raise SkipPath(f"{path}: Jinja-templated files stay disk-only")
if not raw.strip():
return {}
parsed = yaml.safe_load(raw)
return parsed if parsed is not None else {}
def flatten(prefix, value):
if isinstance(value, dict):
for key, child in value.items():
child_id = f"{prefix}.{key}" if prefix else str(key)
yield from flatten(child_id, child)
else:
yield prefix, value
def classify_pillar_path(path):
norm = Path(path).resolve()
norm_str = str(norm)
if norm.name in EXCLUDE_BASENAMES:
raise SkipPath(f"{path}: excluded basename")
for fragment in EXCLUDE_PATH_FRAGMENTS:
if fragment in norm_str:
raise SkipPath(f"{path}: excluded path fragment {fragment}")
if norm.suffix != ".sls":
raise SkipPath(f"{path}: not an .sls file")
parent = norm.parent.name
stem = norm.stem
if parent == "minions":
if stem.startswith("adv_"):
return {"kind": "advanced", "setting_id": "advanced", "node_id": stem[4:]}
return {"kind": "normal", "node_id": stem}
section = parent
if stem == f"soc_{section}":
return {"kind": "normal", "node_id": ""}
if stem == f"adv_{section}":
return {"kind": "advanced", "setting_id": f"{section}.advanced", "node_id": ""}
raise SkipPath(f"{path}: not a SOC-managed pillar file")
def import_pillar_file(path, *, user_id=DEFAULT_USER_ID, note=None):
meta = classify_pillar_path(path)
note = note or f"so-config import-file {path}"
if meta["kind"] == "advanced":
with open(path, "r") as fh:
upsert_setting(meta["setting_id"], fh.read(), node_id=meta["node_id"],
user_id=user_id, note=note)
return 1
data = parse_yaml_file(path)
if not isinstance(data, dict):
raise SkipPath(f"{path}: top-level YAML is not a map")
count = 0
for setting_id, value in flatten("", data):
upsert_setting(setting_id, value, node_id=meta["node_id"],
user_id=user_id, note=note)
count += 1
return count
def iter_pillar_files(root):
root = Path(root)
if not root.is_dir():
return
for path in sorted(root.rglob("*.sls")):
if path.is_file():
yield path
def cmd_set(args):
upsert_setting(args.setting_id, parse_value(args.value, args.value_file),
node_id=args.node_id,
duplicated_from_id=args.duplicated_from_id,
user_id=args.user_id,
note=args.note)
return 0
def cmd_delete(args):
delete_setting(args.setting_id, node_id=args.node_id,
user_id=args.user_id, note=args.note)
return 0
def cmd_delete_prefix(args):
delete_setting_prefix(args.setting_id, node_id=args.node_id,
user_id=args.user_id, note=args.note)
return 0
def cmd_purge_node(args):
purge_node(args.node_id, user_id=args.user_id, note=args.note)
return 0
def cmd_import_file(args):
count = import_pillar_file(args.path, user_id=args.user_id, note=args.note)
print(f"imported {count} settings from {args.path}")
return 0
def cmd_import_minion(args):
count = 0
for name in (f"{args.node_id}.sls", f"adv_{args.node_id}.sls"):
path = PILLAR_ROOT / "minions" / name
if path.exists():
count += import_pillar_file(path, user_id=args.user_id, note=args.note)
print(f"imported {count} settings for node {args.node_id}")
return 0
def cmd_import_all(args):
count = 0
skipped = 0
for path in iter_pillar_files(args.root):
try:
count += import_pillar_file(path, user_id=args.user_id, note=args.note)
except SkipPath as exc:
skipped += 1
if args.verbose:
print(f"skip: {exc}", file=sys.stderr)
print(f"imported {count} settings, skipped {skipped} files")
if args.state_file:
with open(args.state_file, "w") as fh:
fh.write("ok\n")
return 0
def cmd_sync_yaml_mutation(args):
meta = classify_pillar_path(args.path)
note = args.note or f"so-config sync-yaml-mutation {args.operation} {args.path}"
if meta["kind"] == "advanced":
import_pillar_file(args.path, user_id=args.user_id, note=note)
return 0
if args.operation in ("add", "replace"):
upsert_setting(args.key, parse_value(args.value, args.value_file),
node_id=meta["node_id"],
user_id=args.user_id,
note=note)
elif args.operation == "remove":
delete_setting_prefix(args.key, node_id=meta["node_id"],
user_id=args.user_id, note=note)
else:
raise ValueError(f"unsupported operation: {args.operation}")
return 0
def build_parser():
parser = argparse.ArgumentParser(description=__doc__)
sub = parser.add_subparsers(dest="command", required=True)
p = sub.add_parser("wait-schema", help="wait for SOC-created onionconfig tables")
p.add_argument("--timeout", type=int, default=120)
p.add_argument("--interval", type=int, default=2)
p.set_defaults(func=cmd_wait_schema)
p = sub.add_parser("set", help="upsert one setting")
p.add_argument("setting_id")
p.add_argument("value", nargs="?", default="")
p.add_argument("--value-file")
p.add_argument("--node-id", default="")
p.add_argument("--duplicated-from-id")
p.add_argument("--user-id", default=DEFAULT_USER_ID)
p.add_argument("--note")
p.set_defaults(func=cmd_set)
p = sub.add_parser("delete", help="delete one setting")
p.add_argument("setting_id")
p.add_argument("--node-id", default="")
p.add_argument("--user-id", default=DEFAULT_USER_ID)
p.add_argument("--note")
p.set_defaults(func=cmd_delete)
p = sub.add_parser("delete-prefix", help="delete one setting and all child settings")
p.add_argument("setting_id")
p.add_argument("--node-id", default="")
p.add_argument("--user-id", default=DEFAULT_USER_ID)
p.add_argument("--note")
p.set_defaults(func=cmd_delete_prefix)
p = sub.add_parser("purge-node", help="delete all settings for one node")
p.add_argument("node_id")
p.add_argument("--user-id", default=DEFAULT_USER_ID)
p.add_argument("--note")
p.set_defaults(func=cmd_purge_node)
p = sub.add_parser("import-file", help="import one SOC-managed pillar file")
p.add_argument("path")
p.add_argument("--user-id", default=DEFAULT_USER_ID)
p.add_argument("--note")
p.set_defaults(func=cmd_import_file)
p = sub.add_parser("import-minion", help="import one minion's pillar files")
p.add_argument("node_id")
p.add_argument("--user-id", default=DEFAULT_USER_ID)
p.add_argument("--note")
p.set_defaults(func=cmd_import_minion)
p = sub.add_parser("import-all", help="import all SOC-managed local pillar files")
p.add_argument("--root", default=str(PILLAR_ROOT))
p.add_argument("--state-file")
p.add_argument("--user-id", default=DEFAULT_USER_ID)
p.add_argument("--note", default="so-config initial import")
p.add_argument("--verbose", action="store_true")
p.set_defaults(func=cmd_import_all)
p = sub.add_parser("sync-yaml-mutation",
help="mirror one so-yaml add/replace/remove mutation to onionconfig")
p.add_argument("path")
p.add_argument("operation", choices=("add", "replace", "remove"))
p.add_argument("key")
p.add_argument("value", nargs="?", default="")
p.add_argument("--value-file")
p.add_argument("--user-id", default=DEFAULT_USER_ID)
p.add_argument("--note")
p.set_defaults(func=cmd_sync_yaml_mutation)
return parser
def main(argv):
parser = build_parser()
args = parser.parse_args(argv)
try:
return args.func(args)
except SkipPath as exc:
print(f"skip: {exc}", file=sys.stderr)
return 2
except Exception as exc:
print(f"so-config: {exc}", file=sys.stderr)
return 1
if __name__ == "__main__":
sys.exit(main(sys.argv[1:]))
+178
View File
@@ -0,0 +1,178 @@
import importlib
import os
import tempfile
import unittest
from unittest.mock import patch
soconfig = importlib.import_module("so-config")
class TestSoConfigPathMapping(unittest.TestCase):
def test_classify_global_soc(self):
meta = soconfig.classify_pillar_path(
"/opt/so/saltstack/local/pillar/soc/soc_soc.sls")
self.assertEqual(meta["kind"], "normal")
self.assertEqual(meta["node_id"], "")
def test_classify_global_advanced(self):
meta = soconfig.classify_pillar_path(
"/opt/so/saltstack/local/pillar/soc/adv_soc.sls")
self.assertEqual(meta["kind"], "advanced")
self.assertEqual(meta["setting_id"], "soc.advanced")
self.assertEqual(meta["node_id"], "")
def test_classify_minion(self):
meta = soconfig.classify_pillar_path(
"/opt/so/saltstack/local/pillar/minions/h1_sensor.sls")
self.assertEqual(meta["kind"], "normal")
self.assertEqual(meta["node_id"], "h1_sensor")
def test_classify_minion_advanced(self):
meta = soconfig.classify_pillar_path(
"/opt/so/saltstack/local/pillar/minions/adv_h1_sensor.sls")
self.assertEqual(meta["kind"], "advanced")
self.assertEqual(meta["setting_id"], "advanced")
self.assertEqual(meta["node_id"], "h1_sensor")
def test_classify_skips_bootstrap(self):
with self.assertRaises(soconfig.SkipPath):
soconfig.classify_pillar_path(
"/opt/so/saltstack/local/pillar/secrets.sls")
class TestSoConfigImport(unittest.TestCase):
def test_flatten_keeps_lists_as_values(self):
flattened = dict(soconfig.flatten("", {
"host": {"mainip": "10.0.0.1"},
"suricata": {"pcap": {"enabled": True}},
"items": ["a", "b"],
}))
self.assertEqual(flattened["host.mainip"], "10.0.0.1")
self.assertEqual(flattened["suricata.pcap.enabled"], True)
self.assertEqual(flattened["items"], ["a", "b"])
def test_import_file_upserts_flattened_settings(self):
with tempfile.TemporaryDirectory() as tmp:
path = os.path.join(tmp, "h1_sensor.sls")
minions = os.path.join(tmp, "minions")
os.mkdir(minions)
path = os.path.join(minions, "h1_sensor.sls")
with open(path, "w") as fh:
fh.write("host:\n mainip: 10.0.0.1\nsuricata:\n enabled: true\n")
calls = []
with patch.object(soconfig, "upsert_setting",
side_effect=lambda *args, **kwargs: calls.append((args, kwargs))):
count = soconfig.import_pillar_file(path)
self.assertEqual(count, 2)
self.assertIn((("host.mainip", "10.0.0.1"), {"node_id": "h1_sensor", "user_id": "so-config", "note": f"so-config import-file {path}"}), calls)
self.assertIn((("suricata.enabled", True), {"node_id": "h1_sensor", "user_id": "so-config", "note": f"so-config import-file {path}"}), calls)
def test_import_advanced_file_upserts_raw_content(self):
with tempfile.TemporaryDirectory() as tmp:
minions = os.path.join(tmp, "minions")
os.mkdir(minions)
path = os.path.join(minions, "adv_h1_sensor.sls")
with open(path, "w") as fh:
fh.write("custom:\n raw: true\n")
calls = []
with patch.object(soconfig, "upsert_setting",
side_effect=lambda *args, **kwargs: calls.append((args, kwargs))):
count = soconfig.import_pillar_file(path)
self.assertEqual(count, 1)
self.assertEqual(calls[0][0], ("advanced", "custom:\n raw: true\n"))
self.assertEqual(calls[0][1]["node_id"], "h1_sensor")
class TestSoConfigSql(unittest.TestCase):
def test_schema_ready_checks_soc_tables(self):
captured = {}
with patch.object(soconfig, "docker_psql",
side_effect=lambda sql: captured.update({"sql": sql}) or "t\n"):
ready = soconfig.schema_ready()
self.assertTrue(ready)
self.assertIn("to_regclass('public.settings')", captured["sql"])
self.assertIn("to_regclass('public.audit_settings')", captured["sql"])
def test_set_writes_settings_and_audit(self):
captured = {}
with patch.object(soconfig, "docker_psql",
side_effect=lambda sql: captured.setdefault("sql", sql)):
soconfig.upsert_setting("host.mainip", "10.0.0.1",
node_id="h1_sensor", user_id="tester", note="unit")
self.assertIn("INSERT INTO settings", captured["sql"])
self.assertIn("INSERT INTO audit_settings", captured["sql"])
self.assertIn("'host.mainip'", captured["sql"])
self.assertIn("'h1_sensor'", captured["sql"])
self.assertIn("'tester'", captured["sql"])
def test_purge_node_audits_deleted_rows(self):
captured = {}
with patch.object(soconfig, "docker_psql",
side_effect=lambda sql: captured.setdefault("sql", sql)):
soconfig.purge_node("h1_sensor", user_id="tester", note="unit")
self.assertIn("DELETE FROM settings", captured["sql"])
self.assertIn("WHERE node_id = 'h1_sensor'", captured["sql"])
self.assertIn("INSERT INTO audit_settings", captured["sql"])
def test_delete_prefix_removes_children_and_audits(self):
captured = {}
with patch.object(soconfig, "docker_psql",
side_effect=lambda sql: captured.setdefault("sql", sql)):
soconfig.delete_setting_prefix("elasticfleet", node_id="h1_sensor",
user_id="tester", note="unit")
self.assertIn("DELETE FROM settings", captured["sql"])
self.assertIn("setting_id = 'elasticfleet'", captured["sql"])
self.assertIn("'elasticfleet.'", captured["sql"])
self.assertIn("INSERT INTO audit_settings", captured["sql"])
def test_sync_yaml_replace_uses_path_node_id(self):
with tempfile.TemporaryDirectory() as tmp:
minions = os.path.join(tmp, "minions")
os.mkdir(minions)
path = os.path.join(minions, "h1_sensor.sls")
open(path, "w").close()
calls = []
args = soconfig.build_parser().parse_args([
"sync-yaml-mutation", path, "replace", "suricata.enabled", "true"
])
with patch.object(soconfig, "upsert_setting",
side_effect=lambda *a, **kw: calls.append((a, kw))):
soconfig.cmd_sync_yaml_mutation(args)
self.assertEqual(calls[0][0], ("suricata.enabled", True))
self.assertEqual(calls[0][1]["node_id"], "h1_sensor")
def test_sync_yaml_remove_deletes_prefix(self):
with tempfile.TemporaryDirectory() as tmp:
minions = os.path.join(tmp, "minions")
os.mkdir(minions)
path = os.path.join(minions, "h1_sensor.sls")
open(path, "w").close()
calls = []
args = soconfig.build_parser().parse_args([
"sync-yaml-mutation", path, "remove", "elasticfleet"
])
with patch.object(soconfig, "delete_setting_prefix",
side_effect=lambda *a, **kw: calls.append((a, kw))):
soconfig.cmd_sync_yaml_mutation(args)
self.assertEqual(calls[0][0], ("elasticfleet",))
self.assertEqual(calls[0][1]["node_id"], "h1_sensor")
if __name__ == "__main__":
unittest.main()
@@ -1,381 +0,0 @@
#!/usr/bin/env python3
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
# Imports detection overrides (e.g. from so-detections-backup) into the so-detection
# index. Reads <publicId>.<ext> files (NDJSON, one override per line) from a source
# directory, looks up the matching detection by publicId+engine, validates each
# override against the same rules SOC enforces, dedupes against existing overrides
# (operational fields only), and appends new ones.
import argparse
import ipaddress
import json
import os
import re
import sys
from datetime import datetime
import requests
from requests.auth import HTTPBasicAuth
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
DEFAULT_INDEX = "so-detection"
AUTH_FILE = "/opt/so/conf/elasticsearch/curl.config"
ES_URL = "https://localhost:9200"
# Engines we know how to handle and the file extension the backup script writes.
ENGINES = {
"suricata": "txt",
}
# Standard Suricata variables that ship with Security Onion. Anything else
# referenced in an override is "custom" and the user needs to make sure it
# exists in SOC Config before the override will function.
BUILTIN_SURICATA_VARS = {
"$HOME_NET", "$EXTERNAL_NET",
"$HTTP_SERVERS", "$DNS_SERVERS", "$SQL_SERVERS", "$SMTP_SERVERS",
"$TELNET_SERVERS", "$AIM_SERVERS", "$DC_SERVERS", "$MODBUS_SERVER",
"$MODBUS_CLIENT", "$ENIP_CLIENT", "$ENIP_SERVER",
"$HTTP_PORTS", "$SHELLCODE_PORTS", "$ORACLE_PORTS", "$SSH_PORTS",
"$FTP_PORTS", "$FILE_DATA_PORTS",
}
VAR_PATTERN = re.compile(r"\$[A-Z_][A-Z0-9_]*")
# Canonical valid values, per securityonion-soc/model/detection.go.
SURICATA_OVERRIDE_TYPES = {"suppress", "threshold", "modify"}
SUPPRESS_TRACKS = {"by_src", "by_dst", "by_either"}
THRESHOLD_TRACKS = {"by_src", "by_dst", "by_both"}
THRESHOLD_TYPES = {"limit", "threshold", "both"}
STALE_WARNING = """\
WARNING: so-detections-backup does not remove backup files when overrides are
deleted via the Security Onion web UI. As a result, files in the source
directory may represent overrides that were intentionally deleted and should
NOT be re-imported.
Before continuing, verify that the source directory reflects the overrides you
actually want imported. Remove any files corresponding to overrides you previously deleted.
"""
def make_session(auth_file):
with open(auth_file, "r") as f:
for line in f:
if line.startswith("user ="):
creds = line.split("=", 1)[1].strip().replace('"', "")
user, _, password = creds.partition(":")
session = requests.Session()
session.auth = HTTPBasicAuth(user, password)
session.headers.update({"Content-Type": "application/json"})
session.verify = False
return session
raise RuntimeError(f"Could not find 'user =' line in {auth_file}")
def find_detection(session, index, public_id, engine):
query = {
"query": {"bool": {"must": [
{"term": {"so_detection.publicId": public_id}},
{"term": {"so_detection.engine": engine}},
]}},
"size": 2,
}
r = session.get(f"{ES_URL}/{index}/_search", json=query)
r.raise_for_status()
hits = r.json().get("hits", {}).get("hits", [])
if not hits:
return None, None, None
if len(hits) > 1:
# Shouldn't happen — publicId is unique per engine — but flag it.
print(f" WARN: {len(hits)} detections matched publicId={public_id} engine={engine}; using first")
hit = hits[0]
existing = hit["_source"].get("so_detection", {}).get("overrides") or []
return hit["_id"], hit["_index"], existing
def update_overrides(session, doc_index, doc_id, overrides):
body = {"doc": {"so_detection": {"overrides": overrides}}}
r = session.post(f"{ES_URL}/{doc_index}/_update/{doc_id}", json=body)
r.raise_for_status()
return r.json()
def dedupe_key(override):
"""Operational fields only, per Override.Equal() in detection.go.
Excludes timestamps and isEnabled so re-imports don't appear unique."""
t = override.get("type")
if t == "suppress":
return (t, override.get("track"), override.get("ip"))
if t == "threshold":
return (t, override.get("thresholdType"), override.get("track"),
override.get("count"), override.get("seconds"))
if t == "modify":
return (t, override.get("regex"), override.get("value"))
def _validate_suricata_ip(ip):
if not ip:
return "ip cannot be empty"
if ip.startswith("$"):
return None
if ip.startswith("[") and ip.endswith("]"):
for part in ip[1:-1].split(","):
err = _validate_single_ip(part.strip())
if err:
return f"invalid IP in list: {err}"
return None
return _validate_single_ip(ip)
def _validate_single_ip(ip):
try:
if "/" in ip:
ipaddress.ip_network(ip, strict=False)
else:
ipaddress.ip_address(ip)
except ValueError:
return f"invalid IP/CIDR {ip!r}"
return None
def validate_override(override, engine):
"""Mirror Override.Validate() from securityonion-soc/model/detection.go.
Returns None on success, an error string otherwise."""
t = override.get("type")
if not t:
return "override type is required"
if t not in SURICATA_OVERRIDE_TYPES:
return f"invalid type {t!r}: must be one of {sorted(SURICATA_OVERRIDE_TYPES)}"
has = {k: override.get(k) is not None for k in
("regex", "value", "thresholdType", "track", "ip", "count", "seconds", "customFilter")}
if t == "suppress":
if not has["ip"] or not has["track"]:
return "suppress requires 'ip' and 'track'"
if any(has[k] for k in ("regex", "value", "thresholdType", "count", "seconds", "customFilter")):
return "suppress has unnecessary fields"
if override["track"] not in SUPPRESS_TRACKS:
return f"invalid track {override['track']!r}: must be one of {sorted(SUPPRESS_TRACKS)}"
return _validate_suricata_ip(override["ip"])
if t == "threshold":
if not all(has[k] for k in ("thresholdType", "track", "count", "seconds")):
return "threshold requires 'thresholdType', 'track', 'count', 'seconds'"
if any(has[k] for k in ("regex", "value", "customFilter")):
return "threshold has unnecessary fields"
if override["thresholdType"] not in THRESHOLD_TYPES:
return f"invalid thresholdType {override['thresholdType']!r}: must be one of {sorted(THRESHOLD_TYPES)}"
if override["track"] not in THRESHOLD_TRACKS:
return f"invalid track {override['track']!r}: must be one of {sorted(THRESHOLD_TRACKS)}"
if not isinstance(override["count"], int) or override["count"] <= 0:
return f"count must be a positive integer, got {override['count']!r}"
if not isinstance(override["seconds"], int) or override["seconds"] <= 0:
return f"seconds must be a positive integer, got {override['seconds']!r}"
return None
if t == "modify":
if not has["regex"] or not has["value"]:
return "modify requires 'regex' and 'value'"
if any(has[k] for k in ("thresholdType", "track", "count", "seconds", "customFilter")):
return "modify has unnecessary fields"
try:
re.compile(override["regex"])
except re.error as e:
return f"invalid regex: {e}"
return None
def parse_overrides_file(path):
"""Parse a file written by so-detections-backup.py: NDJSON, one override
per line. Returns a list of (override_dict, line_number)."""
overrides = []
with open(path, "r") as f:
for i, line in enumerate(f, start=1):
line = line.strip()
if not line:
continue
overrides.append((json.loads(line), i))
return overrides
def describe(override):
"""Human-readable summary of the operational fields for a given override type."""
t = override.get("type")
if t == "suppress":
return f"type=suppress track={override.get('track')} ip={override.get('ip')}"
if t == "threshold":
return (f"type=threshold track={override.get('track')} "
f"thresholdType={override.get('thresholdType')} "
f"count={override.get('count')} seconds={override.get('seconds')}")
if t == "modify":
return f"type=modify regex={override.get('regex')!r}"
def collect_custom_vars(override):
found = set()
for value in override.values():
if isinstance(value, str):
for match in VAR_PATTERN.findall(value):
if match not in BUILTIN_SURICATA_VARS:
found.add(match)
return found
def parse_args():
p = argparse.ArgumentParser(
description="Import detection overrides into the so-detection index.",
)
p.add_argument("--source", "-s", required=True,
help="Source directory containing <publicId>.<ext> override files.")
p.add_argument("--engine", "-e", default="suricata", choices=list(ENGINES.keys()),
help="Detection engine (default: suricata).")
p.add_argument("--dry-run", "-n", action="store_true",
help="Print what would happen without writing to Elasticsearch.")
p.add_argument("--no-import-note", action="store_true",
help="Do not prepend '[Imported YYYY-MM-DD] ' to the override note.")
p.add_argument("--index", "-i", default=DEFAULT_INDEX,
help=f"Elasticsearch index to update (default: {DEFAULT_INDEX}).")
return p.parse_args()
def confirm_proceed(args):
"""Show the stale-backup warning. Dry-run prints it and continues. Real
runs require the user typing 'yes' at the prompt."""
print(STALE_WARNING)
if args.dry_run:
print("(dry-run: no acknowledgement required)\n")
return True
answer = input("Type 'yes' to acknowledge and continue: ").strip().lower()
print()
return answer == "yes"
def main():
args = parse_args()
if not os.path.isdir(args.source):
print(f"ERROR: source directory not found: {args.source}", file=sys.stderr)
sys.exit(1)
extension = ENGINES[args.engine]
files = sorted(f for f in os.listdir(args.source) if f.endswith(f".{extension}"))
if not files:
print(f"No *.{extension} files found in {args.source}")
sys.exit(0)
if not confirm_proceed(args):
print("Aborted.")
sys.exit(1)
session = make_session(AUTH_FILE)
today = datetime.now().strftime("%Y-%m-%d")
note_prefix = "" if args.no_import_note else f"[Imported {today}] "
counts = {"added": 0, "skipped_dedupe": 0, "skipped_not_found": 0, "invalid": 0, "error": 0}
custom_vars = set()
mode = "DRY-RUN" if args.dry_run else "IMPORT"
print(f"[{mode}] engine={args.engine} source={args.source} index={args.index}\n")
for filename in files:
public_id = os.path.splitext(filename)[0]
path = os.path.join(args.source, filename)
print(f"{public_id}:")
try:
new_overrides = parse_overrides_file(path)
except (json.JSONDecodeError, OSError) as e:
print(f" ERROR: could not parse {filename}: {e}")
counts["error"] += 1
continue
if not new_overrides:
print(" SKIP: empty file")
continue
try:
doc_id, doc_index, existing = find_detection(session, args.index, public_id, args.engine)
except requests.HTTPError as e:
print(f" ERROR: search failed: {e}")
counts["error"] += 1
continue
if doc_id is None:
print(f" WARN: no detection found for publicId={public_id} engine={args.engine}; skipping")
counts["skipped_not_found"] += len(new_overrides)
continue
existing_keys = {dedupe_key(o) for o in existing}
merged = list(existing)
added_this_file = 0
for override, line_no in new_overrides:
err = validate_override(override, args.engine)
if err:
print(f" INVALID (line {line_no}): {err}")
counts["invalid"] += 1
continue
custom_vars.update(collect_custom_vars(override))
key = dedupe_key(override)
if key in existing_keys:
print(f" SKIP (line {line_no}): duplicate of existing override [{describe(override)}]")
counts["skipped_dedupe"] += 1
continue
if note_prefix:
override = dict(override)
override["note"] = note_prefix + (override.get("note") or "")
merged.append(override)
existing_keys.add(key)
added_this_file += 1
print(f" ADD (line {line_no}): {describe(override)}")
if added_this_file == 0:
continue
if args.dry_run:
print(f" DRY-RUN: would update {doc_index}/{doc_id} "
f"({len(existing)} existing → {len(merged)} total)")
counts["added"] += added_this_file
continue
try:
update_overrides(session, doc_index, doc_id, merged)
print(f" UPDATED {doc_index}/{doc_id} ({len(existing)} → {len(merged)})")
counts["added"] += added_this_file
except requests.HTTPError as e:
print(f" ERROR: update failed: {e}")
counts["error"] += 1
print()
print("=" * 60)
print(f"Summary ({mode}):")
print(f" Overrides added: {counts['added']}")
print(f" Skipped (already present): {counts['skipped_dedupe']}")
print(f" Skipped (no detection): {counts['skipped_not_found']}")
print(f" Invalid (failed checks): {counts['invalid']}")
print(f" Errors: {counts['error']}")
if custom_vars:
print()
print("WARNING: detected custom Suricata variables in imported overrides:")
for v in sorted(custom_vars):
print(f" {v}")
print("If any of these are not already defined in SOC Config (Suricata variables),")
print("you must add them manually before the rules will function correctly.")
sys.exit(0 if counts["error"] == 0 and counts["invalid"] == 0 else 1)
if __name__ == "__main__":
main()
@@ -1,588 +0,0 @@
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
import importlib.util
import json
import os
import shutil
import sys
import tempfile
import unittest
from importlib.machinery import SourceFileLoader
from io import StringIO
from unittest.mock import MagicMock, patch
import requests
# The script has no .py extension; spec_from_file_location can't auto-detect a
# loader, so we hand it a SourceFileLoader explicitly. (load_module() is
# deprecated in 3.14 and slated for removal in 3.15.)
HERE = os.path.dirname(os.path.abspath(__file__))
SCRIPT = os.path.join(HERE, "so-detections-overrides-import")
_loader = SourceFileLoader("so_overrides_import", SCRIPT)
_spec = importlib.util.spec_from_loader("so_overrides_import", _loader)
soi = importlib.util.module_from_spec(_spec)
_loader.exec_module(soi)
class TestValidateSuppress(unittest.TestCase):
def test_valid(self):
self.assertIsNone(soi.validate_override(
{"type": "suppress", "track": "by_src", "ip": "1.2.3.4"}, "suricata"))
def test_valid_var(self):
self.assertIsNone(soi.validate_override(
{"type": "suppress", "track": "by_either", "ip": "$HOME_NET"}, "suricata"))
def test_valid_cidr(self):
self.assertIsNone(soi.validate_override(
{"type": "suppress", "track": "by_dst", "ip": "10.0.0.0/8"}, "suricata"))
def test_valid_bracket_list(self):
self.assertIsNone(soi.validate_override(
{"type": "suppress", "track": "by_src", "ip": "[1.2.3.4,10.0.0.0/8]"}, "suricata"))
def test_missing_ip(self):
err = soi.validate_override({"type": "suppress", "track": "by_src"}, "suricata")
self.assertIn("requires", err)
def test_missing_track(self):
err = soi.validate_override({"type": "suppress", "ip": "1.2.3.4"}, "suricata")
self.assertIn("requires", err)
def test_invalid_track(self):
err = soi.validate_override(
{"type": "suppress", "track": "by_both", "ip": "1.2.3.4"}, "suricata")
self.assertIn("invalid track", err)
def test_invalid_ip(self):
err = soi.validate_override(
{"type": "suppress", "track": "by_src", "ip": "not-an-ip"}, "suricata")
self.assertIn("invalid IP", err)
def test_unnecessary_field(self):
err = soi.validate_override(
{"type": "suppress", "track": "by_src", "ip": "1.2.3.4", "count": 5}, "suricata")
self.assertIn("unnecessary fields", err)
class TestValidateThreshold(unittest.TestCase):
def test_valid(self):
self.assertIsNone(soi.validate_override({
"type": "threshold", "track": "by_src",
"thresholdType": "limit", "count": 10, "seconds": 60,
}, "suricata"))
def test_valid_by_both(self):
self.assertIsNone(soi.validate_override({
"type": "threshold", "track": "by_both",
"thresholdType": "both", "count": 1, "seconds": 1,
}, "suricata"))
def test_track_by_either_invalid(self):
err = soi.validate_override({
"type": "threshold", "track": "by_either",
"thresholdType": "limit", "count": 10, "seconds": 60,
}, "suricata")
self.assertIn("invalid track", err)
def test_invalid_threshold_type(self):
err = soi.validate_override({
"type": "threshold", "track": "by_src",
"thresholdType": "bogus", "count": 10, "seconds": 60,
}, "suricata")
self.assertIn("invalid thresholdType", err)
def test_zero_count(self):
err = soi.validate_override({
"type": "threshold", "track": "by_src",
"thresholdType": "limit", "count": 0, "seconds": 60,
}, "suricata")
self.assertIn("count", err)
def test_negative_seconds(self):
err = soi.validate_override({
"type": "threshold", "track": "by_src",
"thresholdType": "limit", "count": 10, "seconds": -1,
}, "suricata")
self.assertIn("seconds", err)
def test_missing_field(self):
err = soi.validate_override({
"type": "threshold", "track": "by_src",
"thresholdType": "limit", "count": 10, # missing seconds
}, "suricata")
self.assertIn("requires", err)
def test_unnecessary_field(self):
err = soi.validate_override({
"type": "threshold", "track": "by_src",
"thresholdType": "limit", "count": 10, "seconds": 60,
"regex": "foo",
}, "suricata")
self.assertIn("unnecessary fields", err)
class TestValidateModify(unittest.TestCase):
def test_valid(self):
self.assertIsNone(soi.validate_override(
{"type": "modify", "regex": r"content:\"foo\"", "value": "content:bar"}, "suricata"))
def test_invalid_regex(self):
err = soi.validate_override(
{"type": "modify", "regex": "(unbalanced", "value": "x"}, "suricata")
self.assertIn("invalid regex", err)
def test_missing_value(self):
err = soi.validate_override({"type": "modify", "regex": "x"}, "suricata")
self.assertIn("requires", err)
def test_unnecessary_field(self):
err = soi.validate_override(
{"type": "modify", "regex": "x", "value": "y", "track": "by_src"}, "suricata")
self.assertIn("unnecessary fields", err)
class TestValidateMisc(unittest.TestCase):
def test_unknown_type(self):
err = soi.validate_override({"type": "suppresss", "track": "by_src", "ip": "1.2.3.4"}, "suricata")
self.assertIn("invalid type", err)
def test_missing_type(self):
err = soi.validate_override({"track": "by_src"}, "suricata")
self.assertIn("type is required", err)
class TestValidateIP(unittest.TestCase):
def test_plain_ipv4(self):
self.assertIsNone(soi._validate_suricata_ip("1.2.3.4"))
def test_plain_ipv6(self):
self.assertIsNone(soi._validate_suricata_ip("::1"))
def test_cidr(self):
self.assertIsNone(soi._validate_suricata_ip("10.0.0.0/8"))
def test_var(self):
self.assertIsNone(soi._validate_suricata_ip("$CONCOURSEWORKERS"))
def test_bracket_list(self):
self.assertIsNone(soi._validate_suricata_ip("[1.2.3.4, 10.0.0.0/8]"))
def test_bracket_list_bad_member(self):
err = soi._validate_suricata_ip("[1.2.3.4,nope]")
self.assertIn("invalid IP in list", err)
def test_empty(self):
self.assertIn("empty", soi._validate_suricata_ip(""))
def test_invalid(self):
self.assertIn("invalid", soi._validate_suricata_ip("999.999.999.999"))
class TestDedupeKey(unittest.TestCase):
def test_suppress(self):
a = {"type": "suppress", "track": "by_src", "ip": "1.2.3.4", "count": 99}
b = {"type": "suppress", "track": "by_src", "ip": "1.2.3.4"}
# count is irrelevant for suppress dedupe
self.assertEqual(soi.dedupe_key(a), soi.dedupe_key(b))
def test_suppress_differs_on_ip(self):
a = {"type": "suppress", "track": "by_src", "ip": "1.2.3.4"}
b = {"type": "suppress", "track": "by_src", "ip": "5.6.7.8"}
self.assertNotEqual(soi.dedupe_key(a), soi.dedupe_key(b))
def test_threshold(self):
a = {"type": "threshold", "track": "by_src", "thresholdType": "limit",
"count": 10, "seconds": 60, "ip": "ignored"}
b = {"type": "threshold", "track": "by_src", "thresholdType": "limit",
"count": 10, "seconds": 60}
self.assertEqual(soi.dedupe_key(a), soi.dedupe_key(b))
def test_threshold_differs_on_count(self):
a = {"type": "threshold", "track": "by_src", "thresholdType": "limit",
"count": 10, "seconds": 60}
b = {"type": "threshold", "track": "by_src", "thresholdType": "limit",
"count": 20, "seconds": 60}
self.assertNotEqual(soi.dedupe_key(a), soi.dedupe_key(b))
def test_modify(self):
a = {"type": "modify", "regex": "x", "value": "y"}
b = {"type": "modify", "regex": "x", "value": "y"}
self.assertEqual(soi.dedupe_key(a), soi.dedupe_key(b))
class TestDescribe(unittest.TestCase):
def test_suppress(self):
s = soi.describe({"type": "suppress", "track": "by_src", "ip": "1.2.3.4"})
self.assertIn("suppress", s)
self.assertIn("by_src", s)
self.assertIn("1.2.3.4", s)
def test_threshold_includes_count(self):
s = soi.describe({"type": "threshold", "track": "by_src",
"thresholdType": "limit", "count": 10, "seconds": 60})
self.assertIn("count=10", s)
self.assertIn("seconds=60", s)
def test_modify(self):
s = soi.describe({"type": "modify", "regex": "foo"})
self.assertIn("modify", s)
self.assertIn("foo", s)
class TestParseOverridesFile(unittest.TestCase):
def _write(self, content):
fd, path = tempfile.mkstemp(suffix=".txt")
os.close(fd)
with open(path, "w") as f:
f.write(content)
self.addCleanup(os.unlink, path)
return path
def test_single_line(self):
path = self._write('{"type":"suppress","track":"by_src","ip":"1.2.3.4"}')
result = soi.parse_overrides_file(path)
self.assertEqual(len(result), 1)
self.assertEqual(result[0][0]["type"], "suppress")
self.assertEqual(result[0][1], 1)
def test_ndjson(self):
path = self._write(
'{"type":"suppress","track":"by_src","ip":"1.2.3.4"}\n'
'{"type":"suppress","track":"by_dst","ip":"5.6.7.8"}\n'
)
result = soi.parse_overrides_file(path)
self.assertEqual(len(result), 2)
self.assertEqual(result[1][1], 2)
def test_empty(self):
path = self._write("")
self.assertEqual(soi.parse_overrides_file(path), [])
def test_blank_lines_skipped(self):
path = self._write('\n{"type":"suppress","track":"by_src","ip":"1.2.3.4"}\n\n')
result = soi.parse_overrides_file(path)
self.assertEqual(len(result), 1)
self.assertEqual(result[0][1], 2) # line number reflects original position
def test_invalid_raises(self):
path = self._write("not json")
with self.assertRaises(json.JSONDecodeError):
soi.parse_overrides_file(path)
class TestCollectCustomVars(unittest.TestCase):
def test_finds_custom(self):
v = soi.collect_custom_vars({"ip": "$CONCOURSEWORKERS"})
self.assertEqual(v, {"$CONCOURSEWORKERS"})
def test_filters_builtins(self):
v = soi.collect_custom_vars({"ip": "$HOME_NET"})
self.assertEqual(v, set())
def test_mixed(self):
v = soi.collect_custom_vars({"ip": "[$HOME_NET,$MYNET]"})
self.assertEqual(v, {"$MYNET"})
def test_non_string_fields_ignored(self):
v = soi.collect_custom_vars({"count": 10, "isEnabled": True})
self.assertEqual(v, set())
class TestMakeSession(unittest.TestCase):
def _write(self, content):
fd, path = tempfile.mkstemp()
os.close(fd)
with open(path, "w") as f:
f.write(content)
self.addCleanup(os.unlink, path)
return path
def test_valid_auth_file(self):
path = self._write('user = "admin:secret"\n')
session = soi.make_session(path)
self.assertEqual(session.auth.username, "admin")
self.assertEqual(session.auth.password, "secret")
self.assertFalse(session.verify)
def test_missing_user_line(self):
path = self._write("# no user line here\n")
with self.assertRaises(RuntimeError):
soi.make_session(path)
class TestFindDetection(unittest.TestCase):
def _session_with_response(self, payload):
session = MagicMock()
response = MagicMock()
response.json.return_value = payload
response.raise_for_status.return_value = None
session.get.return_value = response
return session
def test_found(self):
session = self._session_with_response({"hits": {"hits": [{
"_id": "abc", "_index": "so-detection",
"_source": {"so_detection": {"overrides": [{"type": "suppress"}]}},
}]}})
doc_id, idx, existing = soi.find_detection(session, "so-detection", "2049201", "suricata")
self.assertEqual(doc_id, "abc")
self.assertEqual(idx, "so-detection")
self.assertEqual(len(existing), 1)
def test_not_found(self):
session = self._session_with_response({"hits": {"hits": []}})
doc_id, idx, existing = soi.find_detection(session, "so-detection", "x", "suricata")
self.assertIsNone(doc_id)
self.assertIsNone(idx)
self.assertIsNone(existing)
def test_no_overrides_field(self):
session = self._session_with_response({"hits": {"hits": [{
"_id": "abc", "_index": "so-detection",
"_source": {"so_detection": {}},
}]}})
_, _, existing = soi.find_detection(session, "so-detection", "x", "suricata")
self.assertEqual(existing, [])
def test_multiple_hits_warns(self):
session = self._session_with_response({"hits": {"hits": [
{"_id": "a", "_index": "i", "_source": {"so_detection": {"overrides": []}}},
{"_id": "b", "_index": "i", "_source": {"so_detection": {"overrides": []}}},
]}})
with patch("sys.stdout", new=StringIO()) as out:
doc_id, _, _ = soi.find_detection(session, "i", "x", "suricata")
self.assertEqual(doc_id, "a")
self.assertIn("WARN", out.getvalue())
class TestUpdateOverrides(unittest.TestCase):
def test_posts_to_update_endpoint(self):
session = MagicMock()
response = MagicMock()
response.raise_for_status.return_value = None
response.json.return_value = {"result": "updated"}
session.post.return_value = response
result = soi.update_overrides(session, "so-detection", "abc", [{"type": "suppress"}])
self.assertEqual(result, {"result": "updated"})
url = session.post.call_args[0][0]
self.assertIn("/_update/abc", url)
body = session.post.call_args[1]["json"]
self.assertEqual(body["doc"]["so_detection"]["overrides"], [{"type": "suppress"}])
class TestConfirmProceed(unittest.TestCase):
def test_dry_run_skips_prompt(self):
args = MagicMock(dry_run=True)
with patch("sys.stdout", new=StringIO()):
self.assertTrue(soi.confirm_proceed(args))
def test_yes_input(self):
args = MagicMock(dry_run=False)
with patch("sys.stdout", new=StringIO()):
with patch("builtins.input", return_value="yes"):
self.assertTrue(soi.confirm_proceed(args))
def test_yes_input_case_insensitive(self):
args = MagicMock(dry_run=False)
with patch("sys.stdout", new=StringIO()):
with patch("builtins.input", return_value="YES"):
self.assertTrue(soi.confirm_proceed(args))
def test_no_input_aborts(self):
args = MagicMock(dry_run=False)
with patch("sys.stdout", new=StringIO()):
with patch("builtins.input", return_value="no"):
self.assertFalse(soi.confirm_proceed(args))
def test_empty_input_aborts(self):
args = MagicMock(dry_run=False)
with patch("sys.stdout", new=StringIO()):
with patch("builtins.input", return_value=""):
self.assertFalse(soi.confirm_proceed(args))
class TestParseArgs(unittest.TestCase):
def test_defaults(self):
with patch.object(sys, "argv", ["cmd", "--source", "/some/path"]):
args = soi.parse_args()
self.assertEqual(args.source, "/some/path")
self.assertEqual(args.engine, "suricata")
self.assertFalse(args.dry_run)
self.assertFalse(args.no_import_note)
self.assertEqual(args.index, soi.DEFAULT_INDEX)
def test_all_options(self):
argv = ["cmd", "-s", "/x", "-e", "suricata", "-n",
"--no-import-note", "-i", "alt-index"]
with patch.object(sys, "argv", argv):
args = soi.parse_args()
self.assertEqual(args.source, "/x")
self.assertTrue(args.dry_run)
self.assertTrue(args.no_import_note)
self.assertEqual(args.index, "alt-index")
class TestMain(unittest.TestCase):
def setUp(self):
self.tmpdir = tempfile.mkdtemp()
self.addCleanup(shutil.rmtree, self.tmpdir, ignore_errors=True)
# Stub make_session so tests don't need /opt/so/conf/elasticsearch/curl.config.
p = patch.object(soi, "make_session", return_value=MagicMock())
p.start()
self.addCleanup(p.stop)
def _write_file(self, public_id, overrides, ext="txt"):
"""Write an NDJSON override file. Entries may be dicts or raw strings (for malformed input)."""
path = os.path.join(self.tmpdir, f"{public_id}.{ext}")
with open(path, "w") as f:
for o in overrides:
f.write(o if isinstance(o, str) else json.dumps(o))
f.write("\n")
return path
def _run_main(self, *extra_argv, input_response="yes"):
"""Run main() with stdout/stderr captured and input mocked. Returns (stdout, stderr, exit_code)."""
argv = ["cmd", "--source", self.tmpdir, *extra_argv]
out, err = StringIO(), StringIO()
with patch.object(sys, "argv", argv), \
patch("sys.stdout", new=out), \
patch("sys.stderr", new=err), \
patch("builtins.input", return_value=input_response):
with self.assertRaises(SystemExit) as cm:
soi.main()
return out.getvalue(), err.getvalue(), cm.exception.code
def test_source_dir_missing(self):
argv = ["cmd", "--source", "/no/such/path/here"]
err = StringIO()
with patch.object(sys, "argv", argv), patch("sys.stderr", new=err):
with self.assertRaises(SystemExit) as cm:
soi.main()
self.assertEqual(cm.exception.code, 1)
self.assertIn("source directory not found", err.getvalue())
def test_no_files_found(self):
out, _, code = self._run_main()
self.assertEqual(code, 0)
self.assertIn("No *.txt files found", out)
def test_user_aborts(self):
self._write_file("1001", [{"type": "suppress", "track": "by_src", "ip": "1.2.3.4"}])
out, _, code = self._run_main(input_response="no")
self.assertEqual(code, 1)
self.assertIn("Aborted", out)
def test_parse_error_increments_error(self):
# Malformed JSON line — parse_overrides_file raises JSONDecodeError.
self._write_file("1002", ["not json"])
out, _, code = self._run_main("--dry-run")
self.assertEqual(code, 1) # invalid+error → non-zero
self.assertIn("could not parse", out)
self.assertIn("Errors: 1", out)
def test_empty_file_skipped(self):
# Blank lines only — parse_overrides_file returns []; main reports "empty file" and continues.
path = os.path.join(self.tmpdir, "1003.txt")
with open(path, "w") as f:
f.write("\n\n")
out, _, code = self._run_main("--dry-run")
self.assertEqual(code, 0)
self.assertIn("empty file", out)
@patch.object(soi, "find_detection")
def test_search_http_error(self, mock_find):
mock_find.side_effect = requests.HTTPError("boom")
self._write_file("1004", [{"type": "suppress", "track": "by_src", "ip": "1.2.3.4"}])
out, _, code = self._run_main("--dry-run")
self.assertEqual(code, 1)
self.assertIn("search failed", out)
@patch.object(soi, "find_detection")
def test_no_detection_found(self, mock_find):
mock_find.return_value = (None, None, None)
self._write_file("1005", [{"type": "suppress", "track": "by_src", "ip": "1.2.3.4"}])
out, _, code = self._run_main("--dry-run")
self.assertEqual(code, 0)
self.assertIn("no detection found", out)
self.assertIn("Skipped (no detection): 1", out)
@patch.object(soi, "find_detection")
def test_all_duplicates_no_update(self, mock_find):
existing = [{"type": "suppress", "track": "by_src", "ip": "1.2.3.4"}]
mock_find.return_value = ("doc1", "so-detection", existing)
self._write_file("1006", [{"type": "suppress", "track": "by_src", "ip": "1.2.3.4"}])
out, _, code = self._run_main("--dry-run")
self.assertEqual(code, 0)
self.assertIn("SKIP", out)
self.assertNotIn("DRY-RUN: would update", out) # added_this_file == 0 branch
@patch.object(soi, "update_overrides")
@patch.object(soi, "find_detection")
def test_happy_path_full(self, mock_find, mock_update):
# Exercises: ADD, dedupe SKIP, INVALID, note prefix, UPDATE, custom-vars warning, exit=1 (invalid present)
existing = [{"type": "suppress", "track": "by_src", "ip": "9.9.9.9"}]
mock_find.return_value = ("doc1", "so-detection", existing)
mock_update.return_value = {"result": "updated"}
self._write_file("1007", [
{"type": "suppress", "track": "by_src", "ip": "1.2.3.4"}, # ADD
{"type": "suppress", "track": "by_src", "ip": "9.9.9.9"}, # SKIP (dupe of existing)
{"type": "suppress", "track": "bogus", "ip": "1.2.3.4"}, # INVALID
{"type": "suppress", "track": "by_src", "ip": "$CONCOURSEWORKERS"}, # ADD + custom var
])
out, _, code = self._run_main()
self.assertEqual(code, 1) # one invalid -> non-zero
mock_update.assert_called_once()
merged = mock_update.call_args[0][3]
self.assertEqual(len(merged), 3) # 1 existing + 2 new
new_notes = [o.get("note", "") for o in merged if o.get("ip") in ("1.2.3.4", "$CONCOURSEWORKERS")]
self.assertTrue(all(n.startswith("[Imported ") for n in new_notes))
self.assertIn("ADD", out)
self.assertIn("SKIP", out)
self.assertIn("INVALID", out)
self.assertIn("UPDATED", out)
self.assertIn("$CONCOURSEWORKERS", out)
@patch.object(soi, "update_overrides")
@patch.object(soi, "find_detection")
def test_no_import_note_preserves_note(self, mock_find, mock_update):
mock_find.return_value = ("doc1", "so-detection", [])
mock_update.return_value = {"result": "updated"}
self._write_file("1008", [
{"type": "suppress", "track": "by_src", "ip": "1.2.3.4", "note": "original"},
])
_, _, code = self._run_main("--no-import-note")
self.assertEqual(code, 0)
merged = mock_update.call_args[0][3]
self.assertEqual(merged[0]["note"], "original") # no prefix applied
@patch.object(soi, "find_detection")
def test_dry_run_skips_update(self, mock_find):
mock_find.return_value = ("doc1", "so-detection", [])
self._write_file("1009", [{"type": "suppress", "track": "by_src", "ip": "1.2.3.4"}])
with patch.object(soi, "update_overrides") as mock_update:
out, _, code = self._run_main("--dry-run")
self.assertEqual(code, 0)
mock_update.assert_not_called()
self.assertIn("DRY-RUN: would update", out)
@patch.object(soi, "update_overrides")
@patch.object(soi, "find_detection")
def test_update_http_error(self, mock_find, mock_update):
mock_find.return_value = ("doc1", "so-detection", [])
mock_update.side_effect = requests.HTTPError("nope")
self._write_file("1010", [{"type": "suppress", "track": "by_src", "ip": "1.2.3.4"}])
out, _, code = self._run_main()
self.assertEqual(code, 1)
self.assertIn("update failed", out)
if __name__ == "__main__":
unittest.main()
+30
View File
@@ -314,6 +314,24 @@ EOSQL
fi
}
function sync_minion_config_to_db() {
log "INFO" "Syncing minion config to onionconfig for $MINION_ID"
/usr/sbin/so-config.py import-minion "$MINION_ID" --note "so-minion $OPERATION"
if [ $? -ne 0 ]; then
log "ERROR" "Failed to sync minion config to onionconfig for $MINION_ID"
return 1
fi
}
function purge_minion_config_from_db() {
log "INFO" "Purging minion config from onionconfig for $MINION_ID"
/usr/sbin/so-config.py purge-node "$MINION_ID" --note "so-minion delete"
if [ $? -ne 0 ]; then
log "ERROR" "Failed to purge minion config from onionconfig for $MINION_ID"
return 1
fi
}
# Create the minion file
function ensure_socore_ownership() {
log "INFO" "Setting socore ownership on minion files"
@@ -1088,6 +1106,10 @@ case "$OPERATION" in
log "ERROR" "Failed to setup minion files for $MINION_ID"
exit 1
}
sync_minion_config_to_db || {
log "ERROR" "Failed to sync minion config to onionconfig for $MINION_ID"
exit 1
}
updateMineAndApplyStates || {
log "ERROR" "Failed to update mine and apply states for $MINION_ID"
exit 1
@@ -1108,12 +1130,20 @@ case "$OPERATION" in
log "ERROR" "Failed to setup VM minion files for $MINION_ID"
exit 1
}
sync_minion_config_to_db || {
log "ERROR" "Failed to sync VM minion config to onionconfig for $MINION_ID"
exit 1
}
log "INFO" "Successfully added VM minion $MINION_ID"
;;
"delete")
log "INFO" "Removing minion $MINION_ID"
remove_postgres_telegraf_from_minion
purge_minion_config_from_db || {
log "ERROR" "Failed to purge minion config from onionconfig for $MINION_ID"
exit 1
}
deleteMinionFiles || {
log "ERROR" "Failed to delete minion files for $MINION_ID"
exit 1
-232
View File
@@ -1,232 +0,0 @@
#!/opt/saltstack/salt/bin/python3
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
"""
so-push-drainer
===============
Scheduled drainer for the active-push feature. Runs on the manager every
drain_interval seconds (default 15) via a salt schedule in salt/schedule.sls.
For each intent file under /opt/so/state/push_pending/*.json whose last_touch
is older than debounce_seconds, this script:
* concatenates the actions lists from every ready intent
* dedupes by (state or __highstate__, tgt, tgt_type)
* dispatches a single `salt-run state.orchestrate orch.push_batch --async`
with the deduped actions list passed as pillar kwargs
* deletes the contributed intent files on successful dispatch
Reactor sls files (push_suricata, push_strelka, push_pillar) write intents
but never dispatch directly -- see plan
/home/mreeves/.claude/plans/goofy-marinating-hummingbird.md for the full design.
"""
import fcntl
import glob
import json
import logging
import logging.handlers
import os
import subprocess
import sys
import time
import salt.client
PENDING_DIR = '/opt/so/state/push_pending'
LOCK_FILE = os.path.join(PENDING_DIR, '.lock')
LOG_FILE = '/opt/so/log/salt/so-push-drainer.log'
HIGHSTATE_SENTINEL = '__highstate__'
def _make_logger():
logger = logging.getLogger('so-push-drainer')
logger.setLevel(logging.INFO)
if not logger.handlers:
os.makedirs(os.path.dirname(LOG_FILE), exist_ok=True)
handler = logging.handlers.RotatingFileHandler(
LOG_FILE, maxBytes=5 * 1024 * 1024, backupCount=3,
)
handler.setFormatter(logging.Formatter(
'%(asctime)s | %(levelname)s | %(message)s',
))
logger.addHandler(handler)
return logger
def _load_push_cfg():
"""Read the global:push pillar subtree via salt-call. Returns a dict."""
caller = salt.client.Caller()
cfg = caller.cmd('pillar.get', 'global:push', {})
return cfg if isinstance(cfg, dict) else {}
def _read_intent(path, log):
try:
with open(path, 'r') as f:
return json.load(f)
except (IOError, ValueError) as exc:
log.warning('cannot read intent %s: %s', path, exc)
return None
except Exception:
log.exception('unexpected error reading %s', path)
return None
def _dedupe_actions(actions):
seen = set()
deduped = []
for action in actions:
if not isinstance(action, dict):
continue
state_key = HIGHSTATE_SENTINEL if action.get('highstate') else action.get('state')
tgt = action.get('tgt')
tgt_type = action.get('tgt_type', 'compound')
if not state_key or not tgt:
continue
key = (state_key, tgt, tgt_type)
if key in seen:
continue
seen.add(key)
deduped.append(action)
return deduped
def _dispatch(actions, log):
pillar_arg = json.dumps({'actions': actions})
cmd = [
'salt-run',
'state.orchestrate',
'orch.push_batch',
'pillar={}'.format(pillar_arg),
'--async',
]
log.info('dispatching: %s', ' '.join(cmd[:3]) + ' pillar=<{} actions>'.format(len(actions)))
try:
result = subprocess.run(
cmd, check=True, capture_output=True, text=True, timeout=60,
)
except subprocess.CalledProcessError as exc:
log.error('dispatch failed (rc=%s): stdout=%s stderr=%s',
exc.returncode, exc.stdout, exc.stderr)
return False
except subprocess.TimeoutExpired:
log.error('dispatch timed out after 60s')
return False
except Exception:
log.exception('dispatch raised')
return False
log.info('dispatch accepted: %s', (result.stdout or '').strip())
return True
def main():
log = _make_logger()
if not os.path.isdir(PENDING_DIR):
# Nothing to do; reactors create the dir on first use.
return 0
try:
push = _load_push_cfg()
except Exception:
log.exception('failed to read global:push pillar; aborting drain pass')
return 1
if not push.get('enabled', True):
log.debug('push disabled; exiting')
return 0
debounce_seconds = int(push.get('debounce_seconds', 30))
os.makedirs(PENDING_DIR, exist_ok=True)
lock_fd = os.open(LOCK_FILE, os.O_CREAT | os.O_RDWR, 0o644)
try:
fcntl.flock(lock_fd, fcntl.LOCK_EX)
intent_files = [
p for p in sorted(glob.glob(os.path.join(PENDING_DIR, '*.json')))
if os.path.basename(p) != '.lock'
]
if not intent_files:
return 0
now = time.time()
ready = []
skipped = 0
broken = []
for path in intent_files:
intent = _read_intent(path, log)
if not isinstance(intent, dict):
broken.append(path)
continue
last_touch = intent.get('last_touch', 0)
if now - last_touch < debounce_seconds:
skipped += 1
continue
ready.append((path, intent))
for path in broken:
try:
os.unlink(path)
except OSError:
pass
if not ready:
if skipped:
log.debug('no ready intents (%d still in debounce window)', skipped)
return 0
combined_actions = []
oldest_first_touch = now
all_paths = []
for path, intent in ready:
combined_actions.extend(intent.get('actions', []) or [])
first = intent.get('first_touch', now)
if first < oldest_first_touch:
oldest_first_touch = first
all_paths.extend(intent.get('paths', []) or [])
deduped = _dedupe_actions(combined_actions)
if not deduped:
log.warning('%d intent(s) had no usable actions; clearing', len(ready))
for path, _ in ready:
try:
os.unlink(path)
except OSError:
pass
return 0
debounce_duration = now - oldest_first_touch
log.info(
'draining %d intent(s): %d action(s) after dedupe (raw=%d), '
'debounce_duration=%.1fs, paths=%s',
len(ready), len(deduped), len(combined_actions),
debounce_duration, all_paths[:20],
)
if not _dispatch(deduped, log):
log.warning('dispatch failed; leaving intent files in place for retry')
return 1
for path, _ in ready:
try:
os.unlink(path)
except OSError:
log.exception('failed to remove drained intent %s', path)
return 0
finally:
try:
fcntl.flock(lock_fd, fcntl.LOCK_UN)
finally:
os.close(lock_fd)
if __name__ == '__main__':
sys.exit(main())
+25 -1
View File
@@ -25,6 +25,7 @@ def showUsage(args):
print(' get [-r] - Displays (to stdout) the value stored in the given key. Requires KEY arg. Use -r for raw output without YAML formatting.', file=sys.stderr)
print(' remove - Removes a yaml key, if it exists. Requires KEY arg.', file=sys.stderr)
print(' replace - Replaces (or adds) a new key and set its value. Requires KEY and VALUE args.', file=sys.stderr)
print(' purge - Delete the YAML file from disk (no KEY arg).', file=sys.stderr)
print(' help - Prints this usage information.', file=sys.stderr)
print('', file=sys.stderr)
print(' Where:', file=sys.stderr)
@@ -53,7 +54,20 @@ def loadYaml(filename):
def writeYaml(filename, content):
file = open(filename, "w")
return yaml.safe_dump(content, file)
result = yaml.safe_dump(content, file)
file.close()
return result
def purgeFile(filename):
"""Delete a YAML file from disk. Idempotent; missing files are success."""
if os.path.exists(filename):
try:
os.remove(filename)
except Exception as e:
print(f"Failed to remove {filename}: {e}", file=sys.stderr)
return 1
return 0
def appendItem(content, key, listItem):
@@ -371,6 +385,15 @@ def get(args):
return 0
def purge(args):
"""purge YAML_FILE - delete the file from disk."""
if len(args) != 1:
print('Missing filename arg', file=sys.stderr)
showUsage(None)
return 1
return purgeFile(args[0])
def main():
args = sys.argv[1:]
@@ -388,6 +411,7 @@ def main():
"get": get,
"remove": remove,
"replace": replace,
"purge": purge,
}
code = 1
+28
View File
@@ -991,3 +991,31 @@ class TestLoadYaml(unittest.TestCase):
soyaml.loadYaml("/tmp/so-yaml_test-unreadable.yaml")
sysmock.assert_called_with(1)
self.assertIn("Error reading file", mock_stderr.getvalue())
class TestPurge(unittest.TestCase):
def test_purge_missing_arg(self):
# showUsage calls sys.exit(1); patch it like the other tests do.
with patch('sys.exit', new=MagicMock()):
with patch('sys.stderr', new=StringIO()) as mock_stderr:
rc = soyaml.purge([])
self.assertEqual(rc, 1)
self.assertIn("Missing filename", mock_stderr.getvalue())
def test_purge_existing_file(self):
filename = "/tmp/so-yaml_test_purge.yaml"
with open(filename, "w") as f:
f.write("key: value\n")
rc = soyaml.purge([filename])
self.assertEqual(rc, 0)
import os as _os
self.assertFalse(_os.path.exists(filename))
def test_purge_missing_file_idempotent(self):
filename = "/tmp/so-yaml_test_purge_missing.yaml"
import os as _os
if _os.path.exists(filename):
_os.remove(filename)
rc = soyaml.purge([filename])
self.assertEqual(rc, 0)
+22 -438
View File
@@ -188,6 +188,13 @@ airgap_update_dockers() {
fi
}
backup_old_states_pillars() {
tar czf /nsm/backup/$(echo $INSTALLEDVERSION)_$(date +%Y%m%d-%H%M%S)_soup_default_states_pillars.tar.gz /opt/so/saltstack/default/
tar czf /nsm/backup/$(echo $INSTALLEDVERSION)_$(date +%Y%m%d-%H%M%S)_soup_local_states_pillars.tar.gz /opt/so/saltstack/local/
}
update_registry() {
docker stop so-dockerregistry
docker rm so-dockerregistry
@@ -343,11 +350,10 @@ highstate() {
masterlock() {
echo "Locking Salt Master"
mv -v $TOPFILE $BACKUPTOPFILE
# Render the real top file only for the host running soup; every other
# minion gets an empty top (no states) while the master is upgrading.
echo "{% if grains['id'] == '$MINIONID' %}" > $TOPFILE
cat $BACKUPTOPFILE >> $TOPFILE
echo "{% endif %}" >> $TOPFILE
echo "base:" > $TOPFILE
echo " $MINIONID:" >> $TOPFILE
echo " - ca" >> $TOPFILE
echo " - elasticsearch" >> $TOPFILE
}
masterunlock() {
@@ -366,7 +372,6 @@ preupgrade_changes() {
[[ "$INSTALLEDVERSION" =~ ^2\.4\.21[0-9]+$ ]] && up_to_3.0.0
[[ "$INSTALLEDVERSION" == "3.0.0" ]] && up_to_3.1.0
[[ "$INSTALLEDVERSION" == "3.1.0" ]] && up_to_3.2.0
true
}
@@ -376,7 +381,6 @@ postupgrade_changes() {
[[ "$POSTVERSION" =~ ^2\.4\.21[0-9]+$ ]] && post_to_3.0.0
[[ "$POSTVERSION" == "3.0.0" ]] && post_to_3.1.0
[[ "$POSTVERSION" == "3.1.0" ]] && post_to_3.2.0
true
}
@@ -481,158 +485,6 @@ elasticsearch_backup_index_templates() {
tar -czf /nsm/backup/3.0.0_elasticsearch_index_templates.tar.gz -C /opt/so/conf/elasticsearch/templates/index/ .
}
elasticfleet_set_agent_logging_level_warn() {
. /usr/sbin/so-elastic-fleet-common
local current_agent_policies
if ! current_agent_policies=$(fleet_api "agent_policies?perPage=1000"); then
echo "Warning: unable to retrieve Fleet agent policies"
return 0
fi
# Only updating policies that are within Security Onion defaults and do not already have any user configured advanced_settings.
local policies_to_update
policies_to_update=$(jq -c '
.items[]
| select(has("advanced_settings") | not)
| select(
.id == "so-grid-nodes_general"
or .id == "so-grid-nodes_heavy"
or .id == "endpoints-initial"
or (.id | startswith("FleetServer_"))
)
' <<< "$current_agent_policies")
if [[ -z "$policies_to_update" ]]; then
return 0
fi
while IFS= read -r policy; do
[[ -z "$policy" ]] && continue
local policy_id policy_name policy_namespace
policy_id=$(jq -r '.id' <<< "$policy")
policy_name=$(jq -r '.name' <<< "$policy")
policy_namespace=$(jq -r '.namespace' <<< "$policy")
local update_logging
update_logging=$(jq -n \
--arg name "$policy_name" \
--arg namespace "$policy_namespace" \
'{name: $name, namespace: $namespace, advanced_settings: {agent_logging_level: "warning"}}'
)
echo "Setting elastic agent_logging_level to warning on policy '$policy_name' ($policy_id)."
if ! fleet_api "agent_policies/$policy_id" -XPUT -H 'kbn-xsrf: true' -H 'Content-Type: application/json' -d "$update_logging" >/dev/null; then
echo " warning: failed to update agent policy '$policy_name' ($policy_id)" >&2
fi
done <<< "$policies_to_update"
}
update_logstash_pipeline_name() {
local original_pipeline_name="$1"
local new_pipeline_name="$2"
echo "Checking for conflicting logstash defined_pipelines pillar value."
local LOGSTASH_FILE=/opt/so/saltstack/local/pillar/logstash/soc_logstash.sls
local MINIONDIR=/opt/so/saltstack/local/pillar/minions
for pillar_file in "$LOGSTASH_FILE" "$MINIONDIR"/*.sls; do
[[ -f "$pillar_file" ]] || continue
if grep -q "$original_pipeline_name$" "$pillar_file"; then
echo "Found conflicting defined_pipeline pillar value in $pillar_file. Updating to use the new logstash pipeline name."
sed -i "s#$original_pipeline_name\$#$new_pipeline_name#g" "$pillar_file"
chown socore:socore "$pillar_file"
fi
done
}
check_transform_health_and_reauthorize() {
. /usr/sbin/so-elastic-fleet-common
echo "Checking integration transform jobs for unhealthy / unauthorized status..."
local transforms_doc stats_doc installed_doc
if ! transforms_doc=$(so-elasticsearch-query "_transform/_all?size=1000" --fail --retry 3 --retry-delay 5 2>/dev/null); then
echo "Unable to query for transform jobs, skipping reauthorization."
return 0
fi
if ! stats_doc=$(so-elasticsearch-query "_transform/_all/_stats?size=1000" --fail --retry 3 --retry-delay 5 2>/dev/null); then
echo "Unable to query for transform job stats, skipping reauthorization."
return 0
fi
if ! installed_doc=$(fleet_api "epm/packages/installed?perPage=500"); then
echo "Unable to list installed Fleet packages, skipping reauthorization."
return 0
fi
# Get all transforms that meet the following
# - unhealthy (any non-green health status)
# - metadata has run_as_kibana_system: false (this fix is specific to transforms started prior to Kibana 9.3.3)
# - are not orphaned (integration is not somehow missing/corrupt/uninstalled)
local tmp_transforms tmp_stats tmp_installed
tmp_transforms=$(mktemp)
tmp_stats=$(mktemp)
tmp_installed=$(mktemp)
echo "$transforms_doc" > "$tmp_transforms"
echo "$stats_doc" > "$tmp_stats"
echo "$installed_doc" > "$tmp_installed"
local unhealthy_transforms
unhealthy_transforms=$(jq -c -n \
--slurpfile t "$tmp_transforms" \
--slurpfile s "$tmp_stats" \
--slurpfile i "$tmp_installed" '
($i[0].items | map({key: .name, value: .version}) | from_entries) as $pkg_ver
| ($s[0].transforms | map({key: .id, value: .health.status}) | from_entries) as $health
| [ $t[0].transforms[]
| select(._meta.run_as_kibana_system == false)
| select(($health[.id] // "unknown") != "green")
| {id, pkg: ._meta.package.name, ver: ($pkg_ver[._meta.package.name])}
]
| if length == 0 then empty else . end
| (map(select(.ver == null)) | map({orphan: .id})[]),
(map(select(.ver != null))
| group_by(.pkg)
| map({pkg: .[0].pkg, ver: .[0].ver, transformIds: map(.id)})[])
')
if [[ -z "$unhealthy_transforms" ]]; then
return 0
fi
local unhealthy_count
unhealthy_count=$(jq -s '[.[].transformIds? // empty | .[]] | length' <<< "$unhealthy_transforms")
echo "Found $unhealthy_count transform(s) needing reauthorization."
local total_failures=0
while IFS= read -r transform; do
[[ -z "$transform" ]] && continue
if jq -e 'has("orphan")' <<< "$transform" >/dev/null 2>&1; then
echo "Skipping transform not owned by any installed Fleet package: $(jq -r '.orphan' <<< "$transform")"
continue
fi
local pkg ver body resp
pkg=$(jq -r '.pkg' <<< "$transform")
ver=$(jq -r '.ver' <<< "$transform")
body=$(jq -c '{transforms: (.transformIds | map({transformId: .}))}' <<< "$transform")
echo "Reauthorizing transform(s) for ${pkg}-${ver}..."
resp=$(fleet_api "epm/packages/${pkg}/${ver}/transforms/authorize" \
-XPOST -H 'kbn-xsrf: true' -H 'Content-Type: application/json' \
-d "$body") || { echo "Could not reauthorize transform(s) for ${pkg}-${ver}"; continue; }
(( total_failures += $(jq 'map(select(.success != true)) | length' <<< "$resp" 2>/dev/null) ))
done <<< "$unhealthy_transforms"
rm -f "$tmp_transforms" "$tmp_stats" "$tmp_installed"
if [[ "$total_failures" -gt 0 ]]; then
echo "Some transform(s) failed to reauthorize."
fi
}
ensure_postgres_local_pillar() {
# Postgres was added as a service after 3.0.0, so the new pillar/top.sls
# references postgres.soc_postgres / postgres.adv_postgres unconditionally.
@@ -668,31 +520,6 @@ ensure_postgres_secret() {
chown socore:socore "$secrets_file"
}
rename_strelka_scan_lnk() {
echo "Renaming strelka pillar ScanLNK to ScanLnk."
local STRELKA_FILE=/opt/so/saltstack/local/pillar/strelka/soc_strelka.sls
local MINIONDIR=/opt/so/saltstack/local/pillar/minions
local OLD_KEY=strelka.backend.config.backend.scanners.ScanLNK
local NEW_KEY=strelka.backend.config.backend.scanners.ScanLnk
local TMP_VALUE_FILE
TMP_VALUE_FILE=$(mktemp)
for pillar_file in "$STRELKA_FILE" "$MINIONDIR"/*.sls; do
[[ -f "$pillar_file" ]] || continue
# Skip if ScanLNK doesn't exist
so-yaml.py get "$pillar_file" "$OLD_KEY" > "$TMP_VALUE_FILE" 2>/dev/null || continue
echo "Found 'ScanLNK' key in $pillar_file. Renaming to 'ScanLnk'."
so-yaml.py add "$pillar_file" "$NEW_KEY" "file:$TMP_VALUE_FILE"
so-yaml.py remove "$pillar_file" "$OLD_KEY"
done
rm -f "$TMP_VALUE_FILE"
}
fix_logstash_0013_lumberjack_pipeline_name() {
update_logstash_pipeline_name "so/0013_input_lumberjack_fleet.conf" "so/0013_input_lumberjack_fleet.conf.jinja"
}
up_to_3.1.0() {
ensure_postgres_local_pillar
ensure_postgres_secret
@@ -700,8 +527,7 @@ up_to_3.1.0() {
elasticsearch_backup_index_templates
# Clear existing component template state file.
rm -f /opt/so/state/esfleet_component_templates.json
rename_strelka_scan_lnk
fix_logstash_0013_lumberjack_pipeline_name
INSTALLEDVERSION=3.1.0
}
@@ -727,59 +553,11 @@ post_to_3.1.0() {
# file_roots of its own and --local would fail with "No matching sls found".
salt-call state.apply postgres.telegraf_users queue=True || true
# Update default agent policies to use logging level warn.
elasticfleet_set_agent_logging_level_warn || true
# Check for unhealthy / unauthorized integration transform jobs and attempt reauthorizations
check_transform_health_and_reauthorize || true
POSTVERSION=3.1.0
}
### 3.1.0 End ###
### 3.2.0 Scripts ###
bootstrap_so_soc_database() {
# init-db.sh is mounted into so-postgres at /docker-entrypoint-initdb.d/init-db.sh
# and runs automatically only on a fresh data directory. Hosts upgrading from
# 3.1.0 already have /nsm/postgres populated, so the so_soc bootstrap block
# added in 3.2 never fires. Re-run the script explicitly; it's idempotent.
echo "Bootstrapping so_soc database via init-db.sh."
# The postgres image has no USER directive, so `docker exec` defaults to
# root, and the container env intentionally omits POSTGRES_USER (the upstream
# entrypoint defaults it transiently during first-init only). Recreate both
# so psql inside init-db.sh resolves the connect user correctly.
local exec_cmd="docker exec -u postgres -e POSTGRES_USER=postgres so-postgres bash /docker-entrypoint-initdb.d/init-db.sh"
if ! /usr/sbin/so-postgres-wait; then
FINAL_MESSAGE_QUEUE+=("WARNING: so-postgres was not ready during the 3.2.0 upgrade; the so_soc database may not have been bootstrapped. Re-run manually: $exec_cmd")
return 0
fi
if ! $exec_cmd; then
FINAL_MESSAGE_QUEUE+=("WARNING: init-db.sh failed inside so-postgres during the 3.2.0 upgrade; the so_soc database may not have been bootstrapped. Re-run manually: $exec_cmd")
return 0
fi
echo "so_soc bootstrap complete."
}
up_to_3.2.0() {
fix_logstash_0013_lumberjack_pipeline_name
INSTALLEDVERSION=3.2.0
}
post_to_3.2.0() {
bootstrap_so_soc_database
# Including agent regen script here since it was missed in post_to_3.1.0
echo "Regenerating Elastic Agent Installers"
/sbin/so-elastic-agent-gen-installers
POSTVERSION=3.2.0
}
### 3.2.0 End ###
repo_sync() {
echo "Sync the local repo."
@@ -1031,9 +809,6 @@ verify_es_version_compatibility() {
local is_active_intermediate_upgrade=1
# supported upgrade paths for SO-ES versions
declare -A es_upgrade_map=(
["8.18.4"]="8.18.6 8.18.8 9.0.8"
["8.18.6"]="8.18.8 9.0.8"
["8.18.8"]="9.0.8"
["9.0.8"]="9.3.3"
)
@@ -1057,171 +832,6 @@ verify_es_version_compatibility() {
exit 160
fi
compatible_es_versions="$target_es_version"
for current_version in "${!es_upgrade_map[@]}"; do
# shellcheck disable=SC2076
if [[ " ${es_upgrade_map[$current_version]} " =~ " $target_es_version " ]]; then
compatible_es_versions+=" $current_version"
fi
done
# Check if the given ES version can directly upgrade to the target ES version. Used to assist with catching lagging nodes during the upgrade process
es_version_can_upgrade_to_target() {
local current_version="$1"
# shellcheck disable=SC2076
if [[ -n "$current_version" && " $compatible_es_versions " =~ " $current_version " ]]; then
return 0
fi
return 1
}
# Gather Elasticsearch cluster version info and verify that each node in the cluster is running a version compatible with the target ES version.
verify_searchnodes_es_target_compatibility() {
local retries=20
local retry_count=0
local delay=180
local expected_es_nodes searchnode_minions attempt
local searchnode_discovery_success=false
SEARCHNODE_ES_VERSIONS=""
for attempt in {1..3}; do
if searchnode_minions=$(set -o pipefail; salt-key --out=json --list=accepted 2> /dev/null | jq -r '.minions[]? | select(endswith("searchnode"))'); then
searchnode_discovery_success=true
break
fi
echo "Failed to retrieve grid searchnodes via salt-key... Retrying in 30 seconds. Attempt $attempt of 3."
sleep 30
done
if [[ "$searchnode_discovery_success" != "true" ]]; then
echo "Failed to retrieve grid searchnodes via salt-key."
return 1
fi
# Always add node running soup to expected es nodes
expected_es_nodes="${MINIONID%_*}"
while IFS= read -r searchnode_minion; do
[[ -z "$searchnode_minion" ]] && continue
expected_es_nodes+=$'\n'"${searchnode_minion%_searchnode}"
done <<< "$searchnode_minions"
while [[ $retry_count -lt $retries ]]; do
SEARCHNODE_ES_VERSIONS=$(so-elasticsearch-query _nodes/_all/version --retry 5 --retry-delay 10 --fail 2>&1)
local exit_status=$?
if [[ $exit_status -ne 0 ]]; then
echo "Failed to retrieve Elasticsearch versions from searchnodes... Retrying in $delay seconds. Attempt $((retry_count + 1)) of $retries."
((retry_count++))
sleep $delay
continue
fi
local all_searchnodes_compatible=true
while IFS=$'\t' read -r node current_version; do
[[ -z "$node" ]] && continue
if ! es_version_can_upgrade_to_target "$current_version"; then
echo "Searchnode $node is running Elasticsearch $current_version, which is not directly upgradable to Elasticsearch $target_es_version."
all_searchnodes_compatible=false
fi
done < <(echo "$SEARCHNODE_ES_VERSIONS" | jq -r '.nodes | to_entries[] | [.value.name, .value.version] | @tsv')
while IFS= read -r expected_es_node; do
[[ -z "$expected_es_node" ]] && continue
if ! echo "$SEARCHNODE_ES_VERSIONS" | jq -e --arg node "$expected_es_node" '.nodes | to_entries | any(.value.name == $node)' > /dev/null; then
echo "Searchnode $expected_es_node did not report an Elasticsearch version. It may be offline or still upgrading."
all_searchnodes_compatible=false
fi
done <<< "$expected_es_nodes"
if [[ "$all_searchnodes_compatible" == true ]]; then
echo "All Searchnodes are upgradable to Elasticsearch $target_es_version."
return 0
fi
echo "One or more Searchnodes cannot upgrade directly to Elasticsearch $target_es_version. Rechecking in $delay seconds. Attempt $((retry_count + 1)) of $retries."
((retry_count++))
sleep $delay
done
return 1
}
# Gather heavynode version info and verify that each node is running a version compatible with the target ES version.
verify_heavynodes_es_target_compatibility() {
local heavynode_minions attempt
local retries=20
local retry_count=0
local delay=180
local heavynode_discovery_success=false
HEAVYNODE_ES_VERSIONS=""
for attempt in {1..3}; do
if heavynode_minions=$(set -o pipefail; salt-key --out=json --list=accepted 2> /dev/null | jq -r '.minions[]? | select(endswith("heavynode"))'); then
heavynode_discovery_success=true
break
fi
echo "Failed to retrieve grid heavynodes via salt-key... Retrying in 30 seconds. Attempt $attempt of 3."
sleep 30
done
if [[ "$heavynode_discovery_success" != "true" ]]; then
echo "Failed to retrieve grid heavynodes via salt-key."
return 1
fi
if [[ -z "$heavynode_minions" ]]; then
echo "No heavynodes detected. Skipping heavynode Elasticsearch version compatibility check."
return 0
fi
while [[ $retry_count -lt $retries ]]; do
HEAVYNODE_ES_VERSIONS=$(salt -C 'G@role:so-heavynode' cmd.run 'set -o pipefail; so-elasticsearch-query / --retry 5 --retry-delay 10 | jq -er ".version.number"' shell=/bin/bash --out=json 2> /dev/null)
local exit_status=$?
if [[ $exit_status -ne 0 ]]; then
echo "Failed to retrieve Elasticsearch version from one or more heavynodes... Retrying in $delay seconds. Attempt $((retry_count + 1)) of $retries."
((retry_count++))
sleep $delay
continue
fi
local all_heavynodes_compatible=true
while IFS=$'\t' read -r node current_version; do
[[ -z "$node" ]] && continue
if ! es_version_can_upgrade_to_target "$current_version"; then
echo "Heavynode $node is running Elasticsearch $current_version, which is not directly upgradable to Elasticsearch $target_es_version."
all_heavynodes_compatible=false
fi
done < <(echo "$HEAVYNODE_ES_VERSIONS" | jq -r 'to_entries[] | [.key, .value] | @tsv')
while IFS= read -r heavynode_minion; do
[[ -z "$heavynode_minion" ]] && continue
if ! echo "$HEAVYNODE_ES_VERSIONS" | jq -se --arg minion "$heavynode_minion" 'add | has($minion)' > /dev/null; then
echo "Heavynode $heavynode_minion did not report an Elasticsearch version. It may be offline or still upgrading."
all_heavynodes_compatible=false
fi
done <<< "$heavynode_minions"
if [[ "$all_heavynodes_compatible" == true ]]; then
echo -e "\nAll heavynodes can upgrade to Elasticsearch $target_es_version."
return 0
fi
echo "One or more heavynodes cannot upgrade directly to Elasticsearch $target_es_version. Rechecking in $delay seconds. Attempt $((retry_count + 1)) of $retries."
((retry_count++))
sleep $delay
done
return 1
}
if [[ ! -f "$es_verification_script" ]]; then
create_intermediate_upgrade_verification_script "$es_verification_script"
fi
for statefile in "${es_required_version_statefile_base}"-*; do
[[ -f $statefile ]] || continue
@@ -1240,6 +850,10 @@ verify_es_version_compatibility() {
continue
fi
if [[ ! -f "$es_verification_script" ]]; then
create_intermediate_upgrade_verification_script "$es_verification_script"
fi
echo -e "\n##############################################################################################################################\n"
echo "A previously required intermediate Elasticsearch upgrade was detected. Verifying that all Searchnodes/Heavynodes have successfully upgraded Elasticsearch to $es_required_version_statefile_value before proceeding with soup to avoid potential data loss! This command can take up to an hour to complete."
if ! timeout --foreground 4000 bash "$es_verification_script" "$es_required_version_statefile_value" "$statefile"; then
@@ -1261,26 +875,6 @@ verify_es_version_compatibility() {
# shellcheck disable=SC2076 # Do not want a regex here eg usage " 8.18.8 9.0.8 " =~ " 9.0.8 "
if [[ " ${es_upgrade_map[$es_version]} " =~ " $target_es_version " || "$es_version" == "$target_es_version" ]]; then
if ! verify_searchnodes_es_target_compatibility || ! verify_heavynodes_es_target_compatibility; then
echo -e "\n!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!\n"
echo "One or more Searchnode(s)/Heavynode(s) cannot upgrade directly to Elasticsearch $target_es_version. This can happen with soups that include Elasticsearch upgrades being run in quick succession. Typically, this will resolve itself as the grid synchronizes. Please allow time for all Searchnodes/Heavynodes to have upgraded Elasticsearch to a compatible version with $target_es_version before running soup again to avoid potential data loss!"
if [[ -n "$HEAVYNODE_ES_VERSIONS" ]]; then
echo "Current heavynode Elasticsearch versions:"
echo "$HEAVYNODE_ES_VERSIONS" | jq '.'
fi
if [[ -n "$SEARCHNODE_ES_VERSIONS" ]]; then
echo "Current searchnode Elasticsearch versions:"
echo "$SEARCHNODE_ES_VERSIONS" | jq '.nodes | to_entries | map({(.value.name): .value.version}) | sort | add'
fi
echo -e "\n!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!\n"
exit 161
fi
# supported upgrade
return 0
else
@@ -1638,13 +1232,13 @@ main() {
echo "Verifying we have the latest soup script."
verify_latest_update_script
echo "Verifying Elasticsearch version compatibility before upgrading."
verify_es_version_compatibility
echo "Let's see if we need to update Security Onion."
upgrade_check
upgrade_space
echo "Verifying Elasticsearch version compatibility across the grid before upgrading."
verify_es_version_compatibility
echo "Checking for Salt Master and Minion updates."
upgrade_check_salt
set -e
@@ -1664,8 +1258,7 @@ main() {
echo "Applying $HOTFIXVERSION hotfix"
# since we don't run the backup.config_backup state on import we wont snapshot previous version states and pillars
if [[ ! "$MINION_ROLE" == "import" ]]; then
echo "Running so-config-backup script."
/sbin/so-config-backup
backup_old_states_pillars
fi
copy_new_files
create_local_directories "/opt/so/saltstack/default"
@@ -1721,8 +1314,8 @@ main() {
# since we don't run the backup.config_backup state on import we wont snapshot previous version states and pillars
if [[ ! "$MINION_ROLE" == "import" ]]; then
echo ""
echo "Running so-config-backup script."
/sbin/so-config-backup
echo "Creating snapshots of default and local Salt states and pillars and saving to /nsm/backup/"
backup_old_states_pillars
fi
echo ""
@@ -1758,9 +1351,6 @@ main() {
enable_highstate
echo "salt-call state.show_top"
salt-call state.show_top
echo ""
echo "Running a highstate. This could take several minutes."
set +e
@@ -1768,9 +1358,6 @@ main() {
highstate
set -e
echo "salt-call saltutil.running"
salt-call saltutil.running
stop_salt_master
masterunlock
@@ -1793,9 +1380,6 @@ main() {
# ensure the mine is updated and populated before highstates run, following the salt-master restart
update_salt_mine
echo "salt-call state.show_top"
salt-call state.show_top
highstate
check_saltmaster_status
postupgrade_changes
@@ -33,8 +33,11 @@ so-elastic-fleet-stop --force
status "Deleting Fleet Data from Pillars..."
so-yaml.py remove /opt/so/saltstack/local/pillar/minions/{{ GLOBALS.minion_id }}.sls elasticfleet
/usr/sbin/so-config.py sync-yaml-mutation /opt/so/saltstack/local/pillar/minions/{{ GLOBALS.minion_id }}.sls remove elasticfleet --note "so-elastic-fleet-reset"
so-yaml.py remove /opt/so/saltstack/local/pillar/global/soc_global.sls global.fleet_grid_enrollment_token_general
/usr/sbin/so-config.py sync-yaml-mutation /opt/so/saltstack/local/pillar/global/soc_global.sls remove global.fleet_grid_enrollment_token_general --note "so-elastic-fleet-reset"
so-yaml.py remove /opt/so/saltstack/local/pillar/global/soc_global.sls global.fleet_grid_enrollment_token_heavy
/usr/sbin/so-config.py sync-yaml-mutation /opt/so/saltstack/local/pillar/global/soc_global.sls remove global.fleet_grid_enrollment_token_heavy --note "so-elastic-fleet-reset"
status "Restarting Kibana..."
so-kibana-restart --force
-1
View File
@@ -34,7 +34,6 @@ make-rule-dir-nginx:
so-nginx:
docker_container.running:
- image: {{ GLOBALS.registry_host }}:5000/{{ GLOBALS.image_repo }}/so-nginx:{{ GLOBALS.so_version }}
- restart_policy: unless-stopped
- hostname: so-nginx
- networks:
- sobridge:
-2
View File
@@ -225,7 +225,6 @@ http {
limit_req zone=auth_throttle burst={{ NGINXMERGED.config.throttle_login_burst }} nodelay;
limit_req_status 429;
proxy_pass http://{{ GLOBALS.manager }}:4433;
proxy_set_header Connection "Close";
proxy_read_timeout 90;
proxy_connect_timeout 90;
proxy_set_header Host $host;
@@ -238,7 +237,6 @@ http {
location ~ ^/auth/.*?(whoami|logout|settings|errors|webauthn.js) {
rewrite /auth/(.*) /$1 break;
proxy_pass http://{{ GLOBALS.manager }}:4433;
proxy_set_header Connection "Close";
proxy_read_timeout 90;
proxy_connect_timeout 90;
proxy_set_header Host $host;
+1 -10
View File
@@ -3,14 +3,7 @@
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
{% set hypervisor = pillar.get('minion_id', '') %}
{% if not hypervisor|regex_match('^([A-Za-z0-9._-]{1,253})$') %}
{% do salt.log.error('delete_hypervisor_orch: refusing unsafe minion_id=' ~ hypervisor) %}
delete_hypervisor_invalid_minion_id:
test.fail_without_changes:
- name: delete_hypervisor_invalid_minion_id
{% else %}
{% set hypervisor = pillar.minion_id %}
ensure_hypervisor_mine_deleted:
salt.function:
@@ -27,5 +20,3 @@ update_salt_cloud_profile:
- sls:
- salt.cloud.config
- concurrent: True
{% endif %}
-37
View File
@@ -1,37 +0,0 @@
{% from 'global/map.jinja' import GLOBALMERGED %}
{% set actions = salt['pillar.get']('actions', []) %}
{% set BATCH = GLOBALMERGED.push.batch %}
{% set BATCH_WAIT = GLOBALMERGED.push.batch_wait %}
{% for action in actions %}
{% if action.get('highstate') %}
apply_highstate_{{ loop.index }}:
salt.state:
- tgt: '{{ action.tgt }}'
- tgt_type: {{ action.get('tgt_type', 'compound') }}
- highstate: True
- batch: {{ action.get('batch', BATCH) }}
- batch_wait: {{ action.get('batch_wait', BATCH_WAIT) }}
- kwarg:
queue: 2
{% else %}
refresh_pillar_{{ loop.index }}:
salt.function:
- name: saltutil.refresh_pillar
- tgt: '{{ action.tgt }}'
- tgt_type: {{ action.get('tgt_type', 'compound') }}
apply_{{ action.state | replace('.', '_') }}_{{ loop.index }}:
salt.state:
- tgt: '{{ action.tgt }}'
- tgt_type: {{ action.get('tgt_type', 'compound') }}
- sls:
- {{ action.state }}
- batch: {{ action.get('batch', BATCH) }}
- batch_wait: {{ action.get('batch_wait', BATCH_WAIT) }}
- kwarg:
queue: 2
- require:
- salt: refresh_pillar_{{ loop.index }}
{% endif %}
{% endfor %}
+1 -10
View File
@@ -12,14 +12,7 @@
{% if 'vrt' in salt['pillar.get']('features', []) %}
{% do salt.log.debug('vm_pillar_clean_orch: Running') %}
{% set vm_name = pillar.get('vm_name', '') %}
{% if not vm_name|regex_match('^([A-Za-z0-9._-]{1,253})$') %}
{% do salt.log.error('vm_pillar_clean_orch: refusing unsafe vm_name=' ~ vm_name) %}
vm_pillar_clean_invalid_name:
test.fail_without_changes:
- name: vm_pillar_clean_invalid_name
{% else %}
{% set vm_name = pillar.get('vm_name') %}
delete_adv_{{ vm_name }}_pillar:
module.run:
@@ -31,8 +24,6 @@ delete_{{ vm_name }}_pillar:
- file.remove:
- path: /opt/so/saltstack/local/pillar/minions/{{ vm_name }}.sls
{% endif %}
{% else %}
{% do salt.log.error(
+3 -3
View File
@@ -46,10 +46,10 @@ postgresinitdir:
- require:
- file: postgresconfdir
postgresinitdb:
postgresinitusers:
file.managed:
- name: /opt/so/conf/postgres/init/init-db.sh
- source: salt://postgres/files/init-db.sh
- name: /opt/so/conf/postgres/init/init-users.sh
- source: salt://postgres/files/init-users.sh
- user: 939
- group: 939
- mode: 755
+4 -4
View File
@@ -31,7 +31,7 @@ so-postgres:
- POSTGRES_DB=securityonion
# Passwords are delivered via mounted 0600 secret files, not plaintext env vars.
# The upstream postgres image resolves POSTGRES_PASSWORD_FILE; entrypoint.sh and
# init-db.sh resolve SO_POSTGRES_PASS_FILE the same way.
# init-users.sh resolve SO_POSTGRES_PASS_FILE the same way.
- POSTGRES_PASSWORD_FILE=/run/secrets/postgres_password
- SO_POSTGRES_USER={{ SO_POSTGRES_USER }}
- SO_POSTGRES_PASS_FILE=/run/secrets/so_postgres_pass
@@ -46,7 +46,7 @@ so-postgres:
- /opt/so/conf/postgres/postgresql.conf:/conf/postgresql.conf:ro
- /opt/so/conf/postgres/pg_hba.conf:/conf/pg_hba.conf:ro
- /opt/so/conf/postgres/secrets:/run/secrets:ro
- /opt/so/conf/postgres/init/init-db.sh:/docker-entrypoint-initdb.d/init-db.sh:ro
- /opt/so/conf/postgres/init/init-users.sh:/docker-entrypoint-initdb.d/init-users.sh:ro
- /etc/pki/postgres.crt:/conf/postgres.crt:ro
- /etc/pki/postgres.key:/conf/postgres.key:ro
- /etc/pki/tls/certs/intca.crt:/conf/ca.crt:ro
@@ -70,7 +70,7 @@ so-postgres:
- watch:
- file: postgresconf
- file: postgreshba
- file: postgresinitdb
- file: postgresinitusers
- file: postgres_super_secret
- file: postgres_app_secret
- x509: postgres_crt
@@ -78,7 +78,7 @@ so-postgres:
- require:
- file: postgresconf
- file: postgreshba
- file: postgresinitdb
- file: postgresinitusers
- file: postgres_super_secret
- file: postgres_app_secret
- x509: postgres_crt
@@ -17,7 +17,6 @@ psql -v ON_ERROR_STOP=1 --username "$POSTGRES_USER" --dbname "$POSTGRES_DB" <<-E
END IF;
END
\$\$;
GRANT ALL ON SCHEMA public TO "$SO_POSTGRES_USER";
GRANT ALL PRIVILEGES ON DATABASE "$POSTGRES_DB" TO "$SO_POSTGRES_USER";
-- Lock the SOC database down at the connect layer; PUBLIC gets CONNECT
-- by default, which would let per-minion telegraf roles open sessions
+20
View File
@@ -0,0 +1,20 @@
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
{% from 'allowed_states.map.jinja' import allowed_states %}
{% if sls.split('.')[0] in allowed_states %}
# Deprecated: the old so_pillar schema has been replaced by SOC-owned
# onionconfig tables. SOC creates its schema on first startup.
postgres_schema_pillar_deprecated:
test.nop
{% else %}
{{sls}}_state_not_allowed:
test.fail_without_changes:
- name: {{sls}}_state_not_allowed
{% endif %}
+85 -18
View File
@@ -18,22 +18,38 @@ include:
{% set TG_OUT = TELEGRAFMERGED.output | upper %}
{% if TG_OUT in ['POSTGRES', 'BOTH'] %}
# docker_container.running returns as soon as the container starts, but on
# first-init docker-entrypoint.sh starts a temporary postgres with
# `listen_addresses=''` to run /docker-entrypoint-initdb.d scripts, then
# shuts it down before exec'ing the real CMD. A default pg_isready check
# (Unix socket) passes during that ephemeral phase and races the shutdown
# with "the database system is shutting down". Checking TCP readiness on
# 127.0.0.1 only succeeds after the final postgres binds the port.
postgres_wait_ready:
cmd.run:
- name: /usr/sbin/so-postgres-wait
- name: |
for i in $(seq 1 60); do
if docker exec so-postgres pg_isready -h 127.0.0.1 -U postgres -q 2>/dev/null; then
exit 0
fi
sleep 2
done
echo "so-postgres did not accept TCP connections within 120s" >&2
exit 1
- require:
- docker_container: so-postgres
- file: postgres_sbin
# Ensure the shared Telegraf database exists. init-db.sh only runs on a
# Ensure the shared Telegraf database exists. init-users.sh only runs on a
# fresh data dir, so hosts upgraded onto an existing /nsm/postgres volume
# would otherwise never get so_telegraf.
postgres_create_telegraf_db:
cmd.run:
- name: /usr/sbin/so-telegraf-postgres create_db
- name: |
if ! docker exec so-postgres psql -U postgres -tAc "SELECT 1 FROM pg_database WHERE datname='so_telegraf'" | grep -q 1; then
docker exec so-postgres psql -v ON_ERROR_STOP=1 -U postgres -c "CREATE DATABASE so_telegraf"
fi
- require:
- cmd: postgres_wait_ready
- file: postgres_sbin
# Provision the shared group role and schema once. Every per-minion role is a
# member of so_telegraf, and each Telegraf connection does SET ROLE so_telegraf
@@ -41,26 +57,68 @@ postgres_create_telegraf_db:
# on first write are owned by the group role and every member can INSERT/SELECT.
postgres_telegraf_group_role:
cmd.run:
- name: /usr/sbin/so-telegraf-postgres group_role
- name: |
docker exec -i so-postgres psql -v ON_ERROR_STOP=1 -U postgres -d so_telegraf <<'EOSQL'
DO $$
BEGIN
IF NOT EXISTS (SELECT FROM pg_catalog.pg_roles WHERE rolname = 'so_telegraf') THEN
CREATE ROLE so_telegraf NOLOGIN;
END IF;
END
$$;
GRANT CONNECT ON DATABASE so_telegraf TO so_telegraf;
CREATE SCHEMA IF NOT EXISTS telegraf AUTHORIZATION so_telegraf;
GRANT USAGE, CREATE ON SCHEMA telegraf TO so_telegraf;
CREATE SCHEMA IF NOT EXISTS partman;
CREATE EXTENSION IF NOT EXISTS pg_partman SCHEMA partman;
CREATE EXTENSION IF NOT EXISTS pg_cron;
-- Telegraf (running as so_telegraf) calls partman.create_parent()
-- on first write of each metric, which needs USAGE on the partman
-- schema, EXECUTE on its functions/procedures, and write access to
-- partman.part_config so it can register new partitioned parents.
GRANT USAGE, CREATE ON SCHEMA partman TO so_telegraf;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA partman TO so_telegraf;
GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA partman TO so_telegraf;
GRANT EXECUTE ON ALL PROCEDURES IN SCHEMA partman TO so_telegraf;
-- partman creates per-parent template tables (partman.template_*) at
-- runtime; default privileges extend DML/sequence access to them.
ALTER DEFAULT PRIVILEGES IN SCHEMA partman
GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO so_telegraf;
ALTER DEFAULT PRIVILEGES IN SCHEMA partman
GRANT USAGE, SELECT, UPDATE ON SEQUENCES TO so_telegraf;
-- Hourly partman maintenance. cron.schedule is idempotent by jobname.
SELECT cron.schedule(
'telegraf-partman-maintenance',
'17 * * * *',
'CALL partman.run_maintenance_proc()'
);
EOSQL
- require:
- cmd: postgres_create_telegraf_db
- file: postgres_sbin
{% set creds = salt['pillar.get']('telegraf:postgres_creds', {}) %}
{% for mid, entry in creds.items() %}
{% if entry.get('user') and entry.get('pass') %}
{% set u = entry.user %}
{% set p = entry.pass %}
{% set p = entry.pass | replace("'", "''") %}
postgres_telegraf_role_{{ u }}:
cmd.run:
- name: /usr/sbin/so-telegraf-postgres user
- env:
- ROLE_USER: {{ u | tojson }}
- ROLE_PASS: {{ p | tojson }}
- hide_output: True
- name: |
docker exec -i so-postgres psql -v ON_ERROR_STOP=1 -U postgres -d so_telegraf <<'EOSQL'
DO $$
BEGIN
IF NOT EXISTS (SELECT FROM pg_catalog.pg_roles WHERE rolname = '{{ u }}') THEN
EXECUTE format('CREATE ROLE %I WITH LOGIN PASSWORD %L', '{{ u }}', '{{ p }}');
ELSE
EXECUTE format('ALTER ROLE %I WITH PASSWORD %L', '{{ u }}', '{{ p }}');
END IF;
END
$$;
GRANT CONNECT ON DATABASE so_telegraf TO "{{ u }}";
GRANT so_telegraf TO "{{ u }}";
EOSQL
- require:
- file: postgres_sbin
- cmd: postgres_telegraf_group_role
{% endif %}
@@ -72,12 +130,21 @@ postgres_telegraf_role_{{ u }}:
{% set retention = salt['pillar.get']('postgres:telegraf:retention_days', 14) | int %}
postgres_telegraf_retention_reconcile:
cmd.run:
- name: /usr/sbin/so-telegraf-postgres retention
- env:
- RETENTION_DAYS: {{ retention }}
- name: |
docker exec -i so-postgres psql -v ON_ERROR_STOP=1 -U postgres -d so_telegraf <<'EOSQL'
DO $$
BEGIN
IF EXISTS (SELECT 1 FROM pg_catalog.pg_extension WHERE extname = 'pg_partman') THEN
UPDATE partman.part_config
SET retention = '{{ retention }} days',
retention_keep_table = false
WHERE parent_table LIKE 'telegraf.%';
END IF;
END
$$;
EOSQL
- require:
- cmd: postgres_telegraf_group_role
- file: postgres_sbin
{% endif %}
+7 -41
View File
@@ -7,29 +7,15 @@
. /usr/sbin/so-common
# Without pipefail, a pipeline's exit status is gzip's. A failed pg_dumpall would
# otherwise be masked by a successful gzip, silently producing a valid .gz that
# holds a truncated dump.
set -o pipefail
# Backups contain role password hashes and full chat data; keep them 0600.
umask 0077
TODAY=$(date '+%Y_%m_%d')
BACKUPDIR=/nsm/backup
BACKUPFILE="$BACKUPDIR/so-postgres-backup-$TODAY.sql.gz"
TMPFILE="$BACKUPFILE.tmp"
MAXBACKUPS=7
LOGFILE=/opt/so/log/postgres/backup.log
log() {
echo "$(date '+%Y-%m-%d %H:%M:%S') $*" >> "$LOGFILE"
}
mkdir -p "$BACKUPDIR"
# Remove any temp files left behind by a previously crashed run
rm -f "$BACKUPDIR"/so-postgres-backup-*.sql.gz.tmp
mkdir -p $BACKUPDIR
# Skip if already backed up today
if [ -f "$BACKUPFILE" ]; then
@@ -41,33 +27,13 @@ if ! docker ps --format '{{.Names}}' | grep -q '^so-postgres$'; then
exit 0
fi
# Always clean up the temp file on exit; the success path clears this trap
# after the atomic rename so the finished backup is not deleted.
trap 'rm -f "$TMPFILE"' EXIT
# Dump all databases and roles, compress
docker exec so-postgres pg_dumpall -U postgres | gzip > "$BACKUPFILE"
# Dump all databases and roles, compress. Write to a temp file so the final
# filename only ever appears for a complete, verified backup.
if ! docker exec so-postgres pg_dumpall -U postgres | gzip > "$TMPFILE"; then
log "ERROR: pg_dumpall/gzip failed; backup aborted"
exit 1
fi
# Verify the compressed stream is intact before publishing it
if ! gzip -t "$TMPFILE"; then
log "ERROR: backup failed gzip integrity check; backup aborted"
exit 1
fi
# Atomically publish the verified backup
mv "$TMPFILE" "$BACKUPFILE"
trap - EXIT
log "OK: wrote $BACKUPFILE"
# Retention cleanup (only reached after a successful backup). The glob is
# restricted to finished backups so an in-progress .tmp can never be counted.
NUMBACKUPS=$(find "$BACKUPDIR" -type f -name "so-postgres-backup-*.sql.gz" | wc -l)
# Retention cleanup
NUMBACKUPS=$(find $BACKUPDIR -type f -name "so-postgres-backup*" | wc -l)
while [ "$NUMBACKUPS" -gt "$MAXBACKUPS" ]; do
OLDEST=$(find "$BACKUPDIR" -type f -name "so-postgres-backup-*.sql.gz" -printf '%T+ %p\n' | sort | head -n 1 | awk -F" " '{print $2}')
OLDEST=$(find $BACKUPDIR -type f -name "so-postgres-backup*" -printf '%T+ %p\n' | sort | head -n 1 | awk -F" " '{print $2}')
rm -f "$OLDEST"
NUMBACKUPS=$(find "$BACKUPDIR" -type f -name "so-postgres-backup-*.sql.gz" | wc -l)
NUMBACKUPS=$(find $BACKUPDIR -type f -name "so-postgres-backup*" | wc -l)
done
-32
View File
@@ -1,32 +0,0 @@
#!/bin/bash
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
# Wait for the so-postgres container to accept TCP connections.
#
# docker_container.running returns as soon as the container starts, but on
# first-init docker-entrypoint.sh starts a temporary postgres with
# `listen_addresses=''` to run /docker-entrypoint-initdb.d scripts, then
# shuts it down before exec'ing the real CMD. A default pg_isready check
# (Unix socket) passes during that ephemeral phase and races the shutdown
# with "the database system is shutting down". Checking TCP readiness on
# 127.0.0.1 only succeeds after the final postgres binds the port.
#
# Usage: so-postgres-wait [iterations] [sleep_seconds]
# Default: 60 iterations, 2s sleep (~120s total).
ITERATIONS=${1:-60}
SLEEP_SECONDS=${2:-2}
for i in $(seq 1 "$ITERATIONS"); do
if docker exec so-postgres pg_isready -h 127.0.0.1 -U postgres -q 2>/dev/null; then
exit 0
fi
sleep "$SLEEP_SECONDS"
done
echo "so-postgres did not accept TCP connections within $((ITERATIONS * SLEEP_SECONDS))s" >&2
exit 1
@@ -1,110 +0,0 @@
#!/bin/bash
set -e
# Provision Telegraf state inside the so-postgres container.
# Usage: so-telegraf-postgres <subcommand>
# create_db Ensure the so_telegraf database exists.
# group_role Provision the so_telegraf group role, telegraf/partman schemas,
# pg_partman, pg_cron, and the hourly partman maintenance job.
# user Create or update a per-minion login role granted to so_telegraf.
# Env: ROLE_USER, ROLE_PASS.
# retention Reconcile partman retention on telegraf parents.
# Env: RETENTION_DAYS.
cmd="${1:?subcommand required}"
case "$cmd" in
create_db)
if ! docker exec so-postgres psql -U postgres -tAc \
"SELECT 1 FROM pg_database WHERE datname='so_telegraf'" | grep -q 1; then
docker exec so-postgres psql -v ON_ERROR_STOP=1 -U postgres \
-c "CREATE DATABASE so_telegraf"
fi
;;
group_role)
docker exec -i so-postgres psql -v ON_ERROR_STOP=1 -U postgres -d so_telegraf <<'EOSQL'
DO $$
BEGIN
IF NOT EXISTS (SELECT FROM pg_catalog.pg_roles WHERE rolname = 'so_telegraf') THEN
CREATE ROLE so_telegraf NOLOGIN;
END IF;
END
$$;
GRANT CONNECT ON DATABASE so_telegraf TO so_telegraf;
CREATE SCHEMA IF NOT EXISTS telegraf AUTHORIZATION so_telegraf;
GRANT USAGE, CREATE ON SCHEMA telegraf TO so_telegraf;
CREATE SCHEMA IF NOT EXISTS partman;
CREATE EXTENSION IF NOT EXISTS pg_partman SCHEMA partman;
CREATE EXTENSION IF NOT EXISTS pg_cron;
-- Telegraf (running as so_telegraf) calls partman.create_parent()
-- on first write of each metric, which needs USAGE on the partman
-- schema, EXECUTE on its functions/procedures, and write access to
-- partman.part_config so it can register new partitioned parents.
GRANT USAGE, CREATE ON SCHEMA partman TO so_telegraf;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA partman TO so_telegraf;
GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA partman TO so_telegraf;
GRANT EXECUTE ON ALL PROCEDURES IN SCHEMA partman TO so_telegraf;
-- partman creates per-parent template tables (partman.template_*) at
-- runtime; default privileges extend DML/sequence access to them.
ALTER DEFAULT PRIVILEGES IN SCHEMA partman
GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO so_telegraf;
ALTER DEFAULT PRIVILEGES IN SCHEMA partman
GRANT USAGE, SELECT, UPDATE ON SEQUENCES TO so_telegraf;
-- Hourly partman maintenance. cron.schedule is idempotent by jobname.
SELECT cron.schedule(
'telegraf-partman-maintenance',
'17 * * * *',
'CALL partman.run_maintenance_proc()'
);
EOSQL
;;
user)
: "${ROLE_USER:?ROLE_USER is required}"
: "${ROLE_PASS:?ROLE_PASS is required}"
# psql does not substitute :vars inside dollar-quoted strings, so the
# conditional CREATE/ALTER is built outside any DO block and dispatched
# with \gexec. format() handles identifier/literal quoting.
docker exec -i so-postgres psql \
-v ON_ERROR_STOP=1 \
-v role_user="$ROLE_USER" \
-v role_pass="$ROLE_PASS" \
-U postgres -d so_telegraf <<'EOSQL'
SELECT format(
CASE WHEN EXISTS (SELECT FROM pg_catalog.pg_roles WHERE rolname = :'role_user')
THEN 'ALTER ROLE %I WITH LOGIN PASSWORD %L'
ELSE 'CREATE ROLE %I WITH LOGIN PASSWORD %L'
END,
:'role_user',
:'role_pass'
) \gexec
GRANT CONNECT ON DATABASE so_telegraf TO :"role_user";
GRANT so_telegraf TO :"role_user";
EOSQL
;;
retention)
: "${RETENTION_DAYS:?RETENTION_DAYS is required}"
# \gset + \if guards against a missing pg_partman without using a DO
# block (psql :var substitution doesn't reach into dollar-quoted code).
docker exec -i so-postgres psql \
-v ON_ERROR_STOP=1 \
-v retention_days="$RETENTION_DAYS" \
-U postgres -d so_telegraf <<'EOSQL'
SELECT CASE WHEN EXISTS (SELECT 1 FROM pg_catalog.pg_extension WHERE extname = 'pg_partman')
THEN 'true' ELSE 'false' END AS has_partman \gset
\if :has_partman
UPDATE partman.part_config
SET retention = :'retention_days' || ' days',
retention_keep_table = false
WHERE parent_table LIKE 'telegraf.%';
\endif
EOSQL
;;
*)
echo "Unknown subcommand: $cmd" >&2
exit 1
;;
esac
+4 -6
View File
@@ -3,15 +3,12 @@
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
{% set hid = data['id'] %}
{% if hid|regex_match('^([A-Za-z0-9._-]{1,253})$')
and hid.endswith('_hypervisor')
and data['result'] == True %}
{% if data['id'].endswith('_hypervisor') and data['result'] == True %}
{% if data['act'] == 'accept' %}
check_and_trigger:
runner.setup_hypervisor.setup_environment:
- minion_id: {{ hid }}
- minion_id: {{ data['id'] }}
{% endif %}
{% if data['act'] == 'delete' %}
@@ -20,7 +17,8 @@ delete_hypervisor:
- args:
- mods: orch.delete_hypervisor
- pillar:
minion_id: {{ hid }}
minion_id: {{ data['id'] }}
{% endif %}
{% endif %}
+14 -26
View File
@@ -9,42 +9,30 @@ import logging
import os
import pwd
import grp
import re
log = logging.getLogger(__name__)
PILLAR_ROOT = '/opt/so/saltstack/local/pillar/minions/'
_VMNAME_RE = re.compile(r'^[A-Za-z0-9._-]{1,253}$')
def run():
vm_name = data.get('kwargs', {}).get('name', '')
if not _VMNAME_RE.match(str(vm_name)):
log.error("createEmptyPillar reactor: refusing unsafe vm_name=%r", vm_name)
return {}
log.info("createEmptyPillar reactor: vm_name: %s", vm_name)
vm_name = data['kwargs']['name']
logging.error("createEmptyPillar reactor: vm_name: %s" % vm_name)
pillar_root = '/opt/so/saltstack/local/pillar/minions/'
pillar_files = ['adv_' + vm_name + '.sls', vm_name + '.sls']
try:
# Get socore user and group IDs
socore_uid = pwd.getpwnam('socore').pw_uid
socore_gid = grp.getgrnam('socore').gr_gid
pillar_root_real = os.path.realpath(PILLAR_ROOT)
for f in pillar_files:
full_path = os.path.join(PILLAR_ROOT, f)
resolved = os.path.realpath(full_path)
if os.path.dirname(resolved) != pillar_root_real:
log.error("createEmptyPillar reactor: refusing path outside pillar root: %s", resolved)
continue
if os.path.exists(resolved):
continue
os.mknod(resolved)
os.chown(resolved, socore_uid, socore_gid)
os.chmod(resolved, 0o640)
log.info("createEmptyPillar reactor: created %s with socore:socore ownership and mode 0640", f)
full_path = pillar_root + f
if not os.path.exists(full_path):
# Create empty file
os.mknod(full_path)
# Set ownership to socore:socore
os.chown(full_path, socore_uid, socore_gid)
# Set mode to 644 (rw-r--r--)
os.chmod(full_path, 0o640)
logging.error("createEmptyPillar reactor: created %s with socore:socore ownership and mode 644" % f)
except (KeyError, OSError) as e:
log.error("createEmptyPillar reactor: Error setting ownership/permissions: %s", e)
logging.error("createEmptyPillar reactor: Error setting ownership/permissions: %s" % str(e))
return {}
+11 -33
View File
@@ -1,40 +1,18 @@
#!py
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
import logging
import re
remove_key:
wheel.key.delete:
- args:
- match: {{ data['name'] }}
log = logging.getLogger(__name__)
{{ data['name'] }}_pillar_clean:
runner.state.orchestrate:
- args:
- mods: orch.vm_pillar_clean
- pillar:
vm_name: {{ data['name'] }}
_VMNAME_RE = re.compile(r'^[A-Za-z0-9._-]{1,253}$')
def run():
name = data.get('name', '')
if not _VMNAME_RE.match(str(name)):
log.error("deleteKey reactor: refusing unsafe name=%r", name)
return {}
log.info("deleteKey reactor: deleted minion key: %s", name)
return {
'remove_key': {
'wheel.key.delete': [
{'args': [
{'match': name},
]},
],
},
'%s_pillar_clean' % name: {
'runner.state.orchestrate': [
{'args': [
{'mods': 'orch.vm_pillar_clean'},
{'pillar': {'vm_name': name}},
]},
],
},
}
{% do salt.log.info('deleteKey reactor: deleted minion key: %s' % data['name']) %}
-240
View File
@@ -1,240 +0,0 @@
# One pillar directory can map to multiple (state, tgt) actions.
# tgt is a raw salt compound expression. tgt_type is always "compound".
# Per-action `batch` / `batch_wait` override the orch defaults (25% / 15s).
# An action with `highstate: True` triggers state.highstate instead of
# state.apply -- see salt/orch/push_batch.sls.
#
# Notes:
# - `bpf` is a pillar-only dir (no state of its own) consumed by both
# zeek and suricata via macros, so a bpf pillar change re-applies both.
# - suricata/strelka/zeek/elasticsearch/redis/kafka/logstash etc. have
# their own pillar dirs AND their own state, so they map 1:1 (or 1:2
# in strelka's case, because of the split init.sls / manager.sls).
#
# Intentional omissions (these will log a "not in pillar_push_map.yaml"
# warning in push_pillar.sls and wait for the next scheduled highstate):
# - `data` and `node_data`: pillar-only data consumed by many states;
# handling them generically would amount to a fleetwide highstate.
# - `host`: soc_host describes mainint/mainip; a change is a re-IP and
# needs a coordinated procedure, not an immediate state push.
# - `hypervisor`: state changes touch libvirt and are disruptive; leave
# to the next scheduled highstate.
# - `sensor`: every field in soc_sensor.yaml is `readonly: True` or
# per-minion (`node: True`). Per-minion edits are persisted under
# pillar/minions/<id>.sls and are handled by Branch A of push_pillar.sls
# (per-minion highstate intent), not by this app-pillar map.
#
# The role sets here were verified line-by-line against salt/top.sls. If
# salt/top.sls changes how an app is targeted, update the corresponding
# compound here.
# firewall: the one pillar everyone touches. Applied everywhere intentionally
# because every host's iptables needs to know about every other host in the
# grid. Salt's firewall state is idempotent (file.managed + iptables-restore
# onchanges in salt/firewall/init.sls), so hosts whose rendered firewall is
# unchanged do a file comparison and no-op without touching iptables -- actual
# reload happens only on the hosts whose rules actually changed. Fleetwide
# blast radius is intentional and matches the pre-plan behavior via highstate.
# Adding N sensors in a burst coalesces into one dispatch via the drainer.
firewall:
- state: firewall
tgt: '*'
# backup: backup.config_backup runs on eval, standalone, manager, managerhype,
# managersearch (NOT import -- the backup pillar is included on import per
# pillar/top.sls but the backup state is not run there per salt/top.sls).
backup:
- state: backup.config_backup
tgt: 'G@role:so-eval or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone'
# bpf is pillar-only (no state); consumed by both zeek and suricata as macros.
# Both states run on sensor_roles + so-import per salt/top.sls.
bpf:
- state: zeek
tgt: 'G@role:so-eval or G@role:so-heavynode or G@role:so-import or G@role:so-sensor or G@role:so-standalone'
- state: suricata
tgt: 'G@role:so-eval or G@role:so-heavynode or G@role:so-import or G@role:so-sensor or G@role:so-standalone'
# ca is applied universally.
ca:
- state: ca
tgt: '*'
# docker: universal. The docker state is in both the all-non-managers and
# all-managers branches of salt/top.sls.
docker:
- state: docker
tgt: '*'
# elastalert: eval, standalone, manager, managerhype, managersearch (NOT import).
elastalert:
- state: elastalert
tgt: 'G@role:so-eval or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone'
# elastic-fleet-package-registry: manager_roles exactly.
elastic-fleet-package-registry:
- state: elastic-fleet-package-registry
tgt: 'G@role:so-eval or G@role:so-import or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone'
# elasticsearch: 8 roles.
elasticsearch:
- state: elasticsearch
tgt: 'G@role:so-eval or G@role:so-heavynode or G@role:so-import or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-searchnode or G@role:so-standalone'
# elasticagent: so-heavynode only.
elasticagent:
- state: elasticagent
tgt: 'G@role:so-heavynode'
# elasticfleet: base state only on pillar change. elasticfleet.install_agent_grid
# is a deploy/enrollment step, not a config reload; leave it to the next highstate.
elasticfleet:
- state: elasticfleet
tgt: 'G@role:so-eval or G@role:so-fleet or G@role:so-import or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone'
# global: fanout to a fleetwide highstate. The global pillar (soc_global.sls)
# carries cross-cutting settings (pipeline, url_base, imagerepo, mdengine, ...)
# that are consumed by virtually every state, so a targeted re-apply isn't
# meaningful. The drainer's batch/batch_wait throttling controls blast radius.
global:
- highstate: True
tgt: '*'
# healthcheck: eval, sensor, standalone only.
healthcheck:
- state: healthcheck
tgt: 'G@role:so-eval or G@role:so-sensor or G@role:so-standalone'
# hydra: manager_roles exactly.
hydra:
- state: hydra
tgt: 'G@role:so-eval or G@role:so-import or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone'
# idh: so-idh only.
idh:
- state: idh
tgt: 'G@role:so-idh'
# influxdb: manager_roles exactly.
influxdb:
- state: influxdb
tgt: 'G@role:so-eval or G@role:so-import or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone'
# kafka: standalone, manager, managerhype, managersearch, searchnode, receiver.
kafka:
- state: kafka
tgt: 'G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-receiver or G@role:so-searchnode or G@role:so-standalone'
# kibana: manager_roles exactly.
kibana:
- state: kibana
tgt: 'G@role:so-eval or G@role:so-import or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone'
# kratos: manager_roles exactly.
kratos:
- state: kratos
tgt: 'G@role:so-eval or G@role:so-import or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone'
# logrotate: universal (top-of-file '*' branch in salt/top.sls).
logrotate:
- state: logrotate
tgt: '*'
# logstash: 8 roles, no eval/import.
logstash:
- state: logstash
tgt: 'G@role:so-fleet or G@role:so-heavynode or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-receiver or G@role:so-searchnode or G@role:so-standalone'
# manager: manager_roles exactly. The manager state is also referenced under
# *_sensor / *_heavynode top.sls blocks via `sensor`, but the standalone
# `manager` state itself runs only on manager_roles.
manager:
- state: manager
tgt: 'G@role:so-eval or G@role:so-import or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone'
# nginx: 10 specific roles. NOT receiver, idh, hypervisor, desktop.
nginx:
- state: nginx
tgt: 'G@role:so-eval or G@role:so-fleet or G@role:so-heavynode or G@role:so-import or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-searchnode or G@role:so-sensor or G@role:so-standalone'
# ntp: universal (top-of-file '*' branch in salt/top.sls).
ntp:
- state: ntp
tgt: '*'
# patch: universal. soc_patch carries the OS update schedule, applied via
# patch.os.schedule on every node (it's in both the all-non-managers and
# all-managers branches of salt/top.sls).
patch:
- state: patch.os.schedule
tgt: '*'
# postgres: manager_roles exactly.
postgres:
- state: postgres
tgt: 'G@role:so-eval or G@role:so-import or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone'
# redis: 6 roles. standalone, manager, managerhype, managersearch, heavynode, receiver.
# (NOT eval, NOT import, NOT searchnode.)
redis:
- state: redis
tgt: 'G@role:so-heavynode or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-receiver or G@role:so-standalone'
# registry: manager_roles exactly.
registry:
- state: registry
tgt: 'G@role:so-eval or G@role:so-import or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone'
# sensoroni: universal.
sensoroni:
- state: sensoroni
tgt: '*'
# soc: manager_roles exactly.
soc:
- state: soc
tgt: 'G@role:so-eval or G@role:so-import or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone'
# stig: broad. Runs on standalone, manager, managerhype, managersearch,
# searchnode, sensor, receiver, fleet, hypervisor, desktop.
# NOT eval, NOT import, NOT heavynode, NOT idh (the *_idh block in
# salt/top.sls intentionally omits stig).
stig:
- state: stig
tgt: 'G@role:so-desktop or G@role:so-fleet or G@role:so-hypervisor or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-receiver or G@role:so-searchnode or G@role:so-sensor or G@role:so-standalone'
# strelka: sensor-side only on pillar change (sensor_roles). strelka.manager is
# intentionally NOT fired on pillar changes -- YARA rule and strelka config
# pillar changes are consumed by the sensor-side strelka backend, and re-running
# strelka.manager on managers is both unnecessary and disruptive. strelka.manager
# is left to the 2-hour highstate.
strelka:
- state: strelka
tgt: 'G@role:so-eval or G@role:so-heavynode or G@role:so-sensor or G@role:so-standalone'
# suricata: sensor_roles + so-import (5 roles).
suricata:
- state: suricata
tgt: 'G@role:so-eval or G@role:so-heavynode or G@role:so-import or G@role:so-sensor or G@role:so-standalone'
# telegraf: universal.
telegraf:
- state: telegraf
tgt: '*'
# versionlock: universal (top-of-file '*' branch in salt/top.sls).
versionlock:
- state: versionlock
tgt: '*'
# vm: libvirt-driver hypervisors only. Matched by the salt-cloud:driver:libvirt
# grain (compound supports nested grain matching via G@<key>:<subkey>:<value>).
# pillar/vm/soc_vm.sls write path is referenced at salt/_runners/setup_hypervisor.py:856.
vm:
- state: vm
tgt: 'G@salt-cloud:driver:libvirt'
# zeek: sensor_roles + so-import (5 roles).
zeek:
- state: zeek
tgt: 'G@role:so-eval or G@role:so-heavynode or G@role:so-import or G@role:so-sensor or G@role:so-standalone'
-176
View File
@@ -1,176 +0,0 @@
#!py
# Reactor invoked by the pillar_db beacon when SOC records settings changes in
# the so_soc.audit_settings table (see salt/_beacons/pillar_db.py). The beacon
# emits one event per new row carrying setting_id and node_id.
#
# Two branches, keyed on node_id:
# A) node_id populated -> the change is scoped to that one minion. Look up the
# app in pillar_push_map.yaml and write an intent that runs the app's mapped
# state(s) targeted to just that node.
# B) node_id empty -> grid-wide app change. Look up the app in
# pillar_push_map.yaml and write an intent with the entry's actions as-is.
#
# The app name is the first dotted segment of setting_id (e.g. "telegraf.output"
# -> "telegraf"), which matches the pillar_push_map.yaml keys 1:1.
#
# Reactors never dispatch directly. The so-push-drainer schedule picks up
# ready intents, dedupes across pending files, and dispatches orch.push_batch.
import fcntl
import json
import logging
import os
import time
from salt.client import Caller
import yaml
LOG = logging.getLogger(__name__)
PENDING_DIR = '/opt/so/state/push_pending'
LOCK_FILE = os.path.join(PENDING_DIR, '.lock')
MAX_PATHS = 20
# The pillar_push_map.yaml is shipped via salt:// but the reactor runs on the
# master, which mounts the default saltstack tree at this path.
PUSH_MAP_PATH = '/opt/so/saltstack/default/salt/reactor/pillar_push_map.yaml'
_PUSH_MAP_CACHE = {'mtime': 0, 'data': None}
def _load_push_map():
try:
st = os.stat(PUSH_MAP_PATH)
except OSError:
LOG.warning('push_pillar: %s not found', PUSH_MAP_PATH)
return {}
if _PUSH_MAP_CACHE['mtime'] != st.st_mtime:
try:
with open(PUSH_MAP_PATH, 'r') as f:
_PUSH_MAP_CACHE['data'] = yaml.safe_load(f) or {}
except Exception:
LOG.exception('push_pillar: failed to load %s', PUSH_MAP_PATH)
_PUSH_MAP_CACHE['data'] = {}
_PUSH_MAP_CACHE['mtime'] = st.st_mtime
return _PUSH_MAP_CACHE['data'] or {}
def _push_enabled():
try:
caller = Caller()
return bool(caller.cmd('pillar.get', 'global:push:enabled', True))
except Exception:
LOG.exception('push_pillar: pillar.get global:push:enabled failed, assuming enabled')
return True
def _write_intent(key, actions, path):
now = time.time()
try:
os.makedirs(PENDING_DIR, exist_ok=True)
except OSError:
LOG.exception('push_pillar: cannot create %s', PENDING_DIR)
return
intent_path = os.path.join(PENDING_DIR, '{}.json'.format(key))
lock_fd = os.open(LOCK_FILE, os.O_CREAT | os.O_RDWR, 0o644)
try:
fcntl.flock(lock_fd, fcntl.LOCK_EX)
intent = {}
if os.path.exists(intent_path):
try:
with open(intent_path, 'r') as f:
intent = json.load(f)
except (IOError, ValueError):
intent = {}
intent.setdefault('first_touch', now)
intent['last_touch'] = now
intent['actions'] = actions
paths = intent.get('paths', [])
if path and path not in paths:
paths.append(path)
paths = paths[-MAX_PATHS:]
intent['paths'] = paths
tmp_path = intent_path + '.tmp'
with open(tmp_path, 'w') as f:
json.dump(intent, f)
os.rename(tmp_path, intent_path)
except Exception:
LOG.exception('push_pillar: failed to write intent %s', intent_path)
finally:
try:
fcntl.flock(lock_fd, fcntl.LOCK_UN)
finally:
os.close(lock_fd)
def _app_from_setting(setting_id):
# setting_id is e.g. 'telegraf.output' -> 'telegraf', 'ntp.config.servers' -> 'ntp'
if not setting_id:
return None
return setting_id.split('.', 1)[0] or None
def _node_actions(entry, node_id):
# Copy the app's mapped actions but retarget each one to the single node.
# Preserves the state/highstate selection and any batch/batch_wait overrides.
actions = []
for action in entry:
if not isinstance(action, dict):
continue
node_action = dict(action)
node_action['tgt'] = node_id
node_action['tgt_type'] = 'glob'
actions.append(node_action)
return actions
def run():
if not _push_enabled():
LOG.info('push_pillar: push disabled, skipping')
return {}
# The pillar_db beacon nests its payload under data['data']; fall back to the
# top level so the reactor is robust to either shape.
event = data.get('data', data) # noqa: F821 -- data provided by reactor
setting_id = event.get('setting_id', '')
node_id = (event.get('node_id') or '').strip()
app = _app_from_setting(setting_id)
if not app:
LOG.debug('push_pillar: ignoring event with no app segment: setting_id=%s', setting_id)
return {}
push_map = _load_push_map()
entry = push_map.get(app)
if not entry:
LOG.warning(
'push_pillar: app "%s" is not in pillar_push_map.yaml; change will be '
'picked up at the next scheduled highstate (setting_id=%s)',
app, setting_id,
)
return {}
# Branch A: per-node change -> retarget the app's states to just that node.
if node_id:
actions = _node_actions(entry, node_id)
if not actions:
LOG.warning('push_pillar: no usable actions for app "%s" (setting_id=%s)', app, setting_id)
return {}
_write_intent(
'node_{}_{}'.format(node_id, app), actions,
'audit:{}@{}'.format(setting_id, node_id),
)
LOG.info('push_pillar: per-node intent updated for %s on %s (setting_id=%s)',
app, node_id, setting_id)
return {}
# Branch B: grid-wide app change -> use the map entry's actions as-is.
actions = list(entry) # copy to avoid mutating the cache
_write_intent('pillar_{}'.format(app), actions, 'audit:{}'.format(setting_id))
LOG.info('push_pillar: app intent updated for %s (setting_id=%s)', app, setting_id)
return {}
-96
View File
@@ -1,96 +0,0 @@
#!py
# Reactor invoked by the inotify beacon on rule file changes under
# /opt/so/saltstack/local/salt/strelka/rules/compiled/.
#
# Writes (or updates) a push intent at /opt/so/state/push_pending/rules_strelka.json
# and returns {}. The so-push-drainer schedule picks up ready intents, dedupes
# across pending files, and dispatches orch.push_batch. Reactors never dispatch
# directly -- see plan /home/mreeves/.claude/plans/goofy-marinating-hummingbird.md.
import fcntl
import json
import logging
import os
import time
from salt.client import Caller
LOG = logging.getLogger(__name__)
PENDING_DIR = '/opt/so/state/push_pending'
LOCK_FILE = os.path.join(PENDING_DIR, '.lock')
MAX_PATHS = 20
# Mirrors GLOBALS.sensor_roles in salt/vars/globals.map.jinja. Sensor-side
# strelka runs on exactly these four roles; so-import gets strelka.manager
# instead, which is not fired on pillar changes.
SENSOR_ROLES = ['so-eval', 'so-heavynode', 'so-sensor', 'so-standalone']
def _sensor_compound():
return ' or '.join('G@role:{}'.format(r) for r in SENSOR_ROLES)
def _push_enabled():
try:
caller = Caller()
return bool(caller.cmd('pillar.get', 'global:push:enabled', True))
except Exception:
LOG.exception('push_strelka: pillar.get global:push:enabled failed, assuming enabled')
return True
def _write_intent(key, actions, path):
now = time.time()
try:
os.makedirs(PENDING_DIR, exist_ok=True)
except OSError:
LOG.exception('push_strelka: cannot create %s', PENDING_DIR)
return
intent_path = os.path.join(PENDING_DIR, '{}.json'.format(key))
lock_fd = os.open(LOCK_FILE, os.O_CREAT | os.O_RDWR, 0o644)
try:
fcntl.flock(lock_fd, fcntl.LOCK_EX)
intent = {}
if os.path.exists(intent_path):
try:
with open(intent_path, 'r') as f:
intent = json.load(f)
except (IOError, ValueError):
intent = {}
intent.setdefault('first_touch', now)
intent['last_touch'] = now
intent['actions'] = actions
paths = intent.get('paths', [])
if path and path not in paths:
paths.append(path)
paths = paths[-MAX_PATHS:]
intent['paths'] = paths
tmp_path = intent_path + '.tmp'
with open(tmp_path, 'w') as f:
json.dump(intent, f)
os.rename(tmp_path, intent_path)
except Exception:
LOG.exception('push_strelka: failed to write intent %s', intent_path)
finally:
try:
fcntl.flock(lock_fd, fcntl.LOCK_UN)
finally:
os.close(lock_fd)
def run():
if not _push_enabled():
LOG.info('push_strelka: push disabled, skipping')
return {}
path = data.get('path', '') # noqa: F821 -- data provided by reactor
actions = [{'state': 'strelka', 'tgt': _sensor_compound()}]
_write_intent('rules_strelka', actions, path)
LOG.info('push_strelka: intent updated for path=%s', path)
return {}
-95
View File
@@ -1,95 +0,0 @@
#!py
# Reactor invoked by the inotify beacon on rule file changes under
# /opt/so/saltstack/local/salt/suricata/rules/.
#
# Writes (or updates) a push intent at /opt/so/state/push_pending/rules_suricata.json
# and returns {}. The so-push-drainer schedule picks up ready intents, dedupes
# across pending files, and dispatches orch.push_batch. Reactors never dispatch
# directly -- see plan /home/mreeves/.claude/plans/goofy-marinating-hummingbird.md.
import fcntl
import json
import logging
import os
import time
from salt.client import Caller
LOG = logging.getLogger(__name__)
PENDING_DIR = '/opt/so/state/push_pending'
LOCK_FILE = os.path.join(PENDING_DIR, '.lock')
MAX_PATHS = 20
# Mirrors GLOBALS.sensor_roles in salt/vars/globals.map.jinja. Suricata also
# runs on so-import per salt/top.sls, so that role is appended below.
SENSOR_ROLES = ['so-eval', 'so-heavynode', 'so-sensor', 'so-standalone']
def _sensor_compound_plus_import():
return ' or '.join('G@role:{}'.format(r) for r in SENSOR_ROLES) + ' or G@role:so-import'
def _push_enabled():
try:
caller = Caller()
return bool(caller.cmd('pillar.get', 'global:push:enabled', True))
except Exception:
LOG.exception('push_suricata: pillar.get global:push:enabled failed, assuming enabled')
return True
def _write_intent(key, actions, path):
now = time.time()
try:
os.makedirs(PENDING_DIR, exist_ok=True)
except OSError:
LOG.exception('push_suricata: cannot create %s', PENDING_DIR)
return
intent_path = os.path.join(PENDING_DIR, '{}.json'.format(key))
lock_fd = os.open(LOCK_FILE, os.O_CREAT | os.O_RDWR, 0o644)
try:
fcntl.flock(lock_fd, fcntl.LOCK_EX)
intent = {}
if os.path.exists(intent_path):
try:
with open(intent_path, 'r') as f:
intent = json.load(f)
except (IOError, ValueError):
intent = {}
intent.setdefault('first_touch', now)
intent['last_touch'] = now
intent['actions'] = actions
paths = intent.get('paths', [])
if path and path not in paths:
paths.append(path)
paths = paths[-MAX_PATHS:]
intent['paths'] = paths
tmp_path = intent_path + '.tmp'
with open(tmp_path, 'w') as f:
json.dump(intent, f)
os.rename(tmp_path, intent_path)
except Exception:
LOG.exception('push_suricata: failed to write intent %s', intent_path)
finally:
try:
fcntl.flock(lock_fd, fcntl.LOCK_UN)
finally:
os.close(lock_fd)
def run():
if not _push_enabled():
LOG.info('push_suricata: push disabled, skipping')
return {}
path = data.get('path', '') # noqa: F821 -- data provided by reactor
actions = [{'state': 'suricata', 'tgt': _sensor_compound_plus_import()}]
_write_intent('rules_suricata', actions, path)
LOG.info('push_suricata: intent updated for path=%s', path)
return {}
-1
View File
@@ -17,7 +17,6 @@ include:
so-redis:
docker_container.running:
- image: {{ GLOBALS.registry_host }}:5000/{{ GLOBALS.image_repo }}/so-redis:{{ GLOBALS.so_version }}
- restart_policy: unless-stopped
- hostname: so-redis
- user: socore
- networks:
-3
View File
@@ -21,9 +21,6 @@ so-dockerregistry:
- networks:
- sobridge:
- ipv4_address: {{ DOCKERMERGED.containers['so-dockerregistry'].ip }}
# Intentionally `always` (not unless-stopped) -- registry is critical infra
# and must come back up even if it was manually stopped. Do not homogenize
# to unless-stopped; see the container auto-restart section of the plan.
- restart_policy: always
- port_bindings:
{% for BINDING in DOCKERMERGED.containers['so-dockerregistry'].port_bindings %}
+1 -2
View File
@@ -3,7 +3,7 @@
{% set SCHEDULE = salt['pillar.get']('healthcheck:schedule', 30) %}
include:
- salt.minion
- salt
{% if CHECKS and ENABLED %}
salt_beacons:
@@ -23,4 +23,3 @@ salt_beacons:
- watch_in:
- service: salt_minion_service
{% endif %}
+3 -3
View File
@@ -17,7 +17,7 @@ engines:
to:
'KAFKA':
- cmd.run:
cmd: /usr/sbin/so-yaml.py replace /opt/so/saltstack/local/pillar/kafka/soc_kafka.sls kafka.enabled True
cmd: /usr/sbin/so-yaml.py replace /opt/so/saltstack/local/pillar/kafka/soc_kafka.sls kafka.enabled True && /usr/sbin/so-config.py sync-yaml-mutation /opt/so/saltstack/local/pillar/kafka/soc_kafka.sls replace kafka.enabled True --note "pillarWatch global.pipeline"
- cmd.run:
cmd: salt -C 'G@role:so-standalone or G@role:so-manager or G@role:so-managersearch or G@role:so-receiver or G@role:so-searchnode' saltutil.kill_all_jobs
- cmd.run:
@@ -28,7 +28,7 @@ engines:
to:
'REDIS':
- cmd.run:
cmd: /usr/sbin/so-yaml.py replace /opt/so/saltstack/local/pillar/kafka/soc_kafka.sls kafka.enabled False
cmd: /usr/sbin/so-yaml.py replace /opt/so/saltstack/local/pillar/kafka/soc_kafka.sls kafka.enabled False && /usr/sbin/so-config.py sync-yaml-mutation /opt/so/saltstack/local/pillar/kafka/soc_kafka.sls replace kafka.enabled False --note "pillarWatch global.pipeline"
- cmd.run:
cmd: salt -C 'G@role:so-standalone or G@role:so-manager or G@role:so-managersearch or G@role:so-receiver or G@role:so-searchnode' saltutil.kill_all_jobs
- cmd.run:
@@ -66,5 +66,5 @@ engines:
- cmd.run:
cmd: salt -C 'G@role:so-standalone or G@role:so-manager or G@role:so-managersearch or G@role:so-receiver' state.apply kafka.disabled,kafka.reset
- cmd.run:
cmd: /usr/sbin/so-yaml.py remove /opt/so/saltstack/local/pillar/kafka/soc_kafka.sls kafka.reset
cmd: /usr/sbin/so-yaml.py remove /opt/so/saltstack/local/pillar/kafka/soc_kafka.sls kafka.reset && /usr/sbin/so-config.py sync-yaml-mutation /opt/so/saltstack/local/pillar/kafka/soc_kafka.sls remove kafka.reset --note "pillarWatch kafka.reset"
interval: 10
-11
View File
@@ -1,11 +0,0 @@
reactor:
- 'salt/beacon/*/inotify//opt/so/saltstack/local/salt/suricata/rules':
- salt://reactor/push_suricata.sls
- 'salt/beacon/*/inotify//opt/so/saltstack/local/salt/suricata/rules/*':
- salt://reactor/push_suricata.sls
- 'salt/beacon/*/inotify//opt/so/saltstack/local/salt/strelka/rules/compiled':
- salt://reactor/push_strelka.sls
- 'salt/beacon/*/inotify//opt/so/saltstack/local/salt/strelka/rules/compiled/*':
- salt://reactor/push_strelka.sls
- 'salt/beacon/*/pillar_db/audit_settings':
- salt://reactor/push_pillar.sls
-8
View File
@@ -5,11 +5,3 @@ salt_bootstrap:
- source: salt://salt/scripts/bootstrap-salt.sh
- mode: 755
- show_changes: False
salt_sbin:
file.recurse:
- name: /usr/sbin
- source: salt://salt/tools/sbin
- user: 939
- group: 939
- file_mode: 755
+1 -1
View File
@@ -1,4 +1,4 @@
lasthighstate:
file.touch:
- name: /opt/so/log/salt/lasthighstate
- order: 9001
- order: last
+3 -19
View File
@@ -10,13 +10,12 @@
# software that is protected by the license key."
{% from 'allowed_states.map.jinja' import allowed_states %}
{% from 'global/map.jinja' import GLOBALMERGED %}
{% if sls in allowed_states %}
include:
- salt.minion
- salt.master.pyinotify
- salt.master.boot_mine_update
- salt.master.ext_pillar_postgres
- salt.master.pg_notify_pillar_engine
{% if 'vrt' in salt['pillar.get']('features', []) %}
- salt.cloud
- salt.cloud.reactor_config_hypervisor
@@ -65,21 +64,6 @@ engines_config:
- name: /etc/salt/master.d/engines.conf
- source: salt://salt/files/engines.conf
{% if GLOBALMERGED.push.enabled %}
reactor_pushstate_config:
file.managed:
- name: /etc/salt/master.d/reactor_pushstate.conf
- source: salt://salt/files/reactor_pushstate.conf
- watch_in:
- service: salt_master_service
{% else %}
reactor_pushstate_config:
file.absent:
- name: /etc/salt/master.d/reactor_pushstate.conf
- watch_in:
- service: salt_master_service
{% endif %}
# update the bootstrap script when used for salt-cloud
salt_bootstrap_cloud:
file.managed:
@@ -95,7 +79,7 @@ salt_master_service:
- file: checkmine_engine
- file: pillarWatch_engine
- file: engines_config
- order: 9002
- order: last
{% else %}
-29
View File
@@ -1,29 +0,0 @@
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
# Manages /etc/systemd/system/so-boot-mine-update.service, a manager-only
# Type=oneshot unit that pushes `salt '*' mine.update` once per boot, ordered
# before so-boot-highstate.service so mine-backed pillars (node IPs, ES/Redis/
# Logstash discovery) are fresh before the boot highstate renders them.
include:
- systemd.reload
so_boot_mine_update_unit_file:
file.managed:
- name: /etc/systemd/system/so-boot-mine-update.service
- source: salt://salt/service/so-boot-mine-update.service
- onchanges_in:
- module: systemd_reload
# Only enable once setup is complete. Until then the gate file is missing and
# the unit's own ConditionPathExists would no-op it anyway.
so_boot_mine_update_service:
service.enabled:
- name: so-boot-mine-update.service
- onlyif: test -e /opt/so/state/setup-complete
- require:
- file: so_boot_mine_update_unit_file
- module: systemd_reload
+24
View File
@@ -0,0 +1,24 @@
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
# Deprecated. SOC/onionconfig owns the settings database now; this state only
# removes the old so_pillar ext_pillar config if it was previously deployed.
{% from 'allowed_states.map.jinja' import allowed_states %}
{% if sls.split('.')[0] in allowed_states %}
ext_pillar_postgres_config_absent:
file.absent:
- name: /etc/salt/master.d/ext_pillar_postgres.conf
- watch_in:
- service: salt_master_service
{% else %}
{{sls}}_state_not_allowed:
test.fail_without_changes:
- name: {{sls}}_state_not_allowed
{% endif %}
@@ -0,0 +1,37 @@
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
# Deprecated. SOC/onionconfig owns the settings database now; this state only
# removes the old so_pillar notify engine and reactor config if previously
# deployed.
{% from 'allowed_states.map.jinja' import allowed_states %}
{% if sls.split('.')[0] in allowed_states %}
pg_notify_pillar_engine_module_absent:
file.absent:
- name: /etc/salt/engines/pg_notify_pillar.py
- watch_in:
- service: salt_master_service
pg_notify_pillar_engine_config_absent:
file.absent:
- name: /etc/salt/master.d/pg_notify_pillar_engine.conf
- watch_in:
- service: salt_master_service
pg_notify_pillar_reactor_config_absent:
file.absent:
- name: /etc/salt/master.d/so_pillar_reactor.conf
- watch_in:
- service: salt_master_service
{% else %}
{{sls}}_state_not_allowed:
test.fail_without_changes:
- name: {{sls}}_state_not_allowed
{% endif %}
-20
View File
@@ -1,20 +0,0 @@
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
pyinotify_module_package:
file.recurse:
- name: /opt/so/conf/salt/module_packages/pyinotify
- source: salt://salt/module_packages/pyinotify
- clean: True
- makedirs: True
pyinotify_python_module_install:
cmd.run:
- name: /opt/saltstack/salt/bin/python3.10 -m pip install pyinotify --no-index --find-links=/opt/so/conf/salt/module_packages/pyinotify/ --upgrade
- onchanges:
- file: pyinotify_module_package
- failhard: True
- watch_in:
- service: salt_minion_service
+1
View File
@@ -2,3 +2,4 @@
salt:
minion:
version: '3006.19'
check_threshold: 3600 # in seconds, threshold used for so-salt-minion-check. any value less than 600 seconds may cause a lot of salt-minion restarts since the job to touch the file occurs every 5-8 minutes by default
-31
View File
@@ -1,31 +0,0 @@
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
# Manages /etc/systemd/system/so-boot-highstate.service, a Type=oneshot
# RemainAfterExit=yes unit that runs `salt-call state.highstate` exactly once
# per system boot. Replaces the legacy `startup_states: highstate` minion
# config, which fired on every salt-minion service restart (causing a redundant
# highstate whenever a highstate itself restarted salt-minion).
include:
- systemd.reload
so_boot_highstate_unit_file:
file.managed:
- name: /etc/systemd/system/so-boot-highstate.service
- source: salt://salt/service/so-boot-highstate.service
- onchanges_in:
- module: systemd_reload
# Only enable once setup is complete. Until then the gate file is missing and
# the unit's own ConditionPathExists would no-op it anyway -- this just keeps
# `systemctl is-enabled` honest for the sync_es_users gate.
so_boot_highstate_service:
service.enabled:
- name: so-boot-highstate.service
- onlyif: test -e /opt/so/state/setup-complete
- require:
- file: so_boot_highstate_unit_file
- module: systemd_reload
+6 -47
View File
@@ -17,7 +17,6 @@ include:
- repo.client
- salt.mine_functions
- salt.minion.service_file
- salt.minion.boot_highstate
{% if GLOBALS.is_manager %}
- ca.signing_policy
{% endif %}
@@ -81,47 +80,21 @@ set_log_levels:
- "log_level: info"
- "log_level_logfile: info"
# startup_states: highstate caused a full highstate to run on every
# salt-minion service start, including the restart triggered when a highstate
# itself modified the minion config (beacons, mine, unit file). Replaced by
# so-boot-highstate.service (managed in salt.minion.boot_highstate), which
# runs once per system boot only. Strip the line from /etc/salt/minion on
# upgrade; both the commented and uncommented forms historically existed.
remove_startup_states:
file.line:
enable_startup_states:
file.uncomment:
- name: /etc/salt/minion
- match: 'startup_states: highstate'
- mode: delete
# Upgrade-path bridge: systems that already passed setup under the old gate
# (`grep -x 'startup_states: highstate' /etc/salt/minion`) get a /opt/so/state/setup-complete
# marker so so-boot-highstate.service can be enabled and the so-user_sync cron
# in sync_es_users.sls keeps installing. Setup-in-progress systems instead get
# the marker from `mark_setup_complete` in setup/so-functions at the right
# moment. `replace: false` means we never overwrite a marker once written.
mark_setup_complete_for_upgrades:
file.managed:
- name: /opt/so/state/setup-complete
- replace: false
- makedirs: True
- onlyif: "grep -qx 'startup_states: highstate' /etc/salt/minion"
- require_in:
- file: remove_startup_states
- service: so_boot_highstate_service
- regex: '^startup_states: highstate$'
- unless: pgrep so-setup
{% endif %}
# this has to be outside the if statement above since there are <requisite>_in calls to this state.
# uses watch (not listen) so the restart fires in-state and its result lands on this state's
# running entry; that is what lets wait_for_salt_minion_ready below detect any restart
# uniformly via onchanges, regardless of whether the trigger came from these files or from
# external watch_in's (e.g. beacons, master/pyinotify).
# this has to be outside the if statement above since there are <requisite>_in calls to this state
salt_minion_service:
service.running:
- name: salt-minion
- enable: True
- onlyif: test "{{INSTALLEDSALTVERSION}}" == "{{SALTVERSION}}"
- watch:
- listen:
- file: mine_functions
{% if INSTALLEDSALTVERSION|string == SALTVERSION|string %}
- file: set_log_levels
@@ -130,17 +103,3 @@ salt_minion_service:
- file: signing_policy
{% endif %}
- order: last
# block until the just-restarted salt-minion is back and can execute modules locally, so
# follow-on jobs and the next highstate iteration do not race the restart. onchanges +
# require on salt_minion_service catches every restart trigger uniformly because watch
# mod_watch results replace the service state's running entry. wait logic lives in
# /usr/sbin/so-salt-minion-wait (deployed by common_sbin from common/tools/sbin/).
wait_for_salt_minion_ready:
cmd.run:
- name: /usr/sbin/so-salt-minion-wait
- onchanges:
- service: salt_minion_service
- require:
- service: salt_minion_service
- order: last

Some files were not shown because too many files have changed in this diff Show More