Move onionconfig writes out of so-yaml

make so-yaml PG-canonical and add pillar-change reactor stack
Two coupled changes that together let so_pillar.* be the canonical config store, with config edits driving service reloads automatically: so-yaml PG-canonical mode - Adds /opt/so/conf/so-yaml/mode (and SO_YAML_BACKEND env override) with three values: dual (legacy), postgres (PG-only for managed paths), disk (emergency rollback). Bootstrap files (secrets.sls, ca/init.sls, *.nodes.sls, top.sls, ...) stay disk-only regardless via the existing SkipPath allowlist in so_yaml_postgres.locate. - loadYaml/writeYaml/purgeFile now route to so_pillar.* in postgres mode: replace/add/get all read+write the database with no disk file ever appearing. PG failure is fatal in postgres mode (no silent fallback); dual mode preserves the prior best-effort mirror. - so_yaml_postgres gains read_yaml(path), is_pg_managed(path), and is_enabled() so so-yaml can answer "is this path PG-managed and is PG up" without reaching into private helpers. - schema_pillar.sls writes /opt/so/conf/so-yaml/mode = postgres after the importer succeeds, so flipping postgres:so_pillar:enabled flips so-yaml's behavior in lockstep with the schema being live. pg_notify-driven change fan-out - 008_change_notify.sql adds so_pillar.change_queue + an AFTER trigger on pillar_entry that enqueues the locator and pg_notifies 'so_pillar_change'. Queue is drained at-least-once so engine restarts don't lose events; pg_notify is just the wakeup signal. - New salt-master engine pg_notify_pillar.py LISTENs on the channel, drains the queue with FOR UPDATE SKIP LOCKED, debounces bursts, and fires 'so/pillar/changed' events grouped by (scope, role, minion). - Reactor so_pillar_changed.sls catches the tag and dispatches to orch.so_pillar_reload, which carries a DISPATCH map of pillar-path prefix -> (state sls, role grain set) so adding a new service to the auto-reload list is a one-line edit instead of a new reactor. - Engine + reactor wiring is gated on the same postgres:so_pillar:enabled flag as the schema and ext_pillar config so the whole stack flips on/off together. Tests: 21 new cases (112 total, all passing) covering mode resolution, PG-managed detection, and PG-canonical read/write/purge routing with the PG client stubbed.
2026-06-15 14:48:43 +02:00 · 2026-05-12 16:05:55 -04:00 · 2026-05-01 09:31:48 -04:00 · 2026-04-30 17:09:58 -04:00 · 2026-04-30 16:34:05 -04:00 · 2026-04-30 16:30:57 -04:00
132 changed files with 1176 additions and 4035 deletions
@@ -11,7 +11,6 @@ body:
        -
        - 3.0.0
        - 3.1.0
-        - 3.2.0
        - Other (please provide detail below)
    validations:
      required: true
@@ -1,17 +1,17 @@
-### 3.1.0-20260528 ISO image released on 2026/05/28
+### 3.0.0-20260331 ISO image released on 2026/03/31


 ### Download and Verify

-3.1.0-20260528 ISO image:  
-https://download.securityonion.net/file/securityonion/securityonion-3.1.0-20260528.iso
+3.0.0-20260331 ISO image:  
+https://download.securityonion.net/file/securityonion/securityonion-3.0.0-20260331.iso
 
-MD5: 9D6FF58DEEE24089D722C73169765B3E  
-SHA1: 2B8B816B6CEC3B7F96B3C5E040EBF502DD2C412F  
-SHA256: 62FAB57E247C843D6A04F0796D8162C732B65D82FC3E4A59D087135B9FD32912  
+MD5: ECD318A1662A6FDE0EF213F5A9BD4B07  
+SHA1: E55BE314440CCF3392DC0B06BC5E270B43176D9C  
+SHA256: 7FC47405E335CBE5C2B6C51FE7AC60248F35CBE504907B8B5A33822B23F8F4D5  

 Signature for ISO image:  
-https://github.com/Security-Onion-Solutions/securityonion/raw/3/main/sigs/securityonion-3.1.0-20260528.iso.sig
+https://github.com/Security-Onion-Solutions/securityonion/raw/3/main/sigs/securityonion-3.0.0-20260331.iso.sig

 Signing key:  
 https://raw.githubusercontent.com/Security-Onion-Solutions/securityonion/3/main/KEYS  
@@ -25,22 +25,22 @@ wget https://raw.githubusercontent.com/Security-Onion-Solutions/securityonion/3/

 Download the signature file for the ISO:  
 ```
-wget https://github.com/Security-Onion-Solutions/securityonion/raw/3/main/sigs/securityonion-3.1.0-20260528.iso.sig
+wget https://github.com/Security-Onion-Solutions/securityonion/raw/3/main/sigs/securityonion-3.0.0-20260331.iso.sig
 ```

 Download the ISO image:  
 ```
-wget https://download.securityonion.net/file/securityonion/securityonion-3.1.0-20260528.iso
+wget https://download.securityonion.net/file/securityonion/securityonion-3.0.0-20260331.iso
 ```

 Verify the downloaded ISO image using the signature file:  
 ```
-gpg --verify securityonion-3.1.0-20260528.iso.sig securityonion-3.1.0-20260528.iso
+gpg --verify securityonion-3.0.0-20260331.iso.sig securityonion-3.0.0-20260331.iso
 ```

 The output should show "Good signature" and the Primary key fingerprint should match what's shown below:
 ```
-gpg: Signature made Wed 27 May 2026 03:03:59 PM EDT using RSA key ID FE507013
+gpg: Signature made Mon 30 Mar 2026 06:22:14 PM EDT using RSA key ID FE507013
 gpg: Good signature from "Security Onion Solutions, LLC <info@securityonionsolutions.com>"
 gpg: WARNING: This key is not certified with a trusted signature!
 gpg:          There is no indication that the signature belongs to the owner.
@@ -1 +0,0 @@
-
@@ -1 +1 @@
-3.2.0
+3.1.0
@@ -1,142 +0,0 @@
-# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
-# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
-# https://securityonion.net/license; you may not use this file except in compliance with the
-# Elastic License 2.0.
-
-# Custom salt beacon that watches the SOC audit_settings table in postgres for
-# new settings changes and emits a beacon event per new row. This replaces the
-# inotify watch on /opt/so/saltstack/local/pillar -- instead of monitoring pillar
-# files on disk, we monitor the so_soc.audit_settings table that SOC writes to.
-#
-# Detection is poll-based with a monotonic `id` watermark persisted to
-# WATERMARK_FILE: each pass selects rows with id greater than the last id seen,
-# which makes it self-healing (a missed poll simply catches up on the next one).
-#
-# Each emitted event carries setting_id and node_id; the push_pillar reactor maps
-# setting_id -> app via pillar_push_map.yaml and writes a push intent, after which
-# the existing so-push-drainer / orch.push_batch pipeline takes over unchanged.
-
-import logging
-import os
-import subprocess
-
-log = logging.getLogger(__name__)
-
-WATERMARK_FILE = '/opt/so/state/pillar_db_watch.id'
-CONTAINER = 'so-postgres'
-DATABASE = 'so_soc'
-
-# Unaligned, tuples-only psql output with a field separator that cannot appear in
-# an id/setting_id/node_id, so we can split each row reliably.
-FIELD_SEP = '\x1f'
-
-
-def __virtual__():
-    return True
-
-
-def validate(config):
-    return True, 'valid'
-
-
-def _read_watermark():
-    # Returns the last processed id, or None if the watermark has not been seeded.
-    try:
-        with open(WATERMARK_FILE, 'r') as f:
-            return int((f.read() or '').strip())
-    except (IOError, ValueError):
-        return None
-
-
-def _write_watermark(value):
-    try:
-        os.makedirs(os.path.dirname(WATERMARK_FILE), exist_ok=True)
-        tmp = WATERMARK_FILE + '.tmp'
-        with open(tmp, 'w') as f:
-            f.write(str(int(value)))
-        os.rename(tmp, WATERMARK_FILE)
-    except OSError:
-        log.exception('pillar_db beacon: failed to persist watermark to %s', WATERMARK_FILE)
-
-
-def _query(sql):
-    # Run a query against so_soc inside the so-postgres container over the unix
-    # socket (trust auth, no password). Returns stdout on success, or None on any
-    # failure so the caller can no-op and retry on the next interval.
-    cmd = [
-        'docker', 'exec', CONTAINER,
-        'psql', '-U', 'postgres', '-d', DATABASE,
-        '-tA', '-F', FIELD_SEP, '-c', sql,
-    ]
-    try:
-        result = subprocess.run(cmd, capture_output=True, text=True, timeout=30)
-    except subprocess.TimeoutExpired:
-        log.warning('pillar_db beacon: psql timed out')
-        return None
-    except Exception:
-        log.exception('pillar_db beacon: failed to exec psql')
-        return None
-    if result.returncode != 0:
-        log.warning('pillar_db beacon: psql failed (rc=%s): %s',
-                    result.returncode, (result.stderr or '').strip())
-        return None
-    return result.stdout
-
-
-def beacon(config):
-    retval = []
-
-    watermark = _read_watermark()
-
-    # First run / missing watermark: seed to the current MAX(id) and emit nothing
-    # so we never replay the entire settings history into a fleetwide push.
-    if watermark is None:
-        seed = _query('SELECT COALESCE(MAX(id), 0) FROM audit_settings;')
-        if seed is None:
-            return retval  # postgres not ready yet; retry next interval
-        try:
-            _write_watermark(int((seed or '0').strip() or 0))
-        except ValueError:
-            log.warning('pillar_db beacon: could not parse MAX(id) seed: %r', seed)
-        return retval
-
-    rows = _query(
-        "SELECT id, setting_id, COALESCE(node_id, '') FROM audit_settings "
-        "WHERE id > %d ORDER BY id;" % watermark
-    )
-    if rows is None:
-        return retval
-
-    max_id = watermark
-    for line in rows.splitlines():
-        # Do NOT str.strip() the whole line: Python treats the \x1f field
-        # separator (and \x1c-\x1e) as whitespace, so stripping would eat an
-        # empty trailing node_id field and make the row look malformed.
-        if not line.strip():
-            continue
-        parts = line.split(FIELD_SEP)
-        if len(parts) < 3:
-            log.warning('pillar_db beacon: skipping malformed row: %r', line)
-            continue
-        try:
-            row_id = int(parts[0])
-        except ValueError:
-            log.warning('pillar_db beacon: skipping row with non-int id: %r', line)
-            continue
-        setting_id = parts[1]
-        node_id = parts[2]
-        retval.append({
-            'tag': 'audit_settings',
-            'id': row_id,
-            'setting_id': setting_id,
-            'node_id': node_id,
-        })
-        if row_id > max_id:
-            max_id = row_id
-
-    if max_id > watermark:
-        _write_watermark(max_id)
-        log.info('pillar_db beacon: emitted %d change(s), watermark %d -> %d',
-                 len(retval), watermark, max_id)
-
-    return retval
@@ -25,11 +25,9 @@ if [ ! -f $BACKUPFILE ]; then
  # Create empty backup file
  tar -cf $BACKUPFILE -T /dev/null

-  # Loop through all paths defined in global.sls, and append them to backup file if they exist
+  # Loop through all paths defined in global.sls, and append them to backup file
  {%- for LOCATION in BACKUPLOCATIONS %}
-  if [[ -d {{ LOCATION }} || -f {{ LOCATION }} ]]; then
  tar -rf $BACKUPFILE "${EXCLUSIONS[@]}" {{ LOCATION }}
-  fi
  {%- endfor %}

 fi
@@ -48,6 +48,13 @@ copy_so-yaml_manager_tools_sbin:
    - force: True
    - preserve: True

+copy_so-config_manager_tools_sbin:
+  file.copy:
+    - name: /opt/so/saltstack/default/salt/manager/tools/sbin/so-config.py
+    - source: {{UPDATE_DIR}}/salt/manager/tools/sbin/so-config.py
+    - force: True
+    - preserve: True
+
 copy_so-repo-sync_manager_tools_sbin:
  file.copy:
    - name: /opt/so/saltstack/default/salt/manager/tools/sbin/so-repo-sync
@@ -97,6 +104,13 @@ copy_so-yaml_sbin:
    - force: True
    - preserve: True

+copy_so-config_sbin:
+  file.copy:
+    - name: /usr/sbin/so-config.py
+    - source: {{UPDATE_DIR}}/salt/manager/tools/sbin/so-config.py
+    - force: True
+    - preserve: True
+
 copy_so-repo-sync_sbin:
  file.copy:
    - name: /usr/sbin/so-repo-sync
@@ -192,21 +192,8 @@ update_docker_containers() {
          echo "Unable to tag $image" >> "$LOG_FILE" 2>&1 
          exit 1
        }
-        # Push to the embedded registry via a registry-to-registry copy. Avoids
-        # `docker push`, which on Docker 29.x with the containerd image store
-        # represents freshly-pulled images as an index whose layer content
-        # isn't reachable through the push path. The local `docker tag` above
-        # is preserved so so-image-pull's `:5000` existence check still works.
-        # Pin to the digest already gpg-verified above so we copy exactly the
-        # bytes we approved.
-        local VERIFIED_REF
-        VERIFIED_REF=$(echo "$DOCKERINSPECT" | jq -r ".[0].RepoDigests[] | select(. | contains(\"$CONTAINER_REGISTRY\"))" | head -n 1)
-        if [ -z "$VERIFIED_REF" ] || [ "$VERIFIED_REF" = "null" ]; then
-          echo "Unable to determine verified digest for $image" >> "$LOG_FILE" 2>&1
-          exit 1
-        fi
-        docker buildx imagetools create --tag $HOSTNAME:5000/$IMAGEREPO/$image "$VERIFIED_REF" >> "$LOG_FILE" 2>&1 || {
-          echo "Unable to copy $image to embedded registry" >> "$LOG_FILE" 2>&1
+        docker push $HOSTNAME:5000/$IMAGEREPO/$image >> "$LOG_FILE" 2>&1 || {
+          echo "Unable to push $image" >> "$LOG_FILE" 2>&1 
          exit 1
        }
      fi
@@ -165,8 +165,6 @@ if [[ $EXCLUDE_FALSE_POSITIVE_ERRORS == 'Y' ]]; then
    EXCLUDED_ERRORS="$EXCLUDED_ERRORS|upgrading component template"  # false positive (elasticsearch index or template names contain 'error')
    EXCLUDED_ERRORS="$EXCLUDED_ERRORS|upgrading composable template" # false positive (elasticsearch composable template names contain 'error')
    EXCLUDED_ERRORS="$EXCLUDED_ERRORS|Error while parsing document for index \[.ds-logs-kratos-so-.*object mapping for \[file\]" # false positive (mapping error occuring BEFORE kratos index has rolled over in 2.4.210)
-    EXCLUDED_ERRORS="$EXCLUDED_ERRORS|No such container"            # false positive (telegraf trying to run stats on an old container)
-    EXCLUDED_ERRORS="$EXCLUDED_ERRORS|passwords do not match"       # false positive (automated hydra test)
 fi

 if [[ $EXCLUDE_KNOWN_ERRORS == 'Y' ]]; then
@@ -229,7 +227,7 @@ if [[ $EXCLUDE_KNOWN_ERRORS == 'Y' ]]; then
    EXCLUDED_ERRORS="$EXCLUDED_ERRORS|from NIC checksum offloading" # zeek reporter.log
    EXCLUDED_ERRORS="$EXCLUDED_ERRORS|marked for removal"           # docker container getting recycled
    EXCLUDED_ERRORS="$EXCLUDED_ERRORS|tcp 127.0.0.1:6791: bind: address already in use" # so-elastic-fleet agent restarting. Seen starting w/ 8.18.8 https://github.com/elastic/kibana/issues/201459
-    EXCLUDED_ERRORS="$EXCLUDED_ERRORS|TransformTask\] \[logs-(tychon|aws_billing|microsoft_defender_endpoint|armis|o365_metrics|microsoft_sentinel|snyk|cyera|island_browser).*user so_kibana lacks the required permissions \[(logs|metrics)-\1" # Known issue with integrations starting transform jobs that are explicitly not allowed to start as a system user. This error should not be seen on fresh ES 9.3.3 installs or after SO 3.1.0 with soups addition of check_transform_health_and_reauthorize()
+    EXCLUDED_ERRORS="$EXCLUDED_ERRORS|TransformTask\] \[logs-(tychon|aws_billing|microsoft_defender_endpoint|armis|o365_metrics|microsoft_sentinel|snyk).*user so_kibana lacks the required permissions \[(logs|metrics)-\1" # Known issue with integrations starting transform jobs that are explicitly not allowed to start as a system user. (installed as so_elastic / so_kibana)
    EXCLUDED_ERRORS="$EXCLUDED_ERRORS|manifest unknown"             # appears in so-dockerregistry log for so-tcpreplay following docker upgrade to 29.2.1-1
 fi

@@ -1,3 +1,5 @@
+{% import_yaml 'salt/minion.defaults.yaml' as SALT_MINION_DEFAULTS -%}
+
 #!/bin/bash
 #
 # Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
@@ -23,8 +25,7 @@ SYSTEM_START_TIME=$(date -d "$(</proc/uptime awk '{print $1}') seconds ago" +%s)
 LAST_HIGHSTATE_END=$([ -e "/opt/so/log/salt/lasthighstate" ] && date -r /opt/so/log/salt/lasthighstate +%s || echo 0)
 LAST_HEALTHCHECK_STATE_APPLY=$([ -e "/opt/so/log/salt/state-apply-test" ] && date -r /opt/so/log/salt/state-apply-test +%s || echo 0)
 # SETTING THRESHOLD TO ANYTHING UNDER 600 seconds may cause a lot of salt-minion restarts since the job to touch the file occurs every 5-8 minutes by default
-# THRESHOLD is derived from the global push highstate interval + 1 hour, so the minion-check grace period tracks the schedule automatically.
-THRESHOLD=$(( ({{ salt['pillar.get']('global:push:highstate_interval_hours', 2) }} + 1) * 3600 )) #within how many seconds the file /opt/so/log/salt/state-apply-test must have been touched/modified before the salt minion is restarted
+THRESHOLD={{SALT_MINION_DEFAULTS.salt.minion.check_threshold}} #within how many seconds the file /opt/so/log/salt/state-apply-test must have been touched/modified before the salt minion is restarted
 THRESHOLD_DATE=$((LAST_HEALTHCHECK_STATE_APPLY+THRESHOLD))

 logCmd() {
@@ -9,8 +9,7 @@
 prune_images:
  cmd.run:
    - name: so-docker-prune
-    - onlyif: command -v /usr/sbin/so-docker-prune >/dev/null 2>&1
-    - order: 9000
+    - order: last

 {% else %}

@@ -19,7 +19,6 @@ wait_for_elasticsearch:
 so-elastalert:
  docker_container.running:
    - image: {{ GLOBALS.registry_host }}:5000/{{ GLOBALS.image_repo }}/so-elastalert:{{ GLOBALS.so_version }}
-    - restart_policy: unless-stopped
    - hostname: elastalert
    - name: so-elastalert
    - user: so-elastalert
@@ -15,7 +15,6 @@ include:
 so-elastic-fleet-package-registry:
  docker_container.running:
    - image: {{ GLOBALS.registry_host }}:5000/{{ GLOBALS.image_repo }}/so-elastic-fleet-package-registry:{{ GLOBALS.so_version }}
-    - restart_policy: unless-stopped
    - name: so-elastic-fleet-package-registry
    - hostname: Fleet-package-reg-{{ GLOBALS.hostname }}
    - detach: True
@@ -52,16 +51,6 @@ so-elastic-fleet-package-registry:
      - {{ ULIMIT.name }}={{ ULIMIT.soft }}:{{ ULIMIT.hard }}
    {%   endfor %}
    {% endif %}
-
-wait_for_so-elastic-fleet-package-registry:
-  http.wait_for_successful_query:
-    - name: "http://localhost:8080/health"
-    - status: 200
-    - wait_for: 300
-    - request_interval: 15
-    - require:
-      - docker_container: so-elastic-fleet-package-registry
-
 delete_so-elastic-fleet-package-registry_so-status.disabled:
  file.uncomment:
    - name: /opt/so/conf/so-status/so-status.conf
@@ -16,7 +16,6 @@ include:
 so-elastic-agent:
  docker_container.running:
    - image: {{ GLOBALS.registry_host }}:5000/{{ GLOBALS.image_repo }}/so-elastic-agent:{{ GLOBALS.so_version }}
-    - restart_policy: unless-stopped
    - name: so-elastic-agent
    - hostname: {{ GLOBALS.hostname }}
    - detach: True
@@ -26,9 +26,7 @@ include:
 wait_for_elasticsearch_elasticfleet:
  cmd.run:
    - name: so-elasticsearch-wait
-{% endif %}

-{% if GLOBALS.role == "so-fleet" %}
 # Sync Elastic Agent artifacts to Fleet Node
 elasticagent_syncartifacts:
  file.recurse:
@@ -42,7 +40,6 @@ elasticagent_syncartifacts:
 so-elastic-fleet:
  docker_container.running:
    - image: {{ GLOBALS.registry_host }}:5000/{{ GLOBALS.image_repo }}/so-elastic-agent:{{ GLOBALS.so_version }}
-    - restart_policy: unless-stopped
    - name: so-elastic-fleet
    - hostname: FleetServer-{{ GLOBALS.hostname }}
    - detach: True
@@ -11,14 +11,24 @@ include:
  - elasticfleet.config

 # If enabled, automatically update Fleet Logstash Outputs
-{% if ELASTICFLEETMERGED.config.server.enable_auto_configuration %}
-{%   if grains.role not in ['so-import', 'so-eval']%}
+{% if ELASTICFLEETMERGED.config.server.enable_auto_configuration and grains.role not in ['so-import', 'so-eval'] %}
 so-elastic-fleet-auto-configure-logstash-outputs:
  cmd.run:
    - name: /usr/sbin/so-elastic-fleet-outputs-update
    - retry:
        attempts: 4
        interval: 30
+
+{# Separate from above in order to catch elasticfleet-logstash.crt changes and force update to fleet output policy #}
+so-elastic-fleet-auto-configure-logstash-outputs-force:
+  cmd.run:
+    - name: /usr/sbin/so-elastic-fleet-outputs-update --certs
+    - retry:
+        attempts: 4
+        interval: 30
+    - onchanges:
+        - x509: etc_elasticfleet_logstash_crt
+        - x509: elasticfleet_kafka_crt
 {% endif %}

 # If enabled, automatically update Fleet Server URLs & ES Connection
@@ -28,7 +38,6 @@ so-elastic-fleet-auto-configure-server-urls:
    - retry:
        attempts: 4
        interval: 30
-{% endif %}

 # Automatically update Fleet Server Elasticsearch URLs & Agent Artifact URLs
 so-elastic-fleet-auto-configure-elasticsearch-urls:
@@ -240,7 +240,7 @@ elastic_fleet_policy_create() {
        --arg DESC "$DESC" \
        --arg TIMEOUT $TIMEOUT \
        --arg FLEETSERVER "$FLEETSERVER" \
-            '{"name": $NAME,"id":$NAME,"description":$DESC,"namespace":"default","monitoring_enabled":["logs"],"inactivity_timeout":$TIMEOUT,"has_fleet_server":$FLEETSERVER,"advanced_settings":{"agent_logging_level": "warning"}}'
+            '{"name": $NAME,"id":$NAME,"description":$DESC,"namespace":"default","monitoring_enabled":["logs"],"inactivity_timeout":$TIMEOUT,"has_fleet_server":$FLEETSERVER}'
        )
    # Create Fleet Policy
    if ! fleet_api "agent_policies" -XPOST -H 'kbn-xsrf: true' -H 'Content-Type: application/json' -d "$JSON_STRING"; then
@@ -235,16 +235,6 @@ function update_kafka_outputs() {

 {% endif %}

-# Compare the current Elastic Fleet certificate against what is on disk
-POLICY_CERT_SHA=$(jq -r '.item.ssl.certificate' <<< $RAW_JSON | openssl x509 -noout -sha256 -fingerprint)
-DISK_CERT_SHA=$(openssl x509 -in /etc/pki/elasticfleet-logstash.crt -noout -sha256 -fingerprint)
-
-if [[ "$POLICY_CERT_SHA" != "$DISK_CERT_SHA" ]]; then
-    printf "Certificate on disk doesn't match certificate in policy - forcing update\n"
-    UPDATE_CERTS=true
-    FORCE_UPDATE=true
-fi
-
 # Sort & hash the new list of Logstash Outputs
 NEW_LIST_JSON=$(jq --compact-output --null-input '$ARGS.positional' --args -- "${NEW_LIST[@]}")
 NEW_HASH=$(sha256sum <<< "$NEW_LIST_JSON" | awk '{print $1}')
@@ -232,6 +232,7 @@ printf '%s\n'\
    "      grid_enrollment_general: '$GRIDNODESENROLLMENTOKENGENERAL'"\
    "      grid_enrollment_heavy: '$GRIDNODESENROLLMENTOKENHEAVY'"\
    "" >> "$pillar_file"
+/usr/sbin/so-config.py import-file "$pillar_file" --note "so-elastic-fleet-setup"

 #Store Grid Nodes Enrollment token in Global pillar
 global_pillar_file=/opt/so/saltstack/local/pillar/global/soc_global.sls
@@ -239,6 +240,7 @@ printf '%s\n'\
    "  fleet_grid_enrollment_token_general: '$GRIDNODESENROLLMENTOKENGENERAL'"\
    "  fleet_grid_enrollment_token_heavy: '$GRIDNODESENROLLMENTOKENHEAVY'"\
    "" >> "$global_pillar_file"
+/usr/sbin/so-config.py import-file "$global_pillar_file" --note "so-elastic-fleet-setup"

 # Call Elastic-Fleet Salt State
 printf "\nApplying elasticfleet state"
@@ -9,12 +9,9 @@
 {%   from 'elasticsearch/config.map.jinja' import ELASTICSEARCHMERGED %}
 {%   from 'elasticsearch/template.map.jinja' import ES_INDEX_SETTINGS, SO_MANAGED_INDICES %}
 {%   if GLOBALS.role != 'so-heavynode' %}
-{%     from 'elasticsearch/template.map.jinja' import ALL_ADDON_SETTINGS, ADDON_INDICES %}
+{%     from 'elasticsearch/template.map.jinja' import ALL_ADDON_SETTINGS %}
 {%   endif %}

-include:
-  - elasticsearch.enabled
-
 escomponenttemplates:
  file.recurse:
    - name: /opt/so/conf/elasticsearch/templates/component
@@ -38,20 +35,6 @@ so_index_template_dir:
      {%- endfor %}
    {%- endif %}

-{%  if GLOBALS.role != "so-heavynode" %}
-# Clean up legacy and non-SO managed templates from the elasticsearch/templates/addon-index/ directory
-addon_index_template_dir:
-  file.directory:
-    - name: /opt/so/conf/elasticsearch/templates/addon-index
-    - clean: True
-    {%- if ADDON_INDICES %}
-    - require:
-      {%- for index in ADDON_INDICES %}
-      - file: addon_index_template_{{index}}
-      {%- endfor %}
-    {%- endif %}
-{%  endif %}
-
 # Auto-generate index templates for SO managed indices (directly defined in elasticsearch/defaults.yaml)
 #   These index templates are for the core SO datasets and are always required
 {%  for index, settings in ES_INDEX_SETTINGS.items() %}
@@ -3958,13 +3958,10 @@ elasticsearch:
        - vulnerability-mappings
        - common-settings
        - common-dynamic-mappings
-        - logs-redis.log@package
-        - logs-redis.log@custom
        data_stream:
          allow_custom_routing: false
          hidden: false
-        ignore_missing_component_templates:
-        - logs-redis.log@custom
+        ignore_missing_component_templates: []
        index_patterns:
        - logs-redis.log*
        priority: 501
@@ -24,7 +24,6 @@ include:
 so-elasticsearch:
  docker_container.running:
    - image: {{ GLOBALS.registry_host }}:5000/{{ GLOBALS.image_repo }}/so-elasticsearch:{{ ELASTICSEARCHMERGED.version }}
-    - restart_policy: unless-stopped
    - hostname: elasticsearch
    - name: so-elasticsearch
    - user: elasticsearch
@@ -63,8 +63,7 @@
    { "set":             { "if": "ctx.event?.dataset != null && !ctx.event.dataset.contains('.')", "field": "event.dataset", "value": "{{event.module}}.{{event.dataset}}" } },
    { "split":           { "if": "ctx.event?.dataset != null && ctx.event.dataset.contains('.')", "field": "event.dataset", "separator": "\\.", "target_field": "dataset_tag_temp" } },
    { "append":          { "if": "ctx.dataset_tag_temp != null", "field": "tags", "value": "{{dataset_tag_temp.1}}"  } },
-    { "grok":            { "if": "ctx.http?.response?.status_code instanceof String", "field": "http.response.status_code", "patterns": ["%{NUMBER:http.response.status_code:long}(?:\\s+%{GREEDYDATA})?"], "ignore_failure": true } },
-    { "convert":         { "if": "ctx.http?.response?.status_code != null && !(ctx.http.response.status_code instanceof Number)", "field": "http.response.status_code", "type": "long", "ignore_failure": true } },
+    { "grok":            { "if": "ctx.http?.response?.status_code != null", "field": "http.response.status_code", "patterns": ["%{NUMBER:http.response.status_code:long} %{GREEDYDATA}"]} },
    { "set":             { "if": "ctx?.metadata?.kafka != null" , "field": "kafka.id", "value": "{{metadata.kafka.partition}}{{metadata.kafka.offset}}{{metadata.kafka.timestamp}}", "ignore_failure": true } },
    { "remove":          { "field": [ "message2", "type", "fields", "category", "module", "dataset", "dataset_tag_temp", "event.dataset_temp" ], "ignore_missing": true, "ignore_failure": true } },
    { "pipeline": { "name": "global@custom", "ignore_missing_pipeline": true, "description": "[Fleet] Global pipeline for all data streams" } }
@@ -177,84 +177,12 @@
                "description": "Extract IPs from Elastic Agent events (host.ip) and adds them to related.ip"
            }
        },
-        {
-            "script": {
-                "description": "Snapshot event.ingested into _tmp.event_ingested_pre_fleet before .fleet_final_pipeline-1 overwrites it with ES ingest time",
-                "lang": "painless",
-                "if": "ctx.event?.ingested != null && ctx.event?.created == null",
-                "ignore_failure": true,
-                "source": "ctx.putIfAbsent('_tmp', [:]); ctx._tmp.event_ingested_pre_fleet = ctx.event.ingested;"
-            }
-        },
        {
            "pipeline": {
                "name": ".fleet_final_pipeline-1",
                "ignore_missing_pipeline": true
            }
        },
-        {
-            "script": {
-                "description": "Calculate time from Elastic Agent to Logstash.",
-                "lang": "painless",
-                "if": "ctx._tmp?.logstash_from_agent != null",
-                "ignore_failure": true,
-                "source": "ZonedDateTime start = ctx._tmp.event_ingested_pre_fleet != null ? ZonedDateTime.parse(ctx._tmp.event_ingested_pre_fleet) : ZonedDateTime.parse(ctx['@timestamp']); ctx.event.putIfAbsent('ingestion', [:]); ctx.event.ingestion.latency_elasticagent_to_logstash = ChronoUnit.SECONDS.between(start, ZonedDateTime.parse(ctx._tmp.logstash_from_agent));"
-            }
-        },
-        {
-            "script": {
-                "description": "Calculate time from Logstash to Redis",
-                "lang": "painless",
-                "if": "ctx._tmp?.logstash_from_agent != null && ctx._tmp?.logstash_to_redis != null",
-                "ignore_failure": true,
-                "source": "ctx.event.putIfAbsent('ingestion', [:]); ctx.event.ingestion.latency_logstash_to_redis = ChronoUnit.SECONDS.between(ZonedDateTime.parse(ctx._tmp.logstash_from_agent), ZonedDateTime.parse(ctx._tmp.logstash_to_redis));"
-            }
-        },
-        {
-            "script": {
-                "description": "Calculate time message spends in redis queue (logstash delay in pulling event).",
-                "lang": "painless",
-                "if": "ctx._tmp?.logstash_to_redis != null && ctx._tmp?.logstash_from_redis != null",
-                "ignore_failure": true,
-                "source": "ctx.event.putIfAbsent('ingestion', [:]); ctx.event.ingestion.latency_redis_to_logstash = ChronoUnit.SECONDS.between(ZonedDateTime.parse(ctx._tmp.logstash_to_redis), ZonedDateTime.parse(ctx._tmp.logstash_from_redis));"
-            }
-        },
-        {
-            "script": {
-                "description": "Calculate time from Logstash to Elasticsearch (after read from Redis).",
-                "lang": "painless",
-                "if": "ctx._tmp?.logstash_from_redis != null",
-                "ignore_failure": true,
-                "source": "ctx.event.putIfAbsent('ingestion', [:]); ctx.event.ingestion.latency_logstash_to_elasticsearch = ChronoUnit.SECONDS.between(ZonedDateTime.parse(ctx._tmp.logstash_from_redis), metadata().now);"
-            }
-        },
-        {
-            "script": {
-                "description": "Calculate time from Elastic Agent to Kafka.",
-                "lang": "painless",
-                "if": "ctx._tmp?.logstash_from_kafka != null && ctx._tmp?.logstash_from_agent == null",
-                "ignore_failure": true,
-                "source": "ZonedDateTime start = ctx._tmp.event_ingested_pre_fleet != null ? ZonedDateTime.parse(ctx._tmp.event_ingested_pre_fleet) : ZonedDateTime.parse(ctx['@timestamp']); ctx.event.putIfAbsent('ingestion', [:]); ctx.event.ingestion.latency_elasticagent_to_kafka = ChronoUnit.SECONDS.between(start, ZonedDateTime.parse(ctx._tmp.logstash_from_kafka));"
-            }
-        },
-        {
-            "script": {
-                "description": "Calculate time message spends in Kafka queue (logstash delay in pulling event).",
-                "lang": "painless",
-                "if": "ctx._tmp?.logstash_from_kafka != null && ctx.metadata?.kafka?.timestamp != null && ctx._tmp?.logstash_from_agent == null",
-                "ignore_failure": true,
-                "source": "ctx.event.putIfAbsent('ingestion', [:]); ctx.event.ingestion.latency_kafka_queue = ChronoUnit.SECONDS.between(ZonedDateTime.ofInstant(Instant.ofEpochMilli(Long.parseLong(ctx.metadata.kafka.timestamp.toString())), ZoneId.of('UTC')), ZonedDateTime.parse(ctx._tmp.logstash_from_kafka));"
-            }
-        },
-        {
-            "script": {
-                "description": "Calculate time from Logstash to Elasticsearch (after read from Kafka).",
-                "lang": "painless",
-                "if": "ctx._tmp?.logstash_from_kafka != null && ctx._tmp?.logstash_from_agent == null",
-                "ignore_failure": true,
-                "source": "ctx.event.putIfAbsent('ingestion', [:]); ctx.event.ingestion.latency_kafka_to_elasticsearch = ChronoUnit.SECONDS.between(ZonedDateTime.parse(ctx._tmp.logstash_from_kafka), metadata().now);"
-            }
-        },
        {
            "remove": {
                "field": "event.agent_id_status",
@@ -274,8 +202,7 @@
                    "event.dataset_temp",
                    "dataset_tag_temp",
                    "module_temp",
-                    "datastream_dataset_temp",
-                    "_tmp"
+                    "datastream_dataset_temp"
                ],
                "ignore_missing": true,
                "ignore_failure": true
@@ -1,71 +0,0 @@
-{
-    "description": "zeek.ja4d",
-    "processors": [
-        {
-            "set": {
-                "field": "event.dataset",
-                "value": "ja4d"
-            }
-        },
-        {
-            "remove": {
-                "field": [
-                    "host"
-                ],
-                "ignore_failure": true
-            }
-        },
-        {
-            "json": {
-                "field": "message",
-                "target_field": "message2",
-                "ignore_failure": true
-            }
-        },
-        {
-            "rename": {
-                "field": "message2.ja4d",
-                "target_field": "hash.ja4d",
-                "ignore_missing": true,
-                "if": "ctx?.message2?.ja4d != null && ctx.message2.ja4d.length() > 0"
-            }
-        },
-        {
-            "rename": {
-                "field": "message2.client_mac",
-                "target_field": "host.mac",
-                "ignore_missing": true,
-                "if": "ctx?.message2?.client_mac != null && ctx.message2.client_mac.length() > 0"
-            }
-        },
-        {
-            "rename": {
-                "field": "message2.hostname",
-                "target_field": "host.hostname",
-                "ignore_missing": true,
-                "if": "ctx?.message2?.hostname != null && ctx.message2.hostname.length() > 0"
-            }
-        },
-        {
-            "rename": {
-                "field": "message2.requested_ip",
-                "target_field": "dhcp.requested_address",
-                "ignore_missing": true,
-                "if": "ctx?.message2?.requested_ip != null && ctx.message2.requested_ip.length() > 0"
-            }
-        },
-        {
-            "rename": {
-                "field": "message2.vendor_class_id",
-                "target_field": "zeek.ja4d.vendor_class_id",
-                "ignore_missing": true,
-                "if": "ctx?.message2?.vendor_class_id != null && ctx.message2.vendor_class_id.length() > 0"
-            }
-        },
-        {
-            "pipeline": {
-                "name": "zeek.common"
-            }
-        }
-    ]
-}
@@ -61,25 +61,15 @@
 {% if ALL_ADDON_SETTINGS_ORIG.keys() | length > 0 %}
 {%   for index in ALL_ADDON_SETTINGS_ORIG.keys() %}
 {%     do ALL_ADDON_SETTINGS_GLOBAL_OVERRIDES.update({index: salt['defaults.merge'](ALL_ADDON_SETTINGS_ORIG[index], PILLAR_GLOBAL_OVERRIDES, in_place=False)}) %}
-{#     Explicitly excluding addon indices from ES_INDEX_SETTINGS_ORIG
-         When manager.soc_managed_annotations runs, new entries are added to the salt/elasticsearch/defaults.yaml file to support 'revert to default' functionality.
-         Subsequent map renders will then incorrectly include 'integration X' in 'ES_INDEX_SETTINGS_ORIG' due to being in the defaults.yaml file. #}
-{%     if index in ES_INDEX_SETTINGS_ORIG.keys() %}
-{%       do ES_INDEX_SETTINGS_ORIG.pop(index) %}
-{%     endif %}
 {%   endfor %}
 {% endif %}

 {% set ES_INDEX_SETTINGS = {} %}
-{% macro create_final_index_template(DEFINED_SETTINGS, GLOBAL_OVERRIDES, FINAL_INDEX_SETTINGS, EXCLUDE_INDICES=[]) %}
+{% macro create_final_index_template(DEFINED_SETTINGS, GLOBAL_OVERRIDES, FINAL_INDEX_SETTINGS) %}

 {% do GLOBAL_OVERRIDES.update(salt['defaults.merge'](GLOBAL_OVERRIDES, ES_INDEX_PILLAR, in_place=False)) %}
 {% for index, settings in GLOBAL_OVERRIDES.items() %}

-{%   if index in EXCLUDE_INDICES %}
-{%     continue %}
-{%   endif %}
-
 {#   prevent this action from being performed on custom defined indices. #}
 {#   the custom defined index is not present in either of the dictionaries and fails to reder. #}
 {%   if index in DEFINED_SETTINGS and index in GLOBAL_OVERRIDES %}
@@ -160,19 +150,10 @@
 {% endfor %}
 {% endmacro %}

-{# Exclude addon integrations from final ES_INDEX_SETTINGS #}
-{{ create_final_index_template(ES_INDEX_SETTINGS_ORIG, ES_INDEX_SETTINGS_GLOBAL_OVERRIDES, ES_INDEX_SETTINGS, ALL_ADDON_SETTINGS_ORIG.keys() | list ) }}
-
-{# Exclude SO managed indices, otherwise ALL_ADDON_SETTINGS will include pillar values
-  of core integrations without merging defaults, resulting in an overlapping, but bad index template being generated. #}
-{{ create_final_index_template(ALL_ADDON_SETTINGS_ORIG, ALL_ADDON_SETTINGS_GLOBAL_OVERRIDES, ALL_ADDON_SETTINGS, ES_INDEX_SETTINGS_ORIG.keys() | list ) }}
+{{ create_final_index_template(ES_INDEX_SETTINGS_ORIG, ES_INDEX_SETTINGS_GLOBAL_OVERRIDES, ES_INDEX_SETTINGS) }}
+{{ create_final_index_template(ALL_ADDON_SETTINGS_ORIG, ALL_ADDON_SETTINGS_GLOBAL_OVERRIDES, ALL_ADDON_SETTINGS) }}

 {% set SO_MANAGED_INDICES = [] %}
 {% for index, settings in ES_INDEX_SETTINGS.items() %}
 {%   do SO_MANAGED_INDICES.append(index) %}
 {% endfor %}
-
-{% set ADDON_INDICES = [] %}
-{% for index, settings in ALL_ADDON_SETTINGS.items() %}
-{%   do ADDON_INDICES.append(index) %}
-{% endfor %}
@@ -398,7 +398,6 @@ firewall:
                - elasticsearch_rest
                - docker_registry
                - influxdb
-                - postgres
                - sensoroni
                - yum
                - beats_5044
@@ -411,7 +410,6 @@ firewall:
              portgroups:
                - docker_registry
                - influxdb
-                - postgres
                - sensoroni
                - yum
                - beats_5044
@@ -429,7 +427,6 @@ firewall:
                - yum
                - docker_registry
                - influxdb
-                - postgres
                - sensoroni
            searchnode:
              portgroups:
@@ -440,7 +437,6 @@ firewall:
                - yum
                - docker_registry
                - influxdb
-                - postgres
                - elastic_agent_control
                - elastic_agent_data
                - elastic_agent_update
@@ -454,7 +450,6 @@ firewall:
                - yum
                - docker_registry
                - influxdb
-                - postgres
                - elastic_agent_control
                - elastic_agent_data
                - elastic_agent_update
@@ -464,7 +459,6 @@ firewall:
                - yum
                - docker_registry
                - influxdb
-                - postgres
                - elastic_agent_control
                - elastic_agent_data
                - elastic_agent_update
@@ -498,7 +492,6 @@ firewall:
              portgroups:
                - docker_registry
                - influxdb
-                - postgres
                - sensoroni
                - yum
                - elastic_agent_control
@@ -509,7 +502,6 @@ firewall:
                - yum
                - docker_registry
                - influxdb
-                - postgres
                - elastic_agent_control
                - elastic_agent_data
                - elastic_agent_update
@@ -618,7 +610,6 @@ firewall:
                - elasticsearch_rest
                - docker_registry
                - influxdb
-                - postgres
                - sensoroni
                - yum
                - beats_5044
@@ -631,7 +622,6 @@ firewall:
              portgroups:
                - docker_registry
                - influxdb
-                - postgres
                - sensoroni
                - yum
                - beats_5044
@@ -649,7 +639,6 @@ firewall:
                - yum
                - docker_registry
                - influxdb
-                - postgres
                - sensoroni
            searchnode:
              portgroups:
@@ -660,7 +649,6 @@ firewall:
                - yum
                - docker_registry
                - influxdb
-                - postgres
                - elastic_agent_control
                - elastic_agent_data
                - elastic_agent_update
@@ -674,7 +662,6 @@ firewall:
                - yum
                - docker_registry
                - influxdb
-                - postgres
                - elastic_agent_control
                - elastic_agent_data
                - elastic_agent_update
@@ -684,7 +671,6 @@ firewall:
                - yum
                - docker_registry
                - influxdb
-                - postgres
                - elastic_agent_control
                - elastic_agent_data
                - elastic_agent_update
@@ -716,7 +702,6 @@ firewall:
              portgroups:
                - docker_registry
                - influxdb
-                - postgres
                - sensoroni
                - yum
                - elastic_agent_control
@@ -727,7 +712,6 @@ firewall:
                - yum
                - docker_registry
                - influxdb
-                - postgres
                - elastic_agent_control
                - elastic_agent_data
                - elastic_agent_update
@@ -836,7 +820,6 @@ firewall:
                - elasticsearch_rest
                - docker_registry
                - influxdb
-                - postgres
                - sensoroni
                - yum
                - beats_5044
@@ -849,7 +832,6 @@ firewall:
              portgroups:
                - docker_registry
                - influxdb
-                - postgres
                - sensoroni
                - yum
                - beats_5044
@@ -867,7 +849,6 @@ firewall:
                - yum
                - docker_registry
                - influxdb
-                - postgres
                - sensoroni
            searchnode:
              portgroups:
@@ -877,7 +858,6 @@ firewall:
                - yum
                - docker_registry
                - influxdb
-                - postgres
                - elastic_agent_control
                - elastic_agent_data
                - elastic_agent_update
@@ -890,7 +870,6 @@ firewall:
                - yum
                - docker_registry
                - influxdb
-                - postgres
                - elastic_agent_control
                - elastic_agent_data
                - elastic_agent_update
@@ -900,7 +879,6 @@ firewall:
                - yum
                - docker_registry
                - influxdb
-                - postgres
                - elastic_agent_control
                - elastic_agent_data
                - elastic_agent_update
@@ -934,7 +912,6 @@ firewall:
              portgroups:
                - docker_registry
                - influxdb
-                - postgres
                - sensoroni
                - yum
                - elastic_agent_control
@@ -945,7 +922,6 @@ firewall:
                - yum
                - docker_registry
                - influxdb
-                - postgres
                - elastic_agent_control
                - elastic_agent_data
                - elastic_agent_update
@@ -1064,7 +1040,6 @@ firewall:
                - elasticsearch_rest
                - docker_registry
                - influxdb
-                - postgres
                - sensoroni
                - yum
                - beats_5044
@@ -1077,7 +1052,6 @@ firewall:
              portgroups:
                - docker_registry
                - influxdb
-                - postgres
                - sensoroni
                - yum
                - beats_5044
@@ -1089,7 +1063,6 @@ firewall:
              portgroups:
                - docker_registry
                - influxdb
-                - postgres
                - sensoroni
                - yum
                - beats_5044
@@ -1101,7 +1074,6 @@ firewall:
              portgroups:
                - docker_registry
                - influxdb
-                - postgres
                - sensoroni
                - yum
                - redis
@@ -1111,7 +1083,6 @@ firewall:
              portgroups:
                - docker_registry
                - influxdb
-                - postgres
                - sensoroni
                - yum
                - redis
@@ -1122,7 +1093,6 @@ firewall:
                - yum
                - docker_registry
                - influxdb
-                - postgres
                - elastic_agent_control
                - elastic_agent_data
                - elastic_agent_update
@@ -1159,7 +1129,6 @@ firewall:
              portgroups:
                - docker_registry
                - influxdb
-                - postgres
                - sensoroni
                - yum
                - elastic_agent_control
@@ -1170,7 +1139,6 @@ firewall:
                - yum
                - docker_registry
                - influxdb
-                - postgres
                - elastic_agent_control
                - elastic_agent_data
                - elastic_agent_update
@@ -1514,7 +1482,6 @@ firewall:
                - kibana
                - redis
                - influxdb
-                - postgres
                - elasticsearch_rest
                - elasticsearch_node
                - elastic_agent_control
@@ -1,5 +1,6 @@
 {% from 'vars/globals.map.jinja' import GLOBALS %}
 {% from 'docker/docker.map.jinja' import DOCKERMERGED %}
+{% from 'telegraf/map.jinja' import TELEGRAFMERGED %}
 {% import_yaml 'firewall/defaults.yaml' as FIREWALL_DEFAULT %}

 {# add our ip to self #}
@@ -55,4 +56,16 @@

 {% endif %}

+{# Open Postgres (5432) to minion hostgroups when Telegraf is configured to write to Postgres #}
+{% set TG_OUT = TELEGRAFMERGED.output | upper %}
+{% if TG_OUT in ['POSTGRES', 'BOTH'] %}
+{%   if role.startswith('manager') or role == 'standalone' or role == 'eval' %}
+{%     for r in ['sensor', 'searchnode', 'heavynode', 'receiver', 'fleet', 'idh', 'desktop', 'import'] %}
+{%       if FIREWALL_DEFAULT.firewall.role[role].chain["DOCKER-USER"].hostgroups[r] is defined %}
+{%         do FIREWALL_DEFAULT.firewall.role[role].chain["DOCKER-USER"].hostgroups[r].portgroups.append('postgres') %}
+{%       endif %}
+{%     endfor %}
+{%   endif %}
+{% endif %}
+
 {% set FIREWALL_MERGED = salt['pillar.get']('firewall', FIREWALL_DEFAULT.firewall, merge=True) %}
@@ -1,10 +1,3 @@
 global:
  pcapengine: SURICATA
  pipeline: REDIS
-  push:
-    enabled: true
-    highstate_interval_hours: 2
-    debounce_seconds: 30
-    drain_interval: 15
-    batch: '25%'
-    batch_wait: 15
@@ -59,41 +59,4 @@ global:
    description: Allows use of Endgame with Security Onion. This feature requires a license from Endgame.
    global: True
    advanced: True
-  push:
-    enabled:
-      description: Master kill-switch for the active push feature. When disabled, rule and pillar changes are picked up at the next scheduled highstate instead of being pushed immediately.
-      forcedType: bool
-      helpLink: push
-      global: True
-    highstate_interval_hours:
-      description: How often every minion in the grid runs a scheduled state.highstate, in hours. Lower values keep minions closer in sync at the cost of more load; higher values reduce load but increase worst-case latency for non-pushed changes. The salt-minion health check restarts a minion if its last highstate is older than this value plus one hour.
-      forcedType: int
-      helpLink: push
-      global: True
-      advanced: True
-    debounce_seconds:
-      description: Trailing-edge debounce window in seconds. A push intent must be quiet for this long before the drainer dispatches. Rapid bursts of edits within this window coalesce into one dispatch.
-      forcedType: int
-      helpLink: push
-      global: True
-      advanced: True
-    drain_interval:
-      description: How often the push drainer checks for ready intents, in seconds. Small values lower dispatch latency at the cost of more background work on the manager.
-      forcedType: int
-      helpLink: push
-      global: True
-      advanced: True
-    batch:
-      description: "Host batch size for push orchestrations. A number (e.g. '10') or a percentage (e.g. '25%'). Limits how many minions run the push state at once so large fleets don't thundering-herd."
-      helpLink: push
-      global: True
-      advanced: True
-      regex: '^([0-9]+%?)$'
-      regexFailureMessage: Enter a whole number or a whole-number percentage (e.g. 10 or 25%).
-    batch_wait:
-      description: Seconds to wait between host batches in a push orchestration. Gives the fleet time to breathe between waves.
-      forcedType: int
-      helpLink: push
-      global: True
-      advanced: True

@@ -58,7 +58,6 @@ so-hydra:
      - {{ ULIMIT.name }}={{ ULIMIT.soft }}:{{ ULIMIT.hard }}
    {%   endfor %}
    {% endif %}
-    # Intentionally unless-stopped -- matches the fleet default.
    - restart_policy: unless-stopped
    - watch:
      - file: hydraconfig
@@ -15,7 +15,6 @@ include:
 so-idh:
  docker_container.running:
    - image: {{ GLOBALS.registry_host }}:5000/{{ GLOBALS.image_repo }}/so-idh:{{ GLOBALS.so_version }}
-    - restart_policy: unless-stopped
    - name: so-idh
    - detach: True
    - network_mode: host
@@ -18,7 +18,6 @@ include:
 so-influxdb:
  docker_container.running:
    - image: {{ GLOBALS.registry_host }}:5000/{{ GLOBALS.image_repo }}/so-influxdb:{{ GLOBALS.so_version }}
-    - restart_policy: unless-stopped
    - hostname: influxdb
    - networks:
      - sobridge:
@@ -20,8 +20,11 @@ so-kafka_so-status.disabled:
 ensure_default_pipeline:
  cmd.run:
    - name: |
-        /usr/sbin/so-yaml.py replace /opt/so/saltstack/local/pillar/kafka/soc_kafka.sls kafka.enabled False;
+        set -e
+        /usr/sbin/so-yaml.py replace /opt/so/saltstack/local/pillar/kafka/soc_kafka.sls kafka.enabled False
+        /usr/sbin/so-config.py sync-yaml-mutation /opt/so/saltstack/local/pillar/kafka/soc_kafka.sls replace kafka.enabled False --note "kafka.disabled"
        /usr/sbin/so-yaml.py replace /opt/so/saltstack/local/pillar/global/soc_global.sls global.pipeline REDIS
+        /usr/sbin/so-config.py sync-yaml-mutation /opt/so/saltstack/local/pillar/global/soc_global.sls replace global.pipeline REDIS --note "kafka.disabled"
 {% endif %}

 {# If Kafka has never been manually enabled, the 'Kafka' user does not exist. In this case certs for Kafka should not exist since they'll be owned by uid 960 #}
@@ -27,7 +27,6 @@ include:
 so-kafka:
  docker_container.running:
    - image: {{ GLOBALS.registry_host }}:5000/{{ GLOBALS.image_repo }}/so-kafka:{{ GLOBALS.so_version }}
-    - restart_policy: unless-stopped
    - hostname: so-kafka
    - name: so-kafka
    - networks:
@@ -16,7 +16,6 @@ include:
 so-kibana:
  docker_container.running:
    - image: {{ GLOBALS.registry_host }}:5000/{{ GLOBALS.image_repo }}/so-kibana:{{ GLOBALS.so_version }}
-    - restart_policy: unless-stopped
    - hostname: kibana
    - user: kibana
    - networks:
@@ -51,7 +51,6 @@ so-kratos:
      - {{ ULIMIT.name }}={{ ULIMIT.soft }}:{{ ULIMIT.hard }}
    {%   endfor %}
    {% endif %}
-    # Intentionally unless-stopped -- matches the fleet default.
    - restart_policy: unless-stopped
    - watch:
      - file: kratosschema
@@ -103,7 +103,7 @@ kratos:
  config:
    session:
      lifespan: 
-        description: Defines the length of a login session before it will timeout, and require a new login.
+        description: Defines the length of a login session.
        global: True
        helpLink: kratos
      whoami:
@@ -26,12 +26,12 @@ logstash:
    manager:
      - so/0011_input_endgame.conf
      - so/0012_input_elastic_agent.conf.jinja
-      - so/0013_input_lumberjack_fleet.conf.jinja
+      - so/0013_input_lumberjack_fleet.conf
      - so/9999_output_redis.conf.jinja
    receiver:
      - so/0011_input_endgame.conf
      - so/0012_input_elastic_agent.conf.jinja
-      - so/0013_input_lumberjack_fleet.conf.jinja
+      - so/0013_input_lumberjack_fleet.conf
      - so/9999_output_redis.conf.jinja
    search:
      - so/0900_input_redis.conf.jinja
@@ -69,5 +69,4 @@ logstash:
    pipeline_x_batch_x_size: 125
    pipeline_x_ecs_compatibility: disabled
  dmz_nodes: []
-  latency_metrics: False

@@ -28,7 +28,6 @@ include:
 so-logstash:
  docker_container.running:
    - image: {{ GLOBALS.registry_host }}:5000/{{ GLOBALS.image_repo }}/so-logstash:{{ GLOBALS.so_version }}
-    - restart_policy: unless-stopped
    - hostname: so-logstash
    - name: so-logstash
    - networks:
@@ -1,4 +1,3 @@
-{%- from 'logstash/map.jinja' import LOGSTASH_MERGED %}
 input {
  elastic_agent {
    port => 5055
@@ -12,11 +11,6 @@ input {
  }
 }
 filter {
-  {% if LOGSTASH_MERGED.get('latency_metrics', False) %}
-  ruby {
-    code => "event.set('[_tmp][logstash_from_agent]', Time.now().utc.iso8601(3));"
-  }
-  {% endif %}
 if ![metadata] {
  mutate {
    rename => {"@metadata" => "metadata"}
@@ -0,0 +1,23 @@
+input {
+  elastic_agent {
+    port => 5056
+    tags => [ "elastic-agent", "fleet-lumberjack-input" ]
+    ssl_enabled => true
+    ssl_certificate => "/usr/share/logstash/elasticfleet-lumberjack.crt"
+    ssl_key => "/usr/share/logstash/elasticfleet-lumberjack.key"
+    ecs_compatibility => v8
+    id => "fleet-lumberjack-in"  
+    codec => "json"
+  }
+}
+
+
+filter {
+if ![metadata] {
+  mutate {
+    rename => {"@metadata" => "metadata"}
+  }
+}
+}
+
+
@@ -1,26 +0,0 @@
-{%- from 'logstash/map.jinja' import LOGSTASH_MERGED %}
-input {
-  elastic_agent {
-    port => 5056
-    tags => [ "elastic-agent", "fleet-lumberjack-input" ]
-    ssl_enabled => true
-    ssl_certificate => "/usr/share/logstash/elasticfleet-lumberjack.crt"
-    ssl_key => "/usr/share/logstash/elasticfleet-lumberjack.key"
-    ecs_compatibility => v8
-    id => "fleet-lumberjack-in"
-    codec => "json"
-  }
-}
-
-filter {
-  {% if LOGSTASH_MERGED.get('latency_metrics', False) %}
-  ruby {
-    code => "event.set('[_tmp][logstash_from_fleet]', Time.now().utc.iso8601(3));"
-  }
-  {% endif %}
-  if ![metadata] {
-    mutate {
-      rename => {"@metadata" => "metadata"}
-    }
-  }
-}
@@ -1,4 +1,3 @@
-{%- from 'logstash/map.jinja' import LOGSTASH_MERGED %}
 {%- set kafka_password = salt['pillar.get']('kafka:config:password') %}
 {%- set kafka_trustpass = salt['pillar.get']('kafka:config:trustpass') %}
 {%- set kafka_brokers = salt['pillar.get']('kafka:nodes', {}) %}
@@ -31,11 +30,6 @@ input {
    }
 }
 filter {
-  {% if LOGSTASH_MERGED.get('latency_metrics', False) %}
-  ruby {
-    code => "event.set('[_tmp][logstash_from_kafka]', Time.now().utc.iso8601(3));"
-  }
-  {% endif %}
  if ![metadata] {
    mutate {
      rename => { "@metadata" => "metadata" }
@@ -1,4 +1,4 @@
-{%- from 'logstash/map.jinja' import LOGSTASH_REDIS_NODES, LOGSTASH_MERGED %}
+{%- from 'logstash/map.jinja' import LOGSTASH_REDIS_NODES with context %}
 {%- set REDIS_PASS = salt['pillar.get']('redis:config:requirepass') %}

 {%- for index in range(LOGSTASH_REDIS_NODES|length) %}
@@ -18,10 +18,3 @@ input {
 }
 {%   endfor %}
 {% endfor -%}
-filter {
-  {% if LOGSTASH_MERGED.get('latency_metrics', False) %}
-  ruby {
-    code => "event.set('[_tmp][logstash_from_redis]', Time.now().utc.iso8601(3));"
-  }
-  {% endif %}
-}
@@ -1,11 +1,3 @@
-{%- from 'logstash/map.jinja' import LOGSTASH_MERGED %}
-{% if LOGSTASH_MERGED.get('latency_metrics', False) %}
-filter {
-  ruby {
-    code => "event.set('[_tmp][logstash_to_elasticsearch]', Time.now().utc.iso8601(3));"
-  }
-}
-{% endif %}
 output {
  if "elastic-agent" in [tags] and "so-ip-mappings" in [tags] {
    elasticsearch {
@@ -13,14 +13,7 @@ filter {
                    add_tag => "fleet-lumberjack-{{ GLOBALS.hostname }}"
          }
  }
-{%- from 'logstash/map.jinja' import LOGSTASH_MERGED %}
-{% if LOGSTASH_MERGED.get('latency_metrics', False) %}
-filter {
-  ruby {
-    code => "event.set('[_tmp][fleet_to_logstash]', Time.now().utc.iso8601(3));"
-  }
-}
-{% endif %}
+
 output { 
    lumberjack { 
        codec => json 
@@ -1,17 +1,10 @@
-{%- from 'logstash/map.jinja' import LOGSTASH_MERGED %}
 {%- if grains.role in ['so-heavynode', 'so-receiver'] %}
  {%- set HOST = GLOBALS.hostname %}
 {%- else %}
  {%- set HOST = GLOBALS.manager %}
 {%- endif %}
 {%- set REDIS_PASS = salt['pillar.get']('redis:config:requirepass') %}
-{% if LOGSTASH_MERGED.get('latency_metrics', False) %}
-filter {
-  ruby {
-    code => "event.set('[_tmp][logstash_to_redis]', Time.now().utc.iso8601(3));"
-  }
-}
-{% endif %}
+
 output {
 	redis {
 		host => '{{ HOST }}'
@@ -86,8 +86,3 @@ logstash:
    multiline: True
    advanced: True
    forcedType: "[]string"
-  latency_metrics:
-    description: Enable latency metrics within events processed by logstash. Useful for pinpointing log ingest delay.
-    forcedType: bool
-    global: False
-    advanced: True
@@ -1,21 +0,0 @@
-{% from 'vars/globals.map.jinja' import GLOBALS %}
-{% from 'global/map.jinja' import GLOBALMERGED %}
-
-include:
-  - salt.minion
-
-{% if GLOBALS.is_manager and GLOBALMERGED.push.enabled %}
-salt_beacons_pushstate:
-  file.managed:
-    - name: /etc/salt/minion.d/beacons_pushstate.conf
-    - source: salt://manager/files/beacons_pushstate.conf.jinja
-    - template: jinja
-    - watch_in:
-      - service: salt_minion_service
-{% else %}
-salt_beacons_pushstate:
-  file.absent:
-    - name: /etc/salt/minion.d/beacons_pushstate.conf
-    - watch_in:
-      - service: salt_minion_service
-{% endif %}
@@ -1,41 +0,0 @@
-{% from 'global/map.jinja' import GLOBALMERGED %}
-beacons:
-  pillar_db:
-    - interval: {{ GLOBALMERGED.push.drain_interval }}
-    - disable_during_state_run: True
-  inotify:
-    - disable_during_state_run: True
-    - coalesce: True
-    - files:
-        /opt/so/saltstack/local/salt/suricata/rules:
-          mask:
-            - close_write
-            - moved_to
-            - delete
-          recurse: True
-          auto_add: True
-          exclude:
-            - '\.sw[a-z]$':
-                regex: True
-            - '~$':
-                regex: True
-            - '/4913$':
-                regex: True
-            - '/\.#':
-                regex: True
-        /opt/so/saltstack/local/salt/strelka/rules/compiled:
-          mask:
-            - close_write
-            - moved_to
-            - delete
-          recurse: True
-          auto_add: True
-          exclude:
-            - '\.sw[a-z]$':
-                regex: True
-            - '~$':
-                regex: True
-            - '/4913$':
-                regex: True
-            - '/\.#':
-                regex: True
@@ -15,7 +15,6 @@ include:
  - manager.elasticsearch
  - manager.kibana
  - manager.managed_soc_annotations
-  - manager.beacons

 repo_log_dir:
  file.directory:
@@ -232,7 +231,6 @@ surifiltersrules:
    - user: 939
    - group: 939

-
 {% else %}

 {{sls}}_state_not_allowed:
@@ -31,13 +31,11 @@ sync_es_users:
      - http: wait_for_kratos
      - file: so-user.lock # require so-user.lock file to be missing

-# we dont want this added too early in setup, so the onlyif gates on the
-# /opt/so/state/setup-complete marker. The marker is written by
-# mark_setup_complete in setup/so-functions just before the final setup
-# highstate (and by an upgrade-path state for systems set up under the old gate).
+# we dont want this added too early in setup, so we add the onlyif to verify 'startup_states: highstate'
+# is in the minion config. That line is added before the final highstate during setup
 so-user_sync:
  cron.present:
    - user: root
    - name: 'PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin /usr/sbin/so-user sync &>> /opt/so/log/soc/sync.log'
    - identifier: so-user_sync
-    - onlyif: "test -e /opt/so/state/setup-complete"
+    - onlyif: "grep -x 'startup_states: highstate' /etc/salt/minion"
@@ -1,117 +0,0 @@
-#!/bin/bash
-#
-# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
-# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
-# https://securityonion.net/license; you may not use this file except in compliance with the
-# Elastic License 2.0.
-
-# Runs once per boot on managers (via so-boot-mine-update.service), before
-# so-boot-highstate.service. Waits for the responsive minion set to settle, pushes
-# mine.update, waits until every up minion has actually reported to the mine, then
-# warms the master's per-minion pillar cache so the mine-backed node pillars (node
-# IPs, ES/Redis/Logstash/hypervisor discovery -- some glob- and some pillar/grain-
-# targeted) are complete before the boot highstate renders them. Otherwise a node
-# that is up but not yet fully reported gets dropped from those pillars and torn
-# out of the configs they build (e.g. so-elasticsearch ExtraHosts -> container recreate).
-
-MAX_WAIT=${MINE_UPDATE_MAX_WAIT:-180}   # hard backstop only
-INTERVAL=10
-STABLE_CHECKS=3                          # up-count must hold steady this many polls
-elapsed=0
-prev=-1
-stable=0
-up=0
-
-# Wait for the *reachable* minion set to settle rather than for every accepted
-# key to report up: an operator may accept a minion's key and then intentionally
-# power off that host, so requiring up >= accepted would never be satisfied and
-# we'd always burn the full MAX_WAIT. Once the responsive count stops growing we
-# stop waiting and run mine.update against whoever is up.
-while [ "$elapsed" -lt "$MAX_WAIT" ]; do
-  up=$(/usr/bin/salt-run manage.up --out=json 2>/dev/null \
-    | python3 -c 'import sys,json; print(len(json.load(sys.stdin)))' 2>/dev/null)
-  up=${up:-0}
-  if [ "$up" -gt 0 ] && [ "$up" -eq "$prev" ]; then
-    stable=$((stable + 1))
-    [ "$stable" -ge "$STABLE_CHECKS" ] && break
-  else
-    stable=0
-  fi
-  prev=$up
-  sleep "$INTERVAL"
-  elapsed=$((elapsed + INTERVAL))
-done
-
-echo "so-boot-mine-update: ${up} minions up (settled after ${elapsed}s); running mine.update"
-/usr/bin/salt '*' mine.update --out=txt
-
-# A node that is up but has not yet re-reported network.ip_addrs to the mine is
-# silently dropped from mine-backed pillars (elasticsearch:nodes, node_data, ...)
-# when highstate recompiles them -- which e.g. removes it from so-elasticsearch
-# ExtraHosts and forces a container recreate. After the broad mine.update above,
-# wait until every up minion actually has network.ip_addrs in the mine, re-pushing
-# mine.update to stragglers, before releasing the boot highstate. Bounded by the
-# same MAX_WAIT backstop so a slow/down node never blocks boot indefinitely.
-missing=""
-while [ "$elapsed" -lt "$MAX_WAIT" ]; do
-  up_json=$(/usr/bin/salt-run manage.up --out=json 2>/dev/null)
-  mine_json=$(/usr/bin/salt-run mine.get '*' network.ip_addrs tgt_type=glob --out=json 2>/dev/null)
-  missing=$(printf '%s' "$up_json" | python3 -c '
-import sys, json
-up = set(json.load(sys.stdin) or [])
-mine = {k for k, v in (json.loads(sys.argv[1]) or {}).items() if v}
-print("\n".join(sorted(up - mine)))
-' "$mine_json" 2>/dev/null)
-  if [ -z "$missing" ]; then
-    echo "so-boot-mine-update: mine complete for all up minions after ${elapsed}s"
-    break
-  fi
-  echo "so-boot-mine-update: mine missing up minion(s): $(echo $missing); re-running mine.update"
-  for m in $missing; do /usr/bin/salt "$m" mine.update --out=txt; done
-  sleep "$INTERVAL"
-  elapsed=$((elapsed + INTERVAL))
-done
-[ -n "$missing" ] && echo "so-boot-mine-update: WARNING ${MAX_WAIT}s backstop hit; up minion(s) still absent from mine: $(echo $missing); highstate may drop them from configs"
-
-# The pillar/compound-targeted node pillars (elasticsearch:nodes, redis:nodes,
-# logstash:nodes, hypervisor:nodes) resolve their target against the master's
-# per-minion data cache (grains+pillar in .../minions/<id>/data.p), populated only
-# when a minion's pillar is (re)compiled -- separately from the mine. A freshly
-# booted node can be in the mine (glob/node_data sees it) yet absent from that
-# cache, so it is dropped from those pillars and from the configs they build (e.g.
-# so-elasticsearch ExtraHosts). Force a synchronous pillar refresh so the master
-# caches every up node's pillar; refresh_pillar wait=True returns only once the
-# pillar is recompiled (and thus cached for matching). Retry stragglers <= MAX_WAIT.
-echo "so-boot-mine-update: warming master pillar cache for pillar/grain-targeted node pillars"
-/usr/bin/salt '*' saltutil.refresh_pillar wait=True --out=txt
-missing=""
-while [ "$elapsed" -lt "$MAX_WAIT" ]; do
-  up_json=$(/usr/bin/salt-run manage.up --out=json 2>/dev/null)
-  cached_json=$(/usr/bin/salt-run cache.pillar tgt='*' --out=json 2>/dev/null)
-  missing=$(printf '%s' "$up_json" | python3 -c '
-import sys, json
-up = set(json.load(sys.stdin) or [])
-cached = {k for k, v in (json.loads(sys.argv[1]) or {}).items() if v}
-print("\n".join(sorted(up - cached)))
-' "$cached_json" 2>/dev/null)
-  if [ -z "$missing" ]; then
-    echo "so-boot-mine-update: pillar cache warm for all up minions after ${elapsed}s"
-    break
-  fi
-  echo "so-boot-mine-update: pillar not yet cached for: $(echo $missing); refreshing"
-  for m in $missing; do /usr/bin/salt "$m" saltutil.refresh_pillar wait=True --out=txt; done
-  sleep "$INTERVAL"
-  elapsed=$((elapsed + INTERVAL))
-done
-[ -n "$missing" ] && echo "so-boot-mine-update: WARNING ${MAX_WAIT}s backstop hit; pillar not cached for: $(echo $missing); pillar-targeted pillars may drop them"
-
-# Log what the mine-backed pillars render so the boot-time state is inspectable.
-/usr/bin/salt-call saltutil.refresh_pillar >/dev/null 2>&1
-sleep 2
-for key in node_data elasticsearch:nodes; do
-  rendered=$(/usr/bin/salt-call --out=json pillar.get "$key" 2>/dev/null \
-    | python3 -c 'import sys,json; print(json.dumps(json.load(sys.stdin).get("local"), indent=2, sort_keys=True))' 2>/dev/null)
-  echo "so-boot-mine-update: ${key} rendered as:"
-  echo "${rendered:-null}"
-done
-exit 0
@@ -0,0 +1,448 @@
+#!/usr/bin/env python3
+
+# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
+# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
+# https://securityonion.net/license; you may not use this file except in compliance with the
+# Elastic License 2.0.
+
+"""
+so-config.py writes SOC/onionconfig settings to Postgres.
+
+so-yaml.py remains a YAML file editor. Call this tool when a pillar-backed
+setting also needs to be reflected in the onionconfig database.
+"""
+
+import argparse
+import json
+import os
+from pathlib import Path
+import subprocess
+import sys
+
+import yaml
+
+
+PILLAR_ROOT = Path(os.environ.get("SO_CONFIG_PILLAR_ROOT", "/opt/so/saltstack/local/pillar"))
+DOCKER_CONTAINER = os.environ.get("SO_CONFIG_PG_CONTAINER", "so-postgres")
+PG_DATABASE = os.environ.get("SO_CONFIG_PG_DATABASE", "securityonion")
+PG_USER = os.environ.get("SO_CONFIG_PG_USER", "postgres")
+DEFAULT_USER_ID = os.environ.get("SO_CONFIG_USER_ID", "so-config")
+
+EXCLUDE_BASENAMES = {
+    "secrets.sls",
+    "auth.sls",
+    "top.sls",
+}
+EXCLUDE_PATH_FRAGMENTS = (
+    "/elasticsearch/nodes.sls",
+    "/redis/nodes.sls",
+    "/kafka/nodes.sls",
+    "/hypervisor/nodes.sls",
+    "/logstash/nodes.sls",
+    "/node_data/ips.sls",
+    "/postgres/auth.sls",
+    "/elasticsearch/auth.sls",
+    "/kibana/secrets.sls",
+)
+
+
+class SkipPath(Exception):
+    pass
+
+
+def pg_str(value):
+    if value is None:
+        return "NULL"
+    return "'" + str(value).replace("'", "''") + "'"
+
+
+def pg_jsonb(value):
+    return pg_str(json.dumps(value)) + "::jsonb"
+
+
+def docker_psql(sql):
+    proc = subprocess.run(
+        ["docker", "exec", "-i", DOCKER_CONTAINER,
+         "psql", "-U", PG_USER, "-d", PG_DATABASE,
+         "-tA", "-q", "-v", "ON_ERROR_STOP=1"],
+        input=sql.encode(),
+        capture_output=True,
+        check=False,
+        timeout=60,
+    )
+    if proc.returncode != 0:
+        sys.stderr.write(proc.stderr.decode(errors="replace"))
+        raise RuntimeError(f"docker exec psql failed with rc={proc.returncode}")
+    return proc.stdout.decode(errors="replace")
+
+
+def schema_ready():
+    sql = """
+SELECT to_regclass('public.settings') IS NOT NULL
+   AND to_regclass('public.audit_settings') IS NOT NULL;
+"""
+    return docker_psql(sql).strip() == "t"
+
+
+def cmd_wait_schema(args):
+    import time
+
+    deadline = time.time() + args.timeout
+    while time.time() <= deadline:
+        if schema_ready():
+            return 0
+        time.sleep(args.interval)
+    print("so-config: onionconfig schema is not ready", file=sys.stderr)
+    return 1
+
+
+def upsert_setting(setting_id, value, *, node_id="", duplicated_from_id=None,
+                   user_id=DEFAULT_USER_ID, note=None):
+    note = note or "so-config upsert"
+    sql = f"""
+BEGIN;
+WITH old_row AS (
+    SELECT value
+      FROM settings
+     WHERE setting_id = {pg_str(setting_id)}
+       AND node_id = {pg_str(node_id)}
+     FOR UPDATE
+),
+upserted AS (
+    INSERT INTO settings (setting_id, value, duplicated_from_id, node_id)
+    VALUES ({pg_str(setting_id)}, {pg_jsonb(value)}, {pg_str(duplicated_from_id)}, {pg_str(node_id)})
+    ON CONFLICT (setting_id, node_id) DO UPDATE
+       SET value = EXCLUDED.value,
+           duplicated_from_id = EXCLUDED.duplicated_from_id
+    RETURNING value
+)
+INSERT INTO audit_settings (setting_id, node_id, user_id, old_value, new_value, note)
+SELECT {pg_str(setting_id)},
+       {pg_str(node_id)},
+       {pg_str(user_id)},
+       (SELECT value FROM old_row),
+       (SELECT value FROM upserted),
+       {pg_str(note)}
+ WHERE NOT EXISTS (SELECT 1 FROM old_row)
+    OR (SELECT value FROM old_row) IS DISTINCT FROM (SELECT value FROM upserted);
+COMMIT;
+"""
+    docker_psql(sql)
+
+
+def delete_setting(setting_id, *, node_id="", user_id=DEFAULT_USER_ID, note=None):
+    note = note or "so-config delete"
+    sql = f"""
+BEGIN;
+WITH deleted AS (
+    DELETE FROM settings
+     WHERE setting_id = {pg_str(setting_id)}
+       AND node_id = {pg_str(node_id)}
+     RETURNING value
+)
+INSERT INTO audit_settings (setting_id, node_id, user_id, old_value, new_value, note)
+SELECT {pg_str(setting_id)}, {pg_str(node_id)}, {pg_str(user_id)}, value, NULL::jsonb, {pg_str(note)}
+  FROM deleted;
+COMMIT;
+"""
+    docker_psql(sql)
+
+
+def delete_setting_prefix(setting_id, *, node_id="", user_id=DEFAULT_USER_ID, note=None):
+    if not setting_id:
+        raise ValueError("setting_id prefix cannot be empty")
+    note = note or "so-config delete-prefix"
+    sql = f"""
+BEGIN;
+WITH deleted AS (
+    DELETE FROM settings
+     WHERE node_id = {pg_str(node_id)}
+       AND (
+             setting_id = {pg_str(setting_id)}
+          OR substring(setting_id from 1 for char_length({pg_str(setting_id)}) + 1) = {pg_str(setting_id + ".")}
+       )
+     RETURNING setting_id, value
+)
+INSERT INTO audit_settings (setting_id, node_id, user_id, old_value, new_value, note)
+SELECT setting_id, {pg_str(node_id)}, {pg_str(user_id)}, value, NULL::jsonb, {pg_str(note)}
+  FROM deleted;
+COMMIT;
+"""
+    docker_psql(sql)
+
+
+def purge_node(node_id, *, user_id=DEFAULT_USER_ID, note=None):
+    note = note or "so-config purge-node"
+    sql = f"""
+BEGIN;
+WITH deleted AS (
+    DELETE FROM settings
+     WHERE node_id = {pg_str(node_id)}
+     RETURNING setting_id, value
+)
+INSERT INTO audit_settings (setting_id, node_id, user_id, old_value, new_value, note)
+SELECT setting_id, {pg_str(node_id)}, {pg_str(user_id)}, value, NULL::jsonb, {pg_str(note)}
+  FROM deleted;
+COMMIT;
+"""
+    docker_psql(sql)
+
+
+def parse_value(value, value_file=None):
+    if value_file:
+        with open(value_file, "r") as fh:
+            value = fh.read()
+    parsed = yaml.safe_load(value)
+    if parsed is None and value == "":
+        return ""
+    return parsed
+
+
+def parse_yaml_file(path):
+    with open(path, "rb") as fh:
+        raw = fh.read()
+    if b"{%" in raw or b"{{" in raw:
+        raise SkipPath(f"{path}: Jinja-templated files stay disk-only")
+    if not raw.strip():
+        return {}
+    parsed = yaml.safe_load(raw)
+    return parsed if parsed is not None else {}
+
+
+def flatten(prefix, value):
+    if isinstance(value, dict):
+        for key, child in value.items():
+            child_id = f"{prefix}.{key}" if prefix else str(key)
+            yield from flatten(child_id, child)
+    else:
+        yield prefix, value
+
+
+def classify_pillar_path(path):
+    norm = Path(path).resolve()
+    norm_str = str(norm)
+
+    if norm.name in EXCLUDE_BASENAMES:
+        raise SkipPath(f"{path}: excluded basename")
+    for fragment in EXCLUDE_PATH_FRAGMENTS:
+        if fragment in norm_str:
+            raise SkipPath(f"{path}: excluded path fragment {fragment}")
+    if norm.suffix != ".sls":
+        raise SkipPath(f"{path}: not an .sls file")
+
+    parent = norm.parent.name
+    stem = norm.stem
+
+    if parent == "minions":
+        if stem.startswith("adv_"):
+            return {"kind": "advanced", "setting_id": "advanced", "node_id": stem[4:]}
+        return {"kind": "normal", "node_id": stem}
+
+    section = parent
+    if stem == f"soc_{section}":
+        return {"kind": "normal", "node_id": ""}
+    if stem == f"adv_{section}":
+        return {"kind": "advanced", "setting_id": f"{section}.advanced", "node_id": ""}
+
+    raise SkipPath(f"{path}: not a SOC-managed pillar file")
+
+
+def import_pillar_file(path, *, user_id=DEFAULT_USER_ID, note=None):
+    meta = classify_pillar_path(path)
+    note = note or f"so-config import-file {path}"
+
+    if meta["kind"] == "advanced":
+        with open(path, "r") as fh:
+            upsert_setting(meta["setting_id"], fh.read(), node_id=meta["node_id"],
+                           user_id=user_id, note=note)
+        return 1
+
+    data = parse_yaml_file(path)
+    if not isinstance(data, dict):
+        raise SkipPath(f"{path}: top-level YAML is not a map")
+
+    count = 0
+    for setting_id, value in flatten("", data):
+        upsert_setting(setting_id, value, node_id=meta["node_id"],
+                       user_id=user_id, note=note)
+        count += 1
+    return count
+
+
+def iter_pillar_files(root):
+    root = Path(root)
+    if not root.is_dir():
+        return
+    for path in sorted(root.rglob("*.sls")):
+        if path.is_file():
+            yield path
+
+
+def cmd_set(args):
+    upsert_setting(args.setting_id, parse_value(args.value, args.value_file),
+                   node_id=args.node_id,
+                   duplicated_from_id=args.duplicated_from_id,
+                   user_id=args.user_id,
+                   note=args.note)
+    return 0
+
+
+def cmd_delete(args):
+    delete_setting(args.setting_id, node_id=args.node_id,
+                   user_id=args.user_id, note=args.note)
+    return 0
+
+
+def cmd_delete_prefix(args):
+    delete_setting_prefix(args.setting_id, node_id=args.node_id,
+                          user_id=args.user_id, note=args.note)
+    return 0
+
+
+def cmd_purge_node(args):
+    purge_node(args.node_id, user_id=args.user_id, note=args.note)
+    return 0
+
+
+def cmd_import_file(args):
+    count = import_pillar_file(args.path, user_id=args.user_id, note=args.note)
+    print(f"imported {count} settings from {args.path}")
+    return 0
+
+
+def cmd_import_minion(args):
+    count = 0
+    for name in (f"{args.node_id}.sls", f"adv_{args.node_id}.sls"):
+        path = PILLAR_ROOT / "minions" / name
+        if path.exists():
+            count += import_pillar_file(path, user_id=args.user_id, note=args.note)
+    print(f"imported {count} settings for node {args.node_id}")
+    return 0
+
+
+def cmd_import_all(args):
+    count = 0
+    skipped = 0
+    for path in iter_pillar_files(args.root):
+        try:
+            count += import_pillar_file(path, user_id=args.user_id, note=args.note)
+        except SkipPath as exc:
+            skipped += 1
+            if args.verbose:
+                print(f"skip: {exc}", file=sys.stderr)
+    print(f"imported {count} settings, skipped {skipped} files")
+    if args.state_file:
+        with open(args.state_file, "w") as fh:
+            fh.write("ok\n")
+    return 0
+
+
+def cmd_sync_yaml_mutation(args):
+    meta = classify_pillar_path(args.path)
+    note = args.note or f"so-config sync-yaml-mutation {args.operation} {args.path}"
+
+    if meta["kind"] == "advanced":
+        import_pillar_file(args.path, user_id=args.user_id, note=note)
+        return 0
+
+    if args.operation in ("add", "replace"):
+        upsert_setting(args.key, parse_value(args.value, args.value_file),
+                       node_id=meta["node_id"],
+                       user_id=args.user_id,
+                       note=note)
+    elif args.operation == "remove":
+        delete_setting_prefix(args.key, node_id=meta["node_id"],
+                              user_id=args.user_id, note=note)
+    else:
+        raise ValueError(f"unsupported operation: {args.operation}")
+    return 0
+
+
+def build_parser():
+    parser = argparse.ArgumentParser(description=__doc__)
+    sub = parser.add_subparsers(dest="command", required=True)
+
+    p = sub.add_parser("wait-schema", help="wait for SOC-created onionconfig tables")
+    p.add_argument("--timeout", type=int, default=120)
+    p.add_argument("--interval", type=int, default=2)
+    p.set_defaults(func=cmd_wait_schema)
+
+    p = sub.add_parser("set", help="upsert one setting")
+    p.add_argument("setting_id")
+    p.add_argument("value", nargs="?", default="")
+    p.add_argument("--value-file")
+    p.add_argument("--node-id", default="")
+    p.add_argument("--duplicated-from-id")
+    p.add_argument("--user-id", default=DEFAULT_USER_ID)
+    p.add_argument("--note")
+    p.set_defaults(func=cmd_set)
+
+    p = sub.add_parser("delete", help="delete one setting")
+    p.add_argument("setting_id")
+    p.add_argument("--node-id", default="")
+    p.add_argument("--user-id", default=DEFAULT_USER_ID)
+    p.add_argument("--note")
+    p.set_defaults(func=cmd_delete)
+
+    p = sub.add_parser("delete-prefix", help="delete one setting and all child settings")
+    p.add_argument("setting_id")
+    p.add_argument("--node-id", default="")
+    p.add_argument("--user-id", default=DEFAULT_USER_ID)
+    p.add_argument("--note")
+    p.set_defaults(func=cmd_delete_prefix)
+
+    p = sub.add_parser("purge-node", help="delete all settings for one node")
+    p.add_argument("node_id")
+    p.add_argument("--user-id", default=DEFAULT_USER_ID)
+    p.add_argument("--note")
+    p.set_defaults(func=cmd_purge_node)
+
+    p = sub.add_parser("import-file", help="import one SOC-managed pillar file")
+    p.add_argument("path")
+    p.add_argument("--user-id", default=DEFAULT_USER_ID)
+    p.add_argument("--note")
+    p.set_defaults(func=cmd_import_file)
+
+    p = sub.add_parser("import-minion", help="import one minion's pillar files")
+    p.add_argument("node_id")
+    p.add_argument("--user-id", default=DEFAULT_USER_ID)
+    p.add_argument("--note")
+    p.set_defaults(func=cmd_import_minion)
+
+    p = sub.add_parser("import-all", help="import all SOC-managed local pillar files")
+    p.add_argument("--root", default=str(PILLAR_ROOT))
+    p.add_argument("--state-file")
+    p.add_argument("--user-id", default=DEFAULT_USER_ID)
+    p.add_argument("--note", default="so-config initial import")
+    p.add_argument("--verbose", action="store_true")
+    p.set_defaults(func=cmd_import_all)
+
+    p = sub.add_parser("sync-yaml-mutation",
+                       help="mirror one so-yaml add/replace/remove mutation to onionconfig")
+    p.add_argument("path")
+    p.add_argument("operation", choices=("add", "replace", "remove"))
+    p.add_argument("key")
+    p.add_argument("value", nargs="?", default="")
+    p.add_argument("--value-file")
+    p.add_argument("--user-id", default=DEFAULT_USER_ID)
+    p.add_argument("--note")
+    p.set_defaults(func=cmd_sync_yaml_mutation)
+
+    return parser
+
+
+def main(argv):
+    parser = build_parser()
+    args = parser.parse_args(argv)
+    try:
+        return args.func(args)
+    except SkipPath as exc:
+        print(f"skip: {exc}", file=sys.stderr)
+        return 2
+    except Exception as exc:
+        print(f"so-config: {exc}", file=sys.stderr)
+        return 1
+
+
+if __name__ == "__main__":
+    sys.exit(main(sys.argv[1:]))
@@ -0,0 +1,178 @@
+import importlib
+import os
+import tempfile
+import unittest
+from unittest.mock import patch
+
+
+soconfig = importlib.import_module("so-config")
+
+
+class TestSoConfigPathMapping(unittest.TestCase):
+
+    def test_classify_global_soc(self):
+        meta = soconfig.classify_pillar_path(
+            "/opt/so/saltstack/local/pillar/soc/soc_soc.sls")
+        self.assertEqual(meta["kind"], "normal")
+        self.assertEqual(meta["node_id"], "")
+
+    def test_classify_global_advanced(self):
+        meta = soconfig.classify_pillar_path(
+            "/opt/so/saltstack/local/pillar/soc/adv_soc.sls")
+        self.assertEqual(meta["kind"], "advanced")
+        self.assertEqual(meta["setting_id"], "soc.advanced")
+        self.assertEqual(meta["node_id"], "")
+
+    def test_classify_minion(self):
+        meta = soconfig.classify_pillar_path(
+            "/opt/so/saltstack/local/pillar/minions/h1_sensor.sls")
+        self.assertEqual(meta["kind"], "normal")
+        self.assertEqual(meta["node_id"], "h1_sensor")
+
+    def test_classify_minion_advanced(self):
+        meta = soconfig.classify_pillar_path(
+            "/opt/so/saltstack/local/pillar/minions/adv_h1_sensor.sls")
+        self.assertEqual(meta["kind"], "advanced")
+        self.assertEqual(meta["setting_id"], "advanced")
+        self.assertEqual(meta["node_id"], "h1_sensor")
+
+    def test_classify_skips_bootstrap(self):
+        with self.assertRaises(soconfig.SkipPath):
+            soconfig.classify_pillar_path(
+                "/opt/so/saltstack/local/pillar/secrets.sls")
+
+
+class TestSoConfigImport(unittest.TestCase):
+
+    def test_flatten_keeps_lists_as_values(self):
+        flattened = dict(soconfig.flatten("", {
+            "host": {"mainip": "10.0.0.1"},
+            "suricata": {"pcap": {"enabled": True}},
+            "items": ["a", "b"],
+        }))
+        self.assertEqual(flattened["host.mainip"], "10.0.0.1")
+        self.assertEqual(flattened["suricata.pcap.enabled"], True)
+        self.assertEqual(flattened["items"], ["a", "b"])
+
+    def test_import_file_upserts_flattened_settings(self):
+        with tempfile.TemporaryDirectory() as tmp:
+            path = os.path.join(tmp, "h1_sensor.sls")
+            minions = os.path.join(tmp, "minions")
+            os.mkdir(minions)
+            path = os.path.join(minions, "h1_sensor.sls")
+            with open(path, "w") as fh:
+                fh.write("host:\n  mainip: 10.0.0.1\nsuricata:\n  enabled: true\n")
+
+            calls = []
+            with patch.object(soconfig, "upsert_setting",
+                              side_effect=lambda *args, **kwargs: calls.append((args, kwargs))):
+                count = soconfig.import_pillar_file(path)
+
+        self.assertEqual(count, 2)
+        self.assertIn((("host.mainip", "10.0.0.1"), {"node_id": "h1_sensor", "user_id": "so-config", "note": f"so-config import-file {path}"}), calls)
+        self.assertIn((("suricata.enabled", True), {"node_id": "h1_sensor", "user_id": "so-config", "note": f"so-config import-file {path}"}), calls)
+
+    def test_import_advanced_file_upserts_raw_content(self):
+        with tempfile.TemporaryDirectory() as tmp:
+            minions = os.path.join(tmp, "minions")
+            os.mkdir(minions)
+            path = os.path.join(minions, "adv_h1_sensor.sls")
+            with open(path, "w") as fh:
+                fh.write("custom:\n  raw: true\n")
+
+            calls = []
+            with patch.object(soconfig, "upsert_setting",
+                              side_effect=lambda *args, **kwargs: calls.append((args, kwargs))):
+                count = soconfig.import_pillar_file(path)
+
+        self.assertEqual(count, 1)
+        self.assertEqual(calls[0][0], ("advanced", "custom:\n  raw: true\n"))
+        self.assertEqual(calls[0][1]["node_id"], "h1_sensor")
+
+
+class TestSoConfigSql(unittest.TestCase):
+
+    def test_schema_ready_checks_soc_tables(self):
+        captured = {}
+        with patch.object(soconfig, "docker_psql",
+                          side_effect=lambda sql: captured.update({"sql": sql}) or "t\n"):
+            ready = soconfig.schema_ready()
+
+        self.assertTrue(ready)
+        self.assertIn("to_regclass('public.settings')", captured["sql"])
+        self.assertIn("to_regclass('public.audit_settings')", captured["sql"])
+
+    def test_set_writes_settings_and_audit(self):
+        captured = {}
+        with patch.object(soconfig, "docker_psql",
+                          side_effect=lambda sql: captured.setdefault("sql", sql)):
+            soconfig.upsert_setting("host.mainip", "10.0.0.1",
+                                    node_id="h1_sensor", user_id="tester", note="unit")
+
+        self.assertIn("INSERT INTO settings", captured["sql"])
+        self.assertIn("INSERT INTO audit_settings", captured["sql"])
+        self.assertIn("'host.mainip'", captured["sql"])
+        self.assertIn("'h1_sensor'", captured["sql"])
+        self.assertIn("'tester'", captured["sql"])
+
+    def test_purge_node_audits_deleted_rows(self):
+        captured = {}
+        with patch.object(soconfig, "docker_psql",
+                          side_effect=lambda sql: captured.setdefault("sql", sql)):
+            soconfig.purge_node("h1_sensor", user_id="tester", note="unit")
+
+        self.assertIn("DELETE FROM settings", captured["sql"])
+        self.assertIn("WHERE node_id = 'h1_sensor'", captured["sql"])
+        self.assertIn("INSERT INTO audit_settings", captured["sql"])
+
+    def test_delete_prefix_removes_children_and_audits(self):
+        captured = {}
+        with patch.object(soconfig, "docker_psql",
+                          side_effect=lambda sql: captured.setdefault("sql", sql)):
+            soconfig.delete_setting_prefix("elasticfleet", node_id="h1_sensor",
+                                           user_id="tester", note="unit")
+
+        self.assertIn("DELETE FROM settings", captured["sql"])
+        self.assertIn("setting_id = 'elasticfleet'", captured["sql"])
+        self.assertIn("'elasticfleet.'", captured["sql"])
+        self.assertIn("INSERT INTO audit_settings", captured["sql"])
+
+    def test_sync_yaml_replace_uses_path_node_id(self):
+        with tempfile.TemporaryDirectory() as tmp:
+            minions = os.path.join(tmp, "minions")
+            os.mkdir(minions)
+            path = os.path.join(minions, "h1_sensor.sls")
+            open(path, "w").close()
+
+            calls = []
+            args = soconfig.build_parser().parse_args([
+                "sync-yaml-mutation", path, "replace", "suricata.enabled", "true"
+            ])
+            with patch.object(soconfig, "upsert_setting",
+                              side_effect=lambda *a, **kw: calls.append((a, kw))):
+                soconfig.cmd_sync_yaml_mutation(args)
+
+        self.assertEqual(calls[0][0], ("suricata.enabled", True))
+        self.assertEqual(calls[0][1]["node_id"], "h1_sensor")
+
+    def test_sync_yaml_remove_deletes_prefix(self):
+        with tempfile.TemporaryDirectory() as tmp:
+            minions = os.path.join(tmp, "minions")
+            os.mkdir(minions)
+            path = os.path.join(minions, "h1_sensor.sls")
+            open(path, "w").close()
+
+            calls = []
+            args = soconfig.build_parser().parse_args([
+                "sync-yaml-mutation", path, "remove", "elasticfleet"
+            ])
+            with patch.object(soconfig, "delete_setting_prefix",
+                              side_effect=lambda *a, **kw: calls.append((a, kw))):
+                soconfig.cmd_sync_yaml_mutation(args)
+
+        self.assertEqual(calls[0][0], ("elasticfleet",))
+        self.assertEqual(calls[0][1]["node_id"], "h1_sensor")
+
+
+if __name__ == "__main__":
+    unittest.main()
@@ -1,381 +0,0 @@
-#!/usr/bin/env python3
-
-# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
-# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
-# https://securityonion.net/license; you may not use this file except in compliance with the
-# Elastic License 2.0.
-
-# Imports detection overrides (e.g. from so-detections-backup) into the so-detection
-# index. Reads <publicId>.<ext> files (NDJSON, one override per line) from a source
-# directory, looks up the matching detection by publicId+engine, validates each
-# override against the same rules SOC enforces, dedupes against existing overrides
-# (operational fields only), and appends new ones.
-
-import argparse
-import ipaddress
-import json
-import os
-import re
-import sys
-from datetime import datetime
-
-import requests
-from requests.auth import HTTPBasicAuth
-import urllib3
-
-urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
-
-DEFAULT_INDEX = "so-detection"
-AUTH_FILE = "/opt/so/conf/elasticsearch/curl.config"
-ES_URL = "https://localhost:9200"
-
-# Engines we know how to handle and the file extension the backup script writes.
-ENGINES = {
-    "suricata": "txt",
-}
-
-# Standard Suricata variables that ship with Security Onion. Anything else
-# referenced in an override is "custom" and the user needs to make sure it
-# exists in SOC Config before the override will function.
-BUILTIN_SURICATA_VARS = {
-    "$HOME_NET", "$EXTERNAL_NET",
-    "$HTTP_SERVERS", "$DNS_SERVERS", "$SQL_SERVERS", "$SMTP_SERVERS",
-    "$TELNET_SERVERS", "$AIM_SERVERS", "$DC_SERVERS", "$MODBUS_SERVER",
-    "$MODBUS_CLIENT", "$ENIP_CLIENT", "$ENIP_SERVER",
-    "$HTTP_PORTS", "$SHELLCODE_PORTS", "$ORACLE_PORTS", "$SSH_PORTS",
-    "$FTP_PORTS", "$FILE_DATA_PORTS",
-}
-
-VAR_PATTERN = re.compile(r"\$[A-Z_][A-Z0-9_]*")
-
-# Canonical valid values, per securityonion-soc/model/detection.go.
-SURICATA_OVERRIDE_TYPES = {"suppress", "threshold", "modify"}
-SUPPRESS_TRACKS = {"by_src", "by_dst", "by_either"}
-THRESHOLD_TRACKS = {"by_src", "by_dst", "by_both"}
-THRESHOLD_TYPES = {"limit", "threshold", "both"}
-
-STALE_WARNING = """\
-WARNING: so-detections-backup does not remove backup files when overrides are
-deleted via the Security Onion web UI. As a result, files in the source
-directory may represent overrides that were intentionally deleted and should
-NOT be re-imported.
-
-Before continuing, verify that the source directory reflects the overrides you
-actually want imported. Remove any files corresponding to overrides you previously deleted.
-"""
-
-
-def make_session(auth_file):
-    with open(auth_file, "r") as f:
-        for line in f:
-            if line.startswith("user ="):
-                creds = line.split("=", 1)[1].strip().replace('"', "")
-                user, _, password = creds.partition(":")
-                session = requests.Session()
-                session.auth = HTTPBasicAuth(user, password)
-                session.headers.update({"Content-Type": "application/json"})
-                session.verify = False
-                return session
-    raise RuntimeError(f"Could not find 'user =' line in {auth_file}")
-
-
-def find_detection(session, index, public_id, engine):
-    query = {
-        "query": {"bool": {"must": [
-            {"term": {"so_detection.publicId": public_id}},
-            {"term": {"so_detection.engine": engine}},
-        ]}},
-        "size": 2,
-    }
-    r = session.get(f"{ES_URL}/{index}/_search", json=query)
-    r.raise_for_status()
-    hits = r.json().get("hits", {}).get("hits", [])
-    if not hits:
-        return None, None, None
-    if len(hits) > 1:
-        # Shouldn't happen — publicId is unique per engine — but flag it.
-        print(f"  WARN: {len(hits)} detections matched publicId={public_id} engine={engine}; using first")
-    hit = hits[0]
-    existing = hit["_source"].get("so_detection", {}).get("overrides") or []
-    return hit["_id"], hit["_index"], existing
-
-
-def update_overrides(session, doc_index, doc_id, overrides):
-    body = {"doc": {"so_detection": {"overrides": overrides}}}
-    r = session.post(f"{ES_URL}/{doc_index}/_update/{doc_id}", json=body)
-    r.raise_for_status()
-    return r.json()
-
-
-def dedupe_key(override):
-    """Operational fields only, per Override.Equal() in detection.go.
-    Excludes timestamps and isEnabled so re-imports don't appear unique."""
-    t = override.get("type")
-    if t == "suppress":
-        return (t, override.get("track"), override.get("ip"))
-    if t == "threshold":
-        return (t, override.get("thresholdType"), override.get("track"),
-                override.get("count"), override.get("seconds"))
-    if t == "modify":
-        return (t, override.get("regex"), override.get("value"))
-
-
-def _validate_suricata_ip(ip):
-    if not ip:
-        return "ip cannot be empty"
-    if ip.startswith("$"):
-        return None
-    if ip.startswith("[") and ip.endswith("]"):
-        for part in ip[1:-1].split(","):
-            err = _validate_single_ip(part.strip())
-            if err:
-                return f"invalid IP in list: {err}"
-        return None
-    return _validate_single_ip(ip)
-
-
-def _validate_single_ip(ip):
-    try:
-        if "/" in ip:
-            ipaddress.ip_network(ip, strict=False)
-        else:
-            ipaddress.ip_address(ip)
-    except ValueError:
-        return f"invalid IP/CIDR {ip!r}"
-    return None
-
-
-def validate_override(override, engine):
-    """Mirror Override.Validate() from securityonion-soc/model/detection.go.
-    Returns None on success, an error string otherwise."""
-    t = override.get("type")
-    if not t:
-        return "override type is required"
-    if t not in SURICATA_OVERRIDE_TYPES:
-        return f"invalid type {t!r}: must be one of {sorted(SURICATA_OVERRIDE_TYPES)}"
-
-    has = {k: override.get(k) is not None for k in
-           ("regex", "value", "thresholdType", "track", "ip", "count", "seconds", "customFilter")}
-
-    if t == "suppress":
-        if not has["ip"] or not has["track"]:
-            return "suppress requires 'ip' and 'track'"
-        if any(has[k] for k in ("regex", "value", "thresholdType", "count", "seconds", "customFilter")):
-            return "suppress has unnecessary fields"
-        if override["track"] not in SUPPRESS_TRACKS:
-            return f"invalid track {override['track']!r}: must be one of {sorted(SUPPRESS_TRACKS)}"
-        return _validate_suricata_ip(override["ip"])
-
-    if t == "threshold":
-        if not all(has[k] for k in ("thresholdType", "track", "count", "seconds")):
-            return "threshold requires 'thresholdType', 'track', 'count', 'seconds'"
-        if any(has[k] for k in ("regex", "value", "customFilter")):
-            return "threshold has unnecessary fields"
-        if override["thresholdType"] not in THRESHOLD_TYPES:
-            return f"invalid thresholdType {override['thresholdType']!r}: must be one of {sorted(THRESHOLD_TYPES)}"
-        if override["track"] not in THRESHOLD_TRACKS:
-            return f"invalid track {override['track']!r}: must be one of {sorted(THRESHOLD_TRACKS)}"
-        if not isinstance(override["count"], int) or override["count"] <= 0:
-            return f"count must be a positive integer, got {override['count']!r}"
-        if not isinstance(override["seconds"], int) or override["seconds"] <= 0:
-            return f"seconds must be a positive integer, got {override['seconds']!r}"
-        return None
-
-    if t == "modify":
-        if not has["regex"] or not has["value"]:
-            return "modify requires 'regex' and 'value'"
-        if any(has[k] for k in ("thresholdType", "track", "count", "seconds", "customFilter")):
-            return "modify has unnecessary fields"
-        try:
-            re.compile(override["regex"])
-        except re.error as e:
-            return f"invalid regex: {e}"
-        return None
-
-
-def parse_overrides_file(path):
-    """Parse a file written by so-detections-backup.py: NDJSON, one override
-    per line. Returns a list of (override_dict, line_number)."""
-    overrides = []
-    with open(path, "r") as f:
-        for i, line in enumerate(f, start=1):
-            line = line.strip()
-            if not line:
-                continue
-            overrides.append((json.loads(line), i))
-    return overrides
-
-
-def describe(override):
-    """Human-readable summary of the operational fields for a given override type."""
-    t = override.get("type")
-    if t == "suppress":
-        return f"type=suppress track={override.get('track')} ip={override.get('ip')}"
-    if t == "threshold":
-        return (f"type=threshold track={override.get('track')} "
-                f"thresholdType={override.get('thresholdType')} "
-                f"count={override.get('count')} seconds={override.get('seconds')}")
-    if t == "modify":
-        return f"type=modify regex={override.get('regex')!r}"
-
-
-def collect_custom_vars(override):
-    found = set()
-    for value in override.values():
-        if isinstance(value, str):
-            for match in VAR_PATTERN.findall(value):
-                if match not in BUILTIN_SURICATA_VARS:
-                    found.add(match)
-    return found
-
-
-def parse_args():
-    p = argparse.ArgumentParser(
-        description="Import detection overrides into the so-detection index.",
-    )
-    p.add_argument("--source", "-s", required=True,
-                   help="Source directory containing <publicId>.<ext> override files.")
-    p.add_argument("--engine", "-e", default="suricata", choices=list(ENGINES.keys()),
-                   help="Detection engine (default: suricata).")
-    p.add_argument("--dry-run", "-n", action="store_true",
-                   help="Print what would happen without writing to Elasticsearch.")
-    p.add_argument("--no-import-note", action="store_true",
-                   help="Do not prepend '[Imported YYYY-MM-DD] ' to the override note.")
-    p.add_argument("--index", "-i", default=DEFAULT_INDEX,
-                   help=f"Elasticsearch index to update (default: {DEFAULT_INDEX}).")
-    return p.parse_args()
-
-
-def confirm_proceed(args):
-    """Show the stale-backup warning. Dry-run prints it and continues. Real
-    runs require the user typing 'yes' at the prompt."""
-    print(STALE_WARNING)
-    if args.dry_run:
-        print("(dry-run: no acknowledgement required)\n")
-        return True
-    answer = input("Type 'yes' to acknowledge and continue: ").strip().lower()
-    print()
-    return answer == "yes"
-
-
-def main():
-    args = parse_args()
-
-    if not os.path.isdir(args.source):
-        print(f"ERROR: source directory not found: {args.source}", file=sys.stderr)
-        sys.exit(1)
-
-    extension = ENGINES[args.engine]
-    files = sorted(f for f in os.listdir(args.source) if f.endswith(f".{extension}"))
-    if not files:
-        print(f"No *.{extension} files found in {args.source}")
-        sys.exit(0)
-
-    if not confirm_proceed(args):
-        print("Aborted.")
-        sys.exit(1)
-
-    session = make_session(AUTH_FILE)
-    today = datetime.now().strftime("%Y-%m-%d")
-    note_prefix = "" if args.no_import_note else f"[Imported {today}] "
-
-    counts = {"added": 0, "skipped_dedupe": 0, "skipped_not_found": 0, "invalid": 0, "error": 0}
-    custom_vars = set()
-
-    mode = "DRY-RUN" if args.dry_run else "IMPORT"
-    print(f"[{mode}] engine={args.engine} source={args.source} index={args.index}\n")
-
-    for filename in files:
-        public_id = os.path.splitext(filename)[0]
-        path = os.path.join(args.source, filename)
-        print(f"{public_id}:")
-
-        try:
-            new_overrides = parse_overrides_file(path)
-        except (json.JSONDecodeError, OSError) as e:
-            print(f"  ERROR: could not parse {filename}: {e}")
-            counts["error"] += 1
-            continue
-
-        if not new_overrides:
-            print("  SKIP: empty file")
-            continue
-
-        try:
-            doc_id, doc_index, existing = find_detection(session, args.index, public_id, args.engine)
-        except requests.HTTPError as e:
-            print(f"  ERROR: search failed: {e}")
-            counts["error"] += 1
-            continue
-
-        if doc_id is None:
-            print(f"  WARN: no detection found for publicId={public_id} engine={args.engine}; skipping")
-            counts["skipped_not_found"] += len(new_overrides)
-            continue
-
-        existing_keys = {dedupe_key(o) for o in existing}
-        merged = list(existing)
-        added_this_file = 0
-
-        for override, line_no in new_overrides:
-            err = validate_override(override, args.engine)
-            if err:
-                print(f"  INVALID (line {line_no}): {err}")
-                counts["invalid"] += 1
-                continue
-
-            custom_vars.update(collect_custom_vars(override))
-            key = dedupe_key(override)
-            if key in existing_keys:
-                print(f"  SKIP (line {line_no}): duplicate of existing override [{describe(override)}]")
-                counts["skipped_dedupe"] += 1
-                continue
-
-            if note_prefix:
-                override = dict(override)
-                override["note"] = note_prefix + (override.get("note") or "")
-
-            merged.append(override)
-            existing_keys.add(key)
-            added_this_file += 1
-            print(f"  ADD (line {line_no}): {describe(override)}")
-
-        if added_this_file == 0:
-            continue
-
-        if args.dry_run:
-            print(f"  DRY-RUN: would update {doc_index}/{doc_id} "
-                  f"({len(existing)} existing → {len(merged)} total)")
-            counts["added"] += added_this_file
-            continue
-
-        try:
-            update_overrides(session, doc_index, doc_id, merged)
-            print(f"  UPDATED {doc_index}/{doc_id} ({len(existing)} → {len(merged)})")
-            counts["added"] += added_this_file
-        except requests.HTTPError as e:
-            print(f"  ERROR: update failed: {e}")
-            counts["error"] += 1
-
-    print()
-    print("=" * 60)
-    print(f"Summary ({mode}):")
-    print(f"  Overrides added:           {counts['added']}")
-    print(f"  Skipped (already present): {counts['skipped_dedupe']}")
-    print(f"  Skipped (no detection):    {counts['skipped_not_found']}")
-    print(f"  Invalid (failed checks):   {counts['invalid']}")
-    print(f"  Errors:                    {counts['error']}")
-
-    if custom_vars:
-        print()
-        print("WARNING: detected custom Suricata variables in imported overrides:")
-        for v in sorted(custom_vars):
-            print(f"  {v}")
-        print("If any of these are not already defined in SOC Config (Suricata variables),")
-        print("you must add them manually before the rules will function correctly.")
-
-    sys.exit(0 if counts["error"] == 0 and counts["invalid"] == 0 else 1)
-
-
-if __name__ == "__main__":
-    main()
@@ -1,588 +0,0 @@
-# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
-# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
-# https://securityonion.net/license; you may not use this file except in compliance with the
-# Elastic License 2.0.
-
-import importlib.util
-import json
-import os
-import shutil
-import sys
-import tempfile
-import unittest
-from importlib.machinery import SourceFileLoader
-from io import StringIO
-from unittest.mock import MagicMock, patch
-
-import requests
-
-# The script has no .py extension; spec_from_file_location can't auto-detect a
-# loader, so we hand it a SourceFileLoader explicitly. (load_module() is
-# deprecated in 3.14 and slated for removal in 3.15.)
-HERE = os.path.dirname(os.path.abspath(__file__))
-SCRIPT = os.path.join(HERE, "so-detections-overrides-import")
-_loader = SourceFileLoader("so_overrides_import", SCRIPT)
-_spec = importlib.util.spec_from_loader("so_overrides_import", _loader)
-soi = importlib.util.module_from_spec(_spec)
-_loader.exec_module(soi)
-
-
-class TestValidateSuppress(unittest.TestCase):
-    def test_valid(self):
-        self.assertIsNone(soi.validate_override(
-            {"type": "suppress", "track": "by_src", "ip": "1.2.3.4"}, "suricata"))
-
-    def test_valid_var(self):
-        self.assertIsNone(soi.validate_override(
-            {"type": "suppress", "track": "by_either", "ip": "$HOME_NET"}, "suricata"))
-
-    def test_valid_cidr(self):
-        self.assertIsNone(soi.validate_override(
-            {"type": "suppress", "track": "by_dst", "ip": "10.0.0.0/8"}, "suricata"))
-
-    def test_valid_bracket_list(self):
-        self.assertIsNone(soi.validate_override(
-            {"type": "suppress", "track": "by_src", "ip": "[1.2.3.4,10.0.0.0/8]"}, "suricata"))
-
-    def test_missing_ip(self):
-        err = soi.validate_override({"type": "suppress", "track": "by_src"}, "suricata")
-        self.assertIn("requires", err)
-
-    def test_missing_track(self):
-        err = soi.validate_override({"type": "suppress", "ip": "1.2.3.4"}, "suricata")
-        self.assertIn("requires", err)
-
-    def test_invalid_track(self):
-        err = soi.validate_override(
-            {"type": "suppress", "track": "by_both", "ip": "1.2.3.4"}, "suricata")
-        self.assertIn("invalid track", err)
-
-    def test_invalid_ip(self):
-        err = soi.validate_override(
-            {"type": "suppress", "track": "by_src", "ip": "not-an-ip"}, "suricata")
-        self.assertIn("invalid IP", err)
-
-    def test_unnecessary_field(self):
-        err = soi.validate_override(
-            {"type": "suppress", "track": "by_src", "ip": "1.2.3.4", "count": 5}, "suricata")
-        self.assertIn("unnecessary fields", err)
-
-
-class TestValidateThreshold(unittest.TestCase):
-    def test_valid(self):
-        self.assertIsNone(soi.validate_override({
-            "type": "threshold", "track": "by_src",
-            "thresholdType": "limit", "count": 10, "seconds": 60,
-        }, "suricata"))
-
-    def test_valid_by_both(self):
-        self.assertIsNone(soi.validate_override({
-            "type": "threshold", "track": "by_both",
-            "thresholdType": "both", "count": 1, "seconds": 1,
-        }, "suricata"))
-
-    def test_track_by_either_invalid(self):
-        err = soi.validate_override({
-            "type": "threshold", "track": "by_either",
-            "thresholdType": "limit", "count": 10, "seconds": 60,
-        }, "suricata")
-        self.assertIn("invalid track", err)
-
-    def test_invalid_threshold_type(self):
-        err = soi.validate_override({
-            "type": "threshold", "track": "by_src",
-            "thresholdType": "bogus", "count": 10, "seconds": 60,
-        }, "suricata")
-        self.assertIn("invalid thresholdType", err)
-
-    def test_zero_count(self):
-        err = soi.validate_override({
-            "type": "threshold", "track": "by_src",
-            "thresholdType": "limit", "count": 0, "seconds": 60,
-        }, "suricata")
-        self.assertIn("count", err)
-
-    def test_negative_seconds(self):
-        err = soi.validate_override({
-            "type": "threshold", "track": "by_src",
-            "thresholdType": "limit", "count": 10, "seconds": -1,
-        }, "suricata")
-        self.assertIn("seconds", err)
-
-    def test_missing_field(self):
-        err = soi.validate_override({
-            "type": "threshold", "track": "by_src",
-            "thresholdType": "limit", "count": 10,  # missing seconds
-        }, "suricata")
-        self.assertIn("requires", err)
-
-    def test_unnecessary_field(self):
-        err = soi.validate_override({
-            "type": "threshold", "track": "by_src",
-            "thresholdType": "limit", "count": 10, "seconds": 60,
-            "regex": "foo",
-        }, "suricata")
-        self.assertIn("unnecessary fields", err)
-
-
-class TestValidateModify(unittest.TestCase):
-    def test_valid(self):
-        self.assertIsNone(soi.validate_override(
-            {"type": "modify", "regex": r"content:\"foo\"", "value": "content:bar"}, "suricata"))
-
-    def test_invalid_regex(self):
-        err = soi.validate_override(
-            {"type": "modify", "regex": "(unbalanced", "value": "x"}, "suricata")
-        self.assertIn("invalid regex", err)
-
-    def test_missing_value(self):
-        err = soi.validate_override({"type": "modify", "regex": "x"}, "suricata")
-        self.assertIn("requires", err)
-
-    def test_unnecessary_field(self):
-        err = soi.validate_override(
-            {"type": "modify", "regex": "x", "value": "y", "track": "by_src"}, "suricata")
-        self.assertIn("unnecessary fields", err)
-
-
-class TestValidateMisc(unittest.TestCase):
-    def test_unknown_type(self):
-        err = soi.validate_override({"type": "suppresss", "track": "by_src", "ip": "1.2.3.4"}, "suricata")
-        self.assertIn("invalid type", err)
-
-    def test_missing_type(self):
-        err = soi.validate_override({"track": "by_src"}, "suricata")
-        self.assertIn("type is required", err)
-
-
-class TestValidateIP(unittest.TestCase):
-    def test_plain_ipv4(self):
-        self.assertIsNone(soi._validate_suricata_ip("1.2.3.4"))
-
-    def test_plain_ipv6(self):
-        self.assertIsNone(soi._validate_suricata_ip("::1"))
-
-    def test_cidr(self):
-        self.assertIsNone(soi._validate_suricata_ip("10.0.0.0/8"))
-
-    def test_var(self):
-        self.assertIsNone(soi._validate_suricata_ip("$CONCOURSEWORKERS"))
-
-    def test_bracket_list(self):
-        self.assertIsNone(soi._validate_suricata_ip("[1.2.3.4, 10.0.0.0/8]"))
-
-    def test_bracket_list_bad_member(self):
-        err = soi._validate_suricata_ip("[1.2.3.4,nope]")
-        self.assertIn("invalid IP in list", err)
-
-    def test_empty(self):
-        self.assertIn("empty", soi._validate_suricata_ip(""))
-
-    def test_invalid(self):
-        self.assertIn("invalid", soi._validate_suricata_ip("999.999.999.999"))
-
-
-class TestDedupeKey(unittest.TestCase):
-    def test_suppress(self):
-        a = {"type": "suppress", "track": "by_src", "ip": "1.2.3.4", "count": 99}
-        b = {"type": "suppress", "track": "by_src", "ip": "1.2.3.4"}
-        # count is irrelevant for suppress dedupe
-        self.assertEqual(soi.dedupe_key(a), soi.dedupe_key(b))
-
-    def test_suppress_differs_on_ip(self):
-        a = {"type": "suppress", "track": "by_src", "ip": "1.2.3.4"}
-        b = {"type": "suppress", "track": "by_src", "ip": "5.6.7.8"}
-        self.assertNotEqual(soi.dedupe_key(a), soi.dedupe_key(b))
-
-    def test_threshold(self):
-        a = {"type": "threshold", "track": "by_src", "thresholdType": "limit",
-             "count": 10, "seconds": 60, "ip": "ignored"}
-        b = {"type": "threshold", "track": "by_src", "thresholdType": "limit",
-             "count": 10, "seconds": 60}
-        self.assertEqual(soi.dedupe_key(a), soi.dedupe_key(b))
-
-    def test_threshold_differs_on_count(self):
-        a = {"type": "threshold", "track": "by_src", "thresholdType": "limit",
-             "count": 10, "seconds": 60}
-        b = {"type": "threshold", "track": "by_src", "thresholdType": "limit",
-             "count": 20, "seconds": 60}
-        self.assertNotEqual(soi.dedupe_key(a), soi.dedupe_key(b))
-
-    def test_modify(self):
-        a = {"type": "modify", "regex": "x", "value": "y"}
-        b = {"type": "modify", "regex": "x", "value": "y"}
-        self.assertEqual(soi.dedupe_key(a), soi.dedupe_key(b))
-
-
-class TestDescribe(unittest.TestCase):
-    def test_suppress(self):
-        s = soi.describe({"type": "suppress", "track": "by_src", "ip": "1.2.3.4"})
-        self.assertIn("suppress", s)
-        self.assertIn("by_src", s)
-        self.assertIn("1.2.3.4", s)
-
-    def test_threshold_includes_count(self):
-        s = soi.describe({"type": "threshold", "track": "by_src",
-                          "thresholdType": "limit", "count": 10, "seconds": 60})
-        self.assertIn("count=10", s)
-        self.assertIn("seconds=60", s)
-
-    def test_modify(self):
-        s = soi.describe({"type": "modify", "regex": "foo"})
-        self.assertIn("modify", s)
-        self.assertIn("foo", s)
-
-
-class TestParseOverridesFile(unittest.TestCase):
-    def _write(self, content):
-        fd, path = tempfile.mkstemp(suffix=".txt")
-        os.close(fd)
-        with open(path, "w") as f:
-            f.write(content)
-        self.addCleanup(os.unlink, path)
-        return path
-
-    def test_single_line(self):
-        path = self._write('{"type":"suppress","track":"by_src","ip":"1.2.3.4"}')
-        result = soi.parse_overrides_file(path)
-        self.assertEqual(len(result), 1)
-        self.assertEqual(result[0][0]["type"], "suppress")
-        self.assertEqual(result[0][1], 1)
-
-    def test_ndjson(self):
-        path = self._write(
-            '{"type":"suppress","track":"by_src","ip":"1.2.3.4"}\n'
-            '{"type":"suppress","track":"by_dst","ip":"5.6.7.8"}\n'
-        )
-        result = soi.parse_overrides_file(path)
-        self.assertEqual(len(result), 2)
-        self.assertEqual(result[1][1], 2)
-
-    def test_empty(self):
-        path = self._write("")
-        self.assertEqual(soi.parse_overrides_file(path), [])
-
-    def test_blank_lines_skipped(self):
-        path = self._write('\n{"type":"suppress","track":"by_src","ip":"1.2.3.4"}\n\n')
-        result = soi.parse_overrides_file(path)
-        self.assertEqual(len(result), 1)
-        self.assertEqual(result[0][1], 2)  # line number reflects original position
-
-    def test_invalid_raises(self):
-        path = self._write("not json")
-        with self.assertRaises(json.JSONDecodeError):
-            soi.parse_overrides_file(path)
-
-
-class TestCollectCustomVars(unittest.TestCase):
-    def test_finds_custom(self):
-        v = soi.collect_custom_vars({"ip": "$CONCOURSEWORKERS"})
-        self.assertEqual(v, {"$CONCOURSEWORKERS"})
-
-    def test_filters_builtins(self):
-        v = soi.collect_custom_vars({"ip": "$HOME_NET"})
-        self.assertEqual(v, set())
-
-    def test_mixed(self):
-        v = soi.collect_custom_vars({"ip": "[$HOME_NET,$MYNET]"})
-        self.assertEqual(v, {"$MYNET"})
-
-    def test_non_string_fields_ignored(self):
-        v = soi.collect_custom_vars({"count": 10, "isEnabled": True})
-        self.assertEqual(v, set())
-
-
-class TestMakeSession(unittest.TestCase):
-    def _write(self, content):
-        fd, path = tempfile.mkstemp()
-        os.close(fd)
-        with open(path, "w") as f:
-            f.write(content)
-        self.addCleanup(os.unlink, path)
-        return path
-
-    def test_valid_auth_file(self):
-        path = self._write('user = "admin:secret"\n')
-        session = soi.make_session(path)
-        self.assertEqual(session.auth.username, "admin")
-        self.assertEqual(session.auth.password, "secret")
-        self.assertFalse(session.verify)
-
-    def test_missing_user_line(self):
-        path = self._write("# no user line here\n")
-        with self.assertRaises(RuntimeError):
-            soi.make_session(path)
-
-
-class TestFindDetection(unittest.TestCase):
-    def _session_with_response(self, payload):
-        session = MagicMock()
-        response = MagicMock()
-        response.json.return_value = payload
-        response.raise_for_status.return_value = None
-        session.get.return_value = response
-        return session
-
-    def test_found(self):
-        session = self._session_with_response({"hits": {"hits": [{
-            "_id": "abc", "_index": "so-detection",
-            "_source": {"so_detection": {"overrides": [{"type": "suppress"}]}},
-        }]}})
-        doc_id, idx, existing = soi.find_detection(session, "so-detection", "2049201", "suricata")
-        self.assertEqual(doc_id, "abc")
-        self.assertEqual(idx, "so-detection")
-        self.assertEqual(len(existing), 1)
-
-    def test_not_found(self):
-        session = self._session_with_response({"hits": {"hits": []}})
-        doc_id, idx, existing = soi.find_detection(session, "so-detection", "x", "suricata")
-        self.assertIsNone(doc_id)
-        self.assertIsNone(idx)
-        self.assertIsNone(existing)
-
-    def test_no_overrides_field(self):
-        session = self._session_with_response({"hits": {"hits": [{
-            "_id": "abc", "_index": "so-detection",
-            "_source": {"so_detection": {}},
-        }]}})
-        _, _, existing = soi.find_detection(session, "so-detection", "x", "suricata")
-        self.assertEqual(existing, [])
-
-    def test_multiple_hits_warns(self):
-        session = self._session_with_response({"hits": {"hits": [
-            {"_id": "a", "_index": "i", "_source": {"so_detection": {"overrides": []}}},
-            {"_id": "b", "_index": "i", "_source": {"so_detection": {"overrides": []}}},
-        ]}})
-        with patch("sys.stdout", new=StringIO()) as out:
-            doc_id, _, _ = soi.find_detection(session, "i", "x", "suricata")
-        self.assertEqual(doc_id, "a")
-        self.assertIn("WARN", out.getvalue())
-
-
-class TestUpdateOverrides(unittest.TestCase):
-    def test_posts_to_update_endpoint(self):
-        session = MagicMock()
-        response = MagicMock()
-        response.raise_for_status.return_value = None
-        response.json.return_value = {"result": "updated"}
-        session.post.return_value = response
-
-        result = soi.update_overrides(session, "so-detection", "abc", [{"type": "suppress"}])
-
-        self.assertEqual(result, {"result": "updated"})
-        url = session.post.call_args[0][0]
-        self.assertIn("/_update/abc", url)
-        body = session.post.call_args[1]["json"]
-        self.assertEqual(body["doc"]["so_detection"]["overrides"], [{"type": "suppress"}])
-
-
-class TestConfirmProceed(unittest.TestCase):
-    def test_dry_run_skips_prompt(self):
-        args = MagicMock(dry_run=True)
-        with patch("sys.stdout", new=StringIO()):
-            self.assertTrue(soi.confirm_proceed(args))
-
-    def test_yes_input(self):
-        args = MagicMock(dry_run=False)
-        with patch("sys.stdout", new=StringIO()):
-            with patch("builtins.input", return_value="yes"):
-                self.assertTrue(soi.confirm_proceed(args))
-
-    def test_yes_input_case_insensitive(self):
-        args = MagicMock(dry_run=False)
-        with patch("sys.stdout", new=StringIO()):
-            with patch("builtins.input", return_value="YES"):
-                self.assertTrue(soi.confirm_proceed(args))
-
-    def test_no_input_aborts(self):
-        args = MagicMock(dry_run=False)
-        with patch("sys.stdout", new=StringIO()):
-            with patch("builtins.input", return_value="no"):
-                self.assertFalse(soi.confirm_proceed(args))
-
-    def test_empty_input_aborts(self):
-        args = MagicMock(dry_run=False)
-        with patch("sys.stdout", new=StringIO()):
-            with patch("builtins.input", return_value=""):
-                self.assertFalse(soi.confirm_proceed(args))
-
-
-class TestParseArgs(unittest.TestCase):
-    def test_defaults(self):
-        with patch.object(sys, "argv", ["cmd", "--source", "/some/path"]):
-            args = soi.parse_args()
-        self.assertEqual(args.source, "/some/path")
-        self.assertEqual(args.engine, "suricata")
-        self.assertFalse(args.dry_run)
-        self.assertFalse(args.no_import_note)
-        self.assertEqual(args.index, soi.DEFAULT_INDEX)
-
-    def test_all_options(self):
-        argv = ["cmd", "-s", "/x", "-e", "suricata", "-n",
-                "--no-import-note", "-i", "alt-index"]
-        with patch.object(sys, "argv", argv):
-            args = soi.parse_args()
-        self.assertEqual(args.source, "/x")
-        self.assertTrue(args.dry_run)
-        self.assertTrue(args.no_import_note)
-        self.assertEqual(args.index, "alt-index")
-
-
-class TestMain(unittest.TestCase):
-    def setUp(self):
-        self.tmpdir = tempfile.mkdtemp()
-        self.addCleanup(shutil.rmtree, self.tmpdir, ignore_errors=True)
-        # Stub make_session so tests don't need /opt/so/conf/elasticsearch/curl.config.
-        p = patch.object(soi, "make_session", return_value=MagicMock())
-        p.start()
-        self.addCleanup(p.stop)
-
-    def _write_file(self, public_id, overrides, ext="txt"):
-        """Write an NDJSON override file. Entries may be dicts or raw strings (for malformed input)."""
-        path = os.path.join(self.tmpdir, f"{public_id}.{ext}")
-        with open(path, "w") as f:
-            for o in overrides:
-                f.write(o if isinstance(o, str) else json.dumps(o))
-                f.write("\n")
-        return path
-
-    def _run_main(self, *extra_argv, input_response="yes"):
-        """Run main() with stdout/stderr captured and input mocked. Returns (stdout, stderr, exit_code)."""
-        argv = ["cmd", "--source", self.tmpdir, *extra_argv]
-        out, err = StringIO(), StringIO()
-        with patch.object(sys, "argv", argv), \
-                patch("sys.stdout", new=out), \
-                patch("sys.stderr", new=err), \
-                patch("builtins.input", return_value=input_response):
-            with self.assertRaises(SystemExit) as cm:
-                soi.main()
-        return out.getvalue(), err.getvalue(), cm.exception.code
-
-    def test_source_dir_missing(self):
-        argv = ["cmd", "--source", "/no/such/path/here"]
-        err = StringIO()
-        with patch.object(sys, "argv", argv), patch("sys.stderr", new=err):
-            with self.assertRaises(SystemExit) as cm:
-                soi.main()
-        self.assertEqual(cm.exception.code, 1)
-        self.assertIn("source directory not found", err.getvalue())
-
-    def test_no_files_found(self):
-        out, _, code = self._run_main()
-        self.assertEqual(code, 0)
-        self.assertIn("No *.txt files found", out)
-
-    def test_user_aborts(self):
-        self._write_file("1001", [{"type": "suppress", "track": "by_src", "ip": "1.2.3.4"}])
-        out, _, code = self._run_main(input_response="no")
-        self.assertEqual(code, 1)
-        self.assertIn("Aborted", out)
-
-    def test_parse_error_increments_error(self):
-        # Malformed JSON line — parse_overrides_file raises JSONDecodeError.
-        self._write_file("1002", ["not json"])
-        out, _, code = self._run_main("--dry-run")
-        self.assertEqual(code, 1)  # invalid+error → non-zero
-        self.assertIn("could not parse", out)
-        self.assertIn("Errors:                    1", out)
-
-    def test_empty_file_skipped(self):
-        # Blank lines only — parse_overrides_file returns []; main reports "empty file" and continues.
-        path = os.path.join(self.tmpdir, "1003.txt")
-        with open(path, "w") as f:
-            f.write("\n\n")
-        out, _, code = self._run_main("--dry-run")
-        self.assertEqual(code, 0)
-        self.assertIn("empty file", out)
-
-    @patch.object(soi, "find_detection")
-    def test_search_http_error(self, mock_find):
-        mock_find.side_effect = requests.HTTPError("boom")
-        self._write_file("1004", [{"type": "suppress", "track": "by_src", "ip": "1.2.3.4"}])
-        out, _, code = self._run_main("--dry-run")
-        self.assertEqual(code, 1)
-        self.assertIn("search failed", out)
-
-    @patch.object(soi, "find_detection")
-    def test_no_detection_found(self, mock_find):
-        mock_find.return_value = (None, None, None)
-        self._write_file("1005", [{"type": "suppress", "track": "by_src", "ip": "1.2.3.4"}])
-        out, _, code = self._run_main("--dry-run")
-        self.assertEqual(code, 0)
-        self.assertIn("no detection found", out)
-        self.assertIn("Skipped (no detection):    1", out)
-
-    @patch.object(soi, "find_detection")
-    def test_all_duplicates_no_update(self, mock_find):
-        existing = [{"type": "suppress", "track": "by_src", "ip": "1.2.3.4"}]
-        mock_find.return_value = ("doc1", "so-detection", existing)
-        self._write_file("1006", [{"type": "suppress", "track": "by_src", "ip": "1.2.3.4"}])
-        out, _, code = self._run_main("--dry-run")
-        self.assertEqual(code, 0)
-        self.assertIn("SKIP", out)
-        self.assertNotIn("DRY-RUN: would update", out)  # added_this_file == 0 branch
-
-    @patch.object(soi, "update_overrides")
-    @patch.object(soi, "find_detection")
-    def test_happy_path_full(self, mock_find, mock_update):
-        # Exercises: ADD, dedupe SKIP, INVALID, note prefix, UPDATE, custom-vars warning, exit=1 (invalid present)
-        existing = [{"type": "suppress", "track": "by_src", "ip": "9.9.9.9"}]
-        mock_find.return_value = ("doc1", "so-detection", existing)
-        mock_update.return_value = {"result": "updated"}
-        self._write_file("1007", [
-            {"type": "suppress", "track": "by_src", "ip": "1.2.3.4"},                # ADD
-            {"type": "suppress", "track": "by_src", "ip": "9.9.9.9"},                # SKIP (dupe of existing)
-            {"type": "suppress", "track": "bogus",  "ip": "1.2.3.4"},                # INVALID
-            {"type": "suppress", "track": "by_src", "ip": "$CONCOURSEWORKERS"},      # ADD + custom var
-        ])
-        out, _, code = self._run_main()
-        self.assertEqual(code, 1)  # one invalid -> non-zero
-
-        mock_update.assert_called_once()
-        merged = mock_update.call_args[0][3]
-        self.assertEqual(len(merged), 3)  # 1 existing + 2 new
-        new_notes = [o.get("note", "") for o in merged if o.get("ip") in ("1.2.3.4", "$CONCOURSEWORKERS")]
-        self.assertTrue(all(n.startswith("[Imported ") for n in new_notes))
-
-        self.assertIn("ADD", out)
-        self.assertIn("SKIP", out)
-        self.assertIn("INVALID", out)
-        self.assertIn("UPDATED", out)
-        self.assertIn("$CONCOURSEWORKERS", out)
-
-    @patch.object(soi, "update_overrides")
-    @patch.object(soi, "find_detection")
-    def test_no_import_note_preserves_note(self, mock_find, mock_update):
-        mock_find.return_value = ("doc1", "so-detection", [])
-        mock_update.return_value = {"result": "updated"}
-        self._write_file("1008", [
-            {"type": "suppress", "track": "by_src", "ip": "1.2.3.4", "note": "original"},
-        ])
-        _, _, code = self._run_main("--no-import-note")
-        self.assertEqual(code, 0)
-        merged = mock_update.call_args[0][3]
-        self.assertEqual(merged[0]["note"], "original")  # no prefix applied
-
-    @patch.object(soi, "find_detection")
-    def test_dry_run_skips_update(self, mock_find):
-        mock_find.return_value = ("doc1", "so-detection", [])
-        self._write_file("1009", [{"type": "suppress", "track": "by_src", "ip": "1.2.3.4"}])
-        with patch.object(soi, "update_overrides") as mock_update:
-            out, _, code = self._run_main("--dry-run")
-        self.assertEqual(code, 0)
-        mock_update.assert_not_called()
-        self.assertIn("DRY-RUN: would update", out)
-
-    @patch.object(soi, "update_overrides")
-    @patch.object(soi, "find_detection")
-    def test_update_http_error(self, mock_find, mock_update):
-        mock_find.return_value = ("doc1", "so-detection", [])
-        mock_update.side_effect = requests.HTTPError("nope")
-        self._write_file("1010", [{"type": "suppress", "track": "by_src", "ip": "1.2.3.4"}])
-        out, _, code = self._run_main()
-        self.assertEqual(code, 1)
-        self.assertIn("update failed", out)
-
-
-if __name__ == "__main__":
-    unittest.main()
@@ -314,6 +314,24 @@ EOSQL
 	fi
 }

+function sync_minion_config_to_db() {
+	log "INFO" "Syncing minion config to onionconfig for $MINION_ID"
+	/usr/sbin/so-config.py import-minion "$MINION_ID" --note "so-minion $OPERATION"
+	if [ $? -ne 0 ]; then
+		log "ERROR" "Failed to sync minion config to onionconfig for $MINION_ID"
+		return 1
+	fi
+}
+
+function purge_minion_config_from_db() {
+	log "INFO" "Purging minion config from onionconfig for $MINION_ID"
+	/usr/sbin/so-config.py purge-node "$MINION_ID" --note "so-minion delete"
+	if [ $? -ne 0 ]; then
+		log "ERROR" "Failed to purge minion config from onionconfig for $MINION_ID"
+		return 1
+	fi
+}
+
 # Create the minion file
 function ensure_socore_ownership() {
 	log "INFO" "Setting socore ownership on minion files"
@@ -1088,6 +1106,10 @@ case "$OPERATION" in
 			log "ERROR" "Failed to setup minion files for $MINION_ID"
 			exit 1
 		}
+		sync_minion_config_to_db || {
+			log "ERROR" "Failed to sync minion config to onionconfig for $MINION_ID"
+			exit 1
+		}
 		updateMineAndApplyStates || {
 			log "ERROR" "Failed to update mine and apply states for $MINION_ID"
 			exit 1
@@ -1108,12 +1130,20 @@ case "$OPERATION" in
 			log "ERROR" "Failed to setup VM minion files for $MINION_ID"
 			exit 1
 		}
+		sync_minion_config_to_db || {
+			log "ERROR" "Failed to sync VM minion config to onionconfig for $MINION_ID"
+			exit 1
+		}
 		log "INFO" "Successfully added VM minion $MINION_ID"
 		;;

 	"delete")
 		log "INFO" "Removing minion $MINION_ID"
 		remove_postgres_telegraf_from_minion
+		purge_minion_config_from_db || {
+			log "ERROR" "Failed to purge minion config from onionconfig for $MINION_ID"
+			exit 1
+		}
 		deleteMinionFiles || {
 			log "ERROR" "Failed to delete minion files for $MINION_ID"
 			exit 1
@@ -1,232 +0,0 @@
-#!/opt/saltstack/salt/bin/python3
-
-# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
-# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
-# https://securityonion.net/license; you may not use this file except in compliance with the
-# Elastic License 2.0.
-
-"""
-so-push-drainer
-===============
-
-Scheduled drainer for the active-push feature. Runs on the manager every
-drain_interval seconds (default 15) via a salt schedule in salt/schedule.sls.
-
-For each intent file under /opt/so/state/push_pending/*.json whose last_touch
-is older than debounce_seconds, this script:
-  * concatenates the actions lists from every ready intent
-  * dedupes by (state or __highstate__, tgt, tgt_type)
-  * dispatches a single `salt-run state.orchestrate orch.push_batch --async`
-    with the deduped actions list passed as pillar kwargs
-  * deletes the contributed intent files on successful dispatch
-
-Reactor sls files (push_suricata, push_strelka, push_pillar) write intents
-but never dispatch directly -- see plan
-/home/mreeves/.claude/plans/goofy-marinating-hummingbird.md for the full design.
-"""
-
-import fcntl
-import glob
-import json
-import logging
-import logging.handlers
-import os
-import subprocess
-import sys
-import time
-
-import salt.client
-
-PENDING_DIR = '/opt/so/state/push_pending'
-LOCK_FILE = os.path.join(PENDING_DIR, '.lock')
-LOG_FILE = '/opt/so/log/salt/so-push-drainer.log'
-
-HIGHSTATE_SENTINEL = '__highstate__'
-
-
-def _make_logger():
-    logger = logging.getLogger('so-push-drainer')
-    logger.setLevel(logging.INFO)
-    if not logger.handlers:
-        os.makedirs(os.path.dirname(LOG_FILE), exist_ok=True)
-        handler = logging.handlers.RotatingFileHandler(
-            LOG_FILE, maxBytes=5 * 1024 * 1024, backupCount=3,
-        )
-        handler.setFormatter(logging.Formatter(
-            '%(asctime)s | %(levelname)s | %(message)s',
-        ))
-        logger.addHandler(handler)
-    return logger
-
-
-def _load_push_cfg():
-    """Read the global:push pillar subtree via salt-call. Returns a dict."""
-    caller = salt.client.Caller()
-    cfg = caller.cmd('pillar.get', 'global:push', {})
-    return cfg if isinstance(cfg, dict) else {}
-
-
-def _read_intent(path, log):
-    try:
-        with open(path, 'r') as f:
-            return json.load(f)
-    except (IOError, ValueError) as exc:
-        log.warning('cannot read intent %s: %s', path, exc)
-        return None
-    except Exception:
-        log.exception('unexpected error reading %s', path)
-        return None
-
-
-def _dedupe_actions(actions):
-    seen = set()
-    deduped = []
-    for action in actions:
-        if not isinstance(action, dict):
-            continue
-        state_key = HIGHSTATE_SENTINEL if action.get('highstate') else action.get('state')
-        tgt = action.get('tgt')
-        tgt_type = action.get('tgt_type', 'compound')
-        if not state_key or not tgt:
-            continue
-        key = (state_key, tgt, tgt_type)
-        if key in seen:
-            continue
-        seen.add(key)
-        deduped.append(action)
-    return deduped
-
-
-def _dispatch(actions, log):
-    pillar_arg = json.dumps({'actions': actions})
-    cmd = [
-        'salt-run',
-        'state.orchestrate',
-        'orch.push_batch',
-        'pillar={}'.format(pillar_arg),
-        '--async',
-    ]
-    log.info('dispatching: %s', ' '.join(cmd[:3]) + ' pillar=<{} actions>'.format(len(actions)))
-    try:
-        result = subprocess.run(
-            cmd, check=True, capture_output=True, text=True, timeout=60,
-        )
-    except subprocess.CalledProcessError as exc:
-        log.error('dispatch failed (rc=%s): stdout=%s stderr=%s',
-                  exc.returncode, exc.stdout, exc.stderr)
-        return False
-    except subprocess.TimeoutExpired:
-        log.error('dispatch timed out after 60s')
-        return False
-    except Exception:
-        log.exception('dispatch raised')
-        return False
-    log.info('dispatch accepted: %s', (result.stdout or '').strip())
-    return True
-
-
-def main():
-    log = _make_logger()
-
-    if not os.path.isdir(PENDING_DIR):
-        # Nothing to do; reactors create the dir on first use.
-        return 0
-
-    try:
-        push = _load_push_cfg()
-    except Exception:
-        log.exception('failed to read global:push pillar; aborting drain pass')
-        return 1
-
-    if not push.get('enabled', True):
-        log.debug('push disabled; exiting')
-        return 0
-
-    debounce_seconds = int(push.get('debounce_seconds', 30))
-
-    os.makedirs(PENDING_DIR, exist_ok=True)
-    lock_fd = os.open(LOCK_FILE, os.O_CREAT | os.O_RDWR, 0o644)
-    try:
-        fcntl.flock(lock_fd, fcntl.LOCK_EX)
-
-        intent_files = [
-            p for p in sorted(glob.glob(os.path.join(PENDING_DIR, '*.json')))
-            if os.path.basename(p) != '.lock'
-        ]
-        if not intent_files:
-            return 0
-
-        now = time.time()
-        ready = []
-        skipped = 0
-        broken = []
-        for path in intent_files:
-            intent = _read_intent(path, log)
-            if not isinstance(intent, dict):
-                broken.append(path)
-                continue
-            last_touch = intent.get('last_touch', 0)
-            if now - last_touch < debounce_seconds:
-                skipped += 1
-                continue
-            ready.append((path, intent))
-
-        for path in broken:
-            try:
-                os.unlink(path)
-            except OSError:
-                pass
-
-        if not ready:
-            if skipped:
-                log.debug('no ready intents (%d still in debounce window)', skipped)
-            return 0
-
-        combined_actions = []
-        oldest_first_touch = now
-        all_paths = []
-        for path, intent in ready:
-            combined_actions.extend(intent.get('actions', []) or [])
-            first = intent.get('first_touch', now)
-            if first < oldest_first_touch:
-                oldest_first_touch = first
-            all_paths.extend(intent.get('paths', []) or [])
-
-        deduped = _dedupe_actions(combined_actions)
-        if not deduped:
-            log.warning('%d intent(s) had no usable actions; clearing', len(ready))
-            for path, _ in ready:
-                try:
-                    os.unlink(path)
-                except OSError:
-                    pass
-            return 0
-
-        debounce_duration = now - oldest_first_touch
-        log.info(
-            'draining %d intent(s): %d action(s) after dedupe (raw=%d), '
-            'debounce_duration=%.1fs, paths=%s',
-            len(ready), len(deduped), len(combined_actions),
-            debounce_duration, all_paths[:20],
-        )
-
-        if not _dispatch(deduped, log):
-            log.warning('dispatch failed; leaving intent files in place for retry')
-            return 1
-
-        for path, _ in ready:
-            try:
-                os.unlink(path)
-            except OSError:
-                log.exception('failed to remove drained intent %s', path)
-
-        return 0
-    finally:
-        try:
-            fcntl.flock(lock_fd, fcntl.LOCK_UN)
-        finally:
-            os.close(lock_fd)
-
-
-if __name__ == '__main__':
-    sys.exit(main())
@@ -25,6 +25,7 @@ def showUsage(args):
    print('    get [-r]         - Displays (to stdout) the value stored in the given key. Requires KEY arg. Use -r for raw output without YAML formatting.', file=sys.stderr)
    print('    remove           - Removes a yaml key, if it exists. Requires KEY arg.', file=sys.stderr)
    print('    replace          - Replaces (or adds) a new key and set its value. Requires KEY and VALUE args.', file=sys.stderr)
+    print('    purge            - Delete the YAML file from disk (no KEY arg).', file=sys.stderr)
    print('    help             - Prints this usage information.', file=sys.stderr)
    print('', file=sys.stderr)
    print('  Where:', file=sys.stderr)
@@ -53,7 +54,20 @@ def loadYaml(filename):

 def writeYaml(filename, content):
    file = open(filename, "w")
-    return yaml.safe_dump(content, file)
+    result = yaml.safe_dump(content, file)
+    file.close()
+    return result
+
+
+def purgeFile(filename):
+    """Delete a YAML file from disk. Idempotent; missing files are success."""
+    if os.path.exists(filename):
+        try:
+            os.remove(filename)
+        except Exception as e:
+            print(f"Failed to remove {filename}: {e}", file=sys.stderr)
+            return 1
+    return 0


 def appendItem(content, key, listItem):
@@ -371,6 +385,15 @@ def get(args):
    return 0


+def purge(args):
+    """purge YAML_FILE - delete the file from disk."""
+    if len(args) != 1:
+        print('Missing filename arg', file=sys.stderr)
+        showUsage(None)
+        return 1
+    return purgeFile(args[0])
+
+
 def main():
    args = sys.argv[1:]

@@ -388,6 +411,7 @@ def main():
        "get": get,
        "remove": remove,
        "replace": replace,
+        "purge": purge,
    }

    code = 1
@@ -991,3 +991,31 @@ class TestLoadYaml(unittest.TestCase):
                    soyaml.loadYaml("/tmp/so-yaml_test-unreadable.yaml")
                    sysmock.assert_called_with(1)
                    self.assertIn("Error reading file", mock_stderr.getvalue())
+
+
+class TestPurge(unittest.TestCase):
+
+    def test_purge_missing_arg(self):
+        # showUsage calls sys.exit(1); patch it like the other tests do.
+        with patch('sys.exit', new=MagicMock()):
+            with patch('sys.stderr', new=StringIO()) as mock_stderr:
+                rc = soyaml.purge([])
+                self.assertEqual(rc, 1)
+                self.assertIn("Missing filename", mock_stderr.getvalue())
+
+    def test_purge_existing_file(self):
+        filename = "/tmp/so-yaml_test_purge.yaml"
+        with open(filename, "w") as f:
+            f.write("key: value\n")
+        rc = soyaml.purge([filename])
+        self.assertEqual(rc, 0)
+        import os as _os
+        self.assertFalse(_os.path.exists(filename))
+
+    def test_purge_missing_file_idempotent(self):
+        filename = "/tmp/so-yaml_test_purge_missing.yaml"
+        import os as _os
+        if _os.path.exists(filename):
+            _os.remove(filename)
+        rc = soyaml.purge([filename])
+        self.assertEqual(rc, 0)
@@ -188,6 +188,13 @@ airgap_update_dockers() {
  fi
 }

+backup_old_states_pillars() {
+
+	tar czf /nsm/backup/$(echo $INSTALLEDVERSION)_$(date +%Y%m%d-%H%M%S)_soup_default_states_pillars.tar.gz /opt/so/saltstack/default/
+	tar czf /nsm/backup/$(echo $INSTALLEDVERSION)_$(date +%Y%m%d-%H%M%S)_soup_local_states_pillars.tar.gz /opt/so/saltstack/local/
+
+}
+
 update_registry() {
  docker stop so-dockerregistry
  docker rm so-dockerregistry
@@ -343,11 +350,10 @@ highstate() {
 masterlock() {
  echo "Locking Salt Master"
  mv -v $TOPFILE $BACKUPTOPFILE
-  # Render the real top file only for the host running soup; every other
-  # minion gets an empty top (no states) while the master is upgrading.
-  echo "{% if grains['id'] == '$MINIONID' %}" > $TOPFILE
-  cat $BACKUPTOPFILE >> $TOPFILE
-  echo "{% endif %}" >> $TOPFILE
+  echo "base:" > $TOPFILE
+  echo "  $MINIONID:" >> $TOPFILE
+  echo "    - ca" >> $TOPFILE
+  echo "    - elasticsearch" >> $TOPFILE
 }

 masterunlock() {
@@ -366,7 +372,6 @@ preupgrade_changes() {

    [[ "$INSTALLEDVERSION" =~ ^2\.4\.21[0-9]+$ ]] && up_to_3.0.0   
    [[ "$INSTALLEDVERSION" == "3.0.0" ]] && up_to_3.1.0
-    [[ "$INSTALLEDVERSION" == "3.1.0" ]] && up_to_3.2.0
    true
 }

@@ -376,7 +381,6 @@ postupgrade_changes() {

    [[ "$POSTVERSION" =~ ^2\.4\.21[0-9]+$ ]] && post_to_3.0.0
    [[ "$POSTVERSION" == "3.0.0" ]] && post_to_3.1.0
-    [[ "$POSTVERSION" == "3.1.0" ]] && post_to_3.2.0
    true
 }

@@ -481,158 +485,6 @@ elasticsearch_backup_index_templates() {
  tar -czf /nsm/backup/3.0.0_elasticsearch_index_templates.tar.gz -C /opt/so/conf/elasticsearch/templates/index/ .
 }

-elasticfleet_set_agent_logging_level_warn() {
-    . /usr/sbin/so-elastic-fleet-common
-
-    local current_agent_policies
-    if ! current_agent_policies=$(fleet_api "agent_policies?perPage=1000"); then
-        echo "Warning: unable to retrieve Fleet agent policies"
-        return 0
-    fi
-
-    # Only updating policies that are within Security Onion defaults and do not already have any user configured advanced_settings.
-    local policies_to_update
-    policies_to_update=$(jq -c '
-        .items[]
-        | select(has("advanced_settings") | not)
-        | select(
-            .id == "so-grid-nodes_general"
-            or .id == "so-grid-nodes_heavy"
-            or .id == "endpoints-initial"
-            or (.id | startswith("FleetServer_"))
-          )
-    ' <<< "$current_agent_policies")
-
-    if [[ -z "$policies_to_update" ]]; then
-        return 0
-    fi
-
-    while IFS= read -r policy; do
-        [[ -z "$policy" ]] && continue
-
-        local policy_id policy_name policy_namespace
-        policy_id=$(jq -r '.id' <<< "$policy")
-        policy_name=$(jq -r '.name' <<< "$policy")
-        policy_namespace=$(jq -r '.namespace' <<< "$policy")
-
-        local update_logging
-        update_logging=$(jq -n \
-            --arg name "$policy_name" \
-            --arg namespace "$policy_namespace" \
-            '{name: $name, namespace: $namespace, advanced_settings: {agent_logging_level: "warning"}}'
-        )
-
-        echo "Setting elastic agent_logging_level to warning on policy '$policy_name' ($policy_id)."
-        if ! fleet_api "agent_policies/$policy_id" -XPUT -H 'kbn-xsrf: true' -H 'Content-Type: application/json' -d "$update_logging" >/dev/null; then
-            echo "  warning: failed to update agent policy '$policy_name' ($policy_id)" >&2
-        fi
-    done <<< "$policies_to_update"
-}
-
-update_logstash_pipeline_name() {
-    local original_pipeline_name="$1"
-    local new_pipeline_name="$2"
-
-    echo "Checking for conflicting logstash defined_pipelines pillar value."
-    local LOGSTASH_FILE=/opt/so/saltstack/local/pillar/logstash/soc_logstash.sls
-    local MINIONDIR=/opt/so/saltstack/local/pillar/minions
-    for pillar_file in "$LOGSTASH_FILE" "$MINIONDIR"/*.sls; do
-        [[ -f "$pillar_file" ]] || continue
-        if grep -q "$original_pipeline_name$" "$pillar_file"; then
-            echo "Found conflicting defined_pipeline pillar value in $pillar_file. Updating to use the new logstash pipeline name."
-            sed -i "s#$original_pipeline_name\$#$new_pipeline_name#g" "$pillar_file"
-            chown socore:socore "$pillar_file"
-        fi
-    done
-}
-
-check_transform_health_and_reauthorize() {
-    . /usr/sbin/so-elastic-fleet-common
-
-    echo "Checking integration transform jobs for unhealthy / unauthorized status..."
-
-    local transforms_doc stats_doc installed_doc
-    if ! transforms_doc=$(so-elasticsearch-query "_transform/_all?size=1000" --fail --retry 3 --retry-delay 5 2>/dev/null); then
-        echo "Unable to query for transform jobs, skipping reauthorization."
-        return 0
-    fi
-    if ! stats_doc=$(so-elasticsearch-query "_transform/_all/_stats?size=1000" --fail --retry 3 --retry-delay 5 2>/dev/null); then
-        echo "Unable to query for transform job stats, skipping reauthorization."
-        return 0
-    fi
-    if ! installed_doc=$(fleet_api "epm/packages/installed?perPage=500"); then
-        echo "Unable to list installed Fleet packages, skipping reauthorization."
-        return 0
-    fi
-
-    # Get all transforms that meet the following
-    # - unhealthy (any non-green health status)
-    # - metadata has run_as_kibana_system: false (this fix is specific to transforms started prior to Kibana 9.3.3)
-    # - are not orphaned (integration is not somehow missing/corrupt/uninstalled)
-    local tmp_transforms tmp_stats tmp_installed
-    tmp_transforms=$(mktemp)
-    tmp_stats=$(mktemp)
-    tmp_installed=$(mktemp)
-
-    echo "$transforms_doc" > "$tmp_transforms"
-    echo "$stats_doc"      > "$tmp_stats"
-    echo "$installed_doc"  > "$tmp_installed"
-
-    local unhealthy_transforms
-    unhealthy_transforms=$(jq -c -n \
-        --slurpfile t "$tmp_transforms" \
-        --slurpfile s "$tmp_stats" \
-        --slurpfile i "$tmp_installed" '
-        ($i[0].items | map({key: .name, value: .version}) | from_entries) as $pkg_ver
-        | ($s[0].transforms | map({key: .id, value: .health.status}) | from_entries) as $health
-        | [ $t[0].transforms[]
-            | select(._meta.run_as_kibana_system == false)
-            | select(($health[.id] // "unknown") != "green")
-            | {id, pkg: ._meta.package.name, ver: ($pkg_ver[._meta.package.name])}
-          ]
-        | if length == 0 then empty else . end
-        | (map(select(.ver == null)) | map({orphan: .id})[]),
-          (map(select(.ver != null))
-           | group_by(.pkg)
-           | map({pkg: .[0].pkg, ver: .[0].ver, transformIds: map(.id)})[])
-    ')
-
-    if [[ -z "$unhealthy_transforms" ]]; then
-        return 0
-    fi
-
-    local unhealthy_count
-    unhealthy_count=$(jq -s '[.[].transformIds? // empty | .[]] | length' <<< "$unhealthy_transforms")
-    echo "Found $unhealthy_count transform(s) needing reauthorization."
-
-    local total_failures=0
-    while IFS= read -r transform; do
-        [[ -z "$transform" ]] && continue
-        if jq -e 'has("orphan")' <<< "$transform" >/dev/null 2>&1; then
-            echo "Skipping transform not owned by any installed Fleet package: $(jq -r '.orphan' <<< "$transform")"
-            continue
-        fi
-
-        local pkg ver body resp
-        pkg=$(jq -r '.pkg' <<< "$transform")
-        ver=$(jq -r '.ver' <<< "$transform")
-        body=$(jq -c '{transforms: (.transformIds | map({transformId: .}))}' <<< "$transform")
-
-        echo "Reauthorizing transform(s) for ${pkg}-${ver}..."
-        resp=$(fleet_api "epm/packages/${pkg}/${ver}/transforms/authorize" \
-                        -XPOST -H 'kbn-xsrf: true' -H 'Content-Type: application/json' \
-                        -d "$body") || { echo "Could not reauthorize transform(s) for ${pkg}-${ver}"; continue; }
-
-        (( total_failures += $(jq 'map(select(.success != true)) | length' <<< "$resp" 2>/dev/null) ))
-    done <<< "$unhealthy_transforms"
-
-    rm -f "$tmp_transforms" "$tmp_stats" "$tmp_installed"
-
-    if [[ "$total_failures" -gt 0 ]]; then
-        echo "Some transform(s) failed to reauthorize."
-    fi
-}
-
 ensure_postgres_local_pillar() {
  # Postgres was added as a service after 3.0.0, so the new pillar/top.sls
  # references postgres.soc_postgres / postgres.adv_postgres unconditionally.
@@ -668,31 +520,6 @@ ensure_postgres_secret() {
  chown socore:socore "$secrets_file"
 }

-rename_strelka_scan_lnk() {
-  echo "Renaming strelka pillar ScanLNK to ScanLnk."
-  local STRELKA_FILE=/opt/so/saltstack/local/pillar/strelka/soc_strelka.sls
-  local MINIONDIR=/opt/so/saltstack/local/pillar/minions
-  local OLD_KEY=strelka.backend.config.backend.scanners.ScanLNK
-  local NEW_KEY=strelka.backend.config.backend.scanners.ScanLnk
-  local TMP_VALUE_FILE
-  TMP_VALUE_FILE=$(mktemp)
-
-  for pillar_file in "$STRELKA_FILE" "$MINIONDIR"/*.sls; do
-    [[ -f "$pillar_file" ]] || continue
-    # Skip if ScanLNK doesn't exist
-    so-yaml.py get "$pillar_file" "$OLD_KEY" > "$TMP_VALUE_FILE" 2>/dev/null || continue
-    echo "Found 'ScanLNK' key in $pillar_file. Renaming to 'ScanLnk'."
-    so-yaml.py add "$pillar_file" "$NEW_KEY" "file:$TMP_VALUE_FILE"
-    so-yaml.py remove "$pillar_file" "$OLD_KEY"
-  done
-
-  rm -f "$TMP_VALUE_FILE"
-}
-
-fix_logstash_0013_lumberjack_pipeline_name() {
-    update_logstash_pipeline_name "so/0013_input_lumberjack_fleet.conf" "so/0013_input_lumberjack_fleet.conf.jinja"
-}
-
 up_to_3.1.0() {
  ensure_postgres_local_pillar
  ensure_postgres_secret
@@ -700,8 +527,7 @@ up_to_3.1.0() {
  elasticsearch_backup_index_templates
  # Clear existing component template state file.
  rm -f /opt/so/state/esfleet_component_templates.json
-  rename_strelka_scan_lnk
-  fix_logstash_0013_lumberjack_pipeline_name
+

  INSTALLEDVERSION=3.1.0
 }
@@ -727,59 +553,11 @@ post_to_3.1.0() {
  # file_roots of its own and --local would fail with "No matching sls found".
  salt-call state.apply postgres.telegraf_users queue=True || true

-  # Update default agent policies to use logging level warn.
-  elasticfleet_set_agent_logging_level_warn || true
-
-  # Check for unhealthy / unauthorized integration transform jobs and attempt reauthorizations
-  check_transform_health_and_reauthorize || true
-
  POSTVERSION=3.1.0
 }

 ### 3.1.0 End ###

-### 3.2.0 Scripts ###
-
-bootstrap_so_soc_database() {
-  # init-db.sh is mounted into so-postgres at /docker-entrypoint-initdb.d/init-db.sh
-  # and runs automatically only on a fresh data directory. Hosts upgrading from
-  # 3.1.0 already have /nsm/postgres populated, so the so_soc bootstrap block
-  # added in 3.2 never fires. Re-run the script explicitly; it's idempotent.
-  echo "Bootstrapping so_soc database via init-db.sh."
-  # The postgres image has no USER directive, so `docker exec` defaults to
-  # root, and the container env intentionally omits POSTGRES_USER (the upstream
-  # entrypoint defaults it transiently during first-init only). Recreate both
-  # so psql inside init-db.sh resolves the connect user correctly.
-  local exec_cmd="docker exec -u postgres -e POSTGRES_USER=postgres so-postgres bash /docker-entrypoint-initdb.d/init-db.sh"
-  if ! /usr/sbin/so-postgres-wait; then
-    FINAL_MESSAGE_QUEUE+=("WARNING: so-postgres was not ready during the 3.2.0 upgrade; the so_soc database may not have been bootstrapped. Re-run manually: $exec_cmd")
-    return 0
-  fi
-  if ! $exec_cmd; then
-    FINAL_MESSAGE_QUEUE+=("WARNING: init-db.sh failed inside so-postgres during the 3.2.0 upgrade; the so_soc database may not have been bootstrapped. Re-run manually: $exec_cmd")
-    return 0
-  fi
-  echo "so_soc bootstrap complete."
-}
-
-up_to_3.2.0() {
-  fix_logstash_0013_lumberjack_pipeline_name
-
-  INSTALLEDVERSION=3.2.0
-}
-
-post_to_3.2.0() {
-  bootstrap_so_soc_database
-
-  # Including agent regen script here since it was missed in post_to_3.1.0
-  echo "Regenerating Elastic Agent Installers"
-  /sbin/so-elastic-agent-gen-installers
-
-  POSTVERSION=3.2.0
-}
-
-### 3.2.0 End ###
-

 repo_sync() {
  echo "Sync the local repo."
@@ -1031,9 +809,6 @@ verify_es_version_compatibility() {
    local is_active_intermediate_upgrade=1
    # supported upgrade paths for SO-ES versions
    declare -A es_upgrade_map=(
-        ["8.18.4"]="8.18.6 8.18.8 9.0.8"
-	    ["8.18.6"]="8.18.8 9.0.8"
-	    ["8.18.8"]="9.0.8"
        ["9.0.8"]="9.3.3"
    )

@@ -1057,171 +832,6 @@ verify_es_version_compatibility() {
        exit 160
    fi

-    compatible_es_versions="$target_es_version"
-    for current_version in "${!es_upgrade_map[@]}"; do
-        # shellcheck disable=SC2076
-        if [[ " ${es_upgrade_map[$current_version]} " =~ " $target_es_version " ]]; then
-            compatible_es_versions+=" $current_version"
-        fi
-    done
-
-    # Check if the given ES version can directly upgrade to the target ES version. Used to assist with catching lagging nodes during the upgrade process
-    es_version_can_upgrade_to_target() {
-        local current_version="$1"
-        # shellcheck disable=SC2076
-        if [[ -n "$current_version" && " $compatible_es_versions " =~ " $current_version " ]]; then
-            return 0
-        fi
-
-        return 1
-    }
-
-    # Gather Elasticsearch cluster version info and verify that each node in the cluster is running a version compatible with the target ES version.
-    verify_searchnodes_es_target_compatibility() {
-        local retries=20
-        local retry_count=0
-        local delay=180
-        local expected_es_nodes searchnode_minions attempt
-        local searchnode_discovery_success=false
-        SEARCHNODE_ES_VERSIONS=""
-
-        for attempt in {1..3}; do
-            if searchnode_minions=$(set -o pipefail; salt-key --out=json --list=accepted 2> /dev/null | jq -r '.minions[]? | select(endswith("searchnode"))'); then
-                searchnode_discovery_success=true
-                break
-            fi
-
-            echo "Failed to retrieve grid searchnodes via salt-key... Retrying in 30 seconds. Attempt $attempt of 3."
-            sleep 30
-        done
-
-        if [[ "$searchnode_discovery_success" != "true" ]]; then
-            echo "Failed to retrieve grid searchnodes via salt-key."
-            return 1
-        fi
-
-        # Always add node running soup to expected es nodes
-        expected_es_nodes="${MINIONID%_*}"
-        while IFS= read -r searchnode_minion; do
-            [[ -z "$searchnode_minion" ]] && continue
-            expected_es_nodes+=$'\n'"${searchnode_minion%_searchnode}"
-        done <<< "$searchnode_minions"
-
-        while [[ $retry_count -lt $retries ]]; do
-            SEARCHNODE_ES_VERSIONS=$(so-elasticsearch-query _nodes/_all/version --retry 5 --retry-delay 10 --fail 2>&1)
-            local exit_status=$?
-
-            if [[ $exit_status -ne 0 ]]; then
-                echo "Failed to retrieve Elasticsearch versions from searchnodes... Retrying in $delay seconds. Attempt $((retry_count + 1)) of $retries."
-                ((retry_count++))
-                sleep $delay
-                continue
-            fi
-
-            local all_searchnodes_compatible=true
-            while IFS=$'\t' read -r node current_version; do
-                [[ -z "$node" ]] && continue
-                if ! es_version_can_upgrade_to_target "$current_version"; then
-                    echo "Searchnode $node is running Elasticsearch $current_version, which is not directly upgradable to Elasticsearch $target_es_version."
-                    all_searchnodes_compatible=false
-                fi
-            done < <(echo "$SEARCHNODE_ES_VERSIONS" | jq -r '.nodes | to_entries[] | [.value.name, .value.version] | @tsv')
-
-            while IFS= read -r expected_es_node; do
-                [[ -z "$expected_es_node" ]] && continue
-                if ! echo "$SEARCHNODE_ES_VERSIONS" | jq -e --arg node "$expected_es_node" '.nodes | to_entries | any(.value.name == $node)' > /dev/null; then
-                    echo "Searchnode $expected_es_node did not report an Elasticsearch version. It may be offline or still upgrading."
-                    all_searchnodes_compatible=false
-                fi
-            done <<< "$expected_es_nodes"
-
-            if [[ "$all_searchnodes_compatible" == true ]]; then
-                echo "All Searchnodes are upgradable to Elasticsearch $target_es_version."
-                return 0
-            fi
-
-            echo "One or more Searchnodes cannot upgrade directly to Elasticsearch $target_es_version. Rechecking in $delay seconds. Attempt $((retry_count + 1)) of $retries."
-            ((retry_count++))
-            sleep $delay
-        done
-
-        return 1
-    }
-
-    # Gather heavynode version info and verify that each node is running a version compatible with the target ES version.
-    verify_heavynodes_es_target_compatibility() {
-        local heavynode_minions attempt
-        local retries=20
-        local retry_count=0
-        local delay=180
-        local heavynode_discovery_success=false
-        HEAVYNODE_ES_VERSIONS=""
-
-        for attempt in {1..3}; do
-            if heavynode_minions=$(set -o pipefail; salt-key --out=json --list=accepted 2> /dev/null | jq -r '.minions[]? | select(endswith("heavynode"))'); then
-                heavynode_discovery_success=true
-                break
-            fi
-
-            echo "Failed to retrieve grid heavynodes via salt-key... Retrying in 30 seconds. Attempt $attempt of 3."
-            sleep 30
-        done
-
-        if [[ "$heavynode_discovery_success" != "true" ]]; then
-            echo "Failed to retrieve grid heavynodes via salt-key."
-            return 1
-        fi
-
-        if [[ -z "$heavynode_minions" ]]; then
-            echo "No heavynodes detected. Skipping heavynode Elasticsearch version compatibility check."
-            return 0
-        fi
-
-        while [[ $retry_count -lt $retries ]]; do
-            HEAVYNODE_ES_VERSIONS=$(salt -C 'G@role:so-heavynode' cmd.run 'set -o pipefail; so-elasticsearch-query / --retry 5 --retry-delay 10 | jq -er ".version.number"' shell=/bin/bash --out=json 2> /dev/null)
-            local exit_status=$?
-
-            if [[ $exit_status -ne 0 ]]; then
-                echo "Failed to retrieve Elasticsearch version from one or more heavynodes... Retrying in $delay seconds. Attempt $((retry_count + 1)) of $retries."
-                ((retry_count++))
-                sleep $delay
-                continue
-            fi
-
-            local all_heavynodes_compatible=true
-            while IFS=$'\t' read -r node current_version; do
-                [[ -z "$node" ]] && continue
-                if ! es_version_can_upgrade_to_target "$current_version"; then
-                    echo "Heavynode $node is running Elasticsearch $current_version, which is not directly upgradable to Elasticsearch $target_es_version."
-                    all_heavynodes_compatible=false
-                fi
-            done < <(echo "$HEAVYNODE_ES_VERSIONS" | jq -r 'to_entries[] | [.key, .value] | @tsv')
-
-            while IFS= read -r heavynode_minion; do
-                [[ -z "$heavynode_minion" ]] && continue
-                if ! echo "$HEAVYNODE_ES_VERSIONS" | jq -se --arg minion "$heavynode_minion" 'add | has($minion)' > /dev/null; then
-                    echo "Heavynode $heavynode_minion did not report an Elasticsearch version. It may be offline or still upgrading."
-                    all_heavynodes_compatible=false
-                fi
-            done <<< "$heavynode_minions"
-
-            if [[ "$all_heavynodes_compatible" == true ]]; then
-                echo -e "\nAll heavynodes can upgrade to Elasticsearch $target_es_version."
-                return 0
-            fi
-
-            echo "One or more heavynodes cannot upgrade directly to Elasticsearch $target_es_version. Rechecking in $delay seconds. Attempt $((retry_count + 1)) of $retries."
-            ((retry_count++))
-            sleep $delay
-        done
-
-        return 1
-    }
-
-    if [[ ! -f "$es_verification_script" ]]; then
-        create_intermediate_upgrade_verification_script "$es_verification_script"
-    fi
-
    for statefile in "${es_required_version_statefile_base}"-*; do
        [[ -f $statefile ]] || continue

@@ -1240,6 +850,10 @@ verify_es_version_compatibility() {
            continue
        fi

+        if [[ ! -f "$es_verification_script" ]]; then
+            create_intermediate_upgrade_verification_script "$es_verification_script"
+        fi
+
        echo -e "\n##############################################################################################################################\n"
        echo "A previously required intermediate Elasticsearch upgrade was detected. Verifying that all Searchnodes/Heavynodes have successfully upgraded Elasticsearch to $es_required_version_statefile_value before proceeding with soup to avoid potential data loss! This command can take up to an hour to complete."
        if ! timeout --foreground 4000 bash "$es_verification_script" "$es_required_version_statefile_value" "$statefile"; then
@@ -1261,26 +875,6 @@ verify_es_version_compatibility() {

    # shellcheck disable=SC2076 # Do not want a regex here eg usage " 8.18.8 9.0.8 " =~ " 9.0.8 "
    if [[ " ${es_upgrade_map[$es_version]} " =~ " $target_es_version " || "$es_version" == "$target_es_version" ]]; then
-        if ! verify_searchnodes_es_target_compatibility || ! verify_heavynodes_es_target_compatibility; then
-            echo -e "\n!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!\n"
-
-            echo "One or more Searchnode(s)/Heavynode(s) cannot upgrade directly to Elasticsearch $target_es_version. This can happen with soups that include Elasticsearch upgrades being run in quick succession. Typically, this will resolve itself as the grid synchronizes. Please allow time for all Searchnodes/Heavynodes to have upgraded Elasticsearch to a compatible version with $target_es_version before running soup again to avoid potential data loss!"
-
-            if [[ -n "$HEAVYNODE_ES_VERSIONS" ]]; then
-                echo "Current heavynode Elasticsearch versions:"
-                echo "$HEAVYNODE_ES_VERSIONS" | jq '.'
-            fi
-
-            if [[ -n "$SEARCHNODE_ES_VERSIONS" ]]; then
-                echo "Current searchnode Elasticsearch versions:"
-                echo "$SEARCHNODE_ES_VERSIONS" | jq '.nodes | to_entries | map({(.value.name): .value.version}) | sort | add'
-            fi
-
-            echo -e "\n!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!\n"
-
-            exit 161
-        fi
-
        # supported upgrade
        return 0
    else
@@ -1638,13 +1232,13 @@ main() {
  echo "Verifying we have the latest soup script."
  verify_latest_update_script

+  echo "Verifying Elasticsearch version compatibility before upgrading."
+  verify_es_version_compatibility
+
  echo "Let's see if we need to update Security Onion."
  upgrade_check
  upgrade_space

-  echo "Verifying Elasticsearch version compatibility across the grid before upgrading."
-  verify_es_version_compatibility
-
  echo "Checking for Salt Master and Minion updates."
  upgrade_check_salt
  set -e
@@ -1664,8 +1258,7 @@ main() {
    echo "Applying $HOTFIXVERSION hotfix"
    # since we don't run the backup.config_backup state on import we wont snapshot previous version states and pillars
    if [[ ! "$MINION_ROLE" == "import" ]]; then
-        echo "Running so-config-backup script."
-        /sbin/so-config-backup
+      backup_old_states_pillars
    fi
    copy_new_files
    create_local_directories "/opt/so/saltstack/default"
@@ -1721,8 +1314,8 @@ main() {
    # since we don't run the backup.config_backup state on import we wont snapshot previous version states and pillars
    if [[ ! "$MINION_ROLE" == "import" ]]; then
      echo ""
-      echo "Running so-config-backup script."
-      /sbin/so-config-backup
+      echo "Creating snapshots of default and local Salt states and pillars and saving to /nsm/backup/"
+      backup_old_states_pillars
    fi

    echo ""
@@ -1758,9 +1351,6 @@ main() {

    enable_highstate

-    echo "salt-call state.show_top"
-    salt-call state.show_top
-
    echo ""
    echo "Running a highstate. This could take several minutes."
    set +e
@@ -1768,9 +1358,6 @@ main() {
    highstate
    set -e

-    echo "salt-call saltutil.running"
-    salt-call saltutil.running
-
    stop_salt_master

    masterunlock
@@ -1793,9 +1380,6 @@ main() {
    # ensure the mine is updated and populated before highstates run, following the salt-master restart
    update_salt_mine

-    echo "salt-call state.show_top"
-    salt-call state.show_top
-
    highstate
    check_saltmaster_status
    postupgrade_changes
@@ -33,8 +33,11 @@ so-elastic-fleet-stop --force

 status "Deleting Fleet Data from Pillars..."
 so-yaml.py remove /opt/so/saltstack/local/pillar/minions/{{ GLOBALS.minion_id }}.sls elasticfleet
+/usr/sbin/so-config.py sync-yaml-mutation /opt/so/saltstack/local/pillar/minions/{{ GLOBALS.minion_id }}.sls remove elasticfleet --note "so-elastic-fleet-reset"
 so-yaml.py remove /opt/so/saltstack/local/pillar/global/soc_global.sls global.fleet_grid_enrollment_token_general
+/usr/sbin/so-config.py sync-yaml-mutation /opt/so/saltstack/local/pillar/global/soc_global.sls remove global.fleet_grid_enrollment_token_general --note "so-elastic-fleet-reset"
 so-yaml.py remove /opt/so/saltstack/local/pillar/global/soc_global.sls global.fleet_grid_enrollment_token_heavy
+/usr/sbin/so-config.py sync-yaml-mutation /opt/so/saltstack/local/pillar/global/soc_global.sls remove global.fleet_grid_enrollment_token_heavy --note "so-elastic-fleet-reset"

 status "Restarting Kibana..."
 so-kibana-restart --force
@@ -34,7 +34,6 @@ make-rule-dir-nginx:
 so-nginx:
  docker_container.running:
    - image: {{ GLOBALS.registry_host }}:5000/{{ GLOBALS.image_repo }}/so-nginx:{{ GLOBALS.so_version }}
-    - restart_policy: unless-stopped
    - hostname: so-nginx
    - networks:
      - sobridge:
@@ -225,7 +225,6 @@ http {
 			limit_req             zone=auth_throttle burst={{ NGINXMERGED.config.throttle_login_burst }} nodelay;
 			limit_req_status      429;
 			proxy_pass            http://{{ GLOBALS.manager }}:4433;
-			proxy_set_header      Connection "Close";
 			proxy_read_timeout    90;
 			proxy_connect_timeout 90;
 			proxy_set_header      Host $host;
@@ -238,7 +237,6 @@ http {
 		location ~ ^/auth/.*?(whoami|logout|settings|errors|webauthn.js) {
 			rewrite               /auth/(.*) /$1 break;
 			proxy_pass            http://{{ GLOBALS.manager }}:4433;
-			proxy_set_header      Connection "Close";
 			proxy_read_timeout    90;
 			proxy_connect_timeout 90;
 			proxy_set_header      Host $host;
@@ -3,14 +3,7 @@
 # https://securityonion.net/license; you may not use this file except in compliance with the
 # Elastic License 2.0.

-{% set hypervisor = pillar.get('minion_id', '') %}
-
-{% if not hypervisor|regex_match('^([A-Za-z0-9._-]{1,253})$') %}
-{%   do salt.log.error('delete_hypervisor_orch: refusing unsafe minion_id=' ~ hypervisor) %}
-delete_hypervisor_invalid_minion_id:
-  test.fail_without_changes:
-    - name: delete_hypervisor_invalid_minion_id
-{% else %}
+{% set hypervisor = pillar.minion_id %}

 ensure_hypervisor_mine_deleted:
  salt.function:
@@ -27,5 +20,3 @@ update_salt_cloud_profile:
    - sls:
      - salt.cloud.config
    - concurrent: True
-
-{% endif %}
@@ -1,37 +0,0 @@
-{% from 'global/map.jinja' import GLOBALMERGED %}
-{% set actions = salt['pillar.get']('actions', []) %}
-{% set BATCH = GLOBALMERGED.push.batch %}
-{% set BATCH_WAIT = GLOBALMERGED.push.batch_wait %}
-
-{% for action in actions %}
-{%   if action.get('highstate') %}
-apply_highstate_{{ loop.index }}:
-  salt.state:
-    - tgt: '{{ action.tgt }}'
-    - tgt_type: {{ action.get('tgt_type', 'compound') }}
-    - highstate: True
-    - batch: {{ action.get('batch', BATCH) }}
-    - batch_wait: {{ action.get('batch_wait', BATCH_WAIT) }}
-    - kwarg:
-        queue: 2
-{%   else %}
-refresh_pillar_{{ loop.index }}:
-  salt.function:
-    - name: saltutil.refresh_pillar
-    - tgt: '{{ action.tgt }}'
-    - tgt_type: {{ action.get('tgt_type', 'compound') }}
-
-apply_{{ action.state | replace('.', '_') }}_{{ loop.index }}:
-  salt.state:
-    - tgt: '{{ action.tgt }}'
-    - tgt_type: {{ action.get('tgt_type', 'compound') }}
-    - sls:
-      - {{ action.state }}
-    - batch: {{ action.get('batch', BATCH) }}
-    - batch_wait: {{ action.get('batch_wait', BATCH_WAIT) }}
-    - kwarg:
-        queue: 2
-    - require:
-      - salt: refresh_pillar_{{ loop.index }}
-{%   endif %}
-{% endfor %}
@@ -12,14 +12,7 @@
 {% if 'vrt' in salt['pillar.get']('features', []) %}

 {%   do salt.log.debug('vm_pillar_clean_orch: Running') %}
-{%   set vm_name = pillar.get('vm_name', '') %}
-
-{%   if not vm_name|regex_match('^([A-Za-z0-9._-]{1,253})$') %}
-{%     do salt.log.error('vm_pillar_clean_orch: refusing unsafe vm_name=' ~ vm_name) %}
-vm_pillar_clean_invalid_name:
-  test.fail_without_changes:
-    - name: vm_pillar_clean_invalid_name
-{%   else %}
+{%   set vm_name = pillar.get('vm_name') %}

 delete_adv_{{ vm_name }}_pillar:
  module.run:
@@ -31,8 +24,6 @@ delete_{{ vm_name }}_pillar:
    - file.remove:
      - path: /opt/so/saltstack/local/pillar/minions/{{ vm_name }}.sls

-{%   endif %}
-
 {% else %}

 {%   do salt.log.error(
@@ -46,10 +46,10 @@ postgresinitdir:
    - require:
      - file: postgresconfdir

-postgresinitdb:
+postgresinitusers:
  file.managed:
-    - name: /opt/so/conf/postgres/init/init-db.sh
-    - source: salt://postgres/files/init-db.sh
+    - name: /opt/so/conf/postgres/init/init-users.sh
+    - source: salt://postgres/files/init-users.sh
    - user: 939
    - group: 939
    - mode: 755
@@ -31,7 +31,7 @@ so-postgres:
      - POSTGRES_DB=securityonion
      # Passwords are delivered via mounted 0600 secret files, not plaintext env vars.
      # The upstream postgres image resolves POSTGRES_PASSWORD_FILE; entrypoint.sh and
-      # init-db.sh resolve SO_POSTGRES_PASS_FILE the same way.
+      # init-users.sh resolve SO_POSTGRES_PASS_FILE the same way.
      - POSTGRES_PASSWORD_FILE=/run/secrets/postgres_password
      - SO_POSTGRES_USER={{ SO_POSTGRES_USER }}
      - SO_POSTGRES_PASS_FILE=/run/secrets/so_postgres_pass
@@ -46,7 +46,7 @@ so-postgres:
      - /opt/so/conf/postgres/postgresql.conf:/conf/postgresql.conf:ro
      - /opt/so/conf/postgres/pg_hba.conf:/conf/pg_hba.conf:ro
      - /opt/so/conf/postgres/secrets:/run/secrets:ro
-      - /opt/so/conf/postgres/init/init-db.sh:/docker-entrypoint-initdb.d/init-db.sh:ro
+      - /opt/so/conf/postgres/init/init-users.sh:/docker-entrypoint-initdb.d/init-users.sh:ro
      - /etc/pki/postgres.crt:/conf/postgres.crt:ro
      - /etc/pki/postgres.key:/conf/postgres.key:ro
      - /etc/pki/tls/certs/intca.crt:/conf/ca.crt:ro
@@ -70,7 +70,7 @@ so-postgres:
    - watch:
      - file: postgresconf
      - file: postgreshba
-      - file: postgresinitdb
+      - file: postgresinitusers
      - file: postgres_super_secret
      - file: postgres_app_secret
      - x509: postgres_crt
@@ -78,7 +78,7 @@ so-postgres:
    - require:
      - file: postgresconf
      - file: postgreshba
-      - file: postgresinitdb
+      - file: postgresinitusers
      - file: postgres_super_secret
      - file: postgres_app_secret
      - x509: postgres_crt
@@ -17,7 +17,6 @@ psql -v ON_ERROR_STOP=1 --username "$POSTGRES_USER" --dbname "$POSTGRES_DB" <<-E
        END IF;
    END
    \$\$;
-    GRANT ALL ON SCHEMA public TO "$SO_POSTGRES_USER";
    GRANT ALL PRIVILEGES ON DATABASE "$POSTGRES_DB" TO "$SO_POSTGRES_USER";
    -- Lock the SOC database down at the connect layer; PUBLIC gets CONNECT
    -- by default, which would let per-minion telegraf roles open sessions
@@ -0,0 +1,20 @@
+# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
+# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
+# https://securityonion.net/license; you may not use this file except in compliance with the
+# Elastic License 2.0.
+
+{% from 'allowed_states.map.jinja' import allowed_states %}
+{% if sls.split('.')[0] in allowed_states %}
+
+# Deprecated: the old so_pillar schema has been replaced by SOC-owned
+# onionconfig tables. SOC creates its schema on first startup.
+postgres_schema_pillar_deprecated:
+  test.nop
+
+{% else %}
+
+{{sls}}_state_not_allowed:
+  test.fail_without_changes:
+    - name: {{sls}}_state_not_allowed
+
+{% endif %}
@@ -18,22 +18,38 @@ include:
 {% set TG_OUT = TELEGRAFMERGED.output | upper %}
 {% if TG_OUT in ['POSTGRES', 'BOTH'] %}

+# docker_container.running returns as soon as the container starts, but on
+# first-init docker-entrypoint.sh starts a temporary postgres with
+# `listen_addresses=''` to run /docker-entrypoint-initdb.d scripts, then
+# shuts it down before exec'ing the real CMD. A default pg_isready check
+# (Unix socket) passes during that ephemeral phase and races the shutdown
+# with "the database system is shutting down". Checking TCP readiness on
+# 127.0.0.1 only succeeds after the final postgres binds the port.
 postgres_wait_ready:
  cmd.run:
-    - name: /usr/sbin/so-postgres-wait
+    - name: |
+        for i in $(seq 1 60); do
+          if docker exec so-postgres pg_isready -h 127.0.0.1 -U postgres -q 2>/dev/null; then
+            exit 0
+          fi
+          sleep 2
+        done
+        echo "so-postgres did not accept TCP connections within 120s" >&2
+        exit 1
    - require:
      - docker_container: so-postgres
-      - file: postgres_sbin

-# Ensure the shared Telegraf database exists. init-db.sh only runs on a
+# Ensure the shared Telegraf database exists. init-users.sh only runs on a
 # fresh data dir, so hosts upgraded onto an existing /nsm/postgres volume
 # would otherwise never get so_telegraf.
 postgres_create_telegraf_db:
  cmd.run:
-    - name: /usr/sbin/so-telegraf-postgres create_db
+    - name: |
+        if ! docker exec so-postgres psql -U postgres -tAc "SELECT 1 FROM pg_database WHERE datname='so_telegraf'" | grep -q 1; then
+          docker exec so-postgres psql -v ON_ERROR_STOP=1 -U postgres -c "CREATE DATABASE so_telegraf"
+        fi
    - require:
      - cmd: postgres_wait_ready
-      - file: postgres_sbin

 # Provision the shared group role and schema once. Every per-minion role is a
 # member of so_telegraf, and each Telegraf connection does SET ROLE so_telegraf
@@ -41,26 +57,68 @@ postgres_create_telegraf_db:
 # on first write are owned by the group role and every member can INSERT/SELECT.
 postgres_telegraf_group_role:
  cmd.run:
-    - name: /usr/sbin/so-telegraf-postgres group_role
+    - name: |
+        docker exec -i so-postgres psql -v ON_ERROR_STOP=1 -U postgres -d so_telegraf <<'EOSQL'
+        DO $$
+        BEGIN
+            IF NOT EXISTS (SELECT FROM pg_catalog.pg_roles WHERE rolname = 'so_telegraf') THEN
+                CREATE ROLE so_telegraf NOLOGIN;
+            END IF;
+        END
+        $$;
+        GRANT CONNECT ON DATABASE so_telegraf TO so_telegraf;
+        CREATE SCHEMA IF NOT EXISTS telegraf AUTHORIZATION so_telegraf;
+        GRANT USAGE, CREATE ON SCHEMA telegraf TO so_telegraf;
+        CREATE SCHEMA IF NOT EXISTS partman;
+        CREATE EXTENSION IF NOT EXISTS pg_partman SCHEMA partman;
+        CREATE EXTENSION IF NOT EXISTS pg_cron;
+        -- Telegraf (running as so_telegraf) calls partman.create_parent()
+        -- on first write of each metric, which needs USAGE on the partman
+        -- schema, EXECUTE on its functions/procedures, and write access to
+        -- partman.part_config so it can register new partitioned parents.
+        GRANT USAGE, CREATE ON SCHEMA partman TO so_telegraf;
+        GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA partman TO so_telegraf;
+        GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA partman TO so_telegraf;
+        GRANT EXECUTE ON ALL PROCEDURES IN SCHEMA partman TO so_telegraf;
+        -- partman creates per-parent template tables (partman.template_*) at
+        -- runtime; default privileges extend DML/sequence access to them.
+        ALTER DEFAULT PRIVILEGES IN SCHEMA partman
+            GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO so_telegraf;
+        ALTER DEFAULT PRIVILEGES IN SCHEMA partman
+            GRANT USAGE, SELECT, UPDATE ON SEQUENCES TO so_telegraf;
+        -- Hourly partman maintenance. cron.schedule is idempotent by jobname.
+        SELECT cron.schedule(
+          'telegraf-partman-maintenance',
+          '17 * * * *',
+          'CALL partman.run_maintenance_proc()'
+        );
+        EOSQL
    - require:
      - cmd: postgres_create_telegraf_db
-      - file: postgres_sbin

 {%   set creds = salt['pillar.get']('telegraf:postgres_creds', {}) %}
 {%   for mid, entry in creds.items() %}
 {%     if entry.get('user') and entry.get('pass') %}
 {%       set u = entry.user %}
-{%       set p = entry.pass %}
+{%       set p = entry.pass | replace("'", "''") %}

 postgres_telegraf_role_{{ u }}:
  cmd.run:
-    - name: /usr/sbin/so-telegraf-postgres user
-    - env:
-      - ROLE_USER: {{ u | tojson }}
-      - ROLE_PASS: {{ p | tojson }}
-    - hide_output: True
+    - name: |
+        docker exec -i so-postgres psql -v ON_ERROR_STOP=1 -U postgres -d so_telegraf <<'EOSQL'
+        DO $$
+        BEGIN
+            IF NOT EXISTS (SELECT FROM pg_catalog.pg_roles WHERE rolname = '{{ u }}') THEN
+                EXECUTE format('CREATE ROLE %I WITH LOGIN PASSWORD %L', '{{ u }}', '{{ p }}');
+            ELSE
+                EXECUTE format('ALTER ROLE %I WITH PASSWORD %L', '{{ u }}', '{{ p }}');
+            END IF;
+        END
+        $$;
+        GRANT CONNECT ON DATABASE so_telegraf TO "{{ u }}";
+        GRANT so_telegraf TO "{{ u }}";
+        EOSQL
    - require:
-      - file: postgres_sbin
      - cmd: postgres_telegraf_group_role

 {%     endif %}
@@ -72,12 +130,21 @@ postgres_telegraf_role_{{ u }}:
 {%   set retention = salt['pillar.get']('postgres:telegraf:retention_days', 14) | int %}
 postgres_telegraf_retention_reconcile:
  cmd.run:
-    - name: /usr/sbin/so-telegraf-postgres retention
-    - env:
-      - RETENTION_DAYS: {{ retention }}
+    - name: |
+        docker exec -i so-postgres psql -v ON_ERROR_STOP=1 -U postgres -d so_telegraf <<'EOSQL'
+        DO $$
+        BEGIN
+            IF EXISTS (SELECT 1 FROM pg_catalog.pg_extension WHERE extname = 'pg_partman') THEN
+                UPDATE partman.part_config
+                SET retention = '{{ retention }} days',
+                    retention_keep_table = false
+                WHERE parent_table LIKE 'telegraf.%';
+            END IF;
+        END
+        $$;
+        EOSQL
    - require:
      - cmd: postgres_telegraf_group_role
-      - file: postgres_sbin

 {% endif %}

@@ -7,29 +7,15 @@

 . /usr/sbin/so-common

-# Without pipefail, a pipeline's exit status is gzip's. A failed pg_dumpall would
-# otherwise be masked by a successful gzip, silently producing a valid .gz that
-# holds a truncated dump.
-set -o pipefail
-
 # Backups contain role password hashes and full chat data; keep them 0600.
 umask 0077

 TODAY=$(date '+%Y_%m_%d')
 BACKUPDIR=/nsm/backup
 BACKUPFILE="$BACKUPDIR/so-postgres-backup-$TODAY.sql.gz"
-TMPFILE="$BACKUPFILE.tmp"
 MAXBACKUPS=7
-LOGFILE=/opt/so/log/postgres/backup.log

-log() {
-  echo "$(date '+%Y-%m-%d %H:%M:%S') $*" >> "$LOGFILE"
-}
-
-mkdir -p "$BACKUPDIR"
-
-# Remove any temp files left behind by a previously crashed run
-rm -f "$BACKUPDIR"/so-postgres-backup-*.sql.gz.tmp
+mkdir -p $BACKUPDIR

 # Skip if already backed up today
 if [ -f "$BACKUPFILE" ]; then
@@ -41,33 +27,13 @@ if ! docker ps --format '{{.Names}}' | grep -q '^so-postgres$'; then
  exit 0
 fi

-# Always clean up the temp file on exit; the success path clears this trap
-# after the atomic rename so the finished backup is not deleted.
-trap 'rm -f "$TMPFILE"' EXIT
+# Dump all databases and roles, compress
+docker exec so-postgres pg_dumpall -U postgres | gzip > "$BACKUPFILE"

-# Dump all databases and roles, compress. Write to a temp file so the final
-# filename only ever appears for a complete, verified backup.
-if ! docker exec so-postgres pg_dumpall -U postgres | gzip > "$TMPFILE"; then
-  log "ERROR: pg_dumpall/gzip failed; backup aborted"
-  exit 1
-fi
-
-# Verify the compressed stream is intact before publishing it
-if ! gzip -t "$TMPFILE"; then
-  log "ERROR: backup failed gzip integrity check; backup aborted"
-  exit 1
-fi
-
-# Atomically publish the verified backup
-mv "$TMPFILE" "$BACKUPFILE"
-trap - EXIT
-log "OK: wrote $BACKUPFILE"
-
-# Retention cleanup (only reached after a successful backup). The glob is
-# restricted to finished backups so an in-progress .tmp can never be counted.
-NUMBACKUPS=$(find "$BACKUPDIR" -type f -name "so-postgres-backup-*.sql.gz" | wc -l)
+# Retention cleanup
+NUMBACKUPS=$(find $BACKUPDIR -type f -name "so-postgres-backup*" | wc -l)
 while [ "$NUMBACKUPS" -gt "$MAXBACKUPS" ]; do
-  OLDEST=$(find "$BACKUPDIR" -type f -name "so-postgres-backup-*.sql.gz" -printf '%T+ %p\n' | sort | head -n 1 | awk -F" " '{print $2}')
+  OLDEST=$(find $BACKUPDIR -type f -name "so-postgres-backup*" -printf '%T+ %p\n' | sort | head -n 1 | awk -F" " '{print $2}')
  rm -f "$OLDEST"
-  NUMBACKUPS=$(find "$BACKUPDIR" -type f -name "so-postgres-backup-*.sql.gz" | wc -l)
+  NUMBACKUPS=$(find $BACKUPDIR -type f -name "so-postgres-backup*" | wc -l)
 done
@@ -1,32 +0,0 @@
-#!/bin/bash
-
-# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
-# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
-# https://securityonion.net/license; you may not use this file except in compliance with the
-# Elastic License 2.0.
-
-# Wait for the so-postgres container to accept TCP connections.
-#
-# docker_container.running returns as soon as the container starts, but on
-# first-init docker-entrypoint.sh starts a temporary postgres with
-# `listen_addresses=''` to run /docker-entrypoint-initdb.d scripts, then
-# shuts it down before exec'ing the real CMD. A default pg_isready check
-# (Unix socket) passes during that ephemeral phase and races the shutdown
-# with "the database system is shutting down". Checking TCP readiness on
-# 127.0.0.1 only succeeds after the final postgres binds the port.
-#
-# Usage: so-postgres-wait [iterations] [sleep_seconds]
-# Default: 60 iterations, 2s sleep (~120s total).
-
-ITERATIONS=${1:-60}
-SLEEP_SECONDS=${2:-2}
-
-for i in $(seq 1 "$ITERATIONS"); do
-  if docker exec so-postgres pg_isready -h 127.0.0.1 -U postgres -q 2>/dev/null; then
-    exit 0
-  fi
-  sleep "$SLEEP_SECONDS"
-done
-
-echo "so-postgres did not accept TCP connections within $((ITERATIONS * SLEEP_SECONDS))s" >&2
-exit 1
@@ -1,110 +0,0 @@
-#!/bin/bash
-set -e
-
-# Provision Telegraf state inside the so-postgres container.
-# Usage: so-telegraf-postgres <subcommand>
-#   create_db    Ensure the so_telegraf database exists.
-#   group_role   Provision the so_telegraf group role, telegraf/partman schemas,
-#                pg_partman, pg_cron, and the hourly partman maintenance job.
-#   user         Create or update a per-minion login role granted to so_telegraf.
-#                Env: ROLE_USER, ROLE_PASS.
-#   retention    Reconcile partman retention on telegraf parents.
-#                Env: RETENTION_DAYS.
-
-cmd="${1:?subcommand required}"
-
-case "$cmd" in
-  create_db)
-    if ! docker exec so-postgres psql -U postgres -tAc \
-        "SELECT 1 FROM pg_database WHERE datname='so_telegraf'" | grep -q 1; then
-      docker exec so-postgres psql -v ON_ERROR_STOP=1 -U postgres \
-        -c "CREATE DATABASE so_telegraf"
-    fi
-    ;;
-
-  group_role)
-    docker exec -i so-postgres psql -v ON_ERROR_STOP=1 -U postgres -d so_telegraf <<'EOSQL'
-DO $$
-BEGIN
-    IF NOT EXISTS (SELECT FROM pg_catalog.pg_roles WHERE rolname = 'so_telegraf') THEN
-        CREATE ROLE so_telegraf NOLOGIN;
-    END IF;
-END
-$$;
-GRANT CONNECT ON DATABASE so_telegraf TO so_telegraf;
-CREATE SCHEMA IF NOT EXISTS telegraf AUTHORIZATION so_telegraf;
-GRANT USAGE, CREATE ON SCHEMA telegraf TO so_telegraf;
-CREATE SCHEMA IF NOT EXISTS partman;
-CREATE EXTENSION IF NOT EXISTS pg_partman SCHEMA partman;
-CREATE EXTENSION IF NOT EXISTS pg_cron;
-- Telegraf (running as so_telegraf) calls partman.create_parent()
-- on first write of each metric, which needs USAGE on the partman
-- schema, EXECUTE on its functions/procedures, and write access to
-- partman.part_config so it can register new partitioned parents.
-GRANT USAGE, CREATE ON SCHEMA partman TO so_telegraf;
-GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA partman TO so_telegraf;
-GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA partman TO so_telegraf;
-GRANT EXECUTE ON ALL PROCEDURES IN SCHEMA partman TO so_telegraf;
-- partman creates per-parent template tables (partman.template_*) at
-- runtime; default privileges extend DML/sequence access to them.
-ALTER DEFAULT PRIVILEGES IN SCHEMA partman
-    GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO so_telegraf;
-ALTER DEFAULT PRIVILEGES IN SCHEMA partman
-    GRANT USAGE, SELECT, UPDATE ON SEQUENCES TO so_telegraf;
-- Hourly partman maintenance. cron.schedule is idempotent by jobname.
-SELECT cron.schedule(
-  'telegraf-partman-maintenance',
-  '17 * * * *',
-  'CALL partman.run_maintenance_proc()'
-);
-EOSQL
-    ;;
-
-  user)
-    : "${ROLE_USER:?ROLE_USER is required}"
-    : "${ROLE_PASS:?ROLE_PASS is required}"
-    # psql does not substitute :vars inside dollar-quoted strings, so the
-    # conditional CREATE/ALTER is built outside any DO block and dispatched
-    # with \gexec. format() handles identifier/literal quoting.
-    docker exec -i so-postgres psql \
-      -v ON_ERROR_STOP=1 \
-      -v role_user="$ROLE_USER" \
-      -v role_pass="$ROLE_PASS" \
-      -U postgres -d so_telegraf <<'EOSQL'
-SELECT format(
-  CASE WHEN EXISTS (SELECT FROM pg_catalog.pg_roles WHERE rolname = :'role_user')
-       THEN 'ALTER ROLE %I WITH LOGIN PASSWORD %L'
-       ELSE 'CREATE ROLE %I WITH LOGIN PASSWORD %L'
-  END,
-  :'role_user',
-  :'role_pass'
-) \gexec
-GRANT CONNECT ON DATABASE so_telegraf TO :"role_user";
-GRANT so_telegraf TO :"role_user";
-EOSQL
-    ;;
-
-  retention)
-    : "${RETENTION_DAYS:?RETENTION_DAYS is required}"
-    # \gset + \if guards against a missing pg_partman without using a DO
-    # block (psql :var substitution doesn't reach into dollar-quoted code).
-    docker exec -i so-postgres psql \
-      -v ON_ERROR_STOP=1 \
-      -v retention_days="$RETENTION_DAYS" \
-      -U postgres -d so_telegraf <<'EOSQL'
-SELECT CASE WHEN EXISTS (SELECT 1 FROM pg_catalog.pg_extension WHERE extname = 'pg_partman')
-            THEN 'true' ELSE 'false' END AS has_partman \gset
-\if :has_partman
-UPDATE partman.part_config
-SET retention = :'retention_days' || ' days',
-    retention_keep_table = false
-WHERE parent_table LIKE 'telegraf.%';
-\endif
-EOSQL
-    ;;
-
-  *)
-    echo "Unknown subcommand: $cmd" >&2
-    exit 1
-    ;;
-esac
@@ -3,15 +3,12 @@
 # https://securityonion.net/license; you may not use this file except in compliance with the
 # Elastic License 2.0.

-{% set hid = data['id'] %}
-{% if hid|regex_match('^([A-Za-z0-9._-]{1,253})$')
-   and hid.endswith('_hypervisor')
-   and data['result'] == True %}
+{% if data['id'].endswith('_hypervisor') and data['result'] == True %}

 {%   if data['act'] == 'accept' %}
 check_and_trigger:
  runner.setup_hypervisor.setup_environment:
-    - minion_id: {{ hid }}
+    - minion_id: {{ data['id'] }}
 {%   endif %}

 {%   if data['act'] == 'delete' %}
@@ -20,7 +17,8 @@ delete_hypervisor:
    - args:
      - mods: orch.delete_hypervisor
      - pillar:
-          minion_id: {{ hid }}
+          minion_id: {{ data['id'] }}
 {%   endif %}

 {% endif %}
+
@@ -9,42 +9,30 @@ import logging
 import os
 import pwd
 import grp
-import re
-
-log = logging.getLogger(__name__)
-
-PILLAR_ROOT = '/opt/so/saltstack/local/pillar/minions/'
-_VMNAME_RE = re.compile(r'^[A-Za-z0-9._-]{1,253}$')
-

 def run():
-  vm_name = data.get('kwargs', {}).get('name', '')
-  if not _VMNAME_RE.match(str(vm_name)):
-    log.error("createEmptyPillar reactor: refusing unsafe vm_name=%r", vm_name)
-    return {}
-
-  log.info("createEmptyPillar reactor: vm_name: %s", vm_name)
+  vm_name = data['kwargs']['name']
+  logging.error("createEmptyPillar reactor: vm_name: %s" % vm_name)
+  pillar_root = '/opt/so/saltstack/local/pillar/minions/'
  pillar_files = ['adv_' + vm_name + '.sls', vm_name + '.sls']

  try:
+    # Get socore user and group IDs
    socore_uid = pwd.getpwnam('socore').pw_uid
    socore_gid = grp.getgrnam('socore').gr_gid
-    pillar_root_real = os.path.realpath(PILLAR_ROOT)

    for f in pillar_files:
-      full_path = os.path.join(PILLAR_ROOT, f)
-      resolved = os.path.realpath(full_path)
-      if os.path.dirname(resolved) != pillar_root_real:
-        log.error("createEmptyPillar reactor: refusing path outside pillar root: %s", resolved)
-        continue
-      if os.path.exists(resolved):
-        continue
-      os.mknod(resolved)
-      os.chown(resolved, socore_uid, socore_gid)
-      os.chmod(resolved, 0o640)
-      log.info("createEmptyPillar reactor: created %s with socore:socore ownership and mode 0640", f)
+      full_path = pillar_root + f
+      if not os.path.exists(full_path):
+        # Create empty file
+        os.mknod(full_path)
+        # Set ownership to socore:socore
+        os.chown(full_path, socore_uid, socore_gid)
+        # Set mode to 644 (rw-r--r--)
+        os.chmod(full_path, 0o640)
+        logging.error("createEmptyPillar reactor: created %s with socore:socore ownership and mode 644" % f)

  except (KeyError, OSError) as e:
-    log.error("createEmptyPillar reactor: Error setting ownership/permissions: %s", e)
+    logging.error("createEmptyPillar reactor: Error setting ownership/permissions: %s" % str(e))

  return {}
@@ -1,40 +1,18 @@
-#!py
-
 # Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
 # or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
 # https://securityonion.net/license; you may not use this file except in compliance with the
 # Elastic License 2.0.

-import logging
-import re
+remove_key:
+  wheel.key.delete:
+    - args:
+      - match: {{ data['name'] }}

-log = logging.getLogger(__name__)
+{{ data['name'] }}_pillar_clean:
+  runner.state.orchestrate:
+    - args:
+      - mods: orch.vm_pillar_clean
+      - pillar:
+          vm_name: {{ data['name'] }}

-_VMNAME_RE = re.compile(r'^[A-Za-z0-9._-]{1,253}$')
-
-
-def run():
-  name = data.get('name', '')
-  if not _VMNAME_RE.match(str(name)):
-    log.error("deleteKey reactor: refusing unsafe name=%r", name)
-    return {}
-
-  log.info("deleteKey reactor: deleted minion key: %s", name)
-
-  return {
-    'remove_key': {
-      'wheel.key.delete': [
-        {'args': [
-          {'match': name},
-        ]},
-      ],
-    },
-    '%s_pillar_clean' % name: {
-      'runner.state.orchestrate': [
-        {'args': [
-          {'mods': 'orch.vm_pillar_clean'},
-          {'pillar': {'vm_name': name}},
-        ]},
-      ],
-    },
-  }
+{% do salt.log.info('deleteKey reactor: deleted minion key: %s' % data['name']) %}
@@ -1,240 +0,0 @@
-# One pillar directory can map to multiple (state, tgt) actions.
-# tgt is a raw salt compound expression. tgt_type is always "compound".
-# Per-action `batch` / `batch_wait` override the orch defaults (25% / 15s).
-# An action with `highstate: True` triggers state.highstate instead of
-# state.apply -- see salt/orch/push_batch.sls.
-#
-# Notes:
-#   - `bpf` is a pillar-only dir (no state of its own) consumed by both
-#     zeek and suricata via macros, so a bpf pillar change re-applies both.
-#   - suricata/strelka/zeek/elasticsearch/redis/kafka/logstash etc. have
-#     their own pillar dirs AND their own state, so they map 1:1 (or 1:2
-#     in strelka's case, because of the split init.sls / manager.sls).
-#
-# Intentional omissions (these will log a "not in pillar_push_map.yaml"
-# warning in push_pillar.sls and wait for the next scheduled highstate):
-#   - `data` and `node_data`: pillar-only data consumed by many states;
-#     handling them generically would amount to a fleetwide highstate.
-#   - `host`: soc_host describes mainint/mainip; a change is a re-IP and
-#     needs a coordinated procedure, not an immediate state push.
-#   - `hypervisor`: state changes touch libvirt and are disruptive; leave
-#     to the next scheduled highstate.
-#   - `sensor`: every field in soc_sensor.yaml is `readonly: True` or
-#     per-minion (`node: True`). Per-minion edits are persisted under
-#     pillar/minions/<id>.sls and are handled by Branch A of push_pillar.sls
-#     (per-minion highstate intent), not by this app-pillar map.
-#
-# The role sets here were verified line-by-line against salt/top.sls. If
-# salt/top.sls changes how an app is targeted, update the corresponding
-# compound here.
-
-# firewall: the one pillar everyone touches. Applied everywhere intentionally
-# because every host's iptables needs to know about every other host in the
-# grid. Salt's firewall state is idempotent (file.managed + iptables-restore
-# onchanges in salt/firewall/init.sls), so hosts whose rendered firewall is
-# unchanged do a file comparison and no-op without touching iptables -- actual
-# reload happens only on the hosts whose rules actually changed. Fleetwide
-# blast radius is intentional and matches the pre-plan behavior via highstate.
-# Adding N sensors in a burst coalesces into one dispatch via the drainer.
-firewall:
-  - state: firewall
-    tgt: '*'
-
-# backup: backup.config_backup runs on eval, standalone, manager, managerhype,
-# managersearch (NOT import -- the backup pillar is included on import per
-# pillar/top.sls but the backup state is not run there per salt/top.sls).
-backup:
-  - state: backup.config_backup
-    tgt: 'G@role:so-eval or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone'
-
-# bpf is pillar-only (no state); consumed by both zeek and suricata as macros.
-# Both states run on sensor_roles + so-import per salt/top.sls.
-bpf:
-  - state: zeek
-    tgt: 'G@role:so-eval or G@role:so-heavynode or G@role:so-import or G@role:so-sensor or G@role:so-standalone'
-  - state: suricata
-    tgt: 'G@role:so-eval or G@role:so-heavynode or G@role:so-import or G@role:so-sensor or G@role:so-standalone'
-
-# ca is applied universally.
-ca:
-  - state: ca
-    tgt: '*'
-
-# docker: universal. The docker state is in both the all-non-managers and
-# all-managers branches of salt/top.sls.
-docker:
-  - state: docker
-    tgt: '*'
-
-# elastalert: eval, standalone, manager, managerhype, managersearch (NOT import).
-elastalert:
-  - state: elastalert
-    tgt: 'G@role:so-eval or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone'
-
-# elastic-fleet-package-registry: manager_roles exactly.
-elastic-fleet-package-registry:
-  - state: elastic-fleet-package-registry
-    tgt: 'G@role:so-eval or G@role:so-import or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone'
-
-# elasticsearch: 8 roles.
-elasticsearch:
-  - state: elasticsearch
-    tgt: 'G@role:so-eval or G@role:so-heavynode or G@role:so-import or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-searchnode or G@role:so-standalone'
-
-# elasticagent: so-heavynode only.
-elasticagent:
-  - state: elasticagent
-    tgt: 'G@role:so-heavynode'
-
-# elasticfleet: base state only on pillar change. elasticfleet.install_agent_grid
-# is a deploy/enrollment step, not a config reload; leave it to the next highstate.
-elasticfleet:
-  - state: elasticfleet
-    tgt: 'G@role:so-eval or G@role:so-fleet or G@role:so-import or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone'
-
-# global: fanout to a fleetwide highstate. The global pillar (soc_global.sls)
-# carries cross-cutting settings (pipeline, url_base, imagerepo, mdengine, ...)
-# that are consumed by virtually every state, so a targeted re-apply isn't
-# meaningful. The drainer's batch/batch_wait throttling controls blast radius.
-global:
-  - highstate: True
-    tgt: '*'
-
-# healthcheck: eval, sensor, standalone only.
-healthcheck:
-  - state: healthcheck
-    tgt: 'G@role:so-eval or G@role:so-sensor or G@role:so-standalone'
-
-# hydra: manager_roles exactly.
-hydra:
-  - state: hydra
-    tgt: 'G@role:so-eval or G@role:so-import or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone'
-
-# idh: so-idh only.
-idh:
-  - state: idh
-    tgt: 'G@role:so-idh'
-
-# influxdb: manager_roles exactly.
-influxdb:
-  - state: influxdb
-    tgt: 'G@role:so-eval or G@role:so-import or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone'
-
-# kafka: standalone, manager, managerhype, managersearch, searchnode, receiver.
-kafka:
-  - state: kafka
-    tgt: 'G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-receiver or G@role:so-searchnode or G@role:so-standalone'
-
-# kibana: manager_roles exactly.
-kibana:
-  - state: kibana
-    tgt: 'G@role:so-eval or G@role:so-import or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone'
-
-# kratos: manager_roles exactly.
-kratos:
-  - state: kratos
-    tgt: 'G@role:so-eval or G@role:so-import or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone'
-
-# logrotate: universal (top-of-file '*' branch in salt/top.sls).
-logrotate:
-  - state: logrotate
-    tgt: '*'
-
-# logstash: 8 roles, no eval/import.
-logstash:
-  - state: logstash
-    tgt: 'G@role:so-fleet or G@role:so-heavynode or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-receiver or G@role:so-searchnode or G@role:so-standalone'
-
-# manager: manager_roles exactly. The manager state is also referenced under
-# *_sensor / *_heavynode top.sls blocks via `sensor`, but the standalone
-# `manager` state itself runs only on manager_roles.
-manager:
-  - state: manager
-    tgt: 'G@role:so-eval or G@role:so-import or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone'
-
-# nginx: 10 specific roles. NOT receiver, idh, hypervisor, desktop.
-nginx:
-  - state: nginx
-    tgt: 'G@role:so-eval or G@role:so-fleet or G@role:so-heavynode or G@role:so-import or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-searchnode or G@role:so-sensor or G@role:so-standalone'
-
-# ntp: universal (top-of-file '*' branch in salt/top.sls).
-ntp:
-  - state: ntp
-    tgt: '*'
-
-# patch: universal. soc_patch carries the OS update schedule, applied via
-# patch.os.schedule on every node (it's in both the all-non-managers and
-# all-managers branches of salt/top.sls).
-patch:
-  - state: patch.os.schedule
-    tgt: '*'
-
-# postgres: manager_roles exactly.
-postgres:
-  - state: postgres
-    tgt: 'G@role:so-eval or G@role:so-import or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone'
-
-# redis: 6 roles. standalone, manager, managerhype, managersearch, heavynode, receiver.
-# (NOT eval, NOT import, NOT searchnode.)
-redis:
-  - state: redis
-    tgt: 'G@role:so-heavynode or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-receiver or G@role:so-standalone'
-
-# registry: manager_roles exactly.
-registry:
-  - state: registry
-    tgt: 'G@role:so-eval or G@role:so-import or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone'
-
-# sensoroni: universal.
-sensoroni:
-  - state: sensoroni
-    tgt: '*'
-
-# soc: manager_roles exactly.
-soc:
-  - state: soc
-    tgt: 'G@role:so-eval or G@role:so-import or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone'
-
-# stig: broad. Runs on standalone, manager, managerhype, managersearch,
-# searchnode, sensor, receiver, fleet, hypervisor, desktop.
-# NOT eval, NOT import, NOT heavynode, NOT idh (the *_idh block in
-# salt/top.sls intentionally omits stig).
-stig:
-  - state: stig
-    tgt: 'G@role:so-desktop or G@role:so-fleet or G@role:so-hypervisor or G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-receiver or G@role:so-searchnode or G@role:so-sensor or G@role:so-standalone'
-
-# strelka: sensor-side only on pillar change (sensor_roles). strelka.manager is
-# intentionally NOT fired on pillar changes -- YARA rule and strelka config
-# pillar changes are consumed by the sensor-side strelka backend, and re-running
-# strelka.manager on managers is both unnecessary and disruptive. strelka.manager
-# is left to the 2-hour highstate.
-strelka:
-  - state: strelka
-    tgt: 'G@role:so-eval or G@role:so-heavynode or G@role:so-sensor or G@role:so-standalone'
-
-# suricata: sensor_roles + so-import (5 roles).
-suricata:
-  - state: suricata
-    tgt: 'G@role:so-eval or G@role:so-heavynode or G@role:so-import or G@role:so-sensor or G@role:so-standalone'
-
-# telegraf: universal.
-telegraf:
-  - state: telegraf
-    tgt: '*'
-
-# versionlock: universal (top-of-file '*' branch in salt/top.sls).
-versionlock:
-  - state: versionlock
-    tgt: '*'
-
-# vm: libvirt-driver hypervisors only. Matched by the salt-cloud:driver:libvirt
-# grain (compound supports nested grain matching via G@<key>:<subkey>:<value>).
-# pillar/vm/soc_vm.sls write path is referenced at salt/_runners/setup_hypervisor.py:856.
-vm:
-  - state: vm
-    tgt: 'G@salt-cloud:driver:libvirt'
-
-# zeek: sensor_roles + so-import (5 roles).
-zeek:
-  - state: zeek
-    tgt: 'G@role:so-eval or G@role:so-heavynode or G@role:so-import or G@role:so-sensor or G@role:so-standalone'
@@ -1,176 +0,0 @@
-#!py
-
-# Reactor invoked by the pillar_db beacon when SOC records settings changes in
-# the so_soc.audit_settings table (see salt/_beacons/pillar_db.py). The beacon
-# emits one event per new row carrying setting_id and node_id.
-#
-# Two branches, keyed on node_id:
-#   A) node_id populated -> the change is scoped to that one minion. Look up the
-#      app in pillar_push_map.yaml and write an intent that runs the app's mapped
-#      state(s) targeted to just that node.
-#   B) node_id empty -> grid-wide app change. Look up the app in
-#      pillar_push_map.yaml and write an intent with the entry's actions as-is.
-#
-# The app name is the first dotted segment of setting_id (e.g. "telegraf.output"
-# -> "telegraf"), which matches the pillar_push_map.yaml keys 1:1.
-#
-# Reactors never dispatch directly. The so-push-drainer schedule picks up
-# ready intents, dedupes across pending files, and dispatches orch.push_batch.
-
-import fcntl
-import json
-import logging
-import os
-import time
-
-from salt.client import Caller
-import yaml
-
-LOG = logging.getLogger(__name__)
-
-PENDING_DIR = '/opt/so/state/push_pending'
-LOCK_FILE = os.path.join(PENDING_DIR, '.lock')
-MAX_PATHS = 20
-
-# The pillar_push_map.yaml is shipped via salt:// but the reactor runs on the
-# master, which mounts the default saltstack tree at this path.
-PUSH_MAP_PATH = '/opt/so/saltstack/default/salt/reactor/pillar_push_map.yaml'
-
-_PUSH_MAP_CACHE = {'mtime': 0, 'data': None}
-
-
-def _load_push_map():
-    try:
-        st = os.stat(PUSH_MAP_PATH)
-    except OSError:
-        LOG.warning('push_pillar: %s not found', PUSH_MAP_PATH)
-        return {}
-    if _PUSH_MAP_CACHE['mtime'] != st.st_mtime:
-        try:
-            with open(PUSH_MAP_PATH, 'r') as f:
-                _PUSH_MAP_CACHE['data'] = yaml.safe_load(f) or {}
-        except Exception:
-            LOG.exception('push_pillar: failed to load %s', PUSH_MAP_PATH)
-            _PUSH_MAP_CACHE['data'] = {}
-        _PUSH_MAP_CACHE['mtime'] = st.st_mtime
-    return _PUSH_MAP_CACHE['data'] or {}
-
-
-def _push_enabled():
-    try:
-        caller = Caller()
-        return bool(caller.cmd('pillar.get', 'global:push:enabled', True))
-    except Exception:
-        LOG.exception('push_pillar: pillar.get global:push:enabled failed, assuming enabled')
-        return True
-
-
-def _write_intent(key, actions, path):
-    now = time.time()
-    try:
-        os.makedirs(PENDING_DIR, exist_ok=True)
-    except OSError:
-        LOG.exception('push_pillar: cannot create %s', PENDING_DIR)
-        return
-
-    intent_path = os.path.join(PENDING_DIR, '{}.json'.format(key))
-    lock_fd = os.open(LOCK_FILE, os.O_CREAT | os.O_RDWR, 0o644)
-    try:
-        fcntl.flock(lock_fd, fcntl.LOCK_EX)
-
-        intent = {}
-        if os.path.exists(intent_path):
-            try:
-                with open(intent_path, 'r') as f:
-                    intent = json.load(f)
-            except (IOError, ValueError):
-                intent = {}
-
-        intent.setdefault('first_touch', now)
-        intent['last_touch'] = now
-        intent['actions'] = actions
-        paths = intent.get('paths', [])
-        if path and path not in paths:
-            paths.append(path)
-            paths = paths[-MAX_PATHS:]
-        intent['paths'] = paths
-
-        tmp_path = intent_path + '.tmp'
-        with open(tmp_path, 'w') as f:
-            json.dump(intent, f)
-        os.rename(tmp_path, intent_path)
-    except Exception:
-        LOG.exception('push_pillar: failed to write intent %s', intent_path)
-    finally:
-        try:
-            fcntl.flock(lock_fd, fcntl.LOCK_UN)
-        finally:
-            os.close(lock_fd)
-
-
-def _app_from_setting(setting_id):
-    # setting_id is e.g. 'telegraf.output' -> 'telegraf', 'ntp.config.servers' -> 'ntp'
-    if not setting_id:
-        return None
-    return setting_id.split('.', 1)[0] or None
-
-
-def _node_actions(entry, node_id):
-    # Copy the app's mapped actions but retarget each one to the single node.
-    # Preserves the state/highstate selection and any batch/batch_wait overrides.
-    actions = []
-    for action in entry:
-        if not isinstance(action, dict):
-            continue
-        node_action = dict(action)
-        node_action['tgt'] = node_id
-        node_action['tgt_type'] = 'glob'
-        actions.append(node_action)
-    return actions
-
-
-def run():
-    if not _push_enabled():
-        LOG.info('push_pillar: push disabled, skipping')
-        return {}
-
-    # The pillar_db beacon nests its payload under data['data']; fall back to the
-    # top level so the reactor is robust to either shape.
-    event = data.get('data', data)  # noqa: F821 -- data provided by reactor
-    setting_id = event.get('setting_id', '')
-    node_id = (event.get('node_id') or '').strip()
-
-    app = _app_from_setting(setting_id)
-    if not app:
-        LOG.debug('push_pillar: ignoring event with no app segment: setting_id=%s', setting_id)
-        return {}
-
-    push_map = _load_push_map()
-    entry = push_map.get(app)
-    if not entry:
-        LOG.warning(
-            'push_pillar: app "%s" is not in pillar_push_map.yaml; change will be '
-            'picked up at the next scheduled highstate (setting_id=%s)',
-            app, setting_id,
-        )
-        return {}
-
-    # Branch A: per-node change -> retarget the app's states to just that node.
-    if node_id:
-        actions = _node_actions(entry, node_id)
-        if not actions:
-            LOG.warning('push_pillar: no usable actions for app "%s" (setting_id=%s)', app, setting_id)
-            return {}
-        _write_intent(
-            'node_{}_{}'.format(node_id, app), actions,
-            'audit:{}@{}'.format(setting_id, node_id),
-        )
-        LOG.info('push_pillar: per-node intent updated for %s on %s (setting_id=%s)',
-                 app, node_id, setting_id)
-        return {}
-
-    # Branch B: grid-wide app change -> use the map entry's actions as-is.
-    actions = list(entry)  # copy to avoid mutating the cache
-    _write_intent('pillar_{}'.format(app), actions, 'audit:{}'.format(setting_id))
-    LOG.info('push_pillar: app intent updated for %s (setting_id=%s)', app, setting_id)
-    return {}
@@ -1,96 +0,0 @@
-#!py
-
-# Reactor invoked by the inotify beacon on rule file changes under
-# /opt/so/saltstack/local/salt/strelka/rules/compiled/.
-#
-# Writes (or updates) a push intent at /opt/so/state/push_pending/rules_strelka.json
-# and returns {}. The so-push-drainer schedule picks up ready intents, dedupes
-# across pending files, and dispatches orch.push_batch. Reactors never dispatch
-# directly -- see plan /home/mreeves/.claude/plans/goofy-marinating-hummingbird.md.
-
-import fcntl
-import json
-import logging
-import os
-import time
-
-from salt.client import Caller
-
-LOG = logging.getLogger(__name__)
-
-PENDING_DIR = '/opt/so/state/push_pending'
-LOCK_FILE = os.path.join(PENDING_DIR, '.lock')
-MAX_PATHS = 20
-
-# Mirrors GLOBALS.sensor_roles in salt/vars/globals.map.jinja. Sensor-side
-# strelka runs on exactly these four roles; so-import gets strelka.manager
-# instead, which is not fired on pillar changes.
-SENSOR_ROLES = ['so-eval', 'so-heavynode', 'so-sensor', 'so-standalone']
-
-
-def _sensor_compound():
-    return ' or '.join('G@role:{}'.format(r) for r in SENSOR_ROLES)
-
-
-def _push_enabled():
-    try:
-        caller = Caller()
-        return bool(caller.cmd('pillar.get', 'global:push:enabled', True))
-    except Exception:
-        LOG.exception('push_strelka: pillar.get global:push:enabled failed, assuming enabled')
-        return True
-
-
-def _write_intent(key, actions, path):
-    now = time.time()
-    try:
-        os.makedirs(PENDING_DIR, exist_ok=True)
-    except OSError:
-        LOG.exception('push_strelka: cannot create %s', PENDING_DIR)
-        return
-
-    intent_path = os.path.join(PENDING_DIR, '{}.json'.format(key))
-    lock_fd = os.open(LOCK_FILE, os.O_CREAT | os.O_RDWR, 0o644)
-    try:
-        fcntl.flock(lock_fd, fcntl.LOCK_EX)
-
-        intent = {}
-        if os.path.exists(intent_path):
-            try:
-                with open(intent_path, 'r') as f:
-                    intent = json.load(f)
-            except (IOError, ValueError):
-                intent = {}
-
-        intent.setdefault('first_touch', now)
-        intent['last_touch'] = now
-        intent['actions'] = actions
-        paths = intent.get('paths', [])
-        if path and path not in paths:
-            paths.append(path)
-            paths = paths[-MAX_PATHS:]
-        intent['paths'] = paths
-
-        tmp_path = intent_path + '.tmp'
-        with open(tmp_path, 'w') as f:
-            json.dump(intent, f)
-        os.rename(tmp_path, intent_path)
-    except Exception:
-        LOG.exception('push_strelka: failed to write intent %s', intent_path)
-    finally:
-        try:
-            fcntl.flock(lock_fd, fcntl.LOCK_UN)
-        finally:
-            os.close(lock_fd)
-
-
-def run():
-    if not _push_enabled():
-        LOG.info('push_strelka: push disabled, skipping')
-        return {}
-
-    path = data.get('path', '')  # noqa: F821 -- data provided by reactor
-    actions = [{'state': 'strelka', 'tgt': _sensor_compound()}]
-    _write_intent('rules_strelka', actions, path)
-    LOG.info('push_strelka: intent updated for path=%s', path)
-    return {}
@@ -1,95 +0,0 @@
-#!py
-
-# Reactor invoked by the inotify beacon on rule file changes under
-# /opt/so/saltstack/local/salt/suricata/rules/.
-#
-# Writes (or updates) a push intent at /opt/so/state/push_pending/rules_suricata.json
-# and returns {}. The so-push-drainer schedule picks up ready intents, dedupes
-# across pending files, and dispatches orch.push_batch. Reactors never dispatch
-# directly -- see plan /home/mreeves/.claude/plans/goofy-marinating-hummingbird.md.
-
-import fcntl
-import json
-import logging
-import os
-import time
-
-from salt.client import Caller
-
-LOG = logging.getLogger(__name__)
-
-PENDING_DIR = '/opt/so/state/push_pending'
-LOCK_FILE = os.path.join(PENDING_DIR, '.lock')
-MAX_PATHS = 20
-
-# Mirrors GLOBALS.sensor_roles in salt/vars/globals.map.jinja. Suricata also
-# runs on so-import per salt/top.sls, so that role is appended below.
-SENSOR_ROLES = ['so-eval', 'so-heavynode', 'so-sensor', 'so-standalone']
-
-
-def _sensor_compound_plus_import():
-    return ' or '.join('G@role:{}'.format(r) for r in SENSOR_ROLES) + ' or G@role:so-import'
-
-
-def _push_enabled():
-    try:
-        caller = Caller()
-        return bool(caller.cmd('pillar.get', 'global:push:enabled', True))
-    except Exception:
-        LOG.exception('push_suricata: pillar.get global:push:enabled failed, assuming enabled')
-        return True
-
-
-def _write_intent(key, actions, path):
-    now = time.time()
-    try:
-        os.makedirs(PENDING_DIR, exist_ok=True)
-    except OSError:
-        LOG.exception('push_suricata: cannot create %s', PENDING_DIR)
-        return
-
-    intent_path = os.path.join(PENDING_DIR, '{}.json'.format(key))
-    lock_fd = os.open(LOCK_FILE, os.O_CREAT | os.O_RDWR, 0o644)
-    try:
-        fcntl.flock(lock_fd, fcntl.LOCK_EX)
-
-        intent = {}
-        if os.path.exists(intent_path):
-            try:
-                with open(intent_path, 'r') as f:
-                    intent = json.load(f)
-            except (IOError, ValueError):
-                intent = {}
-
-        intent.setdefault('first_touch', now)
-        intent['last_touch'] = now
-        intent['actions'] = actions
-        paths = intent.get('paths', [])
-        if path and path not in paths:
-            paths.append(path)
-            paths = paths[-MAX_PATHS:]
-        intent['paths'] = paths
-
-        tmp_path = intent_path + '.tmp'
-        with open(tmp_path, 'w') as f:
-            json.dump(intent, f)
-        os.rename(tmp_path, intent_path)
-    except Exception:
-        LOG.exception('push_suricata: failed to write intent %s', intent_path)
-    finally:
-        try:
-            fcntl.flock(lock_fd, fcntl.LOCK_UN)
-        finally:
-            os.close(lock_fd)
-
-
-def run():
-    if not _push_enabled():
-        LOG.info('push_suricata: push disabled, skipping')
-        return {}
-
-    path = data.get('path', '')  # noqa: F821 -- data provided by reactor
-    actions = [{'state': 'suricata', 'tgt': _sensor_compound_plus_import()}]
-    _write_intent('rules_suricata', actions, path)
-    LOG.info('push_suricata: intent updated for path=%s', path)
-    return {}
@@ -17,7 +17,6 @@ include:
 so-redis:
  docker_container.running:
    - image: {{ GLOBALS.registry_host }}:5000/{{ GLOBALS.image_repo }}/so-redis:{{ GLOBALS.so_version }}
-    - restart_policy: unless-stopped
    - hostname: so-redis
    - user: socore
    - networks:
@@ -21,9 +21,6 @@ so-dockerregistry:
    - networks:
      - sobridge:
        - ipv4_address: {{ DOCKERMERGED.containers['so-dockerregistry'].ip }}
-    # Intentionally `always` (not unless-stopped) -- registry is critical infra
-    # and must come back up even if it was manually stopped. Do not homogenize
-    # to unless-stopped; see the container auto-restart section of the plan.
    - restart_policy: always
    - port_bindings:
      {% for BINDING in DOCKERMERGED.containers['so-dockerregistry'].port_bindings %}
@@ -3,7 +3,7 @@
 {% set SCHEDULE = salt['pillar.get']('healthcheck:schedule', 30) %}

 include:
-  - salt.minion
+  - salt

 {% if CHECKS and ENABLED %}
 salt_beacons:
@@ -23,4 +23,3 @@ salt_beacons:
    - watch_in: 
      - service: salt_minion_service
 {% endif %}
-
@@ -17,7 +17,7 @@ engines:
                to:
                  'KAFKA':
                  - cmd.run:
-                      cmd: /usr/sbin/so-yaml.py replace /opt/so/saltstack/local/pillar/kafka/soc_kafka.sls kafka.enabled True
+                      cmd: /usr/sbin/so-yaml.py replace /opt/so/saltstack/local/pillar/kafka/soc_kafka.sls kafka.enabled True && /usr/sbin/so-config.py sync-yaml-mutation /opt/so/saltstack/local/pillar/kafka/soc_kafka.sls replace kafka.enabled True --note "pillarWatch global.pipeline"
                  - cmd.run:
                      cmd: salt -C 'G@role:so-standalone or G@role:so-manager or G@role:so-managersearch or G@role:so-receiver or G@role:so-searchnode' saltutil.kill_all_jobs
                  - cmd.run:
@@ -28,7 +28,7 @@ engines:
                to:
                  'REDIS':
                  - cmd.run:
-                      cmd: /usr/sbin/so-yaml.py replace /opt/so/saltstack/local/pillar/kafka/soc_kafka.sls kafka.enabled False
+                      cmd: /usr/sbin/so-yaml.py replace /opt/so/saltstack/local/pillar/kafka/soc_kafka.sls kafka.enabled False && /usr/sbin/so-config.py sync-yaml-mutation /opt/so/saltstack/local/pillar/kafka/soc_kafka.sls replace kafka.enabled False --note "pillarWatch global.pipeline"
                  - cmd.run:
                      cmd: salt -C 'G@role:so-standalone or G@role:so-manager or G@role:so-managersearch or G@role:so-receiver or G@role:so-searchnode' saltutil.kill_all_jobs
                  - cmd.run:
@@ -66,5 +66,5 @@ engines:
                  - cmd.run:
                      cmd: salt -C 'G@role:so-standalone or G@role:so-manager or G@role:so-managersearch or G@role:so-receiver' state.apply kafka.disabled,kafka.reset
                  - cmd.run:
-                      cmd: /usr/sbin/so-yaml.py remove /opt/so/saltstack/local/pillar/kafka/soc_kafka.sls kafka.reset
+                      cmd: /usr/sbin/so-yaml.py remove /opt/so/saltstack/local/pillar/kafka/soc_kafka.sls kafka.reset && /usr/sbin/so-config.py sync-yaml-mutation /opt/so/saltstack/local/pillar/kafka/soc_kafka.sls remove kafka.reset --note "pillarWatch kafka.reset"
      interval: 10
@@ -1,11 +0,0 @@
-reactor:
-  - 'salt/beacon/*/inotify//opt/so/saltstack/local/salt/suricata/rules':
-    - salt://reactor/push_suricata.sls
-  - 'salt/beacon/*/inotify//opt/so/saltstack/local/salt/suricata/rules/*':
-    - salt://reactor/push_suricata.sls
-  - 'salt/beacon/*/inotify//opt/so/saltstack/local/salt/strelka/rules/compiled':
-    - salt://reactor/push_strelka.sls
-  - 'salt/beacon/*/inotify//opt/so/saltstack/local/salt/strelka/rules/compiled/*':
-    - salt://reactor/push_strelka.sls
-  - 'salt/beacon/*/pillar_db/audit_settings':
-    - salt://reactor/push_pillar.sls
@@ -5,11 +5,3 @@ salt_bootstrap:
    - source: salt://salt/scripts/bootstrap-salt.sh
    - mode: 755
    - show_changes: False
-
-salt_sbin:
-  file.recurse:
-    - name: /usr/sbin
-    - source: salt://salt/tools/sbin
-    - user: 939
-    - group: 939
-    - file_mode: 755
@@ -1,4 +1,4 @@
 lasthighstate:
  file.touch:
    - name: /opt/so/log/salt/lasthighstate
-    - order: 9001
+    - order: last
@@ -10,13 +10,12 @@
 #    software that is protected by the license key."

 {% from 'allowed_states.map.jinja' import allowed_states %}
-{% from 'global/map.jinja' import GLOBALMERGED %}
 {% if sls in allowed_states %}

 include:
  - salt.minion
-  - salt.master.pyinotify
-  - salt.master.boot_mine_update
+  - salt.master.ext_pillar_postgres
+  - salt.master.pg_notify_pillar_engine
 {%   if 'vrt' in salt['pillar.get']('features', []) %}
  - salt.cloud
  - salt.cloud.reactor_config_hypervisor
@@ -65,21 +64,6 @@ engines_config:
    - name: /etc/salt/master.d/engines.conf
    - source: salt://salt/files/engines.conf

-{% if GLOBALMERGED.push.enabled %}
-reactor_pushstate_config:
-  file.managed:
-    - name: /etc/salt/master.d/reactor_pushstate.conf
-    - source: salt://salt/files/reactor_pushstate.conf
-    - watch_in:
-      - service: salt_master_service
-{% else %}
-reactor_pushstate_config:
-  file.absent:
-    - name: /etc/salt/master.d/reactor_pushstate.conf
-    - watch_in:
-      - service: salt_master_service
-{% endif %}
-
 # update the bootstrap script when used for salt-cloud
 salt_bootstrap_cloud:
  file.managed:
@@ -95,7 +79,7 @@ salt_master_service:
      - file: checkmine_engine
      - file: pillarWatch_engine
      - file: engines_config
-    - order: 9002
+    - order: last

 {% else %}

@@ -1,29 +0,0 @@
-# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
-# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
-# https://securityonion.net/license; you may not use this file except in compliance with the
-# Elastic License 2.0.
-
-# Manages /etc/systemd/system/so-boot-mine-update.service, a manager-only
-# Type=oneshot unit that pushes `salt '*' mine.update` once per boot, ordered
-# before so-boot-highstate.service so mine-backed pillars (node IPs, ES/Redis/
-# Logstash discovery) are fresh before the boot highstate renders them.
-
-include:
-  - systemd.reload
-
-so_boot_mine_update_unit_file:
-  file.managed:
-    - name: /etc/systemd/system/so-boot-mine-update.service
-    - source: salt://salt/service/so-boot-mine-update.service
-    - onchanges_in:
-      - module: systemd_reload
-
-# Only enable once setup is complete. Until then the gate file is missing and
-# the unit's own ConditionPathExists would no-op it anyway.
-so_boot_mine_update_service:
-  service.enabled:
-    - name: so-boot-mine-update.service
-    - onlyif: test -e /opt/so/state/setup-complete
-    - require:
-      - file: so_boot_mine_update_unit_file
-      - module: systemd_reload
@@ -0,0 +1,24 @@
+# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
+# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
+# https://securityonion.net/license; you may not use this file except in compliance with the
+# Elastic License 2.0.
+
+# Deprecated. SOC/onionconfig owns the settings database now; this state only
+# removes the old so_pillar ext_pillar config if it was previously deployed.
+
+{% from 'allowed_states.map.jinja' import allowed_states %}
+{% if sls.split('.')[0] in allowed_states %}
+
+ext_pillar_postgres_config_absent:
+  file.absent:
+    - name: /etc/salt/master.d/ext_pillar_postgres.conf
+    - watch_in:
+      - service: salt_master_service
+
+{% else %}
+
+{{sls}}_state_not_allowed:
+  test.fail_without_changes:
+    - name: {{sls}}_state_not_allowed
+
+{% endif %}
@@ -0,0 +1,37 @@
+# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
+# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
+# https://securityonion.net/license; you may not use this file except in compliance with the
+# Elastic License 2.0.
+
+# Deprecated. SOC/onionconfig owns the settings database now; this state only
+# removes the old so_pillar notify engine and reactor config if previously
+# deployed.
+
+{% from 'allowed_states.map.jinja' import allowed_states %}
+{% if sls.split('.')[0] in allowed_states %}
+
+pg_notify_pillar_engine_module_absent:
+  file.absent:
+    - name: /etc/salt/engines/pg_notify_pillar.py
+    - watch_in:
+      - service: salt_master_service
+
+pg_notify_pillar_engine_config_absent:
+  file.absent:
+    - name: /etc/salt/master.d/pg_notify_pillar_engine.conf
+    - watch_in:
+      - service: salt_master_service
+
+pg_notify_pillar_reactor_config_absent:
+  file.absent:
+    - name: /etc/salt/master.d/so_pillar_reactor.conf
+    - watch_in:
+      - service: salt_master_service
+
+{% else %}
+
+{{sls}}_state_not_allowed:
+  test.fail_without_changes:
+    - name: {{sls}}_state_not_allowed
+
+{% endif %}
@@ -1,20 +0,0 @@
-# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
-# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
-# https://securityonion.net/license; you may not use this file except in compliance with the
-# Elastic License 2.0.
-
-pyinotify_module_package:
-  file.recurse:
-    - name: /opt/so/conf/salt/module_packages/pyinotify
-    - source: salt://salt/module_packages/pyinotify
-    - clean: True
-    - makedirs: True
-
-pyinotify_python_module_install:
-  cmd.run:
-    - name: /opt/saltstack/salt/bin/python3.10 -m pip install pyinotify --no-index --find-links=/opt/so/conf/salt/module_packages/pyinotify/ --upgrade
-    - onchanges:
-      - file: pyinotify_module_package
-    - failhard: True
-    - watch_in:
-      - service: salt_minion_service
@@ -2,3 +2,4 @@
 salt:
  minion:
    version: '3006.19'
+    check_threshold: 3600 # in seconds, threshold used for so-salt-minion-check. any value less than 600 seconds may cause a lot of salt-minion restarts since the job to touch the file occurs every 5-8 minutes by default
@@ -1,31 +0,0 @@
-# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
-# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
-# https://securityonion.net/license; you may not use this file except in compliance with the
-# Elastic License 2.0.
-
-# Manages /etc/systemd/system/so-boot-highstate.service, a Type=oneshot
-# RemainAfterExit=yes unit that runs `salt-call state.highstate` exactly once
-# per system boot. Replaces the legacy `startup_states: highstate` minion
-# config, which fired on every salt-minion service restart (causing a redundant
-# highstate whenever a highstate itself restarted salt-minion).
-
-include:
-  - systemd.reload
-
-so_boot_highstate_unit_file:
-  file.managed:
-    - name: /etc/systemd/system/so-boot-highstate.service
-    - source: salt://salt/service/so-boot-highstate.service
-    - onchanges_in:
-      - module: systemd_reload
-
-# Only enable once setup is complete. Until then the gate file is missing and
-# the unit's own ConditionPathExists would no-op it anyway -- this just keeps
-# `systemctl is-enabled` honest for the sync_es_users gate.
-so_boot_highstate_service:
-  service.enabled:
-    - name: so-boot-highstate.service
-    - onlyif: test -e /opt/so/state/setup-complete
-    - require:
-      - file: so_boot_highstate_unit_file
-      - module: systemd_reload
@@ -17,7 +17,6 @@ include:
  - repo.client
  - salt.mine_functions
  - salt.minion.service_file
-  - salt.minion.boot_highstate
 {% if GLOBALS.is_manager %}
  - ca.signing_policy
 {% endif %}
@@ -81,47 +80,21 @@ set_log_levels:
      - "log_level: info"
      - "log_level_logfile: info"

-# startup_states: highstate caused a full highstate to run on every
-# salt-minion service start, including the restart triggered when a highstate
-# itself modified the minion config (beacons, mine, unit file). Replaced by
-# so-boot-highstate.service (managed in salt.minion.boot_highstate), which
-# runs once per system boot only. Strip the line from /etc/salt/minion on
-# upgrade; both the commented and uncommented forms historically existed.
-remove_startup_states:
-  file.line:
+enable_startup_states:
+  file.uncomment:
    - name: /etc/salt/minion
-    - match: 'startup_states: highstate'
-    - mode: delete
-
-# Upgrade-path bridge: systems that already passed setup under the old gate
-# (`grep -x 'startup_states: highstate' /etc/salt/minion`) get a /opt/so/state/setup-complete
-# marker so so-boot-highstate.service can be enabled and the so-user_sync cron
-# in sync_es_users.sls keeps installing. Setup-in-progress systems instead get
-# the marker from `mark_setup_complete` in setup/so-functions at the right
-# moment. `replace: false` means we never overwrite a marker once written.
-mark_setup_complete_for_upgrades:
-  file.managed:
-    - name: /opt/so/state/setup-complete
-    - replace: false
-    - makedirs: True
-    - onlyif: "grep -qx 'startup_states: highstate' /etc/salt/minion"
-    - require_in:
-      - file: remove_startup_states
-      - service: so_boot_highstate_service
+    - regex: '^startup_states: highstate$'
+    - unless: pgrep so-setup

 {% endif %}

-# this has to be outside the if statement above since there are <requisite>_in calls to this state.
-# uses watch (not listen) so the restart fires in-state and its result lands on this state's
-# running entry; that is what lets wait_for_salt_minion_ready below detect any restart
-# uniformly via onchanges, regardless of whether the trigger came from these files or from
-# external watch_in's (e.g. beacons, master/pyinotify).
+# this has to be outside the if statement above since there are <requisite>_in calls to this state
 salt_minion_service:
  service.running:
    - name: salt-minion
    - enable: True
    - onlyif: test "{{INSTALLEDSALTVERSION}}" == "{{SALTVERSION}}"
-    - watch:
+    - listen:
      - file: mine_functions
 {% if INSTALLEDSALTVERSION|string == SALTVERSION|string %}
      - file: set_log_levels
@@ -130,17 +103,3 @@ salt_minion_service:
      - file: signing_policy
 {% endif %}
    - order: last
-
-# block until the just-restarted salt-minion is back and can execute modules locally, so
-# follow-on jobs and the next highstate iteration do not race the restart. onchanges +
-# require on salt_minion_service catches every restart trigger uniformly because watch
-# mod_watch results replace the service state's running entry. wait logic lives in
-# /usr/sbin/so-salt-minion-wait (deployed by common_sbin from common/tools/sbin/).
-wait_for_salt_minion_ready:
-  cmd.run:
-    - name: /usr/sbin/so-salt-minion-wait
-    - onchanges:
-      - service: salt_minion_service
-    - require:
-      - service: salt_minion_service
-    - order: last
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Mike Reeves	a433e9524d	Move onionconfig writes out of so-yaml	2026-05-12 16:05:55 -04:00
Mike Reeves	3d11694d51	make so-yaml PG-canonical and add pillar-change reactor stack Two coupled changes that together let so_pillar.* be the canonical config store, with config edits driving service reloads automatically: so-yaml PG-canonical mode - Adds /opt/so/conf/so-yaml/mode (and SO_YAML_BACKEND env override) with three values: dual (legacy), postgres (PG-only for managed paths), disk (emergency rollback). Bootstrap files (secrets.sls, ca/init.sls, .nodes.sls, top.sls, ...) stay disk-only regardless via the existing SkipPath allowlist in so_yaml_postgres.locate. - loadYaml/writeYaml/purgeFile now route to so_pillar. in postgres mode: replace/add/get all read+write the database with no disk file ever appearing. PG failure is fatal in postgres mode (no silent fallback); dual mode preserves the prior best-effort mirror. - so_yaml_postgres gains read_yaml(path), is_pg_managed(path), and is_enabled() so so-yaml can answer "is this path PG-managed and is PG up" without reaching into private helpers. - schema_pillar.sls writes /opt/so/conf/so-yaml/mode = postgres after the importer succeeds, so flipping postgres:so_pillar:enabled flips so-yaml's behavior in lockstep with the schema being live. pg_notify-driven change fan-out - 008_change_notify.sql adds so_pillar.change_queue + an AFTER trigger on pillar_entry that enqueues the locator and pg_notifies 'so_pillar_change'. Queue is drained at-least-once so engine restarts don't lose events; pg_notify is just the wakeup signal. - New salt-master engine pg_notify_pillar.py LISTENs on the channel, drains the queue with FOR UPDATE SKIP LOCKED, debounces bursts, and fires 'so/pillar/changed' events grouped by (scope, role, minion). - Reactor so_pillar_changed.sls catches the tag and dispatches to orch.so_pillar_reload, which carries a DISPATCH map of pillar-path prefix -> (state sls, role grain set) so adding a new service to the auto-reload list is a one-line edit instead of a new reactor. - Engine + reactor wiring is gated on the same postgres:so_pillar:enabled flag as the schema and ext_pillar config so the whole stack flips on/off together. Tests: 21 new cases (112 total, all passing) covering mode resolution, PG-managed detection, and PG-canonical read/write/purge routing with the PG client stubbed.	2026-05-01 09:31:48 -04:00
Mike Reeves	23255f88e0	add so-yaml dual-write to so_pillar.* + purge verb Hooks every so-yaml.py write through a new so_yaml_postgres helper that mirrors disk YAML mutations into so_pillar.pillar_entry via docker exec psql. Disk remains canonical during the transition; PG mirror failures are logged only when a real write error occurs (skipped paths and postgres-unreachable cases stay silent so existing callers don't see new noise on stderr). Adds a `purge YAML_FILE` verb on so-yaml that deletes the file from disk and removes the matching pillar_entry rows. For minion files it also drops the so_pillar.minion row, which CASCADEs to pillar_entry + role_member. Designed for so-minion's delete path (replaces rm -f) so the audit log captures the deletion. setup/so-functions::generate_passwords + secrets_pillar generate secrets:pillar_master_pass and /opt/so/conf/postgres/so_pillar.key on fresh installs, and append the password to existing secrets.sls files on upgrade. - salt/manager/tools/sbin/so_yaml_postgres.py: locate(), write_yaml(), purge_yaml(), and a small CLI for diagnostics. Skips bootstrap and mine-driven paths via the same allowlist used by so-pillar-import. - salt/manager/tools/sbin/so-yaml.py: import the helper, hook writeYaml() to mirror after every disk write, add purgeFile() and the purge verb. - salt/manager/tools/sbin/so-yaml_test.py: 16 new tests covering the purge verb and the path-locator / write contract of so_yaml_postgres without contacting Postgres. All 91 tests pass. - setup/so-functions: generate_passwords adds PILLARMASTERPASS and SO_PILLAR_KEY; secrets_pillar writes pillar_master_pass and the pgcrypto master key file.	2026-04-30 17:09:58 -04:00
Mike Reeves	d30b52b327	add so-pillar-import — seeds so_pillar.* from on-disk pillar tree Idempotent importer that schema_pillar.sls runs once at end of postgres state on first install, and that so-minion can call per-minion on add / delete. UPSERTs into so_pillar.pillar_entry; the audit trigger handles versioning so re-runs without SLS edits produce no version bumps. Connects via docker exec so-postgres psql, so no DSN config is required at first-install time. Skips bootstrap files (secrets.sls, postgres/ auth.sls, etc.), mine-driven nodes.sls files, and any file containing Jinja templates — those stay disk-authoritative and ext_pillar_first: False means they render before the PG overlay. Auto-syncs to /usr/sbin via the existing manager_sbin file.recurse.	2026-04-30 16:34:05 -04:00
Mike Reeves	3fad895d6a	add so_pillar schema + ext_pillar wiring (postsalt foundation) Lays the database-backed pillar foundation for the postsalt branch. Salt continues to read on-disk SLS first; the new ext_pillar config overlays values from the so_pillar.* schema in so-postgres. - salt/postgres/files/schema/pillar/00{1..7}_*.sql: idempotent DDL for scope/role/role_member/minion/pillar_entry/pillar_entry_history/ drift_log, secret pgcrypto helpers, RLS, pg_cron retention. - salt/postgres/schema_pillar.sls: applies the SQL files inside the so-postgres container after it's healthy, configures the master_key GUC, and runs so-pillar-import once. Gated on postgres:so_pillar:enabled feature flag (default false). - salt/salt/master/ext_pillar_postgres.{sls,conf.jinja}: drops /etc/salt/master.d/ext_pillar_postgres.conf with list-form ext_pillar queries (global/role/minion/secrets) and ext_pillar_first: False so bootstrap pillars on disk render before the PG overlay. - salt/postgres/init.sls + salt/salt/master.sls: include the new states. Both new state branches are guarded so a default install with the flag off is a no-op.	2026-04-30 16:30:57 -04:00
@@ -1 +1 @@
 .2.0
 .1.0