Move per-minion telegraf cred provisioning into so-minion

Simpler, race-free replacement for the reactor + orch + fan-out chain.

- salt/manager/tools/sbin/so-minion: expand add_telegraf_to_minion to
  generate a random 72-char password, reuse any existing password from
  the aggregate pillar, write postgres.telegraf.{user,pass} into the
  minion's own pillar file, and update the aggregate pillar so
  postgres.telegraf_users can CREATE ROLE on the next manager apply.
  Every create<ROLE> function already calls this hook, so add / addVM /
  setup dispatches are all covered identically and synchronously.
- salt/postgres/auth.sls: strip the fanout_targets loop and the
  postgres_telegraf_minion_pillar_<safe> cmd.run block — it's now
  redundant. The state still manages the so_postgres admin user and
  writes the aggregate pillar for postgres.telegraf_users to consume.
- salt/reactor/telegraf_user_sync.sls: deleted.
- salt/orch/telegraf_postgres_sync.sls: deleted.
- salt/salt/master.sls: drop the reactor_config_telegraf block that
  registered the reactor on /etc/salt/master.d/reactor_telegraf.conf.
- salt/orch/deploy_newnode.sls: drop the manager_fanout_postgres_telegraf
  step and the require: it added to the newnode highstate. Back to its
  original 3/dev shape.

No more ephemeral postgres_fanout_minion pillar, no more async salt/key
reactor, no more so-minion setupMinionFiles race: the pillar write
happens inline inside setupMinionFiles itself.
This commit is contained in:
Mike Reeves
2026-04-21 15:34:15 -04:00
parent 1abfd77351
commit 5f28e9b191
6 changed files with 31 additions and 124 deletions
+31
View File
@@ -542,6 +542,37 @@ function add_telegraf_to_minion() {
log "ERROR" "Failed to add telegraf configuration to $PILLARFILE"
return 1
fi
# Provision the per-minion postgres Telegraf credential so telegraf.conf
# renders correctly on the minion's first highstate and postgres.telegraf_users
# picks up the matching aggregate entry on the next manager apply.
#
# Writes:
# - postgres.telegraf.{user,pass} into the minion's own pillar file
# (distributed to only this minion via pillar/top.sls).
# - postgres.auth.users.telegraf_<safe>.{user,pass} into the aggregate
# pillar so postgres.telegraf_users CREATE ROLE finds it.
#
# An existing password is reused if the aggregate already has one (re-add),
# so rerunning so-minion for the same minion keeps the cred stable.
local MINION_SAFE
MINION_SAFE=$(echo "$MINION_ID" | tr '.-' '__' | tr '[:upper:]' '[:lower:]')
local PG_USER="so_telegraf_${MINION_SAFE}"
local AGGREGATE=/opt/so/saltstack/local/pillar/postgres/auth.sls
local PG_PASS=""
if [[ -f "$AGGREGATE" ]]; then
PG_PASS=$(so-yaml.py get -r "$AGGREGATE" "postgres.auth.users.telegraf_${MINION_SAFE}.pass" 2>/dev/null || true)
fi
if [[ -z "$PG_PASS" ]]; then
PG_PASS=$(tr -dc 'A-Za-z0-9~!@#^&*()_=+[]|;:,.<>?-' < /dev/urandom | head -c 72)
fi
so-yaml.py replace "$PILLARFILE" postgres.telegraf.user "$PG_USER" >/dev/null
so-yaml.py replace "$PILLARFILE" postgres.telegraf.pass "$PG_PASS" >/dev/null
if [[ -f "$AGGREGATE" ]]; then
so-yaml.py replace "$AGGREGATE" "postgres.auth.users.telegraf_${MINION_SAFE}.user" "$PG_USER" >/dev/null
so-yaml.py replace "$AGGREGATE" "postgres.auth.users.telegraf_${MINION_SAFE}.pass" "$PG_PASS" >/dev/null
fi
}
function add_influxdb_to_minion() {
-17
View File
@@ -12,21 +12,6 @@
attempts: 36
interval: 5
# so-minion's setupMinionFiles rebuilds the new minion's pillar file from
# scratch, wiping any postgres.telegraf.* entries the reactor may have written
# on salt-key accept. Re-fan the cred here so the highstate below sees it.
# Idempotent via the unless: guard in postgres.auth.
manager_fanout_postgres_telegraf_{{NEWNODE}}:
salt.state:
- tgt: {{ MANAGER }}
- sls:
- postgres.auth
- queue: True
- pillar:
postgres_fanout_minion: {{ NEWNODE }}
- require:
- salt: {{NEWNODE}}_update_mine
# we need to prepare the manager for a new searchnode or heavynode
{% if NEWNODE.split('_')|last in ['searchnode', 'heavynode'] %}
manager_run_es_soc:
@@ -45,5 +30,3 @@ manager_run_es_soc:
- tgt: {{ NEWNODE }}
- highstate: True
- queue: True
- require:
- salt: manager_fanout_postgres_telegraf_{{NEWNODE}}
-28
View File
@@ -1,28 +0,0 @@
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
# Fired by salt/reactor/telegraf_user_sync.sls when salt-key accepts a new
# minion. Only provisions the per-minion pillar entry and DB role on the
# manager; the minion itself will pick up its telegraf config on its first
# highstate during onboarding, so there's no need to push the telegraf state
# from here.
#
# Target the manager via role grains — same pattern as orch/delete_hypervisor.sls.
# The reactor doesn't know the manager's minion id, and grains.master on the
# runner is a hostname, not a targetable id.
{% set FANOUT_MINION = salt['pillar.get']('postgres_fanout_minion', '') %}
manager_sync_telegraf_pg_users:
salt.state:
- tgt: 'G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone or G@role:so-eval'
- tgt_type: compound
- sls:
- postgres.auth
- postgres.telegraf_users
- queue: True
{% if FANOUT_MINION %}
- pillar:
postgres_fanout_minion: {{ FANOUT_MINION }}
{% endif %}
-48
View File
@@ -49,54 +49,6 @@ postgres_auth_pillar:
pass: "{{ entry.pass }}"
{% endfor %}
- show_changes: False
{# Fan a specific minion's telegraf cred out to its own pillar file.
Two triggers populate the target list:
- grains.id (always) so the manager's own pillar is populated on every
postgres.auth run — otherwise the manager's telegraf has no cred on
a fresh install and can't write to its own postgres.
- pillar postgres_fanout_minion (when the reactor fires on a new
minion's salt-key accept).
The `unless` guard keeps re-runs idempotent, so this is one so-yaml.py
check per target, not per minion in the grid. Bulk backfill for
already-accepted minions lives in soup. #}
{% set fanout_targets = [] %}
{% if grains.id %}
{%- do fanout_targets.append(grains.id) %}
{% endif %}
{% set fanout_mid = salt['pillar.get']('postgres_fanout_minion') %}
{% if fanout_mid and fanout_mid not in fanout_targets %}
{%- do fanout_targets.append(fanout_mid) %}
{% endif %}
{% for mid in fanout_targets %}
{%- set safe = mid | replace('.','_') | replace('-','_') | lower %}
{%- set key = 'telegraf_' ~ safe %}
{%- set entry = telegraf_users.get(key) %}
{%- if entry %}
postgres_telegraf_minion_pillar_{{ safe }}:
cmd.run:
- name: |
set -e
PILLAR_FILE=/opt/so/saltstack/local/pillar/minions/{{ mid }}.sls
if [ ! -f "$PILLAR_FILE" ]; then
echo '{}' > "$PILLAR_FILE"
chown socore:socore "$PILLAR_FILE" 2>/dev/null || true
chmod 640 "$PILLAR_FILE"
fi
/usr/sbin/so-yaml.py replace "$PILLAR_FILE" postgres.telegraf.user "$PG_USER"
/usr/sbin/so-yaml.py replace "$PILLAR_FILE" postgres.telegraf.pass "$PG_PASS"
- env:
- PG_USER: '{{ entry.user }}'
- PG_PASS: '{{ entry.pass }}'
- unless: |
[ "$(/usr/sbin/so-yaml.py get -r /opt/so/saltstack/local/pillar/minions/{{ mid }}.sls postgres.telegraf.user 2>/dev/null)" = '{{ entry.user }}' ]
- require:
- file: postgres_auth_pillar
{%- endif %}
{% endfor %}
{% else %}
{{sls}}_state_not_allowed:
-18
View File
@@ -1,18 +0,0 @@
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
{# Fires on salt/key. Only act on successful key acceptance — not reauth. #}
{% if data.get('act') == 'accept' and data.get('result') == True and data.get('id') %}
{{ data['id'] }}_telegraf_pg_sync:
runner.state.orchestrate:
- args:
- mods: orch.telegraf_postgres_sync
- pillar:
postgres_fanout_minion: {{ data['id'] }}
{% do salt.log.info('telegraf_user_sync reactor: syncing telegraf PG user for minion %s' % data['id']) %}
{% endif %}
-13
View File
@@ -62,19 +62,6 @@ engines_config:
- name: /etc/salt/master.d/engines.conf
- source: salt://salt/files/engines.conf
reactor_config_telegraf:
file.managed:
- name: /etc/salt/master.d/reactor_telegraf.conf
- contents: |
reactor:
- 'salt/key':
- /opt/so/saltstack/default/salt/reactor/telegraf_user_sync.sls
- user: root
- group: root
- mode: 644
- watch_in:
- service: salt_master_service
# update the bootstrap script when used for salt-cloud
salt_bootstrap_cloud:
file.managed: