Two coupled changes that together let so_pillar.* be the canonical
config store, with config edits driving service reloads automatically:
so-yaml PG-canonical mode
- Adds /opt/so/conf/so-yaml/mode (and SO_YAML_BACKEND env override) with
three values: dual (legacy), postgres (PG-only for managed paths),
disk (emergency rollback). Bootstrap files (secrets.sls, ca/init.sls,
*.nodes.sls, top.sls, ...) stay disk-only regardless via the existing
SkipPath allowlist in so_yaml_postgres.locate.
- loadYaml/writeYaml/purgeFile now route to so_pillar.* in postgres
mode: replace/add/get all read+write the database with no disk file
ever appearing. PG failure is fatal in postgres mode (no silent
fallback); dual mode preserves the prior best-effort mirror.
- so_yaml_postgres gains read_yaml(path), is_pg_managed(path), and
is_enabled() so so-yaml can answer "is this path PG-managed and is
PG up" without reaching into private helpers.
- schema_pillar.sls writes /opt/so/conf/so-yaml/mode = postgres after
the importer succeeds, so flipping postgres:so_pillar:enabled flips
so-yaml's behavior in lockstep with the schema being live.
pg_notify-driven change fan-out
- 008_change_notify.sql adds so_pillar.change_queue + an AFTER trigger
on pillar_entry that enqueues the locator and pg_notifies
'so_pillar_change'. Queue is drained at-least-once so engine restarts
don't lose events; pg_notify is just the wakeup signal.
- New salt-master engine pg_notify_pillar.py LISTENs on the channel,
drains the queue with FOR UPDATE SKIP LOCKED, debounces bursts, and
fires 'so/pillar/changed' events grouped by (scope, role, minion).
- Reactor so_pillar_changed.sls catches the tag and dispatches to
orch.so_pillar_reload, which carries a DISPATCH map of pillar-path
prefix -> (state sls, role grain set) so adding a new service to
the auto-reload list is a one-line edit instead of a new reactor.
- Engine + reactor wiring is gated on the same postgres:so_pillar:enabled
flag as the schema and ext_pillar config so the whole stack flips
on/off together.
Tests: 21 new cases (112 total, all passing) covering mode resolution,
PG-managed detection, and PG-canonical read/write/purge routing with
the PG client stubbed.
Hooks every so-yaml.py write through a new so_yaml_postgres helper that
mirrors disk YAML mutations into so_pillar.pillar_entry via docker exec
psql. Disk remains canonical during the transition; PG mirror failures
are logged only when a real write error occurs (skipped paths and
postgres-unreachable cases stay silent so existing callers don't see
new noise on stderr).
Adds a `purge YAML_FILE` verb on so-yaml that deletes the file from
disk and removes the matching pillar_entry rows. For minion files it
also drops the so_pillar.minion row, which CASCADEs to pillar_entry +
role_member. Designed for so-minion's delete path (replaces rm -f) so
the audit log captures the deletion.
setup/so-functions::generate_passwords + secrets_pillar generate
secrets:pillar_master_pass and /opt/so/conf/postgres/so_pillar.key on
fresh installs, and append the password to existing secrets.sls files
on upgrade.
- salt/manager/tools/sbin/so_yaml_postgres.py: locate(), write_yaml(),
purge_yaml(), and a small CLI for diagnostics. Skips bootstrap and
mine-driven paths via the same allowlist used by so-pillar-import.
- salt/manager/tools/sbin/so-yaml.py: import the helper, hook
writeYaml() to mirror after every disk write, add purgeFile() and
the purge verb.
- salt/manager/tools/sbin/so-yaml_test.py: 16 new tests covering the
purge verb and the path-locator / write contract of so_yaml_postgres
without contacting Postgres. All 91 tests pass.
- setup/so-functions: generate_passwords adds PILLARMASTERPASS and
SO_PILLAR_KEY; secrets_pillar writes pillar_master_pass and the
pgcrypto master key file.
Idempotent importer that schema_pillar.sls runs once at end of postgres
state on first install, and that so-minion can call per-minion on add /
delete. UPSERTs into so_pillar.pillar_entry; the audit trigger handles
versioning so re-runs without SLS edits produce no version bumps.
Connects via docker exec so-postgres psql, so no DSN config is required
at first-install time. Skips bootstrap files (secrets.sls, postgres/
auth.sls, etc.), mine-driven nodes.sls files, and any file containing
Jinja templates — those stay disk-authoritative and ext_pillar_first:
False means they render before the PG overlay.
Auto-syncs to /usr/sbin via the existing manager_sbin file.recurse.
Lays the database-backed pillar foundation for the postsalt branch. Salt
continues to read on-disk SLS first; the new ext_pillar config overlays
values from the so_pillar.* schema in so-postgres.
- salt/postgres/files/schema/pillar/00{1..7}_*.sql: idempotent DDL for
scope/role/role_member/minion/pillar_entry/pillar_entry_history/
drift_log, secret pgcrypto helpers, RLS, pg_cron retention.
- salt/postgres/schema_pillar.sls: applies the SQL files inside the
so-postgres container after it's healthy, configures the master_key
GUC, and runs so-pillar-import once. Gated on
postgres:so_pillar:enabled feature flag (default false).
- salt/salt/master/ext_pillar_postgres.{sls,conf.jinja}: drops
/etc/salt/master.d/ext_pillar_postgres.conf with list-form ext_pillar
queries (global/role/minion/secrets) and ext_pillar_first: False so
bootstrap pillars on disk render before the PG overlay.
- salt/postgres/init.sls + salt/salt/master.sls: include the new states.
Both new state branches are guarded so a default install with the flag
off is a no-op.
The manager's /etc/salt/minion (written by so-functions:configure_minion)
has no file_roots, so salt-call --local falls back to Salt's default
/srv/salt and fails with "No matching sls found for 'postgres.telegraf_users'
in env 'base'". || true was silently swallowing the error, which meant the
DB roles for the pillar entries just populated by the so-telegraf-cred
backfill loop never actually got created.
Route through salt-master instead; its file_roots already points at the
default/local salt trees.
pillar/top.sls now references postgres.soc_postgres / postgres.adv_postgres
unconditionally, but make_some_dirs only runs at install time so managers
upgrading from 3.0.0 have no local/pillar/postgres/ and salt-master fails
pillar render on the first post-upgrade restart. Similarly, secrets_pillar
is a no-op on upgrade (secrets.sls already exists), so secrets:postgres_pass
never gets seeded and the postgres container's POSTGRES_PASSWORD_FILE and
SOC's PG_ADMIN_PASS would land empty after highstate.
Add ensure_postgres_local_pillar and ensure_postgres_secret to up_to_3.1.0
so the stubs and secret exist before masterlock/salt-master restart. Both
are idempotent and safe to re-run.
Exercises the FileNotFoundError and generic-exception branches added to
loadYaml in the previous commit, restoring 100% coverage required by
the build.
so-telegraf-cred was committed with mode 644, causing
`so-telegraf-cred add "$MINION_ID"` in so-minion's add_telegraf_to_minion
to fail with "Permission denied" and log "Failed to provision postgres
telegraf cred for <minion>". Mark it executable.
Also bail early in seed_creds_file if mkdir/printf/chmod fail, and in
so-yaml.py loadYaml surface a clear stderr message with the filename
instead of an unhandled FileNotFoundError traceback.
Swap the ~150-line Python implementation for a 48-line bash script that
delegates YAML mutation to so-yaml.py — the same helper so-minion and
soup already use. Same semantics: seed the creds pillar on first use,
idempotent add, silent remove.
SO minion ids are dot-free by construction (setup/so-functions:1884
strips everything after the first '.'), so using the raw id as the
so-yaml.py key path is safe.
The old flow had two writers for each per-minion Telegraf password
(so-minion wrote the minion pillar; postgres.auth regenerated any
missing aggregate entries). They drifted on first-boot and there was
no trigger to create DB roles when a new minion joined.
Split responsibilities:
- pillar/postgres/auth.sls (manager-scoped) keeps only the so_postgres
admin cred.
- pillar/telegraf/creds.sls (grid-wide) holds a {minion_id: {user,
pass}} map, shadowed per-install by the local-pillar copy.
- salt/manager/tools/sbin/so-telegraf-cred is the single writer:
flock, atomic YAML write, PyYAML safe_dump so passwords never
round-trip through so-yaml.py's type coercion. Idempotent add, quiet
remove.
- so-minion's add/remove hooks now shell out to so-telegraf-cred
instead of editing pillar files directly.
- postgres.telegraf_users iterates the new pillar key and CREATE/ALTERs
roles from it; telegraf.conf reads its own entry via grains.id.
- orch.deploy_newnode runs postgres.telegraf_users on the manager and
refreshes the new minion's pillar before the new node highstates,
so the DB role is in place the first time telegraf tries to connect.
- soup's post_to_3.1.0 backfills the creds pillar from accepted salt
keys (idempotent) and runs postgres.telegraf_users once to reconcile
the DB.
The reactor path is gone; so-minion now owns add/delete for new
minions. The backfill itself is unchanged — postgres.auth's up_minions
fallback fills the aggregate, postgres.telegraf_users creates the
roles, and the bash loop fans to per-minion pillar files — so the
pre-feature upgrade story still works end-to-end. Just refresh the
comment so it isn't misleading.
Paired with the add path in add_telegraf_to_minion: when a minion is
removed, drop its entry from the aggregate postgres pillar and drop the
matching so_telegraf_<safe> role from the database. Without this, stale
entries and DB roles accumulate over time.
Makes rotate-password and compromise-recovery both a clean delete+add:
so-minion -o=delete -m=<id>
so-minion -o=add -m=<id>
The first call drops the role and clears the aggregate pillar; the
second generates a brand-new password.
The cleanup is best-effort — if so-postgres isn't running or the DROP
ROLE fails (e.g., the role owns unexpected objects), we log a warning
and continue so the minion delete itself never gets blocked by postgres
state. Admins can mop up stray roles manually if that happens.
Simpler, race-free replacement for the reactor + orch + fan-out chain.
- salt/manager/tools/sbin/so-minion: expand add_telegraf_to_minion to
generate a random 72-char password, reuse any existing password from
the aggregate pillar, write postgres.telegraf.{user,pass} into the
minion's own pillar file, and update the aggregate pillar so
postgres.telegraf_users can CREATE ROLE on the next manager apply.
Every create<ROLE> function already calls this hook, so add / addVM /
setup dispatches are all covered identically and synchronously.
- salt/postgres/auth.sls: strip the fanout_targets loop and the
postgres_telegraf_minion_pillar_<safe> cmd.run block — it's now
redundant. The state still manages the so_postgres admin user and
writes the aggregate pillar for postgres.telegraf_users to consume.
- salt/reactor/telegraf_user_sync.sls: deleted.
- salt/orch/telegraf_postgres_sync.sls: deleted.
- salt/salt/master.sls: drop the reactor_config_telegraf block that
registered the reactor on /etc/salt/master.d/reactor_telegraf.conf.
- salt/orch/deploy_newnode.sls: drop the manager_fanout_postgres_telegraf
step and the require: it added to the newnode highstate. Back to its
original 3/dev shape.
No more ephemeral postgres_fanout_minion pillar, no more async salt/key
reactor, no more so-minion setupMinionFiles race: the pillar write
happens inline inside setupMinionFiles itself.
Two fixes on the postgres telegraf fan-out path:
1. postgres.auth cmd.run leaked the password to the console because
Salt always prints the Name: field and `show_changes: False` does
not apply to cmd.run. Move the user and password into the `env:`
attribute so the shell body still sees them via $PG_USER / $PG_PASS
but Salt's state reporter never renders them.
2. so-minion's addMinion -> setupMinionFiles sequence removes the
minion pillar file and rewrites it from scratch, which wipes the
postgres.telegraf.* entries the reactor may have already written on
salt-key accept. Add a postgres.auth fan-out step to
orch.deploy_newnode (the orch so-minion kicks off after
setupMinionFiles) and require it from the new minion's highstate.
Idempotent via the existing unless: guard in postgres.auth.
replace calls removeKey before addKey, so running `so-yaml.py replace`
on a new dotted key whose parent doesn't exist — e.g., postgres.auth
fanning postgres.telegraf.user into a minion pillar file that has
never carried any postgres.* keys — crashed with
KeyError: 'postgres'
from removeKey recursing into a missing parent dict.
Make removeKey a no-op when an intermediate key is absent so that:
- `remove` has the natural "remove if exists" semantics, and
- `replace` works for brand-new nested keys.
The empty-pillar case produced a telegraf.conf with `user= password=`
which libpq misparses ("password=" gets consumed as the user value),
yielding `password authentication failed for user "password="` on
every manager without a prior fan-out (fresh install, not the salt-key
path the reactor handles).
Two fixes:
- salt/postgres/auth.sls: always fan for grains.id in addition to any
postgres_fanout_minion from the reactor, so the manager's own pillar
is populated on every postgres.auth run. The existing `unless` guard
keeps re-runs idempotent.
- salt/telegraf/etc/telegraf.conf: gate the [[outputs.postgresql]]
block on PG_USER and PG_PASS being non-empty. If a minion hasn't
received its pillar yet the output block simply isn't rendered — the
next highstate picks up the creds once the fan-out completes, and in
the meantime telegraf keeps running the other outputs instead of
erroring with a malformed connection string.
postgres.auth was running an `unless` shell check per up-minion on every
manager highstate, even when nothing had changed — N fork+python starts
of so-yaml.py add up on large grids. The work is only needed when a
specific minion's key is accepted.
- salt/postgres/auth.sls: fan out only when postgres_fanout_minion
pillar is set (targets that single minion). Manager highstates with
no pillar take a zero-N code path.
- salt/reactor/telegraf_user_sync.sls: re-pass the accepted minion id
as postgres_fanout_minion to the orch.
- salt/orch/telegraf_postgres_sync.sls: forward the pillar to the
salt.state invocation so the state render sees it.
- salt/manager/tools/sbin/soup: for the one-time 3.1.0 backfill, drop
the per-minion state.apply and do an in-shell loop over the minion
pillar files using so-yaml.py directly. Skips minions that already
have postgres.telegraf.user set.
Every postgres.auth run was rewriting every minion pillar file via
two so-yaml.py replace calls, even when nothing had changed. Passwords
are only generated on first encounter (see the `if key not in
telegraf_users` guard) and never rotate, so re-writing the same values
on every apply is wasted work and noisy state output.
Add an `unless:` check that compares the already-written
postgres.telegraf.user to the one we'd set. If they match, skip the
fan-out entirely. On first apply for a new minion the key isn't there,
so the replace runs; on subsequent applies it's a no-op.
pillar/top.sls only distributes postgres.auth to manager-class roles,
so sensors / heavynodes / searchnodes / receivers / fleet / idh /
hypervisor / desktop minions never received the postgres telegraf
password they need to write metrics. Broadcasting the aggregate
postgres.auth pillar to every role would leak the so_postgres admin
password and every other minion's cred.
Fan out per-minion credentials into each minion's own pillar file at
/opt/so/saltstack/local/pillar/minions/<id>.sls. That file is already
distributed by pillar/top.sls exclusively to the matching minion via
`- minions.{{ grains.id }}`, so each minion sees only its own
postgres.telegraf.{user,pass} and nothing else.
- salt/postgres/auth.sls: after writing the manager-scoped aggregate
pillar, fan the per-minion creds out via so-yaml.py replace for every
up-minion. Creates the minion pillar file if missing. Requires
postgres_auth_pillar so the manager pillar lands first.
- salt/telegraf/etc/telegraf.conf: consume postgres:telegraf:user and
postgres:telegraf:pass directly from the minion's own pillar instead
of walking postgres:auth:users which isn't visible off the manager.
The so-postgres-backup script and its cron were living under
salt/backup/config_backup.sls, which meant the backup script and cron
were deployed independently of whether postgres was enabled/disabled.
- Relocate salt/backup/tools/sbin/so-postgres-backup to
salt/postgres/tools/sbin/so-postgres-backup so the existing
postgres_sbin file.recurse in postgres/config.sls picks it up with
everything else — no separate file.managed needed.
- Remove postgres_backup_script and so_postgres_backup from
salt/backup/config_backup.sls.
- Add cron.present for so_postgres_backup to salt/postgres/enabled.sls
and the matching cron.absent to salt/postgres/disabled.sls so the
cron follows the container's lifecycle.
- config.sls: postgresconfdir creates /opt/so/conf/postgres, so the
two subdirectories under it (postgressecretsdir, postgresinitdir)
don't need their own makedirs — require the parent instead.
- soc_postgres.yaml: helpLink for every annotated key now points to
'postgres' instead of the carried-over 'influxdb' slug.
The previous MANAGER resolution used pillar.get('setup:manager') with a
fallback to grains.get('master'). Neither works from the reactor:
setup:manager is only populated by the setup workflow (not by reactor
runs), and grains.master returns the minion's master-hostname setting,
not a targetable minion id.
Match the pattern used by orch/delete_hypervisor.sls: compound-target
whichever minion is the manager via role grain.
postgres_wait_ready requires docker_container: so-postgres, which is
declared in postgres.enabled. Running postgres.telegraf_users on its own
— as the reactor orch and the soup post-upgrade step both do — errored
because Salt couldn't resolve the require.
Include postgres.enabled from postgres.telegraf_users so the container
state is always in the render. postgres.enabled already includes
telegraf_users; Salt de-duplicates the circular include and the included
states are all idempotent, so repeated application is a no-op.
New minions run highstate as part of onboarding, which already applies
the telegraf state with the fresh pillar entry we just wrote. Pushing
telegraf a second time from the reactor is redundant.
- Remove the MINION-scoped salt.state block from the orch; keep only
the manager-side postgres.auth + postgres.telegraf_users provisioning.
- Stop passing minion_id as pillar in the reactor; the orch doesn't
reference it anymore.
salt/auth fires on every minion authentication — including every minion
restart and every master restart — so the reactor was re-running the
postgres.auth + postgres.telegraf_users + telegraf orchestration for
every already-accepted minion on every reconnect. The underlying states
are idempotent, so this was wasted work and log noise, not a correctness
issue.
Switch the subscription to salt/key, which fires only when the master
actually changes a key's state (accept / reject / delete). Match the
pattern used by salt/reactor/check_hypervisor.sls (registered in
salt/salt/cloud/reactor_config_hypervisor.sls) and add the result==True
guard so half-failed key operations don't trigger the orchestration.
Add Configuration-UI annotations for every postgres pillar key defined
in defaults.yaml, not just telegraf.retention_days:
- postgres.enabled — readonly; admin-visible but toggled via state
- postgres.telegraf.retention_days — drop advanced so user-tunable knobs
surface in the default view
- postgres.config.max_connections, shared_buffers, log_min_messages —
user-tunable performance/verbosity knobs, not advanced
- postgres.config.listen_addresses, port, ssl, ssl_cert_file, ssl_key_file,
ssl_ca_file, hba_file, log_destination, logging_collector,
shared_preload_libraries, cron.database_name — infra/Salt-managed,
marked advanced so they're visible but out of the way
No defaults.yaml change; value-side stays the same.
- firewall/map.jinja and postgres/telegraf_users.sls now pull the
telegraf output selector through TELEGRAFMERGED so the defaults.yaml
value (BOTH) is the source of truth and pillar overrides merge in
cleanly. pillar.get with a hardcoded fallback was brittle and would
disagree with defaults.yaml if the two ever diverged.
- Rename salt/postgres/files/pg_hba.conf.jinja to pg_hba.conf and drop
template: jinja from config.sls — the file has no jinja besides the
comment header.
The Telegraf backend selector lived at global.telegraf_output but it is
a Telegraf-scoped setting, not a cross-cutting grid global. Move both
the value and the UI annotation under the telegraf pillar so it shows
up alongside the other Telegraf tuning knobs in the Configuration UI.
- salt/telegraf/defaults.yaml: add telegraf.output: BOTH
- salt/telegraf/soc_telegraf.yaml: add telegraf.output annotation
- salt/global/defaults.yaml: remove global.telegraf_output
- salt/global/soc_global.yaml: remove global.telegraf_output annotation
- salt/vars/globals.map.jinja: drop telegraf_output from GLOBALS
- salt/firewall/map.jinja: read via pillar.get('telegraf:output')
- salt/postgres/telegraf_users.sls: read via pillar.get('telegraf:output')
- salt/telegraf/etc/telegraf.conf: read via TELEGRAFMERGED.output
- salt/postgres/tools/sbin/so-stats-show: update user-facing docs
No behavioral change — default stays BOTH.
state.apply takes a single mods argument; space-separated names are not
a list, so `state.apply postgres.auth postgres.telegraf_users` was only
applying postgres.auth and silently dropping the telegraf_users state.
Use comma-separated mods and add queue=True to match the rest of soup.
feature/postgres had rewritten the 3.1.0 upgrade block, dropping the
elastic upgrade work 3/dev landed for 9.0.8→9.3.3: elasticsearch_backup_index_templates,
the component template state cleanup, and the /usr/sbin/so-kibana-space-defaults
post-upgrade call. It also carried an older ES upgrade mapping
(8.18.8→9.0.8) that was superseded on 3/dev (9.0.8→9.3.3 for
3.0.0-20260331), and a handful of latent shell-quoting regressions in
verify_es_version_compatibility and the intermediate-upgrade helpers.
Adopt the 3/dev soup verbatim and only add the new Telegraf Postgres
provisioning to post_to_3.1.0 on top of so-kibana-space-defaults.
- Deliver postgres super and app passwords via mounted 0600 secret files
(POSTGRES_PASSWORD_FILE, SO_POSTGRES_PASS_FILE) instead of plaintext env
vars visible in docker inspect output
- Mount a managed pg_hba.conf that only allows local trust and hostssl
scram-sha-256 so TCP clients cannot negotiate cleartext sessions
- Restrict postgres.key to 0400 and ensure owner/group 939
- Set umask 0077 on so-postgres-backup output
- Validate host values in so-stats-show against [A-Za-z0-9._-] before SQL
interpolation so a compromised minion cannot inject SQL via a tag value
- Coerce postgres:telegraf:retention_days to int before rendering into SQL
- Escape single quotes when rendering pillar values into postgresql.conf
- Own postgres tooling in /usr/sbin as root:root so a container escape
cannot rewrite admin scripts
- Gate ES migration TLS verification on esVerifyCert (default false,
matching the elastic module's existing pattern)
Per-minion telegraf roles inherit CONNECT via PUBLIC by default and
could open sessions to the SOC database (though they have no readable
grants inside). Close the soft edge by revoking PUBLIC's CONNECT and
re-granting it to so_postgres only.
feature/postgres never shipped the original cron.present, so this
cleanup state is a no-op on every fresh install. The script itself
stays on disk for emergency use.
docker-entrypoint.sh runs the init-scripts phase with listen_addresses=''
(Unix socket only). The old pg_isready check passed there and then raced
the docker_temp_server_stop shutdown before the final postgres started.
pg_isready -h 127.0.0.1 only returns success once the real CMD binds
TCP, so downstream psql execs never land during the shutdown window.
Use CREATE TABLE IF NOT EXISTS and a WHERE-guarded create_parent() so
a Telegraf restart can re-run the templates safely after manual DB
surgery. Add an explicit tag_table_create_templates mirroring the
plugin default with IF NOT EXISTS for the same reason.
pg_partman 5.x's create_partition() creates a per-parent template
table inside the partman schema at runtime, which requires CREATE on
that schema. Also extend ALTER DEFAULT PRIVILEGES so the runtime-
created template tables are accessible to so_telegraf.
pg_partman 5.x splits p_parent_table on '.' and looks up the parts as
raw identifiers, so the literal must be 'schema.name' rather than the
double-quoted form quoteLiteral emits for .table.
Telegraf calls partman.create_parent() on first write of each metric,
which needs USAGE on the partman schema, EXECUTE on its functions and
procedures, and DML on partman.part_config.
- Telegraf's outputs.postgresql plugin uses Go text/template syntax,
not uppercase tokens. The {TABLE}/{COLUMNS}/{TABLELITERAL} strings
were passed through to Postgres literally, producing syntax errors
on every metric's first write. Switch to {{ .table }}, {{ .columns }},
and {{ .table|quoteLiteral }} so partitioned parents and the partman
create_parent() call succeed.
- Replace the \gexec "CREATE DATABASE ... WHERE NOT EXISTS" idiom in
both init-users.sh and telegraf_users.sls with an explicit shell
conditional. The prior idiom occasionally fired CREATE DATABASE even
when so_telegraf already existed, producing duplicate-key failures.
- Telegraf's partman template passed p_type:='native', which pg_partman
5.x (the version shipped by postgresql-17-partman on Debian) rejects.
Switched to 'range' so partman.create_parent() actually creates
partitions and Telegraf's INSERTs succeed.
- Added a postgres_wait_ready gate in telegraf_users.sls so psql execs
don't race the init-time restart that docker-entrypoint.sh performs.
- so-verify now ignores the literal "-v ON_ERROR_STOP=1" token in the
setup log. Dropped the matching entry from so-log-check, which scans
container stdout where that token never appears.
init-users.sh only runs on a fresh data dir, so upgrades onto an
existing /nsm/postgres volume never got so_telegraf. Pinning partman's
schema also makes partman.part_config reliably resolvable.
Every telegraf.* metric table is now a daily time-range partitioned
parent managed by pg_partman. Retention drops old partitions instead
of the row-by-row DELETE that so-telegraf-trim used to run nightly,
and dashboards will benefit from partition pruning at query time.
- Load pg_cron at server start via shared_preload_libraries and point
cron.database_name at so_telegraf so job metadata lives alongside
the metrics
- Telegraf create_templates override makes every new metric table a
PARTITION BY RANGE (time) parent registered with partman.create_parent
in one transaction (1 day interval, 3 premade)
- postgres_telegraf_group_role now also creates pg_partman and pg_cron
extensions and schedules hourly partman.run_maintenance_proc
- New retention reconcile state updates partman.part_config.retention
from postgres.telegraf.retention_days on every apply
- so_telegraf_trim cron is now unconditionally absent; script stays on
disk as a manual fallback
High-cardinality inputs (docker, procstat, kafka) trigger ALTER TABLE
ADD COLUMN on every new field name, and with all minions writing into
a shared 'telegraf' schema the metric tables hit Postgres's 1600-column
per-table ceiling quickly. Setting fields_as_jsonb and tags_as_jsonb on
the postgresql output keeps metric tables fixed at (time, tag_id,
fields jsonb) and tag tables at (tag_id, tags jsonb).
- so-stats-show rewritten to use JSONB accessors
((fields->>'x')::numeric, tags->>'host', etc.) and cast memory/disk
sizes to bigint so pg_size_pretty works
- Drop regex/regexFailureMessage from telegraf_output SOC UI entry to
match the convention upstream used when removing them from
mdengine/pcapengine/pipeline; options: list drives validation
Per-minion schemas cause table count to explode (N minions * M metrics)
and the per-minion revocation story isn't worth it when retention is
short. Move all minions to a shared 'telegraf' schema while keeping
per-minion login credentials for audit.
- New so_telegraf NOLOGIN group role owns the telegraf schema; each
per-minion role is a member and inherits insert/select via role
inheritance
- Telegraf connection string uses options='-c role=so_telegraf' so
tables auto-created on first write belong to the group role
- so-telegraf-trim walks the flat telegraf.* table set instead of
per-minion schemas
- so-stats-show filters by host tag; CLI arg is now the hostname as
tagged by Telegraf rather than a sanitized schema suffix
- Also renames so-show-stats -> so-stats-show
The psql invocation flag '-v ON_ERROR_STOP=1' used by the so-postgres
init script gets flagged by so-log-check because the token 'ERROR'
matches its error regex. Add to the exclusion list.
Telegraf's postgresql output stores tag values either as individual
columns on <metric>_tag or as a single JSONB 'tags' column, depending
on plugin version. Introspect information_schema.columns and build the
right accessor per tag instead of assuming one layout.
Introduces global.telegraf_output (INFLUXDB|POSTGRES|BOTH, default BOTH)
so Telegraf can write metrics to Postgres alongside or instead of
InfluxDB. Each minion authenticates with its own so_telegraf_<minion>
role and writes to a matching schema inside a shared so_telegraf
database, keeping blast radius per-credential to that minion's data.
- Per-minion credentials auto-generated and persisted in postgres/auth.sls
- postgres/telegraf_users.sls reconciles roles/schemas on every apply
- Firewall opens 5432 only to minion hostgroups when Postgres output is active
- Reactor on salt/auth + orch/telegraf_postgres_sync.sls provision new
minions automatically on key accept
- soup post_to_3.1.0 backfills users for existing minions on upgrade
- so-show-stats prints latest CPU/mem/disk/load per minion for sanity checks
- so-telegraf-trim + nightly cron prune rows older than
postgres.telegraf.retention_days (default 14)
Postgres module now queries Elasticsearch directly via HTTP
for the chat migration (bypasses RBAC that needs user context).
Pass esHostUrl, esUsername, esPassword alongside postgres creds.
Injects the postgres superuser password from secrets pillar so
SOC can run schema migrations as admin before switching to the
app user for normal operations.
Use format() with %L for SQL literal escaping instead of raw
string interpolation. Also ALTER ROLE if user already exists
to keep password in sync with pillar.
Removed postgres from soc/defaults.yaml (shared by all nodes)
and moved it entirely into defaults.map.jinja, which only injects
the config when postgres auth pillar exists (manager-type nodes).
Sensors and other non-manager nodes will not have a postgres module
section in their sensoroni.json, so sensoroni won't try to connect.
- Create vars/postgres.map.jinja for postgres auth globals
- Add POSTGRES_GLOBALS to all manager-type role vars
(manager, eval, standalone, managersearch, import)
- Add postgres module config to soc/defaults.yaml
- Inject so_postgres credentials from auth pillar into
soc/defaults.map.jinja (conditional on auth pillar existing)
Phase 1 of the PostgreSQL central data platform:
- Salt states: init, enabled, disabled, config, ssl, auth, sostatus
- TLS via SO CA-signed certs with postgresql.conf template
- Two-tier auth: postgres superuser + so_postgres application user
- Firewall restricts port 5432 to manager-only (HA-ready)
- Wired into top.sls, pillar/top.sls, allowed_states, firewall
containers map, docker defaults, CA signing policies, and setup
scripts for all manager-type roles
Add ulimits as a configurable advanced setting for every container,
allowing customization through the web UI. Move hardcoded ulimits
from elasticsearch and zeek into defaults.yaml and fix elasticsearch
ulimits that were incorrectly nested under the environment key.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Create /opt/so/conf/zeek/zkg directory and sync custom packages
from the manager via file.recurse. Bind mount the directory into
the so-zeek container so the entrypoint can install packages on
startup.
JA4 (BSD licensed) remains always enabled, but JA4+ variants (JA4S,
JA4D, JA4H, JA4L, JA4SSH, JA4T, JA4TS, JA4X) require a FoxIO license
and are now toggleable via the SOC UI. The toggle includes a license
agreement warning and defaults to disabled.
Full rebuild of all analyzer source-packages via pip download targeting
cp314/manylinux_2_17_x86_64 to match the so-soc Dockerfile base image
(python:3.14.3-slim).
Replaces cp313 wheels with cp314 for pyyaml and charset_normalizer,
and picks up certifi 2026.2.25 (from 2026.1.4).
The so-soc Dockerfile base image moved to python:3.14.3-slim but
analyzer source-packages still contained cp313 wheels for pyyaml and
charset_normalizer, causing pip install failures at container startup.
Replace all cp313 wheels with cp314 builds (pyyaml 6.0.3,
charset_normalizer 3.4.6) across all 14 analyzers and update the
CI python-test workflow to match.
Simplifies salt states, map files, and modules to only support
Oracle Linux 9, removing all Debian/Ubuntu/CentOS/Rocky/AlmaLinux/RHEL
conditional branches.
Security Onion now exclusively supports Oracle Linux 9. This removes
detection, setup, and update logic for Ubuntu, Debian, CentOS, Rocky,
AlmaLinux, and RHEL.
Consolidate version checks to use regex patterns for 2.4.21X and 3.x
versions. Add migrate_pcap_to_suricata to move pcap.enabled to
suricata.pcap.enabled in minion and pcap pillar files during upgrade.
Instead of exiting and requiring the user to rerun the script after
changing pcapengine to SURICATA, let the script continue to the
version check and upgrade.
- Warn users that undeleted Stenographer PCAP data will be inaccessible
and never automatically cleaned up if they switch to SURICATA without
deleting it first
- Require pcapengine to be set to SURICATA before allowing upgrade,
with clear messaging when the user declines to change it
- Skip upgrade if already running Security Onion 3.x.x
- Add interactive prompts to delete Stenographer PCAP data (with double confirmation) and change pcapengine to SURICATA
- Direct users to SOC Configuration UI instead of editing pillar files directly
- Consolidate TRANSITION and STENO cases to reduce repeated code
- configure global@custom ingest pipeline to run .fleet_final_pipeline-1 when available (heavynodes do not have this pipeline).
- Update global@custom pipeline to remove error message related to sending EA logs through logstash (https://github.com/elastic/kibana/issues/183959)
so-assistant-chat and so-assistant-session both had templates with a trailing dash that prevented the pattern from applying to the name of the indices.
The apiKey will be built off of the license rather than a new setting. The model is hardcoded for now at the AI Gateway level. We're going to use the investigationPrompt as a trigger for the feature being visible in the UI but by default will be blank for now.
⚠️ This category is solely for conversations related to Security Onion 2.4 ⚠️
If your organization needs more immediate, enterprise grade professional support, with one-on-one virtual meetings and screensharing, contact us via our website: https://securityonion.com/support
- type:dropdown
attributes:
label:Version
description:Which version of Security Onion 2.4.x are you asking about?
description:Which version of Security Onion are you asking about?
options:
-
- 2.4.10
@@ -30,6 +28,12 @@ body:
- 2.4.150
- 2.4.160
- 2.4.170
- 2.4.180
- 2.4.190
- 2.4.200
- 2.4.201
- 2.4.210
- 2.4.211
- Other (please provide detail below)
validations:
required:true
@@ -91,7 +95,7 @@ body:
attributes:
label:Hardware Specs
description:>
Does your hardware meet or exceed the minimum requirements for your installation type as shown at https://docs.securityonion.net/en/2.4/hardware.html?
Does your hardware meet or exceed the minimum requirements for your installation type as shown at https://securityonion.net/docs/hardware?
If your organization needs more immediate, enterprise grade professional support, with one-on-one virtual meetings and screensharing, contact us via our website: https://securityonion.com/support
- type:dropdown
attributes:
label:Version
description:Which version of Security Onion are you asking about?
options:
-
- 3.0.0
- 3.1.0
- Other (please provide detail below)
validations:
required:true
- type:dropdown
attributes:
label:Installation Method
description:How did you install Security Onion?
options:
-
- Security Onion ISO image
- Cloud image (Amazon, Azure, Google)
- Network installation on Oracle 9 (unsupported)
- Other (please provide detail below)
validations:
required:true
- type:dropdown
attributes:
label:Description
description:>
Is this discussion about installation, configuration, upgrading, or other?
options:
-
- installation
- configuration
- upgrading
- other (please provide detail below)
validations:
required:true
- type:dropdown
attributes:
label:Installation Type
description:>
When you installed, did you choose Import, Eval, Standalone, Distributed, or something else?
options:
-
- Import
- Eval
- Standalone
- Distributed
- other (please provide detail below)
validations:
required:true
- type:dropdown
attributes:
label:Location
description:>
Is this deployment in the cloud, on-prem with Internet access, or airgap?
options:
-
- cloud
- on-prem with Internet access
- airgap
- other (please provide detail below)
validations:
required:true
- type:dropdown
attributes:
label:Hardware Specs
description:>
Does your hardware meet or exceed the minimum requirements for your installation type as shown at https://securityonion.net/docs/hardware?
options:
-
- Meets minimum requirements
- Exceeds minimum requirements
- Does not meet minimum requirements
- other (please provide detail below)
validations:
required:true
- type:input
attributes:
label:CPU
description:How many CPU cores do you have?
validations:
required:true
- type:input
attributes:
label:RAM
description:How much RAM do you have?
validations:
required:true
- type:input
attributes:
label:Storage for /
description:How much storage do you have for the / partition?
validations:
required:true
- type:input
attributes:
label:Storage for /nsm
description:How much storage do you have for the /nsm partition?
validations:
required:true
- type:dropdown
attributes:
label:Network Traffic Collection
description:>
Are you collecting network traffic from a tap or span port?
options:
-
- tap
- span port
- other (please provide detail below)
validations:
required:true
- type:dropdown
attributes:
label:Network Traffic Speeds
description:>
How much network traffic are you monitoring?
options:
-
- Less than 1Gbps
- 1Gbps to 10Gbps
- more than 10Gbps
validations:
required:true
- type:dropdown
attributes:
label:Status
description:>
Does SOC Grid show all services on all nodes as running OK?
options:
-
- Yes,all services on all nodes are running OK
- No,one or more services are failed (please provide detail below)
validations:
required:true
- type:dropdown
attributes:
label:Salt Status
description:>
Do you get any failures when you run "sudo salt-call state.highstate"?
options:
-
- Yes,there are salt failures (please provide detail below)
- No,there are no failures
validations:
required:true
- type:dropdown
attributes:
label:Logs
description:>
Are there any additional clues in /opt/so/log/?
options:
-
- Yes,there are additional clues in /opt/so/log/ (please provide detail below)
- No,there are no additional clues
validations:
required:true
- type:textarea
attributes:
label:Detail
description:Please read our discussion guidelines at https://github.com/Security-Onion-Solutions/securityonion/discussions/1720 and then provide detailed information to help us help you.
placeholder:|-
STOP! Before typing, please read our discussion guidelines at https://github.com/Security-Onion-Solutions/securityonion/discussions/1720 in their entirety!
If your organization needs more immediate, enterprise grade professional support, with one-on-one virtual meetings and screensharing, contact us via our website: https://securityonion.com/support
validations:
required:true
- type:checkboxes
attributes:
label:Guidelines
options:
- label:I have read the discussion guidelines at https://github.com/Security-Onion-Solutions/securityonion/discussions/1720 and assert that I have followed the guidelines.
if:(github.event.comment.body == 'recheck' || github.event.comment.body == 'I have read the CLA Document and I hereby sign the CLA') || github.event_name == 'pull_request_target'
* Link the PR to the related issue, either using [keywords](https://docs.github.com/en/issues/tracking-your-work-with-issues/creating-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword) in the PR description, or [manually](https://docs.github.com/en/issues/tracking-your-work-with-issues/creating-issues/linking-a-pull-request-to-an-issue#manually-linking-a-pull-request-to-an-issue).
* **Pull requests should be opened against the `dev` branch of this repo**, and should clearly describe the problem and solution.
* **Pull requests should be opened against the current `?/dev` branch of this repo**, and should clearly describe the problem and solution.
* Be sure you have tested your changes and are confident they will not break other parts of the product.
Security Onion is a free and open Linux distribution for threat hunting, enterprise security monitoring, and log management. It includes a comprehensive suite of tools designed to work together to provide visibility into your network and host activity.
For organizations and enterprises requiring advanced capabilities, **Security Onion Pro** offers additional features designed for scale and efficiency:
so-checkin will run a full salt highstate to apply all salt states. If a highstate is already running, this request will be queued and so it may pause for a few minutes before you see any more output. For more information about so-checkin and salt, please see:
@@ -129,6 +129,9 @@ if [[ $EXCLUDE_STARTUP_ERRORS == 'Y' ]]; then
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|responded with status-code 503" # telegraf getting 503 from ES during startup
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|process_cluster_event_timeout_exception" # logstash waiting for elasticsearch to start
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|not configured for GeoIP" # SO does not bundle the maxminddb with Zeek
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|HTTP 404: Not Found" # Salt loops until Kratos returns 200, during startup Kratos may not be ready
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|Cancelling deferred write event maybeFenceReplicas because the event queue is now closed" # Kafka controller log during shutdown/restart
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|Redis may have been restarted" # Redis likely restarted by salt
fi
if [[ $EXCLUDE_FALSE_POSITIVE_ERRORS == 'Y' ]]; then
@@ -159,7 +162,9 @@ if [[ $EXCLUDE_FALSE_POSITIVE_ERRORS == 'Y' ]]; then
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|Error while parsing document for index \[.ds-logs-kratos-so-.*object mapping for \[file\]" # false positive (mapping error occuring BEFORE kratos index has rolled over in 2.4.210)
fi
if [[ $EXCLUDE_KNOWN_ERRORS == 'Y' ]]; then
@@ -175,7 +180,6 @@ if [[ $EXCLUDE_KNOWN_ERRORS == 'Y' ]]; then
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|salt-minion-check" # bug in early 2.4 place Jinja script in non-jinja salt dir causing cron output errors
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|monitoring.metrics" # known issue with elastic agent casting the field incorrectly if an integer value shows up before a float
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|repodownload.conf" # known issue with reposync on pre-2.4.20
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|missing versions record" # stenographer corrupt index
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|soc.field." # known ingest type collisions issue with earlier versions of SO
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|error parsing signature" # Malformed Suricata rule, from upstream provider
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|sticky buffer has no matches" # Non-critical Suricata error
@@ -222,6 +226,9 @@ if [[ $EXCLUDE_KNOWN_ERRORS == 'Y' ]]; then
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|Initialized license manager" # SOC log: before fields.status was changed to fields.licenseStatus
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|from NIC checksum offloading" # zeek reporter.log
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|marked for removal" # docker container getting recycled
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|tcp 127.0.0.1:6791: bind: address already in use" # so-elastic-fleet agent restarting. Seen starting w/ 8.18.8 https://github.com/elastic/kibana/issues/201459
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|TransformTask\] \[logs-(tychon|aws_billing|microsoft_defender_endpoint|armis|o365_metrics|microsoft_sentinel|snyk).*user so_kibana lacks the required permissions \[(logs|metrics)-\1" # Known issue with integrations starting transform jobs that are explicitly not allowed to start as a system user. (installed as so_elastic / so_kibana)
EXCLUDED_ERRORS="$EXCLUDED_ERRORS|manifest unknown" # appears in so-dockerregistry log for so-tcpreplay following docker upgrade to 29.2.1-1
fi
RESULT=0
@@ -268,6 +275,13 @@ for log_file in $(cat /tmp/log_check_files); do
description:Gateway for the default docker interface.
helpLink:docker.html
helpLink:docker
advanced:True
range:
description:Default docker IP range for containers.
helpLink:docker.html
helpLink:docker
advanced:True
ulimits:
description:|
Default ulimit settings applied to all containers via the Docker daemon. Each entry specifies a resource name (e.g. nofile, memlock, core, nproc) with soft and hard limits. Individual container ulimits override these defaults. Valid resource names include: cpu, fsize, data, stack, core, rss, nproc, nofile, memlock, as, locks, sigpending, msgqueue, nice, rtprio, rttime.
description:Enables or disables the ElastAlert 2 process. This process is critical for ensuring alerts arrive in SOC, and for outbound notification delivery.
helpLink:elastalert.html
forcedType:bool
helpLink:elastalert
alerter_parameters:
title:Custom Configuration Parameters
description:Optional configuration parameters made available as defaults for all rules and alerters. Use YAML format for these parameters, and reference the ElastAlert 2 documentation, located at https://elastalert2.readthedocs.io, for available configuration parameters. Requires a valid Security Onion license key.
global:True
multiline:True
syntax:yaml
helpLink:elastalert.html
helpLink:elastalert
forcedType:string
jira_api_key:
title:Jira API Key
description:Optional configuration parameter for Jira API Key, used instead of the Jira username and password. Requires a valid Security Onion license key.
global:True
sensitive:True
helpLink:elastalert.html
helpLink:elastalert
forcedType:string
jira_pass:
title:Jira Password
description:Optional configuration parameter for Jira password. Requires a valid Security Onion license key.
global:True
sensitive:True
helpLink:elastalert.html
helpLink:elastalert
forcedType:string
jira_user:
title:Jira Username
description:Optional configuration parameter for Jira username. Requires a valid Security Onion license key.
global:True
helpLink:elastalert.html
helpLink:elastalert
forcedType:string
smtp_pass:
title:SMTP Password
description:Optional configuration parameter for SMTP password, required for authenticating email servers. Requires a valid Security Onion license key.
global:True
sensitive:True
helpLink:elastalert.html
helpLink:elastalert
forcedType:string
smtp_user:
title:SMTP Username
description:Optional configuration parameter for SMTP username, required for authenticating email servers. Requires a valid Security Onion license key.
global:True
helpLink:elastalert.html
helpLink:elastalert
forcedType:string
files:
custom:
@@ -49,91 +50,131 @@ elastalert:
description:Optional custom Certificate Authority for connecting to an AlertManager server. To utilize this custom file, the alertmanager_ca_certs key must be set to /opt/elastalert/custom/alertmanager_ca.crt in the Alerter Parameters setting. Requires a valid Security Onion license key.
global:True
file:True
helpLink:elastalert.html
helpLink:elastalert
gelf_ca__crt:
description:Optional custom Certificate Authority for connecting to a Graylog server. To utilize this custom file, the graylog_ca_certs key must be set to /opt/elastalert/custom/graylog_ca.crt in the Alerter Parameters setting. Requires a valid Security Onion license key.
global:True
file:True
helpLink:elastalert.html
helpLink:elastalert
http_post_ca__crt:
description:Optional custom Certificate Authority for connecting to a generic HTTP server, via the legacy HTTP POST alerter. To utilize this custom file, the http_post_ca_certs key must be set to /opt/elastalert/custom/http_post2_ca.crt in the Alerter Parameters setting. Requires a valid Security Onion license key.
global:True
file:True
helpLink:elastalert.html
helpLink:elastalert
http_post2_ca__crt:
description:Optional custom Certificate Authority for connecting to a generic HTTP server, via the newer HTTP POST 2 alerter. To utilize this custom file, the http_post2_ca_certs key must be set to /opt/elastalert/custom/http_post2_ca.crt in the Alerter Parameters setting. Requires a valid Security Onion license key.
global:True
file:True
helpLink:elastalert.html
helpLink:elastalert
ms_teams_ca__crt:
description:Optional custom Certificate Authority for connecting to Microsoft Teams server. To utilize this custom file, the ms_teams_ca_certs key must be set to /opt/elastalert/custom/ms_teams_ca.crt in the Alerter Parameters setting. Requires a valid Security Onion license key.
global:True
file:True
helpLink:elastalert.html
helpLink:elastalert
pagerduty_ca__crt:
description:Optional custom Certificate Authority for connecting to PagerDuty server. To utilize this custom file, the pagerduty_ca_certs key must be set to /opt/elastalert/custom/pagerduty_ca.crt in the Alerter Parameters setting. Requires a valid Security Onion license key.
global:True
file:True
helpLink:elastalert.html
helpLink:elastalert
rocket_chat_ca__crt:
description:Optional custom Certificate Authority for connecting to PagerDuty server. To utilize this custom file, the rocket_chart_ca_certs key must be set to /opt/elastalert/custom/rocket_chat_ca.crt in the Alerter Parameters setting. Requires a valid Security Onion license key.
global:True
file:True
helpLink:elastalert.html
helpLink:elastalert
smtp__crt:
description:Optional custom certificate for connecting to an SMTP server. To utilize this custom file, the smtp_cert_file key must be set to /opt/elastalert/custom/smtp.crt in the Alerter Parameters setting. Requires a valid Security Onion license key.
global:True
file:True
helpLink:elastalert.html
helpLink:elastalert
smtp__key:
description:Optional custom certificate key for connecting to an SMTP server. To utilize this custom file, the smtp_key_file key must be set to /opt/elastalert/custom/smtp.key in the Alerter Parameters setting. Requires a valid Security Onion license key.
global:True
file:True
helpLink:elastalert.html
helpLink:elastalert
slack_ca__crt:
description:Optional custom Certificate Authority for connecting to Slack. To utilize this custom file, the slack_ca_certs key must be set to /opt/elastalert/custom/slack_ca.crt in the Alerter Parameters setting. Requires a valid Security Onion license key.
global:True
file:True
helpLink:elastalert.html
helpLink:elastalert
config:
scan_subdirectories:
description:Recursively scan subdirectories for rules.
forcedType:bool
advanced:True
global:True
helpLink:elastalert
disable_rules_on_error:
description:Disable rules on failure.
forcedType:bool
global:True
helpLink:elastalert.html
helpLink:elastalert
run_every:
minutes:
description:Amount of time in minutes between searches.
global:True
helpLink:elastalert.html
helpLink:elastalert
buffer_time:
minutes:
description:Amount of time in minutes to look through.
global:True
helpLink:elastalert.html
helpLink:elastalert
old_query_limit:
minutes:
description:Amount of time in minutes between queries to start at the most recently run query.
global:True
helpLink:elastalert.html
helpLink:elastalert
es_conn_timeout:
description:Timeout in seconds for connecting to and reading from Elasticsearch.
global:True
helpLink:elastalert.html
helpLink:elastalert
max_query_size:
description:The maximum number of documents that will be returned from Elasticsearch in a single query.
global:True
helpLink:elastalert.html
helpLink:elastalert
use_ssl:
description:Use SSL to connect to Elasticsearch.
forcedType:bool
advanced:True
global:True
helpLink:elastalert
verify_certs:
description:Verify TLS certificates when connecting to Elasticsearch.
forcedType:bool
advanced:True
global:True
helpLink:elastalert
alert_time_limit:
days:
description:The retry window for failed alerts.
global:True
helpLink:elastalert.html
helpLink:elastalert
index_settings:
shards:
description:The number of shards for elastalert indices.
global:True
helpLink:elastalert.html
helpLink:elastalert
replicas:
description:The number of replicas for elastalert indices.
global:True
helpLink:elastalert.html
helpLink:elastalert
logging:
incremental:
description:When incremental is false (the default), the logging configuration is applied in full, replacing any existing logging setup. When true, only the level attributes of existing loggers and handlers are updated, leaving the rest of the logging configuration unchanged.
forcedType:bool
advanced:True
global:True
helpLink:elastalert
disable_existing_loggers:
description:Disable existing loggers.
forcedType:bool
advanced:True
global:True
helpLink:elastalert
loggers:
'':
propagate:
description:Propagate log messages to parent loggers.
{# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
or more contributor license agreements. Licensed under the Elastic License 2.0; you may not use
this file except in compliance with the Elastic License 2.0. #}
{% import_json '/opt/so/state/esfleet_content_package_components.json' as ADDON_CONTENT_PACKAGE_COMPONENTS %}
{% import_json '/opt/so/state/esfleet_component_templates.json' as INSTALLED_COMPONENT_TEMPLATES %}
{% import_yaml 'elasticfleet/defaults.yaml' as ELASTICFLEETDEFAULTS %}
{% set CORE_ESFLEET_PACKAGES = ELASTICFLEETDEFAULTS.get('elasticfleet', {}).get('packages', {}) %}
{% set ADDON_CONTENT_INTEGRATION_DEFAULTS = {} %}
{% set DEBUG_STUFF = {} %}
{% for pkg in ADDON_CONTENT_PACKAGE_COMPONENTS %}
{% if pkg.name in CORE_ESFLEET_PACKAGES %}
{# skip core content packages #}
{% elif pkg.name not in CORE_ESFLEET_PACKAGES %}
{# generate defaults for each content package #}
{% if pkg.dataStreams is defined and pkg.dataStreams is not none and pkg.dataStreams | length > 0%}
{% for pattern in pkg.dataStreams %}
{# in ES 9.3.2 'input' type integrations no longer create default component templates and instead they wait for user input during 'integration' setup (fleet ui config)
title: generic is an artifact of that and is not in use #}
{% if pattern.title == "generic" %}
{% continue %}
{% endif %}
{% if "metrics-" in pattern.name %}
{% set integration_type = "metrics-" %}
{% elif "logs-" in pattern.name %}
{% set integration_type = "logs-" %}
{% else %}
{% set integration_type = "" %}
{% endif %}
{# on content integrations the component name is user defined at the time it is added to an agent policy #}
{% set component_name = pattern.title %}
{% set index_pattern = pattern.name %}
{# component_name_x maintains the functionality of merging local pillar changes with generated 'defaults' via SOC UI #}
{% set component_name_x = component_name.replace(".","_x_") %}
{# pillar overrides/merge expects the key names to follow the naming in elasticsearch/defaults.yaml eg. so-logs-1password_x_item_usages . The _x_ is replaced later on in elasticsearch/template.map.jinja #}
{# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
or more contributor license agreements. Licensed under the Elastic License 2.0; you may not use
this file except in compliance with the Elastic License 2.0. #}
{% import_json '/opt/so/state/esfleet_input_package_components.json' as ADDON_INPUT_PACKAGE_COMPONENTS %}
{% import_json '/opt/so/state/esfleet_component_templates.json' as INSTALLED_COMPONENT_TEMPLATES %}
{% import_yaml 'elasticfleet/defaults.yaml' as ELASTICFLEETDEFAULTS %}
{% set CORE_ESFLEET_PACKAGES = ELASTICFLEETDEFAULTS.get('elasticfleet', {}).get('packages', {}) %}
{% set ADDON_INPUT_INTEGRATION_DEFAULTS = {} %}
{% set DEBUG_STUFF = {} %}
{% for pkg in ADDON_INPUT_PACKAGE_COMPONENTS %}
{% if pkg.name in CORE_ESFLEET_PACKAGES %}
{# skip core input packages #}
{% elif pkg.name not in CORE_ESFLEET_PACKAGES %}
{# generate defaults for each input package #}
{% if pkg.dataStreams is defined and pkg.dataStreams is not none and pkg.dataStreams | length > 0 %}
{% for pattern in pkg.dataStreams %}
{# in ES 9.3.2 'input' type integrations no longer create default component templates and instead they wait for user input during 'integration' setup (fleet ui config)
title: generic is an artifact of that and is not in use #}
{% if pattern.title == "generic" %}
{% continue %}
{% endif %}
{% if "metrics-" in pattern.name %}
{% set integration_type = "metrics-" %}
{% elif "logs-" in pattern.name %}
{% set integration_type = "logs-" %}
{% else %}
{% set integration_type = "" %}
{% endif %}
{# on input integrations the component name is user defined at the time it is added to an agent policy #}
{% set component_name = pattern.title %}
{% set index_pattern = pattern.name %}
{# component_name_x maintains the functionality of merging local pillar changes with generated 'defaults' via SOC UI #}
{% set component_name_x = component_name.replace(".","_x_") %}
{# pillar overrides/merge expects the key names to follow the naming in elasticsearch/defaults.yaml eg. so-logs-1password_x_item_usages . The _x_ is replaced later on in elasticsearch/template.map.jinja #}
{% if pkg.es_index_patterns is defined and pkg.es_index_patterns is not none %}
{% for pattern in pkg.es_index_patterns %}
{% if pkg.dataStreams is defined and pkg.dataStreams is not none and pkg.dataStreams | length > 0 %}
{% for pattern in pkg.dataStreams %}
{% if "metrics-" in pattern.name %}
{% set integration_type = "metrics-" %}
{% elif "logs-" in pattern.name %}
@@ -74,44 +75,27 @@
{% if component_name in WEIRD_INTEGRATIONS %}
{% set component_name = WEIRD_INTEGRATIONS[component_name] %}
{% endif %}
{# create duplicate of component_name, so we can split generics from @custom component templates in the index template below and overwrite the default @package when needed
eg. having to replace unifiedlogs.generic@package with filestream.generic@package, but keep the ability to customize unifiedlogs.generic@custom and its ILM policy #}
{% set custom_component_name = component_name %}
{# duplicate integration_type to assist with sometimes needing to overwrite component templates with 'logs-filestream.generic@package' (there is no metrics-filestream.generic@package) #}
{% set generic_integration_type = integration_type %}
{# component_name_x maintains the functionality of merging local pillar changes with generated 'defaults' via SOC UI #}
{% set component_name_x = component_name.replace(".","_x_") %}
{# pillar overrides/merge expects the key names to follow the naming in elasticsearch/defaults.yaml eg. so-logs-1password_x_item_usages . The _x_ is replaced later on in elasticsearch/template.map.jinja #}
{% set integration_key = "so-" ~ integration_type ~ component_name_x %}
{# if its a .generic template make sure that a .generic@package for the integration exists. Else default to logs-filestream.generic@package #}
{% if ".generic" in component_name and integration_type ~ component_name ~ "@package" not in INSTALLED_COMPONENT_TEMPLATES %}
{# these generic templates by default are directed to index_pattern of 'logs-generic-*', overwrite that here to point to eg gcp_pubsub.generic-* #}
{% set index_pattern = integration_type ~ component_name ~ "-*" %}
{# includes use of .generic component template, but it doesn't exist in installed component templates. Redirect it to filestream.generic@package #}
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.