Compare commits

...

52 Commits

Author SHA1 Message Date
Mike Reeves 05f6503d61 Gate postgres telegraf fan-out on reactor-provided minion id
postgres.auth was running an `unless` shell check per up-minion on every
manager highstate, even when nothing had changed — N fork+python starts
of so-yaml.py add up on large grids. The work is only needed when a
specific minion's key is accepted.

- salt/postgres/auth.sls: fan out only when postgres_fanout_minion
  pillar is set (targets that single minion). Manager highstates with
  no pillar take a zero-N code path.
- salt/reactor/telegraf_user_sync.sls: re-pass the accepted minion id
  as postgres_fanout_minion to the orch.
- salt/orch/telegraf_postgres_sync.sls: forward the pillar to the
  salt.state invocation so the state render sees it.
- salt/manager/tools/sbin/soup: for the one-time 3.1.0 backfill, drop
  the per-minion state.apply and do an in-shell loop over the minion
  pillar files using so-yaml.py directly. Skips minions that already
  have postgres.telegraf.user set.
2026-04-21 10:05:08 -04:00
Mike Reeves a149ea7e8f Skip per-minion pillar fan-out when cred is already in place
Every postgres.auth run was rewriting every minion pillar file via
two so-yaml.py replace calls, even when nothing had changed. Passwords
are only generated on first encounter (see the `if key not in
telegraf_users` guard) and never rotate, so re-writing the same values
on every apply is wasted work and noisy state output.

Add an `unless:` check that compares the already-written
postgres.telegraf.user to the one we'd set. If they match, skip the
fan-out entirely. On first apply for a new minion the key isn't there,
so the replace runs; on subsequent applies it's a no-op.
2026-04-21 09:59:46 -04:00
Mike Reeves bb71e44614 Write per-minion telegraf creds to each minion's own pillar file
pillar/top.sls only distributes postgres.auth to manager-class roles,
so sensors / heavynodes / searchnodes / receivers / fleet / idh /
hypervisor / desktop minions never received the postgres telegraf
password they need to write metrics. Broadcasting the aggregate
postgres.auth pillar to every role would leak the so_postgres admin
password and every other minion's cred.

Fan out per-minion credentials into each minion's own pillar file at
/opt/so/saltstack/local/pillar/minions/<id>.sls. That file is already
distributed by pillar/top.sls exclusively to the matching minion via
`- minions.{{ grains.id }}`, so each minion sees only its own
postgres.telegraf.{user,pass} and nothing else.

- salt/postgres/auth.sls: after writing the manager-scoped aggregate
  pillar, fan the per-minion creds out via so-yaml.py replace for every
  up-minion. Creates the minion pillar file if missing. Requires
  postgres_auth_pillar so the manager pillar lands first.
- salt/telegraf/etc/telegraf.conf: consume postgres:telegraf:user and
  postgres:telegraf:pass directly from the minion's own pillar instead
  of walking postgres:auth:users which isn't visible off the manager.
2026-04-21 09:57:35 -04:00
Mike Reeves 84197fb33b Move postgres backup script and cron to the postgres states
The so-postgres-backup script and its cron were living under
salt/backup/config_backup.sls, which meant the backup script and cron
were deployed independently of whether postgres was enabled/disabled.

- Relocate salt/backup/tools/sbin/so-postgres-backup to
  salt/postgres/tools/sbin/so-postgres-backup so the existing
  postgres_sbin file.recurse in postgres/config.sls picks it up with
  everything else — no separate file.managed needed.
- Remove postgres_backup_script and so_postgres_backup from
  salt/backup/config_backup.sls.
- Add cron.present for so_postgres_backup to salt/postgres/enabled.sls
  and the matching cron.absent to salt/postgres/disabled.sls so the
  cron follows the container's lifecycle.
2026-04-21 09:42:41 -04:00
Mike Reeves 89a6e7c0dd Tidy config.sls makedirs and postgres helpLinks
- config.sls: postgresconfdir creates /opt/so/conf/postgres, so the
  two subdirectories under it (postgressecretsdir, postgresinitdir)
  don't need their own makedirs — require the parent instead.
- soc_postgres.yaml: helpLink for every annotated key now points to
  'postgres' instead of the carried-over 'influxdb' slug.
2026-04-21 09:39:58 -04:00
Mike Reeves a902f667ba Target manager by role grain in telegraf_postgres_sync orch
The previous MANAGER resolution used pillar.get('setup:manager') with a
fallback to grains.get('master'). Neither works from the reactor:
setup:manager is only populated by the setup workflow (not by reactor
runs), and grains.master returns the minion's master-hostname setting,
not a targetable minion id.

Match the pattern used by orch/delete_hypervisor.sls: compound-target
whichever minion is the manager via role grain.
2026-04-21 09:37:35 -04:00
Mike Reeves f72c30abd0 Have postgres.telegraf_users include postgres.enabled
postgres_wait_ready requires docker_container: so-postgres, which is
declared in postgres.enabled. Running postgres.telegraf_users on its own
— as the reactor orch and the soup post-upgrade step both do — errored
because Salt couldn't resolve the require.

Include postgres.enabled from postgres.telegraf_users so the container
state is always in the render. postgres.enabled already includes
telegraf_users; Salt de-duplicates the circular include and the included
states are all idempotent, so repeated application is a no-op.
2026-04-21 09:35:59 -04:00
Mike Reeves 37e9257698 Change so-postgres final_octet to 47 2026-04-21 09:33:47 -04:00
Mike Reeves 72105f1f2f Drop telegraf push from new-minion orch; highstate covers it
New minions run highstate as part of onboarding, which already applies
the telegraf state with the fresh pillar entry we just wrote. Pushing
telegraf a second time from the reactor is redundant.

- Remove the MINION-scoped salt.state block from the orch; keep only
  the manager-side postgres.auth + postgres.telegraf_users provisioning.
- Stop passing minion_id as pillar in the reactor; the orch doesn't
  reference it anymore.
2026-04-21 09:31:45 -04:00
Mike Reeves ee89b78751 Fire telegraf user sync on salt/key accept, not salt/auth
salt/auth fires on every minion authentication — including every minion
restart and every master restart — so the reactor was re-running the
postgres.auth + postgres.telegraf_users + telegraf orchestration for
every already-accepted minion on every reconnect. The underlying states
are idempotent, so this was wasted work and log noise, not a correctness
issue.

Switch the subscription to salt/key, which fires only when the master
actually changes a key's state (accept / reject / delete). Match the
pattern used by salt/reactor/check_hypervisor.sls (registered in
salt/salt/cloud/reactor_config_hypervisor.sls) and add the result==True
guard so half-failed key operations don't trigger the orchestration.
2026-04-20 19:54:06 -04:00
Mike Reeves 80bf07ffd8 Flesh out soc_postgres.yaml annotations
Add Configuration-UI annotations for every postgres pillar key defined
in defaults.yaml, not just telegraf.retention_days:

- postgres.enabled          — readonly; admin-visible but toggled via state
- postgres.telegraf.retention_days — drop advanced so user-tunable knobs
  surface in the default view
- postgres.config.max_connections, shared_buffers, log_min_messages —
  user-tunable performance/verbosity knobs, not advanced
- postgres.config.listen_addresses, port, ssl, ssl_cert_file, ssl_key_file,
  ssl_ca_file, hba_file, log_destination, logging_collector,
  shared_preload_libraries, cron.database_name — infra/Salt-managed,
  marked advanced so they're visible but out of the way

No defaults.yaml change; value-side stays the same.
2026-04-20 16:36:37 -04:00
Mike Reeves b69e50542a Use TELEGRAFMERGED for telegraf.output and de-jinja pg_hba.conf
- firewall/map.jinja and postgres/telegraf_users.sls now pull the
  telegraf output selector through TELEGRAFMERGED so the defaults.yaml
  value (BOTH) is the source of truth and pillar overrides merge in
  cleanly. pillar.get with a hardcoded fallback was brittle and would
  disagree with defaults.yaml if the two ever diverged.
- Rename salt/postgres/files/pg_hba.conf.jinja to pg_hba.conf and drop
  template: jinja from config.sls — the file has no jinja besides the
  comment header.
2026-04-20 16:06:01 -04:00
Mike Reeves 3ecd19d085 Move telegraf_output from global pillar to telegraf pillar
The Telegraf backend selector lived at global.telegraf_output but it is
a Telegraf-scoped setting, not a cross-cutting grid global. Move both
the value and the UI annotation under the telegraf pillar so it shows
up alongside the other Telegraf tuning knobs in the Configuration UI.

- salt/telegraf/defaults.yaml:    add telegraf.output: BOTH
- salt/telegraf/soc_telegraf.yaml: add telegraf.output annotation
- salt/global/defaults.yaml:      remove global.telegraf_output
- salt/global/soc_global.yaml:    remove global.telegraf_output annotation
- salt/vars/globals.map.jinja:    drop telegraf_output from GLOBALS
- salt/firewall/map.jinja:        read via pillar.get('telegraf:output')
- salt/postgres/telegraf_users.sls: read via pillar.get('telegraf:output')
- salt/telegraf/etc/telegraf.conf: read via TELEGRAFMERGED.output
- salt/postgres/tools/sbin/so-stats-show: update user-facing docs

No behavioral change — default stays BOTH.
2026-04-20 16:03:02 -04:00
Mike Reeves b6a3d1889c Fix soup state.apply args for postgres provisioning
state.apply takes a single mods argument; space-separated names are not
a list, so `state.apply postgres.auth postgres.telegraf_users` was only
applying postgres.auth and silently dropping the telegraf_users state.

Use comma-separated mods and add queue=True to match the rest of soup.
2026-04-20 14:40:32 -04:00
Mike Reeves 1cb34b089c Restore 3/dev soup and add postgres users to post_to_3.1.0
feature/postgres had rewritten the 3.1.0 upgrade block, dropping the
elastic upgrade work 3/dev landed for 9.0.8→9.3.3: elasticsearch_backup_index_templates,
the component template state cleanup, and the /usr/sbin/so-kibana-space-defaults
post-upgrade call. It also carried an older ES upgrade mapping
(8.18.8→9.0.8) that was superseded on 3/dev (9.0.8→9.3.3 for
3.0.0-20260331), and a handful of latent shell-quoting regressions in
verify_es_version_compatibility and the intermediate-upgrade helpers.

Adopt the 3/dev soup verbatim and only add the new Telegraf Postgres
provisioning to post_to_3.1.0 on top of so-kibana-space-defaults.
2026-04-20 14:38:55 -04:00
Mike Reeves 1537ba5031 Merge remote-tracking branch 'origin/3/dev' into feature/postgres 2026-04-20 14:32:05 -04:00
Mike Reeves 8225d41661 Harden postgres secrets, TLS enforcement, and admin tooling
- Deliver postgres super and app passwords via mounted 0600 secret files
  (POSTGRES_PASSWORD_FILE, SO_POSTGRES_PASS_FILE) instead of plaintext env
  vars visible in docker inspect output
- Mount a managed pg_hba.conf that only allows local trust and hostssl
  scram-sha-256 so TCP clients cannot negotiate cleartext sessions
- Restrict postgres.key to 0400 and ensure owner/group 939
- Set umask 0077 on so-postgres-backup output
- Validate host values in so-stats-show against [A-Za-z0-9._-] before SQL
  interpolation so a compromised minion cannot inject SQL via a tag value
- Coerce postgres:telegraf:retention_days to int before rendering into SQL
- Escape single quotes when rendering pillar values into postgresql.conf
- Own postgres tooling in /usr/sbin as root:root so a container escape
  cannot rewrite admin scripts
- Gate ES migration TLS verification on esVerifyCert (default false,
  matching the elastic module's existing pattern)
2026-04-20 12:36:17 -04:00
Mike Reeves 3f46caaf02 Revoke PUBLIC CONNECT on securityonion database
Per-minion telegraf roles inherit CONNECT via PUBLIC by default and
could open sessions to the SOC database (though they have no readable
grants inside). Close the soft edge by revoking PUBLIC's CONNECT and
re-granting it to so_postgres only.
2026-04-17 19:10:07 -04:00
Mike Reeves f3181b204a Remove so-telegraf-trim and update retention description
pg_partman drops old partitions hourly; row-DELETE retention is
obsolete and a confusing emergency fallback on partitioned tables.
2026-04-17 19:06:16 -04:00
Mike Reeves dd39db4584 Drop so_telegraf_trim cron.absent tombstone
feature/postgres never shipped the original cron.present, so this
cleanup state is a no-op on every fresh install. The script itself
stays on disk for emergency use.
2026-04-17 18:59:39 -04:00
Mike Reeves 759880a800 Wait for TCP-ready postgres, not the init-phase Unix socket
docker-entrypoint.sh runs the init-scripts phase with listen_addresses=''
(Unix socket only). The old pg_isready check passed there and then raced
the docker_temp_server_stop shutdown before the final postgres started.
pg_isready -h 127.0.0.1 only returns success once the real CMD binds
TCP, so downstream psql execs never land during the shutdown window.
2026-04-17 16:43:41 -04:00
Mike Reeves 31383bd9d0 Make Telegraf Postgres templates idempotent
Use CREATE TABLE IF NOT EXISTS and a WHERE-guarded create_parent() so
a Telegraf restart can re-run the templates safely after manual DB
surgery. Add an explicit tag_table_create_templates mirroring the
plugin default with IF NOT EXISTS for the same reason.
2026-04-17 15:43:50 -04:00
Mike Reeves 21076af01e Grant so_telegraf CREATE on partman schema
pg_partman 5.x's create_partition() creates a per-parent template
table inside the partman schema at runtime, which requires CREATE on
that schema. Also extend ALTER DEFAULT PRIVILEGES so the runtime-
created template tables are accessible to so_telegraf.
2026-04-17 15:34:19 -04:00
Mike Reeves f11e9da83a Mark time column NOT NULL before partman.create_parent
pg_partman 5.x requires the control column to be NOT NULL; Telegraf's
generated columns are nullable by default.
2026-04-17 15:27:06 -04:00
Mike Reeves 0fddcd8fe7 Pass unquoted schema.name to partman.create_parent
pg_partman 5.x splits p_parent_table on '.' and looks up the parts as
raw identifiers, so the literal must be 'schema.name' rather than the
double-quoted form quoteLiteral emits for .table.
2026-04-17 15:22:57 -04:00
Mike Reeves 927eba566c Grant so_telegraf access to partman schema
Telegraf calls partman.create_parent() on first write of each metric,
which needs USAGE on the partman schema, EXECUTE on its functions and
procedures, and DML on partman.part_config.
2026-04-17 15:13:08 -04:00
Mike Reeves af9330a9dd Escape Go-template placeholders from Jinja in telegraf.conf 2026-04-17 15:04:37 -04:00
Mike Reeves b3fbd5c7a4 Use Go-template placeholders and shell-guarded CREATE DATABASE
- Telegraf's outputs.postgresql plugin uses Go text/template syntax,
  not uppercase tokens. The {TABLE}/{COLUMNS}/{TABLELITERAL} strings
  were passed through to Postgres literally, producing syntax errors
  on every metric's first write. Switch to {{ .table }}, {{ .columns }},
  and {{ .table|quoteLiteral }} so partitioned parents and the partman
  create_parent() call succeed.
- Replace the \gexec "CREATE DATABASE ... WHERE NOT EXISTS" idiom in
  both init-users.sh and telegraf_users.sls with an explicit shell
  conditional. The prior idiom occasionally fired CREATE DATABASE even
  when so_telegraf already existed, producing duplicate-key failures.
2026-04-17 14:55:13 -04:00
Mike Reeves 5228668be0 Fix Telegraf→Postgres table creation and state.apply race
- Telegraf's partman template passed p_type:='native', which pg_partman
  5.x (the version shipped by postgresql-17-partman on Debian) rejects.
  Switched to 'range' so partman.create_parent() actually creates
  partitions and Telegraf's INSERTs succeed.
- Added a postgres_wait_ready gate in telegraf_users.sls so psql execs
  don't race the init-time restart that docker-entrypoint.sh performs.
- so-verify now ignores the literal "-v ON_ERROR_STOP=1" token in the
  setup log. Dropped the matching entry from so-log-check, which scans
  container stdout where that token never appears.
2026-04-17 13:00:12 -04:00
Mike Reeves 7d07f3c8fe Create so_telegraf DB from Salt and pin pg_partman schema
init-users.sh only runs on a fresh data dir, so upgrades onto an
existing /nsm/postgres volume never got so_telegraf. Pinning partman's
schema also makes partman.part_config reliably resolvable.
2026-04-17 10:51:08 -04:00
Mike Reeves d9a9029ce5 Adopt pg_partman + pg_cron for Telegraf metric tables
Every telegraf.* metric table is now a daily time-range partitioned
parent managed by pg_partman. Retention drops old partitions instead
of the row-by-row DELETE that so-telegraf-trim used to run nightly,
and dashboards will benefit from partition pruning at query time.

- Load pg_cron at server start via shared_preload_libraries and point
  cron.database_name at so_telegraf so job metadata lives alongside
  the metrics
- Telegraf create_templates override makes every new metric table a
  PARTITION BY RANGE (time) parent registered with partman.create_parent
  in one transaction (1 day interval, 3 premade)
- postgres_telegraf_group_role now also creates pg_partman and pg_cron
  extensions and schedules hourly partman.run_maintenance_proc
- New retention reconcile state updates partman.part_config.retention
  from postgres.telegraf.retention_days on every apply
- so_telegraf_trim cron is now unconditionally absent; script stays on
  disk as a manual fallback
2026-04-16 17:27:15 -04:00
Mike Reeves 9fe53d9ccc Use JSONB for Telegraf fields/tags to avoid 1600-column limit
High-cardinality inputs (docker, procstat, kafka) trigger ALTER TABLE
ADD COLUMN on every new field name, and with all minions writing into
a shared 'telegraf' schema the metric tables hit Postgres's 1600-column
per-table ceiling quickly. Setting fields_as_jsonb and tags_as_jsonb on
the postgresql output keeps metric tables fixed at (time, tag_id,
fields jsonb) and tag tables at (tag_id, tags jsonb).

- so-stats-show rewritten to use JSONB accessors
  ((fields->>'x')::numeric, tags->>'host', etc.) and cast memory/disk
  sizes to bigint so pg_size_pretty works
- Drop regex/regexFailureMessage from telegraf_output SOC UI entry to
  match the convention upstream used when removing them from
  mdengine/pcapengine/pipeline; options: list drives validation
2026-04-16 17:02:21 -04:00
Mike Reeves f7b80f5931 Merge branch '3/dev' into feature/postgres 2026-04-16 16:37:02 -04:00
Mike Reeves f11d315fea Fix soup 2026-04-16 16:35:24 -04:00
Mike Reeves 2013bf9e30 Fix soup 2026-04-16 16:20:25 -04:00
Mike Reeves a2ffb92b8d Fix soup 2026-04-16 16:19:53 -04:00
Mike Reeves 470b3bd4da Comingle Telegraf metrics into shared schema
Per-minion schemas cause table count to explode (N minions * M metrics)
and the per-minion revocation story isn't worth it when retention is
short. Move all minions to a shared 'telegraf' schema while keeping
per-minion login credentials for audit.

- New so_telegraf NOLOGIN group role owns the telegraf schema; each
  per-minion role is a member and inherits insert/select via role
  inheritance
- Telegraf connection string uses options='-c role=so_telegraf' so
  tables auto-created on first write belong to the group role
- so-telegraf-trim walks the flat telegraf.* table set instead of
  per-minion schemas
- so-stats-show filters by host tag; CLI arg is now the hostname as
  tagged by Telegraf rather than a sanitized schema suffix
- Also renames so-show-stats -> so-stats-show
2026-04-16 15:40:54 -04:00
Mike Reeves c124186989 so-log-check: exclude psql ON_ERROR_STOP flag
The psql invocation flag '-v ON_ERROR_STOP=1' used by the so-postgres
init script gets flagged by so-log-check because the token 'ERROR'
matches its error regex. Add to the exclusion list.
2026-04-15 19:45:42 -04:00
Mike Reeves d24808ff98 Fix so-show-stats tag column resolution
Telegraf's postgresql output stores tag values either as individual
columns on <metric>_tag or as a single JSONB 'tags' column, depending
on plugin version. Introspect information_schema.columns and build the
right accessor per tag instead of assuming one layout.
2026-04-15 19:28:10 -04:00
Mike Reeves cefbe01333 Add telegraf_output selector for InfluxDB/Postgres dual-write
Introduces global.telegraf_output (INFLUXDB|POSTGRES|BOTH, default BOTH)
so Telegraf can write metrics to Postgres alongside or instead of
InfluxDB. Each minion authenticates with its own so_telegraf_<minion>
role and writes to a matching schema inside a shared so_telegraf
database, keeping blast radius per-credential to that minion's data.

- Per-minion credentials auto-generated and persisted in postgres/auth.sls
- postgres/telegraf_users.sls reconciles roles/schemas on every apply
- Firewall opens 5432 only to minion hostgroups when Postgres output is active
- Reactor on salt/auth + orch/telegraf_postgres_sync.sls provision new
  minions automatically on key accept
- soup post_to_3.1.0 backfills users for existing minions on upgrade
- so-show-stats prints latest CPU/mem/disk/load per minion for sanity checks
- so-telegraf-trim + nightly cron prune rows older than
  postgres.telegraf.retention_days (default 14)
2026-04-15 14:32:10 -04:00
Mike Reeves 9ccd0acb4f Add ES credentials to postgres module config for migration
Postgres module now queries Elasticsearch directly via HTTP
for the chat migration (bypasses RBAC that needs user context).
Pass esHostUrl, esUsername, esPassword alongside postgres creds.
2026-04-10 11:41:33 -04:00
Mike Reeves 1ffdcab3be Add postgres adminPassword to SOC module config
Injects the postgres superuser password from secrets pillar so
SOC can run schema migrations as admin before switching to the
app user for normal operations.
2026-04-09 22:21:35 -04:00
Mike Reeves da1045e052 Fix init-users.sh password escaping for special characters
Use format() with %L for SQL literal escaping instead of raw
string interpolation. Also ALTER ROLE if user already exists
to keep password in sync with pillar.
2026-04-09 21:52:20 -04:00
Mike Reeves 55be1f1119 Only add postgres module config on manager nodes
Removed postgres from soc/defaults.yaml (shared by all nodes)
and moved it entirely into defaults.map.jinja, which only injects
the config when postgres auth pillar exists (manager-type nodes).
Sensors and other non-manager nodes will not have a postgres module
section in their sensoroni.json, so sensoroni won't try to connect.
2026-04-09 21:09:43 -04:00
Mike Reeves c1b1452bd9 Use manager IP for postgres hostUrl instead of container hostname
SOC connects to postgres via the host network, not the Docker
bridge network, so it needs the manager's IP address rather than
the container hostname.
2026-04-09 19:34:14 -04:00
Mike Reeves 2dfa83dd7d Wire postgres credentials into SOC module config
- Create vars/postgres.map.jinja for postgres auth globals
- Add POSTGRES_GLOBALS to all manager-type role vars
  (manager, eval, standalone, managersearch, import)
- Add postgres module config to soc/defaults.yaml
- Inject so_postgres credentials from auth pillar into
  soc/defaults.map.jinja (conditional on auth pillar existing)
2026-04-09 14:09:32 -04:00
Mike Reeves b87af8ea3d Add postgres.auth to allowed_states
Matches the elasticsearch.auth pattern where auth states use
the full sls path check and are explicitly listed.
2026-04-09 12:39:46 -04:00
Mike Reeves 46e38d39bb Enable postgres by default
Safe because postgres states are only applied to manager-type
nodes via top.sls and allowed_states.map.jinja.
2026-04-09 12:23:47 -04:00
Mike Reeves 61bdfb1a4b Add daily PostgreSQL database backup
- pg_dumpall piped through gzip, stored in /nsm/backup/
- Runs daily at 00:05 (4 minutes after config backup)
- 7-day retention matching existing config backup policy
- Skips gracefully if container isn't running
2026-04-09 10:29:10 -04:00
Mike Reeves 358a2e6d3f Add so-postgres to container image pull list
Add to both the import and default manager container lists so
the image gets downloaded during installation.
2026-04-09 10:02:41 -04:00
Mike Reeves 762e73faf5 Add so-postgres host management scripts
- so-postgres-manage: wraps docker exec for psql operations
  (sql, sqlfile, shell, dblist, userlist)
- so-postgres-start/stop/restart: standard container lifecycle
- Scripts installed to /usr/sbin via file.recurse in config.sls
2026-04-09 09:55:42 -04:00
Mike Reeves 868cd11874 Add so-postgres Salt states and integration wiring
Phase 1 of the PostgreSQL central data platform:
- Salt states: init, enabled, disabled, config, ssl, auth, sostatus
- TLS via SO CA-signed certs with postgresql.conf template
- Two-tier auth: postgres superuser + so_postgres application user
- Firewall restricts port 5432 to manager-only (HA-ready)
- Wired into top.sls, pillar/top.sls, allowed_states, firewall
  containers map, docker defaults, CA signing policies, and setup
  scripts for all manager-type roles
2026-04-08 10:58:52 -04:00
48 changed files with 1337 additions and 5 deletions
+20
View File
@@ -38,6 +38,9 @@ base:
{% if salt['file.file_exists']('/opt/so/saltstack/local/pillar/elasticsearch/auth.sls') %}
- elasticsearch.auth
{% endif %}
{% if salt['file.file_exists']('/opt/so/saltstack/local/pillar/postgres/auth.sls') %}
- postgres.auth
{% endif %}
{% if salt['file.file_exists']('/opt/so/saltstack/local/pillar/kibana/secrets.sls') %}
- kibana.secrets
{% endif %}
@@ -60,6 +63,8 @@ base:
- redis.adv_redis
- influxdb.soc_influxdb
- influxdb.adv_influxdb
- postgres.soc_postgres
- postgres.adv_postgres
- elasticsearch.nodes
- elasticsearch.soc_elasticsearch
- elasticsearch.adv_elasticsearch
@@ -100,6 +105,9 @@ base:
{% if salt['file.file_exists']('/opt/so/saltstack/local/pillar/elasticsearch/auth.sls') %}
- elasticsearch.auth
{% endif %}
{% if salt['file.file_exists']('/opt/so/saltstack/local/pillar/postgres/auth.sls') %}
- postgres.auth
{% endif %}
{% if salt['file.file_exists']('/opt/so/saltstack/local/pillar/kibana/secrets.sls') %}
- kibana.secrets
{% endif %}
@@ -125,6 +133,8 @@ base:
- redis.adv_redis
- influxdb.soc_influxdb
- influxdb.adv_influxdb
- postgres.soc_postgres
- postgres.adv_postgres
- backup.soc_backup
- backup.adv_backup
- zeek.soc_zeek
@@ -144,6 +154,9 @@ base:
{% if salt['file.file_exists']('/opt/so/saltstack/local/pillar/elasticsearch/auth.sls') %}
- elasticsearch.auth
{% endif %}
{% if salt['file.file_exists']('/opt/so/saltstack/local/pillar/postgres/auth.sls') %}
- postgres.auth
{% endif %}
{% if salt['file.file_exists']('/opt/so/saltstack/local/pillar/kibana/secrets.sls') %}
- kibana.secrets
{% endif %}
@@ -158,6 +171,8 @@ base:
- redis.adv_redis
- influxdb.soc_influxdb
- influxdb.adv_influxdb
- postgres.soc_postgres
- postgres.adv_postgres
- elasticsearch.nodes
- elasticsearch.soc_elasticsearch
- elasticsearch.adv_elasticsearch
@@ -257,6 +272,9 @@ base:
{% if salt['file.file_exists']('/opt/so/saltstack/local/pillar/elasticsearch/auth.sls') %}
- elasticsearch.auth
{% endif %}
{% if salt['file.file_exists']('/opt/so/saltstack/local/pillar/postgres/auth.sls') %}
- postgres.auth
{% endif %}
{% if salt['file.file_exists']('/opt/so/saltstack/local/pillar/kibana/secrets.sls') %}
- kibana.secrets
{% endif %}
@@ -282,6 +300,8 @@ base:
- redis.adv_redis
- influxdb.soc_influxdb
- influxdb.adv_influxdb
- postgres.soc_postgres
- postgres.adv_postgres
- zeek.soc_zeek
- zeek.adv_zeek
- bpf.soc_bpf
+2
View File
@@ -29,6 +29,8 @@
'manager',
'nginx',
'influxdb',
'postgres',
'postgres.auth',
'soc',
'kratos',
'hydra',
+1
View File
@@ -32,3 +32,4 @@ so_config_backup:
- daymonth: '*'
- month: '*'
- dayweek: '*'
+14
View File
@@ -54,6 +54,20 @@ x509_signing_policies:
- extendedKeyUsage: serverAuth
- days_valid: 820
- copypath: /etc/pki/issued_certs/
postgres:
- minions: '*'
- signing_private_key: /etc/pki/ca.key
- signing_cert: /etc/pki/ca.crt
- C: US
- ST: Utah
- L: Salt Lake City
- basicConstraints: "critical CA:false"
- keyUsage: "critical keyEncipherment"
- subjectKeyIdentifier: hash
- authorityKeyIdentifier: keyid,issuer:always
- extendedKeyUsage: serverAuth
- days_valid: 820
- copypath: /etc/pki/issued_certs/
elasticfleet:
- minions: '*'
- signing_private_key: /etc/pki/ca.key
+2
View File
@@ -31,6 +31,7 @@ container_list() {
"so-hydra"
"so-nginx"
"so-pcaptools"
"so-postgres"
"so-soc"
"so-suricata"
"so-telegraf"
@@ -55,6 +56,7 @@ container_list() {
"so-logstash"
"so-nginx"
"so-pcaptools"
"so-postgres"
"so-redis"
"so-soc"
"so-strelka-backend"
+8
View File
@@ -237,3 +237,11 @@ docker:
extra_hosts: []
extra_env: []
ulimits: []
'so-postgres':
final_octet: 47
port_bindings:
- 0.0.0.0:5432:5432
custom_bind_mounts: []
extra_hosts: []
extra_env: []
ulimits: []
+3
View File
@@ -11,6 +11,7 @@
'so-kratos',
'so-hydra',
'so-nginx',
'so-postgres',
'so-redis',
'so-soc',
'so-strelka-coordinator',
@@ -34,6 +35,7 @@
'so-hydra',
'so-logstash',
'so-nginx',
'so-postgres',
'so-redis',
'so-soc',
'so-strelka-coordinator',
@@ -77,6 +79,7 @@
'so-kratos',
'so-hydra',
'so-nginx',
'so-postgres',
'so-soc'
] %}
+9
View File
@@ -98,6 +98,10 @@ firewall:
tcp:
- 8086
udp: []
postgres:
tcp:
- 5432
udp: []
kafka_controller:
tcp:
- 9093
@@ -193,6 +197,7 @@ firewall:
- kibana
- redis
- influxdb
- postgres
- elasticsearch_rest
- elasticsearch_node
- localrules
@@ -379,6 +384,7 @@ firewall:
- kibana
- redis
- influxdb
- postgres
- elasticsearch_rest
- elasticsearch_node
- docker_registry
@@ -590,6 +596,7 @@ firewall:
- kibana
- redis
- influxdb
- postgres
- elasticsearch_rest
- elasticsearch_node
- docker_registry
@@ -799,6 +806,7 @@ firewall:
- kibana
- redis
- influxdb
- postgres
- elasticsearch_rest
- elasticsearch_node
- docker_registry
@@ -1011,6 +1019,7 @@ firewall:
- kibana
- redis
- influxdb
- postgres
- elasticsearch_rest
- elasticsearch_node
- docker_registry
+13
View File
@@ -1,5 +1,6 @@
{% from 'vars/globals.map.jinja' import GLOBALS %}
{% from 'docker/docker.map.jinja' import DOCKERMERGED %}
{% from 'telegraf/map.jinja' import TELEGRAFMERGED %}
{% import_yaml 'firewall/defaults.yaml' as FIREWALL_DEFAULT %}
{# add our ip to self #}
@@ -55,4 +56,16 @@
{% endif %}
{# Open Postgres (5432) to minion hostgroups when Telegraf is configured to write to Postgres #}
{% set TG_OUT = TELEGRAFMERGED.output | upper %}
{% if TG_OUT in ['POSTGRES', 'BOTH'] %}
{% if role.startswith('manager') or role == 'standalone' or role == 'eval' %}
{% for r in ['sensor', 'searchnode', 'heavynode', 'receiver', 'fleet', 'idh', 'desktop', 'import'] %}
{% if FIREWALL_DEFAULT.firewall.role[role].chain["DOCKER-USER"].hostgroups[r] is defined %}
{% do FIREWALL_DEFAULT.firewall.role[role].chain["DOCKER-USER"].hostgroups[r].portgroups.append('postgres') %}
{% endif %}
{% endfor %}
{% endif %}
{% endif %}
{% set FIREWALL_MERGED = salt['pillar.get']('firewall', FIREWALL_DEFAULT.firewall, merge=True) %}
+1
View File
@@ -59,4 +59,5 @@ global:
description: Allows use of Endgame with Security Onion. This feature requires a license from Endgame.
global: True
advanced: True
helpLink: influxdb
+26
View File
@@ -490,6 +490,32 @@ up_to_3.1.0() {
post_to_3.1.0() {
/usr/sbin/so-kibana-space-defaults
# One-time backfill for minions that existed before the postgres Telegraf
# feature shipped. Generate the aggregate pillar on the manager and create
# the per-minion DB roles, then fan each minion's cred into its own pillar
# file. Going forward the reactor handles each new salt-key accept with a
# targeted fan-out, so a manager highstate no longer needs to iterate.
echo "Provisioning Telegraf Postgres users for existing minions."
salt-call --local state.apply postgres.auth,postgres.telegraf_users queue=True || true
AGGREGATE_PILLAR=/opt/so/saltstack/local/pillar/postgres/auth.sls
MINIONS_DIR=/opt/so/saltstack/local/pillar/minions
if [[ -f "$AGGREGATE_PILLAR" && -d "$MINIONS_DIR" ]]; then
for pillar_file in "$MINIONS_DIR"/*.sls; do
[[ -f "$pillar_file" ]] || continue
mid=$(basename "$pillar_file" .sls)
[[ "$mid" == adv_* ]] && continue
safe=$(echo "$mid" | tr '.-' '__' | tr '[:upper:]' '[:lower:]')
existing_user=$(so-yaml.py get -r "$pillar_file" postgres.telegraf.user 2>/dev/null || true)
[[ "$existing_user" == "so_telegraf_${safe}" ]] && continue
user=$(so-yaml.py get -r "$AGGREGATE_PILLAR" "postgres.auth.users.telegraf_${safe}.user" 2>/dev/null || true)
pass=$(so-yaml.py get -r "$AGGREGATE_PILLAR" "postgres.auth.users.telegraf_${safe}.pass" 2>/dev/null || true)
[[ -z "$user" || -z "$pass" ]] && continue
so-yaml.py replace "$pillar_file" postgres.telegraf.user "$user" >/dev/null
so-yaml.py replace "$pillar_file" postgres.telegraf.pass "$pass" >/dev/null
done
fi
POSTVERSION=3.1.0
}
+28
View File
@@ -0,0 +1,28 @@
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
# Fired by salt/reactor/telegraf_user_sync.sls when salt-key accepts a new
# minion. Only provisions the per-minion pillar entry and DB role on the
# manager; the minion itself will pick up its telegraf config on its first
# highstate during onboarding, so there's no need to push the telegraf state
# from here.
#
# Target the manager via role grains — same pattern as orch/delete_hypervisor.sls.
# The reactor doesn't know the manager's minion id, and grains.master on the
# runner is a hostname, not a targetable id.
{% set FANOUT_MINION = salt['pillar.get']('postgres_fanout_minion', '') %}
manager_sync_telegraf_pg_users:
salt.state:
- tgt: 'G@role:so-manager or G@role:so-managerhype or G@role:so-managersearch or G@role:so-standalone or G@role:so-eval'
- tgt_type: compound
- sls:
- postgres.auth
- postgres.telegraf_users
- queue: True
{% if FANOUT_MINION %}
- pillar:
postgres_fanout_minion: {{ FANOUT_MINION }}
{% endif %}
+90
View File
@@ -0,0 +1,90 @@
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
{% from 'allowed_states.map.jinja' import allowed_states %}
{% if sls in allowed_states %}
{% set DIGITS = "1234567890" %}
{% set LOWERCASE = "qwertyuiopasdfghjklzxcvbnm" %}
{% set UPPERCASE = "QWERTYUIOPASDFGHJKLZXCVBNM" %}
{% set SYMBOLS = "~!@#^&*()-_=+[]|;:,.<>?" %}
{% set CHARS = DIGITS~LOWERCASE~UPPERCASE~SYMBOLS %}
{% set so_postgres_user_pass = salt['pillar.get']('postgres:auth:users:so_postgres_user:pass', salt['random.get_str'](72, chars=CHARS)) %}
{# Per-minion Telegraf Postgres credentials. Merge currently-up minions with any #}
{# previously-known entries in pillar so existing passwords persist across runs. #}
{% set existing = salt['pillar.get']('postgres:auth:users', {}) %}
{% set up_minions = salt['saltutil.runner']('manage.up') or [] %}
{% set telegraf_users = {} %}
{% for key, entry in existing.items() %}
{%- if key.startswith('telegraf_') and entry.get('user') and entry.get('pass') %}
{%- do telegraf_users.update({key: entry}) %}
{%- endif %}
{% endfor %}
{% for mid in up_minions %}
{%- set safe = mid | replace('.','_') | replace('-','_') | lower %}
{%- set key = 'telegraf_' ~ safe %}
{%- if key not in telegraf_users %}
{%- do telegraf_users.update({key: {'user': 'so_telegraf_' ~ safe, 'pass': salt['random.get_str'](72, chars=CHARS)}}) %}
{%- endif %}
{% endfor %}
postgres_auth_pillar:
file.managed:
- name: /opt/so/saltstack/local/pillar/postgres/auth.sls
- mode: 640
- reload_pillar: True
- contents: |
postgres:
auth:
users:
so_postgres_user:
user: so_postgres
pass: "{{ so_postgres_user_pass }}"
{% for key, entry in telegraf_users.items() %}
{{ key }}:
user: {{ entry.user }}
pass: "{{ entry.pass }}"
{% endfor %}
- show_changes: False
{# Fan a specific minion's telegraf cred out to its own pillar file. Only
runs when postgres_fanout_minion pillar is provided — otherwise this state
is a no-op. That keeps manager highstates from doing N so-yaml.py forks
when nothing changed. The reactor passes postgres_fanout_minion through
the orch on salt-key accept; soup handles bulk backfill separately. #}
{% set fanout_mid = salt['pillar.get']('postgres_fanout_minion') %}
{% if fanout_mid %}
{%- set safe = fanout_mid | replace('.','_') | replace('-','_') | lower %}
{%- set key = 'telegraf_' ~ safe %}
{%- set entry = telegraf_users.get(key) %}
{%- if entry %}
postgres_telegraf_minion_pillar_{{ safe }}:
cmd.run:
- name: |
set -e
PILLAR_FILE=/opt/so/saltstack/local/pillar/minions/{{ fanout_mid }}.sls
if [ ! -f "$PILLAR_FILE" ]; then
echo '{}' > "$PILLAR_FILE"
chown socore:socore "$PILLAR_FILE" 2>/dev/null || true
chmod 640 "$PILLAR_FILE"
fi
/usr/sbin/so-yaml.py replace "$PILLAR_FILE" postgres.telegraf.user '{{ entry.user }}'
/usr/sbin/so-yaml.py replace "$PILLAR_FILE" postgres.telegraf.pass '{{ entry.pass }}'
- unless: |
[ "$(/usr/sbin/so-yaml.py get -r /opt/so/saltstack/local/pillar/minions/{{ fanout_mid }}.sls postgres.telegraf.user 2>/dev/null)" = '{{ entry.user }}' ]
- require:
- file: postgres_auth_pillar
{%- endif %}
{% endif %}
{% else %}
{{sls}}_state_not_allowed:
test.fail_without_changes:
- name: {{sls}}_state_not_allowed
{% endif %}
+111
View File
@@ -0,0 +1,111 @@
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
{% from 'allowed_states.map.jinja' import allowed_states %}
{% if sls.split('.')[0] in allowed_states %}
{% from 'postgres/map.jinja' import PGMERGED %}
# Postgres Setup
postgresconfdir:
file.directory:
- name: /opt/so/conf/postgres
- user: 939
- group: 939
- makedirs: True
postgressecretsdir:
file.directory:
- name: /opt/so/conf/postgres/secrets
- user: 939
- group: 939
- mode: 700
- require:
- file: postgresconfdir
postgresdatadir:
file.directory:
- name: /nsm/postgres
- user: 939
- group: 939
- makedirs: True
postgreslogdir:
file.directory:
- name: /opt/so/log/postgres
- user: 939
- group: 939
- makedirs: True
postgresinitdir:
file.directory:
- name: /opt/so/conf/postgres/init
- user: 939
- group: 939
- require:
- file: postgresconfdir
postgresinitusers:
file.managed:
- name: /opt/so/conf/postgres/init/init-users.sh
- source: salt://postgres/files/init-users.sh
- user: 939
- group: 939
- mode: 755
postgresconf:
file.managed:
- name: /opt/so/conf/postgres/postgresql.conf
- source: salt://postgres/files/postgresql.conf.jinja
- user: 939
- group: 939
- template: jinja
- defaults:
PGMERGED: {{ PGMERGED }}
postgreshba:
file.managed:
- name: /opt/so/conf/postgres/pg_hba.conf
- source: salt://postgres/files/pg_hba.conf
- user: 939
- group: 939
- mode: 640
postgres_super_secret:
file.managed:
- name: /opt/so/conf/postgres/secrets/postgres_password
- user: 939
- group: 939
- mode: 600
- contents_pillar: 'secrets:postgres_pass'
- show_changes: False
- require:
- file: postgressecretsdir
postgres_app_secret:
file.managed:
- name: /opt/so/conf/postgres/secrets/so_postgres_pass
- user: 939
- group: 939
- mode: 600
- contents_pillar: 'postgres:auth:users:so_postgres_user:pass'
- show_changes: False
- require:
- file: postgressecretsdir
postgres_sbin:
file.recurse:
- name: /usr/sbin
- source: salt://postgres/tools/sbin
- user: root
- group: root
- file_mode: 755
{% else %}
{{sls}}_state_not_allowed:
test.fail_without_changes:
- name: {{sls}}_state_not_allowed
{% endif %}
+19
View File
@@ -0,0 +1,19 @@
postgres:
enabled: True
telegraf:
retention_days: 14
config:
listen_addresses: '*'
port: 5432
max_connections: 100
shared_buffers: 256MB
ssl: 'on'
ssl_cert_file: '/conf/postgres.crt'
ssl_key_file: '/conf/postgres.key'
ssl_ca_file: '/conf/ca.crt'
hba_file: '/conf/pg_hba.conf'
log_destination: 'stderr'
logging_collector: 'off'
log_min_messages: 'warning'
shared_preload_libraries: pg_cron
cron.database_name: so_telegraf
+33
View File
@@ -0,0 +1,33 @@
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
{% from 'allowed_states.map.jinja' import allowed_states %}
{% if sls.split('.')[0] in allowed_states %}
include:
- postgres.sostatus
so-postgres:
docker_container.absent:
- force: True
so-postgres_so-status.disabled:
file.comment:
- name: /opt/so/conf/so-status/so-status.conf
- regex: ^so-postgres$
so_postgres_backup:
cron.absent:
- name: /usr/sbin/so-postgres-backup > /dev/null 2>&1
- identifier: so_postgres_backup
- user: root
{% else %}
{{sls}}_state_not_allowed:
test.fail_without_changes:
- name: {{sls}}_state_not_allowed
{% endif %}
+109
View File
@@ -0,0 +1,109 @@
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
{% from 'allowed_states.map.jinja' import allowed_states %}
{% if sls.split('.')[0] in allowed_states %}
{% from 'vars/globals.map.jinja' import GLOBALS %}
{% from 'docker/docker.map.jinja' import DOCKERMERGED %}
{% set SO_POSTGRES_USER = salt['pillar.get']('postgres:auth:users:so_postgres_user:user', 'so_postgres') %}
include:
- postgres.auth
- postgres.ssl
- postgres.config
- postgres.sostatus
- postgres.telegraf_users
so-postgres:
docker_container.running:
- image: {{ GLOBALS.registry_host }}:5000/{{ GLOBALS.image_repo }}/so-postgres:{{ GLOBALS.so_version }}
- hostname: so-postgres
- networks:
- sobridge:
- ipv4_address: {{ DOCKERMERGED.containers['so-postgres'].ip }}
- port_bindings:
{% for BINDING in DOCKERMERGED.containers['so-postgres'].port_bindings %}
- {{ BINDING }}
{% endfor %}
- environment:
- POSTGRES_DB=securityonion
# Passwords are delivered via mounted 0600 secret files, not plaintext env vars.
# The upstream postgres image resolves POSTGRES_PASSWORD_FILE; entrypoint.sh and
# init-users.sh resolve SO_POSTGRES_PASS_FILE the same way.
- POSTGRES_PASSWORD_FILE=/run/secrets/postgres_password
- SO_POSTGRES_USER={{ SO_POSTGRES_USER }}
- SO_POSTGRES_PASS_FILE=/run/secrets/so_postgres_pass
{% if DOCKERMERGED.containers['so-postgres'].extra_env %}
{% for XTRAENV in DOCKERMERGED.containers['so-postgres'].extra_env %}
- {{ XTRAENV }}
{% endfor %}
{% endif %}
- binds:
- /opt/so/log/postgres/:/log:rw
- /nsm/postgres:/var/lib/postgresql/data:rw
- /opt/so/conf/postgres/postgresql.conf:/conf/postgresql.conf:ro
- /opt/so/conf/postgres/pg_hba.conf:/conf/pg_hba.conf:ro
- /opt/so/conf/postgres/secrets:/run/secrets:ro
- /opt/so/conf/postgres/init/init-users.sh:/docker-entrypoint-initdb.d/init-users.sh:ro
- /etc/pki/postgres.crt:/conf/postgres.crt:ro
- /etc/pki/postgres.key:/conf/postgres.key:ro
- /etc/pki/tls/certs/intca.crt:/conf/ca.crt:ro
{% if DOCKERMERGED.containers['so-postgres'].custom_bind_mounts %}
{% for BIND in DOCKERMERGED.containers['so-postgres'].custom_bind_mounts %}
- {{ BIND }}
{% endfor %}
{% endif %}
{% if DOCKERMERGED.containers['so-postgres'].extra_hosts %}
- extra_hosts:
{% for XTRAHOST in DOCKERMERGED.containers['so-postgres'].extra_hosts %}
- {{ XTRAHOST }}
{% endfor %}
{% endif %}
{% if DOCKERMERGED.containers['so-postgres'].ulimits %}
- ulimits:
{% for ULIMIT in DOCKERMERGED.containers['so-postgres'].ulimits %}
- {{ ULIMIT.name }}={{ ULIMIT.soft }}:{{ ULIMIT.hard }}
{% endfor %}
{% endif %}
- watch:
- file: postgresconf
- file: postgreshba
- file: postgresinitusers
- file: postgres_super_secret
- file: postgres_app_secret
- x509: postgres_crt
- x509: postgres_key
- require:
- file: postgresconf
- file: postgreshba
- file: postgresinitusers
- file: postgres_super_secret
- file: postgres_app_secret
- x509: postgres_crt
- x509: postgres_key
delete_so-postgres_so-status.disabled:
file.uncomment:
- name: /opt/so/conf/so-status/so-status.conf
- regex: ^so-postgres$
so_postgres_backup:
cron.present:
- name: /usr/sbin/so-postgres-backup > /dev/null 2>&1
- identifier: so_postgres_backup
- user: root
- minute: '5'
- hour: '0'
- daymonth: '*'
- month: '*'
- dayweek: '*'
{% else %}
{{sls}}_state_not_allowed:
test.fail_without_changes:
- name: {{sls}}_state_not_allowed
{% endif %}
+34
View File
@@ -0,0 +1,34 @@
#!/bin/bash
set -e
# Create or update application user for SOC platform access
# This script runs on first database initialization via docker-entrypoint-initdb.d
# The password is properly escaped to handle special characters
if [ -z "${SO_POSTGRES_PASS:-}" ] && [ -n "${SO_POSTGRES_PASS_FILE:-}" ] && [ -r "$SO_POSTGRES_PASS_FILE" ]; then
SO_POSTGRES_PASS="$(< "$SO_POSTGRES_PASS_FILE")"
fi
psql -v ON_ERROR_STOP=1 --username "$POSTGRES_USER" --dbname "$POSTGRES_DB" <<-EOSQL
DO \$\$
BEGIN
IF NOT EXISTS (SELECT FROM pg_catalog.pg_roles WHERE rolname = '${SO_POSTGRES_USER}') THEN
EXECUTE format('CREATE ROLE %I WITH LOGIN PASSWORD %L', '${SO_POSTGRES_USER}', '${SO_POSTGRES_PASS}');
ELSE
EXECUTE format('ALTER ROLE %I WITH PASSWORD %L', '${SO_POSTGRES_USER}', '${SO_POSTGRES_PASS}');
END IF;
END
\$\$;
GRANT ALL PRIVILEGES ON DATABASE "$POSTGRES_DB" TO "$SO_POSTGRES_USER";
-- Lock the SOC database down at the connect layer; PUBLIC gets CONNECT
-- by default, which would let per-minion telegraf roles open sessions
-- here. They have no schema/table grants inside so reads fail, but
-- revoking CONNECT closes the soft edge entirely.
REVOKE CONNECT ON DATABASE "$POSTGRES_DB" FROM PUBLIC;
GRANT CONNECT ON DATABASE "$POSTGRES_DB" TO "$SO_POSTGRES_USER";
EOSQL
# Bootstrap the Telegraf metrics database. Per-minion roles + schemas are
# reconciled on every state.apply by postgres/telegraf_users.sls; this block
# only ensures the shared database exists on first initialization.
if ! psql -U "$POSTGRES_USER" -tAc "SELECT 1 FROM pg_database WHERE datname='so_telegraf'" | grep -q 1; then
psql -v ON_ERROR_STOP=1 -U "$POSTGRES_USER" -c "CREATE DATABASE so_telegraf"
fi
+16
View File
@@ -0,0 +1,16 @@
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
#
# Managed by Salt — do not edit by hand.
# Client authentication config: only local (Unix socket) connections and TLS-wrapped TCP
# connections are accepted. Plain-text `host ...` lines are intentionally omitted so a
# misconfigured client with sslmode=disable cannot negotiate a cleartext session.
# Local connections (Unix socket, container-internal) use peer/trust.
local all all trust
# TCP connections MUST use TLS (hostssl) and authenticate with SCRAM.
hostssl all all 0.0.0.0/0 scram-sha-256
hostssl all all ::/0 scram-sha-256
@@ -0,0 +1,8 @@
{# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
https://securityonion.net/license; you may not use this file except in compliance with the
Elastic License 2.0. #}
{% for key, value in PGMERGED.config.items() %}
{{ key }} = '{{ value | string | replace("'", "''") }}'
{% endfor %}
+13
View File
@@ -0,0 +1,13 @@
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
{% from 'postgres/map.jinja' import PGMERGED %}
include:
{% if PGMERGED.enabled %}
- postgres.enabled
{% else %}
- postgres.disabled
{% endif %}
+7
View File
@@ -0,0 +1,7 @@
{# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
https://securityonion.net/license; you may not use this file except in compliance with the
Elastic License 2.0. #}
{% import_yaml 'postgres/defaults.yaml' as PGDEFAULTS %}
{% set PGMERGED = salt['pillar.get']('postgres', PGDEFAULTS.postgres, merge=True) %}
+89
View File
@@ -0,0 +1,89 @@
postgres:
enabled:
description: Whether the PostgreSQL database container is enabled on this grid. Backs the assistant store and the Telegraf metrics database.
forcedType: bool
readonly: True
helpLink: influxdb
telegraf:
retention_days:
description: Number of days of Telegraf metrics to keep in the so_telegraf database. Older partitions are dropped hourly by pg_partman.
forcedType: int
helpLink: postgres
config:
max_connections:
description: Maximum number of concurrent PostgreSQL connections.
forcedType: int
global: True
helpLink: postgres
shared_buffers:
description: Amount of memory PostgreSQL uses for shared buffers (e.g. 256MB, 1GB). Raising this improves read cache hit rate at the cost of system RAM.
global: True
helpLink: postgres
log_min_messages:
description: Minimum severity of server messages written to the PostgreSQL log.
options:
- debug1
- info
- notice
- warning
- error
- log
- fatal
global: True
helpLink: postgres
listen_addresses:
description: Interfaces PostgreSQL listens on. Must remain '*' so clients on the docker bridge network can connect.
global: True
advanced: True
helpLink: postgres
port:
description: TCP port PostgreSQL listens on inside the container. Firewall rules and container port mapping assume 5432.
forcedType: int
global: True
advanced: True
helpLink: postgres
ssl:
description: Whether PostgreSQL accepts TLS connections. Must remain 'on' — pg_hba.conf requires hostssl for TCP.
global: True
advanced: True
helpLink: postgres
ssl_cert_file:
description: Path (inside the container) to the TLS server certificate. Salt-managed.
global: True
advanced: True
helpLink: postgres
ssl_key_file:
description: Path (inside the container) to the TLS server private key. Salt-managed.
global: True
advanced: True
helpLink: postgres
ssl_ca_file:
description: Path (inside the container) to the CA bundle PostgreSQL uses to verify client certificates. Salt-managed.
global: True
advanced: True
helpLink: postgres
hba_file:
description: Path (inside the container) to the pg_hba.conf authentication file. Salt-managed — edit salt/postgres/files/pg_hba.conf.
global: True
advanced: True
helpLink: postgres
log_destination:
description: Where PostgreSQL writes its server log. 'stderr' routes to the container log stream.
global: True
advanced: True
helpLink: postgres
logging_collector:
description: Whether to run a separate logging collector process. Disabled because the docker log stream already captures stderr.
global: True
advanced: True
helpLink: postgres
shared_preload_libraries:
description: Comma-separated list of extensions loaded at server start. Required for pg_cron which drives pg_partman maintenance — do not remove.
global: True
advanced: True
helpLink: postgres
cron.database_name:
description: Database pg_cron schedules jobs in. Must be so_telegraf so partman maintenance runs in the right database context.
global: True
advanced: True
helpLink: postgres
+21
View File
@@ -0,0 +1,21 @@
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
{% from 'allowed_states.map.jinja' import allowed_states %}
{% if sls.split('.')[0] in allowed_states %}
append_so-postgres_so-status.conf:
file.append:
- name: /opt/so/conf/so-status/so-status.conf
- text: so-postgres
- unless: grep -q so-postgres /opt/so/conf/so-status/so-status.conf
{% else %}
{{sls}}_state_not_allowed:
test.fail_without_changes:
- name: {{sls}}_state_not_allowed
{% endif %}
+55
View File
@@ -0,0 +1,55 @@
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
{% from 'allowed_states.map.jinja' import allowed_states %}
{% if sls.split('.')[0] in allowed_states %}
{% from 'vars/globals.map.jinja' import GLOBALS %}
{% from 'ca/map.jinja' import CA %}
postgres_key:
x509.private_key_managed:
- name: /etc/pki/postgres.key
- keysize: 4096
- backup: True
- new: True
{% if salt['file.file_exists']('/etc/pki/postgres.key') -%}
- prereq:
- x509: /etc/pki/postgres.crt
{%- endif %}
- retry:
attempts: 5
interval: 30
postgres_crt:
x509.certificate_managed:
- name: /etc/pki/postgres.crt
- ca_server: {{ CA.server }}
- subjectAltName: DNS:{{ GLOBALS.hostname }}, IP:{{ GLOBALS.node_ip }}
- signing_policy: postgres
- private_key: /etc/pki/postgres.key
- CN: {{ GLOBALS.hostname }}
- days_remaining: 7
- days_valid: 820
- backup: True
- timeout: 30
- retry:
attempts: 5
interval: 30
postgresKeyperms:
file.managed:
- replace: False
- name: /etc/pki/postgres.key
- mode: 400
- user: 939
- group: 939
{% else %}
{{sls}}_state_not_allowed:
test.fail_without_changes:
- name: {{sls}}_state_not_allowed
{% endif %}
+157
View File
@@ -0,0 +1,157 @@
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
{% from 'allowed_states.map.jinja' import allowed_states %}
{% if sls.split('.')[0] in allowed_states %}
{% from 'vars/globals.map.jinja' import GLOBALS %}
{% from 'telegraf/map.jinja' import TELEGRAFMERGED %}
{# postgres_wait_ready below requires `docker_container: so-postgres`, which is
declared in postgres.enabled. Include it here so state.apply postgres.telegraf_users
on its own (from the reactor orch or from soup) still has that ID in scope. Salt
de-duplicates the circular include. #}
include:
- postgres.enabled
{% set TG_OUT = TELEGRAFMERGED.output | upper %}
{% if TG_OUT in ['POSTGRES', 'BOTH'] %}
# docker_container.running returns as soon as the container starts, but on
# first-init docker-entrypoint.sh starts a temporary postgres with
# `listen_addresses=''` to run /docker-entrypoint-initdb.d scripts, then
# shuts it down before exec'ing the real CMD. A default pg_isready check
# (Unix socket) passes during that ephemeral phase and races the shutdown
# with "the database system is shutting down". Checking TCP readiness on
# 127.0.0.1 only succeeds after the final postgres binds the port.
postgres_wait_ready:
cmd.run:
- name: |
for i in $(seq 1 60); do
if docker exec so-postgres pg_isready -h 127.0.0.1 -U postgres -q 2>/dev/null; then
exit 0
fi
sleep 2
done
echo "so-postgres did not accept TCP connections within 120s" >&2
exit 1
- require:
- docker_container: so-postgres
# Ensure the shared Telegraf database exists. init-users.sh only runs on a
# fresh data dir, so hosts upgraded onto an existing /nsm/postgres volume
# would otherwise never get so_telegraf.
postgres_create_telegraf_db:
cmd.run:
- name: |
if ! docker exec so-postgres psql -U postgres -tAc "SELECT 1 FROM pg_database WHERE datname='so_telegraf'" | grep -q 1; then
docker exec so-postgres psql -v ON_ERROR_STOP=1 -U postgres -c "CREATE DATABASE so_telegraf"
fi
- require:
- cmd: postgres_wait_ready
# Provision the shared group role and schema once. Every per-minion role is a
# member of so_telegraf, and each Telegraf connection does SET ROLE so_telegraf
# (via options='-c role=so_telegraf' in the connection string) so tables created
# on first write are owned by the group role and every member can INSERT/SELECT.
postgres_telegraf_group_role:
cmd.run:
- name: |
docker exec -i so-postgres psql -v ON_ERROR_STOP=1 -U postgres -d so_telegraf <<'EOSQL'
DO $$
BEGIN
IF NOT EXISTS (SELECT FROM pg_catalog.pg_roles WHERE rolname = 'so_telegraf') THEN
CREATE ROLE so_telegraf NOLOGIN;
END IF;
END
$$;
GRANT CONNECT ON DATABASE so_telegraf TO so_telegraf;
CREATE SCHEMA IF NOT EXISTS telegraf AUTHORIZATION so_telegraf;
GRANT USAGE, CREATE ON SCHEMA telegraf TO so_telegraf;
CREATE SCHEMA IF NOT EXISTS partman;
CREATE EXTENSION IF NOT EXISTS pg_partman SCHEMA partman;
CREATE EXTENSION IF NOT EXISTS pg_cron;
-- Telegraf (running as so_telegraf) calls partman.create_parent()
-- on first write of each metric, which needs USAGE on the partman
-- schema, EXECUTE on its functions/procedures, and write access to
-- partman.part_config so it can register new partitioned parents.
GRANT USAGE, CREATE ON SCHEMA partman TO so_telegraf;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA partman TO so_telegraf;
GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA partman TO so_telegraf;
GRANT EXECUTE ON ALL PROCEDURES IN SCHEMA partman TO so_telegraf;
-- partman creates per-parent template tables (partman.template_*) at
-- runtime; default privileges extend DML/sequence access to them.
ALTER DEFAULT PRIVILEGES IN SCHEMA partman
GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO so_telegraf;
ALTER DEFAULT PRIVILEGES IN SCHEMA partman
GRANT USAGE, SELECT, UPDATE ON SEQUENCES TO so_telegraf;
-- Hourly partman maintenance. cron.schedule is idempotent by jobname.
SELECT cron.schedule(
'telegraf-partman-maintenance',
'17 * * * *',
'CALL partman.run_maintenance_proc()'
);
EOSQL
- require:
- cmd: postgres_create_telegraf_db
{% set users = salt['pillar.get']('postgres:auth:users', {}) %}
{% for key, entry in users.items() %}
{% if key.startswith('telegraf_') and entry.get('user') and entry.get('pass') %}
{% set u = entry.user %}
{% set p = entry.pass | replace("'", "''") %}
postgres_telegraf_role_{{ u }}:
cmd.run:
- name: |
docker exec -i so-postgres psql -v ON_ERROR_STOP=1 -U postgres -d so_telegraf <<'EOSQL'
DO $$
BEGIN
IF NOT EXISTS (SELECT FROM pg_catalog.pg_roles WHERE rolname = '{{ u }}') THEN
EXECUTE format('CREATE ROLE %I WITH LOGIN PASSWORD %L', '{{ u }}', '{{ p }}');
ELSE
EXECUTE format('ALTER ROLE %I WITH PASSWORD %L', '{{ u }}', '{{ p }}');
END IF;
END
$$;
GRANT CONNECT ON DATABASE so_telegraf TO "{{ u }}";
GRANT so_telegraf TO "{{ u }}";
EOSQL
- require:
- cmd: postgres_telegraf_group_role
{% endif %}
{% endfor %}
# Reconcile partman retention from pillar. Runs after role/schema setup so
# any partitioned parents Telegraf has already created get their retention
# refreshed whenever postgres.telegraf.retention_days changes.
{% set retention = salt['pillar.get']('postgres:telegraf:retention_days', 14) | int %}
postgres_telegraf_retention_reconcile:
cmd.run:
- name: |
docker exec -i so-postgres psql -v ON_ERROR_STOP=1 -U postgres -d so_telegraf <<'EOSQL'
DO $$
BEGIN
IF EXISTS (SELECT 1 FROM pg_catalog.pg_extension WHERE extname = 'pg_partman') THEN
UPDATE partman.part_config
SET retention = '{{ retention }} days',
retention_keep_table = false
WHERE parent_table LIKE 'telegraf.%';
END IF;
END
$$;
EOSQL
- require:
- cmd: postgres_telegraf_group_role
{% endif %}
{% else %}
{{sls}}_state_not_allowed:
test.fail_without_changes:
- name: {{sls}}_state_not_allowed
{% endif %}
@@ -0,0 +1,39 @@
#!/bin/bash
#
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
. /usr/sbin/so-common
# Backups contain role password hashes and full chat data; keep them 0600.
umask 0077
TODAY=$(date '+%Y_%m_%d')
BACKUPDIR=/nsm/backup
BACKUPFILE="$BACKUPDIR/so-postgres-backup-$TODAY.sql.gz"
MAXBACKUPS=7
mkdir -p $BACKUPDIR
# Skip if already backed up today
if [ -f "$BACKUPFILE" ]; then
exit 0
fi
# Skip if container isn't running
if ! docker ps --format '{{.Names}}' | grep -q '^so-postgres$'; then
exit 0
fi
# Dump all databases and roles, compress
docker exec so-postgres pg_dumpall -U postgres | gzip > "$BACKUPFILE"
# Retention cleanup
NUMBACKUPS=$(find $BACKUPDIR -type f -name "so-postgres-backup*" | wc -l)
while [ "$NUMBACKUPS" -gt "$MAXBACKUPS" ]; do
OLDEST=$(find $BACKUPDIR -type f -name "so-postgres-backup*" -printf '%T+ %p\n' | sort | head -n 1 | awk -F" " '{print $2}')
rm -f "$OLDEST"
NUMBACKUPS=$(find $BACKUPDIR -type f -name "so-postgres-backup*" | wc -l)
done
@@ -0,0 +1,80 @@
#!/bin/bash
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
. /usr/sbin/so-common
usage() {
echo "Usage: $0 <operation> [args]"
echo ""
echo "Supported Operations:"
echo " sql Execute a SQL command, requires: <sql>"
echo " sqlfile Execute a SQL file, requires: <path>"
echo " shell Open an interactive psql shell"
echo " dblist List databases"
echo " userlist List database roles"
echo ""
exit 1
}
if [ $# -lt 1 ]; then
usage
fi
# Check for prerequisites
if [ "$(id -u)" -ne 0 ]; then
echo "This script must be run using sudo!"
exit 1
fi
COMMAND=$(basename $0)
OP=$1
shift
set -eo pipefail
log() {
echo -e "$(date) | $COMMAND | $@" >&2
}
so_psql() {
docker exec so-postgres psql -U postgres -d securityonion "$@"
}
case "$OP" in
sql)
[ $# -lt 1 ] && usage
so_psql -c "$1"
;;
sqlfile)
[ $# -ne 1 ] && usage
if [ ! -f "$1" ]; then
log "File not found: $1"
exit 1
fi
docker cp "$1" so-postgres:/tmp/sqlfile.sql
docker exec so-postgres psql -U postgres -d securityonion -f /tmp/sqlfile.sql
docker exec so-postgres rm -f /tmp/sqlfile.sql
;;
shell)
docker exec -it so-postgres psql -U postgres -d securityonion
;;
dblist)
so_psql -c "\l"
;;
userlist)
so_psql -c "\du"
;;
*)
usage
;;
esac
@@ -0,0 +1,10 @@
#!/bin/bash
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
. /usr/sbin/so-common
/usr/sbin/so-restart postgres $1
@@ -0,0 +1,10 @@
#!/bin/bash
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
. /usr/sbin/so-common
/usr/sbin/so-start postgres $1
+10
View File
@@ -0,0 +1,10 @@
#!/bin/bash
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
. /usr/sbin/so-common
/usr/sbin/so-stop postgres $1
+157
View File
@@ -0,0 +1,157 @@
#!/bin/bash
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
# Point-in-time host metrics from the Telegraf Postgres backend.
# Sanity-check tool for verifying metrics are landing before the grid
# dashboards consume them.
#
# Assumes Telegraf's postgresql output is configured with
# tags_as_foreign_keys = true, tags_as_jsonb = true, fields_as_jsonb = true,
# so metric tables are (time, tag_id, fields jsonb) and tag tables are
# (tag_id, tags jsonb).
. /usr/sbin/so-common
usage() {
cat <<EOF
Usage: $0 [host]
Shows the most recent CPU, memory, disk, and load metrics for each host
from the so_telegraf Postgres database. Without an argument, reports on
every host that has data. With a host, limits output to that one.
Requires: sudo, so-postgres running, telegraf.output set to
POSTGRES or BOTH.
EOF
exit 1
}
if [ "$(id -u)" -ne 0 ]; then
echo "This script must be run using sudo!"
exit 1
fi
case "${1:-}" in
-h|--help) usage ;;
esac
FILTER_HOST="${1:-}"
SCHEMA="telegraf"
# Host values are interpolated into SQL below. Hostnames are [A-Za-z0-9._-];
# any other character in a tag value or CLI arg is rejected to prevent a
# stored-tag (or CLI) → SQL injection via a compromised Telegraf writer.
HOST_RE='^[A-Za-z0-9._-]+$'
if [ -n "$FILTER_HOST" ] && ! [[ "$FILTER_HOST" =~ $HOST_RE ]]; then
echo "Invalid host filter: $FILTER_HOST" >&2
exit 1
fi
so_psql() {
docker exec so-postgres psql -U postgres -d so_telegraf -At -F $'\t' "$@"
}
if ! docker exec so-postgres psql -U postgres -lqt 2>/dev/null | cut -d\| -f1 | grep -qw so_telegraf; then
echo "Database so_telegraf not found. Is telegraf.output set to POSTGRES or BOTH?"
exit 2
fi
table_exists() {
local table="$1"
[ -n "$(so_psql -c "SELECT 1 FROM information_schema.tables WHERE table_schema='${SCHEMA}' AND table_name='${table}' LIMIT 1;")" ]
}
# Discover hosts from cpu_tag (every minion reports cpu).
if ! table_exists "cpu_tag"; then
echo "${SCHEMA}.cpu_tag not found. Has Telegraf written any rows yet?"
exit 0
fi
HOSTS=$(so_psql -c "
SELECT DISTINCT tags->>'host'
FROM \"${SCHEMA}\".cpu_tag
WHERE tags ? 'host'
ORDER BY 1;")
if [ -z "$HOSTS" ]; then
echo "No hosts found in ${SCHEMA}. Is Telegraf configured to write to Postgres?"
exit 0
fi
print_metric() {
so_psql -c "$1"
}
for host in $HOSTS; do
if ! [[ "$host" =~ $HOST_RE ]]; then
echo "Skipping host with invalid characters in tag value: $host" >&2
continue
fi
if [ -n "$FILTER_HOST" ] && [ "$host" != "$FILTER_HOST" ]; then
continue
fi
echo "===================================================================="
echo " Host: $host"
echo "===================================================================="
if table_exists "cpu"; then
print_metric "
SELECT 'cpu ' AS metric,
to_char(c.time, 'YYYY-MM-DD HH24:MI:SS') AS ts,
round((100 - (c.fields->>'usage_idle')::numeric), 1) || '% used'
FROM \"${SCHEMA}\".cpu c
JOIN \"${SCHEMA}\".cpu_tag t USING (tag_id)
WHERE t.tags->>'host' = '${host}' AND t.tags->>'cpu' = 'cpu-total'
ORDER BY c.time DESC LIMIT 1;"
fi
if table_exists "mem"; then
print_metric "
SELECT 'memory ' AS metric,
to_char(m.time, 'YYYY-MM-DD HH24:MI:SS') AS ts,
round((m.fields->>'used_percent')::numeric, 1) || '% used (' ||
pg_size_pretty((m.fields->>'used')::bigint) || ' of ' ||
pg_size_pretty((m.fields->>'total')::bigint) || ')'
FROM \"${SCHEMA}\".mem m
JOIN \"${SCHEMA}\".mem_tag t USING (tag_id)
WHERE t.tags->>'host' = '${host}'
ORDER BY m.time DESC LIMIT 1;"
fi
if table_exists "disk"; then
print_metric "
SELECT 'disk ' || rpad(t.tags->>'path', 12) AS metric,
to_char(d.time, 'YYYY-MM-DD HH24:MI:SS') AS ts,
round((d.fields->>'used_percent')::numeric, 1) || '% used (' ||
pg_size_pretty((d.fields->>'used')::bigint) || ' of ' ||
pg_size_pretty((d.fields->>'total')::bigint) || ')'
FROM \"${SCHEMA}\".disk d
JOIN \"${SCHEMA}\".disk_tag t USING (tag_id)
WHERE t.tags->>'host' = '${host}'
AND d.time = (SELECT max(d2.time)
FROM \"${SCHEMA}\".disk d2
JOIN \"${SCHEMA}\".disk_tag t2 USING (tag_id)
WHERE t2.tags->>'host' = '${host}')
ORDER BY t.tags->>'path';"
fi
if table_exists "system"; then
print_metric "
SELECT 'load ' AS metric,
to_char(s.time, 'YYYY-MM-DD HH24:MI:SS') AS ts,
(s.fields->>'load1') || ' / ' ||
(s.fields->>'load5') || ' / ' ||
(s.fields->>'load15') || ' (1/5/15m)'
FROM \"${SCHEMA}\".system s
JOIN \"${SCHEMA}\".system_tag t USING (tag_id)
WHERE t.tags->>'host' = '${host}'
ORDER BY s.time DESC LIMIT 1;"
fi
echo ""
done
+18
View File
@@ -0,0 +1,18 @@
# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
# or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
# https://securityonion.net/license; you may not use this file except in compliance with the
# Elastic License 2.0.
{# Fires on salt/key. Only act on successful key acceptance — not reauth. #}
{% if data.get('act') == 'accept' and data.get('result') == True and data.get('id') %}
{{ data['id'] }}_telegraf_pg_sync:
runner.state.orchestrate:
- args:
- mods: orch.telegraf_postgres_sync
- pillar:
postgres_fanout_minion: {{ data['id'] }}
{% do salt.log.info('telegraf_user_sync reactor: syncing telegraf PG user for minion %s' % data['id']) %}
{% endif %}
+13
View File
@@ -62,6 +62,19 @@ engines_config:
- name: /etc/salt/master.d/engines.conf
- source: salt://salt/files/engines.conf
reactor_config_telegraf:
file.managed:
- name: /etc/salt/master.d/reactor_telegraf.conf
- contents: |
reactor:
- 'salt/key':
- /opt/so/saltstack/default/salt/reactor/telegraf_user_sync.sls
- user: root
- group: root
- mode: 644
- watch_in:
- service: salt_master_service
# update the bootstrap script when used for salt-cloud
salt_bootstrap_cloud:
file.managed:
+5
View File
@@ -24,6 +24,11 @@
{% do SOCDEFAULTS.soc.config.server.modules.elastic.update({'username': GLOBALS.elasticsearch.auth.users.so_elastic_user.user, 'password': GLOBALS.elasticsearch.auth.users.so_elastic_user.pass}) %}
{% if GLOBALS.postgres is defined and GLOBALS.postgres.auth is defined %}
{% set PG_ADMIN_PASS = salt['pillar.get']('secrets:postgres_pass', '') %}
{% do SOCDEFAULTS.soc.config.server.modules.update({'postgres': {'hostUrl': GLOBALS.manager_ip, 'port': 5432, 'username': GLOBALS.postgres.auth.users.so_postgres_user.user, 'password': GLOBALS.postgres.auth.users.so_postgres_user.pass, 'adminUser': 'postgres', 'adminPassword': PG_ADMIN_PASS, 'dbname': 'securityonion', 'sslMode': 'require', 'assistantEnabled': true, 'esHostUrl': 'https://' ~ GLOBALS.manager_ip ~ ':9200', 'esUsername': GLOBALS.elasticsearch.auth.users.so_elastic_user.user, 'esPassword': GLOBALS.elasticsearch.auth.users.so_elastic_user.pass, 'esVerifyCert': false}}) %}
{% endif %}
{% do SOCDEFAULTS.soc.config.server.modules.influxdb.update({'hostUrl': 'https://' ~ GLOBALS.influxdb_host ~ ':8086'}) %}
{% do SOCDEFAULTS.soc.config.server.modules.influxdb.update({'token': INFLUXDB_TOKEN}) %}
{% for tool in SOCDEFAULTS.soc.config.server.client.tools %}
+1
View File
@@ -1,5 +1,6 @@
telegraf:
enabled: False
output: BOTH
config:
interval: '30s'
metric_batch_size: 1000
+44
View File
@@ -8,6 +8,14 @@
{%- set ZEEK_ENABLED = salt['pillar.get']('zeek:enabled', True) %}
{%- set MDENGINE = GLOBALS.md_engine %}
{%- set LOGSTASH_ENABLED = LOGSTASH_MERGED.enabled %}
{%- set TG_OUT = TELEGRAFMERGED.output | upper %}
{%- set PG_HOST = GLOBALS.manager_ip %}
{#- Per-minion telegraf creds are written into the minion's own pillar file
(/opt/so/saltstack/local/pillar/minions/<id>.sls) by postgres.auth on the
manager. Each minion only sees its own password — the aggregate map in
postgres:auth:users is manager-scoped. #}
{%- set PG_USER = salt['pillar.get']('postgres:telegraf:user', '') %}
{%- set PG_PASS = salt['pillar.get']('postgres:telegraf:pass', '') %}
# Global tags can be specified here in key="value" format.
[global_tags]
role = "{{ GLOBALS.role.split('-') | last }}"
@@ -72,6 +80,7 @@
# OUTPUT PLUGINS #
###############################################################################
{%- if TG_OUT in ['INFLUXDB', 'BOTH'] %}
# Configuration for sending metrics to InfluxDB
[[outputs.influxdb_v2]]
urls = ["https://{{ INFLUXDBHOST }}:8086"]
@@ -85,6 +94,41 @@
tls_key = "/etc/telegraf/telegraf.key"
## Use TLS but skip chain & host verification
# insecure_skip_verify = false
{%- endif %}
{%- if TG_OUT in ['POSTGRES', 'BOTH'] %}
# Configuration for sending metrics to PostgreSQL.
# options='-c role=so_telegraf' makes every connection SET ROLE to the shared
# group role so tables created on first write are owned by so_telegraf, and
# all per-minion members can INSERT/SELECT them via role inheritance.
# fields_as_jsonb/tags_as_jsonb keep metric tables at a fixed column count so
# high-cardinality inputs (docker, procstat, kafka) don't blow past the
# Postgres 1600-column-per-table limit.
[[outputs.postgresql]]
connection = "host={{ PG_HOST }} port=5432 user={{ PG_USER }} password={{ PG_PASS }} dbname=so_telegraf sslmode=verify-full sslrootcert=/etc/telegraf/ca.crt options='-c role=so_telegraf'"
schema = "telegraf"
tags_as_foreign_keys = true
tags_as_jsonb = true
fields_as_jsonb = true
# Every metric table is a daily time-range partitioned parent managed by
# pg_partman. Retention drops old partitions instead of row-by-row DELETEs.
{% raw %}
# pg_partman 5.x requires the control column (time) to be NOT NULL, so
# ALTER it before create_parent(). And create_parent() splits
# p_parent_table on '.' to look up raw identifiers, so the literal must
# be 'schema.name' (not '"schema"."name"' as .table|quoteLiteral emits).
# IF NOT EXISTS keeps the three templates idempotent so a Telegraf
# restart after any DB-side surgery re-runs them safely.
create_templates = [
'''CREATE TABLE IF NOT EXISTS {{ .table }} ({{ .columns }}) PARTITION BY RANGE ("time")''',
'''ALTER TABLE {{ .table }} ALTER COLUMN "time" SET NOT NULL''',
'''SELECT partman.create_parent(p_parent_table := {{ printf "%s.%s" .table.Schema .table.Name | quoteLiteral }}, p_control := 'time', p_type := 'range', p_interval := '1 day', p_premake := 3) WHERE NOT EXISTS (SELECT 1 FROM partman.part_config WHERE parent_table = {{ printf "%s.%s" .table.Schema .table.Name | quoteLiteral }})'''
]
tag_table_create_templates = [
'''CREATE TABLE IF NOT EXISTS {{ .table }} ({{ .columns }}, PRIMARY KEY (tag_id))'''
]
{% endraw %}
{%- endif %}
###############################################################################
# PROCESSOR PLUGINS #
+9
View File
@@ -4,6 +4,15 @@ telegraf:
forcedType: bool
advanced: True
helpLink: influxdb
output:
description: Selects the backend(s) Telegraf writes metrics to. INFLUXDB keeps the current behavior; POSTGRES writes to the grid's Postgres instance; BOTH dual-writes for migration validation.
options:
- INFLUXDB
- POSTGRES
- BOTH
global: True
advanced: True
helpLink: influxdb
config:
interval:
description: Data collection interval.
+5
View File
@@ -68,6 +68,7 @@ base:
- backup.config_backup
- nginx
- influxdb
- postgres
- soc
- kratos
- hydra
@@ -95,6 +96,7 @@ base:
- backup.config_backup
- nginx
- influxdb
- postgres
- soc
- kratos
- hydra
@@ -123,6 +125,7 @@ base:
- registry
- nginx
- influxdb
- postgres
- strelka.manager
- soc
- kratos
@@ -153,6 +156,7 @@ base:
- registry
- nginx
- influxdb
- postgres
- strelka.manager
- soc
- kratos
@@ -181,6 +185,7 @@ base:
- manager
- nginx
- influxdb
- postgres
- strelka.manager
- soc
- kratos
+2
View File
@@ -1,4 +1,5 @@
{% from 'vars/elasticsearch.map.jinja' import ELASTICSEARCH_GLOBALS %}
{% from 'vars/postgres.map.jinja' import POSTGRES_GLOBALS %}
{% from 'vars/sensor.map.jinja' import SENSOR_GLOBALS %}
{% set ROLE_GLOBALS = {} %}
@@ -6,6 +7,7 @@
{% set EVAL_GLOBALS =
[
ELASTICSEARCH_GLOBALS,
POSTGRES_GLOBALS,
SENSOR_GLOBALS
]
%}
+2
View File
@@ -1,4 +1,5 @@
{% from 'vars/elasticsearch.map.jinja' import ELASTICSEARCH_GLOBALS %}
{% from 'vars/postgres.map.jinja' import POSTGRES_GLOBALS %}
{% from 'vars/sensor.map.jinja' import SENSOR_GLOBALS %}
{% set ROLE_GLOBALS = {} %}
@@ -6,6 +7,7 @@
{% set IMPORT_GLOBALS =
[
ELASTICSEARCH_GLOBALS,
POSTGRES_GLOBALS,
SENSOR_GLOBALS
]
%}
+3 -1
View File
@@ -1,12 +1,14 @@
{% from 'vars/elasticsearch.map.jinja' import ELASTICSEARCH_GLOBALS %}
{% from 'vars/logstash.map.jinja' import LOGSTASH_GLOBALS %}
{% from 'vars/postgres.map.jinja' import POSTGRES_GLOBALS %}
{% set ROLE_GLOBALS = {} %}
{% set MANAGER_GLOBALS =
[
ELASTICSEARCH_GLOBALS,
LOGSTASH_GLOBALS
LOGSTASH_GLOBALS,
POSTGRES_GLOBALS
]
%}
+3 -1
View File
@@ -1,12 +1,14 @@
{% from 'vars/elasticsearch.map.jinja' import ELASTICSEARCH_GLOBALS %}
{% from 'vars/logstash.map.jinja' import LOGSTASH_GLOBALS %}
{% from 'vars/postgres.map.jinja' import POSTGRES_GLOBALS %}
{% set ROLE_GLOBALS = {} %}
{% set MANAGERSEARCH_GLOBALS =
[
ELASTICSEARCH_GLOBALS,
LOGSTASH_GLOBALS
LOGSTASH_GLOBALS,
POSTGRES_GLOBALS
]
%}
+16
View File
@@ -0,0 +1,16 @@
{# Copyright Security Onion Solutions LLC and/or licensed to Security Onion Solutions LLC under one
or more contributor license agreements. Licensed under the Elastic License 2.0 as shown at
https://securityonion.net/license; you may not use this file except in compliance with the
Elastic License 2.0. #}
{% import 'vars/init.map.jinja' as INIT %}
{%
set POSTGRES_GLOBALS = {
'postgres': {}
}
%}
{% if salt['file.file_exists']('/opt/so/saltstack/local/pillar/postgres/auth.sls') %}
{% do POSTGRES_GLOBALS.postgres.update({'auth': INIT.PILLAR.postgres.auth}) %}
{% endif %}
+2
View File
@@ -1,5 +1,6 @@
{% from 'vars/elasticsearch.map.jinja' import ELASTICSEARCH_GLOBALS %}
{% from 'vars/logstash.map.jinja' import LOGSTASH_GLOBALS %}
{% from 'vars/postgres.map.jinja' import POSTGRES_GLOBALS %}
{% from 'vars/sensor.map.jinja' import SENSOR_GLOBALS %}
{% set ROLE_GLOBALS = {} %}
@@ -8,6 +9,7 @@
[
ELASTICSEARCH_GLOBALS,
LOGSTASH_GLOBALS,
POSTGRES_GLOBALS,
SENSOR_GLOBALS
]
%}
+11 -2
View File
@@ -821,6 +821,7 @@ create_manager_pillars() {
soc_pillar
idh_pillar
influxdb_pillar
postgres_pillar
logrotate_pillar
patch_pillar
nginx_pillar
@@ -1053,6 +1054,7 @@ generate_passwords(){
HYDRAKEY=$(get_random_value)
HYDRASALT=$(get_random_value)
REDISPASS=$(get_random_value)
POSTGRESPASS=$(get_random_value)
SOCSRVKEY=$(get_random_value 64)
IMPORTPASS=$(get_random_value)
}
@@ -1355,6 +1357,12 @@ influxdb_pillar() {
" token: $INFLUXTOKEN" > $local_salt_dir/pillar/influxdb/token.sls
}
postgres_pillar() {
title "Create the postgres pillar file"
touch $adv_postgres_pillar_file
touch $postgres_pillar_file
}
make_some_dirs() {
mkdir -p /nsm
mkdir -p "$default_salt_dir"
@@ -1364,7 +1372,7 @@ make_some_dirs() {
mkdir -p $local_salt_dir/salt/firewall/portgroups
mkdir -p $local_salt_dir/salt/firewall/ports
for THEDIR in bpf elasticsearch ntp firewall redis backup influxdb strelka sensoroni soc docker zeek suricata nginx telegraf logstash soc manager kratos hydra idh elastalert stig global kafka versionlock hypervisor vm; do
for THEDIR in bpf elasticsearch ntp firewall redis backup influxdb postgres strelka sensoroni soc docker zeek suricata nginx telegraf logstash soc manager kratos hydra idh elastalert stig global kafka versionlock hypervisor vm; do
mkdir -p $local_salt_dir/pillar/$THEDIR
touch $local_salt_dir/pillar/$THEDIR/adv_$THEDIR.sls
touch $local_salt_dir/pillar/$THEDIR/soc_$THEDIR.sls
@@ -1832,7 +1840,8 @@ secrets_pillar(){
printf '%s\n'\
"secrets:"\
" import_pass: $IMPORTPASS"\
" influx_pass: $INFLUXPASS" > $local_salt_dir/pillar/secrets.sls
" influx_pass: $INFLUXPASS"\
" postgres_pass: $POSTGRESPASS" > $local_salt_dir/pillar/secrets.sls
fi
}
+6
View File
@@ -202,6 +202,12 @@ export influxdb_pillar_file
adv_influxdb_pillar_file="$local_salt_dir/pillar/influxdb/adv_influxdb.sls"
export adv_influxdb_pillar_file
postgres_pillar_file="$local_salt_dir/pillar/postgres/soc_postgres.sls"
export postgres_pillar_file
adv_postgres_pillar_file="$local_salt_dir/pillar/postgres/adv_postgres.sls"
export adv_postgres_pillar_file
logrotate_pillar_file="$local_salt_dir/pillar/logrotate/soc_logrotate.sls"
export logrotate_pillar_file
+2 -1
View File
@@ -71,7 +71,8 @@ log_has_errors() {
grep -vE "remove_failed_vm.sls" | \
grep -vE "failed to copy: httpReadSeeker" | \
grep -vE "Error response from daemon: failed to resolve reference" | \
grep -vE "log-.*-pipeline_failed_attempts" &> "$error_log"
grep -vE "log-.*-pipeline_failed_attempts" | \
grep -vE " -v ON_ERROR_STOP=1" &> "$error_log"
if [[ $? -eq 0 ]]; then
# This function succeeds (returns 0) if errors are detected