securityonion

mirror of https://github.com/Security-Onion-Solutions/securityonion.git synced 2026-06-21 01:44:16 +02:00

Author	SHA1	Message	Date
Josh Patterson	d0bea2ebcb	Restore grouped per-integration logging and retry 409s in fleet integration loader elastic_fleet_load_integrations_dir now buffers each concurrent job's output (header + API response) to a per-job file and prints them in submission order after wait, restoring the readable serial-style output while keeping concurrent writes. Add --retry-all-errors to the integration create/update curl calls so transient 409 conflicts from concurrent writes to the same agent policy are retried (curl --retry alone does not retry 409).	2026-06-18 11:19:36 -04:00
Josh Patterson	62c01a9756	Merge remote-tracking branch 'origin/3/dev' into soupmod2	2026-06-18 09:53:44 -04:00
reyesj2	16149df71f	formatting	2026-06-16 18:21:28 -05:00
reyesj2	6a18f35020	add context to soup errors and optional soup debug log with xtrace output	2026-06-16 18:21:28 -05:00
Jason Ertel	aa58225e8f	Merge pull request #15974 from Security-Onion-Solutions/jertel/wip es\|ql defaults	2026-06-16 14:27:54 -04:00
Josh Patterson	8e33d0e1e9	Merge remote-tracking branch 'origin/3/dev' into soupmod2	2026-06-16 12:54:18 -04:00
reyesj2	3daed551df	use --fail flag without set -x, since elasticsearch can return a 404 on the template lookup	2026-06-16 11:17:04 -05:00
reyesj2	4456bde1c8	check if template exists without --fail flag	2026-06-16 10:45:53 -05:00
Jorge Reyes	4a6c675223	skip kibana backport if the template doesn't exist	2026-06-16 10:33:11 -05:00
reyesj2	a769d4c680	another unneeded default	2026-06-16 09:32:37 -05:00
reyesj2	f68e3e47a1	remove pillar merge	2026-06-16 09:19:10 -05:00
Jorge Reyes	b81257bf45	Merge pull request #15973 from Security-Onion-Solutions/reyesj2/dlm-support Data stream lifecycle management support	2026-06-15 14:47:51 -05:00
reyesj2	1a423a2434	update message	2026-06-15 14:17:34 -05:00
reyesj2	95cae4c734	remove so-elasticsearch-indices-delete cron when using DLM	2026-06-15 13:32:45 -05:00
reyesj2	596471e140	using new annotation config	2026-06-15 13:31:53 -05:00
reyesj2	d10f21399c	remove comments	2026-06-15 13:31:23 -05:00
Jason Ertel	ae1ddf3817	es\|ql defaults	2026-06-15 12:33:08 -04:00
Josh Patterson	1ee555957a	Speed up so-elastic-fleet-integration-upgrade Fetch each agent policy once and extract integration name/package/version/id locally via a single jq pass instead of re-fetching the identical policy JSON 1+3N times. Memoize epm/packages latest-version lookups so each package is queried once instead of per (policy, integration). Dispatch the per-integration dry-run+upgrade as throttled background jobs (MAX_FLEET_JOBS) with flock-serialized output and a FAIL_FILE marker, mirroring elastic_fleet_load_integrations_dir. Behavior preserved: same elastic-defend-endpoints/fleet_server skips, same AUTO_UPGRADE_INTEGRATIONS default-package gating (moved into jq, using $defaults to avoid the jq $def keyword collision), and exit 1 on any failure so salt retries.	2026-06-12 15:23:43 -04:00
Josh Patterson	43f72c1f9f	Parallelize so-elasticsearch-templates-load template PUTs Load component and index templates as throttled background jobs (max 10 concurrent) instead of sequential curl PUTs, matching the bounded-concurrency + flock-serialized-output pattern used by the fleet/ILM load scripts. Keeps a wait barrier between the component phase and the index phase so index templates never load before their referenced component templates. Failures are tracked via per-job marker files since counter increments can't escape background subshells.	2026-06-12 15:11:34 -04:00
Josh Brower	9031c1fd22	userid vs names	2026-06-12 11:18:59 -04:00
Josh Patterson	ae6a705ce1	Speed up so-elastic-fleet-integration-policy-load Fetch each agent policy once per group instead of refetching the full policy (plus a fresh Kibana session cookie) for every integration file, and dispatch the create/update writes as throttled background jobs. Adds elastic_fleet_load_integrations_dir and elastic_fleet_throttle to so-elastic-fleet-common, reusing the bounded-concurrency pattern from so-elasticsearch-ilm-policy-load. Replaces the four serial loops in the loader with one call per agent policy.	2026-06-12 09:38:41 -04:00
reyesj2	c505160480	set default DLM retention 90d	2026-06-11 15:13:28 -05:00
reyesj2	d9f6cde4e1	remove global setting from data_retention annotation	2026-06-11 15:11:29 -05:00
Josh Patterson	b1273573ed	Fix jq $def keyword collision in optional-integrations-load The agent-policy enumeration passed --argjson def, creating a jq variable $def. 'def' is a reserved keyword in jq and the deployed jq version rejects it, so the program failed to compile and in_use_integrations was left empty (silently disabling the in-use upgrade guard). Rename the arg to $defaults.	2026-06-11 15:50:53 -04:00
Josh Patterson	6c42c419e2	Serialize ILM policy-load output with flock to stop interleaving A single printf per block was not actually one write() call, so concurrent jobs still occasionally interleaved their label and response lines. Hold an flock around just the printf (curl still runs in parallel) so each policy's block prints intact, keeping live completion-order streaming.	2026-06-11 15:42:41 -04:00
Josh Patterson	f23652397c	Speed up so-elastic-fleet-optional-integrations-load decision logic Replace the per-package decision loop (which forked ~10 processes per package and rebuilt a growing JSON file on every add -> O(n^2)) with two jq passes: one prints the status messages, one builds the bulk install list. A vnum/needs() jq definition reproduces the previous version_conversion/compare_versions and excluded/subscription/installed/ upgrade/in-use logic exactly. Also fetch each agent policy once and extract non-default package names locally instead of re-fetching the policy per integration (1+K -> 1 GET per policy). Install behavior is unchanged.	2026-06-11 13:57:56 -04:00
Josh Patterson	07d3b148b5	fix output	2026-06-11 13:37:26 -04:00
Josh Patterson	780d9faf0d	Parallelize so-elasticsearch-ilm-policy-load PUTs Run the ~300 ILM policy PUTs concurrently (bounded to 10 in flight via a throttle gate) instead of one serial curl per policy. Adds a put_policy helper and waits for all background jobs before exiting. Preserves policy parity; only the scheduling changes. Drops the dead empty sid cookie arg (falls back to basic auth from curl.config as before).	2026-06-11 12:08:32 -04:00
reyesj2	4741cc92bd	fleet manager start kibana if it isn't already running and wait for healthly status	2026-06-10 17:52:08 -05:00
reyesj2	46655860e9	http	2026-06-10 17:27:23 -05:00
reyesj2	289ddda5e8	kibana health check for fleet scripts	2026-06-10 17:06:22 -05:00
Josh Patterson	83aaa76f98	allow full highstate on manager when locked	2026-06-10 16:34:10 -04:00
reyesj2	f905afbc6f	logging	2026-06-10 15:01:22 -05:00
reyesj2	bd5e77afc5	increase delay in so-elastic-fleet-package-upgrade attempts	2026-06-10 14:59:29 -05:00
reyesj2	944e773759	save exit until all packages have been attempted	2026-06-10 14:58:49 -05:00
reyesj2	cf456dc58c	reuse existing index templates	2026-06-09 23:21:43 -05:00
reyesj2	9aa9ea3255	Iniitial DLM support	2026-06-09 23:19:26 -05:00
Josh Patterson	448668a72e	Merge remote-tracking branch 'origin/3/dev' into nostartupstates	2026-06-09 14:02:00 -04:00
Josh Patterson	f088a27159	so-boot-mine-update: warm master pillar cache before highstate A complete mine is not enough: elasticsearch:nodes, redis:nodes, logstash:nodes (tgt_type=pillar) and hypervisor:nodes (tgt_type=compound) resolve their target against the master's per-minion data cache (grains+pillar in data.p), which is populated only when a minion's pillar is recompiled -- separately from the mine. After a reboot a node can be in the mine (so node_data/glob sees it) yet absent from that cache, so it fails the elasticsearch:enabled:true pillar match and is dropped from elasticsearch:nodes -> so-elasticsearch ExtraHosts -> container recreate. After the mine-completeness wait, run salt '*' saltutil.refresh_pillar wait=True to synchronously cache every up node's pillar (the same lever deploy_newnode.sls uses), then verify with salt-run cache.pillar and retry stragglers, bounded by MINE_UPDATE_MAX_WAIT. Also log elasticsearch:nodes alongside node_data for inspection.	2026-06-09 13:52:19 -04:00
Josh Patterson	27c7702325	so-boot-mine-update: wait for a complete mine before highstate Mine-backed pillars (node_data, elasticsearch:nodes, redis:nodes, logstash:nodes, hypervisor:nodes) include a node only if it returned an IP from the mine, and the configs they build are rebuilt fresh every highstate. After a manager reboot with a flushed mine, the first boot highstate could run before an up node re-reported network.ip_addrs, dropping it from e.g. so-elasticsearch ExtraHosts and forcing a container recreate. After the initial broad mine.update, poll until every currently-up minion actually has network.ip_addrs in the mine, re-pushing mine.update to stragglers, before releasing the boot highstate. Shares the existing MINE_UPDATE_MAX_WAIT backstop so a slow/down node never blocks boot, and still logs the rendered node_data for inspection.	2026-06-09 10:10:32 -04:00
Josh Patterson	8c306eb37d	so-boot-mine-update: log the rendered node_data content Dump the actual rendered node_data pillar (pretty-printed JSON) to the journal instead of just a rendered/empty verdict, so the boot-time render attempt is fully inspectable. Empty renders print false/null and still emit the WARNING.	2026-06-09 09:49:19 -04:00
Josh Patterson	e536ffa363	so-boot-mine-update: render node_data after mine.update before highstate After the boot-time mine.update, have the manager actually render the node_data pillar and log whether it came back populated. node_data: False makes salt/top.sls apply the bootstrap recovery branch instead of the manager's real config, so surfacing this in the journal makes the condition visible before so-boot-highstate runs. Best-effort and non-blocking: always exits 0 so highstate proceeds regardless.	2026-06-09 09:35:24 -04:00
Jorge Reyes	d7aa7ab228	Merge pull request #15961 from Security-Onion-Solutions/reyesj2/fleet-autoconfigure respect elasticfleet enable_auto_configuration setting for so-elastic…	2026-06-08 15:09:58 -05:00
Jorge Reyes	fe0b68d24c	Merge pull request #15958 from Security-Onion-Solutions/reyesj2-patch-template fix elasticsearch template generation issue	2026-06-08 15:07:49 -05:00
reyesj2	6ad345730b	respect elasticfleet enable_auto_configuration setting for so-elastic-fleet-urls-update	2026-06-08 15:02:57 -05:00
Josh Patterson	9580976ba2	Add manager boot-time grid mine.update oneshot before highstate so-boot-mine-update.service is a manager-only Type=oneshot unit that runs once per boot after salt-master/salt-minion start and before so-boot-highstate.service. It pushes mine.update to all reachable minions so mine-backed pillars (node IPs, ES/Redis/Logstash discovery) are fresh before the boot highstate renders them. The helper waits for the responsive minion set to settle (plateau) rather than for every accepted key to report up, so an intentionally powered-off minion doesn't block the update; MAX_WAIT remains as a backstop.	2026-06-08 11:05:13 -04:00
reyesj2	ac907ba45f	fix elasticsearch template generation issue	2026-06-05 16:42:08 -05:00
Josh Patterson	cb3631da81	Move setup-complete marker from /opt/so/conf to /opt/so/state The setup-complete marker is a runtime-state file, not config, so move it to /opt/so/state/setup-complete. Updates both writers (mark_setup_complete in setup/so-functions and the upgrade-path state in minion/init.sls) and the three readers (so-boot-highstate.service ConditionPathExists, boot_highstate.sls enable gate, and the so-user_sync cron gate).	2026-06-04 15:07:27 -04:00
Josh Patterson	f5d63f585e	Merge remote-tracking branch 'origin/3/dev' into nostartupstates	2026-06-04 09:19:01 -04:00
Josh Patterson	13f8be40b5	so-boot-highstate: wait for docker before running highstate Add docker.service to After= and Wants= so the boot-time highstate starts after docker is up. Uses Wants (soft) so highstate still runs if docker fails to start.	2026-06-04 08:46:35 -04:00

1 2 3 4 5 ...

11867 Commits