Adopt pg_partman + pg_cron for Telegraf metric tables

Every telegraf.* metric table is now a daily time-range partitioned
parent managed by pg_partman. Retention drops old partitions instead
of the row-by-row DELETE that so-telegraf-trim used to run nightly,
and dashboards will benefit from partition pruning at query time.

- Load pg_cron at server start via shared_preload_libraries and point
  cron.database_name at so_telegraf so job metadata lives alongside
  the metrics
- Telegraf create_templates override makes every new metric table a
  PARTITION BY RANGE (time) parent registered with partman.create_parent
  in one transaction (1 day interval, 3 premade)
- postgres_telegraf_group_role now also creates pg_partman and pg_cron
  extensions and schedules hourly partman.run_maintenance_proc
- New retention reconcile state updates partman.part_config.retention
  from postgres.telegraf.retention_days on every apply
- so_telegraf_trim cron is now unconditionally absent; script stays on
  disk as a manual fallback
This commit is contained in:
Mike Reeves
2026-04-16 17:27:15 -04:00
parent 9fe53d9ccc
commit d9a9029ce5
4 changed files with 41 additions and 9 deletions
+3 -9
View File
@@ -80,20 +80,14 @@ delete_so-postgres_so-status.disabled:
- name: /opt/so/conf/so-status/so-status.conf
- regex: ^so-postgres$
# Retention is now handled by pg_partman (hourly maintenance via pg_cron
# scheduled from postgres/telegraf_users.sls). The so-telegraf-trim script
# stays on disk for manual/emergency use but is no longer scheduled.
so_telegraf_trim:
{% if GLOBALS.telegraf_output in ['POSTGRES', 'BOTH'] %}
cron.present:
{% else %}
cron.absent:
{% endif %}
- name: /usr/sbin/so-telegraf-trim >> /opt/so/log/postgres/telegraf-trim.log 2>&1
- identifier: so_telegraf_trim
- user: root
- minute: '17'
- hour: '3'
- daymonth: '*'
- month: '*'
- dayweek: '*'
{% else %}