There are some amazing resources available to learn about [Prometheus](https://prometheus.io/). This post is a cheat sheet with examples that serve as a quick reference and record of interesting things. # PromQL Queries ## Use API Use the _query_ endpoint, curl -s 'http://prom.server:9090/api/v1/query?query=rate(node_cpu_seconds_total%5B1m%5D)' | jq . Use the _query\_range_ endpoint, curl -s -X POST -F 'query=rate(node_cpu_seconds_total[1m])' -F 'start=2023-02-22T04:33:46.167Z' -F'end=2023-02-22T04:53:46.167Z' -F 'step=15s' 'http://prom.server:9090/api/v1/query_range' | jq . Note how in _query_ you need to use HTML URL encoding for some things but not so in _query\_range_. ## CPU Usage CPU usage for all kinds or modes, node_cpu_seconds_total Use the [rate function](https://www.metricfire.com/blog/understanding-the-prometheus-rate-function/) to calculate the CPU usage per mode over time, rate(node_cpu_seconds_total[1m]) Calculate the total percentage of CPU usage per core per instance by summing up rate of all modes except the idle mode, (sum by (instance, cpu) (rate(node_cpu_seconds_total{mode!="idle"}[30s])))*100 The above can be modified to `sum by (instance)` but that will give a combined total for all cores which can be more than 100%. An [alternative](https://www.robustperception.io/understanding-machine-cpu-usage/) is below, 100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[30s])) * 100) ## Memory Usage Query for total memory use, ((node_memory_MemFree_bytes / node_memory_MemTotal_bytes ) * 100) # Further Reading - [PromQL Cheat Sheet](https://promlabs.com/promql-cheat-sheet/) - [Prometheus Metrics: A Practical Guide to Types, Uses, Functions, Exporters, and More](https://www.tigera.io/learn/guides/prometheus-monitoring/prometheus-metrics/) - [Getting started with PromQL – Includes Cheatsheet!](https://sysdig.com/blog/getting-started-with-promql-cheatsheet/)