If this were YouTube, the thumbnail would look something like this
But seriously. Is there any way to monitor long running Postgres queries?
Kinda!
Using sequences.
The Setup
We create two tables. One small and one big.
DROP TABLE IF EXISTS small;
CREATE TABLE small (
some_val int
);
INSERT INTO small
SELECT
gs
FROM
generate_series(1, 100) AS gs;
DROP TABLE IF EXISTS big;
CREATE TABLE big (
other_val int
);
INSERT INTO big
SELECT
gs % 100 + 1
FROM
generate_series(1, 50000000) AS gs;
CREATE INDEX ON big (other_val);
VACUUM ANALYZE small;
VACUUM ANALYZE big;
The Query
This is the slow query we'll try to monitor
SELECT
some_val,
(
SELECT
count(*)
FROM
big AS b
WHERE
b.other_val = s.some_val
)
FROM
small AS s;
The Technique
Create a dummy sequence
DROP SEQUENCE IF EXISTS my_sequence;
CREATE SEQUENCE my_sequence;
Find a way to incorporate a call to nextval
for this sequence
nextval('my_sequence')
Easy enough.
SELECT
some_val,
(
SELECT
count(*)
FROM
big AS b
WHERE
b.other_val = s.some_val
),
nextval('my_sequence')
FROM
small AS s;
Now, while the slow query is running we can open another psql session and run our 'monitor' in a \watch
and see how quickly Postgres is chewing through rows. Sequnces aren't transactional, which is why this works. When we see the values from nextval
in our watch start shooting up, that's Postgres working its way through the query.
SELECT nextval('my_sequence');
\watch 1
The Video Content
Not Perfect
Unfortunately this won't work for every query. Postgres needs to evaluate the call to nextval
in the 'meat' of the execution for it to work, and Postgres won't always choose to do that. It won't work on this query, for example
SELECT
some_val,
count(*),
nextval('my_sequence')
FROM
small AS s INNER JOIN
big AS b ON
s.some_val = b.other_val
GROUP BY
some_val;
More Queries
It's nice when it works though. It can work for UPDATE statements
UPDATE
big
SET
other_val = other_val + 1 + nextval('my_sequence') * 0;
But you could get the math wrong doing it that way and end up changing the logic of the UPDATE. Better to use the RETURNING clause
UPDATE
big
SET
other_val = other_val + 1
RETURNING
nextval('my_sequence');
DELETE statements can work the same way
DELETE FROM
big
RETURNING
nextval('my_sequence');
INSERT statements can use RETURNING the same way as well
INSERT INTO big
SELECT
gs % 100
FROM
generate_series(1, 25000000) AS gs
RETURNING
nextval('my_sequence');
Bonus Trick
There's another way you can monitor INSERT statements that doesn't involve tacking on a RETURNING clause or modifying the INSERT statement in any way. This use case might be pretty rare, but I once wanted to monitor an INSERT statement and I couldn't easily modify the query. The trick was to add a dummy column to the table and make it use the sequence by default.
ALTER TABLE big ADD COLUMN dummy_value int;
ALTER TABLE big ALTER COLUMN dummy_value SET DEFAULT nextval('my_sequence');
INSERT INTO big
SELECT
gs % 100 + 1
FROM
generate_series(1, 25000000) AS gs;
This was in a non-prod environment so adding a dummy column like that was perfectly OK in that case. Segue to final section
Caveat City.
- Here are just a few
- It won't work for every query
- It won't work on read replicas
- Don't use this in prod. Think about how it would make you feel. Creating random sequences that you would forget to clean up. Running hand edited queries that haven't gotten the test coverage the rest of your application gets. The answer is bad, it would make you feel bad.
- Even in lower environments you still may not have permission to create the necessary sequences