Celery tasks with psycopg: ProgrammingError the last operation didn't produce a result
Karim - Sept. 23, 2024
I'm working on aproject in which I have
- A PostgreSQL 16.2 database
- A Python 3.12 backend using psycopg 3.2.1 and psycopg_pool 3.2.2.
- Celery for handling asynchronous tasks.
The celery tasks uses the database pool through the following code:
import os from psycopg_pool import ConnectionPool from contextlib import contextmanager PG_USERNAME = os.getenv('PG_USERNAME') if not PG_USERNAME: raise ValueError(f"Invalid postgres username") PG_PASSWORD = os.getenv('PG_PASSWORD') if not PG_PASSWORD: raise ValueError(f"Invalid postgres pass") PG_HOST = os.getenv('PG_HOST') if not PG_HOST: raise ValueError(f"Invalid postgres host") PG_PORT = os.getenv('PG_PORT') if not PG_PORT: raise ValueError(f"Invalid postgres port") # Options used to prevent closed connections # conn_options = f"-c statement_timeout=1800000 -c tcp_keepalives_idle=30 -c tcp_keepalives_interval=30" conninfo = f'host={PG_HOST} port={PG_PORT} dbname=postgres user={PG_USERNAME} password={PG_PASSWORD}' connection_pool = ConnectionPool( min_size=4, max_size=100, conninfo=conninfo, check=ConnectionPool.check_connection, #options=conn_options, ) @contextmanager def get_db_conn(): conn = connection_pool.getconn() try: yield conn finally: connection_pool.putconn(conn)
And an example celery task would be
@app.task(bind=True) def example_task(self, id): with get_db_conn() as conn: try: with conn.cursor(row_factory=dict_row) as cursor: test = None cursor.execute('SELECT * FROM test WHERE id = %s', (id,)) try: test = cursor.fetchone() except psycopg.errors.ProgrammingError: logger.warning(f'Test log msg') conn.rollback() return cursor.execute("UPDATE test SET status = 'running' WHERE id = %s", (id,)) conn.commit() # Some processing... # Fetch another resource needed cursor.execute('SELECT * FROM test WHERE id = %s', (test['resource_id'],)) cursor.fetchone() # Update the entry with the result cursor.execute(""" UPDATE test SET status = 'done', properties = %s WHERE id = %s """, (Jsonb(properties), id)) conn.commit() except Exception as e: logger.exception(f'Error: {e}') conn.rollback() with conn.cursor(row_factory=dict_row) as cursor: # Update status to error with exception information cursor.execute(""" UPDATE test SET status = 'error', error = %s WHERE id = %s """, (Jsonb({'error': str(e), 'stacktrace': traceback.format_exc()}), webpage_id)) conn.commit()
The code works most of the times, but sometimes, when multiple tasks of the same type are being launched, I'm getting some errors of type psycopg.ProgrammingError: the last operation didn't produce a result
on the second fetchone() call.
Meanwhile, on the database I can see the following warning WARNING: there is already a transaction in progress
I suspect there might be some problems with the way I'm working with connections, but I cannot find were.
As far as I know, once get_db_conn() is called that connection is not available for other tasks, so in theory there cannot be multiple tasks using the same connection, and therefore there should be no transaction already in progress when performing the second fetchone() call.
The resource exists, as every other task can access it, so that's not the problem.
If both the main target row of test as well as the additional one selected based on its test.resource_id foreign key aren't shareable, lock them. Otherwise, concurrent workers are likely bumping into each other, taking on the processing of the same row and altering its fields and the fields of the one its associated with through resource_id, at unpredictable points between subsequent steps of this operation.