Skip Describe portal when executing prepared statements #2422

jackc · 2025-11-02T00:08:03Z

This is a proof of concept for skipping Describe portal message when executing a prepared statement.

Currently, pgx always sends a Describe portal message when executing a prepared statement. It receives a RowDescription message in response. This is convenient as result sets always include a RowDescription first regardless of whether the query was executed with the simple protocol, the extended protocol without a prepared statement, or the extended protocol with a prepared statement.

But pgx always Describe prepared statements when it creates them. So it already has the RowDescription. The only thing it lacks is the format (text or binary) of the result fields as that is specified per execution. But if pgx remembered the formats it requested when it sent the query it could synthesize the complete RowDescription without needing to ask PostgreSQL to resend it.

This proof of concept adds a new method, *PgConn.ExecPreparedStatementDescription() that tests this approach.

Here are results of one of the existing benchmarks adapted to use the new method along with the original method:

jack@glados ~/dev/pgx ±prepared-statements-skip-describe-portal » got -run=^$ -bench=PgConnExecPrepared -benchmem
goos: darwin
goarch: arm64
pkg: github.com/jackc/pgx/v5
cpu: Apple M3 Max
BenchmarkSelectRowsPgConnExecPrepared/1_rows/text-16            	   23263	     49916 ns/op	     104 B/op	      10 allocs/op
BenchmarkSelectRowsPgConnExecPrepared/1_rows/binary_-_mostly-16 	   24380	     48728 ns/op	     104 B/op	      10 allocs/op
BenchmarkSelectRowsPgConnExecPrepared/10_rows/text-16           	   21608	     55964 ns/op	     104 B/op	      10 allocs/op
BenchmarkSelectRowsPgConnExecPrepared/10_rows/binary_-_mostly-16         	   21546	     55750 ns/op	     104 B/op	      10 allocs/op
BenchmarkSelectRowsPgConnExecPrepared/100_rows/text-16                   	   10000	    112878 ns/op	     128 B/op	      12 allocs/op
BenchmarkSelectRowsPgConnExecPrepared/100_rows/binary_-_mostly-16        	   10000	    103770 ns/op	     129 B/op	      12 allocs/op
BenchmarkSelectRowsPgConnExecPrepared/1000_rows/text-16                  	    1762	    672274 ns/op	     441 B/op	      25 allocs/op
BenchmarkSelectRowsPgConnExecPrepared/1000_rows/binary_-_mostly-16       	    2073	    580714 ns/op	     474 B/op	      26 allocs/op
BenchmarkSelectRowsPgConnExecPreparedStatementDescription/1_rows/text-16 	   25375	     48632 ns/op	      32 B/op	       2 allocs/op
BenchmarkSelectRowsPgConnExecPreparedStatementDescription/1_rows/binary_-_mostly-16         	   24746	     48533 ns/op	      32 B/op	       2 allocs/op
BenchmarkSelectRowsPgConnExecPreparedStatementDescription/10_rows/text-16                   	   21912	     54659 ns/op	      40 B/op	       2 allocs/op
BenchmarkSelectRowsPgConnExecPreparedStatementDescription/10_rows/binary_-_mostly-16        	   22455	     53232 ns/op	      40 B/op	       2 allocs/op
BenchmarkSelectRowsPgConnExecPreparedStatementDescription/100_rows/text-16                  	   10000	    113159 ns/op	      64 B/op	       4 allocs/op
BenchmarkSelectRowsPgConnExecPreparedStatementDescription/100_rows/binary_-_mostly-16       	   10000	    104351 ns/op	      65 B/op	       4 allocs/op
BenchmarkSelectRowsPgConnExecPreparedStatementDescription/1000_rows/text-16                 	    1813	    677880 ns/op	     377 B/op	      17 allocs/op
BenchmarkSelectRowsPgConnExecPreparedStatementDescription/1000_rows/binary_-_mostly-16      	    2062	    575758 ns/op	     410 B/op	      18 allocs/op
PASS
ok  	github.com/jackc/pgx/v5	23.469s

There is a tiny improvement in runtime, on the order of a few 100ns to a 1000ns. Per query memory usage and allocations are reduced by an amount significant to this benchmark. Whether it is significant in the context of an application is another question.

It also reduces the amount of network traffic. The test TestConnExecPreparedStatementDescriptionNetworkUsage measures the bytes written and read to the PostgreSQL server using the same query used in the benchmark above when returning a single row.

The amount of bytes written to the server only varies by 7 bytes, 54 without Describe and 61 with Describe. But the bytes received varies by 238 bytes, 153 without Describe and 391 with Describe. That is 2.55x received bytes.

The percentage change will vary significantly based on the number of columns in the result set, which determines the size of the RowDescription message, and the number of rows returned. If only one row is returned it is quite likely that RowDescription is bigger than the actual data. But if many rows are returned then the RowDescription cost is insignificant.

Considerations for whether to move forward with this approach:

The biggest issue is the general increase in complexity. It is yet another code path that can be taken when executing queries.
This proof of concept doesn't consider whether it is safe to directly update the prepared statement description. There might be concurrency issues if someone is doing something with the statement description in another goroutine. Now I can't think of any reason why someone would do that, so presumably documenting that you can't mess with a statement description while it is being executed would be sufficient.
Batches would need to use this new approach as well.
It is perfectly valid to do this according to the documented PostgreSQL protocol. However, the PostgreSQL C library libpq doesn't exercise this path. It always sends the Describe portal message. See https://github.com/postgres/postgres/blob/master/src/interfaces/libpq/fe-exec.c#L1883-L1895. We may run into edge cases with PostgreSQL as no one else may be doing this. In addition, it may cause compatibility issues with semi-compatible databases like CRDB.

analytically · 2025-11-15T22:11:04Z

And even faster impl of readUntilRowDescription (7-11% faster):

Changes:

Reuse pgConn.fieldDescriptions buffer when capacity allows
Single-pass copy with format application (eliminates two-pass assign+modify)
Early format validation before network I/O (fail-fast)
Hoist format variable outside loop for single-format case

// readUntilRowDescription ensures the ResultReader's fieldDescriptions are loaded. It does not return an error as any
// error will be stored in the ResultReader.
func (rr *ResultReader) readUntilRowDescription(statementDescription *StatementDescription, resultFormats []int16) {
	for !rr.commandConcluded {
		// Peek before receive to avoid consuming a DataRow if the result set does not include a RowDescription method.
		// This should never happen under normal pgconn usage, but it is possible if SendBytes and ReceiveResults are
		// manually used to construct a query that does not issue a describe statement.
		msg, _ := rr.pgConn.peekMessage()
		if _, ok := msg.(*pgproto3.DataRow); ok {
			if statementDescription != nil {
				sourceFields := statementDescription.Fields
				fieldCount := len(sourceFields)

				if cap(rr.pgConn.fieldDescriptions) >= fieldCount {
					rr.fieldDescriptions = rr.pgConn.fieldDescriptions[:fieldCount]
				} else {
					rr.fieldDescriptions = make([]FieldDescription, fieldCount)
				}

				formatLen := len(resultFormats)
				switch {
				case formatLen == 0:
					// No format codes provided, default to text format
					for i := range sourceFields {
						rr.fieldDescriptions[i] = sourceFields[i]
						rr.fieldDescriptions[i].Format = pgtype.TextFormatCode
					}
				case formatLen == 1:
					// Single format code applies to all columns
					format := resultFormats[0]
					for i := range sourceFields {
						rr.fieldDescriptions[i] = sourceFields[i]
						rr.fieldDescriptions[i].Format = format
					}
				case formatLen == fieldCount:
					// One format code per column
					for i := range sourceFields {
						rr.fieldDescriptions[i] = sourceFields[i]
						rr.fieldDescriptions[i].Format = resultFormats[i]
					}
				default:
					// This should not occur if Bind validation is correct, but handle gracefully
					rr.concludeCommand(CommandTag{}, fmt.Errorf("result format codes length %d does not match field count %d", formatLen, fieldCount))
				}
			}
			return
		}

		// Consume the message
		msg, _ = rr.receiveMessage()
		if _, ok := msg.(*pgproto3.RowDescription); ok {
			return
		}
	}
}

jackc · 2025-12-26T21:13:29Z

I just added support for skipping Describe to pgconn.Batch. I also restructured some of the row and field description handling including taking some of the optimizations that @analytically posted above.

As of this change there is no more worry about concern "2", what happens if someone changes a prepared statement description concurrently, as it is no longer shared.

This functionality is still not available in pgx.Batch. It would still need to be implemented in pgconn.Pipeline as pgx.Batch uses that instead of pgconn.Batch.

Refactor row / field description handling.

Queue execution of prepared statement without describing portal in pipeline mode. Ugly and hacky but works for now.

Now delegates to specific methods for each request type.

This reduces redundant protocol messages when the statement description is already known.

jackc · 2026-01-01T03:55:45Z

I've just pushed the implementation for skipping Describe portal with pipeline mode and wired up pgx to use the new functionality. It also includes a number of refactorings and cleanups.

I am inclined to merge this, but it would be good to have some additional people try it out first.

jackc mentioned this pull request Nov 2, 2025

Reduce network bandwidth with better statement names #2413

Open

jackc force-pushed the prepared-statements-skip-describe-portal branch from 3a0c941 to c801b0a Compare December 26, 2025 21:09

jackc added 15 commits December 27, 2025 09:59

POC for skipping Describe Portal when executing prepared statements

d7013fc

Skip test on CockroachDB

131d68a

Add ExecPreparedStatementDescription to pgconn.Batch

b5112ab

Refactor row / field description handling.

Make test compatible with CockroachDB

309af0c

Rename ExecPreparedStatementDescription to ExecStatement

2f983e3

Add Pipeline.SendQueryStatement

1dff5d0

Queue execution of prepared statement without describing portal in pipeline mode. Ugly and hacky but works for now.

Allow preloading ResultReader row values in pipeline mode

8a4a517

Refactor Pipeline.getResults()

a8d7ebe

Now delegates to specific methods for each request type.

Extract Pipeline.receiveMessage

01d8259

Further refactoring of pipeline result handling

c9d12bf

Extract combineFieldDescriptionsAndResultFormats

2891df6

Add test of pgx network usage before eliding Describe

3450975

Get field descriptions even when no rows

f2231d6

pgx uses ExecStatement instead of ExecPrepared

446d1f2

This reduces redundant protocol messages when the statement description is already known.

pgx batch uses SendQueryStatement

228c905

jackc force-pushed the prepared-statements-skip-describe-portal branch from edf0221 to 228c905 Compare January 1, 2026 03:52

jackc mentioned this pull request Jan 1, 2026

Feedback on Skip Describe portal when executing prepared statements #2460

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Skip Describe portal when executing prepared statements #2422

Skip Describe portal when executing prepared statements #2422

jackc commented Nov 2, 2025

Uh oh!

analytically commented Nov 15, 2025

Uh oh!

jackc commented Dec 26, 2025

Uh oh!

jackc commented Jan 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Skip Describe portal when executing prepared statements #2422

Are you sure you want to change the base?

Skip Describe portal when executing prepared statements #2422

Conversation

jackc commented Nov 2, 2025

Uh oh!

analytically commented Nov 15, 2025

Uh oh!

jackc commented Dec 26, 2025

Uh oh!

jackc commented Jan 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants