Skip to content

Conversation

@jgaalen
Copy link

@jgaalen jgaalen commented Dec 11, 2024

Description

Based on idea and discussion here, this PR make it work

Motivation and Context

For every single requests which is compressed, JMeter always decompresses the data and stores the uncompressed data in responseData. This causes high memory usage per thread and adds cpu time for decompression when the responseData is never used in an assertion, post processor or listener.

How Has This Been Tested?

First of all, the tests from the build succeeded (after they failed for deflate which needed some extra care). Then I've ran some of my scripts against various websites and all worked.
With debugging I've checked that the responseData is actually only decompressed when accessed and it worked. I'd like to do a benchmark later under load and compare impact on memory and cpu

Screenshots (if appropriate):

Types of changes

Removed decompression from HC4 and Java HTTP implementation and doing the decompression in getResponseData(). This way, it is only decompressed when the data is accessed.

Checklist:

  • My code follows the code style of this project.
  • I have updated the documentation accordingly.

@jgaalen
Copy link
Author

jgaalen commented Dec 25, 2024

So I've done some benchmarking with this code, comparing it with default 5.6.3.

image

Results are surprisingly good! Somehow, the benefit is relatively more on cpu rather than memory.

I've ran a benchmark, running identical JMX script against the same environment at the same time. Both doing 100 threads, 300 requests per second. It was a recording of 5 transactions of a generic website, with a running time of about 80s per threadgroup.

Ran it on 2 separate aws ec2 instances, 2cpu/2gb mem, max heap at 1500m

The one without updated code, did about 15% cpu average during the run, while the updated code where we only decompress when the body is actually used (assertion/postprocessor), did about 10% on average during the run. Also the total memory usage of the VM was about 10% less, meaning the JVM didn't spiked as high.

I think it has less impact on memory, because the responseData and previousResult gets overwritten every next sampler, so it only holds the latest responseData. But not having to decompressing all responses (for nothing) and not needing to do much heavier garbage collections, it saves more on cpu usage.

@jgaalen
Copy link
Author

jgaalen commented May 31, 2025

Any status update on this item? The improvement is impressive on both memory and cpu consumption. I've been using a forked version for a long time with this improvement, but it would be great if this can be merged to JMeter code.

@vlsi vlsi added this to the 6.0 milestone Nov 7, 2025
@vlsi vlsi force-pushed the feature/lazy-response-decompression branch from 91a2eaf to 0b3ef6b Compare November 7, 2025 20:49
vlsi pushed a commit to vlsi/jmeter that referenced this pull request Nov 8, 2025
This commit addresses vlsi's architectural feedback on PR apache#6389 by
implementing a ResponseDecoder interface pattern to decouple decompression
logic from SampleResult.

Key Changes:
- Created ResponseDecoder interface in core module for pluggable decoders
- Implemented PlainResponseDecoder for uncompressed responses
- Created ResponseDecoderFactory in HTTP module with support for:
  * gzip/x-gzip compression (with relax mode support)
  * deflate compression (with relax mode support)
  * Brotli compression
- Modified SampleResult to:
  * Store raw (possibly compressed) response data
  * Track content encoding via new responseContentEncoding field
  * Lazily decompress on getResponseData() using registered decoders
  * Cache decompressed data to avoid repeated decompression
  * Use a registry pattern to avoid circular dependencies
- Updated HTTPHC4Impl to:
  * Disable automatic decompression (removed RESPONSE_CONTENT_ENCODING interceptor)
  * Extract and store Content-Encoding header value
  * Store raw compressed response data
- Updated HTTPJavaImpl to:
  * Remove inline gzip decompression
  * Extract and store Content-Encoding header value
  * Store raw compressed response data

Benefits:
- Decoupled architecture as suggested by vlsi
- Memory efficiency: only decompress when data is actually accessed
- CPU efficiency: avoid unnecessary decompression for responses that aren't read
- Maintains backward compatibility
- Supports all existing compression formats (gzip, deflate, brotli)

The implementation uses a registry pattern where the HTTP module registers
its decoders with SampleResult, avoiding circular dependencies between
core and protocol modules.
@vlsi vlsi force-pushed the feature/lazy-response-decompression branch from 0b3ef6b to 2194a08 Compare November 26, 2025 07:23
@vlsi
Copy link
Collaborator

vlsi commented Nov 27, 2025

There's an edge case: Save response as MD5 + compressed response + SampleResult#getBodySizeAsLong.
It is not clear what would be the way to make it working with "delayed decompression".

When it comes to getBodySizeAsLong, I think it would be fine if we track the number of uncompressed bytes. However, md5(compressed) and md5(uncompressed) are different, so it is a breaking change.

Currently, JMeter computes MD5 over decompressed result, and if we compute MD5 over the compressed data, then the result will change.

I'm not sure how "save as md5" is typically used, however, if users have assertions for MD5 values, then the assertions would start failing if we checksum uncompressed stream.

@vlsi vlsi force-pushed the feature/lazy-response-decompression branch from 2194a08 to acad5df Compare November 27, 2025 11:37
@vlsi
Copy link
Collaborator

vlsi commented Nov 28, 2025

I think behind the lines of adding Response Processing combobox

response processing combobox

Then the old "use md5" would map to "MD5 of decompressed" while the users would be able to go for "MD5 for compressed" or even "fetch and discard"

Any thoughts?

@vlsi vlsi force-pushed the feature/lazy-response-decompression branch 2 times, most recently from a4830d4 to 13ae5bd Compare November 30, 2025 18:29
@vlsi vlsi force-pushed the feature/lazy-response-decompression branch from 13ae5bd to df9c678 Compare December 1, 2025 17:08
@vlsi vlsi force-pushed the feature/lazy-response-decompression branch from df9c678 to e3a8535 Compare December 1, 2025 17:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants