[Feat] new server adapter #482

sleipnir · 2025-12-08T18:48:19Z

Hi everyone, I'd like to talk a little about what I'm currently working on here.

I'm almost finished implementing a new server-side adapter entirely based on Elixir using the thousand_island library. And the results are promising, as we can see below.

Performance Metrics

Metric	Cowboy	ThousandIsland	Improvement
Requests Processed	1,475	3,115	+111% (2.11x)
Total Time	1.46s	0.86s	-41% (1.71x faster)
Throughput	~1,008 req/s	~3,637 req/s	+261%
Minimum Latency	618µs	177µs	-71% (3.5x faster)
Maximum Latency	89.1ms	39.6ms	-56% (2.25x faster)
Average Latency	~991µs	~274µs	-72% (3.6x faster)
CPU Time (User)	3.12s	1.72s	-45%
CPU Time (System)	0.50s	0.48s	-4%

Analysis

Advantages of ThousandIsland:

Dramatically higher throughput:
Processed more than double the number of requests in the same test period.
Significantly lower latencies:
- Minimum latency 3.5x lower
- Maximum latency 2.25x lower
- Average latency approximately 3.6x lower
CPU efficiency:
Used 45% less CPU time (user time).
Reduced total execution time:
Completed the tests in almost half the time

The ThousandIsland adapter demonstrates substantially superior performance compared to Cowboy across all measured aspects. The implementation delivers:

Better throughput (~3.6x)
Lower latencies (3–4x)
Better resource efficiency

This is still a draft and needs a lot of refinement. I opened the PR just to document the work and share it with everyone.

Move grpc_core/lib/grpc/http2 to grpc_core/lib/grpc/transport/http2 to better reflect that HTTP/2 is the transport layer for gRPC. Changes: - Rename GRPC.HTTP2.* to GRPC.Transport.HTTP2.* - Update all imports and aliases in grpc_server and grpc_client - Update all test files

Add detailed test coverage for HTTP/2 frame implementations in the frame/ directory, focusing on gRPC-specific use cases and edge cases. These tests cover gRPC-specific HTTP/2 scenarios including trailers-only responses, large message handling, connection keepalive, and flow control patterns commonly used in gRPC implementations.

HTTP/2 protocol (RFC 9113) requires that HEADERS frame with :status pseudo-header must be sent before any TRAILERS frame. The previous implementation was conditionally skipping HEADERS when a stream had already received END_STREAM from the client, causing protocol errors. This fix ensures that send_grpc_error ALWAYS sends HTTP/2 HEADERS (with required :status and :content-type headers) before sending TRAILERS (with grpc-status and grpc-message), regardless of the stream state (half-closed remote or not). This resolves the 'timeout_on_sleeping_server' interop test failure where Gun client was rejecting error responses with the message: 'A required pseudo-header was not found'.

sleipnir · 2025-12-11T04:15:47Z

ThousandIsland Adapter Implementation Summary

This PR introduces a pure Elixir server adapter for gRPC using ThousandIsland, providing an alternative to the Cowboy adapter.

Implementation Checklist

Core HTTP/2 protocol implementation with state machine for stream lifecycle management
Full gRPC protocol support (unary, client streaming, server streaming, bidirectional streaming)
Async message-based architecture for non-blocking response sending
HTTP/2 frame compliance (HEADERS, DATA, TRAILERS with proper pseudo-header handling)
Deadline/timeout support with grpc-timeout header parsing
Comprehensive test coverage across all packages
Interop test suite validation (18/18 tests passing)
Multiple client adapter support (Gun and Mint)
Error handling and graceful degradation

Test Coverage

grpc_core Package

289 total tests (148 new HTTP/2 frame tests added)
6 doctests
100% pass rate (0 failures)

grpc_client Package

191 total tests
2 doctests
100% pass rate (0 failures)
Coverage: Gun and Mint adapter integration, streaming scenarios, error handling

grpc_server Package

260 total tests
2 doctests
100% pass rate (0 failures)
Coverage: ThousandIsland and Cowboy adapters, HTTP/2 connection management, stream lifecycle

Interop Test Suite

18/18 tests passing with ThousandIsland adapter
Tested with Gun client adapter Ok
Tested with Mint client adapter Ok
All gRPC patterns validated:
- empty_unary - Empty request/response
- large_unary - Large payloads (10MB)
- client_streaming - Client streaming with aggregation
- server_streaming - Server streaming with multiple responses
- ping_pong - Bidirectional streaming (alternating)
- empty_stream - Bidirectional streaming with no messages
- custom_metadata - Request/response metadata handling
- status_code_and_message - Error status propagation
- special_status_message - Unicode in error messages
- unimplemented_service - Service not found (12 status)
- unimplemented_method - Method not found (12 status)
- cancel_after_begin - Early cancellation
- cancel_after_first_response - Mid-stream cancellation
- timeout_on_sleeping_server - Deadline exceeded (4 status)

Total test count across all packages: 740+ tests

Architecture Overview

Handler Structure

GRPC.Server.Adapters.ThousandIsland.Handler
├── handle_connection/2 - Initial HTTP/2 setup
├── handle_data/3 - Process incoming HTTP/2 frames
└── handle_info/2 - Async message handling for responses
    ├── {:grpc_send_data, stream_id, data}
    └── {:grpc_send_trailers, stream_id, trailers}

Connection State Machine

GRPC.Server.HTTP2.Connection
├── Stream lifecycle: idle → open → half_closed → closed
├── Frame processing: HEADERS, DATA, CONTINUATION, TRAILERS
├── Flow control: WINDOW_UPDATE, SETTINGS
├── Error handling: GOAWAY, RST_STREAM
└── State management:
    ├── headers_sent: boolean
    ├── end_stream_received: boolean
    └── stream removal after END_STREAM sent

Async Response Model

The ThousandIsland adapter uses asynchronous message passing for non-blocking response sending (for bidi-streaming):

Request Phase:
- Client sends HEADERS + DATA frames
- Server parses HTTP/2 frames into gRPC request
- Stream marked as open in connection state
Dispatch Phase:
- Request dispatched to service implementation
- Stream kept alive for async response messages
- Service receives context with response PID
Response Phase:
- Service sends responses via async messages (send(pid, {:grpc_send_data, ...}))
- Handler receives messages in handle_info/2
- Connection state updated after each send operation
- Proper ordering maintained through message queue
Completion Phase:
- Service sends TRAILERS with end_stream: true
- Stream removed from connection state
- Resources cleaned up gracefully

Compatibility

Elixir: 1.14.0+ (tested with 1.18.0)
HTTP/2 Clients: Gun (Erlang), Mint (Elixir) - both fully tested
gRPC Spec: Full compliance with gRPC-over-HTTP/2 specification
Existing APIs: Drop-in replacement for Cowboy adapter (same configuration interface)
Dependencies: ThousandIsland for HTTP/2 server capabilities

Performance Characteristics

Pure Elixir: No NIFs or external dependencies beyond ThousandIsland
Message-based concurrency: Leverages BEAM's strengths for async I/O
Memory efficient: Stream state management with proper cleanup on completion/cancellation
Graceful degradation: Handles client disconnections and timeouts without crashes
Concurrent requests: Tested with 8 concurrent workers across 5 rounds

Benchmark Results

Performance Comparison (default configuration: 1,000 requests)

Cowboy Adapter:

Elapsed time: 0.923 seconds
Requests processed: 2,466
Throughput: ~2,672 req/s
Average latency: 0.37 ms
Min latency: 0.24 ms
Max latency: 54.67 ms
CPU time (user): 2.11s
CPU time (system): 0.32s

ThousandIsland Adapter:

Elapsed time: 0.842 seconds 8.8% faster
Requests processed: 3,226 30.8% more
Throughput: ~3,831 req/s 43.4% higher
Average latency: 0.26 ms 29.7% lower
Min latency: 0.16 ms
Max latency: 51.88 ms
CPU time (user): 1.74s 17.5% less CPU
CPU time (system): 0.46s

Summary:

ThousandIsland demonstrates significantly superior performance across all metrics
Processed 760 more requests in less time (3,226 vs 2,466)
Lower average latency (0.26ms vs 0.37ms) provides better user experience
Higher throughput (~3,800 req/s vs ~2,700 req/s) - 43% improvement
Lower CPU usage (1.74s vs 2.11s user time) indicates better efficiency
Both adapters maintain sub-millisecond average latency under load

Interop Test Stability

18 tests × 5 rounds × 2 client adapters = 180 test executions
Average time per round: ~2-3 seconds
Total execution time: ~25 seconds for full suite
Stability: 100% success rate across all runs
Concurrency: 8 workers processing tests in parallel

Memory Characteristics

Streams properly cleaned up after completion
No memory leaks detected during extended test runs
Graceful handling of cancelled/timed-out streams

This implementation provides a solid foundation for pure Elixir gRPC servers with excellent test coverage (740+ tests) and full protocol compliance. The ThousandIsland adapter is experimental but with strong capabilities and can serve as a drop-in replacement for the Cowboy adapter.

benchmark/bin/profile.exs

.gitignore

benchmark/config/config.exs

polvalente

Blocking on the GRPC.Server.Stream opts bug

polvalente · 2025-12-15T10:38:41Z

grpc_server/test/grpc/server/http2/frame_test.exs.bak

@@ -0,0 +1,259 @@
+defmodule GRPC.Server.HTTP2.FrameTest do


…rpc/grpc into feat/new-server-adapter

…erver

This commit reduces a bit the general performance but increase correctness

Co-authored-by: Paulo Valente <16843419+polvalente@users.noreply.github.com>

…rpc/grpc into feat/new-server-adapter

Co-authored-by: Paulo Valente <16843419+polvalente@users.noreply.github.com>

…rpc/grpc into feat/new-server-adapter

aseigo · 2025-12-16T16:00:44Z

I have an umbrella app with different apps starting their own GRPC.Server.Supervisors in their own supervision trees (with different Endpoint modules), and it does not start with this branch with the following error:

** (Mix) Could not start application discovery: Discovery.Application.start(:normal, []) returned an error: shutdown: failed to start child: GRPC.Server.Supervisor
    ** (EXIT) shutdown: failed to start child: GRPC.Server.StreamTaskSupervisor
        ** (EXIT) already started: #PID<0.871.0>

The Application module with the supervision tree:

defmodule Discovery.Application do
  @moduledoc false

  use Application

  @impl true
  def start(_type, _args) do
    children = [
      Discovery.Repo,
      Discovery.RateLimit,
      {
        GRPC.Server.Supervisor,
        endpoint: Discovery.Endpoint,
        port: Application.get_env(:discovery, :grpc_port),
        start_server: true,
        adapter_opts: [
          cred: GRPC.Credential.new(ssl: Application.get_env(:discovery, :ssl))
        ]
      }
    ]

    opts = [strategy: :one_for_one, name: Discovery.Supervisor]
    Supervisor.start_link(children, opts)
  end
end

There is a nearly identical one in another app in the same umbrella but with a different endpoint, port, etc. So there are multiple GRPC servers and multiple Server.Supervisors running in the same BEAM instance.

This works with current releases of the grpc library, as well as with the master branch, but fails to start with this branch. Is this a known issue, or an expected change?

polvalente · 2025-12-16T16:02:55Z

I have an umbrella app with different apps starting their own GRPC.Server.Supervisors in their own supervision trees (with different Endpoint modules), and it does not start with this branch with the following error:
** (Mix) Could not start application discovery: Discovery.Application.start(:normal, []) returned an error: shutdown: failed to start child: GRPC.Server.Supervisor
    ** (EXIT) shutdown: failed to start child: GRPC.Server.StreamTaskSupervisor
        ** (EXIT) already started: #PID<0.871.0>
The Application module with the supervision tree:
defmodule Discovery.Application do
  @moduledoc false

  use Application

  @impl true
  def start(_type, _args) do
    children = [
      Discovery.Repo,
      Discovery.RateLimit,
      {
        GRPC.Server.Supervisor,
        endpoint: Discovery.Endpoint,
        port: Application.get_env(:discovery, :grpc_port),
        start_server: true,
        adapter_opts: [
          cred: GRPC.Credential.new(ssl: Application.get_env(:discovery, :ssl))
        ]
      }
    ]

    opts = [strategy: :one_for_one, name: Discovery.Supervisor]
    Supervisor.start_link(children, opts)
  end
end
There is a nearly identical one in another app in the same umbrella but with a different endpoint, port, etc. So there are multiple GRPC servers and multiple Server.Supervisors running in the same BEAM instance.

This works with current releases of the grpc library, as well as with the master branch, but fails to start with this branch. Is this a known issue, or an expected change?

This is something we already mapped out during code review yesterday. This branch is very much a work in progress.

aseigo · 2025-12-16T16:40:36Z

This branch is very much a work in progress.

Understood, and that's fine. If early testing is not wanted / needed, just say so and I'll happily come back later in the process.

polvalente · 2025-12-16T16:43:04Z

This branch is very much a work in progress.

Understood, and that's fine. If early testing is not wanted / needed, just say so and I'll happily come back later in the process.

@aseigo thank you very much! I think testing, specially external, is gonna be important when we move from the current state of flux we're in.

For instance, we're trying to refactor the process structure to ensure correctness, but given how much this might impact performance this might take a bit of trial and error.

We do appreciate your contributions a lot!

aseigo · 2025-12-16T16:46:08Z

is gonna be important when we move from the current state of flux we're in.

Feel free to ping me when you are at that point and I'll kick some of the tires and look more closely at the implementation as well at that point. Cheers!

Adriano Santos added 3 commits December 8, 2025 15:26

chore: added native http2 protocol support

d68c55f

feat: added thousand_island server adapter

abb9b71

feat: support thousand_island server adapter

cdd7294

sleipnir mentioned this pull request Dec 8, 2025

Support grpcweb trailers encoded in the message #481

Open

Adriano Santos added 18 commits December 8, 2025 15:54

added elixir tools to .gitiignore

64d4658

chore: start client supervisor in application

7bf5fe9

doc: remove supervisor

4ff718e

format

9fcb5ff

chore: remove dead code

18e8727

Merge branch 'master' of https://github.com/elixir-grpc/grpc

dfd43cc

Merge branch 'master' into feat/new-server-adapter

e3b178a

formatting

2b5979d

test: add comprehensive HTTP/2 test coverage for grpc_core

bfb5325

fix(tests): resolve Elixir 1.18 deprecation warnings for Enum.slice/2

c3acd04

format

00fd46d

Merge branch 'feat/grpc-client-setup' into feat/new-server-adapter

c6bb8d2

fix(thousand_island): handle async response sending for stream lifecycle

62b9a27

chore: remove comments, change log level, and some adjustments

7f23251

Adriano Santos added 2 commits December 11, 2025 01:19

ci: fix format check by using mix setup instead of mix deps.get

1e4939f

chore: some cleanup

8361b0d

sleipnir marked this pull request as ready for review December 11, 2025 04:30

polvalente reviewed Dec 12, 2025

View reviewed changes

benchmark/bin/profile.exs Outdated Show resolved Hide resolved

polvalente reviewed Dec 12, 2025

View reviewed changes

benchmark/bin/profile.exs Outdated Show resolved Hide resolved

polvalente reviewed Dec 12, 2025

View reviewed changes

.gitignore Outdated Show resolved Hide resolved

polvalente reviewed Dec 12, 2025

View reviewed changes

benchmark/config/config.exs Outdated Show resolved Hide resolved

polvalente requested changes Dec 15, 2025

View reviewed changes

polvalente reviewed Dec 15, 2025

View reviewed changes

grpc_server/test/grpc/server/http2/frame_test.exs.bak

@@ -0,0 +1,259 @@

defmodule GRPC.Server.HTTP2.FrameTest do

Copy link

Contributor

polvalente Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stray file

Adriano Santos and others added 18 commits December 15, 2025 11:29

adjusts

03a35bc

Merge branch 'feat/new-server-adapter' of https://github.com/elixir-g…

9173cd7

…rpc/grpc into feat/new-server-adapter

remove unused file

8645901

removed português

c2aabfa

fix: bid_stream integration test & run more than one adapter in run_s…

1a29fc1

…erver

chore: better comment

8f3e833

refactor: remove hexa notation

b876a38

ref: after review adjustments

8a0d48d

ref: more defensive process model and error handling

0776ed4

This commit reduces a bit the general performance but increase correctness

ref: use merge instead of ++

0743550

Update grpc_server/lib/grpc/server/adapters/thousand_island.ex

8e3e5f2

Co-authored-by: Paulo Valente <16843419+polvalente@users.noreply.github.com>

Merge branch 'feat/new-server-adapter' of https://github.com/elixir-g…

bd82f2f

…rpc/grpc into feat/new-server-adapter

Update grpc_server/lib/grpc/server/adapters/thousand_island.ex

7e0c672

Co-authored-by: Paulo Valente <16843419+polvalente@users.noreply.github.com>

Merge branch 'feat/new-server-adapter' of https://github.com/elixir-g…

db33e85

…rpc/grpc into feat/new-server-adapter

tests: added map_errors integration tests

1fb869b

format

892b550

review adjustments

4862c36

merge

7705e74

Adriano Santos added 5 commits December 16, 2025 19:47

Merge branch 'master' of https://github.com/elixir-grpc/grpc

1085584

Merge branch 'master' into feat/new-server-adapter

ca80184

format

180fdbd

chore: create your own supervisor for Thousand Island.

91349ef

remove dead code

c2f6135

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feat] new server adapter #482

[Feat] new server adapter #482

Uh oh!

sleipnir commented Dec 8, 2025

Uh oh!

sleipnir commented Dec 11, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

polvalente left a comment

Uh oh!

polvalente Dec 15, 2025

Uh oh!

aseigo commented Dec 16, 2025

Uh oh!

polvalente commented Dec 16, 2025

Uh oh!

aseigo commented Dec 16, 2025 •

edited

Loading

Uh oh!

polvalente commented Dec 16, 2025

Uh oh!

aseigo commented Dec 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Feat] new server adapter #482

Are you sure you want to change the base?

[Feat] new server adapter #482

Uh oh!

Conversation

sleipnir commented Dec 8, 2025

Performance Metrics

Analysis

Uh oh!

sleipnir commented Dec 11, 2025

ThousandIsland Adapter Implementation Summary

Implementation Checklist

Test Coverage

grpc_core Package

grpc_client Package

grpc_server Package

Interop Test Suite

Architecture Overview

Handler Structure

Connection State Machine

Async Response Model

Compatibility

Performance Characteristics

Benchmark Results

Performance Comparison (default configuration: 1,000 requests)

Interop Test Stability

Memory Characteristics

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

polvalente left a comment

Choose a reason for hiding this comment

Uh oh!

polvalente Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

aseigo commented Dec 16, 2025

Uh oh!

polvalente commented Dec 16, 2025

Uh oh!

aseigo commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

polvalente commented Dec 16, 2025

Uh oh!

aseigo commented Dec 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

aseigo commented Dec 16, 2025 •

edited

Loading