Skip to content

[CI Failure]: mi325_1: Entrypoints Integration Test (API Server) #29541

@AndreasKaratzas

Description

@AndreasKaratzas

Name of failing test

pytest -v -s entrypoints/openai/test_collective_rpc.py; pytest -v -s entrypoints/openai --ignore=entrypoints/openai/test_chat_with_tool_reasoning.py --ignore=entrypoints/openai/test_oot_registration.py --ignore=entrypoints/openai/test_tensorizer_entrypoint.py --ignore=entrypoints/openai/correctness/ --ignore=entrypoints/openai/test_collective_rpc.py --ignore=entrypoints/openai/tool_parsers/; pytest -v -s entrypoints/test_chat_utils.py

Basic information

  • Flaky test
  • Can reproduce locally
  • Caused by external libraries (e.g. bug in transformers)

🧪 Describe the failing test

Failing Tests Summary:

test_abort_metrics_reset in test_metrics.py
Tests: Metrics reset after request abort with frontend multiprocessing disabled
Failure: AssertionError
Configuration: --disable-frontend-multiprocessing-text flag
Likely cause: Metrics tracking not properly resetting abort counts when frontend multiprocessing is disabled, possible state management issue in metrics collection

test_openapi_stateless[POST /tokenize] in test_openai_schema.py
Tests: OpenAPI schema validation for tokenize endpoint using schemathesis
Failure: SUBFAIL during schema validation
Configuration: Stateless endpoint validation with generated test cases
Likely cause: Schema mismatch between OpenAPI spec and actual tokenize endpoint behavior, possibly incorrect request/response format or missing field validation

test_mcp_tool_env_flag_enabled in test_response_api_mcp_tools.py
Tests: MCP (Model Context Protocol) tool functionality with environment flag
Failure: Test failure for openai/gpt-oss-20b model
Configuration: model=openai/gpt-oss-20b with MCP tools enabled
Likely cause: MCP tool integration not functioning correctly for gpt-oss-20b, possibly missing tool server initialization or incorrect tool call format

test_empty_file, test_embeddings, test_score in test_run_batch.py
Tests: Batch processing for empty files, embeddings generation, and score/rerank endpoints
Failure: AssertionError across multiple batch API endpoints
Configuration: Batch API with /score, /rerank, /v1/score, /v2/rerank endpoints
Likely cause: Batch processing API implementation issues with request formatting, response handling, or endpoint routing for score/rerank operations

test_same_response_as_chat_completions in test_serving_tokens.py
Tests: Token serving consistency with chat completions API
Failure: Response mismatch between token serving and chat completions
Configuration: Comparing token-based and chat-based API responses
Likely cause: Token serving endpoint producing different output format or content than chat completions, inconsistent tokenization or response formatting

test_basic_audio_with_lora in test_transcription_validation.py and test_translation_validation.py
Tests: Audio transcription/translation with LoRA adapter loading
Failure: LoRA integration failure for speech models
Configuration: model=ibm-granite/granite-speech-3.3-2b with speech LoRA adapter
Likely cause: LoRA adapter loading failing for audio models, possibly incompatible adapter format or missing LoRA runtime initialization for audio modalities

test_single_chat_session_image_base64encoded_beamsearch in test_vision.py
Tests: Vision model inference with base64 encoded images using beam search
Failure: Beam search with vision inputs for Phi-3.5-vision-instruct
Configuration: n=2, beam_search=True, model=microsoft/Phi-3.5-vision-instruct, image_idx=3
Likely cause: Beam search implementation not correctly handling multimodal (vision) inputs, possible image tensor duplication issue or incorrect beam state management

test_single_request in test_vision_embeds.py
Tests: Vision embedding generation for geospatial model with custom inputs
Failure: Embedding pooling for Prithvi model with pixel_values and location_coords
Configuration: model=ibm-nasa-geospatial/Prithvi-EO-2.0-300M-TL-Sen1Floods11, runner=pooling, enable-mm-embeds
Likely cause: Custom vision embedding format not properly handled, terratorch implementation issue with pixel_values/location_coords tensor processing

ERROR tests in test_optional_middleware.py (7 tests)
Tests: API middleware for authentication and request ID headers
Failure: RuntimeError during test execution
Configuration: Various --api-key and --enable-request-id-headers configurations
Likely cause: Server fixture initialization failing, likely timeout or server startup failure preventing all parameterized middleware tests from executing

ERROR tests in test_response_api_with_harmony.py (26 tests)
Tests: Harmony API integration for stateful responses with tools, streaming, code interpreter
Failure: RuntimeError during server initialization for all test variants
Configuration: model=openai/gpt-oss-20b with various harmony API features
Likely cause: Server failing to start for gpt-oss-20b model with harmony features, possibly missing dependencies, model loading timeout, or harmony API initialization failure preventing entire test module execution

📝 History of failing test

AMD-CI build Buildkite references:

  • 1041
  • 1077
  • 1088
  • 1109
  • 1111

CC List.

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    ci-failureIssue about an unexpected test failure in CI

    Type

    No type

    Projects

    Status

    No status

    Status

    In review

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions