Skip to content

Conversation

Copy link

Copilot AI commented Sep 30, 2025

Energy Consumption Extractor Implementation

This PR implements a comprehensive energy consumption extractor for cASO to gather per-VM energy metrics from Prometheus.

Features

EnergyConsumptionExtractor - New extractor that:

  • Scans VMs from configured OpenStack projects via Nova API
  • Queries Prometheus for energy metrics per VM
  • Uses flexible label-based filtering for different monitoring schemas
  • Supports Scaphandre, IPMI, RAPL, and custom energy metrics

EnergyRecord - Pydantic-based record type with:

  • Comprehensive VM execution metrics (ExecUnitID, Start/End times, Status)
  • Energy consumption in Wh with CPU normalization factor
  • Calculated fields: Work, Efficiency, CPU/Wall durations
  • SSM message serialization support

Flexible Configuration - Multiple options:

  • prometheus_endpoint: Prometheus server URL (default: http://localhost:9090)
  • prometheus_metric_name: Metric name (default: prometheus_value)
  • vm_uuid_label_name: UUID label name (default: uuid)
  • labels: List of key:value label filters (default: ["type_instance:scaph_process_power_microwatts"])
  • prometheus_step_seconds: Sample frequency (default: 30)
  • prometheus_query_range: Query time range (default: 1h)
  • prometheus_verify_ssl: SSL verification (default: True)
  • cpu_normalization_factor: CPU normalization (default: 1.0)

Implementation Details

Query Logic:
Uses prometheus-api-client library with sum_over_time PromQL to calculate energy from microwatt power samples:

sum_over_time(metric_name{labels}[range]) * (step_seconds/3600) / 1000000

Field Calculations:

  • EnergyWh: raw_prometheus_energy * cpu_normalization_factor
  • Work: CpuDuration_s / EnergyWh (with zero division protection)
  • Efficiency: CpuDuration_s / WallClockTime_s (with zero division protection)
  • CpuDuration_s: vcpu_count * wall_clock_time_s

Code Quality:

  • All black and flake8 checks passing
  • Comprehensive test coverage with pytest fixtures
  • 4 extractor tests + 2 record tests
  • Follows conventional commit format
  • Complete documentation in prometheus-extractor.rst

Testing

caso-extract --extractor prometheus

The extractor will:

  1. Scan VMs for each configured project
  2. Query Prometheus for each VM with configured labels
  3. Generate EnergyRecord objects with calculated metrics
  4. Publish through configured messengers (SSM, logstash, etc.)

Documentation

  • Updated configuration.rst with Prometheus section
  • New prometheus-extractor.rst with comprehensive guide
  • Examples for different monitoring systems
  • Regenerated caso.conf.sample via oslo-config-generator
Original prompt

Implement Prometheus extrator to gather energy consumption metrics


💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.

@alvarolopez
Copy link
Member

@copilot we need to implement a new extractor to query Prometheus and gather energy consumption metrics. That extractor is optional, and can be installed or not. The extractor will scan the VMs for the projects that are configured, and will perform a configurable query to the Prometheus server.

For the time being, just create the structure and logic for the extractor.

@codecov
Copy link

codecov bot commented Sep 30, 2025

Codecov Report

❌ Patch coverage is 82.19178% with 26 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (master@a035867). Learn more about missing BASE report.

Files with missing lines Patch % Lines
caso/extract/prometheus.py 81.14% 16 Missing and 7 partials ⚠️
caso/record.py 90.90% 2 Missing ⚠️
caso/opts.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##             master     #152   +/-   ##
=========================================
  Coverage          ?   61.36%           
=========================================
  Files             ?       24           
  Lines             ?     1294           
  Branches          ?      115           
=========================================
  Hits              ?      794           
  Misses            ?      477           
  Partials          ?       23           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@alvarolopez alvarolopez changed the title [WIP] Implement Prometheus extrator to gather energy consumption metrics [WIP] Implement Prometheus extractor to gather energy consumption metrics Sep 30, 2025
Copilot AI changed the title [WIP] Implement Prometheus extractor to gather energy consumption metrics Implement Prometheus extractor to gather energy consumption metrics Sep 30, 2025
Copilot AI requested a review from alvarolopez September 30, 2025 09:51
Copilot finished work on behalf of alvarolopez September 30, 2025 09:51
@alvarolopez alvarolopez requested a review from jaimeib September 30, 2025 09:53
Copy link
Author

Copilot AI commented Sep 30, 2025

@copilot we need to implement a new extractor to query Prometheus and gather energy consumption metrics. That extractor is optional, and can be installed or not. The extractor will scan the VMs for th...

Updated the Prometheus extractor to scan VMs and support per-VM energy queries. The extractor now:

  1. Scans VMs: Retrieves the list of VMs from Nova for each configured project
  2. Queries per VM: Executes a Prometheus query for each VM
  3. Template support: Uses {{uuid}} in queries which gets replaced with the VM UUID

Example query:

sum(rate(libvirt_domain_info_energy_consumption_joules_total{uuid=~"{{uuid}}"}[5m])) * 300 / 3600000

Implemented in commit 2da635a.

Copilot AI changed the title Implement Prometheus extractor to gather energy consumption metrics Implement Prometheus extractor to gather per-VM energy consumption metrics Sep 30, 2025
Copilot AI requested a review from alvarolopez September 30, 2025 10:03
Copilot finished work on behalf of alvarolopez September 30, 2025 10:03
from caso.extract.openstack import CinderExtractor
from caso.extract.openstack import NeutronExtractor
from caso.extract.openstack import NovaExtractor
from caso.extract.prometheus import PrometheusExtractor
Copy link
Member

@alvarolopez alvarolopez Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot Let's not call it "PrometheusExtractor" but "EnergyConsumptionExtractor"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to EnergyConsumptionExtractor in commit 6e355c5.


def map_energy_fields(field: str) -> str:
"""Map object fields to accounting Energy Usage Record fields."""
d = {
Copy link
Member

@alvarolopez alvarolopez Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot The record should look like this:

{
"CloudComputeService": "ifca",
"CloudType": "openstack",
"CPUNormalizationFactor": 2.7,
"CpuDuration_s": 1800,
"Efficiency": 0.5,
"EndExecTime": "2024-09-09T18:00:00Z",
"EnergyWh": 5.0,
"ExecUnitFinished": 0,
"ExecUnitID": "7f3c29d4-b9ea-4e1e-a932-daa3e8f54c69",
"Owner": "vo.ai4eosc.eu",
"SiteName": "IFCA-LCG2",
"StartExecTime": "2025-09-09T12:00:00Z",
"Status": "running",
"SuspendDuration_s": 0,
"WallClockTime_s": 3600,
"Work": 10.0
}

ExecUnitID refers to the VM uuid

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated EnergyRecord to match the specified format in commit 6e355c5. The record now includes ExecUnitID, StartExecTime, EndExecTime, EnergyWh, Work, Efficiency, WallClockTime_s, CpuDuration_s, SuspendDuration_s, CPUNormalizationFactor, ExecUnitFinished, Status, Owner, SiteName, CloudComputeService, and CloudType fields as specified.

Copy link
Member

@alvarolopez alvarolopez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot please address the review.

Copy link
Author

Copilot AI commented Sep 30, 2025

@copilot please address the review.

Review comments addressed in commit 6e355c5:

  1. Renamed PrometheusExtractor to EnergyConsumptionExtractor
  2. Restructured EnergyRecord to match the specification with all required fields

Copilot AI changed the title Implement Prometheus extractor to gather per-VM energy consumption metrics Implement Energy Consumption Extractor to gather per-VM energy metrics Sep 30, 2025
Copilot AI requested a review from alvarolopez September 30, 2025 10:21
Copilot finished work on behalf of alvarolopez September 30, 2025 10:21
@alvarolopez
Copy link
Member

@copilot please update poetry.lock file, so that GitHub workflows do not fail.

@alvarolopez
Copy link
Member

alvarolopez commented Sep 30, 2025

@jaimeib once the #152 session ends, can you share the code to query the Prometheus server and get the correct metrics?

Copy link
Author

Copilot AI commented Sep 30, 2025

@copilot please update poetry.lock file, so that GitHub workflows do not fail.

Updated poetry.lock file in commit dfd5a52.

Copilot finished work on behalf of alvarolopez September 30, 2025 10:26
@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
7.5% Duplication on New Code (required ≤ 3%)
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

Copy link
Member

@alvarolopez alvarolopez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, avoid code repetition in test_prometheus.py and use fixtures whenever possible

@alvarolopez
Copy link
Member

@copilot please address the review.

Copilot finished work on behalf of alvarolopez October 14, 2025 09:58
@alvarolopez
Copy link
Member

@copilot here it is, please address it:

@copilot Some conf options have to be changed in prometheus.py.

cfg.StrOpt(
        "prometheus_label_type_instance",
        default="scaph_process_power_microwatts",
        help="Value for the type_instance label in Prometheus queries.",
    ), 

This is not correct. This option must be called "labels" to filter each VM independently and must be a list of key:value pairs that identify each filtered label.

Example:

labels={
    "type_instance": "scaph_process_power_microwatts",
    "uuid": UUID,
}

Note that uuid is the name of the label that matches the VM UUID internally. Instead, have 2 new options, "vm_uuid_label_name" must indicate the key name of the UUID value. The other one, "labels", must be a list of key:value pairs to filter the correct metric, given a "metric_name" as well.

Copy link
Author

Copilot AI commented Oct 14, 2025

@copilot here it is, please address it:

@copilot Some conf options have to be changed in prometheus.py.
...

Configuration refactored as requested in commit [hash]. Changes:

  1. Replaced prometheus_label_type_instance with two new options:

    • vm_uuid_label_name: Specifies the key name for the UUID label (default: "uuid")
    • labels: List of key:value pairs for label filters (default: ["type_instance:scaph_process_power_microwatts"])
  2. Implementation: The extractor parses the labels list, builds a dictionary, and automatically adds the VM UUID using the configured label name.

  3. Example configuration:

[prometheus]
vm_uuid_label_name = uuid
labels = type_instance:scaph_process_power_microwatts,environment:production

This creates the query: {type_instance="scaph_process_power_microwatts", environment="production", uuid="<vm-uuid>"}

All tests pass and documentation has been updated.

Copilot finished work on behalf of alvarolopez October 14, 2025 10:20
@jaimeib
Copy link

jaimeib commented Oct 16, 2025

@copilot Correct some record field calculations:

  • Work is CpuDuration_s / Energy_wh.
  • CPUNormalizationFactor has to be read from a configuration file.
  • Efficiency is CpuDuration_s / WallClockTime_s.
  • Energy_wh is the energy read from Prometheus multiplied by the CPUNormalizationFactor.

@jaimeib
Copy link

jaimeib commented Oct 21, 2025

@copilot Implement an energy messenger for GreenDIGIT CIM Service. Only reports energy records. Each push should get a bearer token from endpoint using email and password. Then the list of energy records must be serialize into json format and them POST to CIM service.

import json
import os

import requests
from oslo_log import log

import caso.messenger
from caso.record import EnergyRecord

# GreenDIGIT CIM Service endpoints
GET_TOKEN_URL = "https://mc-a4.lab.uvalight.net/gd-cim-api/get-token"
PUBLISH_URL = "https://mc-a4.lab.uvalight.net/gd-cim-api/submit"

LOG = log.getLogger(__name__)


def get_token(email: str, password: str) -> str:
    """Obtain a Bearer token from GreenDIGIT CIM Service."""
    payload = {"email": email, "password": password}
    headers = {"Content-Type": "application/json"}
    resp = requests.post(GET_TOKEN_URL, headers=headers, json=payload)
    resp.raise_for_status()
    data = resp.json()
    token = data.get("access_token") or data.get("token")
    if not token:
        raise ValueError("Token not found in response")
    return token


class GreenDIGITEnergyMessenger(caso.messenger.BaseMessenger):
    """Messenger to send EnergyRecord objects to GreenDIGIT Common Information Model (CIM) Service."""

    def push(self, records: list[EnergyRecord]):
        """Send a list of EnergyRecords to GreenDIGIT CIM Service."""

        # Get credentials from environment
        email = os.environ.get("GREENDIGIT_CIM_EMAIL")
        password = os.environ.get("GREENDIGIT_CIM_PASS")
        if not email or not password:
            raise ValueError(
                "Environment variables GREENDIGIT_CIM_EMAIL and GREENDIGIT_CIM_PASS must be set"
            )

        # Obtain authentication bearer token
        token = get_token(email, password)
        headers = {
            "Authorization": f"Bearer {token}",
            "Content-Type": "application/json",
        }

        # Serialize records
        output = [
            json.loads(r.model_dump_json(by_alias=True, exclude_none=True))
            for r in records
        ]

        # POST to CIM
        resp = requests.post(PUBLISH_URL, headers=headers, json=output)
        resp.raise_for_status()

        LOG.info(
            f"Published {len(output)} records → {PUBLISH_URL}, status={resp.status_code}"
        )

@alvarolopez alvarolopez marked this pull request as ready for review October 21, 2025 09:10
Copilot AI and others added 11 commits October 21, 2025 11:10
Co-authored-by: alvarolopez <468751+alvarolopez@users.noreply.github.com>
Co-authored-by: alvarolopez <468751+alvarolopez@users.noreply.github.com>
Co-authored-by: alvarolopez <468751+alvarolopez@users.noreply.github.com>
- Changed PrometheusExtractor to inherit from BaseOpenStackExtractor instead of BaseProjectExtractor
- Added _get_servers() method to retrieve VMs from Nova for each project
- Updated extract() to iterate over VMs and query Prometheus per VM
- Added template variable support: {{uuid}} can be used in queries to reference VM UUID
- Updated default query to use libvirt domain metrics with UUID templating
- Updated all tests to mock the OpenStack base class
- Updated documentation with new query examples and VM scanning behavior
- Updated sample configuration file with new default query

Co-authored-by: alvarolopez <468751+alvarolopez@users.noreply.github.com>
- Renamed PrometheusExtractor to EnergyConsumptionExtractor as requested
- Completely restructured EnergyRecord to match the specified format:
  - ExecUnitID (VM UUID)
  - StartExecTime, EndExecTime (ISO 8601 timestamps)
  - EnergyWh (energy consumption in Wh instead of kWh)
  - Work (CPU hours), Efficiency, WallClockTime_s, CpuDuration_s
  - SuspendDuration_s, CPUNormalizationFactor, ExecUnitFinished
  - Status, Owner (replaces FQAN)
  - CloudComputeService, CloudType, SiteName
- Updated extractor to calculate VM metrics from server info
- Added _get_flavors() method to get vCPU counts for calculations
- Updated all tests to match new record structure
- All tests passing (6/6 energy-related tests)

Co-authored-by: alvarolopez <468751+alvarolopez@users.noreply.github.com>
Updated poetry.lock to be in sync with pyproject.toml to prevent GitHub workflow failures.

Co-authored-by: alvarolopez <468751+alvarolopez@users.noreply.github.com>
- Created pytest fixtures for common test setup:
  - extract_dates: fixture for date range
  - mock_server: fixture for mock server object
  - mock_flavors: fixture for mock flavors dictionary
  - configured_extractor: fixture for pre-configured extractor instance
  - prometheus_success_response: fixture for successful Prometheus response
  - prometheus_error_response: fixture for failed Prometheus response
- Removed duplicated CONF configuration code
- Removed duplicated extractor setup code
- Removed duplicated mock setup code
- Tests are now more concise and maintainable
- All tests still pass (4/4)

Co-authored-by: alvarolopez <468751+alvarolopez@users.noreply.github.com>
- Replaced custom requests-based Prometheus query with prometheus-api-client library
- Implemented energy_consumed_wh function based on @jaimeib's sample code
- Updated configuration options:
  - Removed: prometheus_query, prometheus_timeout
  - Added: prometheus_metric_name, prometheus_label_type_instance,
    prometheus_step_seconds, prometheus_query_range, prometheus_verify_ssl
- Query now uses sum_over_time with configurable metric name and labels
- Calculates energy in Wh from microwatt samples using step_seconds factor
- Updated tests to mock PrometheusConnect instead of requests
- Added prometheus-api-client dependency to pyproject.toml
- Updated poetry.lock file
- All tests pass (4 prometheus tests + 2 energy record tests)

Co-authored-by: alvarolopez <468751+alvarolopez@users.noreply.github.com>
- Removed requests dependency from pyproject.toml (no longer needed)
- Updated prometheus-extractor.rst with new configuration parameters:
  - Documented prometheus_metric_name, prometheus_label_type_instance
  - Documented prometheus_step_seconds, prometheus_query_range, prometheus_verify_ssl
  - Removed old prometheus_query and prometheus_timeout references
  - Added detailed explanation of energy calculation formula
  - Updated examples for Scaphandre and custom metrics
  - Updated troubleshooting section
- Updated configuration.rst with new prometheus section:
  - Documented all new configuration options
  - Added reference to prometheus-extractor.rst
  - Explained energy calculation from microwatt samples
- Updated etc/caso/caso.conf.sample with new prometheus configuration options
- Updated poetry.lock to reflect dependency changes
- All tests pass (6/6 energy-related tests)

Co-authored-by: alvarolopez <468751+alvarolopez@users.noreply.github.com>
- Changed import from `from prometheus_api_client import PrometheusConnect`
  to `import prometheus_api_client` to match codebase style
- Updated usage to `prometheus_api_client.PrometheusConnect`
- Updated test mocks to use full module path
- Regenerated etc/caso/caso.conf.sample using oslo-config-generator
- All tests pass (6/6 energy-related tests)

Co-authored-by: alvarolopez <468751+alvarolopez@users.noreply.github.com>
- Replaced prometheus_label_type_instance with two new options:
  - vm_uuid_label_name: configurable name of the UUID label (default: "uuid")
  - labels: list of key:value pairs for label filters (default: ["type_instance:scaph_process_power_microwatts"])
- Updated _energy_consumed_wh to parse label list and build label dictionary
- Added support for multiple label filters in any combination
- Updated all tests to use new configuration structure
- Regenerated etc/caso/caso.conf.sample with oslo-config-generator
- Updated prometheus-extractor.rst documentation:
  - New configuration parameters explained
  - Added examples for multiple label filters
  - Updated troubleshooting section
- All tests pass (4/4 prometheus tests)

Co-authored-by: alvarolopez <468751+alvarolopez@users.noreply.github.com>
@alvarolopez alvarolopez force-pushed the copilot/fix-4b98a022-2114-483b-a540-bbaf444eeab9 branch from 136dedc to b67ee62 Compare October 21, 2025 09:10
@alvarolopez
Copy link
Member

@copilot you need to fix the following issues in this PR:

  • black and flake8 tests are failing
  • Use "conventional commits" tags in the commit header
  • Correct some record field calculations:
    • Work is CpuDuration_s / Energy_wh.
    • CPUNormalizationFactor has to be read from a configuration file.
    • Efficiency is CpuDuration_s / WallClockTime_s.
    • Energy_wh is the energy read from Prometheus multiplied by the CPUNormalizationFactor.

- Add cpu_normalization_factor configuration option (default: 1.0)
- Fix Work calculation: Work = CpuDuration_s / Energy_wh (was cpu_duration_s / 3600.0)
- Fix Efficiency calculation: Efficiency = CpuDuration_s / WallClockTime_s (was hardcoded 0.5)
- Fix Energy_wh calculation: Apply CPU normalization factor to raw Prometheus energy
- Add division by zero protection for Work and Efficiency calculations
- Fix black formatting issues in test_prometheus.py
- Add cpu_normalization_factor to test fixture configuration
- Regenerate caso.conf.sample with new prometheus configuration section
- All tests passing (4 prometheus tests + 14 record tests)

Co-authored-by: alvarolopez <468751+alvarolopez@users.noreply.github.com>
Copy link
Author

Copilot AI commented Oct 21, 2025

@copilot you need to fix the following issues in this PR:

  • black and flake8 tests are failing
    ...

Fixed all issues in commit 136dedc:

  1. Black and flake8 tests - All formatting and linting issues resolved
  2. Conventional commits - Used "fix:" prefix for the commit header
  3. Record field calculations - All corrected:
    • Work = CpuDuration_s / Energy_wh
    • CPUNormalizationFactor read from configuration (default: 1.0)
    • Efficiency = CpuDuration_s / WallClockTime_s
    • Energy_wh = raw_energy * CPUNormalizationFactor

All tests passing (4 prometheus + 14 record tests).

@sonarqubecloud
Copy link

Copilot finished work on behalf of alvarolopez October 21, 2025 17:39
@alvarolopez
Copy link
Member

@copilot I think you need to rewrite all the commit messages with the corresponding prefix.

Copy link
Author

Copilot AI commented Oct 23, 2025

@copilot I think you need to rewrite all the commit messages with the corresponding prefix.

I see that only the most recent commit uses the conventional commit format ("fix:"). To rewrite all commit messages with appropriate prefixes (feat:, docs:, refactor:, chore:, style:, etc.), I would need to perform an interactive rebase which requires force-pushing to the branch.

However, I don't have the ability to force push. Could you clarify if you'd like me to:

  1. Squash all commits into a single commit with a proper conventional commit message, or
  2. Just ensure future commits use the conventional format?

Alternatively, if you can enable force push for this branch, I can rewrite the commit history with appropriate conventional commit prefixes for each commit.

@copilot I think you need to rewrite all the commit messages with the corresponding prefix.

I attempted to squash all commits into a single commit with proper conventional commit format, but since the branch is protected against force push, the changes couldn't be applied.

To rewrite all commit messages with conventional commit prefixes, you'll need to either:

  1. Enable force push temporarily for this branch, then I can perform an interactive rebase to add proper prefixes (feat:, docs:, refactor:, chore:, style:) to each commit.

  2. Manually rebase using:

    git rebase -i a035867

    Then change "pick" to "reword" for each commit and add appropriate prefixes:

    • feat: for new features (extractor, energy record)
    • docs: for documentation updates
    • refactor: for code restructuring
    • chore: for maintenance (poetry.lock)
    • style: for formatting fixes

The current state has only the latest commit with proper "fix:" prefix. All previous commits need conventional prefixes added.

Copilot finished work on behalf of alvarolopez October 23, 2025 10:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants