Skip to content

Conversation

hylkevds
Copy link
Collaborator

@hylkevds hylkevds commented May 28, 2025

  • enhancement

Release notes

[feature] Added metrics framework and Prometheus implementation.
- Enable by setting metrics_provider_class to MetricsProviderPrometheus.
- Metrics endpoint can be found on http://localhost:9400/metrics by default.
- port can be changed with metrics_endpoint_port. Set to 0 to disable http endpoint.

What does this PR do?

To gain insights into the state of running software, the software can expose metrics, which are then gathered by a metrics framework.
A popular framework for this is Prometheus, but there are others.
This PR adds the ability for Moquette to expose metrics for such system to gather, and provides an implementation for Prometheus specifically.

By default, metrics are not gathered, and the system is only a shim that has minimal impact on performance. The Proteus implementation is a separate project that can be added to the classpath if desired. This minimises the impact for users that do not require metrics and does not change any dependencies for those use-cases.

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files (and/or docker env variables)
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the Changelog if it's a feature or a fix that has to be reported

How to test this PR locally

  1. Start moquette with metrics_provider_class=MetricsProviderPrometheus
  2. navigate to http://localhost:9400/metrics

hylkevds added 5 commits June 7, 2025 12:24
For monitoring a system it is not really important how many items are in
a queue. The important metric is how full the queue is. That is, how many
items the queue has, relative to the maximum size of the queue.
@hylkevds hylkevds requested a review from andsel June 14, 2025 22:09
@andsel
Copy link
Collaborator

andsel commented Jun 17, 2025

Hi @hylkevds thanks for this. It will take some time to me for review, but I'll do, just little bit slow.

Copy link
Collaborator

@andsel andsel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank's @hylkevds for thus PR.
I've given a first review step, in the client part of the code, where Moquette uses this.
It seems fair, but my big concerns is why using a custom API when we could leverage wide spread used metric APIs and implementations, like for example Dropwizard Metrics which defines client APIs and provide various implementations, like the one for Prometheus?

@hylkevds
Copy link
Collaborator Author

Thank's @hylkevds for thus PR. I've given a first review step, in the client part of the code, where Moquette uses this. It seems fair, but my big concerns is why using a custom API when we could leverage wide spread used metric APIs and implementations, like for example Dropwizard Metrics which defines client APIs and provide various implementations, like the one for Prometheus?

My main concern here is to not add any dependencies on any metrics libraries.
On one hand, there are too many to count and everyone has their favourite. Given that one major use case of Moquette is being embedded in other software (including on Android), whatever metrics system we choose, will be the wrong one for most users.
On the other hand, this will keep moquette as small as possible, with as few dependencies as possible.

So instead of choosing any, I made a minimal shim, with the actual metrics collection being done by a separate plugin that could trivially be written for any metrics system that matches the system already used by whatever Moquette is embedded into.
The default (null) implementation doesn't consume any resources, so won't have a performance impact either.

@hylkevds
Copy link
Collaborator Author

Elaborating a bit more about the "custom API" thing: Metrics gathering has to be wired directly into the code. Regardless of which metrics library we choose, we, as authors of moquette, are the ones that define which metrics are gathered as Count, which as Histogram, etc.
The "normal" way to define the metrics would be to spread all the code that is currently in MetricsProviderPrometheus.java all over the Moquette code.
The only thing this Interface does is pull those bits of metrics-gathering code out of the Moquette classes into a separate class.

As far as I can tell, there is not yet a nice "SLF4J-API" equivalent for metrics. Dropwizard Metrics, OpenTelemetry and Proteus are all the equivalent of JUL, Log4J or Logback. They can expose metrics in each others formats, like the logging frameworks can write log files in different formats, but if we choose one, but the embedding application uses another, then there are now two frameworks active, that both expose their own metrics endpoint.

Even if there were a Metrics-API library, it might still be a good idea to make a central metrics-gathering class for moquette, to avoid spreading if (metrics) checks and metrics definitions all over Moquette.

@andsel
Copy link
Collaborator

andsel commented Jul 29, 2025

My main concern here is to not add any dependencies on any metrics libraries.

~~I generally agree, but some dependencies are needed. Dropwizard Metrics (or Micrometer) are an API that has various implementations, it's more like a Logging API provided by Log4J or SLF4J.

I understand your concern, but Moquette would just depend on the API not the implementation.~~


After reading you last comment, could agree with you. I'll move forward in reviewing

Copy link
Collaborator

@andsel andsel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hylkevds for all your hard work on this!

I've left some comments and a suggestion to improve readability of the MetricsManager init method

import io.moquette.broker.config.IConfig;

/**
* Interface that a metrics implementation must implement.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Interface that a metrics implementation must implement.
* Interface that a metrics implementation must implement.
* It mainly defines methods that are used to track Moquette metrics.


/**
* Notify the metrics provider about the number and size of session queues. This will
* be called once.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this has to be called one, could we provide an abstract class for implementors that grant the single executions. Something like:

private boolean executed = false;

public void initSessionQueues(int queueCount, int queueSize) {
  if (executed) {
    throw new IllegalStateException("initSessioQueues must be called only once and was already executed");
  }
  ...
}

*
* You may elect to redistribute this code under either of these licenses.
*/
package io.moquette.logging;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not logging but metering, so I would move into package io.moquette.metrics

@@ -0,0 +1,76 @@
/*
* Copyright (c) 2012-2018 The original author or authors
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Copyright (c) 2012-2018 The original author or authors
* Copyright (c) 2012-2025 The original author or authors

@@ -0,0 +1,69 @@
/*
* Copyright (c) 2012-2018 The original author or authors
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Copyright (c) 2012-2018 The original author or authors
* Copyright (c) 2012-2025 The original author or authors

Comment on lines 101 to 108
public void deliveryComplete(IMqttDeliveryToken token) {
// try {
// token.waitForCompletion(1_000);
// m_messages.offer(new ReceivedMessage(token.getMessage(), token.getTopics()[0]));
// } catch (MqttException e) {
// e.printStackTrace();
// }
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public void deliveryComplete(IMqttDeliveryToken token) {
// try {
// token.waitForCompletion(1_000);
// m_messages.offer(new ReceivedMessage(token.getMessage(), token.getTopics()[0]));
// } catch (MqttException e) {
// e.printStackTrace();
// }
}
public void deliveryComplete(IMqttDeliveryToken token) {
}

I would avoid to copy commented code.

@@ -0,0 +1,90 @@
/*
* Copyright (c) 2012-2018 The original author or authors
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Copyright (c) 2012-2018 The original author or authors
* Copyright (c) 2012-2025 The original author or authors

*/
private static MetricsProvider metricsProvider = new MetricsProviderNull();

public static void init(IConfig config) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to cover with test this with a test resource file in test/resources/META-INF/services ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, I'll try.
Though it will be tested when the Prometheus implementation is built and tested.

Comment on lines 40 to 60
public static void init(IConfig config) {
ServiceLoader<MetricsProvider> loader = ServiceLoader.load(MetricsProvider.class);
String classname = config.getProperty(METRICS_PROVIDER_CLASS, "");

MetricsProvider usedProvider = null;
List<String> foundProviders = new ArrayList<>();
for (MetricsProvider provider : loader) {
foundProviders.add(provider.getClass().getName());
if (!StringUtil.isNullOrEmpty(classname) && provider.getClass().getName().endsWith(classname)) {
LOG.info("Using configured MetricsProvider: {}", provider.getClass().getName());
usedProvider = provider;
break;
}
}
if (usedProvider == null) {
LOG.info("No MetricsProvider configured, or no matching found, using NULL provider. Available providers: {}", foundProviders);
} else {
metricsProvider = usedProvider;
}
metricsProvider.init(config);
}
Copy link
Collaborator

@andsel andsel Aug 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public static void init(IConfig config) {
ServiceLoader<MetricsProvider> loader = ServiceLoader.load(MetricsProvider.class);
String classname = config.getProperty(METRICS_PROVIDER_CLASS, "");
MetricsProvider usedProvider = null;
List<String> foundProviders = new ArrayList<>();
for (MetricsProvider provider : loader) {
foundProviders.add(provider.getClass().getName());
if (!StringUtil.isNullOrEmpty(classname) && provider.getClass().getName().endsWith(classname)) {
LOG.info("Using configured MetricsProvider: {}", provider.getClass().getName());
usedProvider = provider;
break;
}
}
if (usedProvider == null) {
LOG.info("No MetricsProvider configured, or no matching found, using NULL provider. Available providers: {}", foundProviders);
} else {
metricsProvider = usedProvider;
}
metricsProvider.init(config);
}
public static void init(IConfig config) {
ServiceLoader<MetricsProvider> loader = ServiceLoader.load(MetricsProvider.class);
String classname = config.getProperty(METRICS_PROVIDER_CLASS, "");
List<MetricsProvider> foundProviders = new ArrayList<>();
loader.forEach(foundProviders::add);
Optional<MetricsProvider> usedProviderOpt = foundProviders.stream()
.filter(provider -> providerMatchClassname(provider, classname))
.findFirst();
if (usedProviderOpt.isPresent()) {
MetricsProvider usedProvider = usedProviderOpt.get();
LOG.info("Using configured MetricsProvider: {}", usedProvider.getClass().getName());
metricsProvider = usedProvider;
} else {
LOG.info("No MetricsProvider configured, or no matching found, using NULL provider. Available providers: {}",
foundProviders.stream().map(p -> p.getClass().getName()).collect(Collectors.toList()));
}
metricsProvider.init(config);
}
private static boolean providerMatchClassname(MetricsProvider provider, String classname) {
return !StringUtil.isNullOrEmpty(classname) && provider.getClass().getName().endsWith(classname);
}

Try to make more explicit the intention of search for a provider with same class name defined in classname

@hylkevds
Copy link
Collaborator Author

hylkevds commented Aug 5, 2025

Your "init is only called once" comment actually touches on an important detail I wanted your opinion on:
Currently the MetricsManager uses a static reference to a MetricsProvider. This means there can be only one MetricsProvider, even if one were to try to start multiple Moquette instances in the same VM. The tests were also very messy in this regard, since they create all kinds of classes without calling the cleanup functions.

Do you think this is an issue, and if it is, what is your preferred solution?
In FROST I use the settings object for passing these semi-global objects.

@hylkevds
Copy link
Collaborator Author

hylkevds commented Aug 6, 2025

I rewrote the initialisation, now each instance of Server gets its own MetricsProvider.

@andsel
Copy link
Collaborator

andsel commented Aug 6, 2025

Good! @hylkevds if you could resolve the conflicts then we are good to merge :-)

@hylkevds
Copy link
Collaborator Author

hylkevds commented Aug 6, 2025

I've also managed to add tests using test/resources/META-INF/services, seems to work just fine!
Will now resolve the conflicts.

Copy link
Collaborator

@andsel andsel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left just a note on the naming, then I think we are done with this PR :-D

/**
* A metrics provider used for testing.
*/
public class MetricsProviderTest extends AbstractMetricsProvider {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that <Class>Test is a pattern used to name the test suites, to avoid confusion with this, that's a test fixture class, I would rename it MetricsProviderMock or MetricsProviderDouble

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, while naming it my inspiration drew blank. Fixed now!

@andsel andsel self-requested a review August 11, 2025 11:25
Copy link
Collaborator

@andsel andsel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@andsel andsel merged commit 20f41ef into moquette-io:main Aug 11, 2025
4 checks passed
@andsel
Copy link
Collaborator

andsel commented Aug 11, 2025

Thanks a lot @hylkevds for all this hard work! :-D

@hylkevds hylkevds deleted the feature_metrics branch August 25, 2025 12:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants