Standardizing APIs for RxInfer-based Agents #416

albertpod · 2025-01-31T13:01:31Z

albertpod
Jan 31, 2025
Maintainer

We're looking to establish standardized APIs for implementing Bayesian (AIF) agents using RxInfer. While there's ongoing research about the implementation details of planning functions, having a clear interface would help developers start building agents with our toolbox.

Current State

Currently, we have scattered examples like the mountain car and drone simulations that demonstrate agent capabilities, but they lack a developer-friendly API structure. The implementations mix concerns and require deep understanding of the internals.

Proposed API Structure

Here's a proposed high-level API structure for discussion:

Core Agent Interface

abstract type BayesianAgent end

struct RxInferAgent <: BayesianAgent
    model
    constraints
    state
    config
end

Key Operations

I propose standardizing these core operations:

Plan: Compute optimal actions based on current beliefs and goals

plan(agent::BayesianAgent, horizon::Int) -> Vector{Action}

Act: Execute a single action from the plan

act!(agent::BayesianAgent) -> Action

Learn: Update beliefs based on observations

learn!(agent::BayesianAgent, observation::Observation) -> Nothing

Slide: Prepare for the next timestep

slide!(agent::BayesianAgent) -> Nothing

Example Usage

Note: The following examples use an imperative loop structure for clarity. Our end goal is to provide a fully reactive implementation. These examples serve as a conceptual starting point to illustrate the core operations.

# Initialize agent
agent = RxInferAgent(
    model = my_model,
    constraints = my_constraints,
    state = initial_state,
    config = agent_config
)

# Main agent loop
for t in 1:n_steps
    # Plan future actions
    plan(agent, horizon=10)
    
    # Execute next action
    action = act!(agent)
    
    # Get environment feedback
    observation = environment.step(action)
    
    # Update agent's beliefs
    learn!(agent, observation)
    
    # Prepare for next timestep
    slide!(agent)
end

Open Questions

Configuration Options: What configuration options should be available?
Multi-Agent Support: How should we extend this API for multi-agent scenarios?

Request for Comments

We'd love to hear your thoughts on:

The overall API structure
Additional operations that should be included
Alternative approaches you've found effective
Use cases we should consider
Integration with RxEnvironments.jl

Please share your experiences and suggestions below!

Note: This is an initial proposal to start the discussion. All interfaces are subject to change based on community feedback and practical implementation experience.

FraserP117 · 2025-02-01T13:23:46Z

FraserP117
Feb 1, 2025
Collaborator

Thanks @albertpod. This is very exciting, I would like to help out with this. Standardizing the agent creation API will be an immensely valuable contribution!

I've begun some very tentative explorations in this direction by investigating how best to hook up an RxInfer model within an RxEnvironments agent/environment dyad. @wouterwln has been invaluable counsel to that end. You can access my repo of these explorations here. TL;DR on my musings is: yes, a standard imperative main loop is probably best for initial implementation and no, I have not managed RxInfer model integration yet.

I'll enumerate my thoughts and responses to your specifics as per their level of abstraction.

Global Considerations

The most abstract consideration that I would like to raise is where best to implement this agent API. Within RxInfer.jl (proper) or say RxEnvironments.jl? Should this even constitute a totally new package: "RxAgents.jl"; using RxInfer.jl and RxEnvironments.jl as dependencies?

Personally, I think that it might be wisest to implement this in a new "RxAgents.jl" package. An "RxAgents.jl" agent could then fairly seamlessly instantiate an RxEnvironments.jl RxEntity. With the subsequent specification of the environment "Entity" and the definition of the interface functions from RxEnvironments, the full dyad would be complete - regardless of the commitment to an imperative/reactive paradigm.

Core Agent Interface

I like this proposed structure. I'm afraid I don't have any other thoughts on this right now.

Key Operations

Plan: I like this, and I think this makes sense. I suspect the intent is to return a sequence of temporally-contiguous actions, the first of which is the action predicted for the next time step?: $[a_{t+1}, a_{t+2}, ... a_{T}]$.
Both Act and Learn seem like excellent ideas. No fruitful thoughts on these just yet.
Slide: I assume that slide will simply implement the same procedure as laid out by @ThijsvdLaar and @bertdv in this very helpful paper.

Open Questions:

Configuration Options: I don't have anything useful to say on this just yet. I'm not actually sure what this could refer to, other than functional-form and factorisation constraints - in (potential) addition to whether the agent is supposed to operate in the imperative/reactive context.
Multi-Agent Support: Perhaps we can avoid this problem (here) by simply letting RxEnvironments.jl handle it? As far as I understand, RxEnvironenments.jl already has the ability to handle "Entities" reactively.

Final Comments:

Regarding the overall structure, I have no enlightening comments. I do wonder how/if specific methods will have to change to deal with reactivity. I think it is best to focus on an imperative implementation first. I can't think of any additional operations that might be nice to include, other than to perhaps optionally record the agent's history of states in addition to its current state. I haven't found/used any alternative approaches though @kobus78 may well have done so in the course of his extensive implementations. Regarding integration with RxEnvironmnments, I am very much in favour of this. I think that the agent API could neatly constitute an RxEntity and it seems to me that the question of multiple agent support could be partitioned off to RxEnvironments.

I am very keen to assist with any aspect of this endeavour, going forward. I hope these thoughts are somewhat useful.

0 replies

bvdmitri · 2025-02-03T08:38:37Z

bvdmitri
Feb 3, 2025
Maintainer

Thanks for bringing this @albertpod! I also mostly agree with @FraserP117, but I would still make some practical changes to the for loop, as it looks quite odd to me.

Why the return value of the plan! function is not being used even though it is supposed to return Vector{Action} as described in the API ? Why does an agent act! on itself and why does it return an action? What is slide! supposed to mean to a newcomer? Why is environment.step used while everything else is a pure function? And why does preparation happen at the end?

Of course, I know the answers to these questions, but it would be better if they didn’t arise in the first place.

Here’s my proposed revision, which addresses these concerns:

agent = Agent()
environment = Environment()

# Main agent loop
for t in 1:n_steps
    # Prepare for the next time-step, can do nothing on the first iteration 
    # and "slide" on the next ones or do whatever it wants essentially
    prepare!(agent)

    # IMO `plan` should return "something", namely a "plan" or a series of actions
    actions = plan!(agent, horizon=10)
    
    # Execute next action (either first, or some other "picking" mechanism)
    act!(environment, agent, first(actions)) 
    
    # Get environment feedback, different agents observe different stuff
    observation = observe!(environment, agent)
    
    # Update agent's beliefs about the internal state of the world to be able to `act!` later on
    learn!(agent, observation)
end

which translates to the following core API:

Prepare: Do whatever it needs to prepare an agent, be it a slide function or something else

prepare!(agent::BayesianAgent) -> Nothing

Plan: Compute optimal actions based on current beliefs and goals

plan!(agent::BayesianAgent, horizon::Int) -> Vector{Action}

Act: Execute a single action

act!(environment::Environment, agent::BayesianAgent, action::Action) -> Nothing (or Success/Fail status?)

Observe Observe the current snapshot of the environment

observe!(environment::Environment, agent::BayesianAgent) -> Observation

Learn: Update beliefs based on observations

learn!(agent::BayesianAgent, observation::Observation) -> Nothing

I don't have a strong opinion on whenever to use ! at the end or not. We can remove them if most of us don't like it.

We also need some feedback from @wouterwln since he designed the API for RxEnvironments.jl and perhaps my proposal does not perfectly align with the API in RxEnvironments.jl.

0 replies

wouterwln · 2025-02-03T10:52:45Z

wouterwln
Feb 3, 2025
Maintainer

I'll try to structure my feedback as well as I can. I like the idea of the API, and I think indeed, as Fraser said, we should incorporate it in a separate package RxAgents.jl as much as we can. The idea of RxEnvironments.jl is that you wouldn't have to call observe! explicitly, and that RxEnvironments.jl does this for you. This does mean that you will get an additional get_latest_observation line or something, which is maybe exactly what you envision with observe. The API would look something like this:

last_action = 0.0
for t in 1:n_steps
    last_observation = get_latest_observation(agent)
    learn!(agent, observation, last_action)
    actions = plan(agent, n_steps - t)
    act!(agent, first(actions))
end

Now we have to think about some stuff. In my current POMDP implementation, I have the following structure:

@model function planning(p_A, p_B, y_current, y_future, T, p_s, u_current, goal_state)
    A ~ p_A
    B ~ p_B
    prev_state ~ p_s
    
    # Parameter inference step
    current_state ~ Transition(prev_state, B, u_current)
    y_current ~ Transition(current_state, A)
    previous_state = current_state

    # Planning step
    for i in 1:T
        u[i] ~ Categorical([0.2, 0.2, 0.2, 0.2, 0.2])
        s[i] ~ Transition(previous_state, B_future[i], u[i]) 
        y_future[i] ~ Transition(s[i], A_future[i])
        previous_state = s[i]
    end
    s[end] ~ goal_state
end

(Ignoring backwards messages from the planning stage towards A and B) this does parameter inference over A and B and planning in the same inference procedure (so learn! and plan simultaneously). We could incorporate this in our API as well, but I don't know how to name this. I think we can hide a lot of the boilerplate code though. Now, my inner loop looks a bit like this:


# Run inference with current A and B
result = infer(
    model = planning(
        p_A = A, 
        p_B = B,
        T = T - t,
        p_s = p_s,
        goal_state = Categorical([0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0]),
    ),
    data = (
        u_current = UnfactorizedData(prev_u),
        y_current = UnfactorizedData(current_observation),
        y_future = UnfactorizedData(fill(missing, T - t))
    ),
    constraints = constraints,
    initialization = init,
    iterations = 10
)

# Update beliefs about environment dynamics
A = last(result.posteriors[:A])
B = last(result.posteriors[:B])

# Get next state belief and action
p_s = last(result.posteriors[:current_state])
a = mode(first(last(result.posteriors[:u])))

And most of this is stuff we can hide. Let's discuss what we can hide and what we cannot.

The API for RxEnvironments is quite easy I would say. There is send!, which sends data from one place (the agent) to some other place (the environment) and which could trigger a response. So act!(agent, action) for me would be:

act!(agent, action) = send!(get_environment(agent), agent, action)

or something similar, at least a direct alias. This will trigger the environment to send an observation back to the agent which can be accessed either explicitly, or we can subscribe to it. In the end we can also do something like this:

function do_inference_and_planning(agent, observation)
     learn!(agent, observation)
     new_plan = plan(agent, goal)
     act!(agent, first(new_plan))
end

RxEnvironments.subscribe_to_observations!(agent, x -> do_inference_and_planning(agent, x))

And do some smart stuff with receding time horizons and such. RxEnvironments keeps these options open for you and I think we can write boilerplate on top of RxInfer and RxEnvironments which marries these interfaces. Let me know what you guys think.

1 reply

bvdmitri Feb 4, 2025
Maintainer

After reading this I do think we should indeed use the reactive API from RxEnvironments.jl more

apashea · 2025-02-04T03:52:39Z

apashea
Feb 4, 2025

Thank you for your work on this and acceptance of feedback. I am in much agreement with prior points, especially Albert's loop logic/ordering (the # Initialize agent cell). Clear differences between state inference, policy/planning, and learning in that order are great. Few key points about topics unstated, just so you get some feedback on other areas:

For loops, the notion is that the agent will impact the environment with its action, thus influencing the outcome of the next observation. I think everyone has this in mind, but given I've given feedback to Dmitry and others for some time about the discrete state-space development, so I'm just assuming being very, very plain by this point might be helpful here. Want to avoid staying in what could otherwise virtually be an HMM scenario with extra internal inference steps (unless a particular user were to opt for that); dynamic interaction with an environment is the attractor.
In MATLAB and pymdp, which each still bear the VFE/EFE formulation, state inference occurs via minimizing VFE while policy inference involves scoring each action/policy by its respective EFE, the info gain and pragmatic value terms ( policy , denoted pi, is the word for a sequence of actions in the literature and the textbook we all read and most predominant ActInf published literature), and then following the equation q(pi)=softmax(-G*gamma - F - ln(E)) (each policy's posterior probability is a softmax'ed function of its respective scoring of EFE weight by policy precision parameter gamma which itself might be modulated, VFE, and its prior E) for the final posterior over policies q(pi) . You will probably find interest from many people in ActInf if you were able to implement some of these core aspects of what they're actually studying and using for their research, if not make a good case for using GFE or even better (no pressure) to specify one's own scheme but this bearing in mind many researchers are out there who are doing their PhDs, post-docs, etc. in computational neuroscience, cognitive science, robotics, psychiatry/mental health, machine learning, etc. Many are into applying reinforcement learning to problems and finding ActInf to be the framework they want to use, so actually having some kind of environment.step() function will be rather clear to that subsection of folks, not so much the act!() sort of nomenclature ((of course it's julia and your own library, I'm indifferent on this particular point but am letting you know)) or wrapping up environments so tightly elsewhere that a person, say, programming an artificial behavioral experiment can't tweak and incrementally test their environment particularly easily without reading a chapter's worth or so of information. These sorts of folks I'm referring to are indeed scientific and can learn the math; they are not trained nor ready necessarily for making a course out of learning what is ultimately a new, very particular, in-development tool without it being a bit easier and more familiar to get into it. This is the case, I believe, for those still in education and training, already in industry and looking to ActInf for real-world applications, and elsewhere. You all should of course develop what you want to develop. I just want to provide some additional perspective, as someone having led the textbook study at the Institute, done large group workshops and worked on projects with others in pymdp, messages I receive regarding how to do ____ in the library, etc.
These sorts of topline functions/APIs for getting right into it is great. Meanwhile it's the flexibility of RxInfer, however, that will attract those who are wanting to move beyond MATLAB/pymdp as they are learning and studying common notions like hierarchical, hybrid (discrete/continuous), multi-timescale, and other interesting architectures which MATLAB/pymdp don't directly support without a significant amount of rigging if at all.
On the multi-agent question, I've personally just used list/dict and other iterable objects for storing multiple agents and then either iterating thru them or having loop conditions where a particular agent might receive an observation or react based on some attribute it and/or the environment has at time t, which are rather simple. I and some of the pymdp dev's have used networkx as an interesting way to link up agents in definitive networks, that's a bit of a stretch further than necessary though. I mean, multi-agent systems are still kind of "on the rise" so if you have something particularly clever or that meshes well with your framework, you might be able to trendset a bit on this front. Nonetheless looking at recently published papers on ActInf (and RL, generally) multi-agent schemes and tasks will give interesting food for thought.

I've given a fair amount of feedback on this so, just wanted to really share my thoughts on these other ends while I'm still available. Again, great work, will watch developments.

0 replies

LearnableLoopAI · 2025-02-06T12:47:41Z

LearnableLoopAI
Feb 6, 2025
Collaborator

I would love to see this standardization effort embrace an even wider scope. As I see it, the

AI field's main concern is Machine Learning. And
- Machine Learning's main concerns are the 3 pillars:
  - Supervised Learning (Regression, Classification),
  - Unsupervised Learning (Dimensionality reduction, Clustering), and
  - Sequential/Series Learning (Reinforcement Learning, Active Inference).

As 'traditional' machine learning seems to grow towards the bayesian approach, I would love to see our thinking also embrace the above so that the API will be generic enough, in both semantics and syntax, to cover all of the above. Indeed there are examples available already that showcase the applicability of RxInfer to the non-sequential areas. I think RxInfer could really become the tool for everyone, especially now that a Python implementation is also planned. Here is an example of what I mean. To consider the next data point in the case of:

Supervised Learning: the next state starts with provision: acquisition of another state/datapoint
Unsupervised Learning: the next state starts with provision: acquisition of another state/datapoint
Sequential/Series Learning: the next state starts with provision: transition from previous state/datapoint

To get a fuller picture of my thinking, please have a look at the following 2 posts, in particular the section 'Symbols/Nomenclature/Notation (KUF)' as well as the section 'MODELING' where the implementation happens:

https://learnableloop.com/posts/LitterModel_PORT.html
https://learnableloop.com/posts/ManagerEngagements_PORT.html

Best wishes with your valuable work!

2 replies

tims457 Apr 24, 2025

Can you clarify what you mean by a Python implementation is planned? Is this for RxInfer Server or for native modeling in Python? Was there an announcement?

bvdmitri Apr 25, 2025
Maintainer

Can you clarify what you mean by a Python implementation is planned? Is this for RxInfer Server or for native modeling in Python? Was there an announcement?

Hey! Currently we mean a client for RxInfer Server, which is already available https://lazydynamics.github.io/RxInferClient.py/
We announced it in twitter, Julia discourse and LinkedIn. Also available on the website RxInfer.com

Native modeling in Python is a more difficult task, but eventually it can also be implemented. We're currently busy with ReactiveMP refactoring. Once that is ready, we will decide on the next direction :)

albertpod · 2025-02-18T09:23:00Z

albertpod
Feb 18, 2025
Maintainer Author

Thanks to everyone for their input @LearnableLoopAI @bvdmitri @FraserP117 @wouterwln @apashea! I suggest spending some time during the next RxInfer meeting to devise the tasks and create the milestones associated with this discussion.

0 replies

FraserP117 · 2025-02-19T04:11:26Z

FraserP117
Feb 19, 2025
Collaborator

Thanks @albertpod, I'm really looking for ward to assisting with this. Yes I think that we can/should discuss the tasks and milestones for this project in the next RxInfer meeting. It's very possible to host the various milestones/tasks as an Institute page though I won't presume to know if that would be appropriate.

A Suggested Course For Development:

I would just like to offer something as an idea... I think it might be nice to have a running example/test case with which to experiment in creating the new agent API. The hope would be that we all get to focus/work on the same example, for ease of coordination/idea sharing/implementation.

I propose that we use the existing Active Inference Mountain Car agent as this example. This is probably the best and most widely recognised implementation of an Active Inference agent with RxInfer - a natural starting point.

RxEntities and RxEnvironments - Reiteration:

As I mentioned before, I can see RxEnvironmetns.jl splitting into RxEnvironments (proper) and the new "RxAgents.jl". Both would use the RxEntity type to instantiate an agent and an environment. This is only a suggestion, though I think a good case can be made for it.

My Experimentation:

I have made a totally new Jupyter Notebook in an attempt to satisfy the above points. In this notebook, I am recreating the existing Active Inference Mountain Car agent with RxEnvironments.jl. I think this example could constitute a central working example with which to design and create the burgeoning agent API. Please feel free to make branches and do what you will.

This is only a suggestion, perhaps there are other, better ways to start on this.

Note: I have only managed to implement the "naive" agent policy with RxEnvironmetns.jl so far but I can see a direction as to how we should implement the proper Active Inference agent. This will involve specific choices as to the "act", "plan", and "observe" functions - exactly the sorts of choices that this discussion is designed to address.

Many thanks again! Let me know what you think.

2 replies

bvdmitri Feb 19, 2025
Maintainer

Thanks @FraserP117! Very helpful, I will allocate time to go through your example hopefully this or next week

albertpod Feb 19, 2025
Maintainer Author

Good idea @FraserP117. I think that'll be very a nice demonstration, where instead of the OG code to implement AIF, we will be using plan_inf/act/observe, then assuming some parts of dynamics unknown going into plan_inf/act/observe/learn.

docxology · 2025-10-02T21:57:48Z

docxology
Oct 2, 2025

Hello all ~

Picking this back up after a discussion with @FraserP117 earlier today. We talked about how the Server-Client interface, is separable/distinct from the semantics of the Agent-Environment interface. Leaving the Server-Client topic aside, this motivated some discussion around a suitable level of constraint for a "standard API for RxInfer-based Agents" (what this thread is all about!).

So on that wavelength, I worked today here: https://github.com/docxology/RxInferExamples.jl/tree/main/research/agent . That is a generic system where there are folders for Agents and Environment, then generic configuration, composition, logging, and simulation methods. There are type checks for the interface, so that at the generic level, the methods are just testing for the state space typing-coherence (hence the "handle for the agent-environment" interface, is minimally opinionated). Here are the typing check methods.

For example HERE is Mountain Car Active Inference agent (where the @model is described), and HERE is the Mountain Car environment. Then HERE is an extremely thin orchestrator, which configures/points to which Agent + Environment to compose together and simulation. That simulation uses RxInfer methods/calls and outputs the simulation HERE.

So to return to this question of generic Agent API -- my feeling is that since there are an infinite variety (indeed an Infinite Jest!) of Agents and Environments (hence transfinite number of combinations). Hence, in terms of what level of abstraction and type/extent of opinionation in an open source infrastructure-grade synthetic intelligence package (#RxInfer #ActiveInference), what resonates with me, is that the "standardization" level is just "the agent and the environment should match (and/or it should be obvious what the set-theoretic relationship is between Environment output, Agent perception, Agent action, Environment input).

Within that meta-standardization situation, there could be further opinions (e.g. the Agent-Environment must match 1:1 with no excess, or have some match but it is OK if there are extra pieces overhanging or missing expected pieces). And of course those opinions cascade all the way down to the domain and model "Minute Particulars".

There are also many degrees of freedom in specifying the simulation orchestration, as @apashea and others raise in this thread. And that capacity of RxInfer to support custom and reactive scheduling/logistics on Graphs, is a key advantage of the package. I feel that having the interface well-specified and flexible (as I believe the approach presented here is), gives total flexibility in simulation design, deployment, and operation.

0 replies

wouterwln · 2025-10-03T09:50:06Z

wouterwln
Oct 3, 2025
Maintainer

@docxology @FraserP117 For the agent-environment interaction model, I think we should not impose tight constraints a priori, but provide the user with a toolbox that they can use to freely implement any additional constraints themselves. My first attempt at this was RxEnvironments.jl, but the design is convoluted to the point where it is extremely hard to learn (a typical example of feature-centric design over user-centric design). The philosophy and algorithm there, however, I think is correct (and kind of also what I described above here). My main point I guess is the following: We should disentangle any work (be it planning, parameter or state inference) from the "environment time". In the comment that opens this discussion, @albertpod has a tight coupling between his n_steps, the planning, state inference and the actions that are being sent to the environment (and possibly his environment loop). I think we should stay away from a protocol like this.

I've been working on a toolbox that would be a little bit easier to use, I'd love to get your thoughts on it. The (pseudo)code would look something like this:

observations = []

# Do state inference whenever an observation comes in
every(observations) do obs
    state_inference!(agent, obs)
end

# Make a plan every 50 milliseconds
every(50ms) do dt
    plan!(agent)
end

# every 3 seconds do parameter learning
every(3s) do dt
    learn!(agent)
end

# every 10ms do an update and present a new observation to the agent.
every(10ms) do dt
    update!(environment, dt)
    push!(observations, generate_observation(environment, agent)
end

# Run the sim for 10 seconds
for_next(10s) do
    update!()
end

The point here is to disentangle the different ongoing processes and scheduling them as independent jobs that run in relative isolation from each other (apart from sharing computational resources). I think an API like this would be expressive enough to capture most agent-environment interactions, while also allowing the user to build more constrained protocols (like 1:1)

WDYT?

1 reply

FraserP117 Oct 3, 2025
Collaborator

Excellent @wouterwln. I completely agree that we should focus on affording users with a configurable toolbox that can be used to implement/specify as wide an array of agent-to-env interfaces, as possible. I'll provide my thoughts regarding the general issue and then turn to your specific points.

General Thoughts/Orientation:

In the spirit of RxEnvironments.jl I also advocate for our avoiding complex and subtle discussions about what an "agent" or an "environment" is - considered in themselves - and instead, for framing the development of the implementation in terms of interacting "Entities". Some "Entities" may be RxInfer.jl models - perhaps the Generative Model of a wider AIF agent. This is the crux of the matter, as I see it: the affordance of a standardised interaction layer/protocol that can be used by an RxInfer.jl model/agent to interface with another "Entity" in a configurable, composable way. We'll have to support multi-rate and composable interfaces for the entities - I think. Obviously, we'd like to be asynchronous and event-driven, like RxInfer.jl itself.

This would be another module like RxEnvironments.jl (perhaps a refactor or re-implementation?) that also doesn't belong to RxInfer.jl per se but lives in the wider ReactiveBayes ecosystem. Incidentally, the question as to how we should best afford the interaction of multiple Active Inference agents happens to be the subject of my PhD, so this issue it very timely! I'm very keen to take up the development of this part of the ReactiveBayes ecosystem, potentially even as part of my "official" research capacity. I'm investigating the likes of Polynomial Functors and Compositional Active Inference, to this end - especially the machinery of "lenses".

Feedback - On Serialising Within-Agent Updates:

Just an aside first up: I like the native separation of the two characteristic belief-update time scales (state inference and learning). Model selection can perhaps come later?

Indeed, I think the immediate concern is overlapping updates. I'm also on board with the initial assumption/decision to serialise all within-agent mutations that participate in the agent’s sensor-motor loop - while keeping all between-agent interactions asynchronous and event-driven. That gives us safety inside each agent and more flexible affordances for compositionality between interfacing agents. Hence, state_inference!, plan!, and learn! must never overlap for a given agent. Two routes to this come to mind...

Option 1 - Per-Agent Lock:

Give each agent a ReentrantLock and do all heavy operations on an immutable snapshot, then perform a short, atomic commit under the lock. Something like:

struct Agent
    lock::ReentrantLock
    state::AgentState # includes version
end

function state_inference!(agent, obs)
    snap = agent.state # read-only snapshot
    new_state = infer_from(snap, obs)  # heavy compute (no lock)

    lock(agent.lock) do # short commit section
        if agent.state.version == snap.version
            agent.state = commit(new_state)  # swap whole state and bump version
        else
            # someone else committed first. Drop or recompute (policy-dependent)
        end
    end
end

This is a relatively simple option but we'd have to be extremely disciplined: never do long linear algebra ops under the lock and avoid taking multiple agent locks - avoid deadlock risk.

Option 2: Mailbox/FIFO Queue:

Instead of the above "dumb"/"blunt" option, route all “mutate the agent” calls through a single queue: one task processes messages FIFO, guaranteeing serialisation and order determinism within the agent's sensor-motor-loop. Again, something like:

abstract type Msg end
struct ObsMsg{Y} <: Msg; y::Y; end
struct PlanTick  <: Msg; end
struct LearnTick <: Msg; end

struct Agent
    inbox::Channel{Msg}
    state::AgentState
    ...
end

function sensor_motor_loop(agent::Agent)
    @async while true
        msg = take!(agent.inbox) # FIFO
        handle!(agent, msg) # all mutations happen HERE, one at a time...
    end
end

# producers (timers, sensors) just enqueue:
put!(agent.inbox, ObsMsg(obs))
put!(agent.inbox, PlanTick())
put!(agent.inbox, LearnTick())

This would be great for multi-rate agents/timers and a little more flexible than per-agent locks. It's more involved however. I probably favour this option, in any case.

Making This explicit in the Proposed DSL:

We could add two parameters to every(...) so users don’t have to invent their own guards - although perhaps we don't want to do this?

First, serialise = (:actor, agent) or (:lock, agent) (depending on the above choice). Hence, (:actor, agent) is the second, "mailbox serialisation" option. The scheduler enqueues the work to the agent’s single-consumer actor (FIFO). There's no overlap by construction. (:lock, agent) is the first option: "lock-and-commit serialisation". The scheduler runs the block now, the code then follows snapshot -> heavy compute (no lock) -> short atomic commit (under agent.lock).

Second, something like policy = :skip | :latest | :queue handles backpressure when ticks outrun compute. :skip drop a new compute call if one is already in operation (good for slow learn!). :latest - coalesce all pending triggers - after the current run finishes and run once with the newest input - great for for planning/control. :queue - bounded FIFO - process all triggers in order. Latency/memory can grow with this, however.

So, taking up your example code:

# Event-driven observations: process all
every(observations; serialise=(:actor, agent), policy=:queue) do obs
    state_inference!(agent, obs)
end

# Planning on a cadence: never backlog
every(50ms; serialise=(:lock, agent), policy=:latest) do
    plan!(agent)  # inside: snapshot -> compute -> short commit under agent.lock
end

# Slow parameter learning:
every(3s; serialise=(:actor, agent), policy=:skip) do
    learn!(agent)
end

# Environment at 10ms, produces observations
every(10ms) do dt
    update!(environment, dt)
    push!(observations, observe(environment; for=agent_id(agent)))
end

# Deterministic run under a SimulationClock
for_next(10s) do
    update!()  # advances the scheduler/clock
end

I suppose the only other thing is that we might like to expose the time delta with which to do each operation handled by every(...). Perhaps you had already envisioned that.

To conclude, I realise I've said nothing at all about the actual nature/design of the proposed entity-to-entity interface itself. That's surely "The Big Picture Issue" but I think these related issues need to be addressed first. There's more to say about handling (specifically) "environmental" entities, since - presumably? - an "environment" would be a shared resource and hence, would need its own specific means of internal serialisation/arbitration? This could potentially be mitigated by interpreting the environment as the global space of all entity interfaces. In any case, that's another interesting point that we'll have to consider.

I'm interested in all your thoughts, @docxology, @bvdmitri, @wouterwln and all others!

docxology · 2025-10-09T18:48:48Z

docxology
Oct 9, 2025

Thanks for the comments, I will provide some thoughts here.

@wouterwln I agree with " I think we should not impose tight constraints a priori, but provide the user with a toolbox that they can use to freely implement any additional constraints themselves.". I think there are two separate issues here:

The standardization of the schema/structure API of agent-environment interface (that is the title of the thread, and what my previous post and code described), or as @FraserP117 wisely describes, essentially the Entity-Entity interface.
The standardization of the order of operations for calling that interface.
Your post Wouter, seems to describe CompatHelper: bump compat for Rocket to 1, (keep existing compat) #2 (e.g. it is not describing the state spaces for the agent-environment interface). Whereas I focused on TagBot trigger issue #1 (with the understanding that interface schema in hand, there could then be operational flexibility for calling either side of the API in various orchestrated sequencing). The code you describe looks nice for top-level orchestration flow, however does not address the agent-environment interface at all (it is an important complementary aspect though, as I think we are all in agreement that the RxInferAnt should be able to specify agent-environment state spaces as well as operational order [including scheduled and reactive sequencings]).

As to @FraserP117 's Option's, which focus on this RxEnvironment-like question of (a)synchronicity with and among entities. I think in Option 1, you bring up this issue that any locking operation (even a very short one, especially long ones) would then beget secondary questions of what happens if "you knock on the door and it is locked". Option 2 does seem preferable, as this is the approach taken by performant distributed database software, such as Kafka.

Regardless, the approach to among-agent event passing (not to confuse the database engineer's concept of "message passing" among nodes, with the Bayesian graphical modeler's concept of [e.g. variational] message passing on factor graphs) -- that is a separable question from the API standard itself. It may be quite literally the distinction between Space and Time, where the API standard is (shared, composable) space, and the inter-entity handing describes the temporality of coordination. My focus is on the API schema itself, because then users could hook up RxInfer to arbitrary database approaches/backends/schedules.

Perhaps a new thread could be created, e.g. "Entity interface API" for the schema definition (which I have suggested above, should be a meta-standard of just checking for coherence, reporting on that, and enabling user opinions on how much coherence they require), and "Among-Entity Orchestration" for discussion separate from the API interface, how to enable different coordination schedules/approaches, whether the interface itself is simple (e.g. binary bits flowing bi-directionally), or more sophisticated (many fields of different types).

0 replies

ReactiveBayes

Standardizing APIs for RxInfer-based Agents #416

Uh oh!

albertpod Jan 31, 2025 Maintainer

Current State

Proposed API Structure

Core Agent Interface

Key Operations

Example Usage

Open Questions

Request for Comments

Replies: 10 comments · 6 replies

Uh oh!

FraserP117 Feb 1, 2025 Collaborator

Global Considerations

Core Agent Interface

Key Operations

Open Questions:

Final Comments:

Uh oh!

bvdmitri Feb 3, 2025 Maintainer

Uh oh!

wouterwln Feb 3, 2025 Maintainer

Uh oh!

bvdmitri Feb 4, 2025 Maintainer

Uh oh!

Uh oh!

apashea Feb 4, 2025

Uh oh!

LearnableLoopAI Feb 6, 2025 Collaborator

Uh oh!

tims457 Apr 24, 2025

Uh oh!

bvdmitri Apr 25, 2025 Maintainer

Uh oh!

albertpod Feb 18, 2025 Maintainer Author

Uh oh!

FraserP117 Feb 19, 2025 Collaborator

A Suggested Course For Development:

RxEntities and RxEnvironments - Reiteration:

My Experimentation:

Uh oh!

bvdmitri Feb 19, 2025 Maintainer

Uh oh!

Uh oh!

albertpod Feb 19, 2025 Maintainer Author

Uh oh!

docxology Oct 2, 2025

Uh oh!

wouterwln Oct 3, 2025 Maintainer

Uh oh!

FraserP117 Oct 3, 2025 Collaborator

General Thoughts/Orientation:

Feedback - On Serialising Within-Agent Updates:

Option 1 - Per-Agent Lock:

Option 2: Mailbox/FIFO Queue:

Making This explicit in the Proposed DSL:

Uh oh!

docxology Oct 9, 2025

albertpod
Jan 31, 2025
Maintainer

Replies: 10 comments 6 replies

FraserP117
Feb 1, 2025
Collaborator

bvdmitri
Feb 3, 2025
Maintainer

wouterwln
Feb 3, 2025
Maintainer

bvdmitri Feb 4, 2025
Maintainer

apashea
Feb 4, 2025

LearnableLoopAI
Feb 6, 2025
Collaborator

bvdmitri Apr 25, 2025
Maintainer

albertpod
Feb 18, 2025
Maintainer Author

FraserP117
Feb 19, 2025
Collaborator

bvdmitri Feb 19, 2025
Maintainer

albertpod Feb 19, 2025
Maintainer Author

docxology
Oct 2, 2025

wouterwln
Oct 3, 2025
Maintainer

FraserP117 Oct 3, 2025
Collaborator

docxology
Oct 9, 2025