This repository contains the full implementation of the project proposed in our research article:
"Renewable Energy Maximization for Pelagic Islands Network of Microgrids Through Battery Swapping Using Deep Reinforcement Learning"
"Abstract"
- The study proposes an energy management system of pelagic islands network microgrids (PINMGs) based on reinforcement learning (RL) under the effect of environmental factors. Furthermore, the day-ahead standard scheduling proposes an energy-sharing framework across islands by presenting a novel method to optimize the use of renewable energy (RE). Energy sharing across islands is critical for powering isolated islands that need electricity owing to a lack of renewable energy supplies to fulfill local demand. A two-stage cooperative multi-agent deep RL solution based on deep Q-learning (DQN) with central RL and island agents (IA) spread over several islands has been presented to tackle this difficulty. Because of its in-depth learning potential, deep RL-based systems effectively train and optimize their behaviors across several epochs compared to other machine learning or traditional methods. As a result, the centralized RL-based problem of scheduling charge battery sharing from resource-rich islands (SI) to load island networks (LIN) was addressed utilizing dueling DQN. Furthermore, due to its precise tracking, the case study compared the accuracy of various DQN approaches and further scheduling based on the dueling DQN. The need for LIN is also stochastic because of variable demand and charging patterns. Hence, the simulation results, including energy scheduling through the ship, are confirmed by optimizing RE consumption via sharing across several islands, and the effectiveness of the proposed method is validated by state and action perturbation to guarantee robustness.
Published in IEEE Access, 2023.
DOI: 10.1109/ACCESS.2023.3302895
Pelagic islandsโisolated islands far from mainland gridsโrely heavily on local renewable energy (RE) sources. However, due to unpredictable solar/wind conditions, some islands face shortages while others may have surplus energy. Establishing an effective and intelligent energy-sharing framework between these islands is critical to ensure consistent and optimal power supply.
The objective of this project is to maximize the utilization of renewable energy across a network of isolated microgrids (PINMGs) by enabling cooperative energy sharing via battery swapping using ships.
We propose a two-stage cooperative Multi-Agent Deep Reinforcement Learning (MADRL) approach that consists of:
-
Central Reinforcement Learning (CRL)
- Manages global-level scheduling decisions.
- Determines optimal energy transfer from Source Islands (SIs) to Load Island Networks (LINs).
-
Island Agents (IAs)
- Operate locally on each island.
- Handle individual energy demand/supply dynamics using local policies.
- Dueling Deep Q-Networks (Dueling DQN) for improved convergence and policy evaluation.
- Environment Modeling includes stochastic demand, RE generation, and dynamic battery shipping delays.
- State and Action Perturbation Tests to validate robustness and generalizability.
Ships transport charged batteries from surplus-producing islands to demand-heavy islands. The cost and schedule of transport are factored into the optimization.
- 12 States discrete environment with islands acting as microgrids present at different states initially unknown to the agent.
- Three island types:
- Source Island (SI): Exports batteries.
- Source Load Island (SLI): Exports batteries, also has significant loads.
- Load Island (LIN): Only consumes batteries.
- Ship movement across grid cells with discrete actions.
- Battery collection and delivery logic.
- Time-series simulation of energy generation and consumption.
- Reinforcement Learning-ready environment with step and reset functions.
- TensorFlow-based DQN setup included as a test section.
The goal is to train an agent (ship) to:
- Efficiently transport batteries from source islands to the load island.
- Maximize delivery success within step limits.
- Minimize energy from non-renewable sources (penalized in rewards).
This helps separate the state value from the relative advantages of each action.
The ship_movement environment is implemented in the file Reward_function_MDP.py. It defines:
reset(): Initializes the environment.step(action): Applies an action and returns (next_state, reward, done).action_space: A Gym-compatible discrete action space.state: An array of features representing the shipโs status.
This project implements a Dueling Deep Q-Network (DQN) in TensorFlow to train an agent in a custom OpenAI Gym environment called ship_movement. The agent learns to optimize the battery pickup & delivery strategy of a simulated ship using reinforcement learning principles.
| Parameter | Value |
|---|---|
| Episodes | 5000 |
| Batch Size | 32 |
| Discount Factor | 0.98 |
| Learning Rate | 0.0001 |
| Epsilon (start) | 1.0 |
| Epsilon (end) | 0.0001 |
| Epsilon Decay | Exponential |
| Replay Buffer | 100,000 steps |
| Target Sync Freq | 500 steps |
Each episode includes the following steps:
- Reset the environment and initialize variables.
- Choose actions using the epsilon-greedy policy.
- Store experiences in the replay buffer.
- Sample random mini-batches to train the model.
- Periodically update the target network.
- Reduce epsilon over time to shift from exploration to exploitation.
- Save model weights if performance improves.
This repository contains implementations of various deep reinforcement learning (DRL) algorithms applied to a custom ship navigation environment. The goal is to train agents (using DQN, Double DQN, Dueling DQN, Actor-Critic) to navigate efficiently, possibly avoiding islands or minimizing cost over routes.
| File | Description |
|---|---|
Dueling_DQN_with DDQN.py |
Main training script implementing Dueling DQN with Double DQN extensions using TensorFlow. |
TF_DQN_01.py |
TensorFlow-based implementation of the standard DQN algorithm. |
TF_DDQN_01.py |
Implements Double DQN to reduce overestimation bias. |
TF_Duling_DQN.py |
Likely alternative or enhanced version of the Dueling DQN script. |
tf_actor_critic.py |
TensorFlow implementation of an Actor-Critic reinforcement learning agent. |
| File | Description |
|---|---|
Reward_function_MDP.py |
Defines the ship_movement class, a custom Gym environment for training. |
Island_Generator.py |
Generates island map configurations or obstacles for the environment. |
Modules.py |
Contains helper functions and reusable logic modules. |
decay_settings.py |
Stores epsilon decay strategies and other hyperparameters. |
price_dataset_file.py |
Loads or simulates pricing/cost data, likely used for energy optimization tasks. |
| File | Description |
|---|---|
Functionality Plot.py |
Visualizes model performance, reward trends, or Q-values. |
Ploting_and_Saving.py |
Automates plotting and saving of training metrics like reward, loss, and epsilon. |
Plot_RL_results.py |
Standalone script for plotting saved .npy result files. |
| File | Description |
|---|---|
Testing_Learning.py |
Tests trained model performance in the custom environment. |
Testing_learning_01.py |
Variant testing script with a different configuration or episode limit. |
Testing_Learning_02.py |
Another evaluation scenario or experiment. |
| Name | Description |
|---|---|
datasets/ |
Folder for trajectory, pricing, or environment data used during training. |
straight/ |
Contains configurations for straight-line navigation environments. |
Reward_RL_300_straight.rar |
Compressed file with reward data for 300 episodes in the straight scenario. |
straight.rar |
Compressed version of the straight/ directory. |
| Name | Description |
|---|---|
RandomModel/ |
Stores random or baseline model variants. |
Results/ |
Output directory for saved weights, plots, and reward/loss history. |
__pycache__/ |
Auto-generated folder with cached Python bytecode (.pyc files). |
- Install requirements:
git clone https://github.com/eagle-Ji/Deep-Reinforcement-Learning-for-Renewable-Energy-Maximization cd Deep-Reinforcement-Learning-for-Renewable-Energy-Maximization pip install -r requirements.txt
@article{amin2023renewable,
title={Renewable energy maximization for pelagic islands network of microgrids through battery swapping using deep reinforcement learning},
author={Amin, M Asim and Suleman, Ahmad and Waseem, Muhammad and Iqbal, Taosif and Aziz, Saddam and Faiz, Muhammad Talib and Zulfiqar, Lubaid and Saleh, Ahmed Mohammed},
journal={IEEE access},
volume={11},
pages={86196--86213},
year={2023},
publisher={IEEE}
}



