You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/src/esn_tutorials/deep_esn.md
+6-3Lines changed: 6 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
3
3
Deep Echo State Network architectures started to gain some traction recently. In this guide, we illustrate how it is possible to use ReservoirComputing.jl to build a deep ESN.
4
4
5
-
The network implemented in this library is taken from [^1]. It works by stacking reservoirs on top of each other, feeding the output from one into the next. The states are obtained by merging all the inner states of the stacked reservoirs. For a more in-depth explanation, refer to the paper linked above.
5
+
The network implemented in this library is taken from [Gallicchio2017](@cite). It works by stacking reservoirs on top of each other, feeding the output from one into the next. The states are obtained by merging all the inner states of the stacked reservoirs. For a more in-depth explanation, refer to the paper linked above.
6
6
7
7
## Lorenz Example
8
8
@@ -88,6 +88,9 @@ plot(p1, p2, p3; plot_title="Lorenz System Coordinates",
88
88
legendfontsize=12, titlefontsize=20)
89
89
```
90
90
91
-
## Documentation
91
+
## References
92
92
93
-
[^1]: Gallicchio, Claudio, and Alessio Micheli. "_Deep echo state network (deepesn): A brief survey._" arXiv preprint arXiv:1712.04323 (2017).
Copy file name to clipboardExpand all lines: docs/src/esn_tutorials/different_drivers.md
+11-12Lines changed: 11 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ While the original implementation of the Echo State Network implemented the mode
4
4
5
5
## Multiple Activation Function RNN
6
6
7
-
Based on the double activation function ESN (DAFESN) proposed in [^1], the Multiple Activation Function ESN expands the idea and allows a custom number of activation functions to be used in the reservoir dynamics. This can be thought of as a linear combination of multiple activation functions with corresponding parameters.
7
+
Based on the double activation function ESN (DAFESN) proposed in [Lun2015](@cite), the Multiple Activation Function ESN expands the idea and allows a custom number of activation functions to be used in the reservoir dynamics. This can be thought of as a linear combination of multiple activation functions with corresponding parameters.
@@ -14,7 +14,7 @@ where ``D`` is the number of activation functions and respective parameters chos
14
14
15
15
The method to call to use the multiple activation function ESN is `MRNN(activation_function, leaky_coefficient, scaling_factor)`. The arguments can be used as both `args` and `kwargs`. `activation_function` and `scaling_factor` have to be vectors (or tuples) containing the chosen activation functions and respective scaling factors (``f_1,...,f_D`` and ``\lambda_1,...,\lambda_D`` following the nomenclature introduced above). The `leaky_coefficient` represents ``\alpha`` and it is a single value.
16
16
17
-
Starting with the example, the data used is based on the following function based on the DAFESN paper [^1].
17
+
Starting with the example, the data used is based on the following function based on the DAFESN paper [Lun2015](@cite).
@@ -87,7 +87,7 @@ In this example, it is also possible to observe the input of parameters to the m
87
87
88
88
## Gated Recurrent Unit
89
89
90
-
Gated Recurrent Units (GRUs) [^2] have been proposed in more recent years with the intent of limiting notable problems of RNNs, like the vanishing gradient. This change in the underlying equations can be easily transported into the Reservoir Computing paradigm, by switching the RNN equations in the reservoir with the GRU equations. This approach has been explored in [^3] and [^4]. Different variations of GRU have been proposed [^5][^6]; this section is subdivided into different sections that go into detail about the governing equations and the implementation of them into ReservoirComputing.jl. Like before, to access the GRU reservoir driver, it suffices to change the `reservoir_diver` keyword argument for `ESN` with `GRU()`. All the variations that will be presented can be used in this package by leveraging the keyword argument `variant` in the method `GRU()` and specifying the chosen variant: `FullyGated()` or `Minimal()`. Other variations are possible by modifying the inner layers and reservoirs. The default is set to the standard version `FullyGated()`. The first section will go into more detail about the default of the `GRU()` method, and the following ones will refer to it to minimize repetitions. This example was run on Julia v1.7.2.
90
+
Gated Recurrent Units (GRUs) [Cho2014](@cite) have been proposed in more recent years with the intent of limiting notable problems of RNNs, like the vanishing gradient. This change in the underlying equations can be easily transported into the Reservoir Computing paradigm, by switching the RNN equations in the reservoir with the GRU equations. This approach has been explored in [Wang2020](@cite) and [Sarli2020](@cite). Different variations of GRU have been proposed [Dey2017](@cite); this section is subdivided into different sections that go into detail about the governing equations and the implementation of them into ReservoirComputing.jl. Like before, to access the GRU reservoir driver, it suffices to change the `reservoir_diver` keyword argument for `ESN` with `GRU()`. All the variations that will be presented can be used in this package by leveraging the keyword argument `variant` in the method `GRU()` and specifying the chosen variant: `FullyGated()` or `Minimal()`. Other variations are possible by modifying the inner layers and reservoirs. The default is set to the standard version `FullyGated()`. The first section will go into more detail about the default of the `GRU()` method, and the following ones will refer to it to minimize repetitions.
91
91
92
92
### Standard GRU
93
93
@@ -104,7 +104,7 @@ Going over the `GRU` keyword argument, it will be explained how to feed the desi
104
104
105
105
-`activation_function` is a vector with default values `[NNlib.sigmoid, NNlib.sigmoid, tanh]`. This argument controls the activation functions of the GRU, going from top to bottom. Changing the first element corresponds to changing the activation function for ``\mathbf{r}(t)`` and so on.
106
106
-`inner_layer` is a vector with default values `fill(DenseLayer(), 2)`. This keyword argument controls the ``\mathbf{W}_{\text{in}}``s going from top to bottom like before.
107
-
-`reservoir` is a vector with default value `fill(RandSparseReservoir(), 2)`. In a similar fashion to `inner_layer`, this keyword argument controls the reservoir matrix construction in a top to bottom order.
107
+
-`reservoir` is a vector with default value `fill(RandSparseReservoir(), 2)`. Similarly to `inner_layer`, this keyword argument controls the reservoir matrix construction in a top to bottom order.
108
108
-`bias` is again a vector with default value `fill(DenseLayer(), 2)`. It is meant to control the ``\mathbf{b}``s, going as usual from top to bottom.
109
109
-`variant` controls the GRU variant. The default value is set to `FullyGated()`.
110
110
@@ -161,7 +161,7 @@ This variation can be obtained by setting `variation=Minimal()`. The `inner_laye
161
161
162
162
To showcase the use of the `GRU()` method, this section will only illustrate the standard `FullyGated()` version. The full script for this example with the data can be found [here](https://github.com/MartinuzziFrancesco/reservoir-computing-examples/tree/main/change_drivers/gru).
163
163
164
-
The data used for this example is the Santa Fe laser dataset [^7] retrieved from [here](https://web.archive.org/web/20160427182805/http://www-psych.stanford.edu/%7Eandreas/Time-Series/SantaFe.html). The data is split to account for a next step prediction.
164
+
The data used for this example is the Santa Fe laser dataset [Hbner1989](@cite) retrieved from [here](https://web.archive.org/web/20160427182805/http://www-psych.stanford.edu/%7Eandreas/Time-Series/SantaFe.html). The data is split to account for a next step prediction.
[^1]: Lun, Shu-Xian, et al. "_A novel model of leaky integrator echo state network for time-series prediction._" Neurocomputing 159 (2015): 58-66.
245
-
[^2]: Cho, Kyunghyun, et al. “_Learning phrase representations using RNN encoder-decoder for statistical machine translation._” arXiv preprint arXiv:1406.1078 (2014).
246
-
[^3]: Wang, Xinjie, Yaochu Jin, and Kuangrong Hao. "_A Gated Recurrent Unit based Echo State Network._" 2020 International Joint Conference on Neural Networks (IJCNN). IEEE, 2020.
247
-
[^4]: Di Sarli, Daniele, Claudio Gallicchio, and Alessio Micheli. "_Gated Echo State Networks: a preliminary study._" 2020 International Conference on INnovations in Intelligent SysTems and Applications (INISTA). IEEE, 2020.
248
-
[^5]: Dey, Rahul, and Fathi M. Salem. "_Gate-variants of gated recurrent unit (GRU) neural networks._" 2017 IEEE 60th international midwest symposium on circuits and systems (MWSCAS). IEEE, 2017.
249
-
[^6]: Zhou, Guo-Bing, et al. "_Minimal gated unit for recurrent neural networks._" International Journal of Automation and Computing 13.3 (2016): 226-234.
250
-
[^7]: Hübner, Uwe, Nimmi B. Abraham, and Carlos O. Weiss. "_Dimensions and entropies of chaotic intensity pulsations in a single-mode far-infrared NH 3 laser._" Physical Review A 40.11 (1989): 6354.
Copy file name to clipboardExpand all lines: docs/src/esn_tutorials/hybrid.md
+6-3Lines changed: 6 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Hybrid Echo State Networks
2
2
3
-
Following the idea of giving physical information to machine learning models, the hybrid echo state networks [^1] try to achieve this results by feeding model data into the ESN. In this example, it is explained how to create and leverage such models in ReservoirComputing.jl.
3
+
Following the idea of giving physical information to machine learning models, the hybrid echo state networks [Pathak2018](@cite) try to achieve this results by feeding model data into the ESN. In this example, it is explained how to create and leverage such models in ReservoirComputing.jl.
4
4
5
5
## Generating the data
6
6
@@ -94,6 +94,9 @@ plot(p1, p2, p3; plot_title="Lorenz System Coordinates",
94
94
legendfontsize=12, titlefontsize=20)
95
95
```
96
96
97
-
## Bibliography
97
+
## References
98
98
99
-
[^1]: Pathak, Jaideep, et al. "_Hybrid forecasting of chaotic processes: Using machine learning in conjunction with a knowledge-based model._" Chaos: An Interdisciplinary Journal of Nonlinear Science 28.4 (2018): 041101.
0 commit comments