Note that
and
.
Networks states and biases:
| Layer | Activated values | Potentials | Activation Func. | Bias |
|---|---|---|---|---|
| Output | ![]() |
![]() |
![]() |
![]() |
| Context | ![]() |
![]() |
![]() |
![]() |
| Input | ![]() |
Network weights:
| Connection | Weights | |
|---|---|---|
| to | from | |
| Output | Context | ![]() |
| Context | Context | ![]() |
| Context | Input | ![]() |
is the time-constant like variable.
For typical RNN, set
.
Backward propagation of delta-error
can
be written as follows:
Here I assume
if
.
Gradient of each parameters: