rlportfolio.policy.eiie module

class EIIE

Bases: Module

__init__(initial_features: int = 3, k_size: int = 3, conv_mid_features: int = 2, conv_final_features: int = 20, time_window: int = 50, device: str = 'cpu') → EIIE

Convolutional EIIE (ensemble of identical independent evaluators) policy network initializer.

Parameters:

initial_features – Number of input features.
k_size – Size of first convolutional kernel.
conv_mid_features – Size of intermediate convolutional channels.
conv_final_features – Size of final convolutional channels.
time_window – Size of time window used as agent’s state.
device – Device in which the neural network will be run.

Note

Reference article: https://doi.org/10.48550/arXiv.1706.10059.

forward(observation: Tensor, last_action: Tensor) → Tensor

Policy network’s forward propagation. Defines a most favorable action of this policy given the inputs.

Parameters:

observation – environment observation.
last_action – Last action performed by agent.

Returns:

Action to be taken.

class EIIERecurrent

Bases: Module

__init__(initial_features: int = 3, rec_type: str = 'rnn', rec_num_layers: int = 20, rec_nonlinearity: str = 'tanh', rec_final_features: int = 20, portfolio_size: int = 11, device: str = 'cpu') → EIIERecurrent

Recurrent EIIE (ensemble of identical independent evaluators) policy network initializer.

Parameters:

recurrent_type
initial_features – Number of input features.
rec_type – Type of recurrent layers. It can be “rnn” or “lstm”.
rec_num_layers – Number of recurrent layers.
rec_nonlinearity – Activation function to be used in the recurrent units. Can be “relu” or “tanh”. Only used if rec_type is torch.nn.RNN.
rec_final_features – Size of final recurrent channels.
portfolio_size – Number of assets in portfolio.
device – Device in which the neural network will be run.

Note

Reference article: https://doi.org/10.48550/arXiv.1706.10059.

forward(observation: Tensor, last_action: Tensor) → Tensor

Policy network’s forward propagation. Defines a most favorable action of this policy given the inputs.

Parameters:

observation – environment observation.
last_action – Last action performed by agent.

Returns:

Action to be taken.