rlportfolio.utils.general module

class RLDataset

Bases: IterableDataset

__init__(buffer: ReplayBuffer, batch_size: int, sample_bias: float = 1.0, from_start: bool = False) → RLDataset

Initializes reinforcement learning dataset.

Parameters:

buffer – replay buffer to become iterable dataset.
batch_size – Sample batch size.
sample_bias – Probability of success of a trial in a geometric distribution. Only used if buffer is GeometricReplayBuffer.
from_start – If True, will choose a sequence starting from the start of the buffer. Otherwise, it will start from the end. Only used if buffer is GeometricReplayBuffer.

Note

It’s a subclass of pytorch’s IterableDataset, check https://pytorch.org/docs/stable/data.html#torch.utils.data.IterableDataset

apply_action_noise(actions: Tensor, noise_model: str | None = None, epsilon: float = 0, alpha: float = 1.0) → Tensor

Apply noise to portfolio distribution considering its constraints.

Parameters:

actions – Batch of agent actions.
noise_model – Name of the action noise model to use.
epsilon – Random noise parameter.
alpha – Alpha parameter for dirichlet distribution.

Returns:

New batch of actions with applied noise.

apply_parameter_noise(model: Module, epsilon: float = 0) → Module

Apply noise to PyTorch model. If the model is a portfolio optimization policy, the noise allows the agent to generate different actions and explore the action space.

Parameters:

model – PyTorch model to add parameter noise.
epsilon – Noise parameter.

Returns:

Copy of model with parameter noise.

combine_portfolio_vector_memories(pvm_list: list[PortfolioVectorMemory], move_index: bool = True) → PortfolioVectorMemory

Combines two portfolio vector memories and creates a new one.

Parameters:

pvm_list – List of portfolio vector memories.
move_index – If True, moves the index pointer of the portfolio vector memory to the end, so it appends new experiences.

Returns:

Combined portfolio vector memory.

combine_replay_buffers(rb_list: list[SequentialReplayBuffer], new_type: type[SequentialReplayBuffer]) → SequentialReplayBuffer

Combines multiple replay buffers and creates a new one.

Parameters:

rb_list – List of replay buffers.
new_type – New replay buffer type. It can be SequentialReplayBuffer or GeometricReplayBuffer.

Note

After combining replay buffers, its position pointer will be reset to 0 so it is adviseable to avoid combining replay buffers if the integrity of the position pointer is important to the algorithm.

Returns:: Combined replay buffer.

numpy_to_torch(array: ndarray, type: dtype = torch.float32, add_batch_dim: bool = False, device: str = 'cpu') → Tensor

Transforms numpy array to torch tensor.

Parameters:

array – Numpy array to be transformed.
type – Type of torch tensor.
device – Torch tensor device.

Returns:

Torch tensor.

polyak_average(net: Module, target_net: Module, tau: float = 0.01) → Module

Applies polyak average to incrementally update target net.

Parameters:

net – trained neural network.
target_net – target neural network.
tau – update rate.

Returns:

Target neural network with new weights.

torch_to_numpy(tensor: Tensor, squeeze: float = False) → ndarray

Transforms torch tensor to numpy array.

Parameters:

tensor – Tensor to be transformed.
squeeze – If True, numpy array will be squeezed, eliminating dimensions of size 1.

Returns:

Numpy array.