rlportfolio.utils.general module

class RLDataset

Bases: IterableDataset

__init__(buffer: ReplayBuffer, batch_size: int, sample_bias: float = 1.0, from_start: bool = False) RLDataset

Initializes reinforcement learning dataset.

Parameters:
  • buffer – replay buffer to become iterable dataset.

  • batch_size – Sample batch size.

  • sample_bias – Probability of success of a trial in a geometric distribution. Only used if buffer is GeometricReplayBuffer.

  • from_start – If True, will choose a sequence starting from the start of the buffer. Otherwise, it will start from the end. Only used if buffer is GeometricReplayBuffer.

Note

It’s a subclass of pytorch’s IterableDataset, check https://pytorch.org/docs/stable/data.html#torch.utils.data.IterableDataset

apply_action_noise(actions: Tensor, noise_model: str | None = None, epsilon: float = 0, alpha: float = 1.0) Tensor

Apply noise to portfolio distribution considering its constraints.

Parameters:
  • actions – Batch of agent actions.

  • noise_model – Name of the action noise model to use.

  • epsilon – Random noise parameter.

  • alpha – Alpha parameter for dirichlet distribution.

Returns:

New batch of actions with applied noise.

apply_parameter_noise(model: Module, epsilon: float = 0) Module

Apply noise to PyTorch model. If the model is a portfolio optimization policy, the noise allows the agent to generate different actions and explore the action space.

Parameters:
  • model – PyTorch model to add parameter noise.

  • epsilon – Noise parameter.

Returns:

Copy of model with parameter noise.

combine_portfolio_vector_memories(pvm_list: list[PortfolioVectorMemory], move_index: bool = True) PortfolioVectorMemory

Combines two portfolio vector memories and creates a new one.

Parameters:
  • pvm_list – List of portfolio vector memories.

  • move_index – If True, moves the index pointer of the portfolio vector memory to the end, so it appends new experiences.

Returns:

Combined portfolio vector memory.

combine_replay_buffers(rb_list: list[SequentialReplayBuffer], new_type: type[SequentialReplayBuffer]) SequentialReplayBuffer

Combines multiple replay buffers and creates a new one.

Parameters:
  • rb_list – List of replay buffers.

  • new_type – New replay buffer type. It can be SequentialReplayBuffer or GeometricReplayBuffer.

Note

After combining replay buffers, its position pointer will be reset to 0 so it is adviseable to avoid combining replay buffers if the integrity of the position pointer is important to the algorithm.

Returns:

Combined replay buffer.

numpy_to_torch(array: ndarray, type: dtype = torch.float32, add_batch_dim: bool = False, device: str = 'cpu') Tensor

Transforms numpy array to torch tensor.

Parameters:
  • array – Numpy array to be transformed.

  • type – Type of torch tensor.

  • device – Torch tensor device.

Returns:

Torch tensor.

polyak_average(net: Module, target_net: Module, tau: float = 0.01) Module

Applies polyak average to incrementally update target net.

Parameters:
  • net – trained neural network.

  • target_net – target neural network.

  • tau – update rate.

Returns:

Target neural network with new weights.

torch_to_numpy(tensor: Tensor, squeeze: float = False) ndarray

Transforms torch tensor to numpy array.

Parameters:
  • tensor – Tensor to be transformed.

  • squeeze – If True, numpy array will be squeezed, eliminating dimensions of size 1.

Returns:

Numpy array.