rlportfolio.utils.general module
- class RLDataset
Bases:
IterableDataset- __init__(buffer: ReplayBuffer, batch_size: int, sample_bias: float = 1.0, from_start: bool = False) RLDataset
Initializes reinforcement learning dataset.
- Parameters:
buffer – replay buffer to become iterable dataset.
batch_size – Sample batch size.
sample_bias – Probability of success of a trial in a geometric distribution. Only used if buffer is GeometricReplayBuffer.
from_start – If True, will choose a sequence starting from the start of the buffer. Otherwise, it will start from the end. Only used if buffer is GeometricReplayBuffer.
Note
It’s a subclass of pytorch’s IterableDataset, check https://pytorch.org/docs/stable/data.html#torch.utils.data.IterableDataset
- apply_action_noise(actions: Tensor, noise_model: str | None = None, epsilon: float = 0, alpha: float = 1.0) Tensor
Apply noise to portfolio distribution considering its constraints.
- Parameters:
actions – Batch of agent actions.
noise_model – Name of the action noise model to use.
epsilon – Random noise parameter.
alpha – Alpha parameter for dirichlet distribution.
- Returns:
New batch of actions with applied noise.
- apply_parameter_noise(model: Module, epsilon: float = 0) Module
Apply noise to PyTorch model. If the model is a portfolio optimization policy, the noise allows the agent to generate different actions and explore the action space.
- Parameters:
model – PyTorch model to add parameter noise.
epsilon – Noise parameter.
- Returns:
Copy of model with parameter noise.
- combine_portfolio_vector_memories(pvm_list: list[PortfolioVectorMemory], move_index: bool = True) PortfolioVectorMemory
Combines two portfolio vector memories and creates a new one.
- Parameters:
pvm_list – List of portfolio vector memories.
move_index – If True, moves the index pointer of the portfolio vector memory to the end, so it appends new experiences.
- Returns:
Combined portfolio vector memory.
- combine_replay_buffers(rb_list: list[SequentialReplayBuffer], new_type: type[SequentialReplayBuffer]) SequentialReplayBuffer
Combines multiple replay buffers and creates a new one.
- Parameters:
rb_list – List of replay buffers.
new_type – New replay buffer type. It can be SequentialReplayBuffer or GeometricReplayBuffer.
Note
After combining replay buffers, its position pointer will be reset to 0 so it is adviseable to avoid combining replay buffers if the integrity of the position pointer is important to the algorithm.
- Returns:
Combined replay buffer.
- numpy_to_torch(array: ndarray, type: dtype = torch.float32, add_batch_dim: bool = False, device: str = 'cpu') Tensor
Transforms numpy array to torch tensor.
- Parameters:
array – Numpy array to be transformed.
type – Type of torch tensor.
device – Torch tensor device.
- Returns:
Torch tensor.
- polyak_average(net: Module, target_net: Module, tau: float = 0.01) Module
Applies polyak average to incrementally update target net.
- Parameters:
net – trained neural network.
target_net – target neural network.
tau – update rate.
- Returns:
Target neural network with new weights.
- torch_to_numpy(tensor: Tensor, squeeze: float = False) ndarray
Transforms torch tensor to numpy array.
- Parameters:
tensor – Tensor to be transformed.
squeeze – If True, numpy array will be squeezed, eliminating dimensions of size 1.
- Returns:
Numpy array.