Your First Agent ================ Fist, It is necessary to instantiate an environment object. The environment makes use of a `Pandas `_ dataframe which contains the time series of price of stocks. .. code-block:: python import pandas as pd from rlportfolio.environment import PortfolioOptimizationEnv # dataframe with training data (market price time series) df_train = pd.read_csv("train_data.csv") environment = PortfolioOptimizationEnv( df_train, # data to be used 100000 # initial value of the portfolio ) Now, we instantiate the policy gradient algorithm to generate an agent that actuates in the created environment. Note that, in this step, we also define the policy of actions of the agent. .. code-block:: python from rlportfolio.algorithm import PolicyGradient from rlportfolio.policy import EI3 algorithm = PolicyGradient(environment, policy=EI3) Finally, the algorithm is used to train the agent for 10000 training steps. .. code-block:: python # train the algorithm for 10000 steps algorithm.train(10000) The algorithm also contains an interface which allows the agent to be tested in new market data. .. code-block:: python # dataframe with testing data (market price time series) df_test = pd.read_csv("test_data.csv") environment_test = PortfolioOptimizationEnv( df_test, # data to be used 100000 # initial value of the portfolio ) # test the agent in the test environment algorithm.test(environment_test) The test function will return a Python dictionary containing several performance metrics of the agent.