
Reinforcement Learning with TensorFlow
By :

As we already know, portfolio management is the continuous reallocation of funds across different multiple financial products (assets). In this work, the time is divided into equal length periods, where each period T = 30 minutes. At the beginning of each period, the trading agent reallocates the fund across different assets. The price of an asset fluctuates within a period, but four important price metrics are taken into consideration, which are good enough to characterize the price movement of an asset in the period. These price metrics are as follows:
For a continuous market (such as our test case), the opening price of an asset in a period t is its closing price in the previous period t-1. The portfolio consists of m assets. For a time period t, the closing prices of all the m assets create the price vector
. Thus,
element of
that is
is the closing price of the
asset in that
time period.
Similarly, we have vector...