Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
AI4Finance-Foundation
GitHub Repository: AI4Finance-Foundation/FinRL
Path: blob/master/examples/Stock_NeurIPS2018_call_func_rolling_window_SB3.ipynb
726 views
Kernel: base

Open In Colab

Deep Reinforcement Learning for Stock Trading from Scratch: Multiple Stock Trading

  • Pytorch Version

Content

Task Discription

We train a DRL agent for stock trading. This task is modeled as a Markov Decision Process (MDP), and the objective function is maximizing (expected) cumulative return.

We specify the state-action-reward as follows:

  • State s: The state space represents an agent's perception of the market environment. Just like a human trader analyzing various information, here our agent passively observes many features and learns by interacting with the market environment (usually by replaying historical data).

  • Action a: The action space includes allowed actions that an agent can take at each state. For example, a ∈ {−1, 0, 1}, where −1, 0, 1 represent selling, holding, and buying. When an action operates multiple shares, a ∈{−k, ..., −1, 0, 1, ..., k}, e.g.. "Buy 10 shares of AAPL" or "Sell 10 shares of AAPL" are 10 or −10, respectively

  • Reward function r(s, a, s′): Reward is an incentive for an agent to learn a better policy. For example, it can be the change of the portfolio value when taking a at state s and arriving at new state s', i.e., r(s, a, s′) = v′ − v, where v′ and v represent the portfolio values at state s′ and s, respectively

Market environment: 30 consituent stocks of Dow Jones Industrial Average (DJIA) index. Accessed at the starting date of the testing period.

The data for this case study is obtained from Yahoo Finance API. The data contains Open-High-Low-Close price and volume.

Part 1. Install Python Packages

1.1. Install packages

## install required packages !pip install swig !pip install wrds !pip install pyportfolioopt ## install finrl library !pip install -q condacolab import condacolab condacolab.install() !apt-get update -y -qq && apt-get install -y -qq cmake libopenmpi-dev python3-dev zlib1g-dev libgl1-mesa-glx swig !pip install git+https://github.com/AI4Finance-Foundation/FinRL.git
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/ Collecting swig Downloading swig-4.1.1-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.8 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 23.4 MB/s eta 0:00:00 Installing collected packages: swig Successfully installed swig-4.1.1 Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/ Collecting wrds Downloading wrds-3.1.6-py3-none-any.whl (12 kB) Collecting psycopg2-binary Downloading psycopg2_binary-2.9.5-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.0/3.0 MB 47.7 MB/s eta 0:00:00 Requirement already satisfied: scipy in /usr/local/lib/python3.9/dist-packages (from wrds) (1.10.1) Requirement already satisfied: pandas in /usr/local/lib/python3.9/dist-packages (from wrds) (1.4.4) Requirement already satisfied: sqlalchemy<2 in /usr/local/lib/python3.9/dist-packages (from wrds) (1.4.47) Requirement already satisfied: numpy in /usr/local/lib/python3.9/dist-packages (from wrds) (1.22.4) Requirement already satisfied: greenlet!=0.4.17 in /usr/local/lib/python3.9/dist-packages (from sqlalchemy<2->wrds) (2.0.2) Requirement already satisfied: python-dateutil>=2.8.1 in /usr/local/lib/python3.9/dist-packages (from pandas->wrds) (2.8.2) Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.9/dist-packages (from pandas->wrds) (2022.7.1) Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.9/dist-packages (from python-dateutil>=2.8.1->pandas->wrds) (1.16.0) Installing collected packages: psycopg2-binary, wrds Successfully installed psycopg2-binary-2.9.5 wrds-3.1.6 Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/ Collecting pyportfolioopt Downloading pyportfolioopt-1.5.4-py3-none-any.whl (61 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.9/61.9 KB 3.3 MB/s eta 0:00:00 Requirement already satisfied: numpy<2.0.0,>=1.22.4 in /usr/local/lib/python3.9/dist-packages (from pyportfolioopt) (1.22.4) Requirement already satisfied: pandas>=0.19 in /usr/local/lib/python3.9/dist-packages (from pyportfolioopt) (1.4.4) Requirement already satisfied: scipy<2.0,>=1.3 in /usr/local/lib/python3.9/dist-packages (from pyportfolioopt) (1.10.1) Requirement already satisfied: cvxpy<2.0.0,>=1.1.10 in /usr/local/lib/python3.9/dist-packages (from pyportfolioopt) (1.3.1) Requirement already satisfied: ecos>=2 in /usr/local/lib/python3.9/dist-packages (from cvxpy<2.0.0,>=1.1.10->pyportfolioopt) (2.0.12) Requirement already satisfied: scs>=1.1.6 in /usr/local/lib/python3.9/dist-packages (from cvxpy<2.0.0,>=1.1.10->pyportfolioopt) (3.2.2) Requirement already satisfied: osqp>=0.4.1 in /usr/local/lib/python3.9/dist-packages (from cvxpy<2.0.0,>=1.1.10->pyportfolioopt) (0.6.2.post0) Requirement already satisfied: setuptools>65.5.1 in /usr/local/lib/python3.9/dist-packages (from cvxpy<2.0.0,>=1.1.10->pyportfolioopt) (67.6.1) Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.9/dist-packages (from pandas>=0.19->pyportfolioopt) (2022.7.1) Requirement already satisfied: python-dateutil>=2.8.1 in /usr/local/lib/python3.9/dist-packages (from pandas>=0.19->pyportfolioopt) (2.8.2) Requirement already satisfied: qdldl in /usr/local/lib/python3.9/dist-packages (from osqp>=0.4.1->cvxpy<2.0.0,>=1.1.10->pyportfolioopt) (0.1.5.post3) Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.9/dist-packages (from python-dateutil>=2.8.1->pandas>=0.19->pyportfolioopt) (1.16.0) Installing collected packages: pyportfolioopt Successfully installed pyportfolioopt-1.5.4 ⏬ Downloading https://github.com/jaimergp/miniforge/releases/latest/download/Mambaforge-colab-Linux-x86_64.sh... 📦 Installing... 📌 Adjusting configuration... 🩹 Patching environment... ⏲ Done in 0:00:19 🔁 Restarting kernel... Selecting previously unselected package libgl1-mesa-glx:amd64. (Reading database ... 128288 files and directories currently installed.) Preparing to unpack .../libgl1-mesa-glx_21.2.6-0ubuntu0.1~20.04.2_amd64.deb ... Unpacking libgl1-mesa-glx:amd64 (21.2.6-0ubuntu0.1~20.04.2) ... Selecting previously unselected package swig4.0. Preparing to unpack .../swig4.0_4.0.1-5build1_amd64.deb ... Unpacking swig4.0 (4.0.1-5build1) ... Selecting previously unselected package swig. Preparing to unpack .../swig_4.0.1-5build1_all.deb ... Unpacking swig (4.0.1-5build1) ... Setting up libgl1-mesa-glx:amd64 (21.2.6-0ubuntu0.1~20.04.2) ... Setting up swig4.0 (4.0.1-5build1) ... Setting up swig (4.0.1-5build1) ... Processing triggers for man-db (2.9.1-1) ... Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/ Collecting git+https://github.com/AI4Finance-Foundation/FinRL.git Cloning https://github.com/AI4Finance-Foundation/FinRL.git to /tmp/pip-req-build-zsjn96ks Running command git clone --filter=blob:none --quiet https://github.com/AI4Finance-Foundation/FinRL.git /tmp/pip-req-build-zsjn96ks Resolved https://github.com/AI4Finance-Foundation/FinRL.git to commit d3e35c7d94da2b0b4f44734a226de360b5c09d52 Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Collecting elegantrl@ git+https://github.com/AI4Finance-Foundation/ElegantRL.git#egg=elegantrl Cloning https://github.com/AI4Finance-Foundation/ElegantRL.git to /tmp/pip-install-nhzd12k3/elegantrl_0fdb0f9b53ed4e8aac03822e093bbd98 Running command git clone --filter=blob:none --quiet https://github.com/AI4Finance-Foundation/ElegantRL.git /tmp/pip-install-nhzd12k3/elegantrl_0fdb0f9b53ed4e8aac03822e093bbd98 Resolved https://github.com/AI4Finance-Foundation/ElegantRL.git to commit 594b0c31de443a24c1032b75418fdc134664e92f Preparing metadata (setup.py) ... done Collecting pyfolio@ git+https://github.com/quantopian/pyfolio.git#egg=pyfolio-0.9.2 Cloning https://github.com/quantopian/pyfolio.git to /tmp/pip-install-nhzd12k3/pyfolio_26e97b504ff64373aaa7bb9e77b71a1d Running command git clone --filter=blob:none --quiet https://github.com/quantopian/pyfolio.git /tmp/pip-install-nhzd12k3/pyfolio_26e97b504ff64373aaa7bb9e77b71a1d Resolved https://github.com/quantopian/pyfolio.git to commit 4b901f6d73aa02ceb6d04b7d83502e5c6f2e81aa Preparing metadata (setup.py) ... done Collecting stockstats>=0.4.0 Downloading stockstats-0.5.2-py2.py3-none-any.whl (20 kB) Collecting lz4 Downloading lz4-4.3.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 12.2 MB/s eta 0:00:00 Collecting scikit-learn>=0.21.0 Downloading scikit_learn-1.2.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.6/9.6 MB 31.6 MB/s eta 0:00:00 Collecting gputil Downloading GPUtil-1.4.0.tar.gz (5.5 kB) Preparing metadata (setup.py) ... done Collecting stable-baselines3<2.0.0,>=1.6.2 Downloading stable_baselines3-1.7.0-py3-none-any.whl (171 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 171.8/171.8 kB 15.5 MB/s eta 0:00:00 Collecting jqdatasdk Downloading jqdatasdk-1.8.11-py3-none-any.whl (158 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 158.2/158.2 kB 16.5 MB/s eta 0:00:00 Collecting alpaca_trade_api>=2.1.0 Downloading alpaca_trade_api-3.0.0-py3-none-any.whl (33 kB) Collecting yfinance Downloading yfinance-0.2.14-py2.py3-none-any.whl (59 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 59.7/59.7 kB 6.9 MB/s eta 0:00:00 Collecting ccxt>=1.66.32 Downloading ccxt-3.0.46-py2.py3-none-any.whl (3.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.6/3.6 MB 49.5 MB/s eta 0:00:00 Collecting pandas>=1.1.5 Downloading pandas-1.5.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.2 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.2/12.2 MB 54.3 MB/s eta 0:00:00 Collecting tensorboardX Downloading tensorboardX-2.6-py2.py3-none-any.whl (114 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 114.5/114.5 kB 12.3 MB/s eta 0:00:00 Collecting wrds>=3.1.6 Using cached wrds-3.1.6-py3-none-any.whl (12 kB) Collecting numpy>=1.17.3 Downloading numpy-1.24.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.3 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 17.3/17.3 MB 43.0 MB/s eta 0:00:00 Collecting gym>=0.17 Downloading gym-0.26.2.tar.gz (721 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 721.7/721.7 kB 37.5 MB/s eta 0:00:00 Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Collecting exchange_calendars==3.6.3 Downloading exchange_calendars-3.6.3.tar.gz (152 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 152.8/152.8 kB 12.7 MB/s eta 0:00:00 Preparing metadata (setup.py) ... done Collecting ray[default,tune]>=2.0.0 Downloading ray-2.3.1-cp39-cp39-manylinux2014_x86_64.whl (58.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.6/58.6 MB 11.0 MB/s eta 0:00:00 Collecting matplotlib Downloading matplotlib-3.7.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.6/11.6 MB 73.8 MB/s eta 0:00:00 Collecting importlib-metadata==4.13.0 Downloading importlib_metadata-4.13.0-py3-none-any.whl (23 kB) Collecting pyluach Downloading pyluach-2.2.0-py3-none-any.whl (25 kB) Collecting python-dateutil Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 247.7/247.7 kB 22.7 MB/s eta 0:00:00 Collecting pytz Downloading pytz-2023.3-py2.py3-none-any.whl (502 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 502.3/502.3 kB 31.8 MB/s eta 0:00:00 Requirement already satisfied: toolz in /usr/local/lib/python3.9/site-packages (from exchange_calendars==3.6.3->finrl==0.3.5) (0.12.0) Collecting korean_lunar_calendar Downloading korean_lunar_calendar-0.3.1-py3-none-any.whl (9.0 kB) Collecting zipp>=0.5 Downloading zipp-3.15.0-py3-none-any.whl (6.8 kB) Collecting msgpack==1.0.3 Downloading msgpack-1.0.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (322 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 322.2/322.2 kB 26.4 MB/s eta 0:00:00 Collecting websocket-client<2,>=0.56.0 Downloading websocket_client-1.5.1-py3-none-any.whl (55 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 55.9/55.9 kB 6.4 MB/s eta 0:00:00 Collecting deprecation==2.1.0 Downloading deprecation-2.1.0-py2.py3-none-any.whl (11 kB) Requirement already satisfied: requests<3,>2 in /usr/local/lib/python3.9/site-packages (from alpaca_trade_api>=2.1.0->finrl==0.3.5) (2.28.2) Collecting websockets<11,>=9.0 Downloading websockets-10.4-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (106 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 106.5/106.5 kB 9.2 MB/s eta 0:00:00 Collecting PyYAML==6.0 Downloading PyYAML-6.0-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (661 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 661.8/661.8 kB 41.8 MB/s eta 0:00:00 Collecting aiohttp==3.8.1 Downloading aiohttp-3.8.1-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.2 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 59.2 MB/s eta 0:00:00 Requirement already satisfied: urllib3<2,>1.24 in /usr/local/lib/python3.9/site-packages (from alpaca_trade_api>=2.1.0->finrl==0.3.5) (1.26.15) Collecting aiosignal>=1.1.2 Downloading aiosignal-1.3.1-py3-none-any.whl (7.6 kB) Requirement already satisfied: charset-normalizer<3.0,>=2.0 in /usr/local/lib/python3.9/site-packages (from aiohttp==3.8.1->alpaca_trade_api>=2.1.0->finrl==0.3.5) (2.1.1) Collecting multidict<7.0,>=4.5 Downloading multidict-6.0.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (114 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 114.2/114.2 kB 14.4 MB/s eta 0:00:00 Collecting async-timeout<5.0,>=4.0.0a3 Downloading async_timeout-4.0.2-py3-none-any.whl (5.8 kB) Collecting frozenlist>=1.1.1 Downloading frozenlist-1.3.3-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (158 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 158.8/158.8 kB 14.7 MB/s eta 0:00:00 Collecting yarl<2.0,>=1.0 Downloading yarl-1.8.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (264 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 264.6/264.6 kB 20.0 MB/s eta 0:00:00 Collecting attrs>=17.3.0 Downloading attrs-22.2.0-py3-none-any.whl (60 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 60.0/60.0 kB 6.9 MB/s eta 0:00:00 Collecting packaging Downloading packaging-23.0-py3-none-any.whl (42 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 42.7/42.7 kB 4.4 MB/s eta 0:00:00 Requirement already satisfied: setuptools>=60.9.0 in /usr/local/lib/python3.9/site-packages (from ccxt>=1.66.32->finrl==0.3.5) (65.6.3) Collecting aiodns>=1.1.1 Downloading aiodns-3.0.0-py3-none-any.whl (5.0 kB) Requirement already satisfied: certifi>=2018.1.18 in /usr/local/lib/python3.9/site-packages (from ccxt>=1.66.32->finrl==0.3.5) (2022.12.7) Requirement already satisfied: cryptography>=2.6.1 in /usr/local/lib/python3.9/site-packages (from ccxt>=1.66.32->finrl==0.3.5) (39.0.2) Collecting cloudpickle>=1.2.0 Downloading cloudpickle-2.2.1-py3-none-any.whl (25 kB) Collecting gym-notices>=0.0.4 Downloading gym_notices-0.0.8-py3-none-any.whl (3.0 kB) Collecting click>=7.0 Downloading click-8.1.3-py3-none-any.whl (96 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 96.6/96.6 kB 9.3 MB/s eta 0:00:00 Collecting virtualenv>=20.0.24 Downloading virtualenv-20.21.0-py3-none-any.whl (8.7 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.7/8.7 MB 55.5 MB/s eta 0:00:00 Collecting protobuf!=3.19.5,>=3.15.3 Downloading protobuf-4.22.1-cp37-abi3-manylinux2014_x86_64.whl (302 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 302.4/302.4 kB 22.4 MB/s eta 0:00:00 Collecting grpcio>=1.32.0 Downloading grpcio-1.53.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.0/5.0 MB 54.2 MB/s eta 0:00:00 Collecting jsonschema Downloading jsonschema-4.17.3-py3-none-any.whl (90 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 90.4/90.4 kB 8.6 MB/s eta 0:00:00 Collecting filelock Downloading filelock-3.10.7-py3-none-any.whl (10 kB) Collecting tabulate Downloading tabulate-0.9.0-py3-none-any.whl (35 kB) Collecting py-spy>=0.2.0 Downloading py_spy-0.3.14-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (3.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.0/3.0 MB 52.2 MB/s eta 0:00:00 Collecting prometheus-client>=0.7.1 Downloading prometheus_client-0.16.0-py3-none-any.whl (122 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 122.5/122.5 kB 12.5 MB/s eta 0:00:00 Collecting gpustat>=1.0.0 Downloading gpustat-1.0.0.tar.gz (90 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 90.5/90.5 kB 9.3 MB/s eta 0:00:00 Preparing metadata (setup.py) ... done Collecting aiohttp-cors Downloading aiohttp_cors-0.7.0-py3-none-any.whl (27 kB) Collecting colorful Downloading colorful-0.5.5-py2.py3-none-any.whl (201 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 201.4/201.4 kB 21.8 MB/s eta 0:00:00 Collecting opencensus Downloading opencensus-0.11.2-py2.py3-none-any.whl (128 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 128.2/128.2 kB 15.0 MB/s eta 0:00:00 Collecting smart-open Downloading smart_open-6.3.0-py3-none-any.whl (56 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.8/56.8 kB 6.6 MB/s eta 0:00:00 Collecting pydantic Downloading pydantic-1.10.7-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.2 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.2/3.2 MB 62.5 MB/s eta 0:00:00 Collecting threadpoolctl>=2.0.0 Downloading threadpoolctl-3.1.0-py3-none-any.whl (14 kB) Collecting joblib>=1.1.1 Downloading joblib-1.2.0-py3-none-any.whl (297 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 298.0/298.0 kB 28.4 MB/s eta 0:00:00 Collecting scipy>=1.3.2 Downloading scipy-1.10.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (34.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 34.5/34.5 MB 21.4 MB/s eta 0:00:00 Collecting gym>=0.17 Downloading gym-0.21.0.tar.gz (1.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.5/1.5 MB 63.2 MB/s eta 0:00:00 Preparing metadata (setup.py) ... done Collecting torch>=1.11 Downloading torch-2.0.0-cp39-cp39-manylinux1_x86_64.whl (619.9 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 619.9/619.9 MB 1.9 MB/s eta 0:00:00 Collecting protobuf!=3.19.5,>=3.15.3 Downloading protobuf-3.20.3-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.0/1.0 MB 52.1 MB/s eta 0:00:00 Collecting psycopg2-binary Using cached psycopg2_binary-2.9.5-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB) Collecting sqlalchemy<2 Downloading SQLAlchemy-1.4.47-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 70.6 MB/s eta 0:00:00 Collecting six Downloading six-1.16.0-py2.py3-none-any.whl (11 kB) Collecting thriftpy2>=0.3.9 Downloading thriftpy2-0.4.16.tar.gz (643 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 643.4/643.4 kB 39.6 MB/s eta 0:00:00 Preparing metadata (setup.py) ... done Collecting pymysql>=0.7.6 Downloading PyMySQL-1.0.3-py3-none-any.whl (43 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 43.7/43.7 kB 4.1 MB/s eta 0:00:00 Collecting kiwisolver>=1.0.1 Downloading kiwisolver-1.4.4-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 67.1 MB/s eta 0:00:00 Collecting fonttools>=4.22.0 Downloading fonttools-4.39.3-py3-none-any.whl (1.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.0/1.0 MB 55.9 MB/s eta 0:00:00 Collecting pillow>=6.2.0 Downloading Pillow-9.4.0-cp39-cp39-manylinux_2_28_x86_64.whl (3.4 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.4/3.4 MB 79.4 MB/s eta 0:00:00 Collecting pyparsing>=2.3.1 Downloading pyparsing-3.0.9-py3-none-any.whl (98 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 98.3/98.3 kB 10.9 MB/s eta 0:00:00 Collecting importlib-resources>=3.2.0 Downloading importlib_resources-5.12.0-py3-none-any.whl (36 kB) Collecting contourpy>=1.0.1 Downloading contourpy-1.0.7-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (299 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 299.7/299.7 kB 25.6 MB/s eta 0:00:00 Collecting cycler>=0.10 Downloading cycler-0.11.0-py3-none-any.whl (6.4 kB) Collecting ipython>=3.2.3 Downloading ipython-8.12.0-py3-none-any.whl (796 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 796.4/796.4 kB 49.6 MB/s eta 0:00:00 Collecting seaborn>=0.7.1 Downloading seaborn-0.12.2-py3-none-any.whl (293 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 293.3/293.3 kB 25.4 MB/s eta 0:00:00 Collecting empyrical>=0.5.0 Downloading empyrical-0.5.5.tar.gz (52 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 52.8/52.8 kB 6.4 MB/s eta 0:00:00 Preparing metadata (setup.py) ... done Collecting appdirs>=1.4.4 Downloading appdirs-1.4.4-py2.py3-none-any.whl (9.6 kB) Collecting beautifulsoup4>=4.11.1 Downloading beautifulsoup4-4.12.0-py3-none-any.whl (132 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 132.2/132.2 kB 10.1 MB/s eta 0:00:00 Collecting html5lib>=1.1 Downloading html5lib-1.1-py2.py3-none-any.whl (112 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 112.2/112.2 kB 14.1 MB/s eta 0:00:00 Collecting frozendict>=2.3.4 Downloading frozendict-2.3.6-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (114 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 114.3/114.3 kB 14.2 MB/s eta 0:00:00 Collecting multitasking>=0.0.7 Downloading multitasking-0.0.11-py3-none-any.whl (8.5 kB) Collecting lxml>=4.9.1 Downloading lxml-4.9.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (7.1 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.1/7.1 MB 83.7 MB/s eta 0:00:00 Collecting pycares>=4.0.0 Downloading pycares-4.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (288 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 288.7/288.7 kB 22.3 MB/s eta 0:00:00 Collecting soupsieve>1.2 Downloading soupsieve-2.4-py3-none-any.whl (37 kB) Requirement already satisfied: cffi>=1.12 in /usr/local/lib/python3.9/site-packages (from cryptography>=2.6.1->ccxt>=1.66.32->finrl==0.3.5) (1.15.1) Collecting pandas-datareader>=0.2 Downloading pandas_datareader-0.10.0-py3-none-any.whl (109 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 109.5/109.5 kB 11.0 MB/s eta 0:00:00 Collecting nvidia-ml-py<=11.495.46,>=11.450.129 Downloading nvidia_ml_py-11.495.46-py3-none-any.whl (25 kB) Collecting psutil>=5.6.0 Downloading psutil-5.9.4-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (280 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 280.2/280.2 kB 23.5 MB/s eta 0:00:00 Collecting blessed>=1.17.1 Downloading blessed-1.20.0-py2.py3-none-any.whl (58 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.4/58.4 kB 4.1 MB/s eta 0:00:00 Collecting webencodings Downloading webencodings-0.5.1-py2.py3-none-any.whl (11 kB) Collecting prompt-toolkit!=3.0.37,<3.1.0,>=3.0.30 Downloading prompt_toolkit-3.0.38-py3-none-any.whl (385 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 385.8/385.8 kB 27.9 MB/s eta 0:00:00 Collecting traitlets>=5 Downloading traitlets-5.9.0-py3-none-any.whl (117 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 117.4/117.4 kB 9.2 MB/s eta 0:00:00 Collecting typing-extensions Downloading typing_extensions-4.5.0-py3-none-any.whl (27 kB) Collecting pygments>=2.4.0 Downloading Pygments-2.14.0-py3-none-any.whl (1.1 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 51.1 MB/s eta 0:00:00 Collecting jedi>=0.16 Downloading jedi-0.18.2-py2.py3-none-any.whl (1.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 59.5 MB/s eta 0:00:00 Collecting stack-data Downloading stack_data-0.6.2-py3-none-any.whl (24 kB) Collecting decorator Downloading decorator-5.1.1-py3-none-any.whl (9.1 kB) Collecting matplotlib-inline Downloading matplotlib_inline-0.1.6-py3-none-any.whl (9.4 kB) Collecting pexpect>4.3 Downloading pexpect-4.8.0-py2.py3-none-any.whl (59 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 59.0/59.0 kB 6.1 MB/s eta 0:00:00 Collecting backcall Downloading backcall-0.2.0-py2.py3-none-any.whl (11 kB) Collecting pickleshare Downloading pickleshare-0.7.5-py2.py3-none-any.whl (6.9 kB) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.9/site-packages (from requests<3,>2->alpaca_trade_api>=2.1.0->finrl==0.3.5) (3.4) Collecting greenlet!=0.4.17 Downloading greenlet-2.0.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (610 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 610.9/610.9 kB 32.5 MB/s eta 0:00:00 Collecting ply<4.0,>=3.4 Downloading ply-3.11-py2.py3-none-any.whl (49 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 49.6/49.6 kB 5.2 MB/s eta 0:00:00 Collecting nvidia-cublas-cu11==11.10.3.66 Downloading nvidia_cublas_cu11-11.10.3.66-py3-none-manylinux1_x86_64.whl (317.1 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 317.1/317.1 MB 4.1 MB/s eta 0:00:00 Collecting nvidia-cusparse-cu11==11.7.4.91 Downloading nvidia_cusparse_cu11-11.7.4.91-py3-none-manylinux1_x86_64.whl (173.2 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 173.2/173.2 MB 4.7 MB/s eta 0:00:00 Collecting nvidia-cusolver-cu11==11.4.0.1 Downloading nvidia_cusolver_cu11-11.4.0.1-2-py3-none-manylinux1_x86_64.whl (102.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 102.6/102.6 MB 7.9 MB/s eta 0:00:00 Collecting triton==2.0.0 Downloading triton-2.0.0-1-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (63.3 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 63.3/63.3 MB 9.1 MB/s eta 0:00:00 Collecting nvidia-cuda-runtime-cu11==11.7.99 Downloading nvidia_cuda_runtime_cu11-11.7.99-py3-none-manylinux1_x86_64.whl (849 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 849.3/849.3 kB 28.3 MB/s eta 0:00:00 Collecting nvidia-cufft-cu11==10.9.0.58 Downloading nvidia_cufft_cu11-10.9.0.58-py3-none-manylinux1_x86_64.whl (168.4 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 168.4/168.4 MB 5.1 MB/s eta 0:00:00 Collecting nvidia-cudnn-cu11==8.5.0.96 Downloading nvidia_cudnn_cu11-8.5.0.96-2-py3-none-manylinux1_x86_64.whl (557.1 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 557.1/557.1 MB 2.4 MB/s eta 0:00:00 Collecting jinja2 Downloading Jinja2-3.1.2-py3-none-any.whl (133 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 133.1/133.1 kB 14.2 MB/s eta 0:00:00 Collecting networkx Downloading networkx-3.0-py3-none-any.whl (2.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 MB 73.8 MB/s eta 0:00:00 Collecting sympy Downloading sympy-1.11.1-py3-none-any.whl (6.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.5/6.5 MB 83.9 MB/s eta 0:00:00 Collecting nvidia-cuda-nvrtc-cu11==11.7.99 Downloading nvidia_cuda_nvrtc_cu11-11.7.99-2-py3-none-manylinux1_x86_64.whl (21.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 21.0/21.0 MB 55.1 MB/s eta 0:00:00 Collecting nvidia-cuda-cupti-cu11==11.7.101 Downloading nvidia_cuda_cupti_cu11-11.7.101-py3-none-manylinux1_x86_64.whl (11.8 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.8/11.8 MB 54.6 MB/s eta 0:00:00 Collecting nvidia-curand-cu11==10.2.10.91 Downloading nvidia_curand_cu11-10.2.10.91-py3-none-manylinux1_x86_64.whl (54.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 54.6/54.6 MB 9.7 MB/s eta 0:00:00 Collecting nvidia-nvtx-cu11==11.7.91 Downloading nvidia_nvtx_cu11-11.7.91-py3-none-manylinux1_x86_64.whl (98 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 98.6/98.6 kB 10.6 MB/s eta 0:00:00 Collecting nvidia-nccl-cu11==2.14.3 Downloading nvidia_nccl_cu11-2.14.3-py3-none-manylinux1_x86_64.whl (177.1 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 177.1/177.1 MB 5.4 MB/s eta 0:00:00 Requirement already satisfied: wheel in /usr/local/lib/python3.9/site-packages (from nvidia-cublas-cu11==11.10.3.66->torch>=1.11->stable-baselines3<2.0.0,>=1.6.2->finrl==0.3.5) (0.38.4) Collecting cmake Downloading cmake-3.26.1-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (24.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 24.0/24.0 MB 34.2 MB/s eta 0:00:00 Collecting lit Downloading lit-16.0.0.tar.gz (144 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 145.0/145.0 kB 14.8 MB/s eta 0:00:00 Preparing metadata (setup.py) ... done Collecting platformdirs<4,>=2.4 Downloading platformdirs-3.2.0-py3-none-any.whl (14 kB) Collecting distlib<1,>=0.3.6 Downloading distlib-0.3.6-py2.py3-none-any.whl (468 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 468.5/468.5 kB 34.8 MB/s eta 0:00:00 Collecting gym[box2d] Downloading gym-0.26.1.tar.gz (719 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 719.9/719.9 kB 40.7 MB/s eta 0:00:00 Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Downloading gym-0.26.0.tar.gz (710 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 710.3/710.3 kB 45.8 MB/s eta 0:00:00 Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Downloading gym-0.25.2.tar.gz (734 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 734.5/734.5 kB 47.4 MB/s eta 0:00:00 Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Downloading gym-0.25.1.tar.gz (732 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 732.2/732.2 kB 37.7 MB/s eta 0:00:00 Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Downloading gym-0.25.0.tar.gz (720 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 720.4/720.4 kB 35.6 MB/s eta 0:00:00 Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Downloading gym-0.24.1.tar.gz (696 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 696.4/696.4 kB 38.2 MB/s eta 0:00:00 Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Downloading gym-0.24.0.tar.gz (694 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 694.4/694.4 kB 46.8 MB/s eta 0:00:00 Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Downloading gym-0.23.1.tar.gz (626 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 626.2/626.2 kB 45.4 MB/s eta 0:00:00 Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Downloading gym-0.23.0.tar.gz (624 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 624.4/624.4 kB 30.5 MB/s eta 0:00:00 Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Downloading gym-0.22.0.tar.gz (631 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 631.1/631.1 kB 30.3 MB/s eta 0:00:00 Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Collecting box2d-py==2.3.5 Downloading box2d-py-2.3.5.tar.gz (374 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 374.4/374.4 kB 27.9 MB/s eta 0:00:00 Preparing metadata (setup.py) ... done Collecting pyglet>=1.4.0 Downloading pyglet-2.0.5-py3-none-any.whl (831 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 831.3/831.3 kB 52.8 MB/s eta 0:00:00 Collecting pyrsistent!=0.17.0,!=0.17.1,!=0.17.2,>=0.14.0 Downloading pyrsistent-0.19.3-py3-none-any.whl (57 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 57.5/57.5 kB 7.5 MB/s eta 0:00:00 Collecting google-api-core<3.0.0,>=1.0.0 Downloading google_api_core-2.11.0-py3-none-any.whl (120 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 120.3/120.3 kB 14.5 MB/s eta 0:00:00 Collecting opencensus-context>=0.1.3 Downloading opencensus_context-0.1.3-py2.py3-none-any.whl (5.1 kB) Collecting wcwidth>=0.1.4 Downloading wcwidth-0.2.6-py2.py3-none-any.whl (29 kB) Requirement already satisfied: pycparser in /usr/local/lib/python3.9/site-packages (from cffi>=1.12->cryptography>=2.6.1->ccxt>=1.66.32->finrl==0.3.5) (2.21) Collecting google-auth<3.0dev,>=2.14.1 Downloading google_auth-2.17.1-py2.py3-none-any.whl (178 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 178.1/178.1 kB 19.2 MB/s eta 0:00:00 Collecting googleapis-common-protos<2.0dev,>=1.56.2 Downloading googleapis_common_protos-1.59.0-py2.py3-none-any.whl (223 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 223.6/223.6 kB 24.1 MB/s eta 0:00:00 Collecting parso<0.9.0,>=0.8.0 Downloading parso-0.8.3-py2.py3-none-any.whl (100 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.8/100.8 kB 11.5 MB/s eta 0:00:00 Collecting ptyprocess>=0.5 Downloading ptyprocess-0.7.0-py2.py3-none-any.whl (13 kB) Collecting MarkupSafe>=2.0 Downloading MarkupSafe-2.1.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (25 kB) Collecting pure-eval Downloading pure_eval-0.2.2-py3-none-any.whl (11 kB) Collecting executing>=1.2.0 Downloading executing-1.2.0-py2.py3-none-any.whl (24 kB) Collecting asttokens>=2.1.0 Downloading asttokens-2.2.1-py2.py3-none-any.whl (26 kB) Collecting mpmath>=0.19 Downloading mpmath-1.3.0-py3-none-any.whl (536 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 536.2/536.2 kB 43.8 MB/s eta 0:00:00 Collecting rsa<5,>=3.1.4 Downloading rsa-4.9-py3-none-any.whl (34 kB) Collecting pyasn1-modules>=0.2.1 Downloading pyasn1_modules-0.2.8-py2.py3-none-any.whl (155 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 155.3/155.3 kB 18.5 MB/s eta 0:00:00 Collecting cachetools<6.0,>=2.0.0 Downloading cachetools-5.3.0-py3-none-any.whl (9.3 kB) Collecting pyasn1<0.5.0,>=0.4.6 Downloading pyasn1-0.4.8-py2.py3-none-any.whl (77 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 77.1/77.1 kB 8.5 MB/s eta 0:00:00 Building wheels for collected packages: finrl, exchange_calendars, gym, elegantrl, gputil, pyfolio, empyrical, gpustat, thriftpy2, box2d-py, lit Building wheel for finrl (pyproject.toml) ... done Created wheel for finrl: filename=finrl-0.3.5-py3-none-any.whl size=4668526 sha256=f55b41d967223f3a27e4404f4d382736d5c5cbc2e1cfd7736de48586998df184 Stored in directory: /tmp/pip-ephem-wheel-cache-fopkegt4/wheels/ec/6a/08/c43694890a7c5a62c23af4b2a497bce5ee7edef607852cf53f Building wheel for exchange_calendars (setup.py) ... done Created wheel for exchange_calendars: filename=exchange_calendars-3.6.3-py3-none-any.whl size=182636 sha256=dfc7624a26c2bb0463c6ce659d9f6d0b32cfc6cd49c3ef7d0c9ddcd09b49398e Stored in directory: /root/.cache/pip/wheels/4e/02/f9/6c6eeb48a242879e357caf2813953fa8b6e26bd0110bd94226 Building wheel for gym (setup.py) ... done Created wheel for gym: filename=gym-0.21.0-py3-none-any.whl size=1616819 sha256=faa02f535febfd071894e5017afb8902151ba8973f93e252400e27bd560366a9 Stored in directory: /root/.cache/pip/wheels/b3/50/6c/0a82c1358b4da2dbd9c1bb17e0f89467db32812ab236dbf6d5 Building wheel for elegantrl (setup.py) ... done Created wheel for elegantrl: filename=elegantrl-0.3.6-py3-none-any.whl size=195053 sha256=2a848a648928884e7570c74b8f16d0b19603669d5bebb6b3b99cb281e10c68d4 Stored in directory: /tmp/pip-ephem-wheel-cache-fopkegt4/wheels/a3/c3/be/03eb1f20c8650f23ab13b823d93a297a917899f5d08b04b7b9 Building wheel for gputil (setup.py) ... done Created wheel for gputil: filename=GPUtil-1.4.0-py3-none-any.whl size=7409 sha256=c2ab06060122ac074b17acd6bb902ed0348de375b21d7374d13cb0a73b1cb835 Stored in directory: /root/.cache/pip/wheels/2b/b5/24/fbb56595c286984f7315ee31821d6121e1b9828436021a88b3 Building wheel for pyfolio (setup.py) ... done Created wheel for pyfolio: filename=pyfolio-0.9.2+75.g4b901f6-py3-none-any.whl size=75773 sha256=481aca931367143b883ff1cb591a7eae1cddbc83674ba99b77d6ce45458c74c3 Stored in directory: /tmp/pip-ephem-wheel-cache-fopkegt4/wheels/da/0d/dd/aef7001cc1238aff04ec9eabfc002341f00c50deead3083855 Building wheel for empyrical (setup.py) ... done Created wheel for empyrical: filename=empyrical-0.5.5-py3-none-any.whl size=39776 sha256=058a2495ba35828b597997dc4fbaaf2766894aea421a9a0dabfda0298293af95 Stored in directory: /root/.cache/pip/wheels/67/23/d1/a4ef8ff88dc9af7b0eeb1b6fd0d90c6057eaad5a2df25f4e3f Building wheel for gpustat (setup.py) ... done Created wheel for gpustat: filename=gpustat-1.0.0-py3-none-any.whl size=19884 sha256=58bed7ad89eddee8af26ea29c48037dc74474fdcbdf76fab1dc9fa0a6e1d5f51 Stored in directory: /root/.cache/pip/wheels/ce/13/aa/145d9d670feb2cf4a0691b9a3552aafc8a1b49c5162a0f379d Building wheel for thriftpy2 (setup.py) ... done Created wheel for thriftpy2: filename=thriftpy2-0.4.16-cp39-cp39-linux_x86_64.whl size=529342 sha256=97b595d2db82c5223de2f982f830baeffdd4e97880ff9a00476539b23f777b33 Stored in directory: /root/.cache/pip/wheels/88/a4/d5/907737b4c175aec82087b815fa93a8afea5c6c5a3e7bb748b9 Building wheel for box2d-py (setup.py) ... done Created wheel for box2d-py: filename=box2d_py-2.3.5-cp39-cp39-linux_x86_64.whl size=494643 sha256=714bbf157f1f852d527e4c97b5701cd21ce2c606cdb9d6a1281eeb83f722d8c2 Stored in directory: /root/.cache/pip/wheels/a4/c2/c1/076651c394f05fe60990cd85616c2d95bc1619aa113f559d7d Building wheel for lit (setup.py) ... done Created wheel for lit: filename=lit-16.0.0-py3-none-any.whl size=93598 sha256=4af1790b3bda595a7e575e84d292c54d3d2ab3e723ca3d49e5dc43981b53e392 Stored in directory: /root/.cache/pip/wheels/c7/ee/80/1520ca86c3557f70e5504b802072f7fc3b0e2147f376b133ed Successfully built finrl exchange_calendars gym elegantrl gputil pyfolio empyrical gpustat thriftpy2 box2d-py lit Installing collected packages: webencodings, wcwidth, pytz, pyglet, pyasn1, py-spy, pure-eval, ptyprocess, ply, pickleshare, opencensus-context, nvidia-ml-py, multitasking, msgpack, mpmath, lit, korean_lunar_calendar, gputil, executing, distlib, colorful, cmake, box2d-py, backcall, appdirs, zipp, websockets, websocket-client, typing-extensions, traitlets, threadpoolctl, tabulate, sympy, soupsieve, smart-open, six, rsa, PyYAML, pyrsistent, pyparsing, pymysql, pyluach, pygments, pyasn1-modules, psycopg2-binary, psutil, protobuf, prompt-toolkit, prometheus-client, platformdirs, pillow, pexpect, parso, packaging, nvidia-nvtx-cu11, nvidia-nccl-cu11, nvidia-cusparse-cu11, nvidia-curand-cu11, nvidia-cufft-cu11, nvidia-cuda-runtime-cu11, nvidia-cuda-nvrtc-cu11, nvidia-cuda-cupti-cu11, nvidia-cublas-cu11, numpy, networkx, multidict, MarkupSafe, lz4, lxml, kiwisolver, joblib, grpcio, greenlet, frozenlist, frozendict, fonttools, filelock, decorator, cycler, cloudpickle, click, cachetools, attrs, async-timeout, yarl, virtualenv, thriftpy2, tensorboardX, sqlalchemy, scipy, python-dateutil, pydantic, pycares, nvidia-cusolver-cu11, nvidia-cudnn-cu11, matplotlib-inline, jsonschema, jinja2, jedi, importlib-resources, importlib-metadata, html5lib, gym, googleapis-common-protos, google-auth, deprecation, contourpy, blessed, beautifulsoup4, asttokens, aiosignal, stack-data, scikit-learn, ray, pandas, matplotlib, gpustat, google-api-core, aiohttp, aiodns, yfinance, wrds, stockstats, seaborn, pandas-datareader, opencensus, jqdatasdk, ipython, exchange_calendars, ccxt, alpaca_trade_api, aiohttp-cors, empyrical, pyfolio, triton, torch, stable-baselines3, elegantrl, finrl Successfully installed MarkupSafe-2.1.2 PyYAML-6.0 aiodns-3.0.0 aiohttp-3.8.1 aiohttp-cors-0.7.0 aiosignal-1.3.1 alpaca_trade_api-3.0.0 appdirs-1.4.4 asttokens-2.2.1 async-timeout-4.0.2 attrs-22.2.0 backcall-0.2.0 beautifulsoup4-4.12.0 blessed-1.20.0 box2d-py-2.3.5 cachetools-5.3.0 ccxt-3.0.46 click-8.1.3 cloudpickle-2.2.1 cmake-3.26.1 colorful-0.5.5 contourpy-1.0.7 cycler-0.11.0 decorator-5.1.1 deprecation-2.1.0 distlib-0.3.6 elegantrl-0.3.6 empyrical-0.5.5 exchange_calendars-3.6.3 executing-1.2.0 filelock-3.10.7 finrl-0.3.5 fonttools-4.39.3 frozendict-2.3.6 frozenlist-1.3.3 google-api-core-2.11.0 google-auth-2.17.1 googleapis-common-protos-1.59.0 gpustat-1.0.0 gputil-1.4.0 greenlet-2.0.2 grpcio-1.53.0 gym-0.21.0 html5lib-1.1 importlib-metadata-4.13.0 importlib-resources-5.12.0 ipython-8.12.0 jedi-0.18.2 jinja2-3.1.2 joblib-1.2.0 jqdatasdk-1.8.11 jsonschema-4.17.3 kiwisolver-1.4.4 korean_lunar_calendar-0.3.1 lit-16.0.0 lxml-4.9.2 lz4-4.3.2 matplotlib-3.7.1 matplotlib-inline-0.1.6 mpmath-1.3.0 msgpack-1.0.3 multidict-6.0.4 multitasking-0.0.11 networkx-3.0 numpy-1.24.2 nvidia-cublas-cu11-11.10.3.66 nvidia-cuda-cupti-cu11-11.7.101 nvidia-cuda-nvrtc-cu11-11.7.99 nvidia-cuda-runtime-cu11-11.7.99 nvidia-cudnn-cu11-8.5.0.96 nvidia-cufft-cu11-10.9.0.58 nvidia-curand-cu11-10.2.10.91 nvidia-cusolver-cu11-11.4.0.1 nvidia-cusparse-cu11-11.7.4.91 nvidia-ml-py-11.495.46 nvidia-nccl-cu11-2.14.3 nvidia-nvtx-cu11-11.7.91 opencensus-0.11.2 opencensus-context-0.1.3 packaging-23.0 pandas-1.5.3 pandas-datareader-0.10.0 parso-0.8.3 pexpect-4.8.0 pickleshare-0.7.5 pillow-9.4.0 platformdirs-3.2.0 ply-3.11 prometheus-client-0.16.0 prompt-toolkit-3.0.38 protobuf-3.20.3 psutil-5.9.4 psycopg2-binary-2.9.5 ptyprocess-0.7.0 pure-eval-0.2.2 py-spy-0.3.14 pyasn1-0.4.8 pyasn1-modules-0.2.8 pycares-4.3.0 pydantic-1.10.7 pyfolio-0.9.2+75.g4b901f6 pyglet-2.0.5 pygments-2.14.0 pyluach-2.2.0 pymysql-1.0.3 pyparsing-3.0.9 pyrsistent-0.19.3 python-dateutil-2.8.2 pytz-2023.3 ray-2.3.1 rsa-4.9 scikit-learn-1.2.2 scipy-1.10.1 seaborn-0.12.2 six-1.16.0 smart-open-6.3.0 soupsieve-2.4 sqlalchemy-1.4.47 stable-baselines3-1.7.0 stack-data-0.6.2 stockstats-0.5.2 sympy-1.11.1 tabulate-0.9.0 tensorboardX-2.6 threadpoolctl-3.1.0 thriftpy2-0.4.16 torch-2.0.0 traitlets-5.9.0 triton-2.0.0 typing-extensions-4.5.0 virtualenv-20.21.0 wcwidth-0.2.6 webencodings-0.5.1 websocket-client-1.5.1 websockets-10.4 wrds-3.1.6 yarl-1.8.2 yfinance-0.2.14 zipp-3.15.0 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
MIME type unknown not supported

1.2. Import Packages

from finrl import config from finrl import config_tickers from finrl.agents.stablebaselines3.models import DRLAgent from finrl.config import DATA_SAVE_DIR from finrl.config import INDICATORS from finrl.config import RESULTS_DIR from finrl.config import TENSORBOARD_LOG_DIR from finrl.config import TEST_END_DATE from finrl.config import TEST_START_DATE from finrl.config import TRAINED_MODEL_DIR from finrl.config_tickers import DOW_30_TICKER from finrl.main import check_and_make_directories from finrl.meta.data_processor import DataProcessor from finrl.meta.data_processors.func import calc_train_trade_data from finrl.meta.data_processors.func import calc_train_trade_starts_ends_if_rolling from finrl.meta.data_processors.func import date2str from finrl.meta.data_processors.func import str2date from finrl.meta.env_stock_trading.env_stocktrading import StockTradingEnv from finrl.meta.preprocessor.preprocessors import data_split from finrl.meta.preprocessor.preprocessors import FeatureEngineer from finrl.meta.preprocessor.yahoodownloader import YahooDownloader from finrl.plot import backtest_plot from finrl.plot import backtest_stats from finrl.plot import get_baseline from finrl.plot import get_daily_return from finrl.plot import plot_return from finrl.applications.stock_trading.stock_trading_rolling_window import stock_trading_rolling_window import sys sys.path.append("../FinRL") import itertools
/usr/local/lib/python3.9/site-packages/pyfolio/pos.py:26: UserWarning: Module "zipline.assets" not found; multipliers will not be applied to position notionals. warnings.warn(

2 Set parameters and run

train_start_date = "2009-01-01" train_end_date = "2022-07-01" trade_start_date = "2022-07-01" trade_end_date = "2022-11-01" rolling_window_length = 22 # num of trading days in a rolling window if_store_actions = True if_store_result = True if_using_a2c = True if_using_ddpg = True if_using_ppo = True if_using_sac = True if_using_td3 = True stock_trading_rolling_window( train_start_date, train_end_date, trade_start_date, trade_end_date, rolling_window_length, if_store_actions=if_store_actions, if_using_a2c=if_using_a2c, if_store_result=if_store_result, if_using_ddpg=if_using_ddpg, if_using_ppo=if_using_ppo, if_using_sac=if_using_sac, if_using_td3=if_using_td3, )
流式输出内容被截断,只能显示最后 5000 行内容。 | std | 1.02 | | value_loss | 52.5 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 64 | | iterations | 10 | | time_elapsed | 316 | | total_timesteps | 20480 | | train/ | | | approx_kl | 0.018726377 | | clip_fraction | 0.228 | | clip_range | 0.2 | | entropy_loss | -41.6 | | explained_variance | -0.00599 | | learning_rate | 0.00025 | | loss | 10 | | n_updates | 90 | | policy_gradient_loss | -0.0229 | | reward | 1.8604985 | | std | 1.02 | | value_loss | 34 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 64 | | iterations | 11 | | time_elapsed | 350 | | total_timesteps | 22528 | | train/ | | | approx_kl | 0.017771121 | | clip_fraction | 0.201 | | clip_range | 0.2 | | entropy_loss | -41.7 | | explained_variance | -0.00452 | | learning_rate | 0.00025 | | loss | 102 | | n_updates | 100 | | policy_gradient_loss | -0.0176 | | reward | 2.4363315 | | std | 1.02 | | value_loss | 257 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 64 | | iterations | 12 | | time_elapsed | 380 | | total_timesteps | 24576 | | train/ | | | approx_kl | 0.021592125 | | clip_fraction | 0.24 | | clip_range | 0.2 | | entropy_loss | -41.7 | | explained_variance | -0.00462 | | learning_rate | 0.00025 | | loss | 13.1 | | n_updates | 110 | | policy_gradient_loss | -0.0218 | | reward | -0.36686477 | | std | 1.02 | | value_loss | 27.8 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 63 | | iterations | 13 | | time_elapsed | 417 | | total_timesteps | 26624 | | train/ | | | approx_kl | 0.016095877 | | clip_fraction | 0.171 | | clip_range | 0.2 | | entropy_loss | -41.8 | | explained_variance | 0.00607 | | learning_rate | 0.00025 | | loss | 63.8 | | n_updates | 120 | | policy_gradient_loss | -0.0175 | | reward | -6.2590113 | | std | 1.02 | | value_loss | 161 | ----------------------------------------- ---------------------------------------- | time/ | | | fps | 64 | | iterations | 14 | | time_elapsed | 447 | | total_timesteps | 28672 | | train/ | | | approx_kl | 0.02099569 | | clip_fraction | 0.204 | | clip_range | 0.2 | | entropy_loss | -41.9 | | explained_variance | 0.00587 | | learning_rate | 0.00025 | | loss | 18.1 | | n_updates | 130 | | policy_gradient_loss | -0.0176 | | reward | -1.5635415 | | std | 1.03 | | value_loss | 76.1 | ---------------------------------------- day: 3374, episode: 10 begin_total_asset: 1017321.61 end_total_asset: 4690150.25 total_reward: 3672828.63 total_cost: 440655.06 total_trades: 91574 Sharpe: 0.777 ================================= ------------------------------------------ | time/ | | | fps | 63 | | iterations | 15 | | time_elapsed | 480 | | total_timesteps | 30720 | | train/ | | | approx_kl | 0.01574407 | | clip_fraction | 0.252 | | clip_range | 0.2 | | entropy_loss | -41.9 | | explained_variance | 0.045 | | learning_rate | 0.00025 | | loss | 8.21 | | n_updates | 140 | | policy_gradient_loss | -0.0207 | | reward | -0.058135245 | | std | 1.03 | | value_loss | 20 | ------------------------------------------ ----------------------------------------- | time/ | | | fps | 64 | | iterations | 16 | | time_elapsed | 511 | | total_timesteps | 32768 | | train/ | | | approx_kl | 0.018864237 | | clip_fraction | 0.19 | | clip_range | 0.2 | | entropy_loss | -42 | | explained_variance | -0.0334 | | learning_rate | 0.00025 | | loss | 40.5 | | n_updates | 150 | | policy_gradient_loss | -0.0158 | | reward | 2.1892703 | | std | 1.03 | | value_loss | 80.4 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 64 | | iterations | 17 | | time_elapsed | 542 | | total_timesteps | 34816 | | train/ | | | approx_kl | 0.025924759 | | clip_fraction | 0.183 | | clip_range | 0.2 | | entropy_loss | -42 | | explained_variance | -0.0494 | | learning_rate | 0.00025 | | loss | 8.64 | | n_updates | 160 | | policy_gradient_loss | -0.0154 | | reward | -1.6194284 | | std | 1.03 | | value_loss | 19.1 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 63 | | iterations | 18 | | time_elapsed | 576 | | total_timesteps | 36864 | | train/ | | | approx_kl | 0.023486339 | | clip_fraction | 0.227 | | clip_range | 0.2 | | entropy_loss | -42 | | explained_variance | -0.00164 | | learning_rate | 0.00025 | | loss | 71 | | n_updates | 170 | | policy_gradient_loss | -0.0128 | | reward | -6.5787015 | | std | 1.03 | | value_loss | 175 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 63 | | iterations | 19 | | time_elapsed | 609 | | total_timesteps | 38912 | | train/ | | | approx_kl | 0.047546946 | | clip_fraction | 0.278 | | clip_range | 0.2 | | entropy_loss | -42 | | explained_variance | 0.0083 | | learning_rate | 0.00025 | | loss | 22.2 | | n_updates | 180 | | policy_gradient_loss | -0.00743 | | reward | 3.6853487 | | std | 1.03 | | value_loss | 88.3 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 63 | | iterations | 20 | | time_elapsed | 643 | | total_timesteps | 40960 | | train/ | | | approx_kl | 0.028585846 | | clip_fraction | 0.238 | | clip_range | 0.2 | | entropy_loss | -42.1 | | explained_variance | -0.018 | | learning_rate | 0.00025 | | loss | 12.6 | | n_updates | 190 | | policy_gradient_loss | -0.0166 | | reward | 2.84366 | | std | 1.03 | | value_loss | 35.7 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 63 | | iterations | 21 | | time_elapsed | 672 | | total_timesteps | 43008 | | train/ | | | approx_kl | 0.021615773 | | clip_fraction | 0.283 | | clip_range | 0.2 | | entropy_loss | -42.1 | | explained_variance | 0.0164 | | learning_rate | 0.00025 | | loss | 39.1 | | n_updates | 200 | | policy_gradient_loss | -0.0119 | | reward | 7.260352 | | std | 1.04 | | value_loss | 85.5 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 64 | | iterations | 22 | | time_elapsed | 703 | | total_timesteps | 45056 | | train/ | | | approx_kl | 0.023984132 | | clip_fraction | 0.174 | | clip_range | 0.2 | | entropy_loss | -42.2 | | explained_variance | -0.0214 | | learning_rate | 0.00025 | | loss | 10.8 | | n_updates | 210 | | policy_gradient_loss | -0.015 | | reward | 0.7453349 | | std | 1.04 | | value_loss | 27.4 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 63 | | iterations | 23 | | time_elapsed | 736 | | total_timesteps | 47104 | | train/ | | | approx_kl | 0.026311198 | | clip_fraction | 0.239 | | clip_range | 0.2 | | entropy_loss | -42.2 | | explained_variance | 0.0117 | | learning_rate | 0.00025 | | loss | 53.5 | | n_updates | 220 | | policy_gradient_loss | -0.0147 | | reward | -3.601917 | | std | 1.04 | | value_loss | 109 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 64 | | iterations | 24 | | time_elapsed | 765 | | total_timesteps | 49152 | | train/ | | | approx_kl | 0.021329464 | | clip_fraction | 0.228 | | clip_range | 0.2 | | entropy_loss | -42.2 | | explained_variance | 0.0287 | | learning_rate | 0.00025 | | loss | 35.5 | | n_updates | 230 | | policy_gradient_loss | -0.0174 | | reward | -1.4932549 | | std | 1.04 | | value_loss | 69.7 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 64 | | iterations | 25 | | time_elapsed | 799 | | total_timesteps | 51200 | | train/ | | | approx_kl | 0.033834375 | | clip_fraction | 0.347 | | clip_range | 0.2 | | entropy_loss | -42.2 | | explained_variance | -0.0439 | | learning_rate | 0.00025 | | loss | 11.2 | | n_updates | 240 | | policy_gradient_loss | -0.0175 | | reward | -0.13293022 | | std | 1.04 | | value_loss | 31.1 | ----------------------------------------- {'batch_size': 128, 'buffer_size': 100000, 'learning_rate': 0.0001, 'learning_starts': 100, 'ent_coef': 'auto_0.1'} Using cpu device Logging to results/sac ----------------------------------- | time/ | | | episodes | 4 | | fps | 19 | | time_elapsed | 693 | | total_timesteps | 13500 | | train/ | | | actor_loss | 1.23e+03 | | critic_loss | 941 | | ent_coef | 0.175 | | ent_coef_loss | -80.9 | | learning_rate | 0.0001 | | n_updates | 13399 | | reward | -4.1185117 | ----------------------------------- ----------------------------------- | time/ | | | episodes | 8 | | fps | 19 | | time_elapsed | 1407 | | total_timesteps | 27000 | | train/ | | | actor_loss | 486 | | critic_loss | 378 | | ent_coef | 0.047 | | ent_coef_loss | -97.4 | | learning_rate | 0.0001 | | n_updates | 26899 | | reward | -6.0287046 | ----------------------------------- day: 3374, episode: 10 begin_total_asset: 1039580.61 end_total_asset: 4449383.64 total_reward: 3409803.03 total_cost: 3171.06 total_trades: 48397 Sharpe: 0.687 ================================= ----------------------------------- | time/ | | | episodes | 12 | | fps | 19 | | time_elapsed | 2123 | | total_timesteps | 40500 | | train/ | | | actor_loss | 201 | | critic_loss | 10.8 | | ent_coef | 0.0131 | | ent_coef_loss | -63.2 | | learning_rate | 0.0001 | | n_updates | 40399 | | reward | -5.6925883 | ----------------------------------- {'batch_size': 100, 'buffer_size': 1000000, 'learning_rate': 0.001} Using cpu device Logging to results/td3 ----------------------------------- | time/ | | | episodes | 4 | | fps | 24 | | time_elapsed | 545 | | total_timesteps | 13500 | | train/ | | | actor_loss | 16.7 | | critic_loss | 341 | | learning_rate | 0.001 | | n_updates | 10125 | | reward | -5.7216434 | ----------------------------------- ----------------------------------- | time/ | | | episodes | 8 | | fps | 21 | | time_elapsed | 1228 | | total_timesteps | 27000 | | train/ | | | actor_loss | 18.7 | | critic_loss | 21.4 | | learning_rate | 0.001 | | n_updates | 23625 | | reward | -5.7216434 | ----------------------------------- day: 3374, episode: 10 begin_total_asset: 1043903.24 end_total_asset: 5291054.90 total_reward: 4247151.66 total_cost: 1042.86 total_trades: 64106 Sharpe: 0.723 ================================= ----------------------------------- | time/ | | | episodes | 12 | | fps | 21 | | time_elapsed | 1923 | | total_timesteps | 40500 | | train/ | | | actor_loss | 20.6 | | critic_loss | 15.9 | | learning_rate | 0.001 | | n_updates | 37125 | | reward | -5.7216434 | ----------------------------------- hit end! hit end! hit end! hit end! hit end! [*********************100%***********************] 1 of 1 completed Shape of DataFrame: (22, 8) i: 2 {'n_steps': 5, 'ent_coef': 0.01, 'learning_rate': 0.0007} Using cpu device Logging to results/a2c --------------------------------------- | time/ | | | fps | 55 | | iterations | 100 | | time_elapsed | 9 | | total_timesteps | 500 | | train/ | | | entropy_loss | -41 | | explained_variance | 0.0552 | | learning_rate | 0.0007 | | n_updates | 99 | | policy_loss | -125 | | reward | -0.19224237 | | std | 0.997 | | value_loss | 10.9 | --------------------------------------- ------------------------------------ | time/ | | | fps | 65 | | iterations | 200 | | time_elapsed | 15 | | total_timesteps | 1000 | | train/ | | | entropy_loss | -41.1 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 199 | | policy_loss | -65.7 | | reward | 2.47076 | | std | 0.998 | | value_loss | 3.14 | ------------------------------------ ------------------------------------- | time/ | | | fps | 61 | | iterations | 300 | | time_elapsed | 24 | | total_timesteps | 1500 | | train/ | | | entropy_loss | -41.1 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 299 | | policy_loss | 221 | | reward | -0.668967 | | std | 0.999 | | value_loss | 38 | ------------------------------------- ------------------------------------ | time/ | | | fps | 61 | | iterations | 400 | | time_elapsed | 32 | | total_timesteps | 2000 | | train/ | | | entropy_loss | -41 | | explained_variance | 5.96e-08 | | learning_rate | 0.0007 | | n_updates | 399 | | policy_loss | 2.8 | | reward | 2.104001 | | std | 0.997 | | value_loss | 2.71 | ------------------------------------ ------------------------------------- | time/ | | | fps | 64 | | iterations | 500 | | time_elapsed | 39 | | total_timesteps | 2500 | | train/ | | | entropy_loss | -41.1 | | explained_variance | -1.19e-07 | | learning_rate | 0.0007 | | n_updates | 499 | | policy_loss | 239 | | reward | 3.0126274 | | std | 0.999 | | value_loss | 39.6 | ------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 600 | | time_elapsed | 48 | | total_timesteps | 3000 | | train/ | | | entropy_loss | -41.2 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 599 | | policy_loss | -524 | | reward | 3.9946847 | | std | 1 | | value_loss | 293 | ------------------------------------- ------------------------------------- | time/ | | | fps | 62 | | iterations | 700 | | time_elapsed | 56 | | total_timesteps | 3500 | | train/ | | | entropy_loss | -41.2 | | explained_variance | 0.108 | | learning_rate | 0.0007 | | n_updates | 699 | | policy_loss | -37.1 | | reward | 1.4987615 | | std | 1 | | value_loss | 1.8 | ------------------------------------- ------------------------------------ | time/ | | | fps | 63 | | iterations | 800 | | time_elapsed | 62 | | total_timesteps | 4000 | | train/ | | | entropy_loss | -41.2 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 799 | | policy_loss | -337 | | reward | 2.046587 | | std | 1 | | value_loss | 77.1 | ------------------------------------ ------------------------------------- | time/ | | | fps | 61 | | iterations | 900 | | time_elapsed | 72 | | total_timesteps | 4500 | | train/ | | | entropy_loss | -41.2 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 899 | | policy_loss | 26.2 | | reward | 1.0195923 | | std | 1 | | value_loss | 4.92 | ------------------------------------- -------------------------------------- | time/ | | | fps | 62 | | iterations | 1000 | | time_elapsed | 79 | | total_timesteps | 5000 | | train/ | | | entropy_loss | -41.2 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 999 | | policy_loss | 80.3 | | reward | -3.5495179 | | std | 1 | | value_loss | 9.56 | -------------------------------------- ------------------------------------- | time/ | | | fps | 63 | | iterations | 1100 | | time_elapsed | 86 | | total_timesteps | 5500 | | train/ | | | entropy_loss | -41.2 | | explained_variance | -2.38e-07 | | learning_rate | 0.0007 | | n_updates | 1099 | | policy_loss | -258 | | reward | 1.6695346 | | std | 1 | | value_loss | 44.9 | ------------------------------------- ------------------------------------ | time/ | | | fps | 62 | | iterations | 1200 | | time_elapsed | 96 | | total_timesteps | 6000 | | train/ | | | entropy_loss | -41.1 | | explained_variance | -0.00397 | | learning_rate | 0.0007 | | n_updates | 1199 | | policy_loss | 185 | | reward | 2.245284 | | std | 1 | | value_loss | 26.3 | ------------------------------------ ------------------------------------- | time/ | | | fps | 60 | | iterations | 1300 | | time_elapsed | 106 | | total_timesteps | 6500 | | train/ | | | entropy_loss | -41.1 | | explained_variance | -1.19e-07 | | learning_rate | 0.0007 | | n_updates | 1299 | | policy_loss | 150 | | reward | 1.491629 | | std | 0.998 | | value_loss | 21.5 | ------------------------------------- --------------------------------------- | time/ | | | fps | 59 | | iterations | 1400 | | time_elapsed | 117 | | total_timesteps | 7000 | | train/ | | | entropy_loss | -41.1 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 1399 | | policy_loss | -107 | | reward | -0.81370664 | | std | 0.998 | | value_loss | 8.34 | --------------------------------------- ------------------------------------- | time/ | | | fps | 60 | | iterations | 1500 | | time_elapsed | 124 | | total_timesteps | 7500 | | train/ | | | entropy_loss | -41.1 | | explained_variance | -0.0459 | | learning_rate | 0.0007 | | n_updates | 1499 | | policy_loss | -10 | | reward | 2.3688922 | | std | 0.997 | | value_loss | 0.922 | ------------------------------------- -------------------------------------- | time/ | | | fps | 60 | | iterations | 1600 | | time_elapsed | 131 | | total_timesteps | 8000 | | train/ | | | entropy_loss | -41.1 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 1599 | | policy_loss | 128 | | reward | 0.56861943 | | std | 0.998 | | value_loss | 19.5 | -------------------------------------- ------------------------------------- | time/ | | | fps | 59 | | iterations | 1700 | | time_elapsed | 141 | | total_timesteps | 8500 | | train/ | | | entropy_loss | -41 | | explained_variance | 0.203 | | learning_rate | 0.0007 | | n_updates | 1699 | | policy_loss | 37 | | reward | 1.2727017 | | std | 0.996 | | value_loss | 3.2 | ------------------------------------- ------------------------------------- | time/ | | | fps | 60 | | iterations | 1800 | | time_elapsed | 148 | | total_timesteps | 9000 | | train/ | | | entropy_loss | -41 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 1799 | | policy_loss | 80.9 | | reward | 3.9352329 | | std | 0.996 | | value_loss | 9.56 | ------------------------------------- ------------------------------------ | time/ | | | fps | 60 | | iterations | 1900 | | time_elapsed | 156 | | total_timesteps | 9500 | | train/ | | | entropy_loss | -41 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 1899 | | policy_loss | 507 | | reward | 6.328624 | | std | 0.997 | | value_loss | 187 | ------------------------------------ -------------------------------------- | time/ | | | fps | 60 | | iterations | 2000 | | time_elapsed | 166 | | total_timesteps | 10000 | | train/ | | | entropy_loss | -41.1 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 1999 | | policy_loss | -126 | | reward | -2.8187668 | | std | 1 | | value_loss | 8.37 | -------------------------------------- ------------------------------------- | time/ | | | fps | 60 | | iterations | 2100 | | time_elapsed | 172 | | total_timesteps | 10500 | | train/ | | | entropy_loss | -41.1 | | explained_variance | -1.19e-07 | | learning_rate | 0.0007 | | n_updates | 2099 | | policy_loss | -58.5 | | reward | 0.2734109 | | std | 0.999 | | value_loss | 3 | ------------------------------------- -------------------------------------- | time/ | | | fps | 60 | | iterations | 2200 | | time_elapsed | 180 | | total_timesteps | 11000 | | train/ | | | entropy_loss | -41.1 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 2199 | | policy_loss | 157 | | reward | 0.68144214 | | std | 0.997 | | value_loss | 19.9 | -------------------------------------- -------------------------------------- | time/ | | | fps | 60 | | iterations | 2300 | | time_elapsed | 190 | | total_timesteps | 11500 | | train/ | | | entropy_loss | -41.1 | | explained_variance | 1.19e-07 | | learning_rate | 0.0007 | | n_updates | 2299 | | policy_loss | -67.3 | | reward | -1.8721669 | | std | 0.999 | | value_loss | 2.49 | -------------------------------------- -------------------------------------- | time/ | | | fps | 61 | | iterations | 2400 | | time_elapsed | 196 | | total_timesteps | 12000 | | train/ | | | entropy_loss | -41.1 | | explained_variance | -0.0105 | | learning_rate | 0.0007 | | n_updates | 2399 | | policy_loss | 144 | | reward | 0.47134838 | | std | 0.997 | | value_loss | 26.3 | -------------------------------------- -------------------------------------- | time/ | | | fps | 60 | | iterations | 2500 | | time_elapsed | 204 | | total_timesteps | 12500 | | train/ | | | entropy_loss | -41.1 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 2499 | | policy_loss | 589 | | reward | -1.9081986 | | std | 0.997 | | value_loss | 221 | -------------------------------------- ------------------------------------- | time/ | | | fps | 60 | | iterations | 2600 | | time_elapsed | 213 | | total_timesteps | 13000 | | train/ | | | entropy_loss | -41 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 2599 | | policy_loss | 352 | | reward | 6.1386447 | | std | 0.996 | | value_loss | 148 | ------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 2700 | | time_elapsed | 219 | | total_timesteps | 13500 | | train/ | | | entropy_loss | -41 | | explained_variance | -0.0134 | | learning_rate | 0.0007 | | n_updates | 2699 | | policy_loss | -131 | | reward | 0.6143146 | | std | 0.995 | | value_loss | 12.3 | ------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 2800 | | time_elapsed | 229 | | total_timesteps | 14000 | | train/ | | | entropy_loss | -41 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 2799 | | policy_loss | 132 | | reward | 1.7656372 | | std | 0.995 | | value_loss | 13.2 | ------------------------------------- -------------------------------------- | time/ | | | fps | 61 | | iterations | 2900 | | time_elapsed | 237 | | total_timesteps | 14500 | | train/ | | | entropy_loss | -41 | | explained_variance | -0.0245 | | learning_rate | 0.0007 | | n_updates | 2899 | | policy_loss | 17.1 | | reward | 0.08867768 | | std | 0.995 | | value_loss | 3.2 | -------------------------------------- -------------------------------------- | time/ | | | fps | 61 | | iterations | 3000 | | time_elapsed | 243 | | total_timesteps | 15000 | | train/ | | | entropy_loss | -41 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 2999 | | policy_loss | 48.4 | | reward | -3.6771903 | | std | 0.995 | | value_loss | 2.24 | -------------------------------------- ------------------------------------ | time/ | | | fps | 61 | | iterations | 3100 | | time_elapsed | 253 | | total_timesteps | 15500 | | train/ | | | entropy_loss | -41.1 | | explained_variance | 1.19e-07 | | learning_rate | 0.0007 | | n_updates | 3099 | | policy_loss | -0.565 | | reward | 0.679106 | | std | 0.998 | | value_loss | 1.98 | ------------------------------------ ---------------------------------------- | time/ | | | fps | 61 | | iterations | 3200 | | time_elapsed | 260 | | total_timesteps | 16000 | | train/ | | | entropy_loss | -41.1 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 3199 | | policy_loss | 26.6 | | reward | 0.0013427841 | | std | 0.998 | | value_loss | 11.3 | ---------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 3300 | | time_elapsed | 267 | | total_timesteps | 16500 | | train/ | | | entropy_loss | -41.1 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 3299 | | policy_loss | 54.6 | | reward | 1.6012005 | | std | 0.998 | | value_loss | 7.47 | ------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 3400 | | time_elapsed | 277 | | total_timesteps | 17000 | | train/ | | | entropy_loss | -41.1 | | explained_variance | -0.0366 | | learning_rate | 0.0007 | | n_updates | 3399 | | policy_loss | -33.7 | | reward | 0.8685799 | | std | 1 | | value_loss | 3.79 | ------------------------------------- -------------------------------------- | time/ | | | fps | 61 | | iterations | 3500 | | time_elapsed | 284 | | total_timesteps | 17500 | | train/ | | | entropy_loss | -41.1 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 3499 | | policy_loss | 21 | | reward | 0.14613488 | | std | 1 | | value_loss | 1.32 | -------------------------------------- -------------------------------------- | time/ | | | fps | 61 | | iterations | 3600 | | time_elapsed | 291 | | total_timesteps | 18000 | | train/ | | | entropy_loss | -41.1 | | explained_variance | 5.96e-08 | | learning_rate | 0.0007 | | n_updates | 3599 | | policy_loss | -173 | | reward | -1.2669375 | | std | 1 | | value_loss | 20.9 | -------------------------------------- ----------------------------------------- | time/ | | | fps | 61 | | iterations | 3700 | | time_elapsed | 301 | | total_timesteps | 18500 | | train/ | | | entropy_loss | -41.2 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 3699 | | policy_loss | -53.5 | | reward | -0.0045535835 | | std | 1 | | value_loss | 8.58 | ----------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 3800 | | time_elapsed | 308 | | total_timesteps | 19000 | | train/ | | | entropy_loss | -41.1 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 3799 | | policy_loss | 98.2 | | reward | 0.6932831 | | std | 1 | | value_loss | 7.95 | ------------------------------------- ------------------------------------ | time/ | | | fps | 61 | | iterations | 3900 | | time_elapsed | 316 | | total_timesteps | 19500 | | train/ | | | entropy_loss | -41.1 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 3899 | | policy_loss | 628 | | reward | 8.699067 | | std | 1 | | value_loss | 335 | ------------------------------------ ------------------------------------- | time/ | | | fps | 61 | | iterations | 4000 | | time_elapsed | 325 | | total_timesteps | 20000 | | train/ | | | entropy_loss | -41.1 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 3999 | | policy_loss | 605 | | reward | 17.186195 | | std | 1 | | value_loss | 266 | ------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 4100 | | time_elapsed | 331 | | total_timesteps | 20500 | | train/ | | | entropy_loss | -41.1 | | explained_variance | 0.0757 | | learning_rate | 0.0007 | | n_updates | 4099 | | policy_loss | 187 | | reward | 1.1687305 | | std | 1 | | value_loss | 22.2 | ------------------------------------- -------------------------------------- | time/ | | | fps | 61 | | iterations | 4200 | | time_elapsed | 341 | | total_timesteps | 21000 | | train/ | | | entropy_loss | -41.1 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 4199 | | policy_loss | -7.42 | | reward | -0.5722446 | | std | 1 | | value_loss | 0.248 | -------------------------------------- --------------------------------------- | time/ | | | fps | 61 | | iterations | 4300 | | time_elapsed | 351 | | total_timesteps | 21500 | | train/ | | | entropy_loss | -41.2 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 4299 | | policy_loss | 44.7 | | reward | -0.39718863 | | std | 1 | | value_loss | 2.09 | --------------------------------------- --------------------------------------- | time/ | | | fps | 61 | | iterations | 4400 | | time_elapsed | 359 | | total_timesteps | 22000 | | train/ | | | entropy_loss | -41.2 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 4399 | | policy_loss | -29.9 | | reward | -0.07916131 | | std | 1 | | value_loss | 0.947 | --------------------------------------- ------------------------------------- | time/ | | | fps | 60 | | iterations | 4500 | | time_elapsed | 369 | | total_timesteps | 22500 | | train/ | | | entropy_loss | -41.2 | | explained_variance | 5.96e-08 | | learning_rate | 0.0007 | | n_updates | 4499 | | policy_loss | -28.5 | | reward | 1.8631558 | | std | 1 | | value_loss | 1.83 | ------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 4600 | | time_elapsed | 375 | | total_timesteps | 23000 | | train/ | | | entropy_loss | -41.1 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 4599 | | policy_loss | 58.4 | | reward | 1.4081724 | | std | 0.999 | | value_loss | 77 | ------------------------------------- -------------------------------------- | time/ | | | fps | 61 | | iterations | 4700 | | time_elapsed | 383 | | total_timesteps | 23500 | | train/ | | | entropy_loss | -41.1 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 4699 | | policy_loss | 79.9 | | reward | -2.0728068 | | std | 0.998 | | value_loss | 4.66 | -------------------------------------- --------------------------------------- | time/ | | | fps | 61 | | iterations | 4800 | | time_elapsed | 392 | | total_timesteps | 24000 | | train/ | | | entropy_loss | -41.1 | | explained_variance | 0.0901 | | learning_rate | 0.0007 | | n_updates | 4799 | | policy_loss | -15.9 | | reward | -0.09403604 | | std | 0.998 | | value_loss | 0.165 | --------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 4900 | | time_elapsed | 399 | | total_timesteps | 24500 | | train/ | | | entropy_loss | -41.1 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 4899 | | policy_loss | -110 | | reward | 2.0542228 | | std | 1 | | value_loss | 10.8 | ------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 5000 | | time_elapsed | 407 | | total_timesteps | 25000 | | train/ | | | entropy_loss | -41.1 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 4999 | | policy_loss | -186 | | reward | 2.1355224 | | std | 0.999 | | value_loss | 27.6 | ------------------------------------- --------------------------------------- | time/ | | | fps | 61 | | iterations | 5100 | | time_elapsed | 416 | | total_timesteps | 25500 | | train/ | | | entropy_loss | -41.1 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 5099 | | policy_loss | -13.1 | | reward | -0.08651471 | | std | 0.999 | | value_loss | 1.59 | --------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 5200 | | time_elapsed | 423 | | total_timesteps | 26000 | | train/ | | | entropy_loss | -41 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 5199 | | policy_loss | 99 | | reward | 2.3819537 | | std | 0.997 | | value_loss | 29.5 | ------------------------------------- -------------------------------------- | time/ | | | fps | 61 | | iterations | 5300 | | time_elapsed | 432 | | total_timesteps | 26500 | | train/ | | | entropy_loss | -41 | | explained_variance | 1.19e-07 | | learning_rate | 0.0007 | | n_updates | 5299 | | policy_loss | 372 | | reward | -22.664398 | | std | 0.997 | | value_loss | 196 | -------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 5400 | | time_elapsed | 440 | | total_timesteps | 27000 | | train/ | | | entropy_loss | -41 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 5399 | | policy_loss | 153 | | reward | 1.1176498 | | std | 0.995 | | value_loss | 15.5 | ------------------------------------- -------------------------------------- | time/ | | | fps | 61 | | iterations | 5500 | | time_elapsed | 446 | | total_timesteps | 27500 | | train/ | | | entropy_loss | -41 | | explained_variance | -1.19e-07 | | learning_rate | 0.0007 | | n_updates | 5499 | | policy_loss | -92.6 | | reward | -5.1304746 | | std | 0.996 | | value_loss | 14.5 | -------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 5600 | | time_elapsed | 456 | | total_timesteps | 28000 | | train/ | | | entropy_loss | -41 | | explained_variance | 0.0186 | | learning_rate | 0.0007 | | n_updates | 5599 | | policy_loss | 62.9 | | reward | 1.1683302 | | std | 0.996 | | value_loss | 9.46 | ------------------------------------- -------------------------------------- | time/ | | | fps | 61 | | iterations | 5700 | | time_elapsed | 464 | | total_timesteps | 28500 | | train/ | | | entropy_loss | -41 | | explained_variance | -1.19e-07 | | learning_rate | 0.0007 | | n_updates | 5699 | | policy_loss | 34.4 | | reward | -3.4618378 | | std | 0.995 | | value_loss | 12 | -------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 5800 | | time_elapsed | 471 | | total_timesteps | 29000 | | train/ | | | entropy_loss | -41 | | explained_variance | -0.247 | | learning_rate | 0.0007 | | n_updates | 5799 | | policy_loss | -171 | | reward | 5.8363895 | | std | 0.996 | | value_loss | 20.7 | ------------------------------------- -------------------------------------- | time/ | | | fps | 61 | | iterations | 5900 | | time_elapsed | 481 | | total_timesteps | 29500 | | train/ | | | entropy_loss | -41 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 5899 | | policy_loss | 250 | | reward | -4.7651134 | | std | 0.996 | | value_loss | 83.8 | -------------------------------------- ------------------------------------ | time/ | | | fps | 61 | | iterations | 6000 | | time_elapsed | 487 | | total_timesteps | 30000 | | train/ | | | entropy_loss | -41.1 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 5999 | | policy_loss | -211 | | reward | 3.167639 | | std | 0.999 | | value_loss | 95.9 | ------------------------------------ day: 3352, episode: 10 begin_total_asset: 1005653.74 end_total_asset: 8785262.38 total_reward: 7779608.65 total_cost: 31168.83 total_trades: 45054 Sharpe: 0.834 ================================= -------------------------------------- | time/ | | | fps | 61 | | iterations | 6100 | | time_elapsed | 495 | | total_timesteps | 30500 | | train/ | | | entropy_loss | -41.1 | | explained_variance | -0.113 | | learning_rate | 0.0007 | | n_updates | 6099 | | policy_loss | 21.6 | | reward | 0.12259659 | | std | 1 | | value_loss | 0.918 | -------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 6200 | | time_elapsed | 505 | | total_timesteps | 31000 | | train/ | | | entropy_loss | -41.1 | | explained_variance | -0.00301 | | learning_rate | 0.0007 | | n_updates | 6199 | | policy_loss | -126 | | reward | 1.5182142 | | std | 1 | | value_loss | 16.8 | ------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 6300 | | time_elapsed | 511 | | total_timesteps | 31500 | | train/ | | | entropy_loss | -41.2 | | explained_variance | -1.74 | | learning_rate | 0.0007 | | n_updates | 6299 | | policy_loss | -25.1 | | reward | 0.5284405 | | std | 1 | | value_loss | 1.56 | ------------------------------------- -------------------------------------- | time/ | | | fps | 61 | | iterations | 6400 | | time_elapsed | 519 | | total_timesteps | 32000 | | train/ | | | entropy_loss | -41.3 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 6399 | | policy_loss | 123 | | reward | -3.4704874 | | std | 1.01 | | value_loss | 14.2 | -------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 6500 | | time_elapsed | 528 | | total_timesteps | 32500 | | train/ | | | entropy_loss | -41.4 | | explained_variance | -1.19e-07 | | learning_rate | 0.0007 | | n_updates | 6499 | | policy_loss | -102 | | reward | 3.8022645 | | std | 1.01 | | value_loss | 11 | ------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 6600 | | time_elapsed | 535 | | total_timesteps | 33000 | | train/ | | | entropy_loss | -41.4 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 6599 | | policy_loss | 372 | | reward | 24.101572 | | std | 1.01 | | value_loss | 136 | ------------------------------------- -------------------------------------- | time/ | | | fps | 61 | | iterations | 6700 | | time_elapsed | 543 | | total_timesteps | 33500 | | train/ | | | entropy_loss | -41.4 | | explained_variance | 5.96e-08 | | learning_rate | 0.0007 | | n_updates | 6699 | | policy_loss | -923 | | reward | -11.455252 | | std | 1.01 | | value_loss | 474 | -------------------------------------- -------------------------------------- | time/ | | | fps | 61 | | iterations | 6800 | | time_elapsed | 552 | | total_timesteps | 34000 | | train/ | | | entropy_loss | -41.4 | | explained_variance | -0.553 | | learning_rate | 0.0007 | | n_updates | 6799 | | policy_loss | -111 | | reward | 0.10300914 | | std | 1.01 | | value_loss | 9.3 | -------------------------------------- --------------------------------------- | time/ | | | fps | 61 | | iterations | 6900 | | time_elapsed | 558 | | total_timesteps | 34500 | | train/ | | | entropy_loss | -41.4 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 6899 | | policy_loss | 133 | | reward | -0.71322364 | | std | 1.01 | | value_loss | 15.3 | --------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 7000 | | time_elapsed | 567 | | total_timesteps | 35000 | | train/ | | | entropy_loss | -41.4 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 6999 | | policy_loss | -84.3 | | reward | 2.3256698 | | std | 1.01 | | value_loss | 4.58 | ------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 7100 | | time_elapsed | 575 | | total_timesteps | 35500 | | train/ | | | entropy_loss | -41.4 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 7099 | | policy_loss | -2.44 | | reward | 0.9263134 | | std | 1.01 | | value_loss | 1.21 | ------------------------------------- -------------------------------------- | time/ | | | fps | 61 | | iterations | 7200 | | time_elapsed | 586 | | total_timesteps | 36000 | | train/ | | | entropy_loss | -41.5 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 7199 | | policy_loss | -228 | | reward | -1.9283981 | | std | 1.01 | | value_loss | 42 | -------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 7300 | | time_elapsed | 596 | | total_timesteps | 36500 | | train/ | | | entropy_loss | -41.5 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 7299 | | policy_loss | 81 | | reward | -6.168546 | | std | 1.01 | | value_loss | 8.16 | ------------------------------------- --------------------------------------- | time/ | | | fps | 61 | | iterations | 7400 | | time_elapsed | 602 | | total_timesteps | 37000 | | train/ | | | entropy_loss | -41.5 | | explained_variance | -1.19e-07 | | learning_rate | 0.0007 | | n_updates | 7399 | | policy_loss | -136 | | reward | -0.66517484 | | std | 1.02 | | value_loss | 12.4 | --------------------------------------- -------------------------------------- | time/ | | | fps | 61 | | iterations | 7500 | | time_elapsed | 610 | | total_timesteps | 37500 | | train/ | | | entropy_loss | -41.6 | | explained_variance | 1.44e-05 | | learning_rate | 0.0007 | | n_updates | 7499 | | policy_loss | -368 | | reward | 0.28679553 | | std | 1.02 | | value_loss | 81.7 | -------------------------------------- --------------------------------------- | time/ | | | fps | 61 | | iterations | 7600 | | time_elapsed | 620 | | total_timesteps | 38000 | | train/ | | | entropy_loss | -41.6 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 7599 | | policy_loss | -80.4 | | reward | -0.02342434 | | std | 1.02 | | value_loss | 4.71 | --------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 7700 | | time_elapsed | 626 | | total_timesteps | 38500 | | train/ | | | entropy_loss | -41.6 | | explained_variance | -3.11 | | learning_rate | 0.0007 | | n_updates | 7699 | | policy_loss | -91.7 | | reward | 2.6142132 | | std | 1.02 | | value_loss | 6.61 | ------------------------------------- -------------------------------------- | time/ | | | fps | 61 | | iterations | 7800 | | time_elapsed | 635 | | total_timesteps | 39000 | | train/ | | | entropy_loss | -41.6 | | explained_variance | 0.125 | | learning_rate | 0.0007 | | n_updates | 7799 | | policy_loss | 4.76 | | reward | -1.2840562 | | std | 1.02 | | value_loss | 0.762 | -------------------------------------- --------------------------------------- | time/ | | | fps | 61 | | iterations | 7900 | | time_elapsed | 644 | | total_timesteps | 39500 | | train/ | | | entropy_loss | -41.7 | | explained_variance | 0.0476 | | learning_rate | 0.0007 | | n_updates | 7899 | | policy_loss | 273 | | reward | -0.55217224 | | std | 1.02 | | value_loss | 46.2 | --------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 8000 | | time_elapsed | 650 | | total_timesteps | 40000 | | train/ | | | entropy_loss | -41.6 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 7999 | | policy_loss | -769 | | reward | 1.6622137 | | std | 1.02 | | value_loss | 367 | ------------------------------------- -------------------------------------- | time/ | | | fps | 61 | | iterations | 8100 | | time_elapsed | 660 | | total_timesteps | 40500 | | train/ | | | entropy_loss | -41.6 | | explained_variance | -0.131 | | learning_rate | 0.0007 | | n_updates | 8099 | | policy_loss | 38.2 | | reward | 0.38162667 | | std | 1.02 | | value_loss | 1.09 | -------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 8200 | | time_elapsed | 668 | | total_timesteps | 41000 | | train/ | | | entropy_loss | -41.6 | | explained_variance | -0.306 | | learning_rate | 0.0007 | | n_updates | 8199 | | policy_loss | 40.8 | | reward | 0.8386523 | | std | 1.02 | | value_loss | 3.26 | ------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 8300 | | time_elapsed | 674 | | total_timesteps | 41500 | | train/ | | | entropy_loss | -41.5 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 8299 | | policy_loss | 41.7 | | reward | 1.5822707 | | std | 1.02 | | value_loss | 4.75 | ------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 8400 | | time_elapsed | 684 | | total_timesteps | 42000 | | train/ | | | entropy_loss | -41.6 | | explained_variance | 5.96e-08 | | learning_rate | 0.0007 | | n_updates | 8399 | | policy_loss | 133 | | reward | 0.1792632 | | std | 1.02 | | value_loss | 15.6 | ------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 8500 | | time_elapsed | 691 | | total_timesteps | 42500 | | train/ | | | entropy_loss | -41.7 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 8499 | | policy_loss | 99.2 | | reward | 1.6896911 | | std | 1.02 | | value_loss | 33.2 | ------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 8600 | | time_elapsed | 699 | | total_timesteps | 43000 | | train/ | | | entropy_loss | -41.7 | | explained_variance | -0.0163 | | learning_rate | 0.0007 | | n_updates | 8599 | | policy_loss | -836 | | reward | 30.580954 | | std | 1.02 | | value_loss | 436 | ------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 8700 | | time_elapsed | 709 | | total_timesteps | 43500 | | train/ | | | entropy_loss | -41.7 | | explained_variance | 0.0669 | | learning_rate | 0.0007 | | n_updates | 8699 | | policy_loss | -430 | | reward | -9.169519 | | std | 1.02 | | value_loss | 186 | ------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 8800 | | time_elapsed | 715 | | total_timesteps | 44000 | | train/ | | | entropy_loss | -41.7 | | explained_variance | 0.178 | | learning_rate | 0.0007 | | n_updates | 8799 | | policy_loss | -2.39 | | reward | -0.505542 | | std | 1.02 | | value_loss | 0.0762 | ------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 8900 | | time_elapsed | 723 | | total_timesteps | 44500 | | train/ | | | entropy_loss | -41.7 | | explained_variance | 0.0519 | | learning_rate | 0.0007 | | n_updates | 8899 | | policy_loss | -29 | | reward | 1.4009765 | | std | 1.02 | | value_loss | 0.617 | ------------------------------------- -------------------------------------- | time/ | | | fps | 61 | | iterations | 9000 | | time_elapsed | 733 | | total_timesteps | 45000 | | train/ | | | entropy_loss | -41.8 | | explained_variance | 0.0464 | | learning_rate | 0.0007 | | n_updates | 8999 | | policy_loss | -142 | | reward | -1.6482956 | | std | 1.02 | | value_loss | 12.1 | -------------------------------------- ---------------------------------------- | time/ | | | fps | 61 | | iterations | 9100 | | time_elapsed | 739 | | total_timesteps | 45500 | | train/ | | | entropy_loss | -41.7 | | explained_variance | 0.128 | | learning_rate | 0.0007 | | n_updates | 9099 | | policy_loss | -18.8 | | reward | -0.022230674 | | std | 1.02 | | value_loss | 1.08 | ---------------------------------------- --------------------------------------- | time/ | | | fps | 61 | | iterations | 9200 | | time_elapsed | 748 | | total_timesteps | 46000 | | train/ | | | entropy_loss | -41.8 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 9199 | | policy_loss | 35.8 | | reward | -0.16132466 | | std | 1.02 | | value_loss | 6.5 | --------------------------------------- -------------------------------------- | time/ | | | fps | 61 | | iterations | 9300 | | time_elapsed | 757 | | total_timesteps | 46500 | | train/ | | | entropy_loss | -41.7 | | explained_variance | 0.00416 | | learning_rate | 0.0007 | | n_updates | 9299 | | policy_loss | -192 | | reward | -3.3674068 | | std | 1.02 | | value_loss | 87.7 | -------------------------------------- ------------------------------------ | time/ | | | fps | 61 | | iterations | 9400 | | time_elapsed | 763 | | total_timesteps | 47000 | | train/ | | | entropy_loss | -41.8 | | explained_variance | -0.0393 | | learning_rate | 0.0007 | | n_updates | 9399 | | policy_loss | -37.6 | | reward | 1.150722 | | std | 1.02 | | value_loss | 3.37 | ------------------------------------ -------------------------------------- | time/ | | | fps | 61 | | iterations | 9500 | | time_elapsed | 772 | | total_timesteps | 47500 | | train/ | | | entropy_loss | -41.8 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 9499 | | policy_loss | -25.2 | | reward | 0.41208658 | | std | 1.02 | | value_loss | 0.698 | -------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 9600 | | time_elapsed | 780 | | total_timesteps | 48000 | | train/ | | | entropy_loss | -41.9 | | explained_variance | -1.19e-07 | | learning_rate | 0.0007 | | n_updates | 9599 | | policy_loss | 9.9 | | reward | 0.5765088 | | std | 1.03 | | value_loss | 2.06 | ------------------------------------- -------------------------------------- | time/ | | | fps | 61 | | iterations | 9700 | | time_elapsed | 787 | | total_timesteps | 48500 | | train/ | | | entropy_loss | -41.9 | | explained_variance | -1.19e-07 | | learning_rate | 0.0007 | | n_updates | 9699 | | policy_loss | 315 | | reward | -0.2841707 | | std | 1.03 | | value_loss | 49.4 | -------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 9800 | | time_elapsed | 797 | | total_timesteps | 49000 | | train/ | | | entropy_loss | -41.9 | | explained_variance | 5.96e-08 | | learning_rate | 0.0007 | | n_updates | 9799 | | policy_loss | 125 | | reward | 0.6355639 | | std | 1.03 | | value_loss | 9.9 | ------------------------------------- -------------------------------------- | time/ | | | fps | 61 | | iterations | 9900 | | time_elapsed | 804 | | total_timesteps | 49500 | | train/ | | | entropy_loss | -41.9 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 9899 | | policy_loss | 155 | | reward | -4.6037025 | | std | 1.03 | | value_loss | 16.2 | -------------------------------------- ------------------------------------- | time/ | | | fps | 61 | | iterations | 10000 | | time_elapsed | 811 | | total_timesteps | 50000 | | train/ | | | entropy_loss | -41.9 | | explained_variance | -1.19e-07 | | learning_rate | 0.0007 | | n_updates | 9999 | | policy_loss | 104 | | reward | -3.306132 | | std | 1.03 | | value_loss | 9.5 | ------------------------------------- {'batch_size': 128, 'buffer_size': 50000, 'learning_rate': 0.001} Using cpu device Logging to results/ddpg ----------------------------------- | time/ | | | episodes | 4 | | fps | 24 | | time_elapsed | 547 | | total_timesteps | 13412 | | train/ | | | actor_loss | 12.5 | | critic_loss | 295 | | learning_rate | 0.001 | | n_updates | 10059 | | reward | -6.3763723 | ----------------------------------- ----------------------------------- | time/ | | | episodes | 8 | | fps | 21 | | time_elapsed | 1237 | | total_timesteps | 26824 | | train/ | | | actor_loss | -5.51 | | critic_loss | 14.7 | | learning_rate | 0.001 | | n_updates | 23471 | | reward | -6.3763723 | ----------------------------------- day: 3352, episode: 10 begin_total_asset: 1011382.29 end_total_asset: 6058882.63 total_reward: 5047500.34 total_cost: 1010.37 total_trades: 46928 Sharpe: 0.807 ================================= ----------------------------------- | time/ | | | episodes | 12 | | fps | 20 | | time_elapsed | 1942 | | total_timesteps | 40236 | | train/ | | | actor_loss | -8.67 | | critic_loss | 7.57 | | learning_rate | 0.001 | | n_updates | 36883 | | reward | -6.3763723 | ----------------------------------- {'n_steps': 2048, 'ent_coef': 0.01, 'learning_rate': 0.00025, 'batch_size': 128} Using cpu device Logging to results/ppo ------------------------------------ | time/ | | | fps | 60 | | iterations | 1 | | time_elapsed | 33 | | total_timesteps | 2048 | | train/ | | | reward | -0.20400214 | ------------------------------------ ----------------------------------------- | time/ | | | fps | 61 | | iterations | 2 | | time_elapsed | 66 | | total_timesteps | 4096 | | train/ | | | approx_kl | 0.019446293 | | clip_fraction | 0.218 | | clip_range | 0.2 | | entropy_loss | -41.2 | | explained_variance | -0.0153 | | learning_rate | 0.00025 | | loss | 8.32 | | n_updates | 10 | | policy_gradient_loss | -0.0239 | | reward | 0.964798 | | std | 1 | | value_loss | 12.1 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 60 | | iterations | 3 | | time_elapsed | 100 | | total_timesteps | 6144 | | train/ | | | approx_kl | 0.017402954 | | clip_fraction | 0.182 | | clip_range | 0.2 | | entropy_loss | -41.2 | | explained_variance | 0.000782 | | learning_rate | 0.00025 | | loss | 28.3 | | n_updates | 20 | | policy_gradient_loss | -0.0164 | | reward | 7.5384216 | | std | 1 | | value_loss | 51.3 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 62 | | iterations | 4 | | time_elapsed | 131 | | total_timesteps | 8192 | | train/ | | | approx_kl | 0.015737543 | | clip_fraction | 0.162 | | clip_range | 0.2 | | entropy_loss | -41.3 | | explained_variance | -0.0037 | | learning_rate | 0.00025 | | loss | 22.7 | | n_updates | 30 | | policy_gradient_loss | -0.0225 | | reward | 2.27421 | | std | 1.01 | | value_loss | 38.2 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 62 | | iterations | 5 | | time_elapsed | 165 | | total_timesteps | 10240 | | train/ | | | approx_kl | 0.020310912 | | clip_fraction | 0.184 | | clip_range | 0.2 | | entropy_loss | -41.3 | | explained_variance | -0.00721 | | learning_rate | 0.00025 | | loss | 12 | | n_updates | 40 | | policy_gradient_loss | -0.0202 | | reward | 0.7753585 | | std | 1.01 | | value_loss | 21.3 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 62 | | iterations | 6 | | time_elapsed | 196 | | total_timesteps | 12288 | | train/ | | | approx_kl | 0.014960194 | | clip_fraction | 0.143 | | clip_range | 0.2 | | entropy_loss | -41.4 | | explained_variance | -0.0299 | | learning_rate | 0.00025 | | loss | 30.5 | | n_updates | 50 | | policy_gradient_loss | -0.0179 | | reward | 2.62347 | | std | 1.01 | | value_loss | 46.2 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 62 | | iterations | 7 | | time_elapsed | 228 | | total_timesteps | 14336 | | train/ | | | approx_kl | 0.023127541 | | clip_fraction | 0.193 | | clip_range | 0.2 | | entropy_loss | -41.4 | | explained_variance | -0.023 | | learning_rate | 0.00025 | | loss | 6.36 | | n_updates | 60 | | policy_gradient_loss | -0.0221 | | reward | 1.0379714 | | std | 1.01 | | value_loss | 14.1 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 62 | | iterations | 8 | | time_elapsed | 260 | | total_timesteps | 16384 | | train/ | | | approx_kl | 0.018745095 | | clip_fraction | 0.201 | | clip_range | 0.2 | | entropy_loss | -41.5 | | explained_variance | 0.00254 | | learning_rate | 0.00025 | | loss | 18.3 | | n_updates | 70 | | policy_gradient_loss | -0.019 | | reward | -0.21705139 | | std | 1.01 | | value_loss | 59.5 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 62 | | iterations | 9 | | time_elapsed | 294 | | total_timesteps | 18432 | | train/ | | | approx_kl | 0.018167643 | | clip_fraction | 0.154 | | clip_range | 0.2 | | entropy_loss | -41.5 | | explained_variance | -0.000664 | | learning_rate | 0.00025 | | loss | 21.5 | | n_updates | 80 | | policy_gradient_loss | -0.0144 | | reward | -0.31962025 | | std | 1.01 | | value_loss | 36.6 | ----------------------------------------- ---------------------------------------- | time/ | | | fps | 62 | | iterations | 10 | | time_elapsed | 328 | | total_timesteps | 20480 | | train/ | | | approx_kl | 0.02108417 | | clip_fraction | 0.244 | | clip_range | 0.2 | | entropy_loss | -41.6 | | explained_variance | 0.0203 | | learning_rate | 0.00025 | | loss | 7.36 | | n_updates | 90 | | policy_gradient_loss | -0.0191 | | reward | 0.07936729 | | std | 1.02 | | value_loss | 23.3 | ---------------------------------------- ----------------------------------------- | time/ | | | fps | 62 | | iterations | 11 | | time_elapsed | 357 | | total_timesteps | 22528 | | train/ | | | approx_kl | 0.014700897 | | clip_fraction | 0.166 | | clip_range | 0.2 | | entropy_loss | -41.6 | | explained_variance | 0.00383 | | learning_rate | 0.00025 | | loss | 29.1 | | n_updates | 100 | | policy_gradient_loss | -0.0156 | | reward | 1.4870173 | | std | 1.02 | | value_loss | 93.3 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 62 | | iterations | 12 | | time_elapsed | 391 | | total_timesteps | 24576 | | train/ | | | approx_kl | 0.017688308 | | clip_fraction | 0.194 | | clip_range | 0.2 | | entropy_loss | -41.7 | | explained_variance | -0.0104 | | learning_rate | 0.00025 | | loss | 6.58 | | n_updates | 110 | | policy_gradient_loss | -0.0161 | | reward | -0.7623598 | | std | 1.02 | | value_loss | 17.5 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 62 | | iterations | 13 | | time_elapsed | 422 | | total_timesteps | 26624 | | train/ | | | approx_kl | 0.023069832 | | clip_fraction | 0.24 | | clip_range | 0.2 | | entropy_loss | -41.7 | | explained_variance | 0.0101 | | learning_rate | 0.00025 | | loss | 38.8 | | n_updates | 120 | | policy_gradient_loss | -0.0147 | | reward | 3.4454083 | | std | 1.02 | | value_loss | 64.9 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 63 | | iterations | 14 | | time_elapsed | 454 | | total_timesteps | 28672 | | train/ | | | approx_kl | 0.017561657 | | clip_fraction | 0.204 | | clip_range | 0.2 | | entropy_loss | -41.7 | | explained_variance | -0.0172 | | learning_rate | 0.00025 | | loss | 21.4 | | n_updates | 130 | | policy_gradient_loss | -0.02 | | reward | 0.9586051 | | std | 1.02 | | value_loss | 52.4 | ----------------------------------------- day: 3352, episode: 10 begin_total_asset: 988584.72 end_total_asset: 3416710.65 total_reward: 2428125.94 total_cost: 420148.99 total_trades: 89136 Sharpe: 0.598 ================================= ---------------------------------------- | time/ | | | fps | 63 | | iterations | 15 | | time_elapsed | 487 | | total_timesteps | 30720 | | train/ | | | approx_kl | 0.02006042 | | clip_fraction | 0.219 | | clip_range | 0.2 | | entropy_loss | -41.7 | | explained_variance | -0.0279 | | learning_rate | 0.00025 | | loss | 13.3 | | n_updates | 140 | | policy_gradient_loss | -0.0185 | | reward | -0.3580386 | | std | 1.02 | | value_loss | 23.2 | ---------------------------------------- ----------------------------------------- | time/ | | | fps | 62 | | iterations | 16 | | time_elapsed | 523 | | total_timesteps | 32768 | | train/ | | | approx_kl | 0.025233287 | | clip_fraction | 0.243 | | clip_range | 0.2 | | entropy_loss | -41.8 | | explained_variance | -0.00552 | | learning_rate | 0.00025 | | loss | 22.4 | | n_updates | 150 | | policy_gradient_loss | -0.0176 | | reward | -0.5090524 | | std | 1.02 | | value_loss | 69.5 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 62 | | iterations | 17 | | time_elapsed | 556 | | total_timesteps | 34816 | | train/ | | | approx_kl | 0.022021335 | | clip_fraction | 0.216 | | clip_range | 0.2 | | entropy_loss | -41.8 | | explained_variance | 0.0188 | | learning_rate | 0.00025 | | loss | 8.75 | | n_updates | 160 | | policy_gradient_loss | -0.0188 | | reward | 1.8985721 | | std | 1.03 | | value_loss | 23.2 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 62 | | iterations | 18 | | time_elapsed | 586 | | total_timesteps | 36864 | | train/ | | | approx_kl | 0.019396901 | | clip_fraction | 0.229 | | clip_range | 0.2 | | entropy_loss | -41.9 | | explained_variance | 0.00194 | | learning_rate | 0.00025 | | loss | 14.6 | | n_updates | 170 | | policy_gradient_loss | -0.0195 | | reward | -0.31956208 | | std | 1.03 | | value_loss | 39.6 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 62 | | iterations | 19 | | time_elapsed | 622 | | total_timesteps | 38912 | | train/ | | | approx_kl | 0.020318478 | | clip_fraction | 0.225 | | clip_range | 0.2 | | entropy_loss | -41.9 | | explained_variance | 0.0132 | | learning_rate | 0.00025 | | loss | 22.3 | | n_updates | 180 | | policy_gradient_loss | -0.0128 | | reward | 0.33881456 | | std | 1.03 | | value_loss | 55.7 | ----------------------------------------- ---------------------------------------- | time/ | | | fps | 62 | | iterations | 20 | | time_elapsed | 652 | | total_timesteps | 40960 | | train/ | | | approx_kl | 0.02080874 | | clip_fraction | 0.179 | | clip_range | 0.2 | | entropy_loss | -42 | | explained_variance | 0.0334 | | learning_rate | 0.00025 | | loss | 5.55 | | n_updates | 190 | | policy_gradient_loss | -0.0186 | | reward | 0.15585361 | | std | 1.03 | | value_loss | 19.1 | ---------------------------------------- ---------------------------------------- | time/ | | | fps | 62 | | iterations | 21 | | time_elapsed | 686 | | total_timesteps | 43008 | | train/ | | | approx_kl | 0.01973752 | | clip_fraction | 0.227 | | clip_range | 0.2 | | entropy_loss | -42 | | explained_variance | 0.00997 | | learning_rate | 0.00025 | | loss | 19 | | n_updates | 200 | | policy_gradient_loss | -0.0153 | | reward | -14.07267 | | std | 1.03 | | value_loss | 75.2 | ---------------------------------------- ----------------------------------------- | time/ | | | fps | 62 | | iterations | 22 | | time_elapsed | 717 | | total_timesteps | 45056 | | train/ | | | approx_kl | 0.013898542 | | clip_fraction | 0.0931 | | clip_range | 0.2 | | entropy_loss | -42 | | explained_variance | -0.000876 | | learning_rate | 0.00025 | | loss | 12.2 | | n_updates | 210 | | policy_gradient_loss | -0.0138 | | reward | -5.085373 | | std | 1.03 | | value_loss | 27 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 62 | | iterations | 23 | | time_elapsed | 750 | | total_timesteps | 47104 | | train/ | | | approx_kl | 0.01667095 | | clip_fraction | 0.185 | | clip_range | 0.2 | | entropy_loss | -42.1 | | explained_variance | 0.00379 | | learning_rate | 0.00025 | | loss | 8.83 | | n_updates | 220 | | policy_gradient_loss | -0.0139 | | reward | -0.11939671 | | std | 1.03 | | value_loss | 30.8 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 62 | | iterations | 24 | | time_elapsed | 785 | | total_timesteps | 49152 | | train/ | | | approx_kl | 0.027711859 | | clip_fraction | 0.253 | | clip_range | 0.2 | | entropy_loss | -42.1 | | explained_variance | 0.0238 | | learning_rate | 0.00025 | | loss | 31.9 | | n_updates | 230 | | policy_gradient_loss | -0.00308 | | reward | -1.080327 | | std | 1.03 | | value_loss | 75 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 62 | | iterations | 25 | | time_elapsed | 817 | | total_timesteps | 51200 | | train/ | | | approx_kl | 0.025901645 | | clip_fraction | 0.278 | | clip_range | 0.2 | | entropy_loss | -42.1 | | explained_variance | 0.0481 | | learning_rate | 0.00025 | | loss | 5.34 | | n_updates | 240 | | policy_gradient_loss | -0.0164 | | reward | 0.08477563 | | std | 1.04 | | value_loss | 13 | ----------------------------------------- {'batch_size': 128, 'buffer_size': 100000, 'learning_rate': 0.0001, 'learning_starts': 100, 'ent_coef': 'auto_0.1'} Using cpu device Logging to results/sac ----------------------------------- | time/ | | | episodes | 4 | | fps | 19 | | time_elapsed | 703 | | total_timesteps | 13412 | | train/ | | | actor_loss | 1.68e+03 | | critic_loss | 1e+04 | | ent_coef | 0.309 | | ent_coef_loss | -0.516 | | learning_rate | 0.0001 | | n_updates | 13311 | | reward | -11.183781 | ----------------------------------- ----------------------------------- | time/ | | | episodes | 8 | | fps | 19 | | time_elapsed | 1410 | | total_timesteps | 26824 | | train/ | | | actor_loss | 677 | | critic_loss | 66.2 | | ent_coef | 0.0855 | | ent_coef_loss | -112 | | learning_rate | 0.0001 | | n_updates | 26723 | | reward | -10.753805 | ----------------------------------- day: 3352, episode: 10 begin_total_asset: 1005927.23 end_total_asset: 5294689.46 total_reward: 4288762.22 total_cost: 37988.65 total_trades: 61507 Sharpe: 0.700 ================================= ---------------------------------- | time/ | | | episodes | 12 | | fps | 18 | | time_elapsed | 2126 | | total_timesteps | 40236 | | train/ | | | actor_loss | 304 | | critic_loss | 20.1 | | ent_coef | 0.0227 | | ent_coef_loss | -145 | | learning_rate | 0.0001 | | n_updates | 40135 | | reward | -9.593834 | ---------------------------------- {'batch_size': 100, 'buffer_size': 1000000, 'learning_rate': 0.001} Using cpu device Logging to results/td3 ----------------------------------- | time/ | | | episodes | 4 | | fps | 24 | | time_elapsed | 544 | | total_timesteps | 13412 | | train/ | | | actor_loss | 132 | | critic_loss | 6.19e+03 | | learning_rate | 0.001 | | n_updates | 10059 | | reward | -2.3487854 | ----------------------------------- ----------------------------------- | time/ | | | episodes | 8 | | fps | 21 | | time_elapsed | 1242 | | total_timesteps | 26824 | | train/ | | | actor_loss | 49 | | critic_loss | 584 | | learning_rate | 0.001 | | n_updates | 23471 | | reward | -2.3487854 | ----------------------------------- day: 3352, episode: 10 begin_total_asset: 1012427.98 end_total_asset: 5866237.13 total_reward: 4853809.15 total_cost: 1011.41 total_trades: 53632 Sharpe: 0.831 ================================= ----------------------------------- | time/ | | | episodes | 12 | | fps | 20 | | time_elapsed | 1943 | | total_timesteps | 40236 | | train/ | | | actor_loss | 37.3 | | critic_loss | 101 | | learning_rate | 0.001 | | n_updates | 36883 | | reward | -2.3487854 | ----------------------------------- hit end! hit end! hit end! hit end! hit end! [*********************100%***********************] 1 of 1 completed Shape of DataFrame: (22, 8) i: 3 {'n_steps': 5, 'ent_coef': 0.01, 'learning_rate': 0.0007} Using cpu device Logging to results/a2c ------------------------------------- | time/ | | | fps | 46 | | iterations | 100 | | time_elapsed | 10 | | total_timesteps | 500 | | train/ | | | entropy_loss | -41.2 | | explained_variance | -0.471 | | learning_rate | 0.0007 | | n_updates | 99 | | policy_loss | 86.2 | | reward | 1.3343517 | | std | 1 | | value_loss | 5.99 | ------------------------------------- -------------------------------------- | time/ | | | fps | 47 | | iterations | 200 | | time_elapsed | 20 | | total_timesteps | 1000 | | train/ | | | entropy_loss | -41.3 | | explained_variance | -0.271 | | learning_rate | 0.0007 | | n_updates | 199 | | policy_loss | 44.4 | | reward | -1.4969016 | | std | 1 | | value_loss | 1.22 | -------------------------------------- -------------------------------------- | time/ | | | fps | 47 | | iterations | 300 | | time_elapsed | 31 | | total_timesteps | 1500 | | train/ | | | entropy_loss | -41.3 | | explained_variance | 0.0667 | | learning_rate | 0.0007 | | n_updates | 299 | | policy_loss | 141 | | reward | -4.3429856 | | std | 1 | | value_loss | 15.6 | -------------------------------------- -------------------------------------- | time/ | | | fps | 51 | | iterations | 400 | | time_elapsed | 39 | | total_timesteps | 2000 | | train/ | | | entropy_loss | -41.3 | | explained_variance | -1.19e-07 | | learning_rate | 0.0007 | | n_updates | 399 | | policy_loss | 28.9 | | reward | -2.9280229 | | std | 1.01 | | value_loss | 0.941 | -------------------------------------- ------------------------------------ | time/ | | | fps | 54 | | iterations | 500 | | time_elapsed | 46 | | total_timesteps | 2500 | | train/ | | | entropy_loss | -41.2 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 499 | | policy_loss | -356 | | reward | 2.440834 | | std | 1 | | value_loss | 95.7 | ------------------------------------ ------------------------------------- | time/ | | | fps | 52 | | iterations | 600 | | time_elapsed | 56 | | total_timesteps | 3000 | | train/ | | | entropy_loss | -41.3 | | explained_variance | -1.19e-07 | | learning_rate | 0.0007 | | n_updates | 599 | | policy_loss | -187 | | reward | 7.7011724 | | std | 1.01 | | value_loss | 30.6 | ------------------------------------- -------------------------------------- | time/ | | | fps | 55 | | iterations | 700 | | time_elapsed | 63 | | total_timesteps | 3500 | | train/ | | | entropy_loss | -41.4 | | explained_variance | -0.042 | | learning_rate | 0.0007 | | n_updates | 699 | | policy_loss | 23.3 | | reward | -1.0782235 | | std | 1.01 | | value_loss | 0.963 | -------------------------------------- --------------------------------------- | time/ | | | fps | 56 | | iterations | 800 | | time_elapsed | 71 | | total_timesteps | 4000 | | train/ | | | entropy_loss | -41.4 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 799 | | policy_loss | -258 | | reward | -0.20911986 | | std | 1.01 | | value_loss | 50.4 | --------------------------------------- ------------------------------------- | time/ | | | fps | 55 | | iterations | 900 | | time_elapsed | 81 | | total_timesteps | 4500 | | train/ | | | entropy_loss | -41.4 | | explained_variance | 0.118 | | learning_rate | 0.0007 | | n_updates | 899 | | policy_loss | -66.9 | | reward | 0.8433642 | | std | 1.01 | | value_loss | 2.9 | ------------------------------------- -------------------------------------- | time/ | | | fps | 57 | | iterations | 1000 | | time_elapsed | 87 | | total_timesteps | 5000 | | train/ | | | entropy_loss | -41.4 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 999 | | policy_loss | 5.19 | | reward | -1.4874439 | | std | 1.01 | | value_loss | 4.01 | -------------------------------------- ------------------------------------- | time/ | | | fps | 57 | | iterations | 1100 | | time_elapsed | 96 | | total_timesteps | 5500 | | train/ | | | entropy_loss | -41.4 | | explained_variance | -0.555 | | learning_rate | 0.0007 | | n_updates | 1099 | | policy_loss | -77.7 | | reward | 1.8939301 | | std | 1.01 | | value_loss | 3.97 | ------------------------------------- ------------------------------------- | time/ | | | fps | 57 | | iterations | 1200 | | time_elapsed | 105 | | total_timesteps | 6000 | | train/ | | | entropy_loss | -41.4 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 1199 | | policy_loss | 187 | | reward | 2.3026025 | | std | 1.01 | | value_loss | 33.4 | ------------------------------------- ------------------------------------- | time/ | | | fps | 58 | | iterations | 1300 | | time_elapsed | 111 | | total_timesteps | 6500 | | train/ | | | entropy_loss | -41.4 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 1299 | | policy_loss | 47.1 | | reward | 1.9173757 | | std | 1.01 | | value_loss | 5.89 | ------------------------------------- -------------------------------------- | time/ | | | fps | 54 | | iterations | 1400 | | time_elapsed | 127 | | total_timesteps | 7000 | | train/ | | | entropy_loss | -41.4 | | explained_variance | -0.0644 | | learning_rate | 0.0007 | | n_updates | 1399 | | policy_loss | 32.9 | | reward | -3.0739012 | | std | 1.01 | | value_loss | 1.06 | -------------------------------------- ------------------------------------- | time/ | | | fps | 55 | | iterations | 1500 | | time_elapsed | 134 | | total_timesteps | 7500 | | train/ | | | entropy_loss | -41.4 | | explained_variance | -1.19e-07 | | learning_rate | 0.0007 | | n_updates | 1499 | | policy_loss | 305 | | reward | 2.5744946 | | std | 1.01 | | value_loss | 50.5 | ------------------------------------- -------------------------------------- | time/ | | | fps | 55 | | iterations | 1600 | | time_elapsed | 144 | | total_timesteps | 8000 | | train/ | | | entropy_loss | -41.4 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 1599 | | policy_loss | -4.88 | | reward | -2.1707737 | | std | 1.01 | | value_loss | 1.28 | -------------------------------------- --------------------------------------- | time/ | | | fps | 55 | | iterations | 1700 | | time_elapsed | 151 | | total_timesteps | 8500 | | train/ | | | entropy_loss | -41.5 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 1699 | | policy_loss | 160 | | reward | -0.81266195 | | std | 1.01 | | value_loss | 16 | --------------------------------------- -------------------------------------- | time/ | | | fps | 56 | | iterations | 1800 | | time_elapsed | 158 | | total_timesteps | 9000 | | train/ | | | entropy_loss | -41.5 | | explained_variance | -0.208 | | learning_rate | 0.0007 | | n_updates | 1799 | | policy_loss | 38.1 | | reward | -1.4904355 | | std | 1.01 | | value_loss | 5.15 | -------------------------------------- --------------------------------------- | time/ | | | fps | 56 | | iterations | 1900 | | time_elapsed | 169 | | total_timesteps | 9500 | | train/ | | | entropy_loss | -41.5 | | explained_variance | 1.79e-07 | | learning_rate | 0.0007 | | n_updates | 1899 | | policy_loss | 282 | | reward | -0.36043915 | | std | 1.01 | | value_loss | 56 | --------------------------------------- --------------------------------------- | time/ | | | fps | 56 | | iterations | 2000 | | time_elapsed | 175 | | total_timesteps | 10000 | | train/ | | | entropy_loss | -41.5 | | explained_variance | -0.0747 | | learning_rate | 0.0007 | | n_updates | 1999 | | policy_loss | -471 | | reward | -0.37017918 | | std | 1.01 | | value_loss | 171 | --------------------------------------- -------------------------------------- | time/ | | | fps | 57 | | iterations | 2100 | | time_elapsed | 183 | | total_timesteps | 10500 | | train/ | | | entropy_loss | -41.5 | | explained_variance | 0.163 | | learning_rate | 0.0007 | | n_updates | 2099 | | policy_loss | 1.74 | | reward | -1.2063048 | | std | 1.01 | | value_loss | 0.28 | -------------------------------------- ------------------------------------- | time/ | | | fps | 56 | | iterations | 2200 | | time_elapsed | 193 | | total_timesteps | 11000 | | train/ | | | entropy_loss | -41.6 | | explained_variance | -0.326 | | learning_rate | 0.0007 | | n_updates | 2199 | | policy_loss | -94 | | reward | 1.8247845 | | std | 1.01 | | value_loss | 5.61 | ------------------------------------- ------------------------------------- | time/ | | | fps | 57 | | iterations | 2300 | | time_elapsed | 199 | | total_timesteps | 11500 | | train/ | | | entropy_loss | -41.6 | | explained_variance | 0.0682 | | learning_rate | 0.0007 | | n_updates | 2299 | | policy_loss | -128 | | reward | 0.4869665 | | std | 1.01 | | value_loss | 15.7 | ------------------------------------- -------------------------------------- | time/ | | | fps | 57 | | iterations | 2400 | | time_elapsed | 208 | | total_timesteps | 12000 | | train/ | | | entropy_loss | -41.5 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 2399 | | policy_loss | -173 | | reward | -0.9164407 | | std | 1.01 | | value_loss | 23.1 | -------------------------------------- ------------------------------------- | time/ | | | fps | 57 | | iterations | 2500 | | time_elapsed | 217 | | total_timesteps | 12500 | | train/ | | | entropy_loss | -41.6 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 2499 | | policy_loss | 86.3 | | reward | 3.2540042 | | std | 1.02 | | value_loss | 5.33 | ------------------------------------- ------------------------------------- | time/ | | | fps | 58 | | iterations | 2600 | | time_elapsed | 223 | | total_timesteps | 13000 | | train/ | | | entropy_loss | -41.6 | | explained_variance | 1.19e-07 | | learning_rate | 0.0007 | | n_updates | 2599 | | policy_loss | 428 | | reward | 2.4169402 | | std | 1.02 | | value_loss | 112 | ------------------------------------- -------------------------------------- | time/ | | | fps | 57 | | iterations | 2700 | | time_elapsed | 233 | | total_timesteps | 13500 | | train/ | | | entropy_loss | -41.6 | | explained_variance | 0.0601 | | learning_rate | 0.0007 | | n_updates | 2699 | | policy_loss | 1.73 | | reward | -1.3785244 | | std | 1.02 | | value_loss | 0.378 | -------------------------------------- -------------------------------------- | time/ | | | fps | 58 | | iterations | 2800 | | time_elapsed | 241 | | total_timesteps | 14000 | | train/ | | | entropy_loss | -41.7 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 2799 | | policy_loss | 45.3 | | reward | -1.8347946 | | std | 1.02 | | value_loss | 4.25 | -------------------------------------- -------------------------------------- | time/ | | | fps | 58 | | iterations | 2900 | | time_elapsed | 247 | | total_timesteps | 14500 | | train/ | | | entropy_loss | -41.7 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 2899 | | policy_loss | -49.6 | | reward | 0.13086061 | | std | 1.02 | | value_loss | 1.75 | -------------------------------------- -------------------------------------- | time/ | | | fps | 58 | | iterations | 3000 | | time_elapsed | 258 | | total_timesteps | 15000 | | train/ | | | entropy_loss | -41.7 | | explained_variance | -0.104 | | learning_rate | 0.0007 | | n_updates | 2999 | | policy_loss | -51.1 | | reward | -2.9340496 | | std | 1.02 | | value_loss | 1.92 | -------------------------------------- ------------------------------------- | time/ | | | fps | 58 | | iterations | 3100 | | time_elapsed | 266 | | total_timesteps | 15500 | | train/ | | | entropy_loss | -41.7 | | explained_variance | 5.96e-08 | | learning_rate | 0.0007 | | n_updates | 3099 | | policy_loss | -96 | | reward | 5.6104155 | | std | 1.02 | | value_loss | 7.44 | ------------------------------------- ------------------------------------ | time/ | | | fps | 58 | | iterations | 3200 | | time_elapsed | 273 | | total_timesteps | 16000 | | train/ | | | entropy_loss | -41.7 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 3199 | | policy_loss | 288 | | reward | 4.10712 | | std | 1.02 | | value_loss | 56 | ------------------------------------ -------------------------------------- | time/ | | | fps | 58 | | iterations | 3300 | | time_elapsed | 283 | | total_timesteps | 16500 | | train/ | | | entropy_loss | -41.7 | | explained_variance | 5.96e-08 | | learning_rate | 0.0007 | | n_updates | 3299 | | policy_loss | 29.9 | | reward | 0.10846165 | | std | 1.02 | | value_loss | 6.75 | -------------------------------------- --------------------------------------- | time/ | | | fps | 58 | | iterations | 3400 | | time_elapsed | 290 | | total_timesteps | 17000 | | train/ | | | entropy_loss | -41.7 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 3399 | | policy_loss | -128 | | reward | -0.26822066 | | std | 1.02 | | value_loss | 12.3 | --------------------------------------- --------------------------------------- | time/ | | | fps | 58 | | iterations | 3500 | | time_elapsed | 298 | | total_timesteps | 17500 | | train/ | | | entropy_loss | -41.7 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 3499 | | policy_loss | 23.3 | | reward | 0.012110213 | | std | 1.02 | | value_loss | 0.832 | --------------------------------------- -------------------------------------- | time/ | | | fps | 58 | | iterations | 3600 | | time_elapsed | 308 | | total_timesteps | 18000 | | train/ | | | entropy_loss | -41.8 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 3599 | | policy_loss | -94.5 | | reward | -0.6443226 | | std | 1.02 | | value_loss | 11.1 | -------------------------------------- ------------------------------------- | time/ | | | fps | 58 | | iterations | 3700 | | time_elapsed | 314 | | total_timesteps | 18500 | | train/ | | | entropy_loss | -41.8 | | explained_variance | 1.19e-07 | | learning_rate | 0.0007 | | n_updates | 3699 | | policy_loss | -16.7 | | reward | 1.8698422 | | std | 1.02 | | value_loss | 0.374 | ------------------------------------- -------------------------------------- | time/ | | | fps | 58 | | iterations | 3800 | | time_elapsed | 323 | | total_timesteps | 19000 | | train/ | | | entropy_loss | -41.8 | | explained_variance | 1.19e-07 | | learning_rate | 0.0007 | | n_updates | 3799 | | policy_loss | 166 | | reward | -1.3664656 | | std | 1.02 | | value_loss | 19.5 | -------------------------------------- -------------------------------------- | time/ | | | fps | 58 | | iterations | 3900 | | time_elapsed | 332 | | total_timesteps | 19500 | | train/ | | | entropy_loss | -41.7 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 3899 | | policy_loss | 43.9 | | reward | -1.1592114 | | std | 1.02 | | value_loss | 2.46 | -------------------------------------- ------------------------------------ | time/ | | | fps | 58 | | iterations | 4000 | | time_elapsed | 339 | | total_timesteps | 20000 | | train/ | | | entropy_loss | -41.8 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 3999 | | policy_loss | -31.6 | | reward | 1.018338 | | std | 1.02 | | value_loss | 0.683 | ------------------------------------ -------------------------------------- | time/ | | | fps | 58 | | iterations | 4100 | | time_elapsed | 348 | | total_timesteps | 20500 | | train/ | | | entropy_loss | -41.8 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 4099 | | policy_loss | -21.4 | | reward | 0.26098472 | | std | 1.02 | | value_loss | 0.295 | -------------------------------------- ------------------------------------- | time/ | | | fps | 58 | | iterations | 4200 | | time_elapsed | 356 | | total_timesteps | 21000 | | train/ | | | entropy_loss | -41.8 | | explained_variance | 5.96e-08 | | learning_rate | 0.0007 | | n_updates | 4199 | | policy_loss | 37.3 | | reward | 2.0496662 | | std | 1.02 | | value_loss | 1.24 | ------------------------------------- ------------------------------------- | time/ | | | fps | 59 | | iterations | 4300 | | time_elapsed | 362 | | total_timesteps | 21500 | | train/ | | | entropy_loss | -41.9 | | explained_variance | 5.96e-08 | | learning_rate | 0.0007 | | n_updates | 4299 | | policy_loss | 21.7 | | reward | 0.5919729 | | std | 1.03 | | value_loss | 0.614 | ------------------------------------- --------------------------------------- | time/ | | | fps | 58 | | iterations | 4400 | | time_elapsed | 373 | | total_timesteps | 22000 | | train/ | | | entropy_loss | -41.9 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 4399 | | policy_loss | -59.5 | | reward | -0.44648832 | | std | 1.03 | | value_loss | 2.35 | --------------------------------------- -------------------------------------- | time/ | | | fps | 59 | | iterations | 4500 | | time_elapsed | 380 | | total_timesteps | 22500 | | train/ | | | entropy_loss | -41.9 | | explained_variance | -1.19e-07 | | learning_rate | 0.0007 | | n_updates | 4499 | | policy_loss | 75.7 | | reward | -1.7295737 | | std | 1.03 | | value_loss | 5.7 | -------------------------------------- ------------------------------------ | time/ | | | fps | 59 | | iterations | 4600 | | time_elapsed | 387 | | total_timesteps | 23000 | | train/ | | | entropy_loss | -41.9 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 4599 | | policy_loss | -194 | | reward | -2.21535 | | std | 1.03 | | value_loss | 37.3 | ------------------------------------ -------------------------------------- | time/ | | | fps | 58 | | iterations | 4700 | | time_elapsed | 398 | | total_timesteps | 23500 | | train/ | | | entropy_loss | -42 | | explained_variance | -0.0141 | | learning_rate | 0.0007 | | n_updates | 4699 | | policy_loss | -32.7 | | reward | 0.16243774 | | std | 1.03 | | value_loss | 1.75 | -------------------------------------- ------------------------------------ | time/ | | | fps | 59 | | iterations | 4800 | | time_elapsed | 404 | | total_timesteps | 24000 | | train/ | | | entropy_loss | -42 | | explained_variance | 0.168 | | learning_rate | 0.0007 | | n_updates | 4799 | | policy_loss | -61.7 | | reward | 0.961177 | | std | 1.03 | | value_loss | 2.67 | ------------------------------------ ------------------------------------ | time/ | | | fps | 59 | | iterations | 4900 | | time_elapsed | 412 | | total_timesteps | 24500 | | train/ | | | entropy_loss | -42 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 4899 | | policy_loss | -54.1 | | reward | 3.000443 | | std | 1.03 | | value_loss | 2.52 | ------------------------------------ ------------------------------------- | time/ | | | fps | 59 | | iterations | 5000 | | time_elapsed | 422 | | total_timesteps | 25000 | | train/ | | | entropy_loss | -42 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 4999 | | policy_loss | 75.7 | | reward | 0.7883484 | | std | 1.03 | | value_loss | 6.61 | ------------------------------------- --------------------------------------- | time/ | | | fps | 59 | | iterations | 5100 | | time_elapsed | 428 | | total_timesteps | 25500 | | train/ | | | entropy_loss | -42.1 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 5099 | | policy_loss | 237 | | reward | -0.49083808 | | std | 1.03 | | value_loss | 39.1 | --------------------------------------- ------------------------------------- | time/ | | | fps | 59 | | iterations | 5200 | | time_elapsed | 437 | | total_timesteps | 26000 | | train/ | | | entropy_loss | -42.1 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 5199 | | policy_loss | 152 | | reward | 2.7196112 | | std | 1.03 | | value_loss | 16.1 | ------------------------------------- -------------------------------------- | time/ | | | fps | 59 | | iterations | 5300 | | time_elapsed | 445 | | total_timesteps | 26500 | | train/ | | | entropy_loss | -42.1 | | explained_variance | -1.19e-07 | | learning_rate | 0.0007 | | n_updates | 5299 | | policy_loss | -317 | | reward | 0.59174556 | | std | 1.03 | | value_loss | 63.7 | -------------------------------------- -------------------------------------- | time/ | | | fps | 59 | | iterations | 5400 | | time_elapsed | 452 | | total_timesteps | 27000 | | train/ | | | entropy_loss | -42.1 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 5399 | | policy_loss | -126 | | reward | 0.06384493 | | std | 1.03 | | value_loss | 9.43 | -------------------------------------- -------------------------------------- | time/ | | | fps | 59 | | iterations | 5500 | | time_elapsed | 461 | | total_timesteps | 27500 | | train/ | | | entropy_loss | -42.1 | | explained_variance | 1.19e-07 | | learning_rate | 0.0007 | | n_updates | 5499 | | policy_loss | -11.3 | | reward | -1.1629822 | | std | 1.03 | | value_loss | 0.213 | -------------------------------------- ------------------------------------ | time/ | | | fps | 59 | | iterations | 5600 | | time_elapsed | 469 | | total_timesteps | 28000 | | train/ | | | entropy_loss | -42.1 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 5599 | | policy_loss | 91 | | reward | 1.35537 | | std | 1.03 | | value_loss | 5.83 | ------------------------------------ ------------------------------------- | time/ | | | fps | 59 | | iterations | 5700 | | time_elapsed | 476 | | total_timesteps | 28500 | | train/ | | | entropy_loss | -42.1 | | explained_variance | 5.96e-08 | | learning_rate | 0.0007 | | n_updates | 5699 | | policy_loss | -18.6 | | reward | -2.177703 | | std | 1.03 | | value_loss | 0.358 | ------------------------------------- -------------------------------------- | time/ | | | fps | 59 | | iterations | 5800 | | time_elapsed | 487 | | total_timesteps | 29000 | | train/ | | | entropy_loss | -42 | | explained_variance | -1.19e-07 | | learning_rate | 0.0007 | | n_updates | 5799 | | policy_loss | -36.6 | | reward | -2.1937134 | | std | 1.03 | | value_loss | 2.54 | -------------------------------------- -------------------------------------- | time/ | | | fps | 59 | | iterations | 5900 | | time_elapsed | 497 | | total_timesteps | 29500 | | train/ | | | entropy_loss | -42 | | explained_variance | 1.19e-07 | | learning_rate | 0.0007 | | n_updates | 5899 | | policy_loss | -94.3 | | reward | -1.7350562 | | std | 1.03 | | value_loss | 7.48 | -------------------------------------- day: 3330, episode: 10 begin_total_asset: 952508.66 end_total_asset: 4088694.53 total_reward: 3136185.87 total_cost: 3157.22 total_trades: 58734 Sharpe: 0.733 ================================= ------------------------------------- | time/ | | | fps | 59 | | iterations | 6000 | | time_elapsed | 507 | | total_timesteps | 30000 | | train/ | | | entropy_loss | -42 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 5999 | | policy_loss | 9.33 | | reward | 1.7072018 | | std | 1.03 | | value_loss | 0.168 | ------------------------------------- -------------------------------------- | time/ | | | fps | 59 | | iterations | 6100 | | time_elapsed | 515 | | total_timesteps | 30500 | | train/ | | | entropy_loss | -42 | | explained_variance | 0.137 | | learning_rate | 0.0007 | | n_updates | 6099 | | policy_loss | 86.1 | | reward | 0.23781453 | | std | 1.03 | | value_loss | 5.84 | -------------------------------------- --------------------------------------- | time/ | | | fps | 59 | | iterations | 6200 | | time_elapsed | 522 | | total_timesteps | 31000 | | train/ | | | entropy_loss | -42 | | explained_variance | 1.19e-07 | | learning_rate | 0.0007 | | n_updates | 6199 | | policy_loss | 81.6 | | reward | -0.55448675 | | std | 1.03 | | value_loss | 4.51 | --------------------------------------- -------------------------------------- | time/ | | | fps | 59 | | iterations | 6300 | | time_elapsed | 532 | | total_timesteps | 31500 | | train/ | | | entropy_loss | -42.1 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 6299 | | policy_loss | -123 | | reward | 0.53070265 | | std | 1.03 | | value_loss | 10.1 | -------------------------------------- -------------------------------------- | time/ | | | fps | 59 | | iterations | 6400 | | time_elapsed | 539 | | total_timesteps | 32000 | | train/ | | | entropy_loss | -42.2 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 6399 | | policy_loss | -35.1 | | reward | -0.7190698 | | std | 1.04 | | value_loss | 0.746 | -------------------------------------- --------------------------------------- | time/ | | | fps | 59 | | iterations | 6500 | | time_elapsed | 547 | | total_timesteps | 32500 | | train/ | | | entropy_loss | -42.1 | | explained_variance | -1.19e-07 | | learning_rate | 0.0007 | | n_updates | 6499 | | policy_loss | -195 | | reward | -0.20805828 | | std | 1.04 | | value_loss | 24.3 | --------------------------------------- ------------------------------------- | time/ | | | fps | 59 | | iterations | 6600 | | time_elapsed | 557 | | total_timesteps | 33000 | | train/ | | | entropy_loss | -42.1 | | explained_variance | 0.0285 | | learning_rate | 0.0007 | | n_updates | 6599 | | policy_loss | -113 | | reward | -2.668644 | | std | 1.04 | | value_loss | 13.9 | ------------------------------------- ---------------------------------------- | time/ | | | fps | 59 | | iterations | 6700 | | time_elapsed | 563 | | total_timesteps | 33500 | | train/ | | | entropy_loss | -42.2 | | explained_variance | -0.603 | | learning_rate | 0.0007 | | n_updates | 6699 | | policy_loss | -39.6 | | reward | -0.083356254 | | std | 1.04 | | value_loss | 0.818 | ---------------------------------------- ------------------------------------- | time/ | | | fps | 59 | | iterations | 6800 | | time_elapsed | 572 | | total_timesteps | 34000 | | train/ | | | entropy_loss | -42.2 | | explained_variance | 0.0184 | | learning_rate | 0.0007 | | n_updates | 6799 | | policy_loss | 86.5 | | reward | 0.6618178 | | std | 1.04 | | value_loss | 5.35 | ------------------------------------- --------------------------------------- | time/ | | | fps | 59 | | iterations | 6900 | | time_elapsed | 581 | | total_timesteps | 34500 | | train/ | | | entropy_loss | -42.2 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 6899 | | policy_loss | 56.7 | | reward | 0.052872755 | | std | 1.04 | | value_loss | 2.85 | --------------------------------------- ------------------------------------- | time/ | | | fps | 59 | | iterations | 7000 | | time_elapsed | 587 | | total_timesteps | 35000 | | train/ | | | entropy_loss | -42.3 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 6999 | | policy_loss | 197 | | reward | 1.6442178 | | std | 1.04 | | value_loss | 26.3 | ------------------------------------- --------------------------------------- | time/ | | | fps | 59 | | iterations | 7100 | | time_elapsed | 597 | | total_timesteps | 35500 | | train/ | | | entropy_loss | -42.3 | | explained_variance | -0.0238 | | learning_rate | 0.0007 | | n_updates | 7099 | | policy_loss | 39.7 | | reward | -0.16224274 | | std | 1.04 | | value_loss | 1.43 | --------------------------------------- -------------------------------------- | time/ | | | fps | 59 | | iterations | 7200 | | time_elapsed | 605 | | total_timesteps | 36000 | | train/ | | | entropy_loss | -42.3 | | explained_variance | 0.03 | | learning_rate | 0.0007 | | n_updates | 7199 | | policy_loss | 139 | | reward | -0.1674491 | | std | 1.04 | | value_loss | 11.7 | -------------------------------------- ------------------------------------- | time/ | | | fps | 59 | | iterations | 7300 | | time_elapsed | 611 | | total_timesteps | 36500 | | train/ | | | entropy_loss | -42.4 | | explained_variance | -0.0288 | | learning_rate | 0.0007 | | n_updates | 7299 | | policy_loss | -406 | | reward | 2.2645469 | | std | 1.04 | | value_loss | 134 | ------------------------------------- -------------------------------------- | time/ | | | fps | 59 | | iterations | 7400 | | time_elapsed | 622 | | total_timesteps | 37000 | | train/ | | | entropy_loss | -42.4 | | explained_variance | 0.0351 | | learning_rate | 0.0007 | | n_updates | 7399 | | policy_loss | 73.6 | | reward | 0.30078474 | | std | 1.04 | | value_loss | 3.6 | -------------------------------------- -------------------------------------- | time/ | | | fps | 59 | | iterations | 7500 | | time_elapsed | 629 | | total_timesteps | 37500 | | train/ | | | entropy_loss | -42.3 | | explained_variance | 5.96e-08 | | learning_rate | 0.0007 | | n_updates | 7499 | | policy_loss | -8.08 | | reward | -0.3665664 | | std | 1.04 | | value_loss | 0.214 | -------------------------------------- --------------------------------------- | time/ | | | fps | 59 | | iterations | 7600 | | time_elapsed | 636 | | total_timesteps | 38000 | | train/ | | | entropy_loss | -42.4 | | explained_variance | -1.19e-07 | | learning_rate | 0.0007 | | n_updates | 7599 | | policy_loss | 42.9 | | reward | -0.79383886 | | std | 1.04 | | value_loss | 1.9 | --------------------------------------- ---------------------------------------- | time/ | | | fps | 59 | | iterations | 7700 | | time_elapsed | 647 | | total_timesteps | 38500 | | train/ | | | entropy_loss | -42.4 | | explained_variance | -1.19e-07 | | learning_rate | 0.0007 | | n_updates | 7699 | | policy_loss | -104 | | reward | -0.073217735 | | std | 1.05 | | value_loss | 10.3 | ---------------------------------------- -------------------------------------- | time/ | | | fps | 59 | | iterations | 7800 | | time_elapsed | 653 | | total_timesteps | 39000 | | train/ | | | entropy_loss | -42.4 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 7799 | | policy_loss | -152 | | reward | -1.8329335 | | std | 1.05 | | value_loss | 16.7 | -------------------------------------- -------------------------------------- | time/ | | | fps | 59 | | iterations | 7900 | | time_elapsed | 661 | | total_timesteps | 39500 | | train/ | | | entropy_loss | -42.4 | | explained_variance | -1.19e-07 | | learning_rate | 0.0007 | | n_updates | 7899 | | policy_loss | 144 | | reward | -0.8008484 | | std | 1.05 | | value_loss | 15.8 | -------------------------------------- ---------------------------------------- | time/ | | | fps | 59 | | iterations | 8000 | | time_elapsed | 671 | | total_timesteps | 40000 | | train/ | | | entropy_loss | -42.3 | | explained_variance | 5.96e-08 | | learning_rate | 0.0007 | | n_updates | 7999 | | policy_loss | -8.53 | | reward | -0.031915538 | | std | 1.04 | | value_loss | 0.0835 | ---------------------------------------- ------------------------------------- | time/ | | | fps | 59 | | iterations | 8100 | | time_elapsed | 677 | | total_timesteps | 40500 | | train/ | | | entropy_loss | -42.4 | | explained_variance | 5.96e-08 | | learning_rate | 0.0007 | | n_updates | 8099 | | policy_loss | -69.3 | | reward | 0.8095603 | | std | 1.04 | | value_loss | 3.08 | ------------------------------------- -------------------------------------- | time/ | | | fps | 59 | | iterations | 8200 | | time_elapsed | 686 | | total_timesteps | 41000 | | train/ | | | entropy_loss | -42.3 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 8199 | | policy_loss | -5.2 | | reward | -0.5655167 | | std | 1.04 | | value_loss | 0.69 | -------------------------------------- -------------------------------------- | time/ | | | fps | 59 | | iterations | 8300 | | time_elapsed | 695 | | total_timesteps | 41500 | | train/ | | | entropy_loss | -42.4 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 8299 | | policy_loss | -29.8 | | reward | -1.5929188 | | std | 1.04 | | value_loss | 0.672 | -------------------------------------- --------------------------------------- | time/ | | | fps | 59 | | iterations | 8400 | | time_elapsed | 701 | | total_timesteps | 42000 | | train/ | | | entropy_loss | -42.4 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 8399 | | policy_loss | -54.2 | | reward | -0.53150016 | | std | 1.05 | | value_loss | 9.5 | --------------------------------------- ------------------------------------- | time/ | | | fps | 59 | | iterations | 8500 | | time_elapsed | 711 | | total_timesteps | 42500 | | train/ | | | entropy_loss | -42.5 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 8499 | | policy_loss | 237 | | reward | 2.7706447 | | std | 1.05 | | value_loss | 42.1 | ------------------------------------- ------------------------------------- | time/ | | | fps | 59 | | iterations | 8600 | | time_elapsed | 719 | | total_timesteps | 43000 | | train/ | | | entropy_loss | -42.5 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 8599 | | policy_loss | -188 | | reward | 1.1153419 | | std | 1.05 | | value_loss | 21.7 | ------------------------------------- -------------------------------------- | time/ | | | fps | 59 | | iterations | 8700 | | time_elapsed | 730 | | total_timesteps | 43500 | | train/ | | | entropy_loss | -42.4 | | explained_variance | -1.19e-07 | | learning_rate | 0.0007 | | n_updates | 8699 | | policy_loss | -17.4 | | reward | -0.5148427 | | std | 1.05 | | value_loss | 0.297 | -------------------------------------- -------------------------------------- | time/ | | | fps | 59 | | iterations | 8800 | | time_elapsed | 740 | | total_timesteps | 44000 | | train/ | | | entropy_loss | -42.4 | | explained_variance | 1.19e-07 | | learning_rate | 0.0007 | | n_updates | 8799 | | policy_loss | 41.4 | | reward | 0.32814896 | | std | 1.05 | | value_loss | 1.6 | -------------------------------------- --------------------------------------- | time/ | | | fps | 59 | | iterations | 8900 | | time_elapsed | 746 | | total_timesteps | 44500 | | train/ | | | entropy_loss | -42.4 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 8899 | | policy_loss | -47.2 | | reward | -0.17413093 | | std | 1.05 | | value_loss | 1.75 | --------------------------------------- -------------------------------------- | time/ | | | fps | 59 | | iterations | 9000 | | time_elapsed | 755 | | total_timesteps | 45000 | | train/ | | | entropy_loss | -42.4 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 8999 | | policy_loss | 65.1 | | reward | 0.38266626 | | std | 1.05 | | value_loss | 6.64 | -------------------------------------- ------------------------------------- | time/ | | | fps | 59 | | iterations | 9100 | | time_elapsed | 764 | | total_timesteps | 45500 | | train/ | | | entropy_loss | -42.4 | | explained_variance | -1.19e-07 | | learning_rate | 0.0007 | | n_updates | 9099 | | policy_loss | 31.2 | | reward | 1.3317974 | | std | 1.05 | | value_loss | 0.927 | ------------------------------------- --------------------------------------- | time/ | | | fps | 59 | | iterations | 9200 | | time_elapsed | 770 | | total_timesteps | 46000 | | train/ | | | entropy_loss | -42.4 | | explained_variance | -0.0927 | | learning_rate | 0.0007 | | n_updates | 9199 | | policy_loss | 181 | | reward | -0.49035767 | | std | 1.05 | | value_loss | 21.3 | --------------------------------------- ------------------------------------- | time/ | | | fps | 59 | | iterations | 9300 | | time_elapsed | 780 | | total_timesteps | 46500 | | train/ | | | entropy_loss | -42.5 | | explained_variance | 1.19e-07 | | learning_rate | 0.0007 | | n_updates | 9299 | | policy_loss | 148 | | reward | -8.756936 | | std | 1.05 | | value_loss | 30.5 | ------------------------------------- ------------------------------------- | time/ | | | fps | 59 | | iterations | 9400 | | time_elapsed | 788 | | total_timesteps | 47000 | | train/ | | | entropy_loss | -42.5 | | explained_variance | -0.066 | | learning_rate | 0.0007 | | n_updates | 9399 | | policy_loss | 40.5 | | reward | 0.5117786 | | std | 1.05 | | value_loss | 1.51 | ------------------------------------- ------------------------------------- | time/ | | | fps | 59 | | iterations | 9500 | | time_elapsed | 794 | | total_timesteps | 47500 | | train/ | | | entropy_loss | -42.5 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 9499 | | policy_loss | 46.4 | | reward | 1.5631902 | | std | 1.05 | | value_loss | 1.37 | ------------------------------------- -------------------------------------- | time/ | | | fps | 59 | | iterations | 9600 | | time_elapsed | 804 | | total_timesteps | 48000 | | train/ | | | entropy_loss | -42.5 | | explained_variance | 5.96e-08 | | learning_rate | 0.0007 | | n_updates | 9599 | | policy_loss | 4.73 | | reward | -0.8106855 | | std | 1.05 | | value_loss | 0.346 | -------------------------------------- ------------------------------------- | time/ | | | fps | 59 | | iterations | 9700 | | time_elapsed | 811 | | total_timesteps | 48500 | | train/ | | | entropy_loss | -42.5 | | explained_variance | -1.19e-07 | | learning_rate | 0.0007 | | n_updates | 9699 | | policy_loss | 60.8 | | reward | 1.219504 | | std | 1.05 | | value_loss | 3.44 | ------------------------------------- -------------------------------------- | time/ | | | fps | 59 | | iterations | 9800 | | time_elapsed | 818 | | total_timesteps | 49000 | | train/ | | | entropy_loss | -42.5 | | explained_variance | 0.00147 | | learning_rate | 0.0007 | | n_updates | 9799 | | policy_loss | -19 | | reward | 0.36547118 | | std | 1.05 | | value_loss | 6.68 | -------------------------------------- ------------------------------------- | time/ | | | fps | 59 | | iterations | 9900 | | time_elapsed | 829 | | total_timesteps | 49500 | | train/ | | | entropy_loss | -42.5 | | explained_variance | -0.0611 | | learning_rate | 0.0007 | | n_updates | 9899 | | policy_loss | -14 | | reward | 1.2229353 | | std | 1.05 | | value_loss | 2.29 | ------------------------------------- -------------------------------------- | time/ | | | fps | 59 | | iterations | 10000 | | time_elapsed | 835 | | total_timesteps | 50000 | | train/ | | | entropy_loss | -42.6 | | explained_variance | 0 | | learning_rate | 0.0007 | | n_updates | 9999 | | policy_loss | -15.6 | | reward | 0.31784078 | | std | 1.05 | | value_loss | 0.296 | -------------------------------------- {'batch_size': 128, 'buffer_size': 50000, 'learning_rate': 0.001} Using cpu device Logging to results/ddpg ----------------------------------- | time/ | | | episodes | 4 | | fps | 23 | | time_elapsed | 556 | | total_timesteps | 13324 | | train/ | | | actor_loss | 20.3 | | critic_loss | 66.8 | | learning_rate | 0.001 | | n_updates | 9993 | | reward | -4.5011277 | ----------------------------------- ----------------------------------- | time/ | | | episodes | 8 | | fps | 21 | | time_elapsed | 1250 | | total_timesteps | 26648 | | train/ | | | actor_loss | 2.62 | | critic_loss | 9.79 | | learning_rate | 0.001 | | n_updates | 23317 | | reward | -4.5011277 | ----------------------------------- day: 3330, episode: 10 begin_total_asset: 965326.95 end_total_asset: 3940368.63 total_reward: 2975041.68 total_cost: 964.36 total_trades: 53280 Sharpe: 0.657 ================================= ----------------------------------- | time/ | | | episodes | 12 | | fps | 20 | | time_elapsed | 1944 | | total_timesteps | 39972 | | train/ | | | actor_loss | -3.63 | | critic_loss | 2.39 | | learning_rate | 0.001 | | n_updates | 36641 | | reward | -4.5011277 | ----------------------------------- ----------------------------------- | time/ | | | episodes | 16 | | fps | 20 | | time_elapsed | 2656 | | total_timesteps | 53296 | | train/ | | | actor_loss | -6.92 | | critic_loss | 1.51 | | learning_rate | 0.001 | | n_updates | 49965 | | reward | -4.5011277 | ----------------------------------- {'n_steps': 2048, 'ent_coef': 0.01, 'learning_rate': 0.00025, 'batch_size': 128} Using cpu device Logging to results/ppo ----------------------------------- | time/ | | | fps | 70 | | iterations | 1 | | time_elapsed | 29 | | total_timesteps | 2048 | | train/ | | | reward | -0.3290882 | ----------------------------------- ----------------------------------------- | time/ | | | fps | 67 | | iterations | 2 | | time_elapsed | 60 | | total_timesteps | 4096 | | train/ | | | approx_kl | 0.019916927 | | clip_fraction | 0.207 | | clip_range | 0.2 | | entropy_loss | -41.2 | | explained_variance | -0.00611 | | learning_rate | 0.00025 | | loss | 6.42 | | n_updates | 10 | | policy_gradient_loss | -0.0268 | | reward | 0.84259444 | | std | 1 | | value_loss | 15 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 65 | | iterations | 3 | | time_elapsed | 93 | | total_timesteps | 6144 | | train/ | | | approx_kl | 0.016416349 | | clip_fraction | 0.211 | | clip_range | 0.2 | | entropy_loss | -41.3 | | explained_variance | 0.00243 | | learning_rate | 0.00025 | | loss | 71.4 | | n_updates | 20 | | policy_gradient_loss | -0.0189 | | reward | -22.102169 | | std | 1.01 | | value_loss | 95.2 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 65 | | iterations | 4 | | time_elapsed | 125 | | total_timesteps | 8192 | | train/ | | | approx_kl | 0.016711425 | | clip_fraction | 0.152 | | clip_range | 0.2 | | entropy_loss | -41.3 | | explained_variance | -0.0235 | | learning_rate | 0.00025 | | loss | 19.2 | | n_updates | 30 | | policy_gradient_loss | -0.0181 | | reward | 0.8641611 | | std | 1.01 | | value_loss | 51 | ----------------------------------------- ---------------------------------------- | time/ | | | fps | 64 | | iterations | 5 | | time_elapsed | 158 | | total_timesteps | 10240 | | train/ | | | approx_kl | 0.02179965 | | clip_fraction | 0.258 | | clip_range | 0.2 | | entropy_loss | -41.3 | | explained_variance | -0.00376 | | learning_rate | 0.00025 | | loss | 24.8 | | n_updates | 40 | | policy_gradient_loss | -0.0161 | | reward | 0.7124557 | | std | 1.01 | | value_loss | 37.7 | ---------------------------------------- ----------------------------------------- | time/ | | | fps | 64 | | iterations | 6 | | time_elapsed | 189 | | total_timesteps | 12288 | | train/ | | | approx_kl | 0.020254686 | | clip_fraction | 0.206 | | clip_range | 0.2 | | entropy_loss | -41.4 | | explained_variance | -0.02 | | learning_rate | 0.00025 | | loss | 15.9 | | n_updates | 50 | | policy_gradient_loss | -0.0192 | | reward | 2.9676142 | | std | 1.01 | | value_loss | 56 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 64 | | iterations | 7 | | time_elapsed | 221 | | total_timesteps | 14336 | | train/ | | | approx_kl | 0.015349641 | | clip_fraction | 0.182 | | clip_range | 0.2 | | entropy_loss | -41.5 | | explained_variance | 0.00714 | | learning_rate | 0.00025 | | loss | 7.18 | | n_updates | 60 | | policy_gradient_loss | -0.0222 | | reward | -1.0227845 | | std | 1.01 | | value_loss | 12.5 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 64 | | iterations | 8 | | time_elapsed | 254 | | total_timesteps | 16384 | | train/ | | | approx_kl | 0.020761559 | | clip_fraction | 0.231 | | clip_range | 0.2 | | entropy_loss | -41.5 | | explained_variance | -0.00857 | | learning_rate | 0.00025 | | loss | 25.2 | | n_updates | 70 | | policy_gradient_loss | -0.0199 | | reward | 0.80425155 | | std | 1.01 | | value_loss | 57.8 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 64 | | iterations | 9 | | time_elapsed | 283 | | total_timesteps | 18432 | | train/ | | | approx_kl | 0.018122694 | | clip_fraction | 0.236 | | clip_range | 0.2 | | entropy_loss | -41.5 | | explained_variance | 0.00296 | | learning_rate | 0.00025 | | loss | 28.1 | | n_updates | 80 | | policy_gradient_loss | -0.0166 | | reward | -1.42386 | | std | 1.01 | | value_loss | 57.8 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 64 | | iterations | 10 | | time_elapsed | 318 | | total_timesteps | 20480 | | train/ | | | approx_kl | 0.022673171 | | clip_fraction | 0.205 | | clip_range | 0.2 | | entropy_loss | -41.6 | | explained_variance | -0.013 | | learning_rate | 0.00025 | | loss | 17.3 | | n_updates | 90 | | policy_gradient_loss | -0.0191 | | reward | 0.8197509 | | std | 1.02 | | value_loss | 44.3 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 63 | | iterations | 11 | | time_elapsed | 352 | | total_timesteps | 22528 | | train/ | | | approx_kl | 0.020850785 | | clip_fraction | 0.214 | | clip_range | 0.2 | | entropy_loss | -41.6 | | explained_variance | -0.00669 | | learning_rate | 0.00025 | | loss | 48.4 | | n_updates | 100 | | policy_gradient_loss | -0.0161 | | reward | 1.2033767 | | std | 1.02 | | value_loss | 99.1 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 63 | | iterations | 12 | | time_elapsed | 384 | | total_timesteps | 24576 | | train/ | | | approx_kl | 0.024814304 | | clip_fraction | 0.251 | | clip_range | 0.2 | | entropy_loss | -41.6 | | explained_variance | -0.0225 | | learning_rate | 0.00025 | | loss | 10.8 | | n_updates | 110 | | policy_gradient_loss | -0.018 | | reward | 1.610058 | | std | 1.02 | | value_loss | 22.5 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 63 | | iterations | 13 | | time_elapsed | 416 | | total_timesteps | 26624 | | train/ | | | approx_kl | 0.017855735 | | clip_fraction | 0.173 | | clip_range | 0.2 | | entropy_loss | -41.7 | | explained_variance | 0.00501 | | learning_rate | 0.00025 | | loss | 34.5 | | n_updates | 120 | | policy_gradient_loss | -0.0189 | | reward | 7.162905 | | std | 1.02 | | value_loss | 112 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 64 | | iterations | 14 | | time_elapsed | 446 | | total_timesteps | 28672 | | train/ | | | approx_kl | 0.018644353 | | clip_fraction | 0.153 | | clip_range | 0.2 | | entropy_loss | -41.7 | | explained_variance | 0.0117 | | learning_rate | 0.00025 | | loss | 16.7 | | n_updates | 130 | | policy_gradient_loss | -0.0172 | | reward | 2.0473788 | | std | 1.02 | | value_loss | 53.3 | ----------------------------------------- day: 3330, episode: 10 begin_total_asset: 994554.41 end_total_asset: 4699503.39 total_reward: 3704948.98 total_cost: 439274.68 total_trades: 90096 Sharpe: 0.806 ================================= ----------------------------------------- | time/ | | | fps | 63 | | iterations | 15 | | time_elapsed | 480 | | total_timesteps | 30720 | | train/ | | | approx_kl | 0.02508668 | | clip_fraction | 0.25 | | clip_range | 0.2 | | entropy_loss | -41.8 | | explained_variance | -0.0505 | | learning_rate | 0.00025 | | loss | 4.42 | | n_updates | 140 | | policy_gradient_loss | -0.0173 | | reward | -0.36127353 | | std | 1.02 | | value_loss | 14.8 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 64 | | iterations | 16 | | time_elapsed | 510 | | total_timesteps | 32768 | | train/ | | | approx_kl | 0.021448491 | | clip_fraction | 0.211 | | clip_range | 0.2 | | entropy_loss | -41.8 | | explained_variance | 0.00132 | | learning_rate | 0.00025 | | loss | 38 | | n_updates | 150 | | policy_gradient_loss | -0.00894 | | reward | -2.4289682 | | std | 1.02 | | value_loss | 88 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 64 | | iterations | 17 | | time_elapsed | 542 | | total_timesteps | 34816 | | train/ | | | approx_kl | 0.02103462 | | clip_fraction | 0.208 | | clip_range | 0.2 | | entropy_loss | -41.8 | | explained_variance | -0.0246 | | learning_rate | 0.00025 | | loss | 35.3 | | n_updates | 160 | | policy_gradient_loss | -0.0134 | | reward | -0.71985894 | | std | 1.02 | | value_loss | 54.5 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 63 | | iterations | 18 | | time_elapsed | 577 | | total_timesteps | 36864 | | train/ | | | approx_kl | 0.022089712 | | clip_fraction | 0.213 | | clip_range | 0.2 | | entropy_loss | -41.9 | | explained_variance | -0.0028 | | learning_rate | 0.00025 | | loss | 27.9 | | n_updates | 170 | | policy_gradient_loss | -0.0207 | | reward | 0.11034006 | | std | 1.03 | | value_loss | 39.3 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 63 | | iterations | 19 | | time_elapsed | 609 | | total_timesteps | 38912 | | train/ | | | approx_kl | 0.014264661 | | clip_fraction | 0.126 | | clip_range | 0.2 | | entropy_loss | -41.9 | | explained_variance | -0.00283 | | learning_rate | 0.00025 | | loss | 58.6 | | n_updates | 180 | | policy_gradient_loss | -0.0135 | | reward | 6.176509 | | std | 1.03 | | value_loss | 119 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 63 | | iterations | 20 | | time_elapsed | 642 | | total_timesteps | 40960 | | train/ | | | approx_kl | 0.027180977 | | clip_fraction | 0.292 | | clip_range | 0.2 | | entropy_loss | -42 | | explained_variance | 0.0421 | | learning_rate | 0.00025 | | loss | 8.9 | | n_updates | 190 | | policy_gradient_loss | -0.0156 | | reward | 0.20096779 | | std | 1.03 | | value_loss | 19.8 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 64 | | iterations | 21 | | time_elapsed | 671 | | total_timesteps | 43008 | | train/ | | | approx_kl | 0.021884244 | | clip_fraction | 0.205 | | clip_range | 0.2 | | entropy_loss | -42 | | explained_variance | -0.00219 | | learning_rate | 0.00025 | | loss | 53.4 | | n_updates | 200 | | policy_gradient_loss | -0.0145 | | reward | -0.839949 | | std | 1.03 | | value_loss | 94.3 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 63 | | iterations | 22 | | time_elapsed | 706 | | total_timesteps | 45056 | | train/ | | | approx_kl | 0.024635753 | | clip_fraction | 0.235 | | clip_range | 0.2 | | entropy_loss | -42.1 | | explained_variance | -0.00329 | | learning_rate | 0.00025 | | loss | 27.6 | | n_updates | 210 | | policy_gradient_loss | -0.0148 | | reward | -0.21918707 | | std | 1.04 | | value_loss | 61.8 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 64 | | iterations | 23 | | time_elapsed | 735 | | total_timesteps | 47104 | | train/ | | | approx_kl | 0.038902897 | | clip_fraction | 0.28 | | clip_range | 0.2 | | entropy_loss | -42.2 | | explained_variance | -0.0241 | | learning_rate | 0.00025 | | loss | 21.9 | | n_updates | 220 | | policy_gradient_loss | -0.0178 | | reward | -0.12725857 | | std | 1.04 | | value_loss | 34.3 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 63 | | iterations | 24 | | time_elapsed | 768 | | total_timesteps | 49152 | | train/ | | | approx_kl | 0.017998032 | | clip_fraction | 0.174 | | clip_range | 0.2 | | entropy_loss | -42.2 | | explained_variance | 0.0111 | | learning_rate | 0.00025 | | loss | 27.9 | | n_updates | 230 | | policy_gradient_loss | -0.0148 | | reward | 1.7231001 | | std | 1.04 | | value_loss | 65.2 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 63 | | iterations | 25 | | time_elapsed | 804 | | total_timesteps | 51200 | | train/ | | | approx_kl | 0.017844416 | | clip_fraction | 0.186 | | clip_range | 0.2 | | entropy_loss | -42.3 | | explained_variance | 0.0211 | | learning_rate | 0.00025 | | loss | 13.5 | | n_updates | 240 | | policy_gradient_loss | -0.0149 | | reward | -1.0208522 | | std | 1.04 | | value_loss | 35.2 | ----------------------------------------- {'batch_size': 128, 'buffer_size': 100000, 'learning_rate': 0.0001, 'learning_starts': 100, 'ent_coef': 'auto_0.1'} Using cpu device Logging to results/sac ----------------------------------- | time/ | | | episodes | 4 | | fps | 18 | | time_elapsed | 704 | | total_timesteps | 13324 | | train/ | | | actor_loss | 1.1e+03 | | critic_loss | 642 | | ent_coef | 0.169 | | ent_coef_loss | -83.1 | | learning_rate | 0.0001 | | n_updates | 13223 | | reward | -4.2128644 | ----------------------------------- ----------------------------------- | time/ | | | episodes | 8 | | fps | 18 | | time_elapsed | 1433 | | total_timesteps | 26648 | | train/ | | | actor_loss | 451 | | critic_loss | 27.5 | | ent_coef | 0.046 | | ent_coef_loss | -109 | | learning_rate | 0.0001 | | n_updates | 26547 | | reward | -4.2404695 | ----------------------------------- day: 3330, episode: 10 begin_total_asset: 953106.81 end_total_asset: 7458866.64 total_reward: 6505759.83 total_cost: 8648.15 total_trades: 59083 Sharpe: 0.842 ================================= ---------------------------------- | time/ | | | episodes | 12 | | fps | 18 | | time_elapsed | 2152 | | total_timesteps | 39972 | | train/ | | | actor_loss | 216 | | critic_loss | 38.6 | | ent_coef | 0.0127 | | ent_coef_loss | -102 | | learning_rate | 0.0001 | | n_updates | 39871 | | reward | -3.931381 | ---------------------------------- {'batch_size': 100, 'buffer_size': 1000000, 'learning_rate': 0.001} Using cpu device Logging to results/td3 ----------------------------------- | time/ | | | episodes | 4 | | fps | 25 | | time_elapsed | 526 | | total_timesteps | 13324 | | train/ | | | actor_loss | 91.6 | | critic_loss | 1.45e+03 | | learning_rate | 0.001 | | n_updates | 9993 | | reward | -3.5290053 | ----------------------------------- ----------------------------------- | time/ | | | episodes | 8 | | fps | 22 | | time_elapsed | 1191 | | total_timesteps | 26648 | | train/ | | | actor_loss | 43.8 | | critic_loss | 317 | | learning_rate | 0.001 | | n_updates | 23317 | | reward | -3.5290053 | ----------------------------------- day: 3330, episode: 10 begin_total_asset: 972865.93 end_total_asset: 3563567.55 total_reward: 2590701.62 total_cost: 971.89 total_trades: 46620 Sharpe: 0.648 ================================= ----------------------------------- | time/ | | | episodes | 12 | | fps | 21 | | time_elapsed | 1862 | | total_timesteps | 39972 | | train/ | | | actor_loss | 34.3 | | critic_loss | 54.4 | | learning_rate | 0.001 | | n_updates | 36641 | | reward | -3.5290053 | -----------------------------------