Path: blob/main/notebooks/04/05-cryptocurrency-arbitrage.ipynb
663 views
4.5 Cryptocurrency arbitrage search
Installations and Imports
Preamble: Install Pyomo and a solver
The following cell sets and verifies a global SOLVER for the notebook. If run on Google Colab, the cell installs Pyomo and the HiGHS solver, while, if run elsewhere, it assumes Pyomo and HiGHS have been previously installed. It then sets to use HiGHS as solver via the appsi module and a test is performed to verify that it is available. The solver interface is stored in a global object SOLVER for later use.
Problem description
Cryptocurrency exchanges are web services that enable the purchase, sale, and exchange of cryptocurrencies. These exchanges provide liquidity for owners and establish the relative value of these currencies. Joining an exchange enables a user to maintain multiple currencies in a digital wallet, buy and sell currencies, and use cryptocurrencies for financial transactions.
In this notebook, we explore the efficiency of cryptocurrency exchanges by testing for arbitrage opportunities. An arbitrage exists if a customer can realize a net profit through a sequence of risk-free trades. The efficient market hypothesis assumes arbitrage opportunities are quickly identified and exploited by investors. As a result of their trading, prices reach a new equilibrium so that any arbitrage opportunities would be small and fleeting in an efficient market. The question here is whether it is possible, with real-time data and rapid execution, for a trader to profit from these fleeting arbitrage opportunities.
Additional libraries needed: NetworkX and CCXT
This notebook uses two additional libraries: NetworkX and CCXT.
NetworkX is a Python library for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks, which is installed by deafult on Google Colab.
CCXT is an open-source library for connecting to cryptocurrency exchanges. The CCXT library is not installed by default in Google Colab, so it must be installed with the following command.
We import all the libraries we need for this notebook in the following cell.
Available cryptocurrency exchanges
Here we use the ccxt library and list current exchanges supported by ccxt.
Representing an exchange as a directed graph
First, we need some terminology. Trading between two specific currencies is called a market, with each exchange hosting multiple markets. ccxt labels each market with a symbol common across exchanges. The market symbol is an upper-case string with abbreviations for a pair of traded currencies separated by a slash (). The first abbreviation is the base currency, the second is the quote currency. Prices for the base currency are denominated in units of the quote currency. As an example, refers to a market for the base currency Ethereum (ETH) quoted in units of the Bitcoin (BTC). The same market symbol can refer to an offer to sell the base currency (a 'bid') or to an offer to sell the base currency (an 'ask'). For example, ETH/BTC means you can buy units of BTC with one unit of ETH.
An exchange can be represented by a directed graph constructed from the market symbols available on that exchange. In such a graph currencies correspond to nodes on the directed graph. Market symbols correspond to edges in the directed graph, with the source indicating the quote currency and the destination indicating the base currency. The following code develops such a sample graph.
Exchange order book
The order book for a currency exchange is the real-time inventory of trading orders.
A bid is an offer to buy up to a specified amount of the base currency at the price not exceeding the 'bid price' in the quote currency. An ask is an offer to sell up to a specified amount of the base currency at a price no less than a value specified given in the quote currency.
The exchange attempts to match the bid to ask order at a price less than or equal to the bid price. If a transaction occurs, the buyer will receive an amount of base currency less than or equal to the bid volume and the ask volume, at a price less than or equal to the bid price and no less than the specified value.
The order book for currency exchange is the real-time inventory of orders. The exchange order book maintains a list of all active orders for symbols traded on the exchange. Incoming bids above the lowest ask or incoming asks below the highest bid will be immediately matched and transactions executed following the rules of the exchange.
The following cell reads and displays a previously saved order book. Cells at the end of this notebook demonstrate how to retrieve an order book from an exchange and save it as a Pandas DataFrame.
Modelling the arbitrage search problem as a graph
Our goal will be to find arbitrage opportunities, i.e., the possibility to start from a given currency and, through a sequence of executed trades, arrive back at the same currency with a higher balance than at the beginning. We will model this problem as a network one.
A bid appearing in the order book for the market symbol is an order from a prospective counter party to purchase an amount of the base currency at a bid price given in a quote currency . For a currency trader, a bid in the order book is an opportunity to convert the base currency into the quote currency .
The order book can be represented as a directed graph where the nodes correspond to individual currencies. A directed edge from node to node describes an opportunity for us to convert currency into units of currency . Let and denote the amounts of each currency held by us, and let denote the amount of currency exchanged for currency . Following the transaction we have the following changes to the currency holdings
where is a conversion coefficient equal to the price of expressed in terms of currency . The capacity of a trading along edge is specified by a relationship
Because the arcs in our graph correspond to two types of orders - bid and ask - we need to build a consistent way of expressing them in our , notation. Imagine now that we are the party that accepts the buy and ask bids existing in the graph.
For bid orders, we have the opportunity to convert the base currency into the quote currency , for which we will use the following notation:
An ask order for symbol is an order to sell the base currency at a price not less than the 'ask' price given in terms of the quote currency. The ask volume is the amount of base currency to be sold. For us, a sell order is an opportunity to convert the quoted currency into the base currency such that
The following cell creates a directed graph using data from an exchange order book. To distinguish between different order types, we will highlight the big orders with green color, and ask orders with red color.
First, we simply print the content of the order book as a list of arcs.
Next, we draw the graph itself.
Trading and arbitrage
With this unified treatment of bid and ask orders, we are ready to formulate the mathematical problem of finding an arbitrage opportunity. An arbitrage exists if it is possible to find a closed path and a sequence of transactions in the directed graph resulting in a net increase in currency holdings. Given a path
the path is closed if . The path has finite capacity if each edge in the path has a non-zero capacity. For a sufficiently small holding of currency (because of capacity constraints), a closed path with represents an arbitrage opportunity if
If all we care about is simply finding an arbitrage cycle, regardless of the volume traded, we can use one of the many shortest path algorithms from the networkx library. To convert the problem of finding a path meeting the above condition into a sum-of-terms to be minimized, we can take the negative logarithm of both sides to obtain the condition:
In other words, if we assign the negative logarithm as the weight of arcs in a graph, then our problem just becomes translated into the problem of searching for a cycle with a total sum of weights along it to be negative.
Find order books that have arbitrage opportunities
A simple cycle is a closed path in which no node appears twice. Simple cycles are distinct if they are not cyclic permutations (essentially, the same set of nodes but with a different start and end points) of each other. One could check for arbitrage opportunities by checking if there are any negative simple cycles in the graph.
However, searching for a negative-weight cycle through searching for an arbitrage opportunity can be a daunting task - a brute-force search over all simple cycles has complexity which is impractical for large-scale applications. A more efficient search based on the Bellman-Ford algorithm is embedded in the NetworkX function negative_edge_cycle that returns a logical True if a negative cycle exists in a directed graph.
The function negative_edge_cycle is fast, but it only indicates if there is a negative cycle or not, and we don't even know what kind of a cycle is it so it would be hard to use that information to perform an arbitrage.
Luckily, the networkx library includes the function find_negative_cycle that locates a single negative edge cycle if one exists. We can use this to demonstrate the existence of an arbitrage opportunity and to highlight that opportunity on the directed graph of all possible trades. The following cell reports the cycle found and the corresponding trading return measured in basis points (1 basis point = 0.01%). The same trading cycle is marked with thicker arcs in the graph.
Note this may or may not be the trading cycle with maximum return. There may be other cycles with higher or lower returns, and that allow higher or lower trading volumes.
Brute force search arbitrage with simple cycles
Not all arbitrage cycles are the same - some yield a higher relative return (per dollar invested) than the others, and some yield a higher absolute return (maximum amount of money to be made risk-free) than others. This is because the amount of money that flows through a negative cycle is upper bounded by the size of the smallest order in that cycle. Thus, if one is looking for the best possible arbitrage sequence of trades, finding just 'a cycle' might not be enough.
A crude way to search for a good arbitrage opportunity would be to enumerate all possible simple cycles in a graph and pick the one that's best according to whatever criterion we pick. A brute force search over for all simple cycles has order complexity, which is prohibitive for large order books. Nevertheless, we explore this option here to better understand the problem of finding and valuing arbitrage opportunities.
In the following cell, we compute the loss function for all simple cycles that can be constructed within a directed graph using the function simple_cycles from the networkx library to construct a dictionary of all distinct simple cycles in the order book. Each cycle is represented by an ordered list of nodes. For each cycle, the financial return is computed, and a histogram is constructed to show the distribution of potential returns. Several paths with the highest return are then overlaid on the graph of the order book.
Again, note that no account is taken of the trading capacity available on each path.
The next cell iterates over all simple cycles in a directed graph. This can take a long time for a large dense graph.
Next, we sort out the negative cycles from this list and present them along with their basis-points (1% is 100 basis points) return.
In the end, we draw an example arbitrage cycle on our graph to illustrate the route that the money must travel.
Pyomo model for arbitrage with capacity constraints
The preceding analysis demonstrates some practical limitations of relying on generic implementations of network algorithms:
First, more than one negative cycle may exist, so more than one arbitrage opportunity may exist, i.e. an optimal strategy consists of a combination of cycles.
Secondly, simply searching for a negative cycle using shortest path algorithms does not account for capacity constraints, i.e., the maximum size of each of the exchanges. For that reason, one may end up with a cycle on which a good `rate' of arbitrage is available, but where the absolute gain need not be large due to the maximum amounts that can be traded.
Instead, we can formulate the problem of searching for a maximum-gain arbitrage via linear optimization. Assume that we are given a directed graph where each edge is labeled with a 'multiplier' indicating how many units of currency will be received for one unit of currency , and a 'capacity' indicating how many units of currency can be converted to currency .
We break the trading process into steps indexed by , where currencies are exchanged between two adjacent nodes in a single step. We denote by the amount of currency traded from node to in step . In this way, we start with the amount at time and aim to maximize the amount at time . Denote by the set of nodes to which outgoing arcs from lead, and by the set of nodes from which incoming arcs lead.
A single transaction converts units of currency to currency . Following all transactions at event , the trader will hold units of currency where
For every edge , the sum of all transactions must satisfy
The objective of the optimization model is to find a sequence of currency transactions that increase the holdings of a reference currency. The solution is constrained by assuming that the trader cannot short sell any currency. The resulting model is
where the constraints model:
the initial amount condition,
the balance equations linking the state of the given node in the previous and subsequent time periods,
the fact that we cannot trade at time step more of a given currency than we had in this currency from time step . This constraint 'enforces' the time order of trades, i.e., we cannot trade in time period units which have been received in the same time period.
the maximum allowed trade volumes,
the non-negativity of the variables.
The following cell implements this optimization model using Pyomo.
Using this function, we are able to compute the optimal (absolute) return from an order book while respecting the order capacities and optimally using all the arbitrage opportunities inside it.
To track the evolution of the trades throughout time, the script in the following cell illustrates, for each currency (rows) the amount of money held in that currency at each of the time steps . It is visible from this scheme that the sequence of trades is not a simple cycle, but rather a more sophisticated sequence of trades which we would not have discovered with simple-cycle exploration alone, especially not when considering also the arc capacities.
To be even more specific, the following cell lists the sequence of transcations executed.
We next illustrate the arbitrage strategy in the graph by marking all the corresponding arcs thicker.
If we want to be even more precise about the execution of the trading strategy, we can formulate a printout of the list of orders that we, as the counter party to the orders stated in the order book, should issue for our strategy to take place.
In the end, we can illustrate the time-journey of our balances in different currencies using time-indexed bar charts.
Exercises for the reader
The previous notebook cells made certain assumptions that we need to consider. The first assumption was that there was at most one bid and one ask order between any pair of currencies in an exchange. This assumption was based on the number of orders we downloaded from the online database, but in reality, there may be more orders. How would the presence of multiple orders per pair affect our graph formulation? How can we adjust the MILO formulation to account for this?
Another assumption was that we only traded currencies within one exchange. However, in reality, we can trade across multiple exchanges. How can we modify the graph-based problem formulation to accommodate this scenario?
Further, we have assigned no cost to the number of trades required to implement the strategy produced by optimization. How can the modeled be modified to either minimize the number of trades, or to explicitly include trading costs?
Finally, a tool like this needs to operate in real time. How can this model be incorporated into, say, a streamlit application that could be used to monitor for arbitrage opportunities in real time?
Real Time Downloads of Order Books from an Exchange
The goal of this notebook was to show how network algorithms and optimization can be utilized to detect arbitrage opportunities within an order book that has been obtained from a cryptocurrency exchange.
The subsequent cell in the notebook utilizes ccxt.exchange.fetch_order_book to obtain the highest bid and lowest ask orders from an exchange for market symbols that meet the criteria of having a minimum in-degree for their base currencies.
The following cell can be used to download additional order book data sets for testing.
Bibliographic Notes
Crytocurrency markets are relatively new compared to other markets, and relatively few academic papers are available that specifically address arbitrage on those markets. Early studies, such as the following, reported periods of large, recurrent arbitrage opportunities that exist across exchanges, and that can persist for several days or weeks.
Makarov, I., & Schoar, A. (2020). Trading and arbitrage in cryptocurrency markets. Journal of Financial Economics, 135(2), 293-319.
Subsequent work reports these prices differentials do exist, but only at a fraction of the values previously reported, and only for fleeting periods of time.
Crépellière, T., & Zeisberger, S. (2020). Arbitrage in the Market for Cryptocurrencies. Available at SSRN 3606053. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3606053
The use of network algorithms to identify cross-exchange arbitrage has appeared in the academic literature, and in numerous web sites demonstrating optimization and network applications. Representative examples are cited below.
Peduzzi, G., James, J., & Xu, J. (2021, September). JACK THE RIPPLER: Arbitrage on the Decentralized Exchange of the XRP Ledger. In 2021 3rd Conference on Blockchain Research & Applications for Innovative Networks and Services (BRAINS) (pp. 1-2). IEEE. https://arxiv.org/pdf/2106.16158.pdf
Bruzgė, R., & Šapkauskienė, A. (2022). Network analysis on Bitcoin arbitrage opportunities. The North American Journal of Economics and Finance, 59, 101562. https://doi.org/10.1016/j.najef.2021.101562
Bruzgė, R., & Šapkauskienė, A. (2022). Dataset for Bitcoin arbitrage in different cryptocurrency exchanges. Data in Brief, 40, 107731.
The work in this notebook is related to materials found in the following web resources.
https://anilpai.medium.com/currency-arbitrage-using-bellman-ford-algorithm-8938dcea56ea
A more complete analysis of trading and exploiting arbitrage opportunities in decentralized finance markets is available in the following paper and thesis.
Byrne, S. An Exploration of Novel Trading and Arbitrage Methods within Decentralised Finance. https://www.scss.tcd.ie/Donal.OMahony/bfg/202021/StephenByrneDissertation.pdf
Levus, R., Berko, A., Chyrun, L., Panasyuk, V., & Hrubel, M. (2021). Intelligent System for Arbitrage Situations Searching in the Cryptocurrency Market. In CEUR Workshop Proceedings (pp. 407-440). http://ceur-ws.org/Vol-2917/paper32.pdf
In addition to the analysis of arbitrage opportunities, convex optimization may also have an important role in the developing of trading algorithms for crypocurrency exchanges.
Angeris, G., Agrawal, A., Evans, A., Chitra, T., & Boyd, S. (2021). Constant function market makers: Multi-asset trades via convex optimization. arXiv preprint arXiv:2107.12484. https://baincapitalcrypto.com/constant-function-market-makers-multi-asset-trades-via-convex-optimization/ and https://arxiv.org/pdf/2107.12484.pdf