Monday, March 16, 2020

Using R to correlate BTC unspent transaction output (UTXO) with price


Abstract
Bitcoin is thought of as a medium of exchange, due to its layer on the internet having an immutable blockchain ledger, cryptographic security protocol, a predetermined supply free of centralized control, and permanence of transaction.  It also carries weight as a form of currency, able to transact in very small increments, allowing the cryptocurrency to be practical and ubiquitous as a global money.  One issue with using Bitcoin (BTC) is the price volatility.  While BTC is valued at $9,065 today, it has swung from a 2018 low of $3,232 to a high of $12,907 in 2019.  Within the blockchain structure of BTC lies a feature, unspent transaction output (UTXO) which acts as digital signature from sender to receiver.  This paper will attempt to correlate historic unspent transaction outputs of BTC in order to determine its potential for predicting BTC price volatility.

A.     Bitcoin background
Bitcoin (BTC) is governed by rules which are purely abstract and based on mathematics. These rules are oblivious to social conventions, irrespective of their nature (Kurda, 2012). Bitcoin can have an “enormous impact on liberating the users of Bitcoin from social norms they disagree with” (Kurda, 2012).  Some liberal viewpoints also see BTC as means to escape from the binds of a state, by avoiding taxation and commit money laundering.  The economics of BTC and price potential is swayed by societal norms and views of money.
As a form of money, Bitcoin can be thought of as digital gold or gold 2.0.  It possesses several features that are perceived to carry inherent value:  immutable blockchain ledger, cryptographic security to uphold transactions, supply which is predetermined and degressive over time, a proof of work algorithm to very transactions and charge fees, maintenance of ledger by network of computers, and carries multiple inputs and outputs.  It is also divisible by up to 8 decimal places.
B.     Digital signature of BTC
To track the transactions of each Bitcoin address, Bitcoin is designed with an architecture that avoids a potential problem in the banking industry known as double spending.  This problem is solved using an accounting structure called unspent transaction output, or UTXO.  Each transaction of every block record of state includes the input, and the output via this structure.  Unspent transaction outputs are broken up so that the correct amount, including fees, are distributed while the remaining value of the Bitcoin is returned to the sender as change (see figure 1).  Across time, it may be used as meaningful intelligence to understanding Bitcoin pricing.
C.     Choice of analytical tool

R software was chosen, due to its bevy of statistical packages and popularity in evaluating markets within the finance industry (data mining, technical trading, and performance analysis).  R can also directly import real-time data from stock market indices (yet such data for Bitcoin was not available).  R also allows for creating easy and customizable graphic charts and figures, including time series plots.
D.     Bitcoin datasets
Using Blockchain.com (2020) data, I gathered UTXO and USD price data for the preceding two years (March 13, 2018 to March 4, 2020).  Below is a sample of the raw data from both Excel sheets (blockchain.com, 2020):
E.     Analysis and visualizations
Using Excel, data was prepared by taking weekly averages of UTXO and prices in US dollars.  The data was then combined into a single Excel sheet, and imported into R.  I performed manipulation of the factor column data into dates.  Then I used R basic functions to generate time series plots, from which R users could forecast price performance (Zhang, 2016).

As one can see, mapping both plots of unspent transaction output (scale 40-67 million) and dollar prices of BTC (scale $3000-$13000) was impractical to show visual correlation.  Changing the y-axis scale aesthetics did not yield adequate plots.  This required knowledge of high-level plotting techniques.  I sought to correlate the data using statistical packages built into R. 
I evaluated the usefulness of the given continuous data by testing for correlation assumptions.  This is performed by visually scatter plotting the UTXO/prices to check for linearity between them.  Then, using normality plots with the ggpubr library, I can discover whether the data falls under a normal distribution (CRAN, 2018):
Because the UTXOs follow a sigmoid versus a normal distribution, proper statistical methods recommend using a non-parametric correlation—Spearman or Kendall rank-based correlation tests.  Spearman’s correlation test is defined as:
rho= (x′−mx′)(yimy′)(x′−mx′)2(y′−my′)2
Where x′=rank(x)x′=rank(x) and y′=rank(y)y′=rank(y)


The correlation coefficient between x and y are 0.6457 and the p-value is < 2.2-16.  The test indicates a moderately positive correlation—signifying prices of BTC increases with unspent transaction outputs of BTC.

F.     Summary

The nature of Bitcoin historically shows swings of volatility from one year to the next.  UTXOs may become a unique indicator of buy/sell pressure in the market for Bitcoin exchanges.  BTC does show price increases that are moderately correlated to amount of BTC unspent transaction outputs (UTXO).  Indeed, the analysis would yield more reliable results if more historical UTXO/price data were used.  With use of real-time transaction data, the model may undergo forecasting by extrapolating days, weeks, or even months to show asset managers and industry analysts whether to invest more or less proportions of Bitcoin for their portfolio.  BTC would not only provide an excellent medium of exchange, but also signal measures that could properly mitigate risk of investing in the cryptocurrency. 



References
Blockchain.com.  (2020).  Blockchain charts.  Retrieved from https://www.blockchain.com/charts
Surda, P.  (2012 Nov 9).  Economics of bitcoin: is bitcoin an alternative to fiat currencies and gold.  Wirtschafts University.  Retrieved from https://nakamotoinstitute.org/static/docs/economics-of-bitcoin.pdf
Grigg, I.  (2016).  The message is the medium.  Retrieved from https://steemit.com/eos/@iang/the-message-is-the-medium
Zhang, L-C.  (2016 May 13).  R in finance: introduction to r and its applications in finance.  Retrieved from https://www.researchgate.net/publication/302956522_R_in_Finance_Introduction_to_R_and_Its_Applications_in_Finance
(2018).  Comprehensive R archive network (CRAN).  Retrieved from https://cloud.r-project.org/doc/manuals/r-release/R-intro.html#Graphics