Bitcoin is a virtual currency based on cryptography that challenges multiple notions of conventional banking and government regulated fiat currencies. It is a competitively cheap and fast means for transferring money from and to anyone all over the world is a pseudo-anonymous manner. As such, bitcoin transactions represent an essential source of data to study, as a limited amount of information is known about users and their activities across bitcoin’s network. Unluckily, studying bitcoin is hindered by the complex pseudonymous nature of its data structure and collective transaction count, as by early 2017, there have been more than 200 million transactions executed along bitcoin’s blockchain, yet there are only a handful of visual analytics tools that can aid in high level analysis of transactions and users’ activities across bitcoin’s network.
A group of researchers have recently developed BitConduite; a novel system to help in studying various patterns of activities and user profiles across bitcoin’s network. BitConduite is designed to identify individual entities ( users across the network who may be individuals, or various forms of institutions or businesses such as cryptocurrency exchanges, e-commerce sites, cryptocurrency news networks….etc) and characterizing various forms of entities across the network via analyzing the patterns of transactions executed over time.
An Overview of BitConduite:s
BitcConduite’s approach relies on analyzing bitcoin transactions’ data at various levels of aggregation, rather than mere analysis of raw data; thus, it is pivotal to have an ideal data infrastructure that facilitates data processing. Raw data is obtained from the Bitcoin Core client and then stored via MongoDB database. To process data further, a column oriented MonetDB database is used, especially that it is ideal for accomplishing fast data aggregation which is necessary for providing fast access to data utilized by BitConduite.
Via the aforementioned infrastructure, BitConduite can reveal entities of pseudonymous bitcoin transactions using input heuristics that were presented by Reid and Harrigan in their book “An Analysis of Anonymity in the Bitcoin System”. To categorize entities according to similarity, the developers of BitConduite used k-means clustering with a pre-defined number of clusters.
Image from Flickr