Unstructured data: Alpha generation in the digital age

August 3, 2018

The internet is a vast repository of messy, unstructured data including blogs, chat rooms, social media posts, images, videos, etc. This repository reflects the ever changing and often irrational collective psychology of the financial markets. IBM estimates that today, as a civilization, we generate over 2.5 quintillion bytes of unstructured data per day. In fact, over 90% of the world's data has been generated in the last few years according to IBM, and the growth rate of data generation will double by the year 2020.

The fundamental problem posed by the explosion in digital information, caused by rapidly accelerating technologies, is that it's becoming harder for humans to make sense of a world in which the biological brain isn't growing as fast as the data. Within the context of financial markets, information overload causes significant problems for investors looking to outperform markets. In order to outperform financial markets, investors need to generate alpha. Alpha, a parameter in the Capital Asset Pricing Model (CAPM) that measures active return on investment above a benchmark index, is a function of capturing market inefficiencies. According to the Efficient Market Hypothesis (EMH), market inefficiencies are caused by errors in information processing and reasoning.

Accordingly, information overload can introduce bias into investors' decision making processes. Biased decisions based on limited awareness and understanding leads to underperformance caused by an increase in idiosyncratic risk exposure (e.g., errant Tweets from an influential politician on industry specific legislation, etc.). Simply put, conventional investment methodologies that assume investors are rational aren't sufficiently accounting for new types of risk exposure and opportunities introduced by the digital revolution.

Social media is a prime example. Every minute, Facebook users send roughly 31.25 million messages and watch 2.77 million videos. The sheer volume of data and the rate at which opinions compound make it impossible for even an army of human analysts to properly account for the effects of irrational crowd psychology on the market. That's where AI comes in. Machines have the ability to improve the quality of human investment decisions by continuously reading and understanding huge volumes of unstructured information and pushing insights that correct for bias.

However, it should be noted that there is no such thing as ‘general artificial intelligence.’ In order to generate accurate results, machines need to be seeded with domain specific context through the use of human expert networks. Machines can then aggregate and distill unstructured data into useful classifications and measurements, which can be measured and modeled like any other data set.

The key to generating alpha in the digital age of markets is to leverage machine intelligence together with human expert networks to automate cognitive processes and better understand mountains of information in very specific ways. At Accrete, we confront that challenge of managing idiosyncratic risk head-on by building narrowly focused AI solutions, which enable us to leverage both human and machine neural networks to solve specific problems with more accurate and scalable results.

For more on generating alpha in the digital age, check out Nasdaq's interview with Accrete Founder and CEO, Prashant Bhuyan