Let us start with the question “Why quantify?”. Quantification is simply the conversion of thoughts and ideas into numbers. Behavioral scientists and domain experts quantify to improve precision of outcomes and decisions. Peter Drucker, who is often referred to as the founder and thinker on modern management, once said – “If you can’t measure it, you can’t manage it”. In the context of financial modelling, this can be rephrased as “You need quantitative measures to create a quantitative model”.
Analogous to the well-known ‘chicken first or egg first’ dilemma, which questions the relationship between cause and effect, the combined fields of finance and data science has witnessed an ongoing debate: “Data before model or model before data?”.
On one hand, there are proponents of the ‘data before model’ argument. A model can be defined as “a simplified description, especially a mathematical one, of a system or process to assist calculations and predictions” [ 1 ]. In other words, a model is a simplification of reality and this framework allows us to simulate the reality and forecast the future based on data inputs. Data is “facts and statistics collected together for reference or analysis” [ 2 ]. And therefore, extending knowledge from this definition, one can argue that data forms the basis of any analysis and hence always comes first.
On the other hand, there are some who believe that ‘data and model’ are coupled and equally important. The argument is as follows – It is well-known that even the most advanced models are not useful without sufficient and meaningful data; at the same time, data alone cannot be exploited without a germane quantitative model. Experts in the domain of finance have always been striving to build models and improve financial decision-making; though the focus has been on building more elegant models, the volume of data powering these models were limited. Advances in technology have now facilitated collection of large volumes of data, allowing experts to develop more complex models. The emergence of data science has enabled us to process and extract insights from structured and unstructured data sets. Domain experts regularly calibrate models using large amount of data to improve model accuracy and generate alpha.
With improving technology and more focus on AI/ML, financial models have been improving and there is always a hunt for better and unique data. The term “Alternative Data” is the label given to newly emerging source of data which lies beyond conventional data sources. Alternative Data refers to any information on a firm, industry trend, customer behavior; for example, alternative data sources include email receipt, employment data, geolocation, sentiment and weather data. These are data sources which are not conventional in the Finance domain to evaluate a company or make an investment decision. The main motivation behind adopting Alternative Data is to gain information advantage. A number of hedge fund managers and other institutional investment professionals are starting to use Alternative Data to gain insights about investment opportunities and create premium returns.
However, adoption of Alternative Data poses certain challenges. Most alternative datasets are unstructured, large, complex, and less readily usable, compared to traditional data, so they need to go through a combination of several algorithms, models and technological tools to be transformed into structured and valuable information. Another challenge is to improve interpretability of the models in terms of deriving the output. Analysts and other financial market participants need to interpret models and ensure that when a model uses Alternative Data, its outputs are intuitively understood. Another constraint in using Alternative Data is its lack of history. Since this is a new source of data, most of them have a very small historical span of data. This makes back-testing strategies and training machine learning models challenging.
OptiRisk’s team is currently working on the ‘Handbook of Alternative Data in Finance’ that will feature an extensive overview and analysis of Alternative Data and its applications in Finance domain.
Contributions from subject matter experts in Alternative Data are welcome. Some of the topics this Handbook will cover are:
- The Process of Using Alternative Data
- The Technologies and Economics of Acquiring Alternative Data
- Coupling Models with Alternative Data for Fintech
- How to Deal with Different Alternative Data sets
- Alternative Data Disrupting the Landscape of Financial Market