Lean Environment

One of the best rules anybody can learn about investing is to do nothing, absolutely nothing, unless there is something to do... I just wait until there is money lying in the corner, and all I have to do is go over there and pick it up... I wait for a situation that is like the proverbial ‘shooting fish in a barrel.’
— Jim Rogers
There's gold in 'em thar Big Data hill!

There's gold in 'em thar Big Data hill!

Lean Data: A better way to save bullets?

Lean Data: A better way to save bullets?

The principles of marksmanship have much to teach us about accuracy and precision of financial trading. Accuracy is commonly understood as the ability to hit the target, whereas precision is defined as its repeatability. While saving bullets may not be a marksman’s top concern, compensating, accounting, or controlling for environmental factors that affect practical accuracy is an important aspect of good marksmanship. The effects of key environmental factors such as temperature, humidity, air density, precipitation and wind all need to be understood separately and their combined effects estimated by a good marksman. This is of course assuming that the marksman already knew his own intrinsic accuracy based on his inherent ability, as well as the inherent mechanical accuracy of both his firearm and ammunition. After all, the intrinsic accuracy of the combination of rifle, ammunition and shooter is determined by the intrinsic accuracy of the worst performing of these components. A stable, predictable (or even controlled) environment helps narrow the scope of variability so a marksman can aim and fire a series of shots with the greatest accuracy and precision.

AccuracyVsPrecision.jpg

Without precision, a high degree of accuracy is nearly impossible. In order to accurately strike a target, the shooter must adjust the aim to account for several variables. To the extent the repeatability of the firearm or weather conditions is poor, this adjustment becomes correspondingly uncertain – in other words, with poor precision the shooter simply will have to guess, and hope for the best. This is hardly the recipe for success.

In financial trading, precision comes from the ability to model the external macroeconomic environment and understand how it affects the overall performances of a wide range of trading strategies, e.g., trend-following, mean-reversion, statistical arbitrage or other quantitative strategies. Given the inherent constraints on the number of environmental variables that we can practicably monitor and track, it makes sense to concentrate our resources on a core set of key parameters with the widest applicability. Macroeconomic variables such as exchange rates, interest rates, inflation, unemployment, GDP, etc. would fall within the core set of such key parameters. We call this the “Lean Data” approach, as opposed to its better known counterpart, i.e., the “Big Data” approach, which might have many more types of variables, in addition to macroeconomic variables, that may also include weather, geological data or even consumer sentiments. As we shall see, more is not necessarily better, just as free is not necessarily useless.

Suppose that we have n Big Data variables from which a subset of k Lean Data variables can be chosen. As there are n!/k!(n-k)! ways to choose k elements from a set of n elements, we see that this can easily support an ecologically diverse niche along the HFT vs. Big Data continuum. Multiple trading firms with differentiated survival strategies can thus be accommodated.

The objective of a Lean Data approach to trading is to achieve statistical repeatability of a desirable outcome in the aggregate based on executing specific strategies over a large number of times within a stable and predictable environment. In other words, we aim for precision in financial trading, knowing that, as in marksmanship, precision naturally leads to accuracy. However, there is, as they say, no free lunch in finance. What we are giving up here for precision is to forgo trade opportunities that we are unsure about (i.e., false negatives), based on a quick analysis of costs and benefits in real time each trade opportunity arises. In times of uncertainty, we simply do not trade. While we are unable to predict the outcome of each specific trade, we can reasonably expect to achieve a positive result in the aggregate outcome of all executed trades. Trading strategies based on statistical arbitrage, or to a lesser extent, mean reversion, would exhibit this type of behavior.

In times when good data for supporting precision is unavailable, the Lean Data approach has another trick up its sleeve that may be readily applied. Compared with the highly accurate sniper rifle of the Big Data approach (where availability of good data is a pre-requisite), the Lean Data approach can sometimes behave like a machine gun that may not be very accurate to start with. However, a machine gun using the same ammunition as the sniper rifle can be effective at a much greater range due to lower accuracy requirements for effective use. Therefore, the machine gun's target circle is much larger due to its rapid-fire capability, which allows a machine gun to strike with one or more hits along with numerous misses at random locations within the target circle. Again, there is no free lunch in finance. What we are trading away here for achieving a greater range is to tolerate a larger number of misses (i.e., false positives) from lack of good data. Provided we control for the environmental factors, and correctly calculate in real time that the greater benefits of a few hits to be more than the total costs of all misses plus the costs of all the bullets, we can still expect an overall positive financial outcome. Trading strategies that make lots of small bets on large-impact (aka "Black Swan") events the market considers unlikely would fall under this category. In a more limited sense, a trend-following strategy with trailing stops that suffered through multiple stop-outs before finally hitting a trend would be considered as exhibiting a similar behavior.

One can thus see that the Lean Data approach can support a broad class of well-known trading strategies, depending upon the availability of good data that reflect market conditions. By choosing to concentrate our resources on monitoring and tracking a core set of macroeconomic parameters, we ensure that our models and strategies are reasonably informed of the external environmental factors that affect trading performances in the aggregate.

Vertigo-Inducing Big Data.

Vertigo-Inducing Big Data.

Lean Data: Home Sweet Home.

Lean Data: Home Sweet Home.

HFT Environmental Hazards.

HFT Environmental Hazards.

An interesting way of thinking about the Lean Data approach is to visualize it as an environment in which Scrat the saber-tooth squirrel of the ice age is struggling to hold on to his prized acorn. The Big Data environment is vertigo-inducing and scary; whereas the High-Frequency Trading environment is filled with piranhas and hazardous. The small world environment of Lean Data, while no Scratlantis, is nevertheless familiar, cozy and comforting to Scrat (and his prized acorn). But are macro models by themselves sufficient for driving Lean Data financial trading in a Big Data world without suffering too much of a handicap?

One plausible answer, we think, is to consider how the macro models might have already captured the compact structure of the macroeconomic world, and thus can support causal reasoning in a variety of trading models where the external environment exerts influence. Computer scientists generally believe that a sufficiently compact program that explains a complex world essentially captures reality. As Eric Baum, author of the book “What is Thought?”, explains: “the only way one can find an extremely short computer program that makes a huge number of decisions correctly in a vast and complex world is if the world actually has a compact underlying structure and the program essentially captures that structure.”

Therefore, if trading models constrain their reasoning and learning to deal only with meaningful quantities (i.e., vetted by a diverse network of human economists and codified into macro models), their decisions and actions would more closely correspond to macroeconomic reality. Furthermore, if machines, like humans, understand the world through meaningful concepts and only search through meaningful possibilities, the load on computational resources would be more manageable. In other words, human experts through their collective research efforts can provide the metaphorical DNA to the machines, giving it a “running start” and preempting the re-invention of the proverbial wheel thus saving valuable computational resources. This is how we envision the Lean Data approach can make an important contribution to financial trading, by showing a knowledge paradigm where machines and a network of human experts can synergistically collaborate.

An expert is someone who knows some of the worst mistakes that can be made in his subject and how to avoid them.
— Werner Heisenberg (“Physics and Beyond”, 1971)

References:

  1. Taleb, Nassim Nicholas (2010). The Black Swan: The Impact of the Highly Improbable (2nd Edition). Random House.
  2. Baum, Eric B. (2006). What is Thought? A Bradford Book.

All Else Being Equal

Not everything that counts can be counted, and not everything that can be counted counts.
— Albert Einstein (1879-1955)

Albert Einstein was equally prescient in matters beyond just space-time when he uttered this quote with a humorous twist on the word ‘count’. Indeed, in the realm of high-frequency trading, not everything that counts can be feasibly counted in the ever shorter duration of tick-to-trade time available for transaction. And not everything that can be easily counted in the Big Data universe actually counts in practical financial trading. Einstein’s quote seems to hint at the existence of a third possibility where we count exactly what we need counted: not more, not less.

From an economic viewpoint, both HFT and Big Data are both capital resource intensive and therefore will most certainly lead to certain evolutionary dead ends as time goes on, where the few remaining survivors face diminishing returns on their massive capital investments in state-of-the-art technology infrastructure for speed or for processing and storage capacity. For example, despite the massive investments in infrastructure, speed traders are trading fewer and fewer shares, from a high of 3.25 billion shares a day in 2009 down to 1.6 billion shares a day in 2012, according to this Bloomberg article. In addition, they are also making less money on each trade. At the other end of the spectrum, Big Data driven trading firms are locked in an arm race to provision for the unprecedented volume of business data generated worldwide, estimated to double every 15 months, requiring over 10 petabytes of storage and 100 teraflops of computing power on tens of thousands of CPUs in privately-run data centers. But how does one avoid the fate of diminishing returns in the end game, i.e., evolutionary dead ends?

Outrun by a saber-tooth tiger and out-sized compared to a woolly mammoth, the caveman hit upon a new idea for survival...

Outrun by a saber-tooth tiger and out-sized compared to a woolly mammoth, the caveman hit upon a new idea for survival...

The relevant question here is this: could there be a third alternative that can be profitably sustained alongside the fast-evolving ecology of capital-intensive HFT and Big Data? A wiser choice, we believe, is to consider a financial universe that is not characterized by massive capital investment in technology or data but instead distinguished by continual process innovation. How might this be possible? What is actually involved in making this trade-off?

Recall that the overarching goal here is to Identify an ecologically diverse niche along the HFT vs. Big Data continuum that can accommodate multiple survivors with differentiated survival strategies. Both HFTs and Big Data driven trading firms are competing in their respective ecological niches that are narrowing over time, where only the speediest or the few that can process the largest volume of data become the ultimate survivors. Extinctions of firms in the long run are expected to be common under such scenarios.

Seeking an evolutionary pathway forward, driven not by speed or size but by continual process innovation.

Seeking an evolutionary pathway forward, driven not by speed or size but by continual process innovation.

Ceteris paribus, i.e., all else being equal, a higher ratio of conversion from raw data to harvested knowledge is certainly advantageous and should pay off handsomely in the long run. We hold constant the factors of outsourceable technology infrastructure, non-proprietary data sources, and available capital. We focus instead on maximizing the adoption and integration of tools and techniques within the trading platform, i.e., elements that accrue to overall process innovation within the firm. In this way, we are making a long-term bet on efficiency and yields made possible by the accrual of tools and techniques, and thus avoid the fate of diminishing returns in the evolutionary end games. After all, sharper teeth and claws or bigger size hadn’t helped the saber-tooth tiger or the woolly mammoth become viable in the long run. It was the use of tools, an evolving brain, plus the fortuitous discovery of fire for making raw meat digestible that ultimately made a difference to the survival of early humans. But who would have guessed a million years ago that it would be homo sapiens who now inherit the Earth?

According to Yuval Noah Harari, the author of Sapiens, humanity came from humble and marginal beginnings. On the savannas of earliest memory, human beings were third-rate scavengers, devising their earliest tools so as to better crack open bones for marrow. As Harari points out, these tools are testaments to both our ingenuity and our crippling weakness. The early humans needed that marrow because by the time they got to the carcass, the bones were all that was left. The humans didn't have the strength to compete with the lions who hunted it, nor with the hyenas and jackals who arrived once the lions had their fill. Our forebears came for that which the other predators discarded. Humanity began in the garbage; we’re but descendants of sly monkeys who worked an angle!

But what an angle that was! From our weakness on the open plain – our slow, bipedal gait, our perfunctory teeth – we soon accrue an array of strategies and tricks over historic time that enable us now to ascend to the top of the food chain.

So what does the end game look like going forward? A more efficient and higher-yielding data-to-knowledge processing pipeline would certainly be a key feature of any successful “lean data” driven trading firm, of which there will be many in such an ecologically diverse niche. All else being equal, i.e., where technology, data, and capital are held constant, we should expect to see a new breed of such “lean data” trading firms that know how to apply optimizing heuristics to move in more agile ways than Big Data driven trading firms while capturing trades over a longer time horizon than HFT firms could see. Evolution is a slow process. Only time will tell if our bid to bring domain knowledge into the search and discovery process for market inefficiencies will yield the promised riches of quantitative finance.

The market was like a coin with a small flaw that makes it slightly more likely to come up heads than tails (or tails than heads). Out of a hundred flips, it was likely to come up heads fifty-two times, rather than fifty. The key to success was discovering those hidden flaws, as many as possible. The law of large numbers that Thorp had used to beat the dealer and then earn a fortune on Wall Street dictated that such flaws, exploited in hundreds if not thousands of securities, could yield vast riches.
— Scott Patterson (“The Quants”, 2010)

References:

  1. Harari, Yuval Noah (2015). Sapiens: A Brief History of Humankind. Harper.
  2. Patterson, Scott (2010). The Quants: How a New Breed of Math Whizzes Conquered Wall Street and Nearly Destroyed It (First Edition). Crown Business.

Numbers from Noise

When times are mysterious, serious numbers will speak to us always.
When times are mysterious, serious numbers will always be heard.
— Paul Simon (“When Numbers Get Serious”, 1983)
Follow the Chief: War Dance of the Sioux (by: Rudolf Cronau).

Follow the Chief: War Dance of the Sioux (by: Rudolf Cronau).

Prime numbers are the very atoms of arithmetic. Their importance to mathematics comes from their power to build all other numbers. Mastering these building blocks offers the mathematician the hope of discovering new ways of charting a course through the vast complexities of the mathematical world. Yet prime numbers remain the most mysterious objects studied by mathematicians. It is impossible for one looking through a list of prime numbers to predict when the next prime will appear. The list seems chaotic, random, and offers no clues as to how to determine the next number.

The list of primes is the heartbeat of mathematics, but its pulse is irregular. In Carl Sagan’s classic novel Contact, aliens use prime numbers to contact intelligent life on Earth by repeatedly beaming a radio signal through the cosmos with a sequence of prime numbers up to 907 to attract the attention of earthlings. Indeed, generations of mathematicians have sat listening to the rhythm of the ‘prime number drum’ as it beats out its sequence of numbers: two, three, five, seven, eleven, … Even after millennia of intense mathematical work, no one has been able to identify a pattern among these numbers.

Dance of the primes: If there is music in them, it is eternal.

Dance of the primes: If there is music in them, it is eternal.

A tantalizing insight into the pattern was, however, produced almost 150 years ago by the great German mathematician Bernhard Riemann when he published a remarkable hypothesis, which – if true – says that the primes have music in them. Marcus du Sautoy in his delightful book "The Music of the Primes" describes a special three-dimensional landscape, i.e., Riemann’s treasure map of the primes, where they all seem to be miraculously arranged in a straight line as far as the eyes could see, and each point at sea level correspond to a musical note. It is the combination of these notes, each at just the right volume, that give rise to the music of the primes. If Riemann's landscape were true, the orchestra playing the music of the primes will be in perfect balance. It is as if each instrument plays its own pattern, but by combining together so perfectly, the patterns cancel themselves out, leaving just the formless ebb and flow of the primes.

Divine Inspiration: How simple keys and chords when combined rhythmically together create music.

Divine Inspiration: How simple keys and chords when combined rhythmically together create music.

Riemann looked at the image of the primes in the mirror that separated the world of numbers from his imaginary landscape, and saw the seemingly random arrangement of prime numbers on one side of the mirror transform into the strict regimented order of the zeroes on the other side of the mirror. Riemann could not have anticipated what was awaiting him on the other side of the looking glass. But what lay there completely transformed the task of understanding the mysteries of the primes. And mathematicians now had a new landscape to explore.

History doesn’t repeat itself, but it sure does rhyme.
— Mark Twain (1835-1910)

References:

  1. du Sautoy, Marcus (2012). The Music of the Primes: Searching to Solve the Greatest Mystery in Mathematics. Harper Perennial.
  2. Hofstadter, Douglas R. (1979). Gödel, Escher, Bach: an Eternal Golden Braid. Basic Books.