Startups Revenue forecasting — Knowing when things have really changed

By Omri Goldberg, Data Scientist — Liquidity Capital

One of the key challanges in forecasting is in detecting when a real and significant shift has occurred within the business’s core KPIs. Separating the signals from noise becomes significantly harder when what one is looking for is most likely to be represented by a somewhat noisey process. Let’s take forecasting a startup’s future revenue, a key area of focus at Liquidity, it is crucial to understand if truly a shift has occurred or whether we are witnessing fluctuations around an existing trend. Knowing if the rate of revenue growth has truly accelerated or decelerated, if a new product strategy is working well, or if the market conditions are closing on an incumbent startup are critical factors for Liqudity’s risk assessment.

Commonly, manual processes performed by investment analysts entails the usual sniffing and combing through of various spreadsheets and slides, sometimes trying to gently force the data and trends to adopt a narrative. This where we might “see” a change in a metric that didn’t not really occur, or an omitted one if it is not communicated or accounted for. “Did things really start changing in April? Is the new pricing model kicking already?” Those are the questions and dilemmas we usually encounter.

The standard statistical tools out there do not outperform the judgment of an analyst by any means. They tend to smooth things out, creating a delayed response to change. These methods usually revolve around moving averages, sliding windows, and truncating past periods putting higher weights on recent periods.

There are more sophisticated approaches, based on linear regressions such as piecewise linear. These methods tend to be substantially more “alarmist” and overflow the user with false signals. Using these methodologies will mostly likely result in never ending discussions or pile on significant offline work.

In order to address this challenge, we developed a model in which we scan the actual data and find possible break points — significant pattern change. After finding the possible break point, the model tests those points and estimates the best scenario for the break point. This breakpoint is then undergoes a few stress tests to determine whether it is an actual breakpoint, based on statistical significance.

This iterative approach for identifying breakpoints is used as a feature in a comprehensive ML based model for assessing the risk associated with providing financial facilities. When our risk assessment model detects a significant change or anomaly, it determines that there is increased associated risk level with the startup.