
Mobile app data has quietly become a core measurement layer for understanding consumer behaviour, competitive dynamics and business performance across a wide range of sectors, from streaming and retail to travel, fintech and gaming. What was once treated as a niche alternative signal is now routinely used to inform investment decisions, company analysis and thematic research.
At the same time, the market for mobile data has become harder to navigate. Privacy changes, platform opacity and shifting app-store mechanics have made it more difficult to assess the quality, reliability and longevity of many datasets. The result is a growing gap between data that looks compelling in exploration and data that can be trusted once it is embedded in production investment workflows.As institutional use of mobile data matures, the key question is whether it is reliable, explainable and operationally fit for purpose over time.
From novelty to infrastructure
In its early years, mobile app data was often evaluated on novelty. Investors looked for correlations between app usage and business performance, tested them in back-tests, and moved quickly if results appeared promising. That approach worked when datasets were scarce and relatively few market participants were asking the same questions.
Today, that environment no longer exists. Mobile data is widely available, widely used and increasingly similar at the surface level. As a result, differentiation has shifted upstream, away from headline metrics and towards the way data is constructed, validated and governed.
This mirrors a broader pattern seen across institutional data markets. As datasets move from experimentation into routine use, they begin to resemble infrastructure rather than insight. At that point, buyers care less about novelty and more about consistency, transparency and failure modes.
The limits of aggregation
One of the most common sources of confusion in mobile data is the reliance on aggregated metrics without sufficient understanding of how those aggregates are formed.
“You can absolutely make a lot of money trading on aggregate data – just as you can win at blackjack without really knowing what you’re doing,” observes Jonathan Kay, Founder and CEO of Apptopia, specialists in the provision of mobile consumer intelligence for institutional investors and equity analysts. “But over a large enough number of hands, the odds catch up with you. The real issue is confidence intervals. Aggregate data widens them. Sometimes you can live with that, but sometimes you can’t – and when you can’t, it becomes very expensive. There’s a big difference between making a small wager close to home and placing a $200 million investment decision. The higher the stakes, the more confidence you need in what the data is actually telling you.”
Aggregate figures such as sessions, monthly active users (MAU) or downloads can be useful directional indicators. However, they compress multiple behaviours into a single number, often masking very different underlying dynamics. The same aggregate outcome can result from growth, churn, substitution, frustration or promotional activity, each with distinct implications.“Take a simple example,” says Kay. “Suppose you see 10,000 sessions on the JetBlue app in the US on a given day. It’s tempting to interpret that as 10,000 people engaging with the app. In reality, it could be 5,000 frustrated customers opening the app multiple times because flights were cancelled, or 10,000 new users responding to a promotion. Those scenarios have very different implications, yet they produce the same aggregate number.”
This compression becomes problematic as conditions change. Small shifts in user behaviour, platform mechanics or measurement assumptions may not break an aggregate signal outright, but can gradually erode its meaning. The data continues to update, dashboards continue to refresh, and models continue to run, even as the relationship between the metric and the underlying business reality weakens.
“Over time, these small ambiguities accumulate,” points out Kay. “None of them break the signal outright, but together they erode it. That leads to drift – the signal slowly becoming less reliable without ever failing catastrophically.”
This form of degradation is particularly dangerous because it is incremental rather than binary. There is no clear failure event that forces a reassessment. Instead, confidence intervals widen quietly over time.
Why disaggregation matters
Disaggregate, user-level data does not eliminate uncertainty, but it allows investors to ask more precise questions. Rather than observing only outcomes, it makes it possible to examine paths: how users arrive at those outcomes, how cohorts behave differently, and how changes propagate through a system.
This distinction becomes critical when markets mature. As widely observed signals lose power, the remaining informational edge often lies in understanding composition and movement rather than totals. Which users are leaving, which are arriving, where time is being reallocated, and how behaviour changes before headline metrics move.
Kay gives the example of Netflix. “For years, subscriber growth correlated strongly with the company’s performance. Then one quarter, subscriber growth stalled and the stock reacted violently. Many investors blamed the data providers, asking why their models had “broken”. In reality, the relationship hadn’t broken – it had changed. Subscriber count is a function of new subscribers minus churn. For years, growth dominated churn, so tracking new subscribers worked well. Over time, churn became more important, but many models didn’t adapt. The aggregate metric masked that shift. What disaggregate data gives you is visibility into the path, not just the outcome. And that path matters.”
Importantly, disaggregate data also provides a foundation for better validation. When metrics are decomposed into observable components, it becomes easier to test assumptions, identify anomalies and detect when a model is no longer behaving as expected.
The role of panels
Most mobile datasets ultimately derive from panels: defined groups of users or devices that are observed continuously over time. Panels exist because it is impossible to observe all users globally, so providers work with samples and infer broader patterns.
“One major misconception concerns modelling versus observed data,” notes Kay. “Aggregate data is always modelled, because nobody observes everything. Investors are usually quite good at interrogating models – asking about overfitting, restatements, training data and so on. But when buyers move to user-level data, there’s a tendency to relax that scrutiny. Because the data is directly observed, people assume it’s inherently reliable, and focus almost entirely on panel size.”
The size of a dataset built on a panel matters, but only insofar as the underlying panel itself is of sufficient quality. Three characteristics are particularly important.
First, representativeness. A large panel is not necessarily a representative one. Panels sourced through incentives, geography or device type can skew heavily towards certain demographics. Without careful normalisation and benchmarking, such biases can materially distort outputs.
“Size is not the same as representativeness,” states Kay. “A large panel can still be heavily biased. For example, panels that pay users tend to skew towards lower-income demographics. You can have millions of panellists and still only represent a small fraction of the population you’re trying to measure.”
Second, retention. Longitudinal analysis depends on observing the same users over time. If panel participants churn frequently, changes in behaviour may reflect sample turnover rather than genuine economic or competitive shifts.
Third, observability. Buyers need to understand which variables are directly observed and which are inferred or modelled. Direct observation does not remove the need for scrutiny; it changes the nature of the questions that should be asked.
Sophisticated buyers increasingly request visibility into these properties, rather than treating panel size alone as a proxy for quality.
“Internally, we track a much larger panel than we can reliably report on,” notes Kay, “because only a subset meets our retention thresholds. Without that stability, user-level analysis becomes meaningless.”
Understanding drift
One of the most under-discussed risks in alternative data is drift: the gradual change in what a dataset is actually measuring.
Drift can arise from many sources. Platform policies change. User behaviour evolves. Data collection methods are updated. Models are refined. Coverage expands or contracts. Any of these can alter the relationship between a metric and the real-world phenomenon it is intended to represent.
Crucially, drift often occurs at the level of derived metrics rather than raw variables. Core measures such as transaction counts or user volumes may remain relatively stable, while assumptions around pricing, weighting or conversion subtly change the output.
Managing drift therefore requires visibility into how outputs are built from inputs. Buyers who only see final aggregates are poorly positioned to detect when something has shifted. Those who understand the underlying variables, and how they combine, are better able to recognise when revalidation is required.
“Transparency around how outputs are built matters,” explains Kay. “Firms like YipitData, for example, provide a clear path from inputs to outputs. Even if a derived estimate becomes less accurate over time, understanding how it was constructed allows investors to adapt. Our approach is to operate as close to the variable level as possible. If we give investors reliable underlying variables, and they understand the business qualitatively, they are far better placed to detect and manage drift themselves.”
The difference between exploratory use and production use is not simply one of scale. It is a shift in responsibility.
In production workflows, data feeds into models that drive real decisions. Errors propagate. Assumptions compound. Small degradations can have large downstream effects. As a result, operational questions become as important as analytical ones.
How are changes communicated? Is there versioning and revision history? Are users alerted when methodology changes materially? What monitoring exists to flag abnormal behaviour? When should a dataset be re-tested or temporarily withdrawn?
These are not abstract concerns. They determine whether a dataset can be safely embedded into long-term processes rather than used opportunistically.
Raising the bar for buyers and providers
As mobile app data becomes more institutionalised, responsibility is shifting on both sides of the market.
Data providers are increasingly expected to supply not just metrics, but documentation, transparency and tools that allow clients to assess reliability over time. This includes clear explanations of methodology, guidance on appropriate use cases, and artefacts that support internal governance and risk management.
At the same time, buyers must ask better questions. Size alone is no longer a sufficient proxy for quality. Neither is novelty. Evaluating mobile data now requires an understanding of sampling, modelling, validation and drift, as well as the operational context in which the data will be used.
The evolution of mobile app data reflects a broader trend across alternative data markets. As datasets move from the margins into the core of institutional workflows, expectations rise. What once differentiated a provider may become table stakes. What once felt like an insight increasingly resembles infrastructure.
In that environment, the most valuable data is not necessarily the most eye-catching, but the most robust: data that can withstand scrutiny, adapt to change and continue to perform as conditions evolve.
For investors, the challenge is no longer finding mobile data. It is determining which data can be trusted when the easy correlations fade, and the real work begins.
Subscribe to our newsletter


