Alpha Runs on Context

April 30, 2025

Alpha generation in investment management has always been a research problem. The best-performing funds have the best research. That has not changed. What has changed is what research means.

For decades, investment research meant analysts reading filings, building spreadsheets, calling management teams, and synthesizing information into a thesis. The process was manual, the output was a written document, and the competitive edge was the analyst's judgment. That model still works. It just cannot scale to the volume and velocity of information that modern markets generate.

The firms generating the most consistent alpha today are not the ones with the best analysts or the best models. They are the ones that have figured out how to systematically ingest, contextualize, and reason over their entire organizational knowledge base. Earnings transcripts. SEC filings. Internal research notes. Portfolio positioning data. Macroeconomic indicators. Credit agreements. Supply chain signals. Alternative data feeds. All of it. Simultaneously. For every position in the portfolio.

This is not a model problem. Every fund has access to the same foundation models. The model is a commodity. The edge is what you feed it.

The Token Economy of Real Research

There is a persistent misconception in the market that AI-powered investment research should be cheap. That a well-crafted prompt and a few retrieved documents can produce institutional-grade analysis. This is wrong.

Producing a single piece of high-quality investment research requires consuming tens of millions of tokens. Not thousands. Not hundreds of thousands. Tens of millions.

A proper analysis of a single equity position requires ingesting the company's last several years of quarterly filings, earnings call transcripts, sell-side research coverage, peer company filings for relative valuation, industry reports, management track records, credit facility terms, supply chain data, patent filings, regulatory submissions, and the fund's own prior research on the name. Each of these sources runs tens of thousands of tokens. Multiply by the number of relevant sources, and you are deep into the millions before the model has written a single sentence of analysis.

Companies that try to produce investment research with minimal token consumption are optimizing for the wrong thing. They build systems that sound like research but lack the informational depth that drives conviction. A research note generated from a company's latest 10-K and a generic prompt is not research. It is a summary. Summaries do not generate alpha.

The economics are counterintuitive. Spending more on tokens, dramatically more, produces disproportionately better research. The marginal value of the next million tokens of context is not linear. It compounds. The fifteenth data source might be the one that reveals the supply chain risk that changes the thesis. You cannot know in advance which source matters most, so you ingest everything relevant and let the system reason across all of it.

The Real Work Is Plumbing

If context is the edge, then the systems that assemble context are the competitive infrastructure. And those systems are, by any honest assessment, boring. Data pipelines. Ingestion frameworks. Document parsers. Chunking strategies. Retrieval systems. Workflow orchestrators. Caching layers. Plumbing.

This is where most firms fail. Not because the plumbing is conceptually difficult, but because it is tedious, unglamorous, and requires deep attention to the specific data formats, access patterns, and quality requirements of investment workflows. A research pipeline that reliably ingests SEC EDGAR filings, parses XBRL financials, extracts footnotes from 10-Ks, normalizes earnings transcript formats across providers, maintains a living knowledge base of prior internal research, and orchestrates all of this into a coherent context package for each research task is a substantial piece of infrastructure. It is also the only piece that matters.

The firms that treat this infrastructure as an afterthought, that focus on the model and the UI and treat data ingestion as a commodity problem, consistently produce lower-quality output. The quality of AI-generated investment research is bounded by the quality of the context assembly, not the capability of the model.

Investors Must Own These Systems

Here is the part that most technology organizations get wrong. They build investment research systems the way they build enterprise software: IT gathers requirements, engineering builds an application, and the end user gets a finished product. This does not work for investment research.

Investment research is a creative, iterative, judgment-intensive process. Every investor has a different analytical framework, different information priorities, different ways of building conviction. An IT-delivered application that encodes a fixed research workflow will always be too rigid. The investor will use it once, find it does not match how they actually think, and go back to their spreadsheet.

The only approach that works is giving investors the ability to build and customize these systems themselves. Not by writing code. By expressing their research process in natural language and having the system execute it.

An investor should be able to say: "For every name in the portfolio, pull the last eight quarters of earnings transcripts, the most recent proxy statement, all insider transactions in the past twelve months, our internal research notes, and the three most recent sell-side reports. Cross-reference management guidance against actual results. Flag any position where guidance accuracy has deteriorated over the last two years. Weight recent quarters more heavily." That description is a program. The system should execute it.

This is natural language programming applied to investment research. The investor defines the logic, the priorities, the analytical framework. The system handles data ingestion, retrieval, orchestration, and execution. The investor iterates on the logic until the output matches their standards. They refine it. They extend it. They build a library of research workflows that encode their investment process.

IT's role changes fundamentally in this model. IT does not build the research application. IT builds the platform that makes it trivially easy to ingest new data sources, connect to internal and external systems, and create new research workflows through natural language. IT provides the ingestion layer, the data connectors, the execution infrastructure, and the governance framework. The investor provides the intelligence.

The Ingestion Problem

The hardest part of the platform is ingestion. Investment firms sit on enormous volumes of organizational knowledge locked in formats and systems that AI cannot easily consume. Proprietary research databases. Portfolio management systems. Risk platforms. CRM notes from management meetings. Compliance records. Trade blotters. Internal messaging threads. Investment committee memos.

Making this knowledge available to AI research systems requires building ingestion pipelines for each source. Handling authentication. Parsing heterogeneous formats. Maintaining freshness. Resolving entity references across systems. Is "MSFT" in the trade blotter the same as "Microsoft Corp" in the research note? Preserving the access controls that govern who can see what.

This is the work that separates firms that talk about AI-powered research from firms that actually do it. It is not glamorous. It does not demo well. But it is the foundation on which everything else is built. Get the ingestion right, make it easy, make it comprehensive, and the rest follows. Get it wrong, and no amount of model sophistication will compensate.

Where This Goes

The convergence of investment research and alpha generation is not a trend. It is a structural shift. The firms that build the infrastructure to systematically ingest their organizational knowledge, empower their investors to create custom research systems through natural language, and invest in the token economy required for genuine analytical depth will compound their advantage over time. Every new data source ingested, every new workflow created, every iteration on the research logic makes the system more valuable and the alpha more durable.

The firms that try to shortcut this, that assume a chatbot and a few document uploads constitute an investment research platform, will find they have built an expensive summarization tool. Summaries are not research. Research generates alpha.