Why AI Infra
I wanted to work in AI Infra because its more fragmented and less winner takes all. Even if you fail as a seller, the knowledge of GPUs and data centers will be useful to buyers procuring them.
The world can buy cars from Japan or China, but every country needs to build its own road. Same with data centers. Software is America centric and it’s not clear to me yet who the winner is, except the top 3 labs. Same with AI Agents.
Infrastructure also has its nasty downsides. Products tend to lack differentiation; business is prone to margin squeezing from competitors, it is very capex heavy – power, real estate and GPUs. Recall what happened with Cisco in the boom of internet 1990s. When you are building infrastructure, your sales are going up +50% a year, but once it is built, growth not only doesn’t go up 50%, but it also goes down because on a rate of change basis, you no longer need infrastructure. In 2000s, A lot of companies with estimates of +50% to +70% growth, for the next 2-3 years, had business that were about to collapse. Nasdaq went down 95% in 2001.
AI Infra for HFs
In 2023, I thought there are <10 HF players who are successful in neural networks. My colleagues thought it was still an experiment. Let’s not talk about the moment when a senior guy in trading told me that neural networks doesn’t work. I was dead wrong.
In 2025, I learnt that there are people who’ve been doing this for decades. The Chief scientist of OpenAI is an ex-HF guy. Look at all the ICML/NeurIPS sponsors. Compare 2023 vs 2024. The number of quant HFs sponsors almost doubled.
There is a firm called “B”. They are discreet. They work with a few high-profile quant HFs whom I prefer not to write about. Not Quadrature or TGS or RenTech, but you can assume they also had compute. Lucky for me, they had an HPC for finance guide which I liked, so I cold called them. They told me that everyone wants more compute. That HF clients seem to be making money this year and last year. I also learned from a P.Decrem that HFs aren’t excited about 10k clusters anymore. They want more, but Nvidia is focusing on sovereign clients. I don’t know what I don’t know, but I suspect most are experimenting while some are making money.
I had a conversation with a neocloud* who told me that they think only 40 companies in this world can “make money” from large clusters and everyone is trying to get them.
Company B is small yet efficient - makes 75 million of revenue last year, paid out 10 employees, 6 million of salary total, and is fully owned by a 60-year-old guy who I reckon is about to retire. Last year, he paid himself 3 million in dividends. I totally want to build a company like this.
Opportunities – There are two opportunities I see
First, Build AI Infra starting with HFs and then other niche players.
“B” - these guys are middlemen. I assume most quant HFs do not want to go through the hassles, so they ask B, who source the GPU servers from OEM like Dell to build the data centers. OEM tends to charge ~30% markup vs ODM, but that is due to the specialization of servers. As compute clusters grow larger, some guys like Meta decide to build their own and source it from ODM like Quanta (who charge only 1% markup), To do so, you need both the technical expertise for server configuration and willingness to negotiate per component. Nitty gritty infra stuff which some rich clients prefer to outsource.
My question is – until when does your competitive edge on trading lasts before you P&L gets competed away? Just like systematic strategies in the 1990s? How do you ensure ROI on your models? Assuming your moat gets reduced over time, will your edge naturally be in managing CapEx smartly, just like the AI Labs guys? This is not just about getting good financing rate, it’s about sourcing, relationship, technical expertise in networking. Assuming everything is reliable.
The additional complexity comes with how next generation GPUs need more power density and liquid cooling. Existing data centers are not equipped for that. Making it harder, every chip manufacturer, every server assembler has a different manifold or different way to get liquid on the cold plate.
The argument for neocloud* is that their rich clients would want the newest chips and that they are secure enough, that cloud make sense. I talked to “B” and he disagrees, with HFs prefer to have their own data center with older generation chips for security reasons. I suspect some HFs/AI Labs are okay with H100s for now because the software stack is more robust. Will it change in 10 years? Someone like XTX is probably stuck with their 2023-2026 version of data centers.
Second: Nearly everything is a rounding error compared to GPU cost
AI in the Cloud, how to keep your models flying high and deliver ROI*
https://calv.info/openai-reflections
The second opportunity is working on a job that solves a new problem that arise with AI. In the case of AI, the underrated problem is managing cost. There are smart engineers building things like context caching or smarter prompt engineering to optimize cost of LLMs. On the finance side, OpenAI is hiring those with compute/procrument knowledge who also knows how to do ML forecasting. The trend is rather clear – you need to know both some kind of contextual knowledge (finance) and ML.
An interesting excerpt from OpenAI guy
“We had to forecast out the load capacity requirements as part of the Codex launch, and doing this was the first time I'd really spent benchmarking any GPUs. The gist is that you should actually start from the latency requirements you need (overall latency, # of tokens, time-to-first-token) vs doing bottoms-up analysis on what a GPU can support. Every new model iteration can change the load patterns wildly.”
TCO Analysis of 10k Cluster
My bare metal chassis analysis of a 10k H100 GPU Cluster
Let me know what you think!