A Reflection of America

Love makes my heart warm

Money makes my heart calm

Humour makes my heart fun

Thinking makes my heart beats

-Ti Soe, 2025

Having achieved most of what I set out to do ten years ago, 2025 was the year I felt fulfilled. Meeting intellectually gifted peers who are great at orthogonal thinking, have drive, and have esoteric interests like Kafka, Chomsky, and Voltaire is a privilege that someone as extroverted as me can only dream of. I’ve always wanted to live an intellectually rich life, and my initial naivety about how macro trading can fulfil that led me here. A lot has changed. The 19-year-old Tiffany, who was inspired by Dalio, Thiel and Soros, understood the reality that governs the trading floor. Technological superiority is more important than intellectual thinking. Macro trading entered its prime in the 1990s, when governments tried to tame currency volatility by soft pegging currencies. With most currencies floating and some artificially held down, the golden days of philosopher-traders are gone.

I’ve always loved deep, intellectual thinking, and over the next year and a half, I’m fully committed to sharpening my mind and following my curiosity wherever it leads. This means a lot of reading. Just as much, I love doing the things I think about—and in that, I’ve always had a bit more courage than most of my peers. I’ve learned that not everything can be explained through pure rationality; some truths demand intuition, artistry, and context. Reminds me of Hofstader’s “Truth is not always provable; what is provable is not necessarily true”. Courage, I’ve found, is rarer than genius. And strangely, having seen so much death at a young age has pushed me to live with greater intention and urgency.

Yet how many of these are all stories I am fooling myself? (*insert the context of having been pleasantly intellectually challenged by another crazy math PHD whom I never met*) Having been in California, I’ve seen the extent to which too much thinking, too much freedom results - another comsci dropout modelling contemporary power structures while holding down an unrelated day job*. Inspiring both awe and disdain in equal measure—praised for originality, dismissed for imitation. And in the end, I can’t help but wonder: how much of it truly matters?

And it’s really funny, coming from me, because for 25 years I’ve been on the other side—where people in Asia told me that ideas don’t matter if there’s a high chance of failure. Not smart enough to beat the kids from developed countries for a scholarship—I won. Not a target school for the trading floor—I got the job through hustling and outperformed. Too unrealistic to chase intellectual ambitions—well, I’m doing it now.

Then why? Because pursuing high-expected-value, low-probability bets with a deterministic optimism brings me closer to my ideal self. And perhaps that’s what much of the Bay Area’s intellectual scene is about—a modern expression of America’s old Calvinistic impulse. A culture built on a sense of calling or vocation, born from the earliest religious refugees who came here seeking freedom to believe, to reason, and to build.

Girard spoke of two kinds of desire: the physical and the metaphysical. The former is a desire for utility; the latter, a desire for identity. The more I reflect, the more cautious I become about myself. I’m not as independent a thinker as I like to believe. I’m optimistic in Hong Kong, cautious in California*. How much of that is genuine reflection—and how much is simply the impulse to rebel? How many of my recent “wins” were coincidences rather than true independent thought? Still, to overcredit luck is to surrender agency.

Even with a framework for making decisions, I’ve come to learn that doing something different from the herd isn’t the same as thinking for oneself. Sometimes contrarianism is just another form of imitation. And so I wonder: how much of what I’ve done was driven not by reason, but by my desire for an identity? My desire to be the kind of person who thinks deeply and profits from it?

*Gray Mirror: A brief explanation of the Cathedral

*Hong Kong Executive: “I’ve seen many economic cycles—AI is another bubble.”
*Californian Founder: “I’ve seen many business cycles—this time is different.”

What I learn from Open Compute Project

My two focuses in October were to prepare for OCP and actively learn in the supercomputing club. I always believed that Luck = Opportunity + Preparation, and I try to learn as much as I can online before I join a conference. Most of the time, it leads to interesting conversations and useful opportunities.

I was also really lucky at UCSD - for the first time, an American company loaned our team a $250,000 server equipped with eight AMD GPUs for us to dismantle. That is unheard of in other parts of the world. It was also my first time seeing hardware components like NICs and PCIe up close.

There are a couple of things I learned from the Open Compute Project. But the most important thing I learned was that there’s money in the table for a reason. Through talking with business leaders, I learned about business/legal practices that made it difficult to simplify the supply chain.

Shortage

The surge in AI infrastructure demand has triggered aggressive purchasing by hyperscalers and “neoclouds,” leading to widespread component shortages. Based on my conversations in OCP, Hard drives are sold out for two years, driving SSD prices up for everyone else. Generators are reportedly on three-year backorders. Delivering a fully equipped, PUE-efficient data centre on schedule now requires coordination across dozens of suppliers and vendors — each facing its own bottlenecks. Walking the OCP floor, one thing became clear: hyperscalers dominate this market, making up about 80% of vendors’ clientele.

A hyperscaler I spoke with was cautious about the risks, but noted that when they cancelled a data center contract, a competitor quickly filled the gap. Demand surged afterward, and in hindsight, they realised they should have accepted the contract.

Margins

OEM vs ODM. Original Equipment Manufacturers assemble components from different manufacturers in China. They sell completed systems with warranties and software via value-added suppliers. ODMs like Quanta manufacture to customer specifications. With the AI Infrastructure boom, Oracle and Meta have bypassed OEMs to work directly with ODMs. While OEMs historically apply a markup of 30%, ODMs operate on thinner margins of 1%. However, ODMs require large contracts, making direct access challenging for smaller businesses.

Catching up in technology is incredibly difficult, but changing a business model is even harder. It seems every OEM has taken a page from Clayton Christensen’s Innovator’s Dilemma by taking the hit in margins to capture market share. Nvidia has also open-sourced the reference architecture for its Blackwell servers, further levelling the playing field. As a result, OEMs are beginning to look more like ODMs. I also learned that both ODMs and OEMs rely on many of the same Chinese component manufacturers for smaller parts, which narrows the differentiation even more.

Some component manufacturers

12V Bus Bar - Amphenol, Bizlink, Interplex, JPC, Lotes
Slide Rail - Fositek, Repon, Yuans Tech, Cheng Fwa, Kingslide
UQD/MQD - Auras, Danfoss, Envicool, Fositek, Foxconn, Lead Wealth, Lotes, Netonx, Nidec, Parker, Readore, Staubil

Lessons

The surge in Ai infrastructure demand has triggered aggressive purchasing by hyperscalers and "neoclouds", leading to widespread component shortages. Based on my conversations in OCP, hard drives have been sold out for two years, driving SSD prices up for everyone else. Generators are reportedly on three-year backorders. Delivering a fully equipped, PUE-efficient data center on schedule now requires coordination across dozens of suppliers and vendors - each facing its own bottlenecks. Walking on the OCP floor, one thing became clear: hyperscalers dominate this market, making up 80% of the vendors clientele.

A hyperscaler I spoke with was cautious about the risks, but noted that when they cancelled a data center contract, a competitor quickly filled the gap. Demand surged afterwards, and in hindsight, they realised they should have accepted the contract.

Reflection

Recently, Sequoia talked about two things - Rolef mentioned that there are too many talented people chasing not-so-interesting problems (like it's 1999), and Doug mentioned that most money made will be in the application layers, directly to the consumer. It’s making me rethink my approach, but I’m dedicated to winning (outcome) and not dying (path dependent). In a world that's changing, the “taste” of your problem matters than your current skillset. And taste depends on context; context is dependent on those who hold information.

UCSD Supercomputing Club

Book List

On Becoming A Person	Carls Roger	It's a bit dry, but I like some of his concepts. Being Open = Becoming Yourself
Probable Impossibilities	Alan Lightman	Very beautiful book, a sense of art wielded with science.
The Plague	Albert Camus	Beautiful book, with deep philosophical ideas
Nichomachean Ethics	Aristotle
When Reason Goes on Holiday	Neven Sarsadic
The Scout Mindset	Julia Galef
Profiles of the Future	Arthur C Clarke	Reality is quite far off from the prediction
Medieval Technology and Social Change	Lynn White Jr.	Didn't finish, didn't really like

Guns, Germs, and Steel	Jared Diamond	I find it okay
The Smartest Guys in the Room	Bethany McLean & Peter Elkind	It's a great reminder to not get too caught up in intellectual abstract ideas
The Copernican Revolution	Thomas S. Kuhn	Interesting read - people are scared of technology
The Daily Stoic	Ryan Holiday & Stephen Hanselman	Good reminder. Stoicism is helpful in hard times.
The Death of Ivan Ilyich	Leo Tolstoy	Very Deep. Tolstoy yearned for meaning in life. This is different from Camus.
The Brothers Karamazov	Fyodor Dostoevsky	Best book ever. One of the most defining books I've read
Gilead	Marilynne Robinson	Didn't finish
Feynman’s Rainbow	Leonard Mlodinow	Good teaching for life
Superforecasting	Philip E. Tetlock & Dan Gardner	How to be more rational

The Maniac	Benjamin Labaut	Sometimes too fictionalised, but interesting overall
Godel Escher Bach	Douglas R. Hofstadter	Legendary book, captivating
A Mind at Play	Jimmy Soni & Rob Goodman	Claude Shannon is an abstract thinker
The Singularity Is Near	Ray Kurzweil	Predictions came true
The Great Illusion	Norman Angell	I don't like the writing style, but its interesting book written in 1900s
The Sovereign Individual	William Rees-Mogg & James Dale Davidson Wikipedia	one of my favourites
Only the Paranoid Survive	Andrew S. Grove (Andy Grove)	great management book
Alchemy: The Magic of Original Thinking in a World of Mind-Numbing Conformity	Rory Sutherland	very, very good book for original thinkers
The Beginning of Infinity	David Deutsch	author is optimistic
Novelist as a Vocation	Haruki Murakami	wow, really beautiful
On the Edge: The Art of Risking Everything	Nate Silver	Didn't finish
Talent: How to Identify Energizers, Creatives, and Winners	Tyler Cowen & Daniel Gross	nothing too new
7 Powers: The Foundations of Business Strategy	Hamilton W. Helmer	great framework to think about
Impro: Improvisation and the Theatre	Keith Johnstone	iconic book
Technological Republic: Hard Power, Soft Belief and Future of West	Alex Karp	okay book
Up to March 2025
Measure What Matters: OKRs: The Simple Idea that Drives 10x Growth	John Doerr	Just okay
Invent and Wander	Walter Isacsoon	Gifted by PT. I realized that I have obsessive curiosity. Listen to my intuition
Charlie Almanack	Stripe Press	Great tips on human pyschology. Filled with wisdom
Things Hidden Since the Foundation of the World	Rene Girard	This is the hardest book ive read. But concept on "Scapegoating" and "Mimetic Desire" is mindblowing
Dream Machine	Stripe Press	Inspiring. I'll never take for granted how hard it was to build computers again.
The Elephant in the Brain	Kevin S & Robin	Just okay. Recommended by Naval Ravikant
A man for all markets	Ed Thorp	I aspire to be ed thorp, curious, healthy, principled and ALOT of freedom
Am I being too subtle	Sam Zell	Love this book. Straight talking guy.
Exercised: Why something we never evolved to do is healthy and rewarding	Daniel Liberman	Learnt about human body. More patient on my fitness journey
Dont Sweat the Small stuff	Richard C	Just okay.
The Power Law	Sebastian Mallaby	Really love this book. Concept fully ingrained in my major decision making
Up Close and Personal	John Mack	Interesting history of wall street
Chip War	Chris Miller	Really love this book. Curious on eccentric neurodivergent CTOs making big bets like ASML
The Founders	Jimmy Soni	Stories of eccentric, intense nerds of Paypal.
Between two shores	TL TSIM	Gifted by PT. Learned about the differences of East and Western culture
Ego is the Enemy	Ryan Holiday	Reminds me that Ego is dangerous
Courage to be happy	Ichiro	I prefer his former book more
Money Machine	Weijian Shan	Highly recommend! Read this and told myself " I'd never want to work with the gov/politics. Super complex"
Keeping at it	Paul Volcker	Fed Chairman Paul Volcker is our hero! Man of character. If i go through tough times, i read him
Think again	Adam Grant	just okay
Negotiation	Brian tracy	Just okay
Never Split the Difference	Chris Voss	Love this book! Fave Negotiation book
Do Nothing	Celeste	Just okay, but reminder to hyperactive tiffany to chill.
How to lead	David Rubenstein	I like Jeff Bezos and Jamie Dimon's part.
Born to Run 2	Chris Mc Dougall	Gifted by PT. Interesting book on running and certainly inspires one to run
Atomic Habits	James Clear	Good reminder of incorporating good habits
Money Games	Weijian Shan	Highly recommend. Another book from Weijian. Met him in person. Love this detailed story on PE
Deep Work	Cal Newport	Attention span of Gen Z is terrible, mine included.
No Filter	Sarah Frier	Felt like im being pulled to entrepreneurship after reading this. Story of 0 to 1 of Instagram
No Rules Rules : Netflix	Reed Hastings	I want to work in this sort of culture. Netflix culture interests me, but am not passionate about the industry
The Unthered Soul: Journey beyond oneself	Michael Singer	Sam altman recommended. Spritual Fulfillment
Hidden Potential	Adam Grant	Just okay
Option B	Adam Grant & Sheryl Sandberg	Just okay
Range	David Epstein	Just okay
Empire of Pain	Patrick Keefe	Super interesting read on drugs abuse in US.
Biggest Bluff	Maria K	I dont like
Time to breathe	Bill M	Just okay. Bought it when i was stressed. By the time i start reading it, i wasnt stressed at all.
The Geek Way	Andrew Mc Fee	Felt like he's describing me - obsessive curiosity, creative self starter who pushes the walls & limits
Ikigai	Hector	Just okay
More Money than God - Hedge Funds & The Making of Elite	Sebastian Mallaby	Fun read
Man Search for Meaning	Victor Frankl	Inspiring read, it's all about purpose/meaning
How to win friends & influence people	Dale Carnegie	I get it, but wont follow 100% of it. Good for people who wants to learn EQ
Man who solved the market	Gregory Z	Jim Simons inspires me to pursue the unknown path - he too struggled. Be guided by beauty
Black Swan	Nassim Taleb	Inspires me to pursue " low probability, high EV" stuff
Pyschology of Money	Morgan Housel	Recommend to all young people
Market Wizards	Jack S	Story of Ed Seykota & Stan Druckenmiller is interesting
New Market Wizards	Jack S	Recommend to those doing markets
Hedge Fund Market Wizards	Jack S	^
The Essays of Warren Buffet		i'm not into value investing
Four Thousand Weeks	Oliver	You have a short life! Recommended to people in limbo
Hillbilly Elegy	Jd Vance	Cried in first chapter. Didnt know that extreme sadistic part of US.
Flashboys	Michael Lewis	Just okay
The undoing project	Michael Lewis	Interesting!
Thinking fast and slow	Daniel K	Complex read for 20 year old Tiff
Boomerang	Michael Lewis	Crazy cheap financing during 2000s. "Madness is crowds are more common than individual"
What it takes	Steve Schwarzman	Inspiring entrepreneur
Shoe Dog	Phil Knight	wow, his business almost died multiple times
Total Recall	Arnold Sch	Didnt finish it, not really my style and too long, haha!
How will you measure your life	Clayton	Tiffy, its all about fulfillment and principles!
The power of now	Eck T	Great reminder to focus on present
Think and Grow Rich	Napoleon	Just okay.
Cant hurt me	David Goggins	its all about mindset
Grit	Angela Duckworth	just okay, good self reminder
The unwinding of miracle	Julie Ip	underrated inspiring memoir. she had a crazy life. LIVE YOUR LIFE TO THE FULLEST
When Breath becomes Air	Paul	Life is short, heartfelt book
What i talk about when i talk about running	Murakami	Love this book - it's never too late to start running
Good to Great	Jim Collins	From PT List. Talent is all about 1 + 1 = 3
The Monk who sold his Ferrari	Robin Sharma	From PT List. Do things the things that fear you most
Start with Why	Simon Sinek	Inspire others!
Never Eat Alone	Keith Ferrazi	Great book to learn networking
Courage to be disliked	Fumitake Toga	Great book to read at 19

When Progress Looks like Decline

Lessons from the Past
I’m interested in making AI infrastructure cheaper and reliable, and I often think about how AI is similar to the Industrial Revolution. Electricity is a good comparison. It was a technology that changed everything, across every industry. Electricity improved manufacturing and daily life, but many of its benefits didn't show up in productivity measures. Some gains are hidden, and I think it's important to take note that this will be the same with AI.

A Brief History
Before electricity, factories relied on a centralised steam or water power. They rely on one central power source, using belt and pulleys snaking across the floor, causing breakdowns and safety risks. Factories had to be located near a water or coal-rich environment. Electrification changed this. Instead of building a whole new factory, which requires large investments, a business can just expand its existing factory. Growth shifted from large & discrete to small & incremental. Adoption of electric motors took decades. In 1900, almost no factories used electric motors. By 1910, 20% did. By 1920, half. By 1930, 80% had switched (Source: Byrne Hobart).

When Progress Looks Like Decline
Even then, some benefits weren’t obvious in the data. When factories switched from gas lamps to electric light, GDP went down because light got cheaper. GDP only counts money spent, not improvements in quality. Faster trains, fewer breakdowns, and night shifts made factories more productive, but those gains didn’t show up as “productivity growth.” Labour hours and train miles were counted as the same, despite productivity. The real benefits only became clear decades after the economy had time to reorganise around the new technology.

How Growth Changed the Market
Capital markets changed, too. Before electrification, growth required big investments, so companies issued bonds with predictable dividends. With a high dividend, you would not expect growth... Stocks were more uncertain. After electrification, growth became incremental. Companies could just reinvest profits to expand, so stocks became more lucrative.

What about AI
The narrative around AI shifts between singularity and doomsday, but I suspect the future will be in the middle. AI will probably make inequality worse — not just between countries, but within them. Globalisation narrowed the gap. AI may widen it. With falling birth rates and rising costs, that worries me. Angry young people with nothing to lose tend to start revolutions. But in the long run, I’m optimistic. AI will give individuals and small companies more leverage. It will make organisations leaner and create new kinds of work, especially creative work. Life will probably get more volatile, but also more rewarding, at least for those who can adapt.

Good for capitalists.

Good for consumers.

Harder for workers.

The Broader Intellectual Question

Back when I was an associate on the trading floor, I often heard older colleagues say, “It’s much harder for young people these days.” And there’s some truth to that. Many of my peers were pushed to compete from an early age—only to exhale with relief (and mild burnout) once they finally landed a job.

That experience isn’t universal, though. It didn’t apply to me. But in places like Hong Kong, the pressure to compete is real—so real that one of my pregnant ex-colleagues had to sign up her unborn child for a nursery.

I’ve always had a quiet scepticism toward competition. I was pretty influenced by Rene Girard's mimetic theory, and pursuing original ideas has worked well for me. Sure, it makes you better at whatever you're competing in. But it often stifles creativity and independent thought. It works well for those on the right path—but not for those pursuing careers or relationships they are not fulfilled with. Education is not a substitute for thinking hard about what you want.

If AI truly democratizes knowledge, will it amplify the value of the smartest people? Or will it reward the average person with more agency? And as AI accelerates technological cycles - driving nations, companies, and people to even more competition- will we ever pause to question the underlying assumptions we’ve inherited? Or will we be too busy optimising, scaling, and competing to notice we’ve walked straight into a high-tech version of the Malthusian trap?

Will AI lead to higher GDPs, more consumer choices, and faster innovation—yet leave the average person feeling more anxious and less fulfilled? In other words, what happens when progress, by every economic metric, starts to feel like decline in lived experience?

Risk and Realities of Supply Chain

Markups

As I get deeper into the AI Infrastructure industry, I realise how complex the supply chain is. The current supply chain for trading firms is Hedgefund -> Value Added Reseller + Data Centre -> OEM -> ODM. At each stage, there is roughly a 30% markup. "VAR" buys servers and manages the entire project. Their value lies in knowing the perfect hardware to fully utilise the client's software model. Recently, I came across the VAR's latest AI server list, published in late 2024. I was curious why they were only offering older models like the R760xa from 2021, along with several servers dating back to 2019. None is liquid-cooled, and there were no signs of the latest XE models from Dell. My hypothesis is:

1. OEMs like Dell no longer prioritise VAR, since hyperscalers have created huge AI server backlogs.
2. The VAR lacks the networking and integration expertise needed for the newest servers.
3. With little competition, VAR can get away with charging high margins despite weaker offerings.
4. Many clients care more about cost than cutting-edge hardware, and competitive pressure hasn’t yet squeezed trading profits.

I started to think - can I source from ODMs directly and deliver the same quality of product? So I decided to reach out to ODMs in Taiwan.
They replied initially and stopped - they only entertain large customer orders.

Risks and Realities of AI Clusters

I spoke with a leader managing 3,000 Nvidia GPUs. They were trying to build a data centre, the first-of-its-kind, locally. The current process is inefficient - the entire contract is awarded to a local telecom company, which must meet strict criteria or face penalties. The telecom then acts as a project manager, coordinating with about 20 different vendors. Very little is handled in-house. I suspect this lack of expertise translates into operational inefficiency. I’m especially sceptical about their ability to network all these clusters effectively.

They then sell these GPU clusters to clients, mainly university research departments. They keep spare nodes but still face daily customer complaints. Building perfectly reliable hardware is very hard, yet clients tolerate it because there’s no better alternative. Meanwhile, China continues to struggle with procuring H800s needed for training.

Their biggest concern is cybersecurity. As she put it, “It only takes one disaster to ruin trust.” The data centre’s location is top secret and is part of the government’s critical infrastructure law.

The Hardware You Can't See

I’ve never seen an AI server in person, but I’m excited for the chance to. This year, I was lucky enough to be sponsored to attend the OCP Global Summit and join NeurIPS in San Diego.

I’m also really curious about how open source might help reduce costs over time. I remember how open-sourcing Android lowered phone manufacturing costs, and I’m hoping to see something similar happen here—though it’ll definitely take time. And for Meta, the benefit seems straightforward: lower costs for them as well. (Update, Nov 2025: I’m more skeptical about open-source hardware now.)

Building AI clusters takes more than GPUs. Networking and bandwidth are key to keeping them fast. Meta's open-source system consists of an isolated high-bandwidth network that connects all their GPUs and domain-specific accelerators. Bandwidth is expected to grow by 5x to 10x by 2030. Hopper GPUs had 900GB/s NVLink; Blackwell is twice that. Every new generation GPU is doubling or more in compute throughput, which forces interconnect bandwidth to keep up, so the chips aren't waiting for the data.

To support this, AI Labs need a high-performance, multi-tier, non-blocking network fabric that can manage traffic smoothly. Meta decided to open-source Catalina, a high-power rack capable of supporting up to 140kW. The other reference you can typically find on the internet is Bianca, but it's a total integrated system from Nvidia.

Source: Meta Catalina Specification via OCP

Components within Meta's compute tray

1. GB200 High Performance Module (N)- Modular component that contains CPU and GPU

2. Host Management Controller (N)- Control panel to monitors all parts, checks temperature, etc

3. Connect X7 (N)- Network interface card that talks to your GPU and data center fabric

4. Power Distribution Board (M)- Receive bulk power and distribute it to everyone else
5. Data Center Secure Control Module (M) - Low speed chips moved to DC SCM, making HPM less dense and cheaper. Allows upgrade without changing the whole HPM
6. E1.S NVMe backplane (M) - Interconnect board connecting SSDs to main system
7. OSFP Carrier Board (M)- Acts as a interface between compute trays and network fabric. Best for thermals and maintanability.
8. Front IO Board (M)- Interface board to keep motherboard cleaner
9. CX7 OCP NIC 3.0 (Commodity)- Seperate board for CX7 NIC to allow easier upgrade, simplify cooling and signal integrity

N= Nvidia-designed
M= Meta-designed

Thermal Solution
Catalina uses a combination of liquid and fan cooling, with a suspected code name of "Channel Island". They utilise a PG25-based liquid, with a temperature of 10-12 degrees celcius. They have terminal sensors within the baseboard management controller (BMC) that tolerates +-2C. Channel Island can also detect leaks via sensors, contain leaks via mechanical design and response to leaks by shutting down power or turn off supply once detected.

Within the compute tray, there are eight fans. These fans provide air cooling for the E1.S drives, front end CX7 OCP NICs, the DC-SCM, and the power conversion circuitry on the PDB. The thermal design is resilient and can continue to operate with a single fan rotor failure. The cold plate loop is used for liquid cooling on the high-powered components - like the GB200 and CX7 backend NIC modules.

Electricity
From reading semi-analysis, I was intrigued to learn that the cost of power is cheaper in some US states compared to some parts of the world. Notably, costs are 1/3 of Singapore! The power landscape is accelerating, and a 50MW+ per facility would no longer be enough. Legacy data centres would no longer be relevant.

There are other components beyond electricity and thermal, which are open-sourced by Meta. Currently, they utilise Quanta, a Taiwanese supplier, to build their bare metal.

Scaling hardware isn't straightforward

When I was working at deep learning at my previous job, I wasn't expecting to be interested in hardware. However, AI is different because the cost and quality of the infra make a significant impact on the model. So I did what I always did when I became curious: I went down the rabbit hole.

Introduction

The increase in machine learning usage - from natural language processing, graph neural networks to monte carlo pruning (Alpha Go) - has driven a massive surge in computational power. Models continue to double in size, and racks are on track to increase to 1 MW by 2030.

There are several problems arising from this. First, adding extra GPUs doesn't always help - the total system performance scales sub-linearly to the extra GPUs. Adding more compute nodes tends to reduce overall system efficiency. Second, migrating hardware between different sub-domains is tough. Financial trading models and autonomous driving have diverse priorities in terms of latency, memory usage, and throughput. AI models are unlikely to converge across sub-domains, and existing algorithms continue to evolve. Third, an average server requires ~50 subcomponents, and as Jensen introduces new GPU models, parts of the supply chain get reworked.

As a result, scaling hardware blindly leads to diminishing returns — and without careful hardware-software co-design, most of that extra compute goes to waste.

Physical System Design*

1. Chips

Modern algorithms need lots of cores and fast memory. The easiest way to scale is with chiplets. Instead of building one big chip, you build smaller ones and connect them. It’s cheaper, more reliable, and scales better.

New players like Groq are rapidly entering the inference space, aiming to carve out market share. China is trying to move to local chips like Huawei. To keep up, you need a modular system—something that can take new chips without fully starting over. Standards like Open Accelerator Inference and Universal Baseboard make that possible.

2. Chassis - Tray

A chassis combines key materials - such as processors, storage drives and memory - into a compute, storage or memory node. Traditionally, 19-inch racks are the industry standard. 21-inch racks, supported by OCP, are increasingly becoming popular due to bigger AI workload needs.

3. Server - Compute*

The GB200 NVL72 racks consist of 18 compute trays and 9NV Switch trays. Meta has made their Catalina NVL-72 system open-source. Catalina is Meta's next-generation AI/ML rack that supports large cluster training and inference use cases. The design focuses on achieving a fast time to market, alignment with industry references, and providing cutting-edge performance.

4. Pod Density - Compute Nodes

A pod is a series of compute nodes that work together as if it's a single computer. Even though the job is being split up into multiple physical machines, the software sees it as one machine. The pods are connected with a low-latency interconnect like NVSwitch. In the case of Meta's NVL72, each tray contains 2 CPUs and 4 GPUs, and there are 18 trays in a rack. Two racks are then connected to fit 72 accelerators per pod. Pod density is expected to increase in the future, but due to current power and liquid-cooling constraints, most data centres cannot support the rack density of NVL72 in one rack.

OCP also talked about a future where there can be more than two accelerators per high-performance module, and depending on advancements in fabric technology, that number could be 576 per rack!

Different types of clients also require different ratios of GPUs to other components, depending on the primary purpose of the hardware. Semianalysis wrote a great article on how to improve bare metal cost by omitting less important units.

5. Networking

AI workloads often involve a huge amount of data moving between CPUs, GPUs, memory, storage and sometimes even across data centres. If all these transfers happened one after another, the system would become a bottleneck. Parallelism in networking means designing the network so that multiple data flows - CPU to GPU, GPU to memory, node to node, cluster to cluster. This involves interconnects like NVLink, Infiniband and special architecture. As my friend, Reynold, says "Networking is a whole set of different complex problems"

Different networks that co-exist for different purposes

Frontend networking (Normal Ethernet): Connect Servers to the outside world

Backend Networking (InfiniBand/RoCE Ethernet or other high-performance fabrics): Connect all nodes and servers, move a huge amount of data between GPUs in different servers with minimal delay

Scale up Accelerator Interconnect (NVLink): Connect GPUs within server, share memory and exchange data faster than PCIe

Out-of-Band Networking for Manageability: If networks are overloaded. down, admins can still access the system via this method.

*Source: Open Compute Society, Semi Analysis

Reflections 2025

What do I want

I’ve always liked ideas and people. I like to surround myself with smart kids who think deeply. I particularly like creative, courageous, contrarian geeks. My main aim is to meet those people, and there are some industries who attract them – quant HFs and tech startups. I want my future job to involve working or servicing them. I want to be in the US/UK and then maybe Asia after. I want to work in building new things

I realised that I have a strong appetite for taking risk. I also learned that I love multiple domains and can come up with original ideas. I love adrenaline, and I love being on the edge. A great life to me would be work that is like skiing and holidays that is like onsen. I get obsessively curious at times, but there are very few ideas that can get my blood boiling for months. AI is one of them. Macro is another. Once it happens, I can’t help but be relentlessly resourceful. The only thing that would stop me is failing at it. So far, I’ve only failed once.

It is hard

Frankly, I care about the AI stuff, working with smart people and being competent. I always believe in choosing the right boat first and just say yes. I am also sure I don’t want to do sales, because it’s time to *thrive* in a new skill. I think it suits the generalist, polymathic nature of myself. I’m always interested in a broad range of things –rene girard, history, geopolitics, maths, philosophy, science and hence my obsession with macro. But being competent is important, and you must deliver value. Profit is an important discipline for new ideas. I see myself in a BD/Finance role with ML skills, not just a pure ML.

I think the age of AI is great for people with high agency and polymaths. Not going to lie, the job market is intense. I’m 100% confident that this is the right decision in the long term for me, but oh boy, I am about to stomach a lot of pain. I literally thought about entering a swap contract with the people I love.

T: Hey can I sleep in your sofa (or extra room) for one month if everything fails in the next 3 years? On the other hand, if I have X amount after 35 years old and no debt, how about a 1k USD cash gift or hotel voucher as an appreciation <3 ?

Frankly I thought it was a great idea, because I’m motivated to give back to my mentors and friends (you know who you are ), and they always see the upside in me. Alternatively, protecting my liquidity and downside is important in this unpredictable, volatile environment.

Entry-level jobs have already vanished. We can talk for 3 hours about what this means for the future - more dissent, less peace, increasingly winners take all. I am determined to win. I am also vulnerable, nervous and excited.

Principles

Inspired by Ray Dalio's book, which led to my pursuit of a macro career. I'm not the best writer, but I love ideas, and I love being competent. I think profit is an important discipline to pursue more ideas. I love thinking, and my main goal is to meet interesting, deep thinkers and make a bit of money along the way.

1. Momentum is everything - both negative and positive
When I was 10, life changed dramatically. Life became 20% worse, and then two years later it became 80% worse. When you are poor, you become stressed, which affects your relationships, makes you lonely, you fight more, and things have a tendency to spiral downward. Failure can be very demoralising.

Success, on the other hand, is very motivating. You get a good job, you are competent, you get told you are talented, you feel more motivated to do a better job, you go home happy, you smile more, and you have better relationships with others. You can't give anything to others when your cup is empty, so it's important to take care of yourself.

The challenge is stopping the momentum when you are failing and recognising that positive momentum doesn't last forever. The key for me has been stoic philosophy at times of pain, and gratitude during the good times.

2. Use your past to your advantage

My childhood has moulded me to be both optimistic and paranoid at the same time. I firmly believe in Andy Grove's quote of " Only the Paranoid Survive". I do not think paranoia equates to unhappiness, nor optimism equates to happiness. I am very confident I can go through most kinds of adversity, but I know it is going to be painful, so I actively try to think hard about the future and take risks while I am young. I'm rather optimistic and open with people, so I like reaching out to strangers whom I admire. This means I'm naturally good at sales or being relentlessly resourceful. This could also mean I fail to recognise red flags in others, but because I am very detailed about numbers, I rarely fall into financial scams.

Knowing this weakness, I save money during good times. I recently quit my job because I want to work in AI, and I knew the fear of negative personal cash flow would force me to act urgently. You can use both your strengths and weaknesses to your advantage.

3. Don't listen to others' advice on burnout
Burnout occurs when an individual experiences excessive emotional adversity in life that they cannot overcome. But if you do not feel emotion in that particular activity, then you may have found your biggest edge.

4. Have the courage to be original
I do not get embarrassed easily. I find it fun to reach out to people in interesting ways. I find it fun that they find me amusing, and I brushed it off when they rejected me.

I started with cold emails and LinkedIn, but found the return rate unsatisfactory. I decided to be different, so I cold-called CEOs and seniors. I found their numbers through LLMs or BBG. Seniors would pick it up; some would kindly reject me. I've cold-called 5 CEOs, but frankly, I wasn't able to get past their assistants.

So I decided to send a bunch of them fedex mail. I figured out the fancy mail would spark their curiosity. The problem with this is that you can't track whether they see the mail or not.

There was a time when I would video myself in Loom. I did this to ~10 companies, with most of them replying! I sometimes send it to their generic email address (info@company.com) if I can't find the individual's email. You can track whether someone has viewed it or not. Once, I printed a shirt, video-ed myself and sent it to every single member on their board. Both quickly replied within hours, and we learn some stuff from the conversation. They loved the creativity, but I was rejected for visa reasons.

Despite multiple rejections, I realised I can do it for months because I experience very little mental pain from the act of hustling. My biggest constraint is my finances, and I don't like burning money on my FedEx mail and t-shirts. I also don't like paying a premium for Loom.

Frankly speaking, these are all the things I did when I was younger. I am not sure if I would do it now. You get a lot of bandwidth for failure when you are a young person.

5. The world doesn't care about talent; they care about success.

The very first non-professor adult I've met in Hong Kong was a chief executive who came to school to give a speech. I happened to be the only non-Chinese with the best grades, so I got a chance to make a speech afterwards. I shared my life story, he came forward, gave me his name card and proactively told me to reach out to him. Over the course of 7+ years, we would meet regularly. I thought this was the norm in Hong Kong, that EVERYONE wants to talk to a student.

Turns out that man was unique. I was convinced that he was once a mildly eccentric, obsessively curious, well-read independent thinker. He's managed to assimilate into society, but he firmly believes that I was talented. 7 years later, after being exposed to a myriad of talents, I realised he saw a bit of himself in me. He told me to read more and that school wasn't that important. I read a lot of books during my time in University, and it was the best decision I've ever made.

Despite getting 4A* and winning Top in the World in AS Mathematics, I was quite shocked when multiple people convinced me I couldn't go to work in macro because I didn't go to a target school (I wasn't smart enough). My heart dropped when a senior trading guy told me that I should quit school and take out a loan to attend a better school. I didn't have the financial means. I didn't believe in loans.

Through a series of hustling and luck, I got a fun job in macro, and then the opposite happened - most of the people at work called me special and talented. I was confused, but reality is probably in the middle. I wasn't as bad as people say in Uni, nor as smart as people say in work. Do not let yourself be fooled by human mimesis.

The reality is that no one (except for a few like that guy) cares about your talent; they only care once you achieve success. If you are talented but not successful, don't worry. Focus on working hard and getting what you want. Results matter. If you are less talented but feel imposter syndrome, why worry? The world is filled with unsuccessful people with talent and no opportunity.

6. Understand your strengths and blind spots

Have high convictions about your unique strengths, but be humble about your blind spots. I am always curious about my limits and trust my instincts when it comes to doing what I want. I love adventures, am not a perfectionist, and tend to act with a strong sense of urgency. When it comes to my weakness, I tend to listen to others and control that inner voice in my head that tells me to "do it". I've learnt that strength and weakness are usually two sides of the same coin.

7. Find your own game when it comes to luck

Coming back to the cold mailing story, I find it more fun to do something different, but I've seen admirable friends who do the "same thing" and win by sheer willpower. The game of luck is different for each individual. Math has inspired one of my principles -
A die with six sides and a value from 1 to 6 has an expected value of 3.5 on every single throw. You can rate your life from 1 to 6.
How do you optimise for a 6/6?

Most people struggle with settling into a life of 6/6 when they have a life of 4/6. Most will choose to settle and stay in their comfort zone, because rationally speaking, it is above expected value. However, in a world that is changing, the biggest risk is not taking risks at all. It's a personal decision, but having gone through ups and downs, I find living at rock bottom (or below expected value) stressful financially, but clear mentally. There is only one direction to strive for - up.

Assuming you work hard to develop mental resilience, you will find yourself having multiple chances to roll the dice. I like to leverage my biggest strength and roll the dice 100 times before eventually getting what I want. Life will be worse first, before it gets better, but leveraging your unique edge means it's never too bad! You will find the journey fulfilling, and you will gain a stronger belief in yourself.

My mentality in life has always been to "roll the dice", but it's important to understand which phase of life you are currently in, hedge your biggest risks and make sure it doesn't kill you. For me, the fun is in the journey. I realise I enjoyed rolling the dice more than achieving the outcome.

I've seen instances where my smart friends took the more patient, less risky, slower method of "engineering the dice" to obtain a 6/6 life without going below EV. In the world of hedgefunds, there's a story of how Soros gets into the mess and knows when to get out, Stan never gets in the mess in the first place.

8. Aim to predict the future by understanding history

The book Sovereign Individual was an unusually ambitious, thought-provoking book written in the 1990s. The author tries to predict the future by understanding how technology results in a shift in "megapolitics". It ties nicely to the concepts of mimesis from Girard: people imitate each other's desires, and that mimesis drives cultural change.

I've always had a view that every generation is born into a particular culture at a particular time. The Millenials complained about not being able to afford housing in their 20s. GenZ's never thought about it in the first place. As we enter a period of rapid change, it's essential to recognise that the methods of attaining wealth and a good life from previous generations may not apply to our own. Meanwhile, there are universal principles of human desires, greed and fear that remain constant.

Learn to think hard about the future.

9. Learn Emotional Intelligence from Abe Lincoln

Smart people often have intense, extreme personalities. If you optimise your life to work with great people and great problems, you will need to learn emotional intelligence. This means understanding yourself, having the ability not to act on it, and developing empathy for others. During the Civil War, Abraham Lincoln faced immense pressure of leading a divided nation and managing conflict in his team. His capacity to regulate his emotions, maintain clarity and purpose in the most difficult circumstances, is deeply inspiring to me.

10. Spend time on things that give you energy - be it friends or work

To live a fulfilling life, focus on activities and people that boost your energy. For me, creative work that challenges my mind is energising. Thoughtful conversations spark new ideas and keep me engaged. Humour is a key part of my life, bringing joy and lightness. I choose to spend time with brilliant friends who sharpen my thinking and share positive energy, creating a sense of mutual inspiration.

Opportunities in AI Infrastructure

Why AI Infra

I wanted to work in AI Infra because its more fragmented and less winner takes all. Even if you fail as a seller, the knowledge of GPUs and data centers will be useful to buyers procuring them.

The world can buy cars from Japan or China, but every country needs to build its own road. Same with data centers. Software is America centric and it’s not clear to me yet who the winner is, except the top 3 labs. Same with AI Agents.

Infrastructure also has its nasty downsides. Products tend to lack differentiation; business is prone to margin squeezing from competitors, it is very capex heavy – power, real estate and GPUs. Recall what happened with Cisco in the boom of internet 1990s. When you are building infrastructure, your sales are going up +50% a year, but once it is built, growth not only doesn’t go up 50%, but it also goes down because on a rate of change basis, you no longer need infrastructure. In 2000s, A lot of companies with estimates of +50% to +70% growth, for the next 2-3 years, had business that were about to collapse. Nasdaq went down 95% in 2001.

AI Infra for HFs

In 2023, I thought there are <10 HF players who are successful in neural networks. My colleagues thought it was still an experiment. Let’s not talk about the moment when a senior guy in trading told me that neural networks doesn’t work. I was dead wrong.

In 2025, I learnt that there are people who’ve been doing this for decades. The Chief scientist of OpenAI is an ex-HF guy. Look at all the ICML/NeurIPS sponsors. Compare 2023 vs 2024. The number of quant HFs sponsors almost doubled.

There is a firm called “B”. They are discreet. They work with a few high-profile quant HFs whom I prefer not to write about. Not Quadrature or TGS or RenTech, but you can assume they also had compute. Lucky for me, they had an HPC for finance guide which I liked, so I cold called them. They told me that everyone wants more compute. That HF clients seem to be making money this year and last year. I also learned from a P.Decrem that HFs aren’t excited about 10k clusters anymore. They want more, but Nvidia is focusing on sovereign clients. I don’t know what I don’t know, but I suspect most are experimenting while some are making money.

I had a conversation with a neocloud* who told me that they think only 40 companies in this world can “make money” from large clusters and everyone is trying to get them.

Company B is small yet efficient - makes 75 million of revenue last year, paid out 10 employees, 6 million of salary total, and is fully owned by a 60-year-old guy who I reckon is about to retire. Last year, he paid himself 3 million in dividends. I totally want to build a company like this.

Opportunities – There are two opportunities I see

First, Build AI Infra starting with HFs and then other niche players.
“B” - these guys are middlemen. I assume most quant HFs do not want to go through the hassles, so they ask B, who source the GPU servers from OEM like Dell to build the data centers. OEM tends to charge ~30% markup vs ODM, but that is due to the specialization of servers. As compute clusters grow larger, some guys like Meta decide to build their own and source it from ODM like Quanta (who charge only 1% markup), To do so, you need both the technical expertise for server configuration and willingness to negotiate per component. Nitty gritty infra stuff which some rich clients prefer to outsource.

My question is – until when does your competitive edge on trading lasts before you P&L gets competed away? Just like systematic strategies in the 1990s? How do you ensure ROI on your models? Assuming your moat gets reduced over time, will your edge naturally be in managing CapEx smartly, just like the AI Labs guys? This is not just about getting good financing rate, it’s about sourcing, relationship, technical expertise in networking. Assuming everything is reliable.

The additional complexity comes with how next generation GPUs need more power density and liquid cooling. Existing data centers are not equipped for that. Making it harder, every chip manufacturer, every server assembler has a different manifold or different way to get liquid on the cold plate.

The argument for neocloud* is that their rich clients would want the newest chips and that they are secure enough, that cloud make sense. I talked to “B” and he disagrees, with HFs prefer to have their own data center with older generation chips for security reasons. I suspect some HFs/AI Labs are okay with H100s for now because the software stack is more robust. Will it change in 10 years? Someone like XTX is probably stuck with their 2023-2026 version of data centers.

Second: Nearly everything is a rounding error compared to GPU cost

AI in the Cloud, how to keep your models flying high and deliver ROI*
https://calv.info/openai-reflections

The second opportunity is working on a job that solves a new problem that arise with AI. In the case of AI, the underrated problem is managing cost. There are smart engineers building things like context caching or smarter prompt engineering to optimize cost of LLMs. On the finance side, OpenAI is hiring those with compute/procrument knowledge who also knows how to do ML forecasting. The trend is rather clear – you need to know both some kind of contextual knowledge (finance) and ML.

An interesting excerpt from OpenAI guy
“We had to forecast out the load capacity requirements as part of the Codex launch, and doing this was the first time I'd really spent benchmarking any GPUs. The gist is that you should actually start from the latency requirements you need (overall latency, # of tokens, time-to-first-token) vs doing bottoms-up analysis on what a GPU can support. Every new model iteration can change the load patterns wildly.”

TCO Analysis of 10k Cluster

My bare metal chassis analysis of a 10k H100 GPU Cluster

Let me know what you think!

Shortage

Margins

Lessons

Reflection

Up to March 2025