Tiffany Soerianto

Cloud vs On-Premise

2026-05-16T23:10:28Z

Following up from the previous post, "What I learned from Open Compute Project"

I initially thought other clients should buy directly from ODMs instead of OEMs. I later realised the situation is more complex - there's money on the table for a reason. As for cloud versus on-premise, I am rather confused. Both sides claim cost advantages, and it really depends on the scale and context. When I spoke with the hedge fund industry, most opted for on-prem for security and control. What is the cost of compute is a frequent question, but a complex one to answer. The peak flops, the utilisation rate, cost of hardware depreciation, and the desired throughput are some examples of the factors.

Some of the key insights I've learned are from Prag Mishra:
1. Flexibility: The cost of FLOP/$ doubles every two years for ML/AI-specific GPUs (source: Epoch AI). Cloud instance requires a fixed-year contract, which might not be cost-effective. By maintaining flexibility to upgrade compute resources, you can cut the cost by half in a year.

2. Utilization: 100% utilization in cloud are rare. The max compute per dollar for cloud instance is 25PFlops/$ on a one year upfront. An on-prem GPU server, if utilized 60% of the time, can be a fully deprecated in two years to match the maximum PFLOP/$ that can be achieved in cloud instances.

3. Execution: On-prem is hard and risky. One missing piece - a generator stuck in backlog - can halt everything. Key components like HBM supply can't scale overnight. At the application layer, OpenAI needs compute that's ready and reliable to serve the customers at speed.

4. Cost of Performance: In July 2024, the cost of H100 instances in the cloud was $77/hr for AWS p5.48xlarge with 8xH100s. In August 2025, its $55/h - much lower. Actual costs vary depending on the region.

I've read many resources online for TCO costs, such as semi-analysis or Neoclouds themselves. I'm wary of some of the assumptions, especially the high utilisation rate, which might not be reflective of reality.

Example: OpenAI Training Estimates
According to reports from the Information, the New York Times and Epoch AI. For GPT 4.5, Epoch AI estimated that OpenAI utilises between 40,000 to 100,000 H100s to train its model, between 90 to 165 days for $2 per H100 hour. Based on these assumptions and a maximum utilization rate of 90%, the total training cost for a single model is estimated to range between $192 million and $890 million. According to Winsome, OpenAI plans to spend $350B by 2030 on compute infrastructure, with annual server bills at $85B.

How would you split between on-prem vs cloud?

Decision will be made mainly on technical strengths and costs. Beware of online commentary on clouds. I met an ex-OpenAI technical staff member who spoke highly of his 2023 experience with Azure vs the other cloud provider. I'm cautious about online commentary - much of it reflects incentives rather than reality. From my experience, GPU memory and networking performance often outweigh FLOPs in selecting hardware. This makes Nvidia networking superior for training, and the ecosystem is working hard to catch up - Broadcom recently announced an 800GB Ethernet NIC in a bid against Nvidia ConnectX.

CXL memory is another industry solution for memory constraints. After talking to Marvell in OCP, my understanding is that it's mostly used by hyperscalers. The software stack is still immature.

Assuming training occurs on bare-metal nodes, differences in hyperscaler software stacks are largely negligible. What truly matters is speed, reliability and efficiency.

Nothing is impossible when humans are involved

2026-04-21T03:10:23Z

Around 2023, I was invited to a dinner party with a bunch of young traders from a particular quant hedgefund. Almost immediately, I was greeted with a socratic debate with a stranger (let's name him John). Enamored by his intelligence, I realized I haven't had an interesting conversation rooted in ideas in years. I began to question what I was doing with my life. Inspired by that energy, I decided I wanted to work with these people.

The fund was elite, with less than 0.1% acceptance rate. I had studied accounting and finance from a non-target school. On paper, it was impossible. But nothing is impossible when humans are involved.

I took a huge bet, knowing that the fund was open to those with math talent. I pushed my “Top in the world mathematics award”. I knew John was impressed by my intelligence and my story of sneaking into a conference at 19 to find a trading job. I spent 300 hours of my weekend, on top of my 55hour work week, grinding through math. I loved every minute of it. Few months later, I sent "John" a video: a theatrical pouring out of my filled notebooks. I told him I was applying, that I had nothing to lose, and that I'd cold-call his Head of Trading if I didn't get an interview.

I eventually got the interview and learned that John fought for me in the background. But I failed. The other candidates were simply better. My heart broke, and I walked away with a new mandate: In whatever I do next, I need a 10x competitive advantage.

I've watched AlphaGo as a university student, and was deeply inspired by Demis dream of solving intelligence. AI was taking off. The fund was investing in Coreweave. Other quants were heavily sponsoring ICML. My intuition told me AI had already worked in trading.
When I told a senior trader that XTX Markets had spent $1 billion on a data center, he laughed and called it a bubble.

I suspected he was wrong. I cold-called several CEOs whose numbers I found on Bloomberg. They were loosely involved in AI infra (Gerko,Thiel,Gray). Their assistants politely rejected me, but it only clarified my resolve. I had to go all in.

I stayed a bit longer for my boss, but asked for one month to step away from my regular work and experiment with building a CNN based trading model. This was completely unheard of on the trading floor - a sales who studied finance working on pytorch code. I failed my experiment. But that showed me how great my boss is, my personal reputation and my convincing abilities. My honesty about quitting hurt my bonus, but it really doesn’t matter. In the long run, you should always leave abit of money in the table.

A few months after I left, I found this obscure firm writing detailed research on AI Infra for finance. I looked into their securities filing and found out that this quiet company of 10 people is a $100M USD revenue-generating business. I cold-called the company. The other guy was a young junior who was more than happy to exchange knowledge about the opaque business. You never know how much you can learn just by cold calling.

I saw an opportunity to build a business against the incumbents - supplying AI infra for hedgefunds. I knew my edge is in distribution and I was very excited to learn about the product. AI infra is also a horizontal layer technology, which means I can be anywhere in the world in the long term. If i fail in that idea, I can use my knowledge for a job in an AI Lab. This was the thesis I made with the limited information I had.

In Aug 2025, I cold emailed the innovation bureau of HongKong, showing them my blog and my interests. They were excited that someone in Hong Kong knew something related to neural networks and immediately gave me a phone number. It was the director of the local GPU lab. I cold called her and again, like many others, she was amused. She told me to just come and told her juniors to “teach Tiffany something”. I spent one month absorbing all the knowledge from engineers about AI infra supply chain, data centers and testing Chinese GPUs. The feeling of hustling sometimes gives me a sense of thrill - from bending the rules a little, from finding creative ways to get what you want and make things work.

There were tons of rejections. You just have to keep on trying, move on and forget about it. I’ve been doing this since I was 15. When people told me I was brave for quitting my high paying job, I am not sure if I feel brave. I am just quite used to dealing with uncertainty. I suspect courage is easy when you have no choice but to act. Prior to my success with AISC HK, I spent two weeks in the US physically cold mailing and fedex-ing to multiple CEOs and companies. For one of the company - I printed a t-shirt with a logo of theirs, videoed myself and sent it to every single board member. It’s not just chutzpah, you have to be different, thoughtful and prepared. I got an interview within hours, they were amused but didn’t want to sponsor my visa. In every call, I would always learn something and use that information to think for myself and roll the next dice.

I became more serious with my business idea and reached out to multiple conferences asking if they need a student volunteer. Open Compute came back with an offer to pay for my flight and hotel as part of charity. I was lucky, once again. I showed up, talk to people and gave them my wild duck PDF. This time, I actually had people proactively reaching out to me and saying they want to help me, offer me a job, none more enthusiastically than a well connected CEO called Dean N. I rejected because I knew it wasn’t the right fit

On the very last day of the conference, I got extremely lucky. I was unpacking my bags, taking out my GEB book on the table, when two older men noticed my “energy”and asked me what’s that book about. I yapped about Kurt Godel for 10 minutes, energetically talked about how that book inspired AI research (I know you might disagree).Turns out they were both senior data center leaders from Intel.

Over the next 40 minutes, they rigorously challenged my idea. I learned about hidden business practices no one outside could possibly know. This business is far more opaque than I thought. The previous 19 people i met either said my idea would work or said nothing. I also learned about the dynamics of AI Infra that makes margin incredibly difficult. I gave up the idea the next day - I will never be able to make a 10x product.

At NeurIPS, I felt aligned. "This is my people", I thought to myself. I was most impressed with a researcher in Reflection AI and two researchers from GDM. Their energy is palpable, and my instinct told me that I can really trust these guys, that I can work with them.

This is a room with 10,000 of the smartest people in the world. Once again, my heart wants to do something that is statistically difficult to achieve yet gives me so much energy. I looked back at my finance career. All I can think of is the people, not the money, not the fancy food, just the people.

The math might not always add up, but the chemistry surely did.

A priori and fundamental truth

2026-04-13T16:29:06Z

There are broadly two categories of knowledge - a priori and a posteriori. A priori is knowledge from reasoning. A posteriori is knowledge from sensory observation. When people talk, I suspect most of their knowledge comes from a posteriori. As a result, we have the old adage that people don't remember what you say, but how you make them feel.

Yet few of us spend a lot of time thinking deeply about a priori - trying to understand fundamental truths. I suspect that in most human experiences, there is no such thing as fundamental truth. But we can still break open the deep-seated beliefs that drive our actions and thought processes.

For the longest time, I believed in the importance of talking simpler. You hear this from Feynman, from Dimon, from people in all walks of life . Yet as a result of talking simpler, I suspect I started thinking simpler. It made me overdeterministic in my thought process, finding plausible reasons behind my actions, my past success and failures. It wasn’t until reading The Brothers Karamazov that I started to deeply question myself—unravelling the complexity of human biases, noise, and the contradictions that exist in our minds.

The shift that followed was fundamental. I now try to speak simpler, but think more complexly. I try to hold two opposing thoughts in my mind at once. At the end of the day, to do well is to articulate well. But articulating well has to be in conjunction with thinking hard.

My research on Go To Market

2026-04-13T04:17:40Z

Writing this blog to demonstrate my understanding of GTM, which is a tad different from macro sales. The big picture, however, hasn't changed; it's all about solving customers' problems and how you make them feel.

Traditionally, there are two kinds of sales: product-led growth and traditional enterprise sales. I suspect that for ReflectionAI, the latter is more important. The two questions one must ask: Who do you sell it to, What are you selling, and How are you selling?

1. Figure out the Persona
In PLG, no developer wants to be on the phone, get cold-called, or even pick up a phone call from a random salesperson. In the latter, focus on which persona you want to sell it to. Does your product really resonate with them and drive amazing, incredible value? What is the actual business value, and how do you articulate it?

Are you aiming for the CIO or some functional leader, like VP of Research? Is it through a referral? Is it cold-outbounding? Some personas you can cold-mail, some personas you can't.

2. Focus on learnings and insights

Something a junior salesperson can do is thoughtfully reach out to people and focus on gaining insights and logos, rather than immediate deals. No commission targets are probably what makes sense at the start, in order to create a cohesive culture of trust. For an early-stage startup, setting up a commission too early can either be an overtarget or an undertarget. Ultimately, the first 20 hires will be the lifeblood of the culture, who will represent your team as you grow the company worldwide.

3. Showcase Pilot for customers, focus on executive buy-in
Make sure executive buy-in is present in pilots. Focus on clear ROI and Results. What does good look like, and how do you measure it?

While LLMs often deliver noticeable productivity gains, these benefits can feel intangible - so it's critical to define and communicate clear results.

4. Large Enterprise is moving faster than you think
There is now pressure from boards on "What is your AI strategy?". When market share is at stake, large enterprises are moving with surprising velocity. Fear of revenue erosion is a powerful accelerant.

Friendship of Virtue

2026-03-29T02:30:00Z

“For me, to remember friendship is to recall conversations that it seemed a sin to break off: the ones that made the sacrifice of the following day a trivial one” - Hitchens

During my break in 2025, I did the things I never had the time and money for: travelling cheaply across Asia, learning to ski and pursuing my AI interests.

Yet, for all the excitement of 2025, nothing beats meeting Isaac* and Albert*. We spent countless hours lost in philosophy, history, politics, literature and science, followed by absurd jokes and complete nonsense.

A lot of friendships and connections depend upon a sort of shared language, not necessarily designed to exclude others, but to instantly bridge the gaps left by time. With Isaac, we would bond via slagging - a high-powered version of teasing where friends are jokingly cruel to each other. I would endlessly tease Isaac’s gigantic forehead and receding hairline, only for him to fire back at my own hairline and the pretentiousness of my latest intellectual comparison.

"How dare you, Tiff," he’d scoff after I tried to compare Shapiro to Hitchens. "You know absolutely nothing about Hitchens."

It is indeed true - I knew nothing about Hitchens beyond his identity as a debater. But I was quickly awed by his wit, intellect and sheer courage of independent thought. If one has to wince at one’s stupidity, I find a bit of relief in reminding myself of a man who once defaced a political poster in the Middle East with a four-letter word, only to realise too late that it was for a martyr. Hitchens nearly died because of his ignorance and ballsy defiance. I can’t help but be amused by his many adventures and reckless love for life.

Hitchens is one of the many “characters” my friends introduced to me. I am so grateful to have finally found friends who are much more well-read than I am. There is no greater joy than to have a good conversation where wide reading and original ideas finally meet.

Isaac

Isaac is a pseudonym for my friend. Isaac never lets the comfort of our friendship get in the way of his commitment to the truth. If my thinking is flawed, he will call me out. I suspect that’s why I feel so comfortable with him, beyond our shared intellectual interests and sense of humour. There’s a sense of comfort knowing that he will always put his value before our friendship, which in turn allows me to do the same with mine- my vivacious love of life, joy de vivre.

Isaac is simply very rare. He is one of the few people I know who goes to certain lengths to protect his mind, although I suspect most of it is his natural proclivity. I was positively shell-shocked to learn that he spent his entire life, as a GenZ, never touching social media. No Facebook, no Twitter, not even LinkedIn. Isaac, to me, has spent his entire life protecting the independence of his mind.

You can really feel that discipline in our conversations - his razor-sharp rationale and his effortless erudition. Unsurprisingly, his circle is small. Yet he remains one of the most self assured person I know - perfectly happy in the solitude of his own thoughts. He seemed to have mastered something I often struggle with: the idea that a clear head is a virtue you shouldn’t trade just to escape the weight of being alone.

Conversations with Isaac are always exciting. Isaac has a knack for dismantling my hard questions. Our conversations are often provocative - my questions challenge his logic, and his answers challenge my perspective. Isaac has more than once dismantled my belief systems, and that is no easy feat.

One time, Isaac challenged my idea of “meaning”. I had spent a large portion of my life believing my life was meaningful. By the end of our 8-hour conversation, he had convinced me that there is no real meaning in life - that my sense of meaning is nothing but a fluff of emotions used to justify my own suffering. All along, I was fooling myself, perhaps to make myself feel better.

Isaac’s intent wasn’t to make me a nihilist, but to ensure that I stopped deceiving myself with narratives. I suppose that is what I loved most about our friendship - always learning, always exciting, always being challenged and the refusal to sugarcoat things.

One of my favourite cheeky memories involves provoking his distaste for Dostoevsky. While I was obsessed with The Brothers Karamazov, Isaac—being a staunch atheist—couldn't stand the religious weight of Crime and Punishment. I took great pleasure in bringing up “Dodo” every chance I got, just to watch him wince.

"If you insist on admiring him so much," he retorted, "the least you could do is learn to pronounce his name correctly." I laughed. In that moment, as always, he was honest to the core.

Albert

Albert is, by my definition, a lunatic. I say this with the absolute highest level of affection. There are many moments when I’ll be walking down the street, recall a snippet of one of our conversations, and just start laughing out loud. He is so novel, so interesting, so intense, so intelligent, so disagreeable. He is always disagreeing, including disagreeing with himself. Albert represents the embodiment of “holding two opposing thoughts in your mind without going crazy”. Sometimes I wonder: How on earth does this guy exist? And this coming from me, someone who has been noted more than once for her own unique charm and eccentricity. Our conversations are always a ride, and our first one remains an iconic memory etched into my mind.

I met Albert through a series of unlikely circumstances. In mid 2025, I was in the middle of a career "roll the dice" in California, pushing my luck to break into AI. My main motivation was the people; I was desperate to find a circle that was actually inspired by the abstract question of "how the mind works." I wasn't disappointed. Albert shared those motivations, though he was significantly more knowledgeable and—rightfully—a bit older than I was.

I’ve always told my friend John* when I’m about to do something crazy, and he encouraged me to join a chat group started by one of his old Oxbridge connections. It was mostly people yapping about ideas, often with a layer of status-signalling that I didn't take too seriously. But one day, I posted a thread about a meetup for an AI lab, and Albert liked it. I looked him up, saw he actually worked in the sector, and boldly reached out. I wanted to learn about AI infrastructure and sent him my blogs to show my interests. To my surprise, he was amused enough to hop on a call.

We talked about AI for maybe 10 minutes, only to be followed by one of the most engaging (albeit unhinged) conversations I had for a random call with a stranger. For the next two hours, we covered everything from enlightenment, French prepa, science, academia and politics. He was incessantly challenging my conclusions, poking holes in my logic, and we ended the call with him complimenting my “balls”.

Feeling inspired, I followed up with a long, sentimental text about how grateful I was for my life, my family's poor upbringing, and how lucky I felt. I was promptly shell-shocked by his response. He sent back a wall of text explaining how I was absolutely fooling myself into thinking that my entrepreneurial spirit correlates with my upbringing —that I was just telling a story in my head. That I was overdeterministic.

This was from a stranger I had never even met in person. My heart dropped; I felt like total crap for about ten minutes. But then, a wave of amusement hit me. He was so right. "We are definitely going to be friends," I gleed with excitement.

And I am so glad we did. Albert has become one of my most treasured friendships. He never lets friendship take precedence over his first love, which was and is logic. If one employs flawed thinking, it would be rubbed in; no, it would be emphasized. I suspect that is the very reason he is often accused of “mansplaining”, but it’s the same reason he is so loved dearly by his friends. He cares more about the truth than he does about being polite. He is always interested in how you think rather than in what you think. In a world of superficial small talk, that is a gift.

*Not their real name. Inspired by Camus and Newton.

Moving On

2026-03-18T19:00:00Z

I’ve spent roughly 8 months gaining deeper knowledge in AI Infra, starting from zero. My initial interest came from an opportunity I saw while I was on the trading floor, and it fits my taste - horizontal layer technology expanding globally, smart people, and revenue growth. I blogged about it, worked hard, read everything, cold-called, learn new information, updated my beliefs, hustled at conferences and learned more. This industry is far more opaque than I thought.

Earlier in Jan, I eventually connected with a researcher at a leading AI lab who saw something in me. I told him I absolutely have no talent in pioneering new models, who then asked me what I’m interested in. I eventually landed an interview in late Feb with their AI infra team.

I learned new information during the interview, realised that this particular person is very different from researchers - they are looking for an industry insider rather than high potential, culture fit, values and aptitude. I started thinking about this amazing, young, energetic researcher I met in NeurIPS called Chris and really wanted to be closer to the fulcrum of scientific rigour. Given that I’ve been kamikaze-ing for 8 months, I took a break by reading literature and philosophy and changed my mind about what’s next.

My fundamental passion is the people I surround myself with - abstract creative thinkers who think about the possibilities they can weave with their technical prowess. It’s time for me to move on from AI Infra and read heavily about open-source LLMs.

I suppose that's the tough part of being a student again - ultimately, "what" is important is a knowledge held by insider experts, and all you can do is showcase how you think and why, how you think is valuable.

How will the miracle happen today?

2026-03-10T16:04:03Z

In my first year of university, I was invited to make a speech at an award ceremony. PT was one of the industry mentors invited to the event - a senior person from a global firm who sat in various government boards in Hong Kong. After my speech, he handed me his namecard, shared his phone number and told me to reach out to him.

Being me, I treated him like a bunch of atoms that decay every 7 years ( a normal human). I was 18 years old, and he was probably around 60 years old. For the next few years, I would visit his office every 3-6 months, and we’d sit for an hour discussing business, life and intellectual topics until he retired. Despite the “me too” climate of those years, he never seemed to care about optics. I think my lack of pretentiousness, persistent questioning and child-like curiosity amuse him. I can’t imagine being that high up and having everyone treat you like a god; it seems exhausting.

I thought it was common sense to treat senior figures with utmost privacy. Afterall, they have something to lose. Turns out it wasn’t common sense.

PT believed in giving back, spending his personal time going to universities in hope of inspiring the next generation. His message was to read more books. I took the advice to heart and started reading. What began as a practical pursuit - gathering knowledge for life - slowly evolved into something deeper: a life of knowledge.

Years later, he said that he began to feel his university visits were futile until he met me. He spoke with frustration about another student mentee who snapped a selfie with him, just to see her showing off on Facebook the next day. He told me he almost gave up on GenZs. When I asked him what kept him going, he simply said: “I am a long-distance marathon runner”.

Having known PT for 7 years now, I could see how this mentorship meant a lot to both of us. For him, he felt a strong sense of fulfilment. To me, it felt like the world cared.

PT, I learned, was not an odd one out. It never ceases to amaze me that the kindness of strangers can be so dependable. As I travelled around the world, moved to a new country and searched for job opportunities, I asked myself, “How will the miracle happen today?”

My first job offer
Around 2018, I was offered my first internship. How? In the same award ceremony, my accounting professor heard my speech, started talking about it to her colleague, who, I later learned, proactively reached out to multiple friends asking for internship opportunities. A couple of weeks later, my professor asked “Do you want a job?”. I was shocked at the kindness of strangers and said yes.

After I finished that internship, PT called in his head of HR and told her they should hire me as an intern. I didn’t hesitate, I looked at him straight in the eye and told him I didn’t want to do accounting. I wanted Markets.

I told him I want to understand how the world works, inspired by the Ray Dalio book he recommended. He laughed. “Tiffany”, he said, “you realize that investment banking has some of the smartest people in the world, right?”

Years later, I learned that it isn’t really true. But at the time, the odds were heavily stacked against me. I was determined anyway.

Why? Because I am driven by a visceral awareness of how short life is. Knowing my mother, my grandfather, and my uncle all faced cancer before the age of fifty, my fifteen-year-old self became convinced that my own clock might stop there, too. (Though, with the trajectory of AI, I hope it changes). That continues to be the driving factor of my life till today, which is why I am so action-oriented.

The journey

I did everything - I researched online, cold messaged >100 people on linkedin and struck up conversations with peers/leaders in those recruiting events. Most people told me to go to a good grad school. By the end of the recruiting season, I was automatically rejected by most companies and had exactly one interview: Bank of America. Statistically, the odds of landing a role with only one shot are near zero. I knew I had to create my own luck. I had recently read “The Uses of Adversity” in The New Yorker online . I figured if Sidney Weinberg could go from janitor to Goldman Sachs CEO by knocking on doors during the Gilded Age - a time defined by its staggering wealth gap - I had no excuse to sit still. In 2019, I found a wealth management conference online. I registered using my student email. I suspect they saw "University" and assumed I was a professor. I showed up in W Hotel Hong Kong as a nineteen-year-old. It was technically legal, but we all know I wasn’t supposed to be there. I struck up conversations with a bunch of industry people. When they asked me what I was doing there, I sheepishly told them I wanted to learn about investments. Some immediately assumed I came from wealth and gave me namecards. I was being pitched “secure gold banks” and crypto. It was a nerve-wracking but hilarious memory.

I reached out to all ten of the name cards I collected and was honest in the email that I was looking for a job in global markets. Through sheer luck, one portfolio manager found my energy contagious, my knowledge proficient, and my audacity amusing. He excitedly shared this encounter with some of his friends, as lunch-time gossip. One of his friends told her husband, who happened to be a head at Goldman Sachs Hong Kong, Mark. I met Mark before my last interview round with Bank of America. He was amused and encouraging, and said, “ If BofA doesn't work out, you’ll always have a place at Goldman."

That signal of confidence meant everything to a 20-year-old. But life has a cruel way of reinforcing its brevity. Mark was my second closest mentor, after PT. A week before we were set to catch up after my internship, Mark died in a hiking accident, leaving behind a six-month-old son. It was a tragedy I couldn’t have imagined. I never got to properly thank this wonderful American. I promised myself then that I would pay for his son’s education one day. I also promised to thank the people who make an impact while they are still here. That is why I am writing essays.

The Rationality of Hustle
While I rely on luck, I dont fully believe in miracles. My decision to pursue markets was rooted in simple first principles thinking: if I was smart enough to get into Oxford, and banks hire people from Oxford, I didn't suddenly become “stupid” over the course of six months.

Yet, I was consistently told to just accept that the system is rigged. While I accept the things I couldn’t control, I was tenacious about the things I could. Even in the middle of the hustle, I never stopped being grateful for the miracles of my life. If anything, I am driven to do my best because I am aware of the countless people born in a different era who wish they had the opportunities I did - happily unmarried at 25 with a free-spirited Californian attitude and an American accent. I am grateful that I get to move to a high-trust society like Hong Kong, met a boyfriend who loves a passionate woman like me ( would have been unlikely in Indonesia), and built a circle of friendship with AI researchers - all born in different countries, all connected by a shared English language, because we were born in the age of the internet.

In the end, moving through the world with joy—and a fearless, independent streak—has always been my style. But the credit goes to tools that made the world small enough for me to conquer. To that, I can only say THANK YOU!

The Effect of Test-time Compute on Data Center Dynamics

2026-02-09T18:43:36Z

One of my recent side projects.

Download TestTIMECOMPUTE.pdf

Are numbers an illusion?

2026-01-18T05:03:07Z

Hold up a finger. Could this finger be a different color? Could it be slightly longer? Could it be crooked? But could it be ever be anything other than one finger? The number is obligatory. The number is something the finger essentially has.

Machine Learning Infrastructure

2025-11-22T03:51:52Z

Compute is a key lever for AI progress. I wanted to work on the business side of AI Infra, but that itself encompasses many layers - business development, operations, finance, technical program management, etc. Since it really comes down to the right role opening up at the right time, I figured it’s better to build a solid big-picture understanding of how everything fits together, and then go deep on the specifics when the opportunity comes.

Having an entrepreneurial mindset, plus backgrounds in finance and data science, sets me up well for system-level thinking. I think managing costs, hardware/software optimisation, and actually executing are the things that matter most, and I wanted a skill set that lines up with that. I love adventures, and the idea of taking on a role that involves navigating ambiguity in a fast-moving environment really excites me.

The goal of this blog is to cover:

High-level aspects of how modern machine learning infrastructure works
Hardware advancements that accelerate deep learning workloads
Industry insights.

It builds on themes from my previous blog posts—including supply chain dynamics, neocloud, TCO analysis, hardware fundamentals, and OCP reflections*:

Machine Learning Infrastructure

A full ML platform usually has two parallel data pipelines:

Real-time pipeline: Handles data that arrives continuously and needs low-latency processing
Batch: Processes large historical datasets on a schedule(hourly, daily, etc) and is usually used for training.

Real-time: Apache Kafka receives continuous events. Events can be a log record, a click event, etc. Flink consumes these events and performs ETL (Extract, Transform and Load) into the real time feature store. Prediction service will use the latest model and features to make instant predictions.

Batch: Data lake stores large volumes of historical data, which is then processed by Spark ETL. The output goes to the Batch Feature Store which is used for training and batch inference, while the features & labels goes to training. Batch prediction jobs periodically run predictions over large datasets, and the output is saved to the data lake.

Why do you need batch inference?
Some predictions are too computationally intensive to run on demand. Even though the end user might not request the prediction directly, the system does. For example, a financial system loads risk rankings for reporting.

A Reflection of America

2025-11-11T21:29:23Z

Love makes my heart warm

Money makes my heart calm

Humour makes my heart fun

Thinking makes my heart beats

-Ti Soe, 2025

Having achieved most of what I set out to do ten years ago, 2025 was the year I felt fulfilled. Meeting intellectually gifted peers who are great at orthogonal thinking, have drive, and have esoteric interests like Kafka, Chomsky, and Voltaire is a privilege that someone as extroverted as me can only dream of. I’ve always wanted to live an intellectually rich life, and my initial naivety about how macro trading can fulfil that led me here. A lot has changed. The 19-year-old Tiffany, who was inspired by Dalio, Thiel and Soros, understood the reality that governs the trading floor. Technological superiority is more important than intellectual thinking. Macro trading entered its prime in the 1990s, when governments tried to tame currency volatility by soft pegging currencies. With most currencies floating and some artificially held down, the golden days of philosopher-traders are gone.

I’ve always loved deep, intellectual thinking, and over the next year and a half, I’m fully committed to sharpening my mind and following my curiosity wherever it leads. This means a lot of reading. Just as much, I love doing the things I think about—and in that, I’ve always had a bit more courage than most of my peers. I’ve learned that not everything can be explained through pure rationality; some truths demand intuition, artistry, and context. Reminds me of Hofstader’s “Truth is not always provable; what is provable is not necessarily true”. Courage, I’ve found, is rarer than genius. And strangely, having seen so much death at a young age has pushed me to live with greater intention and urgency.

Yet how many of these are all stories I am fooling myself? (*insert the context of having been pleasantly intellectually challenged by another crazy math PHD whom I never met*) Having been in California, I’ve seen the extent to which too much thinking, too much freedom results - another comsci dropout modelling contemporary power structures while holding down an unrelated day job*. Inspiring both awe and disdain in equal measure—praised for originality, dismissed for imitation. And in the end, I can’t help but wonder: how much of it truly matters?

And it’s really funny, coming from me, because for 25 years I’ve been on the other side—where people in Asia told me that ideas don’t matter if there’s a high chance of failure. Not smart enough to beat the kids from developed countries for a scholarship—I won. Not a target school for the trading floor—I got the job through hustling and outperformed. Too unrealistic to chase intellectual ambitions—well, I’m doing it now.

Then why? Because pursuing high-expected-value, low-probability bets with a deterministic optimism brings me closer to my ideal self. And perhaps that’s what much of the Bay Area’s intellectual scene is about—a modern expression of America’s old Calvinistic impulse. A culture built on a sense of calling or vocation, born from the earliest religious refugees who came here seeking freedom to believe, to reason, and to build.

Girard spoke of two kinds of desire: the physical and the metaphysical. The former is a desire for utility; the latter, a desire for identity. The more I reflect, the more cautious I become about myself. I’m not as independent a thinker as I like to believe. I’m optimistic in Hong Kong, cautious in California*. How much of that is genuine reflection—and how much is simply the impulse to rebel? How many of my recent “wins” were coincidences rather than true independent thought? Still, to overcredit luck is to surrender agency.

Even with a framework for making decisions, I’ve come to learn that doing something different from the herd isn’t the same as thinking for oneself. Sometimes contrarianism is just another form of imitation. And so I wonder: how much of what I’ve done was driven not by reason, but by my desire for an identity? My desire to be the kind of person who thinks deeply and profits from it?

*Gray Mirror: A brief explanation of the Cathedral

*Hong Kong Executive: “I’ve seen many economic cycles—AI is another bubble.”
*Californian Founder: “I’ve seen many business cycles—this time is different.”

What I learn from Open Compute Project

2025-11-10T17:39:00Z

My two focuses in October were to prepare for OCP and actively learn in the supercomputing club. I always believed that Luck = Opportunity + Preparation, and I try to learn as much as I can online before I join a conference. Most of the time, it leads to interesting conversations and useful opportunities.

I was also really lucky at UCSD - for the first time, an American company loaned our team a $250,000 server equipped with eight AMD GPUs for us to dismantle. That is unheard of in other parts of the world. It was also my first time seeing hardware components like NICs and PCIe up close.

There are a couple of things I learned from the Open Compute Project. But the most important thing I learned was that there’s money in the table for a reason. Through talking with business leaders, I learned about business/legal practices that made it difficult to simplify the supply chain.

Shortage

The surge in AI infrastructure demand has triggered aggressive purchasing by hyperscalers and “neoclouds,” leading to widespread component shortages. Based on my conversations in OCP, Hard drives are sold out for two years, driving SSD prices up for everyone else. Generators are reportedly on three-year backorders. Delivering a fully equipped, PUE-efficient data centre on schedule now requires coordination across dozens of suppliers and vendors — each facing its own bottlenecks. Walking the OCP floor, one thing became clear: hyperscalers dominate this market, making up about 80% of vendors’ clientele.

A hyperscaler I spoke with was cautious about the risks, but noted that when they cancelled a data center contract, a competitor quickly filled the gap. Demand surged afterward, and in hindsight, they realised they should have accepted the contract.

Margins

OEM vs ODM. Original Equipment Manufacturers assemble components from different manufacturers in China. They sell completed systems with warranties and software via value-added suppliers. ODMs like Quanta manufacture to customer specifications. With the AI Infrastructure boom, Oracle and Meta have bypassed OEMs to work directly with ODMs. While OEMs historically apply a markup of 30%, ODMs operate on thinner margins of 1%. However, ODMs require large contracts, making direct access challenging for smaller businesses.

Catching up in technology is incredibly difficult, but changing a business model is even harder. It seems every OEM has taken a page from Clayton Christensen’s Innovator’s Dilemma by taking the hit in margins to capture market share. Nvidia has also open-sourced the reference architecture for its Blackwell servers, further levelling the playing field. As a result, OEMs are beginning to look more like ODMs. I also learned that both ODMs and OEMs rely on many of the same Chinese component manufacturers for smaller parts, which narrows the differentiation even more.

Some component manufacturers

12V Bus Bar - Amphenol, Bizlink, Interplex, JPC, Lotes
Slide Rail - Fositek, Repon, Yuans Tech, Cheng Fwa, Kingslide
UQD/MQD - Auras, Danfoss, Envicool, Fositek, Foxconn, Lead Wealth, Lotes, Netonx, Nidec, Parker, Readore, Staubil

Lessons

The surge in Ai infrastructure demand has triggered aggressive purchasing by hyperscalers and "neoclouds", leading to widespread component shortages. Based on my conversations in OCP, hard drives have been sold out for two years, driving SSD prices up for everyone else. Generators are reportedly on three-year backorders. Delivering a fully equipped, PUE-efficient data center on schedule now requires coordination across dozens of suppliers and vendors - each facing its own bottlenecks. Walking on the OCP floor, one thing became clear: hyperscalers dominate this market, making up 80% of the vendors clientele.

A hyperscaler I spoke with was cautious about the risks, but noted that when they cancelled a data center contract, a competitor quickly filled the gap. Demand surged afterwards, and in hindsight, they realised they should have accepted the contract.

Reflection

Recently, Sequoia talked about two things - Rolef mentioned that there are too many talented people chasing not-so-interesting problems (like it's 1999), and Doug mentioned that most money made will be in the application layers, directly to the consumer. It’s making me rethink my approach, but I’m dedicated to winning (outcome) and not dying (path dependent). In a world that's changing, the “taste” of your problem matters than your current skillset. And taste depends on context; context is dependent on those who hold information.

UCSD Supercomputing Club

Book List

2025-10-31T06:45:00Z

On Becoming A Person	Carls Roger	It's a bit dry, but I like some of his concepts. Being Open = Becoming Yourself
Probable Impossibilities	Alan Lightman	Very beautiful book, a sense of art wielded with science.
The Plague	Albert Camus	Beautiful book, with deep philosophical ideas
Nichomachean Ethics	Aristotle
When Reason Goes on Holiday	Neven Sarsadic
The Scout Mindset	Julia Galef
Profiles of the Future	Arthur C Clarke	Reality is quite far off from the prediction
Medieval Technology and Social Change	Lynn White Jr.	Didn't finish, didn't really like

Guns, Germs, and Steel	Jared Diamond	I find it okay
The Smartest Guys in the Room	Bethany McLean & Peter Elkind	It's a great reminder to not get too caught up in intellectual abstract ideas
The Copernican Revolution	Thomas S. Kuhn	Interesting read - people are scared of technology
The Daily Stoic	Ryan Holiday & Stephen Hanselman	Good reminder. Stoicism is helpful in hard times.
The Death of Ivan Ilyich	Leo Tolstoy	Very Deep. Tolstoy yearned for meaning in life. This is different from Camus.
The Brothers Karamazov	Fyodor Dostoevsky	Best book ever. One of the most defining books I've read
Gilead	Marilynne Robinson	Didn't finish
Feynman’s Rainbow	Leonard Mlodinow	Good teaching for life
Superforecasting	Philip E. Tetlock & Dan Gardner	How to be more rational

The Maniac	Benjamin Labaut	Sometimes too fictionalised, but interesting overall
Godel Escher Bach	Douglas R. Hofstadter	Legendary book, captivating
A Mind at Play	Jimmy Soni & Rob Goodman	Claude Shannon is an abstract thinker
The Singularity Is Near	Ray Kurzweil	Predictions came true
The Great Illusion	Norman Angell	I don't like the writing style, but its interesting book written in 1900s
The Sovereign Individual	William Rees-Mogg & James Dale Davidson Wikipedia	one of my favourites
Only the Paranoid Survive	Andrew S. Grove (Andy Grove)	great management book
Alchemy: The Magic of Original Thinking in a World of Mind-Numbing Conformity	Rory Sutherland	very, very good book for original thinkers
The Beginning of Infinity	David Deutsch	author is optimistic
Novelist as a Vocation	Haruki Murakami	wow, really beautiful
On the Edge: The Art of Risking Everything	Nate Silver	Didn't finish
Talent: How to Identify Energizers, Creatives, and Winners	Tyler Cowen & Daniel Gross	nothing too new
7 Powers: The Foundations of Business Strategy	Hamilton W. Helmer	great framework to think about
Impro: Improvisation and the Theatre	Keith Johnstone	iconic book
Technological Republic: Hard Power, Soft Belief and Future of West	Alex Karp	okay book
Up to March 2025
Measure What Matters: OKRs: The Simple Idea that Drives 10x Growth	John Doerr	Just okay
Invent and Wander	Walter Isacsoon	Gifted by PT. I realized that I have obsessive curiosity. Listen to my intuition
Charlie Almanack	Stripe Press	Great tips on human pyschology. Filled with wisdom
Things Hidden Since the Foundation of the World	Rene Girard	This is the hardest book ive read. But concept on "Scapegoating" and "Mimetic Desire" is mindblowing
Dream Machine	Stripe Press	Inspiring. I'll never take for granted how hard it was to build computers again.
The Elephant in the Brain	Kevin S & Robin	Just okay. Recommended by Naval Ravikant
A man for all markets	Ed Thorp	I aspire to be ed thorp, curious, healthy, principled and ALOT of freedom
Am I being too subtle	Sam Zell	Love this book. Straight talking guy.
Exercised: Why something we never evolved to do is healthy and rewarding	Daniel Liberman	Learnt about human body. More patient on my fitness journey
Dont Sweat the Small stuff	Richard C	Just okay.
The Power Law	Sebastian Mallaby	Really love this book. Concept fully ingrained in my major decision making
Up Close and Personal	John Mack	Interesting history of wall street
Chip War	Chris Miller	Really love this book. Curious on eccentric neurodivergent CTOs making big bets like ASML
The Founders	Jimmy Soni	Stories of eccentric, intense nerds of Paypal.
Between two shores	TL TSIM	Gifted by PT. Learned about the differences of East and Western culture
Ego is the Enemy	Ryan Holiday	Reminds me that Ego is dangerous
Courage to be happy	Ichiro	I prefer his former book more
Money Machine	Weijian Shan	Highly recommend! Read this and told myself " I'd never want to work with the gov/politics. Super complex"
Keeping at it	Paul Volcker	Fed Chairman Paul Volcker is our hero! Man of character. If i go through tough times, i read him
Think again	Adam Grant	just okay
Negotiation	Brian tracy	Just okay
Never Split the Difference	Chris Voss	Love this book! Fave Negotiation book
Do Nothing	Celeste	Just okay, but reminder to hyperactive tiffany to chill.
How to lead	David Rubenstein	I like Jeff Bezos and Jamie Dimon's part.
Born to Run 2	Chris Mc Dougall	Gifted by PT. Interesting book on running and certainly inspires one to run
Atomic Habits	James Clear	Good reminder of incorporating good habits
Money Games	Weijian Shan	Highly recommend. Another book from Weijian. Met him in person. Love this detailed story on PE
Deep Work	Cal Newport	Attention span of Gen Z is terrible, mine included.
No Filter	Sarah Frier	Felt like im being pulled to entrepreneurship after reading this. Story of 0 to 1 of Instagram
No Rules Rules : Netflix	Reed Hastings	I want to work in this sort of culture. Netflix culture interests me, but am not passionate about the industry
The Unthered Soul: Journey beyond oneself	Michael Singer	Sam altman recommended. Spritual Fulfillment
Hidden Potential	Adam Grant	Just okay
Option B	Adam Grant & Sheryl Sandberg	Just okay
Range	David Epstein	Just okay
Empire of Pain	Patrick Keefe	Super interesting read on drugs abuse in US.
Biggest Bluff	Maria K	I dont like
Time to breathe	Bill M	Just okay. Bought it when i was stressed. By the time i start reading it, i wasnt stressed at all.
The Geek Way	Andrew Mc Fee	Felt like he's describing me - obsessive curiosity, creative self starter who pushes the walls & limits
Ikigai	Hector	Just okay
More Money than God - Hedge Funds & The Making of Elite	Sebastian Mallaby	Fun read
Man Search for Meaning	Victor Frankl	Inspiring read, it's all about purpose/meaning
How to win friends & influence people	Dale Carnegie	I get it, but wont follow 100% of it. Good for people who wants to learn EQ
Man who solved the market	Gregory Z	Jim Simons inspires me to pursue the unknown path - he too struggled. Be guided by beauty
Black Swan	Nassim Taleb	Inspires me to pursue " low probability, high EV" stuff
Pyschology of Money	Morgan Housel	Recommend to all young people
Market Wizards	Jack S	Story of Ed Seykota & Stan Druckenmiller is interesting
New Market Wizards	Jack S	Recommend to those doing markets
Hedge Fund Market Wizards	Jack S	^
The Essays of Warren Buffet		i'm not into value investing
Four Thousand Weeks	Oliver	You have a short life! Recommended to people in limbo
Hillbilly Elegy	Jd Vance	Cried in first chapter. Didnt know that extreme sadistic part of US.
Flashboys	Michael Lewis	Just okay
The undoing project	Michael Lewis	Interesting!
Thinking fast and slow	Daniel K	Complex read for 20 year old Tiff
Boomerang	Michael Lewis	Crazy cheap financing during 2000s. "Madness is crowds are more common than individual"
What it takes	Steve Schwarzman	Inspiring entrepreneur
Shoe Dog	Phil Knight	wow, his business almost died multiple times
Total Recall	Arnold Sch	Didnt finish it, not really my style and too long, haha!
How will you measure your life	Clayton	Tiffy, its all about fulfillment and principles!
The power of now	Eck T	Great reminder to focus on present
Think and Grow Rich	Napoleon	Just okay.
Cant hurt me	David Goggins	its all about mindset
Grit	Angela Duckworth	just okay, good self reminder
The unwinding of miracle	Julie Ip	underrated inspiring memoir. she had a crazy life. LIVE YOUR LIFE TO THE FULLEST
When Breath becomes Air	Paul	Life is short, heartfelt book
What i talk about when i talk about running	Murakami	Love this book - it's never too late to start running
Good to Great	Jim Collins	From PT List. Talent is all about 1 + 1 = 3
The Monk who sold his Ferrari	Robin Sharma	From PT List. Do things the things that fear you most
Start with Why	Simon Sinek	Inspire others!
Never Eat Alone	Keith Ferrazi	Great book to learn networking
Courage to be disliked	Fumitake Toga	Great book to read at 19

When Progress Looks like Decline

2025-09-26T22:58:49Z

Lessons from the Past
I’m interested in making AI infrastructure cheaper and reliable, and I often think about how AI is similar to the Industrial Revolution. Electricity is a good comparison. It was a technology that changed everything, across every industry. Electricity improved manufacturing and daily life, but many of its benefits didn't show up in productivity measures. Some gains are hidden, and I think it's important to take note that this will be the same with AI.

A Brief History
Before electricity, factories relied on a centralised steam or water power. They rely on one central power source, using belt and pulleys snaking across the floor, causing breakdowns and safety risks. Factories had to be located near a water or coal-rich environment. Electrification changed this. Instead of building a whole new factory, which requires large investments, a business can just expand its existing factory. Growth shifted from large & discrete to small & incremental. Adoption of electric motors took decades. In 1900, almost no factories used electric motors. By 1910, 20% did. By 1920, half. By 1930, 80% had switched (Source: Byrne Hobart).

When Progress Looks Like Decline
Even then, some benefits weren’t obvious in the data. When factories switched from gas lamps to electric light, GDP went down because light got cheaper. GDP only counts money spent, not improvements in quality. Faster trains, fewer breakdowns, and night shifts made factories more productive, but those gains didn’t show up as “productivity growth.” Labour hours and train miles were counted as the same, despite productivity. The real benefits only became clear decades after the economy had time to reorganise around the new technology.

How Growth Changed the Market
Capital markets changed, too. Before electrification, growth required big investments, so companies issued bonds with predictable dividends. With a high dividend, you would not expect growth... Stocks were more uncertain. After electrification, growth became incremental. Companies could just reinvest profits to expand, so stocks became more lucrative.

What about AI
The narrative around AI shifts between singularity and doomsday, but I suspect the future will be in the middle. AI will probably make inequality worse — not just between countries, but within them. Globalisation narrowed the gap. AI may widen it. With falling birth rates and rising costs, that worries me. Angry young people with nothing to lose tend to start revolutions. But in the long run, I’m optimistic. AI will give individuals and small companies more leverage. It will make organisations leaner and create new kinds of work, especially creative work. Life will probably get more volatile, but also more rewarding, at least for those who can adapt.

Good for capitalists.

Good for consumers.

Harder for workers.

The Broader Intellectual Question

Back when I was an associate on the trading floor, I often heard older colleagues say, “It’s much harder for young people these days.” And there’s some truth to that. Many of my peers were pushed to compete from an early age—only to exhale with relief (and mild burnout) once they finally landed a job.

That experience isn’t universal, though. It didn’t apply to me. But in places like Hong Kong, the pressure to compete is real—so real that one of my pregnant ex-colleagues had to sign up her unborn child for a nursery.

I’ve always had a quiet scepticism toward competition. I was pretty influenced by Rene Girard's mimetic theory, and pursuing original ideas has worked well for me. Sure, it makes you better at whatever you're competing in. But it often stifles creativity and independent thought. It works well for those on the right path—but not for those pursuing careers or relationships they are not fulfilled with. Education is not a substitute for thinking hard about what you want.

If AI truly democratizes knowledge, will it amplify the value of the smartest people? Or will it reward the average person with more agency? And as AI accelerates technological cycles - driving nations, companies, and people to even more competition- will we ever pause to question the underlying assumptions we’ve inherited? Or will we be too busy optimising, scaling, and competing to notice we’ve walked straight into a high-tech version of the Malthusian trap?

Will AI lead to higher GDPs, more consumer choices, and faster innovation—yet leave the average person feeling more anxious and less fulfilled? In other words, what happens when progress, by every economic metric, starts to feel like decline in lived experience?

Risk and Realities of Supply Chain

2025-08-20T07:52:00Z

Markups

As I get deeper into the AI Infrastructure industry, I realise how complex the supply chain is. The current supply chain for trading firms is Hedgefund -> Value Added Reseller + Data Centre -> OEM -> ODM. At each stage, there is roughly a 30% markup. "VAR" buys servers and manages the entire project. Their value lies in knowing the perfect hardware to fully utilise the client's software model. Recently, I came across the VAR's latest AI server list, published in late 2024. I was curious why they were only offering older models like the R760xa from 2021, along with several servers dating back to 2019. None is liquid-cooled, and there were no signs of the latest XE models from Dell. My hypothesis is:

1. OEMs like Dell no longer prioritise VAR, since hyperscalers have created huge AI server backlogs.
2. The VAR lacks the networking and integration expertise needed for the newest servers.
3. With little competition, VAR can get away with charging high margins despite weaker offerings.
4. Many clients care more about cost than cutting-edge hardware, and competitive pressure hasn’t yet squeezed trading profits.

I started to think - can I source from ODMs directly and deliver the same quality of product? So I decided to reach out to ODMs in Taiwan.
They replied initially and stopped - they only entertain large customer orders.

Risks and Realities of AI Clusters

I spoke with a leader managing 3,000 Nvidia GPUs. They were trying to build a data centre, the first-of-its-kind, locally. The current process is inefficient - the entire contract is awarded to a local telecom company, which must meet strict criteria or face penalties. The telecom then acts as a project manager, coordinating with about 20 different vendors. Very little is handled in-house. I suspect this lack of expertise translates into operational inefficiency. I’m especially sceptical about their ability to network all these clusters effectively.

They then sell these GPU clusters to clients, mainly university research departments. They keep spare nodes but still face daily customer complaints. Building perfectly reliable hardware is very hard, yet clients tolerate it because there’s no better alternative. Meanwhile, China continues to struggle with procuring H800s needed for training.

Their biggest concern is cybersecurity. As she put it, “It only takes one disaster to ruin trust.” The data centre’s location is top secret and is part of the government’s critical infrastructure law.

The Hardware You Can't See

2025-08-14T10:59:00Z

I’ve never seen an AI server in person, but I’m excited for the chance to. This year, I was lucky enough to be sponsored to attend the OCP Global Summit and join NeurIPS in San Diego.

I’m also really curious about how open source might help reduce costs over time. I remember how open-sourcing Android lowered phone manufacturing costs, and I’m hoping to see something similar happen here—though it’ll definitely take time. And for Meta, the benefit seems straightforward: lower costs for them as well. (Update, Nov 2025: I’m more skeptical about open-source hardware now.)

Building AI clusters takes more than GPUs. Networking and bandwidth are key to keeping them fast. Meta's open-source system consists of an isolated high-bandwidth network that connects all their GPUs and domain-specific accelerators. Bandwidth is expected to grow by 5x to 10x by 2030. Hopper GPUs had 900GB/s NVLink; Blackwell is twice that. Every new generation GPU is doubling or more in compute throughput, which forces interconnect bandwidth to keep up, so the chips aren't waiting for the data.

To support this, AI Labs need a high-performance, multi-tier, non-blocking network fabric that can manage traffic smoothly. Meta decided to open-source Catalina, a high-power rack capable of supporting up to 140kW. The other reference you can typically find on the internet is Bianca, but it's a total integrated system from Nvidia.

Source: Meta Catalina Specification via OCP

Components within Meta's compute tray

1. GB200 High Performance Module (N)- Modular component that contains CPU and GPU

2. Host Management Controller (N)- Control panel to monitors all parts, checks temperature, etc

3. Connect X7 (N)- Network interface card that talks to your GPU and data center fabric

4. Power Distribution Board (M)- Receive bulk power and distribute it to everyone else
5. Data Center Secure Control Module (M) - Low speed chips moved to DC SCM, making HPM less dense and cheaper. Allows upgrade without changing the whole HPM
6. E1.S NVMe backplane (M) - Interconnect board connecting SSDs to main system
7. OSFP Carrier Board (M)- Acts as a interface between compute trays and network fabric. Best for thermals and maintanability.
8. Front IO Board (M)- Interface board to keep motherboard cleaner
9. CX7 OCP NIC 3.0 (Commodity)- Seperate board for CX7 NIC to allow easier upgrade, simplify cooling and signal integrity

N= Nvidia-designed
M= Meta-designed

Thermal Solution
Catalina uses a combination of liquid and fan cooling, with a suspected code name of "Channel Island". They utilise a PG25-based liquid, with a temperature of 10-12 degrees celcius. They have terminal sensors within the baseboard management controller (BMC) that tolerates +-2C. Channel Island can also detect leaks via sensors, contain leaks via mechanical design and response to leaks by shutting down power or turn off supply once detected.

Within the compute tray, there are eight fans. These fans provide air cooling for the E1.S drives, front end CX7 OCP NICs, the DC-SCM, and the power conversion circuitry on the PDB. The thermal design is resilient and can continue to operate with a single fan rotor failure. The cold plate loop is used for liquid cooling on the high-powered components - like the GB200 and CX7 backend NIC modules.

Electricity
From reading semi-analysis, I was intrigued to learn that the cost of power is cheaper in some US states compared to some parts of the world. Notably, costs are 1/3 of Singapore! The power landscape is accelerating, and a 50MW+ per facility would no longer be enough. Legacy data centres would no longer be relevant.

There are other components beyond electricity and thermal, which are open-sourced by Meta. Currently, they utilise Quanta, a Taiwanese supplier, to build their bare metal.

Scaling hardware isn't straightforward

2025-08-14T05:04:10Z

When I was working at deep learning at my previous job, I wasn't expecting to be interested in hardware. However, AI is different because the cost and quality of the infra make a significant impact on the model. So I did what I always did when I became curious: I went down the rabbit hole.

Introduction

The increase in machine learning usage - from natural language processing, graph neural networks to monte carlo pruning (Alpha Go) - has driven a massive surge in computational power. Models continue to double in size, and racks are on track to increase to 1 MW by 2030.

There are several problems arising from this. First, adding extra GPUs doesn't always help - the total system performance scales sub-linearly to the extra GPUs. Adding more compute nodes tends to reduce overall system efficiency. Second, migrating hardware between different sub-domains is tough. Financial trading models and autonomous driving have diverse priorities in terms of latency, memory usage, and throughput. AI models are unlikely to converge across sub-domains, and existing algorithms continue to evolve. Third, an average server requires ~50 subcomponents, and as Jensen introduces new GPU models, parts of the supply chain get reworked.

As a result, scaling hardware blindly leads to diminishing returns — and without careful hardware-software co-design, most of that extra compute goes to waste.

Physical System Design*

1. Chips

Modern algorithms need lots of cores and fast memory. The easiest way to scale is with chiplets. Instead of building one big chip, you build smaller ones and connect them. It’s cheaper, more reliable, and scales better.

New players like Groq are rapidly entering the inference space, aiming to carve out market share. China is trying to move to local chips like Huawei. To keep up, you need a modular system—something that can take new chips without fully starting over. Standards like Open Accelerator Inference and Universal Baseboard make that possible.

2. Chassis - Tray

A chassis combines key materials - such as processors, storage drives and memory - into a compute, storage or memory node. Traditionally, 19-inch racks are the industry standard. 21-inch racks, supported by OCP, are increasingly becoming popular due to bigger AI workload needs.

3. Server - Compute*

The GB200 NVL72 racks consist of 18 compute trays and 9NV Switch trays. Meta has made their Catalina NVL-72 system open-source. Catalina is Meta's next-generation AI/ML rack that supports large cluster training and inference use cases. The design focuses on achieving a fast time to market, alignment with industry references, and providing cutting-edge performance.

4. Pod Density - Compute Nodes

A pod is a series of compute nodes that work together as if it's a single computer. Even though the job is being split up into multiple physical machines, the software sees it as one machine. The pods are connected with a low-latency interconnect like NVSwitch. In the case of Meta's NVL72, each tray contains 2 CPUs and 4 GPUs, and there are 18 trays in a rack. Two racks are then connected to fit 72 accelerators per pod. Pod density is expected to increase in the future, but due to current power and liquid-cooling constraints, most data centres cannot support the rack density of NVL72 in one rack.

OCP also talked about a future where there can be more than two accelerators per high-performance module, and depending on advancements in fabric technology, that number could be 576 per rack!

Different types of clients also require different ratios of GPUs to other components, depending on the primary purpose of the hardware. Semianalysis wrote a great article on how to improve bare metal cost by omitting less important units.

5. Networking

AI workloads often involve a huge amount of data moving between CPUs, GPUs, memory, storage and sometimes even across data centres. If all these transfers happened one after another, the system would become a bottleneck. Parallelism in networking means designing the network so that multiple data flows - CPU to GPU, GPU to memory, node to node, cluster to cluster. This involves interconnects like NVLink, Infiniband and special architecture. As my friend, Reynold, says "Networking is a whole set of different complex problems"

Different networks that co-exist for different purposes

Frontend networking (Normal Ethernet): Connect Servers to the outside world

Backend Networking (InfiniBand/RoCE Ethernet or other high-performance fabrics): Connect all nodes and servers, move a huge amount of data between GPUs in different servers with minimal delay

Scale up Accelerator Interconnect (NVLink): Connect GPUs within server, share memory and exchange data faster than PCIe

Out-of-Band Networking for Manageability: If networks are overloaded. down, admins can still access the system via this method.

*Source: Open Compute Society, Semi Analysis

Reflections 2025

2025-08-05T06:09:10Z

What do I want

I’ve always liked ideas and people. I like to surround myself with smart kids who think deeply. I particularly like creative, courageous, contrarian geeks. My main aim is to meet those people, and there are some industries who attract them – quant HFs and tech startups. I want my future job to involve working or servicing them. I want to be in the US/UK and then maybe Asia after. I want to work in building new things

I realised that I have a strong appetite for taking risk. I also learned that I love multiple domains and can come up with original ideas. I love adrenaline, and I love being on the edge. A great life to me would be work that is like skiing and holidays that is like onsen. I get obsessively curious at times, but there are very few ideas that can get my blood boiling for months. AI is one of them. Macro is another. Once it happens, I can’t help but be relentlessly resourceful. The only thing that would stop me is failing at it. So far, I’ve only failed once.

It is hard

Frankly, I care about the AI stuff, working with smart people and being competent. I always believe in choosing the right boat first and just say yes. I am also sure I don’t want to do sales, because it’s time to *thrive* in a new skill. I think it suits the generalist, polymathic nature of myself. I’m always interested in a broad range of things –rene girard, history, geopolitics, maths, philosophy, science and hence my obsession with macro. But being competent is important, and you must deliver value. Profit is an important discipline for new ideas. I see myself in a BD/Finance role with ML skills, not just a pure ML.

I think the age of AI is great for people with high agency and polymaths. Not going to lie, the job market is intense. I’m 100% confident that this is the right decision in the long term for me, but oh boy, I am about to stomach a lot of pain. I literally thought about entering a swap contract with the people I love.

T: Hey can I sleep in your sofa (or extra room) for one month if everything fails in the next 3 years? On the other hand, if I have X amount after 35 years old and no debt, how about a 1k USD cash gift or hotel voucher as an appreciation <3 ?

Frankly I thought it was a great idea, because I’m motivated to give back to my mentors and friends (you know who you are ), and they always see the upside in me. Alternatively, protecting my liquidity and downside is important in this unpredictable, volatile environment.

Entry-level jobs have already vanished. We can talk for 3 hours about what this means for the future - more dissent, less peace, increasingly winners take all. I am determined to win. I am also vulnerable, nervous and excited.

Principles

2025-08-03T09:03:16Z

Inspired by Ray Dalio's book, which led to my pursuit of a macro career. I'm not the best writer, but I love ideas, and I love being competent. I think profit is an important discipline to pursue more ideas. I love thinking, and my main goal is to meet interesting, deep thinkers and make a bit of money along the way.

1. Momentum is everything - both negative and positive
When I was 10, life changed dramatically. Life became 20% worse, and then two years later it became 80% worse. When you are poor, you become stressed, which affects your relationships, makes you lonely, you fight more, and things have a tendency to spiral downward. Failure can be very demoralising.

Success, on the other hand, is very motivating. You get a good job, you are competent, you get told you are talented, you feel more motivated to do a better job, you go home happy, you smile more, and you have better relationships with others. You can't give anything to others when your cup is empty, so it's important to take care of yourself.

The challenge is stopping the momentum when you are failing and recognising that positive momentum doesn't last forever. The key for me has been stoic philosophy at times of pain, and gratitude during the good times.

2. Use your past to your advantage

My childhood has moulded me to be both optimistic and paranoid at the same time. I firmly believe in Andy Grove's quote of " Only the Paranoid Survive". I do not think paranoia equates to unhappiness, nor optimism equates to happiness. I am very confident I can go through most kinds of adversity, but I know it is going to be painful, so I actively try to think hard about the future and take risks while I am young. I'm rather optimistic and open with people, so I like reaching out to strangers whom I admire. This means I'm naturally good at sales or being relentlessly resourceful. This could also mean I fail to recognise red flags in others, but because I am very detailed about numbers, I rarely fall into financial scams.

Knowing this weakness, I save money during good times. I recently quit my job because I want to work in AI, and I knew the fear of negative personal cash flow would force me to act urgently. You can use both your strengths and weaknesses to your advantage.

3. Don't listen to others' advice on burnout
Burnout occurs when an individual experiences excessive emotional adversity in life that they cannot overcome. But if you do not feel emotion in that particular activity, then you may have found your biggest edge.

4. Have the courage to be original
I do not get embarrassed easily. I find it fun to reach out to people in interesting ways. I find it fun that they find me amusing, and I brushed it off when they rejected me.

I started with cold emails and LinkedIn, but found the return rate unsatisfactory. I decided to be different, so I cold-called CEOs and seniors. I found their numbers through LLMs or BBG. Seniors would pick it up; some would kindly reject me. I've cold-called 5 CEOs, but frankly, I wasn't able to get past their assistants.

So I decided to send a bunch of them fedex mail. I figured out the fancy mail would spark their curiosity. The problem with this is that you can't track whether they see the mail or not.

There was a time when I would video myself in Loom. I did this to ~10 companies, with most of them replying! I sometimes send it to their generic email address (info@company.com) if I can't find the individual's email. You can track whether someone has viewed it or not. Once, I printed a shirt, video-ed myself and sent it to every single member on their board. Both quickly replied within hours, and we learn some stuff from the conversation. They loved the creativity, but I was rejected for visa reasons.

Despite multiple rejections, I realised I can do it for months because I experience very little mental pain from the act of hustling. My biggest constraint is my finances, and I don't like burning money on my FedEx mail and t-shirts. I also don't like paying a premium for Loom.

Frankly speaking, these are all the things I did when I was younger. I am not sure if I would do it now. You get a lot of bandwidth for failure when you are a young person.

5. The world doesn't care about talent; they care about success.

The very first non-professor adult I've met in Hong Kong was a chief executive who came to school to give a speech. I happened to be the only non-Chinese with the best grades, so I got a chance to make a speech afterwards. I shared my life story, he came forward, gave me his name card and proactively told me to reach out to him. Over the course of 7+ years, we would meet regularly. I thought this was the norm in Hong Kong, that EVERYONE wants to talk to a student.

Turns out that man was unique. I was convinced that he was once a mildly eccentric, obsessively curious, well-read independent thinker. He's managed to assimilate into society, but he firmly believes that I was talented. 7 years later, after being exposed to a myriad of talents, I realised he saw a bit of himself in me. He told me to read more and that school wasn't that important. I read a lot of books during my time in University, and it was the best decision I've ever made.

Despite getting 4A* and winning Top in the World in AS Mathematics, I was quite shocked when multiple people convinced me I couldn't go to work in macro because I didn't go to a target school (I wasn't smart enough). My heart dropped when a senior trading guy told me that I should quit school and take out a loan to attend a better school. I didn't have the financial means. I didn't believe in loans.

Through a series of hustling and luck, I got a fun job in macro, and then the opposite happened - most of the people at work called me special and talented. I was confused, but reality is probably in the middle. I wasn't as bad as people say in Uni, nor as smart as people say in work. Do not let yourself be fooled by human mimesis.

The reality is that no one (except for a few like that guy) cares about your talent; they only care once you achieve success. If you are talented but not successful, don't worry. Focus on working hard and getting what you want. Results matter. If you are less talented but feel imposter syndrome, why worry? The world is filled with unsuccessful people with talent and no opportunity.

6. Understand your strengths and blind spots

Have high convictions about your unique strengths, but be humble about your blind spots. I am always curious about my limits and trust my instincts when it comes to doing what I want. I love adventures, am not a perfectionist, and tend to act with a strong sense of urgency. When it comes to my weakness, I tend to listen to others and control that inner voice in my head that tells me to "do it". I've learnt that strength and weakness are usually two sides of the same coin.

7. Find your own game when it comes to luck

Coming back to the cold mailing story, I find it more fun to do something different, but I've seen admirable friends who do the "same thing" and win by sheer willpower. The game of luck is different for each individual. Math has inspired one of my principles -
A die with six sides and a value from 1 to 6 has an expected value of 3.5 on every single throw. You can rate your life from 1 to 6.
How do you optimise for a 6/6?

Most people struggle with settling into a life of 6/6 when they have a life of 4/6. Most will choose to settle and stay in their comfort zone, because rationally speaking, it is above expected value. However, in a world that is changing, the biggest risk is not taking risks at all. It's a personal decision, but having gone through ups and downs, I find living at rock bottom (or below expected value) stressful financially, but clear mentally. There is only one direction to strive for - up.

Assuming you work hard to develop mental resilience, you will find yourself having multiple chances to roll the dice. I like to leverage my biggest strength and roll the dice 100 times before eventually getting what I want. Life will be worse first, before it gets better, but leveraging your unique edge means it's never too bad! You will find the journey fulfilling, and you will gain a stronger belief in yourself.

My mentality in life has always been to "roll the dice", but it's important to understand which phase of life you are currently in, hedge your biggest risks and make sure it doesn't kill you. For me, the fun is in the journey. I realise I enjoyed rolling the dice more than achieving the outcome.

I've seen instances where my smart friends took the more patient, less risky, slower method of "engineering the dice" to obtain a 6/6 life without going below EV. In the world of hedgefunds, there's a story of how Soros gets into the mess and knows when to get out, Stan never gets in the mess in the first place.

8. Aim to predict the future by understanding history

The book Sovereign Individual was an unusually ambitious, thought-provoking book written in the 1990s. The author tries to predict the future by understanding how technology results in a shift in "megapolitics". It ties nicely to the concepts of mimesis from Girard: people imitate each other's desires, and that mimesis drives cultural change.

I've always had a view that every generation is born into a particular culture at a particular time. The Millenials complained about not being able to afford housing in their 20s. GenZ's never thought about it in the first place. As we enter a period of rapid change, it's essential to recognise that the methods of attaining wealth and a good life from previous generations may not apply to our own. Meanwhile, there are universal principles of human desires, greed and fear that remain constant.

Learn to think hard about the future.

9. Learn Emotional Intelligence from Abe Lincoln

Smart people often have intense, extreme personalities. If you optimise your life to work with great people and great problems, you will need to learn emotional intelligence. This means understanding yourself, having the ability not to act on it, and developing empathy for others. During the Civil War, Abraham Lincoln faced immense pressure of leading a divided nation and managing conflict in his team. His capacity to regulate his emotions, maintain clarity and purpose in the most difficult circumstances, is deeply inspiring to me.

10. Spend time on things that give you energy - be it friends or work

To live a fulfilling life, focus on activities and people that boost your energy. For me, creative work that challenges my mind is energising. Thoughtful conversations spark new ideas and keep me engaged. Humour is a key part of my life, bringing joy and lightness. I choose to spend time with brilliant friends who sharpen my thinking and share positive energy, creating a sense of mutual inspiration.

Opportunities in AI Infrastructure

2025-08-01T08:56:01Z

Why AI Infra

I wanted to work in AI Infra because its more fragmented and less winner takes all. Even if you fail as a seller, the knowledge of GPUs and data centers will be useful to buyers procuring them.

The world can buy cars from Japan or China, but every country needs to build its own road. Same with data centers. Software is America centric and it’s not clear to me yet who the winner is, except the top 3 labs. Same with AI Agents.

Infrastructure also has its nasty downsides. Products tend to lack differentiation; business is prone to margin squeezing from competitors, it is very capex heavy – power, real estate and GPUs. Recall what happened with Cisco in the boom of internet 1990s. When you are building infrastructure, your sales are going up +50% a year, but once it is built, growth not only doesn’t go up 50%, but it also goes down because on a rate of change basis, you no longer need infrastructure. In 2000s, A lot of companies with estimates of +50% to +70% growth, for the next 2-3 years, had business that were about to collapse. Nasdaq went down 95% in 2001.

AI Infra for HFs

In 2023, I thought there are <10 HF players who are successful in neural networks. My colleagues thought it was still an experiment. Let’s not talk about the moment when a senior guy in trading told me that neural networks doesn’t work. I was dead wrong.

In 2025, I learnt that there are people who’ve been doing this for decades. The Chief scientist of OpenAI is an ex-HF guy. Look at all the ICML/NeurIPS sponsors. Compare 2023 vs 2024. The number of quant HFs sponsors almost doubled.

There is a firm called “B”. They are discreet. They work with a few high-profile quant HFs whom I prefer not to write about. Not Quadrature or TGS or RenTech, but you can assume they also had compute. Lucky for me, they had an HPC for finance guide which I liked, so I cold called them. They told me that everyone wants more compute. That HF clients seem to be making money this year and last year. I also learned from a P.Decrem that HFs aren’t excited about 10k clusters anymore. They want more, but Nvidia is focusing on sovereign clients. I don’t know what I don’t know, but I suspect most are experimenting while some are making money.

I had a conversation with a neocloud* who told me that they think only 40 companies in this world can “make money” from large clusters and everyone is trying to get them.

Company B is small yet efficient - makes 75 million of revenue last year, paid out 10 employees, 6 million of salary total, and is fully owned by a 60-year-old guy who I reckon is about to retire. Last year, he paid himself 3 million in dividends. I totally want to build a company like this.

Opportunities – There are two opportunities I see

First, Build AI Infra starting with HFs and then other niche players.
“B” - these guys are middlemen. I assume most quant HFs do not want to go through the hassles, so they ask B, who source the GPU servers from OEM like Dell to build the data centers. OEM tends to charge ~30% markup vs ODM, but that is due to the specialization of servers. As compute clusters grow larger, some guys like Meta decide to build their own and source it from ODM like Quanta (who charge only 1% markup), To do so, you need both the technical expertise for server configuration and willingness to negotiate per component. Nitty gritty infra stuff which some rich clients prefer to outsource.

My question is – until when does your competitive edge on trading lasts before you P&L gets competed away? Just like systematic strategies in the 1990s? How do you ensure ROI on your models? Assuming your moat gets reduced over time, will your edge naturally be in managing CapEx smartly, just like the AI Labs guys? This is not just about getting good financing rate, it’s about sourcing, relationship, technical expertise in networking. Assuming everything is reliable.

The additional complexity comes with how next generation GPUs need more power density and liquid cooling. Existing data centers are not equipped for that. Making it harder, every chip manufacturer, every server assembler has a different manifold or different way to get liquid on the cold plate.

The argument for neocloud* is that their rich clients would want the newest chips and that they are secure enough, that cloud make sense. I talked to “B” and he disagrees, with HFs prefer to have their own data center with older generation chips for security reasons. I suspect some HFs/AI Labs are okay with H100s for now because the software stack is more robust. Will it change in 10 years? Someone like XTX is probably stuck with their 2023-2026 version of data centers.

Second: Nearly everything is a rounding error compared to GPU cost

AI in the Cloud, how to keep your models flying high and deliver ROI*
https://calv.info/openai-reflections

The second opportunity is working on a job that solves a new problem that arise with AI. In the case of AI, the underrated problem is managing cost. There are smart engineers building things like context caching or smarter prompt engineering to optimize cost of LLMs. On the finance side, OpenAI is hiring those with compute/procrument knowledge who also knows how to do ML forecasting. The trend is rather clear – you need to know both some kind of contextual knowledge (finance) and ML.

An interesting excerpt from OpenAI guy
“We had to forecast out the load capacity requirements as part of the Codex launch, and doing this was the first time I'd really spent benchmarking any GPUs. The gist is that you should actually start from the latency requirements you need (overall latency, # of tokens, time-to-first-token) vs doing bottoms-up analysis on what a GPU can support. Every new model iteration can change the load patterns wildly.”

TCO Analysis of 10k Cluster

My bare metal chassis analysis of a 10k H100 GPU Cluster

Let me know what you think!

Letter

2025-07-08T04:00:00Z

I was lucky I had a pretty amazing boss for my first job.

Download Tiffany_RecLetter__1_.docx

Total Cost of Ownership Analysis

2025-07-01T19:00:00Z

TCO Analysis of 10k Cluster

This is my bare-metal chassis analysis for a 10,000-GPU H100 cluster. It’s a helpful way to understand the every parts of a server. The goal was to compare costs when buying directly from ODMs versus OEMs, using typical markups of about 1% and 30% respectively. With the fast-changing AI market, these numbers may continue to shift.

Notes:

The cost estimates are based on data from pytorchatoms. I prorated the figures to a 10,000-GPU setup and performed several cross-checks to ensure internal consistency.

The server-component data is from 2024, while the operational assumptions reflect conditions as of July 2025. I used a weighted average cost of capital (WACC) of 9.1%.

A range of 7%–10% seems reasonable, given a current U.S. risk-free rate of approximately 4.3%. The 90% utilization rate is sourced from SemiAnalysis, though it may be somewhat optimistic.

Most of the underlying figures come from PytorchAtoms and SemiAnalysis. The electricity cost assumption of $0.087 per kWh is based on North Dakota rates.

My top 5 favourite books of all time

2025-06-11T06:40:46Z

At 17, I cultivated a reading habit on the advice of a cherished mentor. Like many other Gen Zs, I grew up immersed in YouTube binges and movies. Embracing reading was one of the best decisions of my life, bringing me wisdom, clarity, and joy that many others overlook. I discovered a deep affinity for the intellectual world, gravitating toward books with original ideas that remain relevant for decades. Below are my top five books of all time, followed by my complete reading list.

My favourite books and why:

1. Impro by Keith Johnstone

Most people overthink decisions, paralysed by what’s “correct.” Keith Johnstone’s Impro says: stop thinking, start acting. Written in 1979, this book is engaging and timeless. The chapter on status was interesting. It breaks down behaviours mechanistically, and it's a great framework that everyone can learn from. I first discovered it after reading it here.

2. The Sovereign Individual

Originally published in 1997, there is a shocking number of predictions made by the author that became true 28 years later. Big ideas sound crazy until they’re obvious. This book challenges me to think harder about the future by understanding the past. It encourages strategic thinking by understanding megapolitics - large-scale forces shaping civilisations.

3. Zero to One by Peter Thiel

Copying others gets you to “one”; building what doesn’t exist takes you from zero. An absolute timeless book that challenges you to be bold and follow your original ideas. The book resonates with J.C.R. Licklider’s 1960s vision of a world where computers eliminate geographical barriers to creativity and collaboration. Fast forward 50 years, and societal pressures for conformity still hinder the potential of the individual. This book inspires me to be authentic and question the status quo. There are no rules, only consequences.

4. The hard things about hard things by Ben Horowitz
Contrary to expectations, the most difficult things at work are not the tasks, but navigating human conflicts and the emotional toll that comes with a high-pressure environment. School doesn't teach you that, but this book will. Learn to separate what’s best for the mission from what feels good. Its raw honesty makes it valuable for leaders and entrepreneurs.

5. Principles by Ray Dalio
I stumbled into this book by pure chance. I was a kid from Indonesia, freshly arrived to Hong Kong, who knew nothing about the financial markets. This book inspired me to pursue financial markets, and reap the rewards of the intellectual thrill. It also piqued my interest in meditation, and my subsequent experience in Vipassana was one of my fondest memories. The book is a reminder to carve your own principles, follow one's integrity and learn from mistakes.

Clouds

2025-04-10T13:52:00Z

My interest in GPUs began in 2023 when Nvidia unveiled plans for the Blackwell chip. I was struck by reports that each new chip was slashing inference costs by up to 30x. I recalled a key insight from Andy Grove’s Only the Paranoid Survive, where he described the exponential decline of Intel’s memory business due to disruptive technological shifts.

I was curious about how this leap in computing power is going to affect my business. I was shocked to learn that one of the largest players in electronic FX trading (a hedge fund called XTX) had the biggest private cloud usage of A100s in the world (according to this report in 2023). Back then, no one I knew was using it in trading. I dug further and discovered other secretive hedgefunds – the likes of TGS investments, constructing massive 500,000 sqft data centers.

But why build when you can rent?

Neoclouds are GPU-based infrastructure tailored for AI and ML. Similar to the production of traditional cars vs electric cars (Toyota vs Tesla), traditional hyperscalers (AWS) and neoclouds are built differently from the bottom up. Traditional hyperscalers, however, has a big edge in financing.

Interestingly, the business is more fragmented than I initially thought, with different companies pursuing different strategies. Unlike direct competitors in the software side (Windsurf vs Cursor), different clients of the infrastructure business are willing to accept certain trade-offs - whether it's premium pricing in exchange for a single tenant offering, or preference on the software stack.

One of my favourite resources to learn more about this industry is semianalysis.com

According to Dylan & team, there are nine major factors differentiating the best cloud providers, ranging from:

1. Security
2. Lifecycle and Technical Expertise
3. Slurm and Kubernetes Offerings
4. Reliability/SLA
5. NCCL/RCCL Networking Performance
6. Storage
7. Active/Passive Health Checks and Monitoring
8. Pricing and Consumption Model
9. Technical Partnerships

It's more complicated than you think

Building a world-class Neo Cloud is more complicated than you think. It’s not a simple act of combining GPU, networking them and supplying power. It’s an extremely complicated system with hundreds and thousands of potential failures for a single workload, and the pioneers have placed automated systems and active burn-ins to provide better customer service. The software stack is increasingly integral to the success of cloud architecture. The Neocloud business is rapidly expanding, with the total addressable market projected to increase from $33 billion in 2023 to $260 billion by 2030.*The neocloud business is rapidly growing, with total addressable market is set to grow from 33bn (2023) to 260bn (2030)*.

Booms and Busts

Technological revolutions often spark economic booms, but history shows that they can also lead to devastating busts when optimism outpaces fundamentals. The mobile and cloud computing waves of the 2000s and 2010s drove sustained growth without major collapses, thanks to measured scaling and broad market adoption. In contrast, the dot-com and telecom crash of 2000-2002 serves as a stark warning for industries reliant on heavy financing and infrastructure.

There are two historical busts that I think we can identify critical lessons from. First is overcapacity in servers and fiber-optics in the 1990s. Companies like Global Crossing and WorldCom overbuilt fiber optic networks, anticipating demand that never fully materialised. By 2001, only 5% of the laid fibers was operational, leading to plummeting prices.

Second is the collapse of Japanese semiconductor manufacturers after the introduction of the 1985 Plaza Accord. This agreement, aimed at correcting trade balances, strengthened the Japanese yen, making exports costlier. By the 1990s, Japan’s global share of semiconductors dropped from over 50% to 20%.

The AI boom holds a transformative potential, but history warns against unchecked optimism. Leading neo cloud providers l are proactively mitigating risks through strategic measures: “take or pay” contracts ensure revenue stability, while partnerships with deep-pocketed clients and well-funded AI startups reduce exposure to customer defaults. By combining long-term optimism and short-term paranoia, leading neo clouds can position themselves to avoid the pitfalls of the past tech wave.

* Source: Nebius Group

Why I name this blog finetti

2025-03-30T07:15:00Z

While I enjoy reading across many subjects, science has always held a special place in my heart. One day, I stumbled into Bruno. Bruno de Finetti was an Italian mathematician who famously stated that "probability does not exist" in an objective sense; instead, probability exists only as a mental construct representing how likely a person thinks an event will happen based on their information and beliefs.

I think it's a great reminder of life - no matter how data-driven we are, our probability judgment is filtered through our subjective beliefs.

About Me

2025-03-25T05:12:00Z

I love adventures, deep thinking and reading. Previously I worked at Macro in an American Investment Bank. I still love understanding how the world works, and macro will have a special place in my heart.

Like many others, I first stumbled into AI after watching Alphago. This year, I decided to pivot in my career and follow my intellectual curiosity

Some of my interests:

Economic Growth.
Books and ideas. Big fan of press.stripe.com
Business Philosophy
Math Puzzles that make you think

X: tiff_soerianto
Email: soerianto.tiffany@gmail.com

Tiffany Soerianto

Cloud vs On-Premise

Nothing is impossible when humans are involved

A priori and fundamental truth

My research on Go To Market

Friendship of Virtue

Isaac

Albert

Moving On

How will the miracle happen today?

The Effect of Test-time Compute on Data Center Dynamics

Are numbers an illusion?

Hold up a finger. Could this finger be a different color? Could it be slightly longer? Could it be crooked? But could it be ever be anything other than one finger? The number is obligatory. The number is something the finger essentially has.

Machine Learning Infrastructure

Machine Learning Infrastructure

A Reflection of America

What I learn from Open Compute Project

Shortage

Margins

Lessons

Reflection

Book List

Up to March 2025

When Progress Looks like Decline

Risk and Realities of Supply Chain

The Hardware You Can't See

Scaling hardware isn't straightforward

Reflections 2025

Principles

Opportunities in AI Infrastructure

Letter

Total Cost of Ownership Analysis

My top 5 favourite books of all time

Clouds

Why I name this blog finetti

About Me