Anurag's Link Blog

Collection of interesting ideas and snippets I've found around the web.

Home RSS
← 2025 2026 2027 →

How Cursor indexes DB ft Merkle Trees

cursor.com tech

Cursor builds its first view of a codebase using a Merkle tree, which lets it detect exactly which files and directories have changed without reprocessing everything. The Merkle tree features a cryptographic hash of every file, along with hashes of each folder that are based on the hashes of its children.

Small client-side edits change only the hashes of the edited file itself and the hashes of the parent directories up to the root of the codebase. Cursor compares those hashes to the server's version to see exactly where the two Merkle trees diverge. Entries whose hashes differ get synced. Entries that match are skipped. Any entry missing on the client is deleted from the server, and any entry missing on the server is added. The sync process never modifies files on the client side.

The Merkle tree approach significantly reduces the amount of data that needs to be transferred on each sync. In a workspace with fifty thousand files, just the filenames and SHA-256 hashes add up to roughly 3.2 MB. Without the tree, you would move that data on every update. With the tree, Cursor walks only the branches where hashes differ.

When a file changes, Cursor splits it into syntactic chunks. These chunks are converted into the embeddings that enable semantic search. Creating embeddings is the expensive step, which is why Cursor does it asynchronously in the background.

Most edits leave most chunks unchanged. Cursor caches embeddings by chunk content. Unchanged chunks hit the cache, and agent responses stay fast without paying that cost again at inference time. The resulting index is fast to update and light to maintain.

The indexing pipeline above uploads every file when a codebase is new to Cursor. New users inside an organization don't need to go through that entire process though.

When a new user joins, the client computes the Merkle tree for a new codebase and derives a value called a similarity hash (simhash) from that tree. This is a single value that acts as a summary of the file content hashes in the codebase.

The client uploads the simhash to the server. The server then uses it as a vector to search in a vector database composed of all the other current simhashes for all other indexes in Cursor in the same team (or from the same user) as the client. For each result returned by the vector database, we check whether it matches the client similarity hash above a threshold value. If it does, we use that index as the initial index for the new codebase.

This copy happens in the background. In the meantime, the client is allowed to make new semantic searches against the original index being copied, resulting in a very quick time-to-first-query for the client.

Added on February 12, 2026

Brandon Sanderson: Art vs AI

When one of my favorite fiction authors talks about AI, I gotta take notes.

I do think that part of the reason I dislike AI is because it is too focused on the product and not the process. Yes, the message is journey before destination. It is always journey before destination, but there's a specific take on it this time.

Maybe someday the language models will be able to write books better than I can. But here's the thing, using those models in such a way absolutely misses the point because it looks at art only as a product. Why did I write White Sand Prime? It wasn't to produce a book to sell. I knew at the time that I wasn't going to write a book that was going to sell. It was for the satisfaction of having written a novel and feeling the accomplishment in learning how to do it. I tell you right now, if you've never finished a project on this level, it's one of the most sweet and beautiful and transcendent moments in my life was holding that manuscript, thinking to myself, I did it.

[....]

This is the difference between data from Star Teek and a large language model. At least the ones operating right now. Data created art because he wanted to grow. He wanted to become something. He wanted to understand, art is the means by which we become what we want to be. The purpose of writing all those books in my earlier years wasn't to produce something I could sell. It was to turn me into someone who could create great art. It took an amateur and it made him a professional. I think this is why I rebel against the AI art product so much because they steal the opportunity for for growth from us.

[...]

The difference is that the books aren't the product. They aren't the art, Not completely. And this is the point. The book, the painting, the film script is not the only art. It's important, but in a way, it's a receipt. It's a diploma. The book you write, the painting you create, the music you compose is important and artistic, but it's also a mark of proof that you have done the work to learn because in the end of it all, you are the art. The most important change made by an artistic endeavor is the change it makes in you. The most important emotions are the ones you feel when writing that story and holding the completed work. I don't care if the AI can create something that is better than what we can create because it cannot be changed by that creation.

Added on February 9, 2026

For the sake of it

m.youtube.com non-tech

Why one should create art even if it achieves you nothing, even if you're bad at it.

The choreographer Merce Cunningham said once "You have to love dancing to stick to it .It gives you nothing back ,no manuscripts to store away ,no paintings to show on walls and maybe hang in museums ,no poems to be printed and sold ,nothing but that single fleeting moment when you feel alive ."

Added on February 9, 2026

How to survive as a SaaS company in times of AI Pt2

nmn.gl tech

Adapt to the customer, not the other way around

The times of asking customers to change how they work are gone. Now, SaaS vendors that differentiate by being ultra customizable win the hearts of customers.

How? It’s the most powerful secret to increase usage. We’ve all heard the classic SaaS problem where the software is sold at the beginning of the year, but no one actually ends up using it because of how inflexible it is and the amount of training needed.

And if a SaaS is underutilized, it gets noticed. And that leads to churn.

This is the case with one of my customers, they have a complex SaaS for maintenance operations. But turns out, this was not being used at the technician level because they found the UI too complex4.

How I’m solving this is essentially a whitelabelled vibe-coding platform with in-built distribution and secure deployments. When they heard of my solution they were immediately onboard. Their customer success teams quickly coded a very specific mobile webapp for the technicians to use and deployed it in a few days.

Now, the IC technician is exposed to just those parts of the SaaS that they care about i.e. creating maintenance work orders. The executives get what they want too, vibe coding custom reports exactly the way they want vs going through complicated BI config. They are able to build exactly what they want and feel like digital gods while doing it.

Usage for that account was under 35%, and is now over 70%. They are now working closely with me to vibe code new “micro-apps” that work according to all of their customer workflows. And the best part? This is all on top of their existing SaaS which works as a system of record and handles security, authentication, and supports lock-in by being a data and a UI moat.

This is exactly what I’m building: a way for SaaS companies to let their end-users vibe code on top of their platform (More on that below). My customers tell me it’s the best thing they’ve done for retention, engagement, and expansion in 2026 – because when your users are building on your platform, they’re not evaluating your competitors.

Added on February 9, 2026

How to survive as a SaaS company in times of AI

No source URL tech

How to survive

1. Be a System of Record

If the entire company’s workflows operates on your platform, i.e. you’re a line-of-business SaaS, you are integrated into their existing team already. They know your UI and rely on you on the day to day.

For example, to create a data visualization I won’t seek any SaaS. I’ll just code one myself using many of the popular vibe coding tools (my team actually did that and it’s vastly more flexible than what we’d get off-the-shelf).

Being a “System of Record” means you’re embedded so deeply that there’s no choice but to win. My prediction is that we’ll see more SaaS companies go from the application layer to offering their robust SoR as their primary selling point.

Added on February 9, 2026

AI and cheapely accessed information

Loved the T-Rex analogy.

There’s a concept in behavioral science called the “effort heuristic.” It’s the idea that we tend to value information more if we worked for it. The more effort something requires, the more meaning we assign to the result. When all knowledge is made effortless, it’s treated as disposable. There’s no awe, no investment, no delight in the unexpected—only consumption.

(I'm reminded of the scene in Jurassic Park when the tour Jeep pulls up to the Tyrannosaurus rex exhibit. Doctor Grant says “The T-Rex doesn't want to be fed. It wants to hunt.”)

Added on February 8, 2026

Why Senior Enginners let bad projects fail?

lalitm.com tech

Firstly, software companies have an inherent bias for action. They value speed and shipping highly. Concerns, by definition, slow things down and mean people have to look at things which they hadn’t budgeted for. And so unless your concern is big enough to overcome the “push for landing”, there’s little chance for any meaningful change to come from you saying something. In fact, it’s very likely that you’ll be largely ignored.

Related to this, even if the team does take your concern seriously, you have to be careful not to do it too often. Once or twice, you might be seen as someone who is upholding “quality”. But do it too often and you quickly move to being seen as a “negative person”, someone who is constantly a problem maker, not a problem “fixer”. You rarely get credit for the disasters you prevented. Because nothing happened, people forget about it quickly.

There’s also the problem that every time you push back, you are potentially harming someone’s promotion packet or a VP’s “pet project.” You are at risk of burning bridges and creating “enemies”, at least of a sort. Having a few people who disagree in a big company with you is the cost of doing business, but if you have too many, it starts affecting your main work too.

Finally, there is also the psychological impact. There is one of you and hundreds of engineers working in spaces that your expertise might help with. Your attention is finite, but the capacity for a large company to generate bad ideas is infinite. Speaking from experience, getting too involved in stopping these quickly can make you very cynical about the state of the world. And this is really not a good place to be.

Added on February 8, 2026

The levarage of Enterprise SaaS

nmn.gl tech

Enterprise SaaS platforms have spent years (and millions) solving these problems: role-based access control, encryption at rest and in transit, penetration testing, compliance certifications, incident response procedures. Your customers may not consciously value this — until something breaks.

The challenge is that security is invisible when it works. You need to communicate this value proactively: remind customers that the “simple” tool they could vibe-code themselves would require them to also handle auth, permissions, backups, uptime, and compliance.

Added on February 5, 2026

What does a day in your life look like?

jasmi.news tech

Jasmine Sun went to Shenzhen, China and asked Chinese AI researcher a few questions. They seem bit too driven.

What does a day in your life look like?” we asked. “I wake up and I check Twitter.”

“Do you have to work 996?” “No,” he laughed. “It’s 007 now.” (Midnight to midnight, seven days a week.)

“Do you guys worry about AI safety?” “We don’t think about risks at all.”

“Based,” said Aadil.

Added on January 31, 2026

Losing interest in the chase

bylinebyline.com non-tech

I’ve been thinking about obsessions and how they materialize. Things we want, achievements we need, people we admire, attention we crave. I only just realized that a fixation is almost always a sign that the call is coming from inside the house. It’s never actually about the thing. Or maybe it is, but not entirely. Here’s what I mean: Pining for a certain accolade is likely less about the accolade and more about a gaping hole inside that an achievement would supposedly fill. A salve for a scar. An ointment for an insecurity. Maybe it helps, maybe it’s worth it, but it will never satiate without acknowledging the real thing that’s screaming. The one that’s urging the running and chasing.

Added on January 30, 2026

Carney's Carnage

cbc.ca non-tech

In 1978, the Czech dissident Václav Havel, later president, wrote an essay called The Power of the Powerless. And in it, he asked a simple question: How did the communist system sustain itself?

And his answer began with a greengrocer. Every morning, this shopkeeper places a sign in his window: "Workers of the world, unite!" He doesn't believe it. No one does. But he places the sign anyway to avoid trouble, to signal compliance, to get along. And because every shopkeeper on every street does the same, the system persists.

Not through violence alone, but through the participation of ordinary people in rituals they privately know to be false.

Havel called this "living within a lie." The system's power comes not from its truth but from everyone's willingness to perform as if it were true. And its fragility comes from the same source: when even one person stops performing — when the greengrocer removes his sign — the illusion begins to crack.

Added on January 30, 2026

Make me care

No source URL non-tech

If you have done some­thing cool, or you have stud­ied some­thing for a long time, or you have thought some­thing in­ter­est­ing, and you are writ­ing it up, and you are at a loss how to get started, try to ex­tract out the key phrase:

What do you find your­self rant­ing about to peo­ple re­peat­edly? What does the Wikipedia entry miss that frus­trates you? How would the world be dif­fer­ent if this were not true? If you were telling a friend in a rush why you were ex­cited to write this down, what would you say? Just say that! Just… start with the in­ter­est­ing part first.

When writ­ing, your first job is this:

First, make me care.

Added on January 30, 2026

Gwern on Writing

gwern.net non-tech

If we want to hook the reader, pro­voke their cu­rios­ity about this anom­aly. Boil it down to a sin­gle sen­tence: “Venice is in­ter­est­ing be­cause it was an em­pire with no farms.” And there we have our title: “Em­pires With­out Farms”. An ap­par­ent para­dox, which in­trigues the reader, and starts them think­ing about what em­pires they know of but had never thought about their lack of agri­cul­ture, and whether that is true, and if it is, how could it have been true, what did they eat and why didn’t they lose wars if they didn’t grow all their own food…?

Added on January 30, 2026

Markets vs Irrationality

If Tesla was valued fairly, it would probably be at the tune of $5B. But I’ll never bet against it, because the markets can remain irrational for longer than I can remain solvent.

Added on January 29, 2026

Levaraging LLMs ft Karpathy

x.com tech

LLMs are exceptionally good at looping until they meet specific goals and this is where most of the "feel the AGI" magic is to be found. Don't tell it what to do, give it success criteria and watch it go. Get it to write tests first and then pass them. Put it in the loop with a browser MCP. Write the naive algorithm that is very likely correct first, then ask it to optimize it while preserving correctness. Change your approach from imperative to declarative to get the agents looping longer and gain leverage.

Added on January 28, 2026

The DoorDash problem: AI agents vs Web 2.0

But if people stop using the apps and websites and start sending agents instead, that business really starts to break down. Because DoorDash and all the other service providers make their money by having a direct relationship with customers they can monetize in lots of different ways. It’s basic stuff like promotions, deals and discounts, ads for other stuff, their own subscriptions like DashPass and Uber One, and whatever other ideas they might have to make money in the future.

But AI doesn’t care about any of that stuff — if you ask for a car to the airport, an AI might just open Uber and Lyft and always pick the cheapest ride. These big App Store era services might just become commodity databases of information competing on price alone, which might not actually be sustainable, even if it might be the future. In fact, this past May at the Google I/O developer conference, Google DeepMind CEO Demis Hassabis said that he thinks we might not need to render web pages at all in an agent-first world.

Added on January 26, 2026

Sooner or later we all have to do things we do not want to.

collabfund.com non-tech

Retired United States Navy General William McRaven echoed a similar sentiment in his book, The Wisdom of the Bullfrog, writing,

“I found in my career that if you take pride in the little jobs, people will think you worthy of the bigger jobs.”

He illustrated this point with a story from early in his career when rather than being assigned to lead a mission, he was tasked with building a float that would represent the Navy SEALs (often referred to as “frogmen”) in the Fourth of July parade.

After receiving the assignment, McRaven was admittedly dejected. In his mind, he had joined the Navy SEALs to lead missions, not build parade floats. But a seasoned team member offered him a quiet piece of advice, saying:

“Sooner or later we all have to do things we do not want to. But if you are going to do it, do it right. Build the best damn Frog Float you can.”

McRaven took the message to heart, pouring himself into the task and the float went on to win first prize in its category.

Added on January 21, 2026

Wikipedia's guide on identifying articles by LLMs

Folks at Wikipedia made this awesome guide to detect LLMs in articles. While most of it is high level, but once you've read enough AI generated stuff, you can see a similar pattern.

1. Undue emphasis on significance, legacy, and broader trends

*Words to watch: stands/serves as, is a testament/reminder, a vital/significant/crucial/pivotal/key role/moment, underscores/highlights its importance/significance, reflects broader, symbolizing its ongoing/enduring/lasting, contributing to the, setting the stage for, marking/shaping the, represents/marks a shift, key turning point, evolving landscape, focal point, indelible mark, deeply rooted, ...

2. LLM writing often puffs up the importance of the subject matter by adding statements about how arbitrary aspects of the topic represent or contribute to a broader topic. There is a distinct and easily identifiable repertoire of ways that it writes these statements.

Eg: The Statistical Institute of Catalonia was officially established in 1989, marking a pivotal moment in the evolution of regional statistics in Spain. [...]

Added on January 21, 2026

Normalization of Risks in critical systems/AI

Johann describes the “Normalization of Deviance” phenomenon, where repeated exposure to risky behaviour without negative consequences leads people and organizations to accept that risky behaviour as normal.

This was originally described by sociologist Diane Vaughan as part of her work to understand the 1986 Space Shuttle Challenger disaster, caused by a faulty O-ring that engineers had known about for years. Plenty of successful launches led NASA culture to stop taking that risk seriously.

Johann argues that the longer we get away with running these systems in fundamentally insecure ways, the closer we are getting to a Challenger disaster of our own.

Added on January 12, 2026

Most technical problems are people problems

Tech debt projects are always a hard sell to management, because even if everything goes flawlessly, the code just does roughly what it did before. This project was no exception, and the optics weren't great. I did as many engineers do and "ignored the politics", put my head down, and got it done. But, the project went long, and I lost management's trust in the process.

I realized I was essentially trying to solve a people problem with a technical solution.

Added on January 12, 2026