Pranav—Is Manus all hype?
So we’re apparently seeing a “second DeepSeek moment” from China — they claim to be the first ones to have created “agentic” AI.
What does that mean? Well, most AI, in theory, is passive. It can generate text in response to prompts. People have built in some extra capabilities as well — your run-of-the-mill AI (familiarity breeds contempt, I guess) can surf the net, create excel sheets, create PDFs, and manage a lot of other nifty little tricks. But agentic AI goes a little further.
Here’s one way of understanding things: when you give your regular old ChatGPT a prompt, it works on that prompt and tries to one-shot its way into an answer — any answer. It can’t really plan things, or chew on a problem by itself. It can’t really interact with any tools outside its own ecosystem. It’s a fantastic piece of software, but it’s a good distance away from an intelligence that works anything like ours.
Now think of your AI reasoning models. They’re a little different. Those models aren’t just telling you the first thing that comes to their “minds,” so to speak. They seem to hold a “thought”, work on it, refine it, and then give you an output. It’s almost as though they can observe their own thoughts.
The “deep research” models are even more interesting. They do what these reasoning models can. They can observe their thoughts and work on them, in a sense. But to add to that, they can also perform (a small set of) actions they choose. They can search the internet where required, almost as per their discretion, to hone their answers. Once you give them a task, they strategise, build answers, run multiple searches, structure the whole thing, and only then give it to you. So, instead of blurting out their outputs in one shot, they’re doing something entirely more sophisticated. It’s almost as if they have a mind of their own, where they’re working out a complex task before giving you an answer. Deep Research is an agent.
Here’s one way to think of it: I’m a lousy football player. When I’m on the football field, I can’t think. I have a terrible sense of how the field looks. I can’t plan ahead or strategise my way around obstacles. I basically just point myself in a direction and run, without thinking of what I’m doing or why. If, by some misfortune, the ball comes to me, I’m caught off guard. That’s the point at which I start looking around, frantically, for where everyone else is. Then, I just pick some direction and kick. There’s no depth or nuance, whatsoever, to what I’m doing.
And so, on the football field, I’m not really an agent.
Lionel Messi, I’m certain, is nothing like me. He has a profoundly well developed footballing mind. He has a constant map of how the field looks and where all the players are. He’s probably always thinking through millions of options — who can he pass to, where can he push through the defense, what are all his scoring angles, and hundreds of other things I can’t even imagine. He constantly combs through his options, subconsciously, simulates what they can do for him, and only then takes action. It’s all this background processing that makes his football look so effortless.
Messi is 100% an agent. He’s our top agent, in fact.
So, in a very rough sense, the difference between Messi and me is not unlike the difference between Deep Research and GPT-2.
Now, the holy grail is a general purpose-sort of agentic AI. Imagine an AI that can do whatever you or I can do on a computer. You can ask it for some fairly complex task: say, plan a trip to Turkey end-to-end. It’s a fairly complex task, but also one that, to a great extent, requires little more than a computer. You ask it to plan an itinerary, buy all your tickets, make hotel reservations, apply for a visa, maintain a full-blown excel sheet with all the information you require, download all your travel documents, place an order for the forex you need, and so on. For this, it has to go to different websites and portals, all of which have to be navigated differently, and carry out a detailed, multi-step process.
Now, mind you, you aren’t directing any of these individual steps. The AI “knows” what it needs to do. It creates its own task list, and executes each task one-by-one, without any outside help.
That’s pretty mind-blowing, isn’t it?
That’s sort of what China’s Manus AI claims to do. It can apparently plan for and execute many different tasks. To do so, it creates momentary “virtual computers”, where it can access a browser and play around with it. According to early reports, it isn’t perfect, but it’s surprising what all it can do.
Now, here’s the thing. This probably isn’t another DeepSeek moment. DeepSeek was trying to create models from scratch. Manus has found a clever way to stitch together different bits of AI in a way that can actually achieve a great deal of agent-like behaviour. Look under the hood, though, and it’s just Claude 3.7. It’s not really AI, it’s just a clever piece of software that uses AI.
So, on one hand, Manus is all hype. On the other hand, it tells us what is possible with the technology we already have. It seems like the AI hype-train was pushing people to create better and better fundamental models, without really thinking of all the ways in which the old models could be useful to us. I mean, I genuinely don’t think we’ve exhausted any more than a minor subset of all the possible uses we could put GPT-3 for, and we’ve now sprinted so far ahead that we perhaps never will.
I guess if there’s one thing to take away from all of this, it’s just that you’ve gotta stop and smell the roses. We’re looking at a profound shift in human capabilities. Might as well stop the hustle and just look, for a moment, at the magic that’s unfolding right before our eyes.
Bhuavn—On venture capital
If you follow financial news even sparingly, it's hard to miss headlines about Masayoshi Son. I write a lot, but I've never used the word "buccaneer"—and if there's one person who embodies the spirit of the word, it's Son. Venture capitalists, in general, are crazy people who make crazy investments and say crazy things. However, Son makes your average VC look Amish.
The dude's been taking risks like he's permanently on drugs—really good drugs. I was reading this brilliant article by Laleh Khalili reviewing Gambling Man: The Wild Ride of Japan's Masayoshi Son by Lionel Barber. Here are a few crazy things Son has done:
- Borrowed over $10 billion to buy Vodafone in what was the largest leveraged buyout in Asia.
- Lost $11.5 billion on WeWork.
- Blew billions through Vision Fund 1, which apparently generated single-digit returns.
- Despite the failure of Vision Fund 1, still convinced others to invest in Vision Fund 2.
- Lost his fortune and regained it several times over.
And then, here's a banger:
The protagonist of Dostoevsky’s novel The Gambler stakes his life and the woman he loves on just one more game, one more throw of the dice, driven by ‘self-intoxication, by your own fantasy ... some terrible pleasure – luck, victory, power’. Son seems compulsively driven to invest larger and larger sums so he can call himself the biggest, most significant, most visionary investor in the world. ‘Bill Gates just started Microsoft and Mark Zuckerberg started Facebook,’ Barber quotes Son as saying. ‘I am involved in a hundred businesses, and I control the entire ecosystem. These are not my peers. The right comparison for me is Napoleon or Genghis Khan or Emperor Qin. I am not a CEO. I am building an empire.’
Talk about megalomania.
That's it for today. If you liked this, give us a shout by tagging us on Twitter.