Return the Fund
Posts
Uncovering AI Agent Dynamics

Uncovering AI Agent Dynamics

Using two agent companies to dissect the field

Prerit Das
June 15, 2024

Return the Fund 🚀

The frontier of tech-focused VC research

In today’s edition:

Quick peek at two AI agent companies representing industry trends and theses
Focus on why each company matters—the takeaways from their stances
Briefly break down AI agent infrastructure and use an analogy to explain the key difficulties that need solving by (investable) infra-layer companies

AUTONOMOUS EMPLOYEES

Faktory

Consumer-facing agent suppliers are the most straightforward players in this space. Companies like Faktory create specialized AI agents and sell them directly to end users—typically other businesses. (PS: a deeper explanation of agent fundamentals is at the bottom of this edition.)

They have a series of pre-tested, trained, and tool-equipped “employee” skeletons for customers to “hire” and onboard. Thomas Peterson, CEO of Faktory, is on a mission to build fully autonomous organizations powered by Faktory’s suite of workers.

“We aim to construct the largest, most intelligent, and dependable AI workforce in the world,” Petersen said in an email.

Why this matters

Is agent creation a sustainable business model? Or, is it too indefensible to justify long-term survival given the pace of open-source development? These are two questions every investor must grapple with when analyzing agentic software prospects from first principles.

Our take: by and large, packaging agents for customers isn’t a strong enough UVP to fend off competitors. That said, there are some mountainous hurdles to building performant and cost-effective specialized agents. The tail risk of an agent acting as a human never would is enough to dissuade companies from offloading critical tasks to AI today.

In the early days of agents, the primary difficulty was teaching LLMs to consistently generate outputs parsable by agent orchestration software. Now, the struggle is in teaching agents to think and behave like a top human employee. Companies that solve this problem creatively with tuned models, multi-model architectures, parallel processing, etc. have the potential to capture a significant share of the agent market early.

We’re still in the earliest stages of this market, and the demand for tech that works is through the roof. Look out for app-layer agent creation companies that have a unique, differentiated, and proven underlying mechanism for solving orchestrational and/or behavioral challenges. Steer clear of hype-driven, jargonistic companies duplicating or repackaging viral demos.

Quick facts

Founded by Thomas Petersen, Faktory’s CEO. Headquartered in New York.
Backed by Tribe, Kinetic, and Collab+Currency.
Currently fundraising at the seed stage.

BROWSER AUTOMATION

Browserbase

Behind every agent is its infrastructure—that which allows it to operate in the world. A necessary ability for agents is interacting with the internet. This is harder than it seems.

The most straightforward approach is pulling the HTML content of a page and parsing it as necessary. This is how most agents in 2023 conducted research and summarized articles.

The trouble is that many modern websites have complex renderings and dynamic JavaScript content. From a developer’s perspective, there are two ways to scrape the web.

As stated, make an HTTP request to the site and retrieve its HTML content.
1. Pros: simple, cheap, runnable anywhere.
2. Cons: often misses content, can’t catch renderings or dynamic information.
Use a tool like Selenium to physically control a browser instance, navigating it as a human would.
1. Pros: doesn’t miss content, can dynamically navigate websites.
2. Cons: computationally expensive, heavyweight, hard to deploy.

Browserbase is a SaaS cloud-hosted implementation of option 2, giving developers access to comprehensive autonomous browsing capabilities without the hassle of web drivers, expensive processing, and deployment limitations.

In a nutshell: Browserbase empowers agents with comprehensive internet surfing abilities.

Why this matters

Earlier we discussed the defensibility of app-layer agent creation companies who lie at the bottom of the stack, selling directly to customers. Browserbase is an example of agent infrastructure—companies who don’t directly distribute agents to customers, but instead empower other companies to do so.

Infrastructure companies sell shovels to gold miners; thus, their deals are highly coveted by investors (more on investor dynamics in our LLM Infrastructure edition). That said, it’s not a fool-proof business model. The best infra companies take a creative approach to tackling complex, headachy problems—like autonomously browsing the internet.

Quick facts

Recently raised $6.5 million led by AI Grant, Basecase Capital, and Kleiner Perkins.
Founded by Paul Klein IV, based in San Francisco.

DEEPER UNDERSTANDING

My Agent and His Issues

An agent is an AI system that can take action in the real world, beyond simply generating text. Does the AI itself interact with the world? No. LLMs underpin AI agents, and LLMs are solely text generation models.

Orchestration software carefully prompts LLMs to return structured outputs indicating how to act. The software then deterministically executes the LLM’s command.

Think of a ship’s captain. He, himself, does not steer his ship. He issues commands to his crew—heading and power—who then execute his order without a thought required.

Sticking with the analogy of a ship, let’s think about some of the aforementioned struggles with AI agents.

Difficulties

The parsability issue, in the ship analogy, is akin to crew members not understanding what the captain is saying. There is a reason why crews are (literally) militaristic about communication. If the captain tells his crew, “I don’t know, go that way. Speed? Uhh, kind of fast, but not too fast,” disaster would ensue.

LLMs are prone to hallucination and defaulting to taught mannerisms—being friendly, all-pleasing, etc. This is not conducive to the structured and militaristic communication style required for consistent agent outputs.

Deep fine-tuning and pre-trained models specialized for agent workflows are the most obvious solutions to this problem. Many companies, including Google, Apple, and Meta, are trying to solve this problem for on-device agentic control.

Abstracted yet low-level control over tools is another issue solvable by infrastructure companies (i.e. Browserbase’s UVP). In the ship analogy, this is akin to the machinery used by crew members to allocate power, steer, etc.

If the captain has to teach crew members how to use knobs and buttons every time he issues a command, the ship will go nowhere. And if he can’t control steering finely enough, the ship will not end up where intended.

Multi-agent unified systems (multiple specialized captains) and creative orchestration structures are potential creative, differentiated solutions.

As it stands, the technology is not quite ready for mass adoption. But the demand is, and history has repeatedly shown what happens next.

WHO WE ARE

Return the Fund 🚀

One startup a week poised for 10x growth; market deep dives with actionable insights for builders and investors.
Technical breakdowns of advanced new tech to empower informed decision-making
Find your next prospect, product, job, customer, or partner. 🤝 Written by undercover pioneers of the field; trusted by builders and investors from Silicon Valley to NYC. 💸

Last week, we peeked at a few of our favorite alt investment tech startups.

Alternative Investments Spotlight

Overviewing some of the coolest and hottest alternative investment startups bringing unique asset class opportunities to everyday individuals.

returnthefund.vc/p/alt-investments-spotlight

As you know, we’re hell-bent on uncovering future unicorns cruising under the radar. Preeminent companies are lean, quiet, and driven before reaching their watershed moments. By the time people start talking about them, it’s too late.

In a nutshell—we pitch you startups like you’re an esteemed VC. If you’re interested in them as a partner, product, or prospect, we’ll make a warm intro. Humbly, our network knows no bounds!

We’ll also intuitively break down advanced tech so you can stay ahead of trends and critically analyze hype in the news and in your circles (regardless of your technical prowess).

Periodically, we’ll propose niche market opportunities. These are tangible ways to extract alpha from the private markets.

You won’t get editions from us very often. Weekly at best. Two reasons:

We’re balancing full-time VC/startup work with Return the Fund.
We prioritize depth, insight, and value. This is not a daily news publication… We hope that when you do get an email from us, it’s dense, valuable, actionable, and worth saving.

Thanks for reading today’s RTF. Let us know what you thought of this edition by answering the question below, and feel free to reach out to us at [email protected]. 🤝

Psst: None of our company picks are ever sponsored. All research and opinions are completely held by the Return the Fund team.

Reply

or to participate.