OpenAI's Agent, clickbait crisis in AI research and more

Posted on

This week we have some exciting news from OpenAI that published a new agent, we see trends of expertly crafted clickbait research, and we review one augmentation technique that received a lot of attention recently.

OpenAI Operator: An agent that will not suck?

OpenAI presented their new agent Operator that will takes control over a web browser and uses keyboard+mouse, paired with a new model to traverse the web, interact with web pages to try and solve the task that you ask it to.

Do you also have a feeling of deja vu? The agent looks very similar to one that Anthropic released a few months ago.

With a hefty price tag of 200E/month for a pro account, we weren’t able to test it yet. But from what we read it performs quite well. If this agent holds up to the hype, it will give all of us a little AI assistant that we can offload some of our tasks to: buying concert tickets, find the cheapest supermarket do to this weeks groceries, maybe we can even ask it to clean up our email inbox?

Although currently the price tag is quite hefty and it’s only available to users in the United States, AI generally has the trend of becoming cheaper at a very fast rate. Can’t wait to try this one out soon!

A clickbait crisis in AI research.

If you’ve been keeping up with AI research papers since the ChatGPT boom, you’ve probably noticed a hard-to-miss trend: titles designed to hook you instantly and play on your FOMO. The titles are crafted to grab attention and make us readers feel like we are missing out on the “next big thing” in AI research. For instance, we have seen:

  1. More Agents is All You Need
  2. Were RNNs All We Needed?
  3. Don’t Do RAG: When Cache-Augmented Generation is All You Need for Knowledge Tasks

Not every paper with a catchy title offers groundbreaking insights and impacts, and that is fine. However, we, as readers, need to look beyond the hooks and read AI news more critically and with an eye toward details that often get buried.

CAG is an example of that clickbait crisis: Old Concept, New Hype?

One example could be a paper by researchers from Taiwan university with the catchy title: “Don’t Do RAG: When Cache-Augmented Generation is All You Need for Knowledge Tasks”

The title did pay off, where suddenly we saw mention of this paper and technique all over the place. But how groundbreaking was this new technique really? Caching prompts is already available by Anthropic’s API and is common practice from our experience, and unloading your entire knowledge base into a few million token context window might look like a good idea, this can actually degrade your model’s performance [1][2]. While for smaller tasks and smaller knowledge base this approach can work very well, calling it “all you need” is too strong of a claim to make.

What we’re reading

Join Our Newsletter

Subscribe

* indicates required

Intuit Mailchimp