Six Minutes That Will Change Everything

I need your mind to be open to the possibilities

Alphabet released the video below and its market capitalization immediately jumped by $90 billion. It’s a six minute demonstration that, unless you’re entirely devoid of imagination, will absolutely rock you back on your heels.

Before I have you watch it (and you absolutely must), I want to explain why it’s so important.

What I’m showing you is Google’s Gemini AI technology. What distinguishes it from Bard and ChatGPT and everything else that’s come before is its multi-modality. Meaning, it’s the first Artificial Generative Intelligence (AGI) to be simultaneously trained on text, images, audio and video. Prior large language models have been either visual or textual but Gemini is natively multi-modal.

Here’s a diagram from the unit at Alphabet that created it, Google DeepMind (you can read their technical paper here if you want to go deeper):

That universality is what makes this demo so startling. It’s closer to being like interacting with a person than anything we’ve seen yet, because it effortlessly (I assume) switches back and forth between words and pictures as it answers the questions, prompts and challenges being issued. It can tell you what’s going on in a drawing and then change its opinion in real-time as the drawing is changing. It can explain your ideas or come up with its own when asked to. It can start by observing and describing an image and then begin serving you up additional information about what is being pictured, from the realms of science, history, comedy (the cat joke was purrrrfect), etc.

Other LLM’s and generative AI projects that have come before have been trained on either visual or textual data separately and then bolted together in order to do both. Gemini was simultaneously trained and, as such, seems to have a better command of the combinations. Multi-modal will now be the future of AGI. You won’t see any mainstream products going backwards from here on out. That is the thing Google’s DeepMind just accomplished in full public view last week.

OK, now watch:

Admit it, this AGI stuff is now further along that you thought it was. ChatGPT was first released to the public on November 30th, 2022. It’s just one year later and yet it feels as though we’ve jumped five years into the future. The rate of progress here feels remarkable to me.

I’m not a technologist, though. So maybe I am impressed because I don’t work in AI and this stuff isn’t second nature to me. How can we tell that this is a huge leap and not just another iteration people have gotten overly excited by?

Well, there is a benchmark that technologists are using in order to rank and score various AGI products. It’s called the MMLU test, which stands for Massive Multitask Language Understanding. There are other benchmarks but this seems to be the one that is now a standard system for understanding the degree of advancement taking place with each new LLM version.

Anyway, I’ll let the CEO and co-founder of Google’s DeepMind unit, Demis Hassabis, tell you how powerful Gemini is in his own words:

We've been rigorously testing our Gemini models and evaluating their performance on a wide variety of tasks. From natural image, audio and video understanding to mathematical reasoning, Gemini Ultra’s performance exceeds current state-of-the-art results on 30 of the 32 widely-used academic benchmarks used in large language model (LLM) research and development.

With a score of 90.0%, Gemini Ultra is the first model to outperform human experts on MMLU (massive multitask language understanding), which uses a combination of 57 subjects such as math, physics, history, law, medicine and ethics for testing both world knowledge and problem-solving abilities.

Demis Hassabis, Google DeepMind Co-Founder and CEO

At the dawning of the Magnificent 7 era (back before we began calling the largest technology stocks FAANG), I wrote a post at The Reformed Broker called “Just own the damn robots.” which explained the rationale behind overweighting the tech sector in a retirement portfolio. It holds up pretty well, reading it back now six years later. The gist of it is “If I’m going to be replaced by this stuff, I may as well make money from it.”

We’re only a year into the mainstreaming of AI technology, some would argue it’s still the top of the first inning. The labor market is still extremely tight - last month the US economy created 200,000 new jobs and the unemployment rate actually ticked down to 3.7%. Just about everyone who wants to work can find a job. There are still close to 9 million openings if you believe the JOLTS data we got last week.

But when you watch the six minute video above, it’s not hard to imagine a large change coming to the labor force in the near future. It’s not that AGI will put everyone out of work all at once. It’s more that the companies and people who become adept at employing this technology to become more efficient internally and delight their clients externally are going to win. Those who do not are going to lose. And yes, some things people are currently doing for a living are going to go away. First you’ll see tasks automated and then the roles themselves will be eliminated.

Historically, new waves of technology have created new jobs and have led to people doing things they could not have previously imagined just a few years earlier. But there are also jobs that are automated away and, for some types of workers, there is an adjustment period that could last awhile. It’s a very uncomfortable feeling, seeing the future coming like this and not knowing what to do about it.

The purpose of this note is to show you what’s coming now so that you can start thinking about how you’ll harness it to improve your own career. I also wanted to put the technology sector’s dominance over the rest of the stock market into perspective. Let’s not act like there’s no reason for it happening.

Alphabet is now worth $1.7 trillion, one of the largest publicly traded firms in the world. The company has nine separate businesses with over a billion users each - from YouTube to Gmail to Android to Search. They’ll be deploying Gemini AI technology across these platforms and others in the coming year. It’s going to be available in three “sizes” or levels of strength - Ultra, Pro and Nano. Everyone from scientists to schoolchildren will be interacting with it, possibly on a daily basis. The coming year will be a whirlwind of change. If you’re going to be reading me, I need your mind to be open to the possibilities. That’s good enough for now.

Josh