Sam was interviewed by Logan Bartlett after the 4o launch event

Today I listened to Sam's interview with Logan Bartlett after the launch event. The podcast can be found at this link: https://castbox.fm/vd/701477381. There were some interesting thoughts, so I'm sharing the script here:

4o First Experience

Logan: You made an announcement earlier today about multimodal 4o, right? That Ω symbol.
Sam: Oh, that stands for "Omni," meaning all-encompassing.
Logan: It applies to text, speech, and vision. Can you talk about why that's important?
Sam: Because I think it's an incredible way to use a computer. We've had this idea of voice controlling computers for a long time, like things like Siri, but they've never felt natural to me. This one feels different for a bunch of reasons, including what it can do, the speed, the ability to incorporate other modalities, the naturalness of its tone, and the fact that you can make it talk faster or with a different voice. That fluidity and flexibility makes me not believe how much I enjoy using it.
Logan: Spike Jonze (the director) would be proud. Do you have any favorite use cases?
Sam: I've only been using it for about a week, but one surprising use case is when I'm heads down working and have my phone on the desk, using it as another channel without having to switch windows or change what I'm doing. For example, I usually stop what I'm doing to switch to another tab and Google search something, but now I can just ask it and get an answer immediately without changing what I'm looking at on my computer. It's a really cool experience.

Technical Implementation

Logan: What made it actually possible to achieve this? Is it architectural changes or more compute power?
Sam: It's a combination of everything we've learned over the past few years. We've been working on audio models, vision models, and trying to tie them together. We've also been looking for more efficient ways to train our models. It's not like we unlocked something entirely new all at once, but rather put a lot of puzzle pieces together.
Logan: Do you think it's necessary to develop an on-device model to reduce latency and improve usability?
Sam: For video, it might be hard to deal with network latency. For example, I’ve always thought it would be amazing to put on a pair of AR glasses and have the world described to you in real time and watch it change as you look at it, which could become even harder under network latency. But for this technology, a delay of two or three hundred milliseconds feels incredibly fast, faster than human reaction times in many situations.
Logan: Are you referring to images in the video?
Sam: Sorry, I mean generating video, not input video.
Logan: Got it, so currently it can handle real video, right?
Sam: Frame by frame.

GPT-5

Logan: You recently mentioned that the next big release for ChatGPT might not be GPT-5, and it feels like you're taking a more incremental approach to model development. Is that a fair way to understand it? There won't be a big version release in the future, but...
Sam: We actually don't know yet. One thing I've learned is that AI doesn't go well with surprises. Although the traditional tech company way of releasing products may not fit us, we might release GPT-5 in a different way, or call it something else. But I don't think we've figured out how to name and brand these things yet. The naming from GPT-1 to GPT-4 made sense to me, and now GPT-4 has obviously improved a lot. We also have an idea that there might be an underlying "virtual brain" that can think harder in certain situations, or it could be different models, but users might not care if they are different. So we still don't know how to market these products.

Computational demand

Logan: Does this mean that the demand for computing power required for model advancement may be less than before?
Sam: I think we will always use the maximum computing power we can get. We are finding incredible efficiency improvements, which is very important. The cool stuff released today is obviously the voice mode, but perhaps the most important thing is that we have made it so efficient that we can offer it for free to users. The best model in the world can be freely provided to anyone who wants to download ChatGPT, which represents a significant efficiency improvement over GPT-4 and GPT-4 Turbo, and we still have more room for improvement.

Change the world

Logan: I've heard you say that ChatGPT itself hasn't changed the world, but it has changed people's expectations of the world.
Sam: Yes, I think if you look at any economic indicator you choose, you won't find much evidence that ChatGPT has truly boosted productivity or had other impacts. There might be some impact in customer support, but if you look at global GDP, can you see the impact of ChatGPT's release? Probably not.
Logan: Do you think there will ever come a day where we can pinpoint ChatGPT's impact on GDP?
Sam: I don't know if you'll ever be able to say that one model did it, but I do think if we look at the charts over the next few decades, something will have changed.

Next 1 year

Logan: What applications or areas do you think are most promising in the next 12 months?
Sam: I'm obviously biased because it relates to what we're doing here, but I think coding is going to be a very big application area.
Logan: This has to do with "bitter lesson", you recently spent some time talking about the difference between deep specialization models and general models that can perform real reasoning.
Sam: I bet on the general model being important.
Logan: In your view, which is more important, focusing on a particular dataset and all the ensembles related to a very narrow task, or the ability for general reasoning?
Sam: If the model can do general reasoning and discover new things, then when it needs to deal with a new type of data, you can input it and it will handle it. But not the other way around, I don't think putting together a bunch of specialized models can achieve general reasoning.
Logan: So, what are the implications for building specialized models?
Sam: I think a better way to say it is that the most important thing is to figure out real reasoning capabilities, and then we can use it to do all sorts of things.

The next 2 years

Logan: What do you think will be the primary mode of communication between humans and AI in two years?
Sam: Natural language seems great. I'm very interested in this overall approach to designing the future where humans and AIs can use the same thing. So I'm more interested in humanoid robots than other forms of robotics because I think the world is already really well-suited for humans, and I don't want it to get reconfigured for something more efficient. I like the idea of interacting with AIs in human-optimized language, and maybe they can talk to each other in the same way, but I think it's a direction worth pushing.

Model personalization

Logan: You mentioned recently that models might eventually become commoditized over time, but what's most important is personalizing the model for each individual. Is that right?
Sam: I'm not sure, but it seems like a reasonable thing to me.
Logan: So aside from personalization, do you think that ultimately what will win in the market for end users is just regular business interfaces and ease of use?
Sam: Those are certainly going to be important; they always have been. I can imagine some other things too, like some kind of market or network effects, such as we want our agents to talk to each other, app stores from different companies, but I think normal business rules usually apply. Whenever you have new technology, you're tempted to think they don't apply, but that's always fake news. All the traditional ways of creating value are still going to be very important here.

Open-source models

Logan: What's your reaction when you see open-source models catching up on benchmarks and all that?
Sam: I think it's great. Like a lot of other technologies, open-source models are going to have their place, hosted models are going to have their place, and that's fine.

AI Infrastructure

Logan: I don't want to ask about specifics, but there have been media reports that you're looking to raise a large amount of capital. The Wall Street Journal thought this was credible news aimed at driving investment into semiconductor manufacturing, with companies like TSMC and NVIDIA already actively expanding to meet the demand for AI infrastructure. You've recently said that you think the world needs more AI infrastructure, and then even more. Are you seeing something on the demand side that leads to needing more AI infrastructure than what is currently being provided by TSMC and NVIDIA?
Sam: First, I'm confident we'll find ways to dramatically reduce the cost of delivering current systems. I'm also confident that as we do so, demand will increase dramatically. Third, I believe that building larger and better systems will further increase demand. We should want to live in a world where intelligent computing power is abundant, and people can use it to do all kinds of things without having to consider whether they need it, such as whether they want it to help them read all their emails and respond, or whether they want it to cure cancer. The answer is that we want it to be able to do both simultaneously, and I just want to make sure we have enough resources to achieve that.

Physical devices

Logan: I don't need you to comment on your personal efforts, though you can tell me if you wish. But regarding some physical device assistants like Humane and Limitless, what do you think they are doing wrong? Or what are the reasons for their lower-than-expected user adoption rates?
Sam: I think it's still early. I've been an early adopter of many types of computing devices; I used and loved the Compaq TC1000 in college, which was far from the iPad but heading in the right direction. Then I bought a Treo; I was a very uncool college student using the old Palm Treo, not a trendy product at the time, but eventually we got the iPhone. These things look like a very promising direction, but they require some iterations.

AI companies of the future

Logan: You recently mentioned that many GPT-4-based companies will be overwhelmed by future versions of GPT. Could you elaborate on this point? Second, which AI-first companies can survive the progress of GPT?
Sam: I've come to the conclusion that the only viable framework is, you can build a business that bets on the model getting much better, or you can build a business that bets on it not getting better. If you do a bunch of work to make some use case barely work with GPT-4, and then GPT-5 comes out and it just crushes it and does everything else well too, you'll regret all the effort you put in. But if you have something that works across the board, instead of being super focused on one use case, when GPT-5 or whatever version name comes out and it's better across the board, you get to benefit from that. Most of the time you're not building an AI business, you're building a business and AI is just the technology you happen to be using. It's like early App Store days where there were apps filling obvious gaps, and eventually Apple solved those problems so you didn't need flashlight apps in the App Store anymore; they became part of the operating system. And then companies like Uber used the smartphone to build a very defensible long-term business, and I think you should go after the latter.
Logan: I can think of a lot of existing companies that leverage your technology that fit into this framework. In some ways, are there new concepts like Uber that could serve as examples of companies or interesting things that fall into this category?
Sam: I would actually bet on new companies winning in many of these cases. A really common example is people trying to build an AI doctor, an AI diagnostician. People say they don't want to start a company in this area because Mayo Clinic or some big company will do it, but I think it's actually going to be a new company doing this kind of thing.
Logan: What advice do you have for CEOs who want to proactively prepare for these types of disruptions?
Sam: I would say, betting on smart services getting better and cheaper every year is necessary, but not sufficient to guarantee you win. So, large companies that take years to implement changes, you can beat them, but every startup paying attention will do that too, so you still need to figure out what my long-term defensibility is. The competitive landscape right now is more open than it has been in a long time, there's a lot of new stuff you can do, but even though you can do it more ways now, you can't neglect the hard work of building and maintaining value.

The Future of Work

Logan: Can you imagine a job or a role that will become mainstream in five years because of the existence of AI, but right now might be niche or doesn't exist at all?
Sam: That's a good question, and I haven't been asked that before. People always ask what jobs will go away, but the new jobs are a more interesting question. Let me think about it. I can come up with lots of things that I don't find particularly interesting or important, but I'm looking for these big new areas that maybe a hundred million or fifty million people would do. The overall category is probably new forms of art, entertainment, and more human-like connection stuff. I don't know what the job title would be, but I think this could become a very massive new area. Whether we'll get there in five years, I don't know, but I think there will be high demand for incredible human-to-human experiences.

OpenAI Valuation

Logan: OpenAI’s public valuation is around $90 billion recently. Are there one or two things you think could make OpenAI a trillion-dollar company besides AGI?
Sam: If we can keep improving our technology at the current rate, keep using it to build good products, and maintain the growth of revenue, I'm not sure about the exact figure, but I think we'll be fine.
Logan: Is the current monetization model what you think could create a trillion-dollar equity value?
Sam: The subscription model for ChatGPT is working well for us, which was unexpected. I wouldn't have bet on it doing this well, but it is.
Logan: Do you think that after AGI, whatever that means, we could ask the AGI if the monetization model should be different?
Sam: Yes.

Company Structure

Logan: I think we may have seen in November last year that there are some shortcomings in the current OpenAI structure, but I think we don't need to revisit it. You've talked about it many times. I'd like to ask, what do you think is the appropriate structure for the future?
Sam: I think we're almost ready to discuss this. We've been having various conversations and brainstorming sessions, and we hope to formally address this issue within this year.
Logan: When Larry and Brett Taylor were promoted to board members, I was waiting for my call, but it never came.

AI Expectations

Logan: Regarding AI expectations and monetization models, I found an interesting point you once mentioned: manual work will be affected first, then white-collar work, and finally creative work. However, the actual situation seems somewhat reversed. In other aspects, are there any things that went against your expectations, like you thought it would be one way, but in fact, it turned out to be completely different?
Sam: The one you mentioned is the biggest surprise. I didn't expect AI to be so early proficient in legal work because I thought that was very precise and complex. But undoubtedly, the biggest surprise is the observation about physical labor, cognitive labor, and creative labor.

AGI

Logan: You've talked a lot about AGI and explained why you don't like this term. Can you elaborate on this perspective?
Sam: I no longer think of it as a sudden event. You know, when you start any company, there are always many naive ideas, especially in such a rapidly changing field. When we started, my naive idea was that we would go through a phase without AGI (Artificial General Intelligence), and then suddenly have AGI, which would be a real discontinuity. I still think there might be a real discontinuity, but overall, I think it's more like a continuous exponential curve, with the key being the rate of progress each year. You and I may not agree on the specific months or years, but we would agree that it’s AGI. We could propose other tests that we would both agree on, but even that is harder than it sounds. GPT-4 clearly hasn't reached the threshold that almost anyone would call AGI, and I don’t think our next big model will either, but I can imagine we might only be one or two or a few ideas and some scaling away from reaching a new stage. I think it's important to stay vigilant.
Logan: Is there a more modern Turing test, something we could call the Bartlet test, where when it crosses that threshold, you would consider it AGI?
Sam: I think it's a really important thing when it can do research better than any individual OpenAI researcher, or even better than all of OpenAI researchers combined. That seems like it could or should be a discontinuity.
Logan: Does that feel close?
Sam: Maybe not, but I wouldn't rule it out.
Logan: What do you think the biggest hurdle is to getting to AGI? It sounds like you think there's enough continuity in the scaling laws to carry us for a few more years?
Sam: I think the biggest hurdle is new research. One thing I've had to learn transitioning from internet software to AI is that research doesn't work as much on timelines like engineering does, which usually means it takes longer and might not work at all, but sometimes also means it works way faster than anyone expected.
Logan: Can you elaborate on that point? Progress doesn't happen in a linear fashion?
Sam: I think the best way to explain is through historical examples. I might get some of the numbers wrong, but I'm pretty sure no one will correct me. For example, neutrons were first proposed in the early 20th century, possibly detected for the first time in the 1920s, work on the atomic bomb began in the 1930s and was completed in the 1940s. From not knowing about the existence of neutrons to building an atomic bomb, it shattered all our intuitions about physics, and it happened very quickly. There are many other examples, like the Wright brothers' flight; they thought in 1906 that it would take another 50 years to achieve flight, but by 1908 they had done it. There are similar things like this. Some things we anticipate but never happen, or take much longer than we imagine, but sometimes progress happens very fast.

Interpretability

Logan: Where are we at on the spectrum when it comes to the issue of interpretability? And how important is it going to be for AI in the long term?
Sam: There are different types of interpretability, such as whether I understand the mechanisms of each layer in the network, and whether I can spot logical flaws in the output. I’m excited about the work being done in this direction at OpenAI and elsewhere. I think interpretability as a broader field looks promising and exciting.
Logan: I won't push you for a definitive answer, I assume you'll make an announcement when you're ready. Do you think this will be a requirement for mainstream AI applications? Maybe just internally within corporations?
Sam: GPT-4 is not fully there yet, but it's a fair question.

Regulation

Logan: You might get asked some questions or accused of things, maybe that's too strong a word, but people have concerns about this. One of them is the excitement around AGI and your personal concern about OpenAI having a monopoly on AGI and making unilateral decisions, which leads to discussions about governance structures where perhaps there should be elected leadership rather than you making the decision.
Sam: I think heavily regulating current capability models is a mistake, but when models can pose significant catastrophic risks to the world, I think some form of oversight is good. There's a subtle balance in setting these thresholds and testing standards, but there's a big gap between halting the enormous benefits of this technology and allowing people to train models at home. If we had international rules like those for nuclear weapons, I think that would be a good thing.
Logan: On this question, some investors think this kind of regulation is bad, they think there are risks in such regulation. How do you view the inherent potential risks of AI?
Sam: I think they haven't really faced the possibility of AGI. These people who loudly oppose AI regulation were, not long ago, completely denying the possibility of AGI. I understand their point of view, that regulation is generally harmful to technology, like Europe's tech industry, but I think we're approaching a threshold, beyond which we might all feel differently.
Logan: Do you think open-source models themselves inherently pose some kind of danger?
Sam: Not the current ones, but I can imagine there might be one in the future.
Logan: You've said that safety, in a way, is a wrong framework because it's more about what we're willing to accept, like aviation safety.
Sam: Safety is not a binary concept. You're willing to get on an airplane because you think they're safe, even though you know that occasionally they crash. What we call airline safety standards are the result of discussions, and people have different opinions about them. Overall, they've become extremely safe, but that doesn't mean there won't be accidents.
Logan: Likewise, medicine has side effects, some people will have bad reactions, and then there's the downside of social media. There's a hidden side to safety, like the downside of social media. Is there anything in the safety paradigm that would make you change your approach?
Sam: We have something called a preparedness framework, which means we'll take different actions at these categories and levels.

AGI Acceleration

Logan: I had Eliezer Yudkowsky on the show, and he talked a lot about his views. Do you think a fast takeoff scenario is possible?
Sam: It's certainly possible, and it might not even require modifying the current Transformer architecture. But this is not what I think is the most likely path, though I don't rule out that possibility. I think things will become more continuous, even as they accelerate. I don't think we'll go to bed one night with pretty good AI and wake up the next day with real superintelligence. But even if the takeoff happens over a year or a few years, that's fast in a sense. There's also the question of whether, once you have very powerful AGI, does society change the next day, the next year, or over the next decade? My guess is that, for the most part, it won't be a next-day or next-year thing, but over a decade, the world will look very different. I think societal inertia works in our favor here.

Equity

Logan: I think people are also suspicious about certain questions, and the question that you seem reluctant to answer is about Elon, equity, and the board structure in November. These are probably your three least favorite questions that you've been asked many times. Which one do you dislike answering the most?
Sam: I don't hate these questions; there's just nothing new to say.
Logan: I won't ask specifically about equity because I think you've answered it many times, although people still seem to dislike hearing "enough money" as the answer.
Sam: If I were to make a trillion dollars and then give it away, that would probably meet people's expectations, or do the usual thing. There is another way of doing this somehow. Comparatively, I'm just saying most people who make a lot of money will end up giving it away.

Motivation

Logan: What motivates you to pursue AGI, aside from equity? I think most people would feel that even if I have some higher mission, I can still get paid for it. So what gets you up every day now and makes you feel like it's worth it?
Sam: I often tell people that I am willing to make a lot of compromises and sacrifices in my life right now because I think this is going to be the most exciting, important, and best thing I will ever get to work on. It's a crazy time, and I'm enjoying it, knowing it won't last forever. One day, I'll retire to a farm and look back fondly on these times, but also think, "Oh, those days were stressful, long, and intense." But it's also very cool, and I can't believe this is happening to me. It's amazing.

Unique Moment

Logan: Is there a moment that felt the least real to you? Like, for example, not being able to leave your city? You did a podcast with Bill Gates. I imagine your speed dial has a lot of interesting people on it. If I were to grab your phone right now, would there be a ton of celebrities in there? Over the last few years, has there been a moment that felt unique and surreal?
Sam: There are almost daily moments where something happens and I think "Wow, if I had more space to reflect on this, it would feel crazy," but it's kind of like fish in water. After what happened in November, I received 10 to 20 text messages from presidents and prime ministers around the world, and that wasn't even the strangest part. What was strange is that all of this happened, and it felt completely normal to me as I was replying. We were in this intense state for four and a half days, barely sleeping or eating, but very high energy, clear, and focused, yet physically in a weird adrenaline-filled state. It all happened the week before Thanksgiving, which was extremely crazy, but resolved on Tuesday night. Then on that Wednesday, the Wednesday before Thanksgiving, my wife Ally and I drove up to Napa Valley, stopping at a restaurant along the way for food. During the drive, I realized I hadn't eaten in days, and everything started to feel normal again. I ordered four main courses, two milkshakes, just for myself, and it felt incredibly satisfying. While I was eating, one of the presidents messaged again apologizing, and suddenly I realized how many people had messaged me, and it didn't feel strange. What was strange was that I realized all of this had happened, and it should have been an incredibly strange experience, but it wasn't. So that was a striking moment for me. I think humans adapt to anything much more than we realize. You can get used to any new normal, whether good or bad. This is something I've learned over the past few years, and I'm still surprised by it. It speaks to the extraordinary nature of humans, which is good for us because we're facing a huge transformation.

Unique human characteristics

Logan: When you think about models becoming increasingly intelligent, what do you think will always remain uniquely human?
Sam: Many years from now, humans will still care deeply about other humans. I read online where people say everyone will fall in love with ChatGPT, having ChatGPT girlfriends and such, but I bet they won't. I think our long-standing focus on each other, in various big and small ways, will continue to be an obsession for us. I don't think we'll primarily enjoy watching robots play soccer.

Operating OpenAI

Logan: In running this company, OpenAI, you drew on a lot of the rules and frameworks that you built at YC about how to run businesses, and you broke some of them. On hiring, would you hire different kinds of people than you would for a traditional internet company or B2B software company?
Sam: Researchers are very different from product engineers.
Logan: Researchers are unique, but would OpenAI recruit different types of executives?
Sam: I generally don't rely entirely on external recruitment for executives; I think it's important to promote from within, but also bring in some senior talent. This is important for what we're doing because what we do is quite different from what other companies do.

Important decision-making

Logan: In the process of OpenAI, which decision do you think was the most important? How did you make that decision?
Sam: It's hard to point to just one, but we decided to adopt iterative deployment rather than building AGI in secret and then releasing it all at once, which felt like a very important decision. Another key decision was betting on language models; we decided this would be what we would focus on, which seemed like a big bet at the time, but in hindsight, it seems obvious, though it didn't feel that way back then.
Logan: What’s the story behind the decision to bet on language models?
Sam: We had other projects, such as robotics and video games, but someone started working on language models, and Ilya was very convinced about this direction. We worked on GPT-1, GPT-2, studied scaling laws, expanded to GPT-3, and then we made a big bet, deciding this would be our main focus.

Two ways to use AI

Logan: You recently mentioned two ways to use AI: cloning yourself and cloning your smartest employee. That's a subtle distinction, can you elaborate?
Sam: Within the next five years, you might want to clarify whether you're texting me or my AI assistant. If it's the AI assistant, it will consolidate the information and you'll get a response later. If it's me, I'll know that these things are being handled by me personally. I think there's value in separating these; I don't want AI to just be an extension of myself, but an independent entity.
Logan: In music or creative fields, it becomes easy to replicate Drake or Taylor Swift's audio. We may need some form of verification to confirm the authenticity of these works.
Sam: There does need to be some kind of verification, like at OpenAI, where different people handle different tasks rather than one single entity. Letting AI work as an independent entity rather than an extension of oneself.

Future Education

Logan: In terms of the education system, what changes do you think should be made to prepare for the future?
Sam: The biggest change is to allow, and even require, people to use these tools. In some cases, we want people to do things the old way to help with understanding, but overall, in real life, you would use a calculator, so you need to understand it and be proficient in its use. If you never use a calculator in math class, you will perform poorly at work. Similarly, we shouldn’t train people not to use AI, because it will become an important part of doing valuable work in the future.

The Shape of the Future

Logan: You’ve mentioned that AGI is just a point on the continuum of intelligence, and you believe progress might continue. Do you ever pause to reflect on or imagine what the future might look like?
Sam: Often. I don't picture flying cars or Star Wars future cities, but instead I imagine a world where one person can do the work of hundreds or thousands, and where we can figure out all of science. That would be pretty cool.