It's Crunch Time: Ajeya Cotra on RSI & AI-Powered AI Safety Work, from the 80,000 Hours Podcast

Watch Episode Here

Listen to Episode Here

Show Notes

This cross-post from the 80,000 Hours podcast features Ajeya Cotra in conversation with Rob Wiblin about AI timelines, recursive self-improvement, and the “crunch time” window when AI could rapidly accelerate its own development. Ajeya explains why widespread, compounding automation may face fewer bottlenecks than many expect, and what that could mean for the world by 2050. They also discuss transparency, early warning systems, and the emerging strategy of using each generation of AI to align and control its successors.

LINKS:

Ajeya Cotra podcast episode

Sponsors:

AvePoint:

AvePoint is building the control layer for AI agents so you can securely govern, audit, and recover every action at scale. Design trusted agentic outcomes from day one at https://avpt.co/tcr

VCX:

VCX, by Fundrise, is the public ticker for private tech, giving everyday investors access to high-growth private companies in AI, space, defense tech, and more. Learn how to invest at https://getvcx.com

Claude:

Claude is the AI collaborator that understands your entire workflow, from drafting and research to coding and complex problem-solving. Start tackling bigger problems with Claude and unlock Claude Pro’s full capabilities at https://claude.ai/tcr

Tasklet:

Build your own Cognitive Revolution monitoring agent in one click.
Try it for free and use code COGREV for 50% off your first month at https://tasklet.ai

CHAPTERS:

(00:00) About the Episode

(04:17) AI inside safety plans

(06:49) AGI growth expectations (Part 1)

(18:32) Sponsors: AvePoint | VCX

(20:54) AGI growth expectations (Part 2)

(21:56) Disagreement and measurement (Part 1)

(37:28) Sponsors: Claude | Tasklet

(41:19) Disagreement and measurement (Part 2)

(41:19) Transparency before takeoff

(58:00) Redirecting AI labor

(01:09:06) Defense plans and limits

(01:25:26) Pausing versus redirecting

(01:35:22) Open Phil and compute

(01:50:16) Bottlenecks and preparation

(01:57:42) From research to grants

(02:13:03) Burnout and sabbatical

(02:23:22) What EA offered

(02:36:43) EA and religion

(02:47:56) Next career steps

(02:57:35) EA's future niche

(03:08:15) Episode Outro

(03:12:11) Outro

PRODUCED BY:

https://aipodcast.ing

SOCIAL LINKS:

Website: https://www.cognitiverevolution.ai

Twitter (Podcast): https://x.com/cogrev_podcast

Twitter (Nathan): https://x.com/labenz

LinkedIn: https://linkedin.com/in/nathanlabenz/

Youtube: https://youtube.com/@CognitiveRevolutionPodcast

Apple: https://podcasts.apple.com/de/podcast/the-cognitive-revolution-ai-builders-researchers-and/id1669813431

Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk

Transcript

This transcript is automatically generated; we strive for accuracy, but errors in wording or speaker identification may occur. Please verify key details when needed.

Introduction

Hello, and welcome back to the Cognitive Revolution!

Today's episode is a crosspost from the 80,000 Hours podcast, hosted by Rob Wiblin, and featuring a conversation with Ajeya Cotra, who previously led technical AI safety grantmaking at Open Philanthropy, now Coefficient Giving, and is now working on Risk Assessment at METR.

For years, AI insiders have recognized Ajeya as one of the most rigorous thinkers about the AI future, and she recently validated that judgment by coming in #3 out of more than 400 participants in the AI Digest 2025 AI Forecasting Survey.

For comparison, I was proud to land in the top 5% at #23.

In this conversation, Ajeya takes Rob through her expectations for the next few years as AI crosses critical thresholds, recursive self-improvement intensifies, and we enter what she describes as "crunch time" – a potentially short window in which AI is powerful enough to dramatically accelerate AI R&D, but not yet totally beyond human control.

As a preview, I'll warn you that even the accelerationists may suffer some future shock from this conversation, because Ajeya thinks it's actually quite plausible that we might find no insurmountable bottlenecks to widespread and compounding automation, and that if so, the world of 2050 could look as different from our perspective today as our world would look to hunter-gatherers of 10,000 years ago.

So, what's the plan to make sure such a mind-boggling transformation goes well for humans?

Ajeya advocates for transparency measures and early warning systems designed to make sure that superintelligence doesn't happen in secret, but aside from that … she reports that all frontier developers are gradually converging on a strategy of using each generation of AIs to attempt to align, understand, and control their successors.

Now, as regular listeners know, I signed the Future of Life Institute's October 2025 petition calling for a Ban on Superintelligence, not because I think this approach is forever destined to fail, but simply because I worry that we don't yet understand AIs well enough to bet on a good outcome from a recursive self-improvement powered intelligence explosion.

And yet, I do agree with Ajeya's advice – almost regardless of the kind of work you're doing, you should be adopting AI as aggressively as possible, both to maintain an accurate understanding of the situation at any given moment in time, and increasingly because you won't be able to keep up without it. It's a mad, mad world we'll soon be living in, but I would go as far as to say that even Pause AI campaigners ought to be using AI intensively.

If all that weren't enough for you to process, you should also know that the situation has recently accelerated yet again.

On March 5, just about 2 weeks after this episode was originally published, Ajeya posted an article on her Substack, Planned Obsolescence, called "I underestimated AI capabilities (again)" – in which she reports that the predictions she made in January 2026, which were the backdrop for this conversation, were already starting to be met, in just the first couple months of the year.

And more recently, we've of course learned of Anthropic's new Mythos model, which, despite the fact that Anthropic has never emphasized benchmark scores as much as other model developers, shows major gains on many benchmarks, and has reportedly found zero-day exploits in every major operating system and web browser, among many other software projects.

The bottom line is that crunch time is arguably here now, so if you've been watching and waiting for AI to get serious before deciding what to do about it, I would suggest getting off the sidelines sooner rather than later. If you need help figuring out what to do, you might consider applying for free 1:1 career advising from 80,000 Hours.

As always, I want to thank Rob and the 80,000 Hours team for allowing me to cross-post this episode. They've been delivering incredible alpha for years, and the nearer the singularity becomes, the more prescient they look.

With that, I hope you enjoy this essential conversation about AI timelines and crunch time strategy, with Ajeya Cotra and host Rob Wiblin, from the 80,000 Hours podcast.

Main Episode

Ajeya Cotra: if you look at public communications from at least open AI anthropic and google deepmind in all of their like stated safety plans you see this element of as AIS get better and better they're going to incorporate the AIS themselves into their safety plans more and more. how to create a setup where we use control techniques and alignment techniques and interpretability to the point where we feel good about relying on their outputs is like a crucial step to figure out because it either like bottlenecks our progress because we're checking on everything all the time and slowing things down or it doesn't bottleneck our progress but we like hand the AIS the power to take over.

Rob Wiblin: today i'm speaking with ajaya kocha. ajaya is a senior advisor at open philanthropy where in twenty twenty four she LED their technical AI safety grant making. more generally she's been doing AI related research and strategy since twenty eighteen and has become very influential in AI circles for her work on timelines capability evaluations and threat meddling. thanks so much for coming back on the show ajaya.

Ajeya Cotra: thank you so much for having me.

Rob Wiblin: so doing this interview gave me a chance to go back and listen to the interview that we did that we recorded i guess two two and a half years ago. and i have to say you were very on the ball. there was a lot of issues that came up in that conversation that you were bringing to to people 's attention that i think in the subsequent two and a half years seemed like a much much bigger bigger deal now. you talked about meters evaluating autonomous capabilities a line of research that's gone on to become super influential very widely read i think in policy circles. you talked about using probes to monitor and shut down dangerous conversations something that's a pretty standard practice and maybe one of the potentially most useful outputs from mechanistic interpretability. you talked about the importance of using chain of thought and scratch pads to monitor what AIS are doing and why it's still probably the dominant technique. you talked about the growing situational awareness of AI models and the resulting possibility of deceptive alignment. that's now completely mainstream topic. you talked about how when you train models to not engage in bad behavior they don't necessarily just learn to become honest. they also learn to just hide their their misbehavior better. something that is i guess research has kind of borne out does really happen and it's a big concern. you talked about how you expected models to get schemier as they get smarter especially once we inserted reinforcement learning back in back into the mix something that's definitely happened. and you know you talked a bunch about sycophancy how you thought models might end up just like flattering people rather than giving accurate information because that's kind of something that we enjoy. so i feel like i mean i don't you didn't come up with all these ideas or anything like that but i think you were you're ahead of the curve and maybe we'll get some ahead of the curve ideas and listen to view as well.

Ajeya Cotra: hopefully. thank you. so you think that a key?

Rob Wiblin: driver of disagreements about kind of everything to do with AI is people 's different views on how likely AGI is to speed up science and technology and i guess physical infrastructure and manufacturing. why is that?

Ajeya Cotra: yeah. so i think a thing that i've been noticing as the concept of AGI has become more and more mainstream is that it's also become more and more watered down. so like last year i was on a panel about the future of AI at deal book in new york. and it was me and one or two other folks who kind of think about things from a safety perspective and then a number of lixor venture capitalists and like technologists. and the the moderator asked at the very beginning of the panel how likely like whether we thought it was more likely than not that by twenty thirty we would get AGI defined as a is that can do everything humans can do. and like seven or eight hands went up not including mine because my timelines are somewhat longer than that. but then he asked a follow up question a couple questions later about whether you thought whether we thought that AI would create or or more jobs or destroy more jobs over the following ten years. so twenty thirty was five years and seven out of ten people thought that we would have AGI by twenty thirty. but then it turned out that eight out of ten people not including me thought that AI would create more jobs than it destroyed over the ten over the next ten years. and i was a little confused. i was like why is it that you think you we will have AI that can do absolutely everything that the best human experts can do in five years but we'll actually end up creating more jobs than it destroys in the following ten years like so?

Rob Wiblin: there's a sudden tension in his view.

Ajeya Cotra: and and when i poked some people later in the panel about that seeming tension and they like really quickly backed off and they said you know oh like what what does AGI really mean? the the moderator had defined it as this very extreme thing but they were like you know we we kind of already have AGI. people keep moving the goal posts. we keep making like cool new products. and like people aren't accepting that it's AGI and like they like aspire to something higher. and i thought that was funny because like you know the the old school singulitarian futurist definition of AGI is this very extreme thing. but i think like VCS have an have an instinct to sort of call call something AGI that is like GPT five is AGI or like something something just much milder. and so i think this creates a situation where people feel like they've gotten a lot of evidence that AGI doesn't isn't a very big deal and like doesn't change much because like we already have AGI or we're going to have it next year or we got it like two years ago. and like you know look around us nothing much is is changing. but but but i feel like and so i think there's this expectation where like whether or not we get AGI in the next few years people a lot of people like are starting to not really like care about that question. they still expect the next twenty five years and the next fifty years to play out kind of like the last twenty five years of the last fifty years where you know there there's a lot of technological change between two thousand and twenty twenty five but it's like a moderate amount of change. and they kind of expect that in twenty fifty there will be a similar amount of change as there was between two thousand and twenty twenty five. even if they think that we're going to get AGI in twenty thirty they think AGI is just like what's going to drive that sort of continued mild improvement. whereas i think that there's a pretty good chance that by twenty fifty you know the world will look as different from today as today does from like the hunter gatherer era. you know like it's like ten thousand years of progress rather than twenty five years of progress driven by AI automating all intellectual activity.

Rob Wiblin: yeah yeah. i guess you've hinted at the fact that there is an enormously wide range of views on this. but can you give us a sense of just how large the spectrum is and what what the picture looks like on either end?

Ajeya Cotra: yeah. so i would say on the the the sort of standard mainstream view if you ask like a normal person on the street like what twenty fifty will look like or if you ask like a a standard mainstream economist i think they would think well the population is a little bit bigger. we have somewhat better technologies. maybe like they have a few pet technologies that they're like most interested in like then and maybe we have this one or that one slightly better medicine. people live slightly longer and yeah it's an amount of change that's like extremely manageable. i think on the far extreme from there on the other side is like a view described and if anyone builds it every everyone dies. where in that worldview you have at some point probably pretty unpredictably we sort of crack the code to extreme super intelligence. like we we invent a technology that rather suddenly goes from being like you know GPT five and GPT six and so on to being so much smarter than us that we're like you know cats or like mice or ants compared to this thing 's intelligence. and then that thing can like really immediately have like really extreme impacts on the physical world. the classical sort of canonical example here being inventing nanotechnology. so like the ability to like precisely manufacture things that are like really really tiny and can replicate themselves really really quickly and can do like all sorts of things you know and can like move you know inventing like space probes close to the speed of light and things like that. i think there's a whole spectrum in between where like people think that we are going to get to a world where we have technologies approaching their physical limits. we have like spaceships approaching the speed of light and we have sort of self replicating entities that replicate as quickly as bacteria while like also doing useful things for us. but we're going to have to go through like intermediate stages before getting there. but i think like something that unites all of the people who are sort of AI futurists and like concerned about aix risk is that they think in the coming decades we're likely to get this level of like extreme technological progress driven by AI.

Rob Wiblin: how strong is the correlation between how much someone expects AI or AGI to speed up science research in particular and i guess like physical industry as well and how likely they think it is to go poorly or or how nervous they are about the whole prospect?

Ajeya Cotra: i think it's a very strong correlation. like i've found often that people who like reasonable people who are AI accelerationists tend to think that the default course of how AI is developed and deployed in the world is very very very slow and gradual. and they think that we should like cut some red tape to make it go at a little bit more of a reasonable pace. and people who are worried about X risk think that the default course of AI is this extremely explosive thing where it like you know overturns society on an all dimensions at once in you know maybe a year or maybe five years or maybe six months or maybe a week. and they're saying oh we should slow it down to take to take ten years maybe. and and meanwhile you know the the the sort of accelerationists think that by default diffusing and capturing the benefits of AI will take like fifty years or a hundred years. and they want to speed it up to take thirty five years you know?

Rob Wiblin: it's quite interesting i guess. yeah people who radically differ in their policy prescriptions might be targeting aiming for the same level of speed. actually maybe they wanted to this this this bureau to take ten years or twenty years. that's that's that's what both of them want. but they just think it's going to their their baseline is so different. so they're pushing in completely opposite directions. what's what's your kind of modal expectations? what what? what do you think is the most likely impact for it to have?

Ajeya Cotra: i think that probably in the early twenty thirties we are going to see what ryan greenblatt calls top human expert dominating AI which is an AI system that can do tasks that you can do remotely from a computer better than any human expert. so it's better at remote virology tasks than the best virologists better at remote you know software engineering tasks than the best software engineers and and so on for all the different domains. and by that time i feel like probably the world has already accelerated and changed and and sort of narrower and weaker AI systems have already like penetrated in a bunch of places and like we're we're looking at a pretty different world. but at that point i think things can go much much faster because i think at top human expert dominating AIS in the cognitive domain could probably use human physical labor to build like robotic physical actuators for themselves. that would be like one of the things that whether the AIS are sort of have already taken over and are acting on their own or whether like humans are still in control of the AIS i think that would be a a goal they would have of like automating the physical as well. and i think i have pretty wide uncertainty on like exactly how hard that'll be. but whenever i check in on the field of robotics i actually feel like robotics is like progressing pretty quickly. and it's taking off for the same reasons that sort of cognitive AI is taking off. it's like large models lots of data imitation large scale is helping robotics a lot. so i imagine that like you can pretty quickly maybe within a year maybe within a couple years get to the point where these like superhuman AI 's are controlling a bunch of physical actuators that allow them to sort of close the loop of like making more of themselves. like doing all the work required to like run the factories that like print out the chips that then run the AIS and like doing all the repair work on that and like gathering the raw materials on that.

Rob Wiblin: so saying you're expecting in the twenty thirties it won't just be that these AI models are capable of automating you know computer based R and D but they'll also be able to lead on the project of building fabricators that produce the chips that they run on. and so that's like another kind of positive feedback loop. yeah.

Ajeya Cotra: so i i really recommend the post three types of intelligence explosion that's by tom davidson on forethought which where where he makes the point that like you know we talk a lot about the the sort of promise and the danger of AIS automating A I R N D and and like you know automating the process of making better AIS. but that's only one feedback loop that is required to fully close the loop of making more AIS. because we're talking about software that makes the like you know transformer architecture slightly more efficient or like gathers better data to train the AIS on. but AIS are also running on chips which are printed in like these chip like you know factories at nvidia. and those factories have machines that are built by other machines that are built by other machines and you know ultimately go down to raw materials. and i think that's something we don't talk about very much because it'll happen afterward is how hard it would be for the AIS to automate that entire stack the full stack and not just the software stack.

Rob Wiblin: so the i guess the range of expectations that exist among sensible thoughtful people who've engaged with this on how much like at peak how much is AGI going to speed up economic growth? it ranges from people who say it will speed up economic growth by point three percentage points. so it'll be you know fifteen percent increase or something on on current rates of economic growth. and you know and i'd be very happy if it if it was that good. yeah people would say at peak the economy will be growing at a thousand percent a year or higher than that thousands of percent a year. so it's like a hundred or a thousand or a ten thousand fold disagreement basically on the on the likely impact that this is going to have. it's an almost like unfathomable degree of disagreement among people who it's not as if they've thought about this independently and they haven't had a chance to talk. they've spoken about this they've shared their reasons and they and they don't change their mind and they disagree by a thousandfold impact i guess. and you've made a part of your i guess mission in life the last couple of years to have like really sincere you know intellectually engaged curious conversations with people across the full spectrum. why do you think it is that this this agree this disagreement is is able to be maintained?

Ajeya Cotra: yeah. i i feel like at the end of the day the different parties tend to lean on two different like pretty sort of simple priors or simple like outside views that are that are kind of different outside views. and i would say that the the party that like sort of group that expects things to be a lot slower tends to lean on well for the last hundred hundred and fifty years in frontier economies we've seen two percent growth. and like think of the technological change that has occurred over the last hundred or a hundred and fifty years. you know we we like went from having very little like electricity was just an idea to like everywhere was electrified. you know we we had the washing machine and the like the television the radio like all these things happen computers happen in this period of time. none of these show up as like a an uptick in economic growth. and i think there's this stylized fact that mainstream economists really like to cite which is that new technology is sort of the engine that sustains two percent growth. and in the absence of that new technology growth would have slowed. and so they they're they're kind of like this. this is how new technologies always are. people think that they're going to lead to a productivity boom but you never see them in the in the statistics. you didn't see the radio you didn't see the television you didn't see the computer you didn't see the internet and you're not going to see AIAI might be really cool. it might be the next thing that lets us keep chugging along you know and that that's like one perspective. it's an outside view they keep returning to and also just maybe a like somewhat more generalized thing which is like things are just always hard and slow you know just like way harder and slower than than you think. you know it's like what was it like? murphy's not murphy's law.

Rob Wiblin: because anything that can go wrong will go wrong. i think this is our experience in our personal lives. and it's awfully hard to achieve things at work things that to other people might seem straightforward. and they're like why haven't you finished this yet? and you're like well i could give you a very long list.

Ajeya Cotra: or like hofstadter 's law and it always takes longer than you think even when you take hofstadter 's law into account. or like the programmer 's credo. this is my favorite one. it's like we we do these things not because they are easy but because we thought they would be easy. so so there's just that this whole cloud of like you know it's naivete to think that things can go crazy fast. if you write down a story that seems like perfect and unassailable for how things will be like super easy and fast there's all sorts of bottlenecks and all sorts of drag factors you inevitably failed to account for in that story. it's like that. that's kind of like that perspective. and then i think the alternative perspective leans a lot on like much longer term economic history. so if you if you attempt to to try and like assign reasonable like GDP measures to the last ten thousand years of human history you see acceleration. so the growth rate was not always two percent per year at the frontier two percent per year is actually blisteringly fast compared to what it was in like three thousand BC which is like maybe that was like point one percent per year. so the growth rate has already multiplied like many folds maybe an order of magnitude maybe two. i think they're like people in the slower camp tend to feel like the exercise of doing like long run historical data is just like too fraught to rely upon. but people in both camps do agree that the industrial revolution happened and the industrial revolution accelerated growth rates a lot. and we went from having growth rates that were well below one percent to having two percent a year growth rates. and and and i think that like yeah people in the like faster camp tend to tend to like lean on the long run and on models that say that the reason that we had accelerating growth in the long run was a feedback loop where more people can like try out more ideas and like discover more innovations which then leads to food production being more efficient which then leads to a larger supportable population. and then you can rinse and repeat and you get like super exponential population growth. and then that like that perspective says that AIS if you can slot in AIS to replace not just the cognitive but the cognitive and the physical the entire package and close the full loop of AIS doing everything needed to make more AIS or AIS and robots doing everything needed to make more AIS and robots then there's no reason to think that like two percent is some sort of. like physical law of the universe they can grow as fast as you know. their physical constraints allow them to grow which are not necessarily the same as the constraints that keep human driven growth at two percent.

Rob Wiblin: so that's the justification that they provide for their perspective in broad strokes. but why is it that even after communicating this at great length to one another they don't kind of converge on uncertainty or saying it'll it'll be something in the middle because there's competing factors that they just like continue to be reasonably confident about. quite different i guess narratives about how things will go.

Ajeya Cotra: yeah i'm honestly not sure. i think maybe one part of it is that so i i i guess like you know i i'm i'm partial to the things will be crazier side of things. so i'm not sure i'll be able to like give a perfectly balanced account. but i feel like one thing i've noticed on the side in terms of people who think it'll be slower is that their their worldview kind of has a built in error theory of people who think things will go faster. so the the the like world view is not just things will keep ticking along but everyone thinks there will always be like some big new revolution.

Rob Wiblin: that makes things everyone 's always expecting to speed.

Ajeya Cotra: up and they've been they've always been wrong. so there's that dynamic which is like it's it's a like you know from their point of view i think it's it's totally reasonable. it's like kind of like even if there isn't some super knock down argument in the terms of the your interlocutor where you can like point to a mistake that they'll accept. or even if you kind of look at the story and think it's kind of plausible you you still have this strong prior that like someone could have made the same argument about television someone could have made the same argument about computers. none of these played out. so i think i think that's a big factor. i also think there hasn't been like these you know these are complicated ideas and there hasn't been that much dialogue. and i think there could be more and i think there could be more dialogue that is trying to ground things in like near term observations also. but yeah i think that's a big part of it. i think they have like an an error theory built in that makes it so that like the object level conversation about like OK like you know here's how the AI could make the robots and here's how the robots could bootstrap into more robots and so on. like that whole way of thinking doesn't feel very like legitimate or interesting or like they they they sort of have a story where like that type of thinking always leads to a bias towards expecting things to go faster than they actually will. because it's like hard for that kind of thinking to account for all the drag factors and all the bottlenecks. whereas i think on the other side people think things will go faster. feel like everyone is always kind of like blanket assuming that they're going to be bottlenecks. and then they bring up specific bottlenecks. and those specific bottlenecks when you look into them don't seem you know like they might slow things down from from some sort of absolute peak of a thousand percent growth. but they're not reasons to think that two percent is is where the ceiling is or even that ten percent is where the ceiling is. so they also have this this kind of error theory of like the bottlenecks objection.

Rob Wiblin: so it's incredibly decision relevant to figure out who is right here. i think like almost all of the parties to this conversation if they if they completely changed their view and the people who thought it was going to be a thousand percent decided it was going to be point three percent that would probably change what they're working on. although they'll think it was like a decisive consideration probably against everything that they were doing previously and vice versa. if people came to think there would there would be a thousand percent speed up then they'll probably be a whole lot more nervous and interested in different kinds of projects. so how can we potentially get more of a heads up ahead of time about which which way things are going to go? is it i guess it seems like sharing theoretical arguments hasn't been persuasive to people. yeah. is there any kind of empirics that we could collect as early as possible?

Ajeya Cotra: so one thing that i think will not address all of this but but is a step in the right direction is really characterizing how and why and if AI is speeding up software and A I R N D. so meter came out with an uplift RCT which i think was the first of its kind or at least the the largest and highest quality where they had software developers split into two groups. one group was allowed to use AI the other group was disallowed from using AI and they studied you know how quickly those developers solved issues like tasks on their to do list. and it actually turned out that in this case AI slowed down their performance which i thought was interesting. i don't expect that to remain true. and but i'm glad we're starting to collect this data now and i'm glad we're starting to sort of cross check between benchmark style evaluations where a is are given a bunch of tasks and sort of scored in an automated way and evidence we can get about actual like in context real world speed UPS. so i really want to get a lot more evidence about that of all kinds like big uplift RC TS. it would be great if companies were into internally conducting RC TS on their own like roll outs of internal products to see like you know our teams that get the the latest AI product earlier more productive than teams that don't even self report which i think has a lot of limitations. is is still something we should be gathering. so i guess the my my like high level formula would be look at the places where adoption has penetrated the most and start to measure speed up in like actual output variables. like i think it would be really cool if there was a solar panel manufacturing plant that had like really adopted AI and we like started to see like you know how much more quickly they could manufacture solar panels or like how much better they could make solar panels.

Rob Wiblin: yeah. is it possible to do this at the at the chip manufacturing level? i guess maybe that that's the most difficult manufacturing that there is more or less. so we might think that AI you get more of an early heads up if you do something that's more straightforward like like solar panels but would really like to be monitoring across like the across all kinds of different manufacturing. how much difference is any of this making?

Ajeya Cotra: i think the most important thing or the thing i ultimately care about is the AI stack. so so chip design chip manufacturing manufacturing the equipment that manufacturers chips and then of course the software piece of it too. the software piece is the is the earliest piece but but i think we should be monitoring degree of AI adoption self reported AI acceleration RCTS anything we can get our hands on for the entire stack. because i think the like moment when the the sort of AI futurists think things are likely to be going much much faster sort of coincides with when AI has like fully automated the process of making more AI. so that's that's really something to watch out for. and then i think like but but on a separate track you you also want to just be looking at the earliest power users no matter where they are just because you can get insight that transfers to these domains is.

Rob Wiblin: there anything else we can do?

Ajeya Cotra: i don't know i'm i'm really like curious about this.

Rob Wiblin: to answer him right that last year you put out a request for proposals you were at open field looking to fund people who had ideas for how would we resolve this this question.

Ajeya Cotra: yeah so i i put out a pair of requests for proposals in late twenty twenty three. one of them was on building difficult realistic benchmarks for AI agents. so at the time very few people were working with AI agents and only like a couple of agentic benchmarks had come out including meters benchmark that i discussed on the show last time. and so i was really excited about it felt like it was a moment to move on from like giving LLMS multiple choice tests to giving them like real tasks like book me a flight or like you know make this piece of software work like write tests run the tests iterate until the thing actually works. and that was like a very new idea at the time but also the time was sort of right for that idea. and there were a lot of academic researchers who were excited about moving into the space. so we got a lot of applications for that arm of our request for proposals. and we funded a bunch of cool benchmarks including like cybench which is a cyber offense benchmark that's used in a lot of like standard evaluations now. but then we also had this other arm which was basically like types of evidence other than benchmarks like surveys RC TS all the things we talked about. we got much less interest for that. and i think it just reflects that it's harder to think of like good ways to measure things outside of benchmarks even though everyone agrees benchmarks have major weaknesses and and consistently overestimate real world performance because benchmarks are sort of like clean and contained and the real world is messy and open ended. but one thing that i'm excited about that came out of the second RFP is that forecasting research institute is running this panel called leap which is the longitudinal experts on AI panel where they just take like a hundred or two hundred AI experts economists and super forecasters and have them answer a bunch of granular questions about where AI is going to be in the next six months in the next year in the next five years. both like benchmark scores but also things like you know will companies report that they're like slowing down hiring because of AI or like will and AI be able to like plan an event in the real world or like these kinds of things? so i'm very excited about that. and i think honestly like having people make subjective predictions explain how those predictions are connected to their like longer run world views and then like check overtime who's right might be the like most flexible tool we have. so i'm very excited to see where leap goes. but i think it is like it is challenging to get indicators that are clearly early warnings so that we can actually like do something about it if the people who are more concerned or right but that are also like clearly valid and like not easy to dismiss on the other side as just like not realistic enough to matter.

Rob Wiblin: so as part of this you've been thinking about i guess one way that this could really go wrong is if the companies that are developing cutting edge AI may know they begin may begin to see themselves internally how much it's helping them and that that perhaps it's it's speeding them up enormously but they may not decide not to share that information with the rest of the world and.

Ajeya Cotra: they may not decide they may decide not to release those products. like if there's one company that's well ahead of the others then like in in AI twenty twenty seven it was it was sort of depicted that the company that was ahead in the AI race was so far ahead of its competitors that it could afford to just keep its best stuff internal and like only release sort of less good products to the rest of the world.

Rob Wiblin: it could afford it in the sense of it didn't need to make money by by selling the the the the product.

Ajeya Cotra: like its competitors were far enough behind that they couldn't like undercut it or compete with it by like releasing a better product like that in in in the story the like the company in the lead open mind is basically just like releasing products that are slightly better than the like state of the art of its competitors.

Rob Wiblin: they're saying they're so far ahead that they can just choose to always basically have their product be somewhat better. they can just release whatever whatever level of their own internal machine is would would be the best to the external one. yeah OK. but i guess it would be yeah it would be unfortunate if there are people who do know this but the broader world doesn't get a heads up. and so you know we could have known six months or a year earlier in what direction things were going but that but that was kept secret. i mean i guess maybe for the leading AI company that they'd prefer to keep it secret but but for the rest of us i suppose would probably prefer that the government has some idea what's going on. so you've been thinking about what sort of transparency requirements could be put in place that would require people require the companies to release information that would give the rest of us clues as to as to where things are going. what sort of transparency requirements could those be?

Ajeya Cotra: yeah. so i think there's there's a whole spectrum of evidence about AI capabilities where on the one hand the the sort of easiest to test but the least informative is benchmark results. and companies do release benchmark results when they release models right now. so they say you know applaud opus four was released and they have a model card that says like you know it has this score on this like hacking benchmark. it has this score on the software engineering benchmark and so on as part of a report about whether it's dangerous or GPT five had the same thing. i think that that's great that they do that. but in my ideal world they would release their highest internal benchmark score at some calendar time cadence. so every three months they would say we've achieved this level score on this hacking benchmark this level score on software engineering benchmark this score on an autonomy benchmark. and that's because as you said danger could manifest from purely internal deployment because if they have an AI agent that's sufficiently good at AIR and D they could use that to go much faster internally and then other capabilities and therefore other risks might come online much faster than like people were previously expecting. so it's not ideal to have your like report card for the model come out when you release it to the public unless there's some sort of guarantee that you're not sitting on a product that's like sufficient let like that's substantially more powerful than the public product. so maybe it's fine to like release it release your like model card and system card along with the product. if you also separately have a guarantee that you won't have too much of a gap between the internal and the external. so that's like on the on the like end of things that are currently discussed. it's like kind of how i would how i would tweak information that's currently reported to be like somewhat more helpful for this concern. but then there's a bunch of other stuff that is not currently reported that i think ideally it would be really great to know stuff like how much and how are they using AI systems internally. so one thing i'm very interested in is so so companies will sometimes report kind of to brag about like the percentage of lines of code that are written by their AI systems. various CE OS have said like internally ninety percent of our lines of code are written by AIS and things like that. i think it'd be great to have systematic reporting of those kinds of metrics but those metrics aren't the like ideal metric i'd be interested in. so like one thing i'm interested in is like what fraction of pull requests to your internal code base were mostly written by AI and mostly reviewed by AI. so AI is like humans are like not involved for the most part in like both sides of this equation. and i'd be very interested in watching that number climb up because i think it's an indication both of AI capabilities and of like how much difference they're giving to AIS. and eventually if things are going to go crazy fast the AIS have to be doing most things including most like management and approval and review. because if humans have to do that stuff then things can only go so fast you know? so i really want to track how much like higher level decision making authority is being given to the AIS in practice inside the companies. yeah i think they're probably a bunch of other things that we could like send basically as a survey like like how much do you use AIS for this type of thing for that type of thing? like how much speed up do you get from you know subjectively do you think you get? if you're running any internal RCTS i would of course love to know the results of that.

Rob Wiblin: what about just requirements that in as much as they're training future generations of AIAI models they have to reveal to at least like some some people in the government like how they're performing on like normal evals of capability. so they can kind of see the see the line going up even if they're not releasing it as products for whatever reason. and if the line starts like you know if the benchmarks start going like yeah coming upwards like far above previous expectations then that that's could lead them to sound the alarm.

Ajeya Cotra: yeah i think that is a good thing to do. but i i'm sort of i sort of don't think that just benchmarks alone will actually lead anyone to sound the alarm because we just like the thing with benchmarks is that they saturate but.

Rob Wiblin: they always have that S curve.

Ajeya Cotra: shape they always have the S curve shape and the benchmarks we have right now are harder than the previous generation of benchmarks. but it's still far from the case that like i feel confident that if your AI gets a hundred percent score on all these benchmarks then it's like a threat to the world and it could take over the world. i still think the benchmarks we have right now are like well below that. so what's probably going to happen is that these benchmarks are going to get saturated then there's going to be a next generation of benchmarks people make and then those benchmarks are going to tick up and then get saturated. so i think we need some kind of real world measure before we can start sounding the alarm. and then the ultimate real world measure is is actually just observed productivity right? like if they are seeing internally that they're discovering insights like faster than they were before then that that's a very like late but also very clear signal. and that's the point at which we should definitely they should definitely sound the alarm and like we should sort of know what's happening. so yeah.

Rob Wiblin: yeah. how is this idea being received by the by the companies? i mean on the on the one hand it seems like transparency requirements is the regulatory instrument that the companies have objected to the least. it's the one that they've been most willing to to tolerate. on the other hand the whole message of this is we don't trust you to share information with the rest of the world and we think that you might screw us over basically by by by by rushing ahead and like deliberately concealing that. i could imagine that that could be a little bit offensive to them or at least if that is their plan then they probably want to find some excuse for not for not having this kind of overs.

Ajeya Cotra: yeah. i think that the the response just tends to differ based on the like actual information that's being asked for. so benchmark scores they already release like i said they release it at the at the point of releasing a product which i think is fine for now. but i would like to move it to a regime where they release benchmark scores at some sort of fixed cadence even if they don't have a product release. benchmark scores are not considered like sensitive information. but this other stuff that i think is is a lot more informative on the margin is much more fraught right? they don't necessarily want to share with the world like you know the rate at which they're gaining like algorithmic insights because you want to maintain some mystery about that for competitive reasons. like it's risky for you if it's a little bit too fast because then like i don't know competitors will start paying more attention to you and like trying trying to copy you and trying to find out what's going on. it's also risky for you if it's like too slow but then because then that's kind of embarrassing.

Rob Wiblin: investors lose heart.

Ajeya Cotra: yeah investors lose heart. and another thing i didn't mention earlier is that i would really like them to be reporting their most concerning misalignment related safety incidents. so like has it ever been the case that in real life use within the company the model lied about something important and covered up the logs? like i really want to know that. but then of course it's clear that reporting that is very embarrassing to companies. so one like thing that might help here is that there are a number of companies now so so perhaps they could report. their you know they could report their individual data to some sort of third party aggregator that then reports out like an anonymized like overall industry aggregate score. but i don't think that solves all the issues 'cause there are a few enough of them that that like people would be able to guess so. so i think there's a lot of like competitive challenges and IP sensitivity challenges and like just PR challenges to overcome here with some of the more like penetrating internal information. but i think it's like important enough to the public interest that we should try and find a way to like navigate that.

Rob Wiblin: yeah. so it's not unusual for government agencies to be able to basically demand commercially sensitive information from companies for regulatory or governance purposes. i actually worked at one when i was in the australian government. i was at the productivity commission which had like extraordinary subpoena powers to basically demand almost any documents from from any company in the country. like i rarely use power but and it wasn't the only agency that that that had that capability.

Ajeya Cotra: and what kinds of things would you ask them?

Rob Wiblin: i mean i well i i never actually saw this power being used. it was a kind of AI guess. people were proud of the fact that we had that authority. but i think you would usually do it for competitive for like competition reasons trying to tell whether companies are colluding potentially or whether there's like an insufficient degree of like market competition. and there would be a reason to to to to intervene. and i would i would imagine almost certainly there's government agencies in in the US that have a have a similar remit. and so if if they actually could keep that kind of information secret then maybe the the companies will be more happy to share it with people who were specialized basically in reading this like comprehending this data and figuring out what to do with it.

Ajeya Cotra: yeah i think that could be a solution but i'm. i'm a little skeptical. so i i think that releasing this information publicly is probably a lot better than releasing it just to a government body basically because you know we're we're like building the plane of like AI safety research like as we're flying it. and it's not like there's a box checking exercise that any kind of government agency that's like often understaffed especially with like technical staff could do. it's more like we we want this information out there in the open and then we want people to do like some involved analysis of it. and like our sense of what information we even want is probably going to be like shifting over time. and it'll probably go better if there's like a a robust kind of external scientific conversation about like what indicators we want to see and what that would mean. and like when we should trigger alarm and if that's all being routed through governments with like ten people or like even fifty people who have to deal with it i think it's like it would be very hard for them to interpret the evidence like quickly enough and well enough and and be confident enough to sound the alarm and then have people actually listen to them. like if i imagine sounding the alarm on something like the intelligence explosion i kind of picture it having to be like a a society wide conversation kind of like sounding the alarm about covid. or like something i have in my mind is like when joe biden had that disastrous debate performance that led to like weeks of conversation that ultimately led to him being removed from the ticket. it would have been very hard i think for a a small narrow group of people sort of entrusted with the authority to like make the same thing happen.

Rob Wiblin: because i guess you want common knowledge and you want lots of attention focused on the on the on the issue as well as just some technocrats being aware.

Ajeya Cotra: as well as the opportunity for a bunch of technical experts who may not be paying that much attention now because like maybe they think this stuff is all science fiction to jump in at that moment and like offer their their takes. and i think it would be very powerful if someone like arvind narayanan who's like known for being very skeptical of these stories actually looked at the data and changed his mind and said oh yeah like this this is happening now. and it's dangerous. and it's very hard to get those kinds of common knowledge dynamics if everything is like just sent to governments. that said of course like i think sending things to governments is better than like not sending it anywhere. so like i i also think that's good.

Rob Wiblin: so in as much as the the plan a would be we want them to be sharing this information such that any anyone in the public can find out. i guess they'll probably resist this to to this any legislation imposing this to some extent and i guess for partially the legitimate reasons that it is probably going to be frustrating for them. how high on the list of like in as much as people are trying to set priorities for what what sort of asks do you make and which sort of fights do you pick? would this be like very high on the on the list for you?

Ajeya Cotra: i think i i laid out a whole spectrum of of ideal kind of like information sharing practices. and i don't think going all or nothing on that whole package is like a top priority fight to pick. but i think sort of the algorithm of thinking really hard about what pieces of information we would want to know in order to know for ourselves if the intelligence explosion was happening. and then and and sort of getting the like highest value items on that list or the like biggest bang for the bang for buck items on that list. to me it feels very high. and and i think that's the like strategy that people working on AI safety related legislation have have landed on. so like the rays act in new york and SB fifty three in california are both like quite transparency oriented and both oriented around for example like whistleblower protections which are which are like an important sort of policy plank underlying transparency.

Rob Wiblin: do you think that information about an emerging intelligence explosion might just leak out to the public anyway because staff at the companies would feel uncomfortable with that proceeding in in in secret?

Ajeya Cotra: i think that's very plausible. i still think that information that leaks in the form of like rumors in san francisco like tech pro parties doesn't have the ability to impact policy and like decision making all the way in DC or london or brussels in the same way as information that is just sort of clearly unrefuted and and it very salient and and sort of official. so i i mean i i think that the AI safety scene in the bay area has benefited from having like close social ties to people who work at AI companies getting us a sense of like what might be coming around the corner. but that's not something you can just that's not something that you can use to really like pull an alarm or like advocate for very costly actions. so i i think it like isn't really enough. we need more.

Rob Wiblin: so let's imagine that via whatever mechanism society does get a heads up that we are starting to see the early stages of an intelligence explosion. what would we do with that heads up?

Ajeya Cotra: yeah. so i think one just extremely important factor is at that point in time how good are AI systems at everything besides AIR and D? so the alarm has sounded and AI we learned that AI has like fully or almost fully automated R and D at the leading AI lab perhaps all the AI labs. this is causing those labs to go way faster than they were going with like mostly human driven progress in the previous era. so at that point in time whatever AI progress you thought was going to be made by default in the next ten years or the next twenty years or the next thirty years might be made in a year or two or even six months based on how much depending on how much AI is speeding everything up. so you know at at this stage a is might not be that dangerous but we might be about to move very quickly through the the point in time where they're not so dangerous to the point in time where they have you know sort of a god like abilities. and i think that what we want to do as a society if we gain confidence that we're sort of at the starting point of this intelligence explosion is to redirect as much of that AI labour as we can from further AIR and D to things that could help protect us from future generations of a is both in terms of AI takeover risk and also in terms of a wide range of other problems that might be created for society by increasingly powerful AI. and at that point it's not it's still not in the sort of narrow selfish interests of whichever company is in the lead to do that. because if they were to slow down unilaterally then like someone behind them could catch up. but hopefully if we have if the alarm has sounded and we have like a clear picture of you know we have six months or twelve months or eighteen months until radical super intelligence then this might be like a a window of opportunity to coordinate to get to to like use AIS for protective activities instead of further AI capability acceleration.

Rob Wiblin: so the chance we have is AI is becoming much smarter very very very quickly. and we feel very nervous about that. but i guess the and the opportunity that's created is that well we have AIS like a lot more labour and we have like much smarter potential researchers than than we did before. so why don't we turn that new resource towards solving solving this problem that i guess at the moment we don't really know how to how to fix? it's a little bit i guess i think some people who are not too worried about AI that they look at society as a whole they look at history and they say well technology has enabled us to do all kinds of more destructive things. but we don't particularly feel like we're in a more precarious situation now or at much greater personal risk now than in nineteen hundred or in eighteen hundred because advances in destructive technology have been offset by advances in safety increasing technology. and on balance probably things have gotten safer. and so the idea is well can we potentially it's like going to be a vertiginous time but perhaps we could pull off this the same the same trick in in this crunch time period.

Ajeya Cotra: yeah and and i think that like a lot of people who are who are more concerned about AI risk are very like dismissive of this plan. it's just like it it sort of sounds like a a crazy plan. it's like really flying by the seat of your pants like expecting the the thing that's creating the problem to solve the problem. but in a sense like i do think humanity has repeatedly used sort of general purpose technologies that both like created problems to solve those problems. like you know automobiles something as mundane as that. like you know cars created the opportunity for there to be carjackings and for there to be drive by shootings. and for like you know it empowered bad actors in various ways. but of course like you know if if the police and law enforcement have cars as well like that is that is a balance. like it's not like you're you know when you imagine a future with some crazy new advanced technology and you imagine all the problems it creates it can be hard to like with the same level of detail and fidelity. imagine all the responses to those problems that are also enabled by that technology. and so you're you you know you could imagine someone worrying about the rise of like fast vehicles and like neglecting to think about how the fast vehicles would have like you know all the way all the ways that like they cause bad things could could be sort of like kept in check by people using vehicles for law enforcement and similar. and similarly with computers you know you can you can hack things with computers but computers also enable you to do a lot of automated monitoring for that kind of hack and like automated vulnerable discovery. yeah different kinds of law enforcement. like you couldn't you couldn't imagine a police force not using computers. so i do think the basic principle is sound that like if you're worried about problems created by technology one of the first things on your mind should be how can you use whatever that new technology is to solve those problems? and but but you know i i think that this is an especially narrow window to get this right. and you're not imagining cars creating like broad based rapid acceleration of of all sorts of new technologies and potentially like just a twelve month window or two year window or six year window before everything goes totally crazy. so i do think that it's important to not blow through that window to like monitor as we're approaching it and to monitor how long we have. but but yeah i think i'm fundamentally fairly optimistic about trying to use early transformative AI systems like early systems that automate a lot of things to automate the process of controlling and aligning and managing risks from the next generation of systems who then like automate the process of managing those risks from the generation after and so on.

Rob Wiblin: yeah it's interesting that you say that this approach has often often been dismissed because i feel i feel it's very in vogue now. i hear about this proposal every every couple of days someone presented or i read something about it in in one guise or another. yeah i guess one reason why years in the past it might have felt unpopular as people were mostly focused on the issue of misaligned AI. they were concerned about an AI that has it in for you and would like to take over if it had the opportunity. and that's maybe the worst application of this out of all of them because they're your you're asking the AI to align itself but you don't know whether it's assisting you or trying to to undermine you. and so i mean you could try to make that work. people have suggested proposals where you could try to get useful honest work out of a out of an AI that is not that doesn't want to help you but it's a lot easier to see how you could actually solve problems other than alignment. like if you assume or the alignment part we've like we feel like we've got a good handle on but there's a huge list of other problems that are being created during the intelligence explosion like the fact that AI now if people get access to it could invent other kinds of destructive technologies that we don't yet have good countermeasures for. in that case it's just clear how well the AI could just help you figure out what the countermeasures ought to be.

Ajeya Cotra: so i i don't think that i agree with this. so i do think misalignment the the prospect that these early AIS these early transformative AIS are misaligned is a is a huge obstacle to this plan that needs to be shored up and handled and specifically addressed. and i don't think that it necessarily bites harder for getting the AIS to do alignment research than for getting the AIS to do anything else helpful because if they have it out for you they don't necessarily want to like help you shore up your civilization 's defenses. so if you're imagining trying to get a hardened misaligned AI to help you with bio defense if it's misaligned then it you know for example wants the option of like threatening you with a bio weapon in its arsenal in the future. it would similarly have an incentive to do a bad job at that as it would to do a bad job at alignment research. so in general i think there's this there's one big concern which is will the AIS that we're trying to use at that point in time have motivations that give them incentives to undermine the work we're trying to get them to do? and and i think they certainly would have incentives to undermine alignment research if they were misaligned. but i think they would also have incentives to undermine like efforts to make ourselves more rational and thoughtful like AI for epistemics. because if we're more rational and thoughtful then maybe we'll realize they're probably misaligned and that would be bad for them. they would also have incentive to undermine our like defac style like defensive efforts because that would make it harder for them to take over.

Rob Wiblin: that makes sense. i think the distinction i was drawing is for people who thought that the alignment problem was extremely hard to solve and we were like way off track to to to to solving it. the idea of getting the AI to solve the problem is kind of self contradictory because i well i wouldn't believe i wouldn't trust the AI at all. i you know anything that i proposed i would assume was was sabotaging us. if you're on the side of thinking well the alignment problem is actually the easier part of things. i think that that's a that's a relatively straightforward technical problem that we are on track to solve. but there's this like laundry list of ten other issues. it's like they're then very obvious like but we'll have the like that that will have the brilliant AGI. so why don't we just use that to solve all the other things and and also like i'm inclined to trust it and believe it.

Ajeya Cotra: yeah. so i do think that if you are not worried about alignment at this early stage everything becomes easier. it becomes an even more attractive you know strategy and path. but i think the canonical using AI for AI safety or using AI for defence plan does imagine that we're not sure at the beginning that they're aligned. we may not be like highly confident that they're like extremely misaligned and like like fully power seeking and like looking to take over at every opportunity. but we're not imagining that we know with confidence we can trust them. so figuring out how to create a setup where we use control techniques and alignment techniques and interpretability and like whatever other tool at our disposal to get to the point where we feel good about relying on their outputs is like a crucial step to figure out. because it either like bottlenecks our progress because we're we're checking on everything all the time and slowing things down or it doesn't bottleneck our progress but we like hand the AIS the power to take over.

Rob Wiblin: so which kind of specific problems arising from the intelligence explosion are you envisaging wanting to get the the AGI to help us out with?

Ajeya Cotra: yeah. so one obvious one is just AI alignment. how can we ensure that either these AIS that we're using to help us right now or future generations of AIS that they help us create and future generations that those AIS help us to create. how can we ensure that that whole chain is motivated to help humans and is honest and is like basically doing what we say and steerable and that is sort of the foundation of everything else. but then there are also other things that are not really about AIS at all that are just about broad societal defenses. so if we think that the advent of extremely powerful AI will like create a flood of new like cyber vulnerabilities that are quickly discovered in like a bunch of critical systems like weapons systems and the power grid and so on. can we pre emptively use those same AIS that are good at finding those vulnerabilities to find and patch them before bad actors can use the AIS to find them? another thing is bio defense. so you had my colleague andrew on your podcast recently that talked about his ambitious plan to you know rapidly scale up like detection of novel pathogens rapidly scale up medical countermeasures when they're detected and rapidly scale up the manufacturing of like PPE and like clean rooms and things like that. if we have AI systems that are good at like you know that kind of research problem and also maybe we have at that point robots. so a lot of that manufacturing itself can be automated and can go a lot faster than if humans had to do that stuff. that would be like a big boon to biodefense. and then there's some somewhat more speculative things along the lines of like you you can think of this as a kind of defense. like you can think of it as like a psychological defense maybe. but there's stuff around. can we use AIS to make our collective decision making a lot smarter a lot wiser a lot better? can we make it so that we're better at finding truth together? can we make it so that we're better at coming to like compromise policy solutions that leave like lots of people happy?

Rob Wiblin: and how do you ensure that advances in AI doesn't lead to a war between the US and china? that kind of thing.

Ajeya Cotra: or even that that too. but even more mundanely stuff like over the last ten fifteen years social media has led to like a degradation of political discourse. could could AI tools help you just just kind of like find the policy from among the vast space of possible policies that like a large number of people actually like and can like credibly put trust in and so on?

Rob Wiblin: so i interviewed wool mccaskill and tom davidson from fourth earlier in the year and they have the organization has a long list of what they call grand challenges which they suspect all of them are probably amenable to this kind of AGI labor during crunch time. i think other ones are like ensuring that society doesn't end up locked into particular values but kind of prematurely with that and like cuts off our ability for further reflection and changing our mind. the potential use of AI or AGI in as much as it's very steerable and follows instructions to be used in kind of power grabs by by the people who are who are operating it. i guess the space governance this question of if we actually do start to be able to use resources in space how would we share them? how would we divide them such that in particular such that there's not conflict ahead of time because people anticipate that once you start grab grabbing resources in space you're on track to become overwhelmingly dominant? yeah there's epistemic disruption which you mentioned i guess new competitive pressures kind of concerns that you can end up in a sort of malthusian situation if you have competition between many different AIS and possibly some others that are missing here. but there's i guess many other like i guess we don't know which of these are going to loom large at that time. some of them might feel like they've they've kind of been addressed or perhaps that we were we were hallucinating issues that that aren't so severe. but yeah there's there's many different ways that we could potentially apply it.

Ajeya Cotra: yeah i agree. i think all of those problems that tom and will highlighted seem like real problems to me. i think maybe my approach would be to from our current vantage point lump a lot of that under AI for helping us us think better and helping us like find solutions that we're mutually happy with. so it's like AI for coordination compromise negotiation truth seeking that that cluster of things. because i think like you know something like the question of space governance like how do we divide up the resources of space? if like there are some existing factions that have an existing distribution of power. no one really wants the the sort of destruction that comes from everybody racing as hard as possible to get there first. but there's like a complicated space of like negotiated options beyond that. and i think AIS could potentially help a lot with that sort of thing.

Rob Wiblin: so you said in your notes that you think this approach is basically what all of the frontier AI companies say this is their safety plan more or less. is that right?

Ajeya Cotra: yeah i i would think so. i think if you look at public communications from at least open AI anthropic and google deepmind this sort of jumps out more or less in in these different cases. but in all of their like stated safety plans you see this element of as a is get better and better they're going to incorporate the a is themselves into their safety plans more and more. and and i think some some are more explicit than others about expecting some sort of like specific crunch time that occurs when AI is like rapidly accelerating A I R N D. but but everybody is picturing a is playing a heavy role in their in the safety of future a is.

Rob Wiblin: yeah. what assumptions are necessary for this approach to make sense? or what kinds of setups could actually just make it a bad plan?

Ajeya Cotra: yeah. i think fundamentally you need it to be the case that there exists a window of opportunity where like before AIS are uncontrollably powerful or have created like unacceptable levels of risk where they are like really capable and like really change the game for like AI safety research. and that there's some meaningful window of time where you can you can notice as you're approaching it and even by default without like crazy slow down. it lasts at least six months or lasts a year. if you think instead that once your AI sort of hits upon some generality threshold it like within a matter of days or weeks becomes crazy super intelligent this plan doesn't work because like you know you wouldn't even notice probably before you're before it's too late. so and then i think there's also there can also be unlucky orderings of capabilities where this plan wouldn't work where you could have a is that are like really specifically good at AIR and D and they're really not good at anything else not even AI safety research that's very similar to AIR and D. they're just like extremely good at AIR and D. maybe the only thing they're good at is making it so that future generations of AIS have better sample efficiency and can learn new things more efficiently. then you could have a period of six months or a year where you know this is happening and you have these AIS but you're still sort of hurtling towards a highly general super intelligence without being able to use these AIS for anything else necessarily because they're just not good at anything else.

Rob Wiblin: there's something that's a bit self contradictory about that because an AI that can it's like extremely smart but all it can do is improve the sample efficiency of the next model is in a sense like not very troubling. but in itself because it doesn't have like general capabilities that kind of model isn't going to be able to take over or invent other technologies. it's only at the point that it has the broader capabilities the broader agency that it actually is able to make problems. but i guess you're saying you could have a long lead up where that's all that it can do and then at the last stage.

Ajeya Cotra: yeah. and then at the last stage it might be going it might go back to the first scenario i talked about where it's like oh it the the narrow AIS that are just like savants at A I R N D hit upon a like an algorithm in in almost like a blind search. like almost like if you imagine alpha fold like it is brilliant at like figuring out how proteins fold but isn't. yeah it doesn't it isn't like broadly aware like you could imagine such AIS or like an algorithmic search process hitting upon an architecture or like a training strategy that then can go foom really quickly. and so so in this lead up you're like yep AI is accelerating AI R and D. it's crunch time. we have six months left we have three months left. but like these AI 's are not the AI 's that you can use for anything useful.

Rob Wiblin: yeah i guess many of the problems that we'd like it to help with issue they're like social issues political issues philosophical issues in in some cases. what do you think is the chances that AII mean? the companies i think they're working harder to make them good at coding and to make them good at AI research than any other particular thing. and i guess those are more concrete measurable problems than solving philosophical questions. so it seems like it is really a live risk that they'll unfortunately the the balance of capabilities will end up being pretty disadvantageous for this plan.

Ajeya Cotra: yeah i think that the further afield you go from work that looks like doing ML research and doing software engineering the the greater a penalty they'll probably be the a is like currently are much better at you know helping my friends who do ML research all day than me where i do you know weird thinking and like go on these kinds of podcasts and like write emails to people making like grant decisions and stuff like that. it's it's much worse at that stuff. you can see already that it's got like an a very specialized skill profile. fortunately i do think that at least AI safety there there's a big chunk of AI safety research that does look very similar to ML research. and i do think like you know my friends who are getting like big speed UPS from AI are our safety researchers and they're doing the kinds of work control alignment etcetera that i think will will be like some of the most important things you want these AIS to be helping with at the very beginning. but but yeah stuff like AI for epistemics AI for moral philosophy you know AI for negotiation AI for policy design all that stuff just may not be that good. doesn't necessarily have to be good by default. and that's like a big concern of the plan.

Rob Wiblin: i guess another worry would be that the AI models end up being able to cause trouble before they end up being capable enough to to figure out solutions. like it's like i know a classic case there would be imagine that we put a lot of effort into. i guess it would be a bit stupid to do this but we put a lot of effort into training an AI model that's extremely good at developing new viruses or or new bacteria basically changing diseases to to make them worse. i mean there are people who develop using AI to to develop new viruses. i guess they're using it to develop medical treatments but that that sort of stuff can then be repurposed for other things. but if that sort of highly specialized model arrives first before you end up with a with a model that has a sufficient understanding of all the society and biology and medicine to figure out what the good countermeasures are then we need a different approach than this one.

Ajeya Cotra: yeah. and in general i think i think of like AIS doing defensive labour as prediction about the world that you want to like try and be thinking about as you make your plans. it's not a guarantee. and in many cases the answer will be to do to specialize now in doing the kinds of things that might be hardest for the AIS to do then. and i think stuff like building a bunch of physical infrastructure to like stockpile a bunch of PPE 's and vaccines and things like that is a is a prime candidate for something that just inherently takes a long lead time. and that the AI 's might not be that advantage at at the point that they're good at doing the like scary things that it's meant to protect against.

Rob Wiblin: yeah that that was going to be another concern of mine that in in as much as the AIS are very helpful you might imagine that they're very helpful at the like idea generational the the strategizing stage. but they might still be quite bad at like actually running a business or actually figuring out how to do all the manufacturing. so if if they come up with they could come up with a great strategy for countervailing a new new new bio weapons where they're like here's the widget that you should use go and go and make ten billion of them. they're like can you help us with that? it's like no i'm not very good at that.

Ajeya Cotra: yeah good luck. yeah i think i think that in general you should expect a is to be much better at things that there are tighter feedback loops on where you can recognize success after a short period of time. and that's why they're like that's one of the reasons why they're really really good at coding because you can just like train them on this like very hard to fake signal of like did the code run after you like did whatever you did with it? and in general i think like idea generation versus actually executing on like a one year plan has some of this element of like you can read a white paper and be like huh yeah that's pretty good. and like you can push the thumbs up thumbs up button and like generate an AI that's like pretty good at generating white papers that you think are like you know neat. and like probably would work but it's like much harder to train the AI to like run the team of like thousands of humans and robots that are like actually executing on the plan.

Rob Wiblin: why is the crunch time aspect or or you know the intelligence explosion taking off actually even relevant to when we would want to start doing this? because you might just think if AI can help us do research or do do work to solve any of these problems then we as soon as it's able to do that we want to do it. like whether or not an intelligence explosion is is is kicking off or not.

Ajeya Cotra: to some extent that's right. i think the reason that i focus so much on the intelligence explosion is twofold. one is because i'm at that point i think we might have a pretty short clock to figure out a bunch of stuff and you know the the default trajectory might look like twelve months to extremely powerful powerful uncontrollable super intelligence that can easily take over the world. so it kind of changes our calculus of like you might you want to like focus on like very short term things rather than things that have long lead times at least at crunch time if not before. the other thing is i think crunch time can help alleviate some of the challenges we're we've been talking about with a is not being good at the full spectrum of things we want them to be good at. because sort of by definition at that point a is are really good at further AIR and D. and one of the things we could do with a is that are good at AIR and D at least in most cases is to try and direct their AIR and D towards like filling out the skill profile of a is and getting them to be good at some of the types of things that we want them to be good at that they aren't so good at right now. and so at that point you might have like just much more capability at your disposal and it might be like much more worth putting in the effort to to try and like fine tune and scaffold and do all these other things to make your AI that's good at moral philosophy or your AI that's good at bio defense.

Rob Wiblin: so you're thinking about this strategy not just as a description i guess what what other organizations potentially should work on or as a description of what what what AI companies are already planning to do. but also i guess because you think maybe this should influence what open philanthropy plans to do over over the coming years. and potentially that like open open philanthropies best play might be to have billions of dollars waiting at this at this relevant crunch time and then disperse them incredibly quickly buying a whole lot of compute to to to get AIS to solve these problems.

Ajeya Cotra: yeah i mean just just like how right now you know eighty percent plus of our grant money goes to salaries to pay humans to think about stuff and do research and do policy analysis and advocacy and all these other things. you know so too in a few years it might be the case that AIS are better than most of our human grantees and our money should mostly be going to buying API credits or renting GPU time to get the AIS to do like a similar distribution of activities.

Rob Wiblin: so an alternative approach to this would be that at the point that we get a heads up that we think an intelligence explosion is is beginning to take place we do everything we can to pause at that stage to to to slow down basically to to arrest that process. so that rather than having to like rush to in in in three or six months get the AIS to fix all of these issues we buy ourselves a bunch more time. why not adopt that as the primary approach instead?

Ajeya Cotra: yeah. so i think that the the plan i described is compatible with pausing at an intelligence explosion at like right at the brink of an intelligence explosion. in fact i would hope that we do that because i think by default having twelve months to get everything in order is just not enough time. but i think of it as doing two things. one is making the pause less binary. so if you think of the default path as almost a hundred percent of AI labour goes into further rounds of making AIS better and making more AIS and making more chips and so on. and you think of a pause or a stop as zero percent of AI labour is going in of the world 's AI labour is going in towards those activities. i think there's a whole spectrum between zero and a hundred percent. and then i think of it as doing another thing which is it's sort of answering the question of what you do in the pause which is like you you do all this protective stuff and you have these AIS around to do it with. and you might think like like once you have that frame of like making the pause less binary and thinking really hard about what you do during a pause i think you might often end up thinking oh it's worth going a little bit further with AI capabilities. because you know especially if we tilt the capabilities in a certain direction we might at the end of that get AIS that are much better than they are right now at bio defence while still not being uncontrollable still not being that scary. and you can imagine a bunch of little pauses and little redirections and so on during during that whole period. and i would hope that like at some point in the period we do activities like policy coordination and so on that cause us to have longer in this sweet spot of of AIS that are powerful enough to help with a lot of stuff but not so powerful they're like you know we've already lost the game.

Rob Wiblin: so yeah we should probably clarify that although you think this is among our our best bets in an ideal world do you think that we would go substantially slower through all of this? because you know as good a plan as this might be we'll we'll really be white knuckling it and and not be confident that it's necessarily going to work.

Ajeya Cotra: yeah i so i i think that you know if if a really clear early warning sign triggers that we are about to enter into this intelligence explosion fast take off space where we go in the space of twelve months from you know AIR and D automation to vastly superhuman AI. then i would vote for at that time shifting that trajectory to be ten times longer or even longer than that. and trying to make that transition as a society in ten years instead of one year or twenty years instead of one year. i still wouldn't. and this is this is maybe a bit of a quibble. i still wouldn't advocate for pausing and then like hanging out for ten years and then unpausing because i actually think that like slowly inching our way up is better than like pause then unpause and then having a jump. but yeah i would like going back to what we said about like how your default expectations of trajectories influence what you think should happen. i think the default is is going through this in like one year. and i would certainly rather it be ten or fifteen or twenty years. but but i think that this like the the frame of using AIS to solve our problems applies regardless of whether you're sort of white knuckling it in one year or like maybe eking out an extra two months or if you managed to to get the like consensus and the common knowledge that allows the world to step through it in ten years.

Rob Wiblin: yeah i guess in as much as we're slowing down to do something this is a big part of the thing that we're slowing down to do. yeah. so this is a big part of the company 's plan for technical alignment if this doesn't work out. yeah why? why do you think it's most likely to to to have failed for them?

Ajeya Cotra: i think that it's probably if it fails it's probably most likely to fail because they just didn't actually do a big redirection from using AIS for further AI capabilities to putting a lot of energy towards using them for AI safety. because you know they say this is their plan but they don't really have any quantitative claims about like at that stage what fraction of their AI labor or their human labor for that matter is going to go towards the safety versus the further acceleration. and they'll be facing tremendous pressure at that point from their their competitors to to stay ahead. and so my guess is that unless they have like just much more robust commitments than they have right now they probably just won't be. directing that much of their AI labour. and so if they have a hundred thousand really smart human equivalents maybe only like a hundred of them are working on AI safety which is maybe still like more than they had before in human labour but not not that much compared to like how quickly things are going.

Rob Wiblin: it's saying unless they have really strong commitments but i guess other mechanisms would be that it's legally required at this point. the government basically insists that that most of the compute go to go towards this or at least like most of it's not going towards percussive self improvement. or i guess if the companies could reach some sort of agreement where they're saying well we would all like to spend more of our compute on this kind of thing. so we're going to have some i guess contract where we're going to spend like fifty percent of our all of our compute and then we like don't lose relative position in particular.

Ajeya Cotra: yeah. i mean i think that particular contract is probably going to run into big.

Rob Wiblin: antitrust issues it'd be a little illegal but yeah maybe we could carve out and accept a debt to antitrust with this one. i guess a different mechanic in as much as the government is taking a massive interest they could help to try to coordinate this one.

Ajeya Cotra: yeah i think i think that's a possibility. i do think it's a bit tough. this is this is not the kind of thing it's like super easy to make laws about because it's really not a box checking exercise. like what do you actually when you like write the legislation that like half the compute must be spent on safety rather than capabilities? like what do you count as safety research? and like how are you enforcing this? like do you have like auditors in there being like what are you working on? what are you working on to like all the team leads and the companies and like you know checking off that they have that it's fifty percent safety. i can imagine stuff like that. i think it would require like extremely technically deep regulators that like we just don't really have right now i think.

Rob Wiblin: i thought that you might say that the the most likely reason for this to fail was that it just turned out that alignment is incredibly hard. you get egregious misalignment even at like relatively low levels of intelligence and we don't really figure out how to fix that early enough to get useful work out of them.

Ajeya Cotra: yeah i think that's a possibility. i don't think it's the most likely way it fails in on my views. i think the most likely way it fails is that they're going to control. they don't go super hard on it. but i think it's it's also plausible that they're they're just trying to get the AIS to help with alignment and the AIS are just like misaligned and the control procedures and other things are like ineffective. and so they just deliberately only help with further AIR and D and don't help with alignment and safety and bio defense. and like all these other things you'd want them to help with i would hope that at that stage the transparency regime is strong enough that that fact is broadcast really widely. and then that could inspire like a change in policy that causes us to to slow down. but but then in that world it's a bad world even if we do slow down a lot because you just we're just on our own. we have to like do this stuff without the a is help because we can't get them to help us. but i'm actually like reasonably bullish about control techniques getting early AIS that are not super galaxy brain super intelligences to be helpful for a range of stuff that they're good at.

Rob Wiblin: it's another way that they could end up actually just not making that much of an effort is if the the window is relatively brief and it just takes a long time to get projects off the ground and they haven't really planned this ahead. so you know they end up debating it back and forth and then by the time they figured out that they actually do want to do this. i mean i suppose it's like nominally in in in these various papers but i wonder whether they actually are thinking ahead about how this would feel and like whether they'll be have the decision making capability to decide to redirect enormous resources towards this this other effort.

Ajeya Cotra: yeah. i do think anything that requires a large corporation to be super discontinuous and something it's doing is is like facing big headwinds as a plan. so i would hope that they're sort of smoothly increasing the amount of internal inference compute that is going towards safety as the AIS get better and better so that the the jump doesn't have to be huge at that at that final stage. and that is something that if we could elicit like honest reports without creating like perverse incentives that's something i'd want to know about like how much i mean how many how much human labour is going to safety versus capabilities and how much internal AI inference is going to safety versus capabilities how much fine tuning effort is going to safety versus capabilities. and i think i think they have like a much better shot if they're if they're stepping it up over time on some kind of schedule.

Rob Wiblin: OK. so that's the AI companies who i guess we're imagining would mostly be focused on this strategy for AI technical alignment. but you've been thinking about this more in the context of open philanthropy and like what niche it could fill. what would open flat that we need to do if this was you know dumping billions of dollars onto this plan that became it's a mainland strategy?

Ajeya Cotra: yeah. i think that for now the biggest thing we need to do is is very similar to the biggest thing i think society needs to do for preparing for the intelligence explosion which is really trying to like track where we're at right now in terms of how useful AIS are for the work that we do and the work our grantees do. i think pushing ourselves to automate ourselves and to pushing our grantees to automate themselves and and like tracking you know how how good is AI at the stuff forethought does? how good is AI at the stuff that redwood research or apollo does? how good is AI at the stuff that our policy grantees do? and and i think that that is just like one thing just like just socializing within ourselves like that. hey like it's a it's probably a big deal when the AI start to get really good at any given like good thing we're funding and once we start to see signs of life there we should be like prepared to potentially go really big on that. and like you said earlier i do think crunch time isn't like a hundred percent a special thing. like we absolutely shouldn't be like waiting until crunch time to do anything at all. it's just the prediction that like crunch time is the point when a lot of things that were hard to automate before become easier to automate. so if there are some if it turns out for example that like AI is really good at math research which i i think is plausible then maybe we should be trying to deliberately shift our technical grant making towards more mathy kinds of technical grant making because that is an area where you can like churn a lot more. like that's just so much more tractable. so i think just having a function that is looking out for these things and is maybe just like poking openfill and openfill 's grantees to to like consider shifting their work towards more easily automatable things. like consider repeatedly testing whether their work can be automated is a big thing. and then i i think i could imagine down the line something like even just having separate accounting for like the rest of our grant making versus grant making that is going towards paying for a is for our grantees. like you know we we already pay for like chat GPT pro subscriptions and like chat G P T A P I credits for tons and tons of grantees. i think just making it a bit more salient in our minds like what fraction of our giving is going towards that and do we endorse its size and do we is there like any place where we should be going bigger and are we on track? is the percentage climbing the way we like think it should be it? does that seem in line with like the way AI capabilities are climbing? are we on track to you know if we think crunch time is going to start in six years are we on track to have inference compute be like a large fraction of our spending at that time?

Rob Wiblin: if i think about this kind of psychologically i i could imagine you know if i was leading open philanthropy or i guess i was one of the donors being and being advised and we did have these transparency requirements and we did start getting a sense that an intelligence explosion might might be kicking off. i could imagine dithering for a long time rather than deciding to commit billions of dollars towards this because you there's only a particular amount of money there's only a particular size of endowment. and i think i would be like very scared that it's it will be going too early or this is a bad idea or we're going to have egg on our face afterwards. because it turn out there were some early signs of intelligence explosion but it's not really going to work out. and then we've like spent ten billion dollars and we have nothing left to show for it. that would be you'd feel really bad if you made that mistake. does that sound like a plausible way for things to go? oh.

Ajeya Cotra: totally. i mean i think it's just a very natural institutional i think even beyond just being scared of making a mistake on this front it's just that organizations have particular ways they do things and there's like processes and right now open fields process for grant making looks like you know usually someone fairly junior gets an opportunity come across their desk either through like one of our open calls or through some contact they have. and that junior person like pulls together some materials like to convince their manager it's a good fit. and then that manager sort of convinces someone higher up that it's a good fit. and you can have two layers or three layers or sometimes four layers of like you know information cascading up the decision making process that we have in place as an org. and then it's approved and and it's just like if if the right thing to do is to spend a billion dollars on like some particular strain of work that's like super automatable. it just like that isn't even like you wouldn't trust some random junior person to make that call. you need to you might need to have just a different process for that. and and like you need to like and. i don't know what that process would look like but i think that would be like one thing to figure out.

Rob Wiblin: i guess for for this sort of incredible scaling of of funding and an effort to take place it would have to be you're going to be incredibly bottlenecked on people or there won't be like that many more people involved. so it would have to be the AIS not just doing the object level work but also deciding what problems to work on. yeah having with the like managing the the project and overseeing other AIS basically just taking up the entire org hierarchy. so that that that's that's the picture that you're envisaging.

Ajeya Cotra: yeah. so i think there's there's two possibilities here. one possibility is that by the time it's the right move to dump a bunch of money on crunch time AI labour openfill itself has already been largely automated. and that's actually like an easy world because in that world we we just have a visceral sense that AIS are really helpful because they like we've like you know maybe we've slowed down our junior hiring and like all our program associates are AIS right now. and like you know we are totally transformed as an organization. so so the like evidence like in the conviction to pull the trigger might be easier to achieve. and then actually we have a bunch of laborers. maybe we have like a thousand people on the like AI team instead of like forty five that we have now. and they can like you know figure out all this stuff much more quickly. but i think the like concerning possibility is actually there's jaggedness where maybe AI is extremely good at math and maybe AI is extremely good at technical AI safety and like certain specific kinds of manufacturing that could be really useful for like a PPE play. but it's not that like we haven't automated ourselves. it's not that good at doing our jobs because like there wasn't much of that stuff in the training data. we're just not like well.

Rob Wiblin: stabbed to a misturb.

Ajeya Cotra: AI labour yeah it makes a couple of things in a way that like you can you can put it in a setup in like software or manufacturing where you catch those mistakes but it's harder. you need humans to do that on the open fill side. so we're not very automated. we don't have a visceral sense of you know it's time now like this this is the moment. like a is are really really good. we got to go big. but but it's still the right thing to do to like pour a bunch of money into AI labour on these like few verticals that are like heavily automated.

Rob Wiblin: i think we've maybe actually been burying the lead a little bit here on what what what the biggest challenge is for an external group like openfill to implement this plan which is will you even be given access to the to to to the very best models so that that are being trained. and i guess at this crunch time when there's a crunch on demand for compute will you actually have enough computer chips so we will anyone be willing to sell to you for you to do this kind of work? yeah. can you can you go into that?

Ajeya Cotra: yeah. so i think there's there's two challenges here to getting access to enough labour as an external group. one is whether they will just even sell to you. so like i said earlier in AI twenty twenty seven and a lot of stories of the intelligence explosion you get to a point where one company has pulled far enough ahead of its competitors that it keeps its internal best systems to itself and only releases systems that are like considerably worse than its internal frontier that are just good enough to be like ahead of its competitors released products. and there can be a growing gap in like how intelligent the the best internal systems are and how intelligent the best externally accessible systems are. and the AI company may deliberately choose not to sell to willing customers because they want to keep the their secrets to themselves. another possibility is they they might be willing to sell to you but the price just might be way too steep because the opportunity cost of using that compute to like sell to you to do whatever you want to do with it is training further more powerful AIS and they might be willing to pay quite a lot for that. so i think both are challenges. the second one is in some sense more straightforward to address which is you try to like hedge against this possibility by having some portion of your portfolio like really exposed to compute prices and hope that you know maybe that looks like in the extreme case just just having GPU 's yourself that you know in peacetime you just rent out to other people doing commercial activity with it. but then during crunch time you redirect to doing AI labour. although in that case you'll have to furthermore figure out how to get the latest AI chips like the latest AI models onto those chips that you own. so you might have to cut deals to to make that happen. but also like in in less extreme cases you might just purchase a bunch of nvidia or purchase a bunch of liquid public stocks that are exposed to AI to make it more likely that you can afford AI capabilities at the time.

Rob Wiblin: so there could be a huge run up in the price of GPUS or or computer this time but you can partly hedge against that possibility by having most of your investments be in nvidia or other companies that sell GPUS. so that if if their price goes up you benefit on the investment side and that and that helps to offset the increasing price. OK. and then on the software side there's a question of whether you have access to the to the very best models that are being trained. i guess on the one hand there's this story you can imagine where the companies are very close together the models are roughly the same margins are very low. they're very keen to put out models as soon as possible in order to remain competitive. i guess on the other hand you can have one one leader that's starting to keep things all secret. do do do you have a particular take on which of these scenarios you think it is more likely to to to come about?

Ajeya Cotra: yeah i think that at least at the beginning part of crunch time like when the AIS are are just starting to automate a lot of AI RND my my bet is that things will at that point be relatively commercial relatively open. the leading few companies are within you know a month of each other in their capability frontier. or maybe it's like hard to say who's in the lead because like one company specializes in like one like you know their their model is like a little spiky on like pre training and another company 's model is a little spiky on software engineering or something like that. and i think that the the reason i think that is is basically just because it's kind of what like a naive econ one O one model would predict would happen. it seems like these companies don't have big moats and it also seems like what we've seen happen over the last few years. like it?

Rob Wiblin: kind of describes the present day more.

Ajeya Cotra: or less it describes the present day. and that's a change from a few years ago where i do think open AI had like way more of a lead and it seemed more plausible that there would be like a monopoly or a duopoly. but there are reasons to to push in the other direction which is basically that if you have a super exponential feedback loop you have a bunch of actors that are growing at an increasingly rapid rate. like first at two percent then at four percent then at eight percent and they don't interact with one another. you do get a winner take all dynamic where if if they're growing on the same growth curve but one gets there gets to a particular milestone first. the bat leader gets more and more and more powerful and wealthy relative to the laggards. this is in contrast to exponential growth where if everyone is growing at two percent forever then the like ratios between more and less wealthy nations or companies stay fixed. so so there is a reason to think that specifically around the time of the intelligence explosion gaps will begin to grow again. but i think probably around the start yeah it will most likely be the case that like you can buy AI labour if you can afford it you can buy API credits you can go on chat GPT dot com. and and then i think i i i have a lot of uncertainty about how it evolves from there.

Rob Wiblin: yeah. what do you think is the chance that the leading company will try to keep the level of yeah the level that they're reaching secret?

Ajeya Cotra: i think it depends a lot on the the competition landscape they face. so basically if the other companies are really far behind then i think there's a pretty strong incentive and reason to keep your capabilities secret because you you give up like sort of quarterly profits. but maybe you don't care about that because you're running on investment money anyway. and if you can get your AI to help you make better AI to help you make better AI and so on you could emerge with like super intelligence that might give you a power that rivals nation states or like the ability to just like decisively control how the future goes. and that might be like very attractive to a sort of power seeking company. i do think it does involve forgoing short term profits though which means that if competitors are close at your heels and your investors are breathing down your neck to like deliver quarterly earnings.

Rob Wiblin: and i guess you can't go and talk you can't go and tell all of your investors oh don't worry we have like we have a super intelligence because i think then then word will get out.

Ajeya Cotra: well and then also they your plan is to screw over the investors. in this case your plan is to create a super intelligence not to pay them back. so create a super intelligence and take over the world. maybe like they won't like that. there's like a a mismatch in incentives between the investors and the CEO and the CEO is sort of being a bad agent to their principal. so so basically like the more things look like an efficient competitive market with very little slack the more the leading company will will be sort of forced to provide access to the rest of us.

Rob Wiblin: to what extent do you imagine the companies would be enthusiastically bought in on assisting with this plan? so so this strategy is their predominant approach to AI technical safety. i think even the optimists agree that there are other issues that society 's going to have to deal with. in fact they say this all the time that these are the companies that we're going to need a new social contract that's going to upend everything. it's going to be a big deal. i imagine that they in as much as they're nervous about the effects that the technology is going to have they'll be very happy if someone came to them with a like pre prepared plan for how here's how we're going to deploy all of this compute in order to solve all these other problems.

Ajeya Cotra: yeah i think it's unclear. i think there are certainly they have some incentive to be into this but the two sort of alternative uses of AI labour that might be more attractive to them are like one power seeking for themselves. just like you know building up an enormous AI lead over everyone else and then sort of bursting onto the scene with with an incredible amount of power and like the ability to like challenge like the US government or like nation states might be attractive to some people. i think that would be like a very evil strategy to pursue but it's definitely in the water. the other thing is more mundane. it's just using these AIS to make normal goods and services to make the products and the media content and the other services that people most want to pay money for. in a short term sense it's very similar to how right now we don't spend a huge fraction of society 's GDP on bio defense and cyber defense.

Rob Wiblin: and these other.

Ajeya Cotra: things and moral philosophy. it's just like that's not what people want to pay for. and AI is like another it's just a thing that accelerates the creation of products and services people want to pay for and this isn't very high on the list.

Rob Wiblin: i guess most people are not looking to become dictator of the world or or to to take on huge amounts of power. but i guess the kinds of people who end up leading very risky technology projects are not not typical people. they're like somewhat more ambitious than the typical. so i suppose we can't potentially rule rule that out as a possibility.

Ajeya Cotra: yeah.

Rob Wiblin: so a possible challenge would be that even if you have an enormous amount of compute there might just be only so fast that you can go because you require some sort of sequential step. so there's some step that is just like bottlenecked in time like you have to do. i guess people talk about things where you have to do an experiment that just that just actually takes a certain amount of time to to to play out. but more generally at least with LLMS for example they produce like one token after another. and having twice as much compute doesn't necessarily allow you to to like produce to basically complete an answer twice twice as fast without limit. how much is that an issue here? and as much as we're trying to solve problems in like a very short calendar time.

Ajeya Cotra: yeah i think that that is likely to come up especially for physical defenses like manufacturing PPE or scaling up the ability to rapidly create medical countermeasures and then also for social and policy things. so i can imagine that a is could be very helpful in figuring out what kind of agreement between the US and china would be like mutually beneficial and how we could enforce it. but the way human decision making works still probably requires humans from the US and china to come together and like talk about it you know have a like you know a conference or convening and and come to a decision that they ratify and they feel good about. and that could be a bottleneck.

Rob Wiblin: yeah. are there any other examples of of similar bottlenecks? i guess in terms of solving theoretical problems i suppose you you can speed things up enormously by having like many many different instances of of the same model like try try to brainstorm different solutions and then have them evaluate one another. and that allows you to kind of have many different efforts in in in parallel.

Ajeya Cotra: but it's also i do think for for deep theoretical problems you you can speed things up by having efforts going in parallel. but the right solution that's out there somewhere involves like multiple leaps where like it's hard to think of the next insight without having the foundation of the earlier insight. so. so really even if you have a hundred a is working in parallel. what will happen is that one of them comes up with the the first step of the insight and then everyone is working on in parallel and finding in the next insight. but you still need to go three or four steps in so.

Rob Wiblin: what sort of stuff do we need to be doing in advance? i guess like for example setting up planning meetings ahead of time for like diplomats between the US and china? we actually need to do that at the very early stage in anticipation that eventually we might have a deal that they might want to ratify. i guess that sounds a bit crazy but are there other examples of things that you need to do before this all kicks off?

Ajeya Cotra: yeah. i think that in general you want to be thinking about what would the AIS at the time be like most comparatively disadvantaged in. they'll have like all these advantages over us. they'll understand the situation but much better at that point in time than we do now. they'll be able to think faster move faster and so on. but i think what we can contribute now would be things that just inherently take a long lead time to set up. so that might include physical infrastructure like the the bio infrastructure that my colleague andrew is working on building out. it might also include just social consensus. like i think it takes some amount of time for an idea to be socialized in society to have it as an accessible concept that maybe we should try and create some sort of treaty between the US and china to allow AI to progress somewhat slower than it might naturally and use a bunch of AI compute to to solve all these problems. i think that that takes that kind of thing takes years to become something that's that's in people 's tool kit in the water such that they actually think to have the AIS like go down that path and like figure out the details of that.

Rob Wiblin: so what should people be doing if if if they think that this kind of makes sense or it's something that they 'd want to contribute to? are there other organizations that should similarly be sort of planning ahead and thinking about how this might look for them? or could individuals be thinking about how they could contribute to i guess adopting this approach for for their own particular projects?

Ajeya Cotra: yeah. so in terms of other organizations i think it would be especially great for government entities to be thinking about adopting AII. know that there's just a number of random little types of red tape that make it harder for governments to adopt AIS than for anyone in industry to adopt AIS. and i think we might end up in a situation where like you know the the regulate ease the like industry people have like fast cars and the regulators have like horses and buggies because of this like differential adoption gap. and i think just more broadly if you if your company is not already going like maximally hard on adopting AI for your personal use case and you work on defenses AI safety you know moral philosophy all these good things. it it's probably worth like having a team that's just kind of on the lookout for like how how could you like sort of adopt AI as soon as it becomes like actually useful for you?

Rob Wiblin: let's talk a bit about the career journey that you've been on since since we lasted an interview two and a half years ago. i guess back then you were doing general AI research and and strategy for for open flat B. this is in twenty twenty three. and then in twenty twenty four you started leading the AI technical grant making. and then i guess towards the end of that year you decided to take four months off and take take a sabbatical. take a sabbatical yeah. tell us about all of that.

Ajeya Cotra: yeah. so i think the before i had been at openfill for more than six years before i made my first grant. i was involved in like some grant making conversations earlier but the first grant i actually let on was like some somewhere in mid or late twenty twenty three. and i had joined openfill in twenty sixteen. so it was like kind of interesting like my work at openfill in some sense. if you kind of if you just took the outside view and said you know this is a philanthropy that's giving away money. my work there was like very strange because it was kind of thinking about these heady topics and then like writing these like long reports that i published on less wrong about them. and i always felt a little like oh i maybe i should dip into grant making because that is like our core product in some sense. it's what we do. but i had always been sort of drawn away by like deeper intellectual projects. so even though i like always vaguely had the thought that i should do grant making it never really happened for me until actually i think the thing that pushed me head first into grant making was the FTX collapse. so actually sorry my first grant must have been in twenty two instead of twenty three because at that point there were hundreds and hundreds of people who had been promised grants by the FTX foundation where their grant wasn't going to go through or they were worried it was going to be clawed back or it was it was partially not going through an open fill. sort of put out this emergency call for proposals for people who had been affected by the crash. and i had i had some like thoughts and takes on technical research and also just the organization needed help like surge capacity for for this like sort of emergency influx of grant making. so in the in a matter of like maybe six weeks or so i made like fifty different grants after not having made like any grants at all. and that was that was a really interesting experience. and i discovered i like there were elements of it i really liked but there were also there's just like something about the way you made grants where you just really couldn't dig into any particular thing very much especially in the context of something like the FTX emergency. you just had to be like making these decisions really quickly. but but i felt like i had thoughts about how grant making could be done with like more at least in in the technical AI safety space could be done with like more inside view justification for the like research directions we were funding than we had previously. and so in early mid twenty twenty three i sort of tried to go down that path.

Rob Wiblin: OK so so in twenty twenty two you did this huge burst of grant making i guess trying to help a bunch of refugees from the FTX foundation basically. but then you thought i guess you would have noticed that there's probably no overarching strategy behind all of the grants you were making. and you were like we need to have a bigger picture idea of like where we're what we're actually trying to push on and why.

Ajeya Cotra: yeah. so i was focused on grants to technical researchers. so these are often often academics sometimes AI safety nonprofits and they would be working on you know often interpretability or some kind of adversarial robustness. and they seemed like you know reasonable research bets. but i felt kind of unsatisfied. and i think this is going to be like a theme of like me and my career. i felt kind of unsatisfied about how like the the theory of change hadn't been like really ground out and like spelled out as to like how this type of interpretability research would lead to like this type of technique or ability we have. and then that could fit into a plan to prevent AAI takeover in this way. or similarly for any of the other like research streams we were funding. and this had been actually the big thing that deterred me from like getting involved in open fields technical AI safety grant making for a long time even though i was i was one of the few people on staff that thought about technical AI safety outside of that team. it was because like in the end it seemed like most grant decisions in this twenty fifteen to twenty twenty two period. turned on like heuristics about this person 's a cool researcher and they care about AI safety which is like totally reasonable. but i think i wanted to like have more of like a story for like and this line of research is addressing this critical problem. and like you know this is like why we think it's plausibly likely to succeed. and this is what it would mean if it succeeded. and we never really like had that kind of like very built out strategy because it's like very hard. it's like it's like it's a lot to invest in building out a strategy like that. but you know having been thrown head first into grant making with the FTX crisis i was like maybe i i do want to try and like take on the AI safety grant making portfolio which at the time didn't have a leader because all the people who had worked on that portfolio had had left by that point some to go to FTX foundation actually. and so like it was it was this like portfolio that had been like somewhat orphaned within the organization. and it was clearly like a very important thing. and i like i was like oh maybe we could like approach it in this kind of novel way for us in this area to like really try and form our own inside views about like the priorities of different technical research directions and like really connect how it would like address the problems we most cared about.

Rob Wiblin: it sounds like you find it unpleasant or like anxiety inducing to make grants where you don't have a deep understanding of what the money or i guess not so much what the money is being spent on but like you don't have a personal opinion about whether it's likely to to to bear fruit. is is that right?

Ajeya Cotra: yeah we're like i think it's it's a bit nebulous what the standard is that i hold myself to. but i think for my research projects when i think about timelines or i think about how AI could lead to takeover or like how quickly could the world change if we had AGII think i can often with like months of effort get to the point where i can anticipate and have like a reasonable response to and a reasonable back and forth with a very wide range of intelligent criticisms for why my conclusion might be like totally wrong and totally off base. like i feel like i know what the skeptics that are more do me than me will say and i know what the skeptics that are less do me than me will say. and i i could have an intelligent conversation that goes for a long while with like either side. and that is like a standard i aspired to get to with why we supported certain grants. and i i could do that with some of our grants but i i wanted the program to get to the point where like if somebody came to me and said like you know isn't isn't interpretability just just actually some like hasn't seen much success over the last four years. what do you make of that? i wanted to have like i wanted to kind of be at reflective equilibrium on my answers to questions like that and wanted to be able to like say something that went a bit beyond like yes but like you know outside view we should support a range of things. and that that is something that i think emotionally like is unsatisfying to me if it's like a big element of my work yeah.

Rob Wiblin: it's maybe worth explaining why it is that openfill doesn't aspire to get to that level of confidence with most of its of its grants. why is that?

Ajeya Cotra: i think it just takes a long time. i think there's two things. it just takes a lot of effort. and then the other thing is that even if you put in that effort you don't want to fully back your own inside view. and then i think i wouldn't endorse that either. and so like is this one two punch where it's just like developing your views about like exactly how interpretability or adversarial robustness or like control or courage ability fits into everything is a ton of work. you have to talk to a ton of people. you have to write up a bunch of stuff. and in the meantime you're not making you're not getting money out the door while you're doing all this stuff right? and then having done all this stuff like where are you going to end up? like you're going to end up in a place where there are reasonable views on both sides. and like it's a complicated issue. we probably want to hedge our bets and like defer to like you know different people with different amounts of like the pot and so on. and so i think people have a a reaction that's like very reasonably like OK we're we're going to end up in a place where we've thought it really we've thought it through. it was a lot of work. it's still very uncertain. we still want to spread our bets. so why not like why not like just get to the point where just short circuit all that and spread our bets and like lean on advisors. and i i think i have sympathy for that. hopefully i represented that perspective reasonably well but i just feel like in in my life in my experience like having done the homework like really qualitatively changes like the details of the decisions you make in ways that i think can be really high impact. like one thing that i'm able to do having like gone through the whole rigmarole of like forming views is work with researchers to find like the most awesome version of like their idea by the lights of my goals and like pitch them on that and like sort of co create grant opportunities. and i think there's just like something and that i maybe maybe won't be like great at defending but i just feel like there are other like nebulous benefits beyond that. and and i like really like operating that way.

Rob Wiblin: so in twenty twenty four you you actually like took on responsibility for this whole portfolio?

Ajeya Cotra: but i guess you in late twenty twenty three.

Rob Wiblin: yeah twenty twenty three. but i guess your personal philosophy of how to operate is somewhat in tension with how openfill as a whole is. is is tending to operate and and or.

Ajeya Cotra: just in tension with in the short term making a large volume of grants. i think that's yeah.

Rob Wiblin: so yeah so what did you end up end up doing in the role?

Ajeya Cotra: so i think i ended up pursuing a a compromise where one thing that just comes with the territory of this role is that there were there have been grantees that we made grants to in the past that are up for renewal. and like part of the responsibility of being the person in charge of this program area is that you investigate those renewals and make decisions about whether we should keep the grantees on or not. and those grants i tried to follow like what an open fill canonical decision making process would be there. and so i tried to pursue kind of a barbell strategy for a while where like on the one hand there were either renewals or people who like knew us who reached out to us to ask us to consider grants. where i wouldn't hold myself to the standard of like really on the technical merits like understanding and defending the proposal but would learn more on heuristics like this. this person seems like aligned with the goal of like reducing AI takeover risk. this person has like a broadly good research track record and so on. and try to make those grants like relatively quickly. but then i would also be trying to develop like a different funding program or like some some grants that i like really wanted to bet on where i would try and like work myself to like hold myself to that standard and try and like really like write down why i thought this was a good thing to pursue. and it turned out that the second thing basically turned into like making a bet in late twenty three to like mid twenty four of AIAI agent capability benchmarks and like other ways of gaining evidence about like a is impact on the world.

Rob Wiblin: so sort of the stuff that we were talking about earlier where you're trying to get an early heads up about whether the AIS are going to be really effective agents. i guess twenty twenty three we were really unsure how that was going to go. it seemed like agents in general have like been a bit like disappointing or it hasn't progressed as much as i expected or probably as as you expected. but at that point it seemed like well maybe they're maybe by this point they'll be just operating computers completely as well as as humans. and you really want to i guess know if that was that was the feature we were heading for.

Ajeya Cotra: yeah yeah. so i i launched this request for proposals which openfill has done technical safety requests for proposals before. but this was by far the narrowest and most sort of like deeply justified technical RFP that we had put out at that time where i was like we are looking for benchmarks that test agents not just models that are chat bots. and these are the properties we think a really great benchmark would have. and these are examples of benchmarks we think are good and not so good. and like we had a whole application form that was that was in some sense like sort of guiding people to like to or like trying to elicit the information about their benchmark that we thought would like be most important for determining whether or not it was like really informative. and like mostly this was just like be way more realistic like have way harder tasks than existing benchmarks. like even if you think your tasks are hard enough they're probably not hard enough. there's a lot of like push in that direction. so it was a very opinionated and very detailed and very narrow RFP. and we ended up making twenty five million dollars of of grants through that and then another two to three million from the companion RFP which was just a broader like all kinds of like information from RCTS to surveys about a is impact on the world. and i i'm like pretty happy with how that turned out. it was like like you would expect a lot of effort poured into like one sort of direction. and you know if you were if you were skeptical of this like sort of high effort sort of approach to grant making there would be this like you you you could argue that like you know i could have just like put in way less effort funded like you know twice as much volume in grants across like ten different areas picking up the low hanging fruit in all those areas.

Rob Wiblin: so i guess halfway through twenty twenty four you started feeling pretty burnt out or like you you want to take a bit of a break. yeah. why? why was that?

Ajeya Cotra: yeah i think throughout this so right around when i switched from doing mostly research to doing grant making and especially when i was like trying to ramp up this this program area that had this more like inside view more understanding oriented approach to AI safety research. holden who had been running the the AI team up to that point decided to step away and left the organization. and he was my manager. and i i think that i had a working relationship with holden that involved a lot of like arguing and discussing about the like substance of what i was working on. and when he left leadership was like stretched more thin because someone in leadership was gone. and i think the the people who remained in the leadership team didn't have as much like context and fluency with all this AI stuff as holden did. so you know when i i i wrote up this big memo being like oh we should do like AI safety grant making in a more like understanding oriented way and we should develop inside views. and here's why i think that would be good. and i think what i wanted was for like my manager or like leadership to like argue with me about the object level on that. and like for for there to be like some sort of like shared view within the organization about like how much this was a good idea or like what are the pros and cons of it? and like how much we want to bet on it. but i think that was like just kind of unrealistic given the other priorities on their plate and like given their level of like context in this area. so i ended up having to to sort of approach it in a more like transactional way with the organization. it's more like rather than like let's talk about whether this is a good idea. it was more like well i want to do it this way. and they were like yeah i mean we don't know if that's the best way to do things. and like we have some skepticism but like you can you can do that if you want. and it's like and so i felt like kind of lonely because i think and this is something i learned about myself like over the course of trying to run this program and then going on sabbatical and reflecting on it that i really like to be like kind of plugged into the like central brain of like the organization i'm part of. and i sort of didn't like like i didn't feel like i had a path to do that. and i instead like what i had a path to do was to like stand up this this thing which i tried to do but it would it just like felt a bit tough going and like it.

Rob Wiblin: sounds like you are. you're a bit on your own.

Ajeya Cotra: yeah i felt a bit on my own and i'm not a very like entrepreneurial person i think or like i'm ambitious in some ways but like it's not i i just really have like have a high need for like constantly talking to other people. and i try to achieve that sense of team by like hiring people under me to help me with this vision. but i think i was not very good at hiring and management. partly it was because this vision was was like pretty nebulous. and i think i like probably needed to like spend more cycles like working out the kinks in it by myself and like really solidifying what it is and what's the realistic version of doing like an understanding oriented technical AI safety program. so it's very hard to hire because you kind of had to hire for someone who really like resonated with that off the bat even though it wasn't a very well defined thing. and then i so so that took a lot of energy. and then i think with people i was managing i have always struggled and in this case still struggled with like perfectionism in management. so i have this long history of like trying to get people to like serve as like writers who like write up my ideas. and it never works for me because like they don't do it just the way i want it. and i'm myself a pretty like fast writer. and so like working with a writer as their editor and like getting their writing output to be something i'm satisfied with often ends up taking more time than like doing it myself. and i found the same happened to some extent with grant makers where at one point we had a number of people sort of spent part of their time working on the benchmarks RFP. and i think it's like possible that i would have just move through the grants faster if it were just me working on it which is which is a bit tough. i think i never like i i think this is like a weakness or challenge a lot of like new managers go through. and like i was sort of going through that at the same time as feeling like some of the like the the feedback and engagement i got from above me was like like much less than it was before. and i had to like sort of like prove this new way of doing things and like felt yeah like i i i i thought and still think that there was like a lot to the arguments i was making but also like i didn't it was not a wild success when i like took a swing at it like by myself so.

Rob Wiblin: so september last year you decided to step away and just take some time away from work i guess after eight years of working very hard full full time yeah. what would you end up doing doing with that time it?

Ajeya Cotra: was a mix of things i just did a lot of like life stuff like i don't know and just invested more in like like i i found a new group house to move into or like started a new group house. so that was cool. did more just like trying to take care of myself. i started an exercise habit off that exercise habit now again so we'll see. and then i i did a lot of reflecting on why this work situation ended up being so hard for me. and like also just like my journey through just like my career as a whole and like what are the patterns? and when things were were hard for me i also just jumped in and helped with some random projects going on. so the curve conference which is a conference that kind of brings together AI skeptics and AI safety people and like people kind of on all sides of the issue of like A I S impact on society. that that was like having its first iteration while i was on sabbatical. so i was able to like kind of get involved with that more and like try to be helpful more than i like could have been if i had a full time job which was really cool. did some writing. most of that writing hasn't been published but it was still good for me to do. but yeah it it kind of went by really fast. honestly. there was a lot of stuff to think about and a lot to do.

Rob Wiblin: yeah what what sorts of reflections did you have on i guess your your your career so far and your motivation and i guess like what had been difficult in in twenty twenty three and twenty twenty four?

Ajeya Cotra: yeah. so i think in terms of twenty twenty three and twenty twenty four specifically i really do feel like i want to be like an advisor and a helper to the kind of central organization. and i had had been that in many ways over the last over the previous six years. so the transition to being more entrepreneurial and more like i have a little start up making grants in my area and the the organization is investing like money in me but not necessarily a lot of like attention. and i didn't necessarily have a path to like make arguments that then influenced like stuff in a cross cutting way. that was hard. so i think that was interesting to learn about myself that like that's if i don't have that i will still sort of gravitate towards trying to like meddle in like everything else that's going on. and if i don't have like a productive path to meddle i'll feel sad. that was one big thing. i think another big thing is this just how much depth do i want? like i do think i really want to like i have a drive to really like get to the bottom of something or just like i'm always like thinking about the counter argument and the counter argument to the counter argument. and like the stuff i liked even when i was very young like i i really liked math tutoring and like i really liked math in general because you could just like dig and dig and like get to an answer. and that's just inherently like a like uneasy fit with grant making. basically they're just like investing like fox.

Rob Wiblin: yes like venture capital that open fill is engaged in in a way.

Ajeya Cotra: yeah yeah. so that was also interesting to reflect on and it and like i said it was like somewhat strangely for my first six or seven years at open fill i actually just did do like rather deep research even though we were a grant making organization. i just wasn't doing grant making.

Rob Wiblin: yes. is that in part because holden really wanted this deep research? he wanted to more deeply understand the idea? OK both both personally and he thought it was i guess healthy for the organization.

Ajeya Cotra: yeah i think that's right. i think he had a lot of like drive and demand for like really figuring out timelines really figuring out take off speeds and like exactly what our threat models are for like whether AI could take over the world and like build building that all up. and i think has a lot of the same instinct i have of like oh it's it's just really good to do your homework. and it's it's really good to like you know have the response to the top ten counter arguments and the response to those responses and just really like know your stuff. and so he he was the driver of a lot of the work that i did. and i think if you'd rerolled the dice and like openfill had been run by like different leadership it's like probably pretty unlikely we would have gone as deep as we did until like doing our own AI strategy thinking. because the thought would have been like well that we should we should fund like a place like fhi or like now forethought to do that stuff instead of us.

Rob Wiblin: in your notes you said that you spent a fair bit of time reflecting in this period. about what it had been that you liked about effective altruism i guess as an ecosystem and as a mentality and what things you didn't like so much about it to tell us about that.

Ajeya Cotra: yeah so i i guess it's been a long time since you've talked about effective altruism in the show. so so it's just sort of like open with what it even is which is this movement or idea that you should think explicitly and seriously and quantitatively about how you can do the most good with your career or with your money that you're donating. and that different career paths and different charities you could donate to could differ by orders of magnitude and how much good they do. so like if you are working on reducing climate change it could be orders of magnitude more helpful to work on researching green technologies versus to work on like getting people to turn off their lights more more like conserve electricity more in their personal use. and there's this ethos that like if you're really taking this seriously and you like really care about about helping the world you you stop and think and you do the math. like in the same way that you know if you if you had cancer or your spouse had cancer you would like do the research and like figure out what treatments had what side effects and what treatments had what success rates. and you would ask a lot of questions of the doctor. there's there's like this ethos that like that's what it looks like when you take something seriously. and a lot of people when they're doing good in the world they they do what makes them feel instinctively good. and there is like a whole other approach where you you sort of respect the intellectual depth of that problem. and i was really drawn to this. i i sort of you know fell head first into the EA rabbit hole when i was thirteen. so it's been more than half my entire life that i've been like extremely involved in this community this way of thinking. and i think there were like maybe like three big things that i really really liked about this approach. one is just that EA is sort of like challenge themselves to care about people and beings that were like very different from them very far away from them in time and space. so even the most like sort of quote unquote vanilla like EA cause area of global poverty the vast majority of money that goes to alleviating poverty given by individuals in rich countries goes to helping other individuals in rich countries even though money could go much much further overseas in countries where people have a much lower standard of living. and the reason you know people donate locally is that they'd sort of feel more affinity for people who are like closer to them and more similar to them. NEA also has like a lot of strains that sort of challenge people to to extend care to animals to extend care to future generations that may live like thousands of years or millions of years in the future to artificial intelligence. also if it if it can be something that has consciousness and can feel pain and so on. and that was really appealing to me. but then they were also just like a there's a way of going about doing things that was also very appealing to me which is like they're very nerdy. they were very intellectual. they were like really like thinking stuff through and almost like innovating methodologically on like how can we figure out which charities are better than which other charities? and like there are there are lots of like interesting like arguments thrown around for this. and they were all they were very transparent. like there's just the culture of like open debate and like admitting your mistakes give well like an early sort of pillar of the early EA movement had a mistakes page on its website where it just discussed mistakes that it made. they they were very like honest and high integrity in like an interesting way that doesn't obviously follow from like caring about other beings more. for example like givewell refused to do donation matching because donation matching is usually a scam where like the the big donor like would have given that much anyway even if you hadn't made your donation. so that whole package was like really attractive to me. i think it like really like hit a lot of psychological buttons for me at once and like really felt like my people and like the the like way i wanted to live my life.

Rob Wiblin: so there's the being more compassionate to a to a wider wider range of beings which i guess it's still the case and probably still something you like about the about the factor actual approach. but it was like also going into like enormous intellectual depth and just like really debating things out. and then there's also the the very high integrity about honesty like not not allowing any chicanery whatsoever or i know.

Ajeya Cotra: just the hint of chicanery extremely like like fastidious and like exacting level of integrity that like other movements even other pretty high integrity movements like weren't aspiring to.

Rob Wiblin: even beyond like what people are even asking.

Ajeya Cotra: for but i just like you you just like proactively say like by the way did you know donation matching is a scam? that's why we're not doing it even though even though we would get more donations to help poor people. you know it was it's interesting that like that was such a natural part of the early EA movement even though like you're you're sort of giving up on impact you know?

Rob Wiblin: yeah it's not necessarily implied. i mean like oh i guess it's a it's a practical question whether it is or not. so i guess as as things evolved you found that i guess the second one the intellectual depth was like now lacking from your job. were there other things that were kind of changing that made you like less enthusiastic?

Ajeya Cotra: yeah i think the so so the intellectual depth was very much there in like other parts of the EA ecosystem especially AI safety and like thinking through like how how exactly would you like control early early transformative AI systems and things like that. and like i said my heart was like always pulled towards those kinds of questions even though i worked at a grant making organization.

Rob Wiblin: yeah it feels like on some level you really were a more natural grant recipient rather than you should have gotten something to really go in deep on some.

Ajeya Cotra: questions. yeah i mean i i think that if i had graduated college in twenty twenty two instead of twenty sixteen like in twenty sixteen i graduated college. i went to give well and like a big part of why i went to give well at the time was that they had the most intellectual depth on this question of like what are the best charities? and if i had graduated college in twenty twenty two i probably would have done maths which is this program to like upskill in MLAI safety research and then tried to join an AI safety group. you know so i think i'm i'm sort of naturally drawn to to like actually doing the research in some sense. so so in that sense it was it was sort of a mundane issue that like my job especially after holden left and the demand for that kind of like research evaporated a little bit at the leadership level. it was like if i were to start over again probably i wouldn't have like applied to join open fill. i probably would have applied to join an AI safety group. but then i think like there's a the the third thing of just this like extremely almost comically high level of integrity that i really really liked. was also like eroding over the years just as like you know when i think about why i think that like when a lot of the focus in the of the EA movement was convincing really smart people to donate differently being extremely like unusually high integrity was like actually just really valuable and powerful asset. so like obviously people like me and like very wealthy people that were early givewell donors really liked that givewell had a mistakes page and really liked that that whole ethos and that whole package. it helped them trust that the recommendations were actually real recommendations and they weren't being spun something and they weren't being sold something like all the rest of the charity recommendation ecosystem. but then like when you move away from that being your primary method of change when instead you've you've actually attracted quite a lot of funders and now you're trying to use that money and the talent that you've attracted to like achieve things in the world. maybe things that involve like a lot of politics then the like the the being like extremely transparent can can be like very challenging especially because like donors like want privacy or like if you're running a a political campaign you don't want your opponents to know exactly your strategy and like you know the ways that you think you might have made mistakes. like it's just like this is not how like most of the real world works you know?

Rob Wiblin: yeah it's not the case that the world 's most impactful organizations are consistently incredibly transparent or even like incredibly high integrity.

Ajeya Cotra: yeah yeah. and so there was this this tension in between the goals which i i felt like i should only care about the goals of EA. there's like so sort of what EA told me and and it kind of made sense to me was that like the the point here is to help others as much as possible. the point is not to conform to an aesthetic or like do things in a way that feels like cleanest or prettiest. but at the same time i think i was like to some extent kidding myself about how much of my own motivation and my own attraction to the concept came from just the goals. like just pillar one and altruism versus pillars two and three of like that intellectual depth and like intellectual creativity. and this like crazy high level of like openness transparency like having absolutely nothing to hide like you know letting all comers come. like i i think for me like as a as a fact about my psychology the latter two things were actually really important for my motivation. and they were sort of overtime just like smaller and smaller like features of like what it was like to do EA like to to try and like pursue EA goals in my career.

Rob Wiblin: yeah i guess we should say for people who don't know that i guess over this period. the environment open field was operating and became a lot more challenging and and and a lot more hostile. i guess that i guess for years i have been funding all kinds of AI related stuff but as AI became a much bigger industry it became apparent what sorts of concerns different different people had. it's it's work in some ways started to just clash with like very large commercial interests potentially and also just alternative ideologies that had different ideas about how things ought to be regulated or how things ought to go. and so you we're now in a world where there were people who would sit down and think how can i fuck with open film? like what can i do to give these guys a terrible day? like what have they published that we could spread that will be embarrassing for them. and in that kind of environment where people just literally want to cause trouble for you it's a lot less attractive to be maximally forthcoming about all of your internal deliberations and why you made all all of your decisions. like all of us would potentially be a bit more conservative in that kind of environment.

Ajeya Cotra: yeah i mean even before like the the latest round that started in twenty twenty three of like AI policy heating up openfill compromised a lot on its initial like wild ambitions for transparency. like at the beginning there was this idea that we would publish the grants we decided not to make. i mean explain why we decided not to make them when people like came to us for grants. but the reason?

Rob Wiblin: most organizations don't do that. there's.

Ajeya Cotra: a reason most organizations don't do that. for our earliest two program officer hires we have a whole blog post that we wrote like about their strengths and weaknesses as a candidate and like alternatives we considered and like how confident are we that this will work out? and we stopped doing that. so it's just like there was like a level of transparency that's just like i still in my heart want that but it's like absolutely insane. and then i think like and then i think the adversarial pressure that you mentioned makes it so that like open fill as an organization that like funds a lot of this ecosystem has a lot to lose. like i think if we go down like a large number of helpful projects have a much harder time getting funding. we have to be like a lot more risk averse than many of our grantees even though those grantees are also facing an adversarial environment. i think the way they many of them navigated is to sort of like fight back and like explain their perspective and like you know define themselves in the public sphere. and in my my like instinct is to just do more of that and to just like sort of say more and and respond. but it's harder to do that from open phil 's position for a number of reasons. yeah.

Rob Wiblin: so over the years a lot of people i guess usually critics have said that effective altruism has some things in common with religious movements. to what extent have you found that to to be the case and to what extent have you found that not to be the case?

Ajeya Cotra: yeah. i mean i think i think EA aspires to be and very much succeeds at being like a lot more truth seeking than the world 's religions and a lot more truth seeking than a lot of other communities and movements in the world. so in that sense i think there's a dis analogy that's extremely important. i do think there are like it's not a bad analogy in some ways because i think for people who who really are deeply involved in the EA community it provides like a a map of the good life. you know it's like it's it's like a vision of what it means to to be good and have a good life. it's sort of unlike a political movement in that it doesn't just have like a set of policy prescriptions for the world but like many religious movements it intersects with politics. and like there are people who you know approach political questions like whether you should ban gestation crates for pigs through the lens of their of their commitment to EA. and so it it has this and it's not it's not just like a community it's not just like a social club. i think like you know people get solace and friendship from their from their like local community of EAS like people do from their local church community. but it's it is more than that. it is trying to say something about like you know the the sweep of the world and like your place in it and like what it means to live a good and meaningful life. and it like intersects with like politics and community and a bunch of other things while not being exactly the same as it.

Rob Wiblin: yeah i would think a key way that it's not like a religion is that it feels more like in many respects a business to me or like a startup or an organization that has like quite a functional goal. or i guess that's a different aspect of it that well i guess you know some people like the ideas they like the blog posts they don't engage with the community whatsoever. and i suppose for them it's going to be a different experience. and there's people who like like the community. actually there's many people who participate in the kind of community of people who would say i'm involved in effective actorism but actually are not that interested the projects or necessarily even the effort of of helping people. so people kind of sample that the aspects that they like. but i mean for many of the people who staff who work in organizations that have other people who would say i'm i'm i'm really into effective altruism it's much more pragmatic i would say.

Ajeya Cotra: yeah i think that is how it ends up manifesting for a lot of people. but i don't think that's really like what EA is or like i think it it's a mistake to collapse EA into a set of like three or four goals in the world like you know reducing suffering of animals in factory farms plus like improving quality of life for poor people in developing countries plus like AI safety. i think in some ways i think people think of like EA as like a weird an umbrella for like those three things. and then those three things are basically like professional communities pursuing like a kind of well defined goal. but i think EA is is more like a way of looking at the world in a way of thinking about the good. and i think you can take an EA approach to cause areas that are in some sense more parochial than the like big three EA cause areas. like i think you can absolutely take an EA approach to US. policy from the perspective of like thinking about the welfare of US. citizens doing like rigorous cost effectiveness analysis of like what policies actually help and don't help. and a lot of people do. and then i think there are like there is EA as a generator of like new cause areas that sort of could get added to the canon. and i think right now there's like a bunch of fertile ground with like could EA be a force that helps society prepare for like radical change by advanced AI where AI safety is one big important thing there. but there might be like a range of other issues and you might want to prioritize some of those based on like your values and like your sense of how things will play out.

Rob Wiblin: so you you you wrote in your notes that's at least from your personal point of view EA wasn't enough like a religion or it wasn't as much like a religion as as as you might personally have liked. yeah explain that.

Ajeya Cotra: i think i mean i think i'm someone that just really benefits from structure and from sort of like emotional motivation reinforcement. and i think and i also just like very much tend to like a little bit socially conform or like i think i tend to like try and achieve the ideal of like the community i'm in. and i think the ideal of like sort of my corner of the EA community is sort of like you said it's just like to like have a really impactful job and then do a really good job at it and work a lot of hours at it. and so that's like the message you get from the community. and that's like what i'm try to do. but i think i personally would have liked yeah a bit more of a like spiritual angle to the community and a bit more like if you read my colleague joe carl smith 's blog i think i get some of that like existential reflection about like our morality and our values and this crazy thing that so many EAS believe that in a matter of like a decade or two we might be in an utterly transformed world that might be like relative to this vantage point like utopic or dystopic and just like grappling with that. and like i think you know i think if there had been like an EA church where like every sunday like someone who's like really good and thoughtful about these issues like LED a discussion spoken from the spoke about them and LED a discussion about them i think that would have been like very enriching for my life and probably ultimately like made me be higher impact. but that's just not how the EA community is structured. and it's like it's sort of deliberately not structured that way because it's like the professional community aspect of EA. you really like want to not care if people like believe the like deepest you know teachings and philosophical orientation. you really want to just be like you know if you're doing great AI safety research like.

Rob Wiblin: great do great AI.

Ajeya Cotra: safety research so so they're like incentives of a professional community pull against like what i might personally want here.

Rob Wiblin: yeah. do do you think it sounds like you think while i might have been more appealing to you it's like not actually necessarily better for for things to to to go in that direction? i mean i guess to me per for me personally i kind of like the more professional community like limited aspect of it 'cause i you kind of just want to be able to go home and not have to think about this stuff all the time. i haven't necessarily.

Ajeya Cotra: whereas i want to go home and think about it in a different way. i'm like i i mean i already go home and think about my work all day. so like i like frequently have insomnia where i think about my work and i just want to be like instead of thinking about the next google doc i need to write or the next email i need to send. like i would like to be like i know spiritually marinating. like yeah exactly.

Rob Wiblin: yeah yeah i got i got. i mean i guess people have a range of views but i guess it's like clear why many people have like not embraced that or have been have been keen for the more like let's have a like more like a strong strong division between this sort of thing which can be like very stressful. yeah.

Ajeya Cotra: and it can be very like dangerous and culty and like i mean there are a lot of there are a lot of reasons to worry about it. but i do think there is just a large contingent of EAS that are like me in wanting some sort of spiritual grounding. like joe carl smith 's blog is like extremely popular with hardcore EAS. it's not like a generically popular blog or it's like it's like reasonably popular but it's just like there are a number of people who are like oh wow this is like really nourishing something in me that i like didn't realize i needed.

Rob Wiblin: you wonder there's probably an age thing here a little bit as well. i guess i feel like when i was younger i noticed that you know the the older people were less interested in this in this aspect of it. and i guess like now i'm in the kind of older class and i'm like well i have my family to like provide nourishment and just like that that's like absorbing a lot of time and and energy that i kind of don't have for attending church or whatever else it might be.

Ajeya Cotra: do you i mean that's kind of interesting is like do you feel like you had some sort of spiritual hole that was filled specifically by having a child or was it were you always just like not that interested in this?

Rob Wiblin: yeah i mean i think of myself as a deeply unspiritual person something like that wasn't really a niche that i needed scratched. i guess earlier on i was maybe more interested in the social scene to like you know make good friends and meet people. i guess like having made more friends who i think of as like minded and you know having a lot of common interests with that's kind of not as interesting either anymore. yeah i've already got my friends and now i'm just going to ride it out.

Ajeya Cotra: i actually think that i thought of myself as an extremely unspiritual person. i had like a lot of disdain for like spirituality when i was twenty. and so for me the age thing has gone the other way. like i think i want more and more of a religion shaped thing in my life as i age. and when i think about why i think it's because when i was twenty i had like unrealistic aspirations for my worldly projects. like i like you know i was like i guess by that point i'd already been an EA for like six or seven years. but like i was just starting off like trying to do EA things in the world. and i had this like sense that like you know this this is obviously correct. this is obviously great. like everyone who's like good and reasonable will like get on board with it and like we'll just like solve poverty and solve factory farming. i i don't i wouldn't have exactly said this stuff but i just had that like inner vibe and i would go around being like have you heard the good word about like yay. and i think as i've just done things in the real world i'm like everything is very hard and slow and just the feeling of doing my job which involves like you know right writing these google docs and sending these emails is just not like automatically connected to my like higher aspirations. and and there is like a long grind and there's a lot of failure. and so i think i like have a increasing demand for some separate thing that is like specifically trying to like reorient me mentally towards the bigger picture.

Rob Wiblin: yeah for me the bottom line there is that working on this stuff can be like quite stressful and quite tiring and i want to like completely check out and stop thinking about it and just be with people and talk about other other other issues. like i said like the. yeah the different strategies. yeah for i.

Ajeya Cotra: think probably like i i think i probably want some of both. like i think i i i now like i live in a group house with a couple of little kids which is really great and and it's good for like but but i find unfortunately that it it takes a lot to like pull my mind away or like i when i watch TV i'm like thinking about other stuff in the background so.

Rob Wiblin: yeah. so i think during your sabbatical you considered going independent and i guess becoming a writer or researcher just just doing doing your own thing. but in the end you decided to come back to openfill at least for a while. what? why was that?

Ajeya Cotra: yeah. so towards the end of the sabbatical i was planning on taking some time to just start a sub stack and like write about a bunch of stuff including a lot of the stuff about EA that we were discussing a lot of stuff about AI and like sort of see where it went. and at that time i honestly didn't have like a super strong impact case for this. i think i didn't think it was crazy that it would be the highest impact thing to do. but the reason i was doing it was just 'cause i like i just wanted to i just wanted this. and not that i like you know could really defend that it was the highest impact thing. but at that moment after having gone through this whole journey i was like yeah maybe i have like more room in my life for making a career decision on the basis of not just impact. the reason i decided to stay was that basically while i was out openfill was conducting a search for a new director to lead our JCR work. so all our AI work and our bio risk work. this was the position holden was in when he left in twenty twenty three. and both of the top two candidates seemed really good to me. and i felt like someone new coming in was like could could probably really use help from someone who's not particularly running any given program area doesn't have a big team to worry about and can just like help that person develop contacts figure out their strategy. and then it could be an opportunity for me to like see if i can i could get the feeling of like sort of plugging in again that i had been missing for a while.

Rob Wiblin: and then how did it go?

Ajeya Cotra: i think it went really well. so our director of GCRS is emily olson who's also the president of open philanthropy. and she's like and i've been spending like most of the last year most of this year twenty twenty five just helping her in various ways trying to understand like well what have we funded? what's come of that? you know what's the AI worldview? what do we think is going to happen with AI? how's that informing our strategy? like what are the strategies of the various sub teams? and i would like work really really well with her. and it's like i i actually had been like lonely at openfill almost the entire time i'd been at openfill even though it got worse in twenty twenty three. because like while holden was like really great at like you know giving me a lot of bandwidth. i'm really grateful for and like talking about object level stuff with me. holden never like ran a ship where he like. he was like i'm doing this bigger project. can you help me with this piece of it? and like here's how it fits in. holden was always like more like a research PI where like i was doing my own research project and he would like talk to me about it a bunch and he was interested in the results but it was not like integrated into a whole. and emily really does operate in more of like an integrated way where like i'm doing stuff and i know she like needs to know the answer and is going to do something with it which is like very cool and like very like novel for me as like a way to work. and it's something that i always thought i would want. and indeed it's like really really great. and i think she's like an she's an extremely like caring and thoughtful manager for me who's like really good at like eliciting work out of me. and like i noticed that like i work more than i did right before i went on sabbatical and it feels less hard. so you know that's just like a a sign that like things are are working.

Rob Wiblin: so you're trying to decide what to do next you know whether to say it open fill or go into i guess like something more i guess less meta or maybe that that will allow you to to go into even even more depth. because how how are you using the stuff that you've learned about yourself over the last few years to kind of to inform that decision?

Ajeya Cotra: yeah so i'm talking to besides openfill which is still a a top candidate i'm talking to two technical research orgs about potentially finding a fit there. one is redwood research the other is meter. and redwood research works on basically futurism inspired technical AI safety research. and they've they're best known for pioneering the AI control agenda and meter. i think of as trying to be like the world 's early warning system for intelligence explosion. it's like they're they're measuring like all the different measures we want to be tracking to see if we're on the cusp of a is like rapidly accelerating A I R N D or acquiring other capabilities that let them take over. both of these missions are like very close to my heart. they're both like narrower than openfill where i could just like if i wanted just like dip my toes in like absolutely everything that might help with making AI go well. but then in exchange they would let me like go deep in a way that i think would would probably be like more satisfying for me all else equal. and in in terms of like how i'm using what i've learned i think i think i just like and this is like so cliche. and it's something that like you know if a twenty year old version of me were watching this like she'd roll her eyes. but like yeah you're you're extremely local environment like the literal person you're reporting to matters a huge amount you know and like the like two or three people you're going to be like talking to most in your job or just features like how much are you talking to people in your job versus working on your own can can just like make a transformative difference. and i found it interesting to reflect on like i said all that stuff earlier about how like EA has become a lot less transparent and a lot less sort of prioritizing maximal integrity at all costs. and that does still bother me. and actually it's for the moral foundations of EAI. think of like sort of utilitarian thinking. you can go down a a long rabbit hole where it like is is very suspect in many ways. and we talked about this sometime in in some previous episodes but but both of those things bother me a lot more when i'm also in a working environment that's locally.

Rob Wiblin: hard for me.

Ajeya Cotra: you know and and it's it's not like those issues aren't issues but but the salience of like those kind of heady big picture things versus like extremely micro things about like what does it feel like when you have a one on one with your manager is like like i think i i had been underrating the like mundane and the micro in in how i'd been thinking about my career up to now. and i'm like trying to like do trials. like i'm actually in the middle of a work trial with meter as we're filming this episode. and and that's what i'm paying attention to. like how does the rhythm of the work feel? how do the people feel?

Rob Wiblin: yeah. i guess other generalizable observations are that i mean i guess open fields environment changed over the years. you were there for eight years but i guess nine years now nine years right? yeah. but the kind of constraints that open field was labouring under in twenty twenty three were very different than in in twenty sixteen. and so unsurprisingly like i guess it might have been a good fit for you to start with but that doesn't necessarily mean it will be a good fit forever. and also there there was a leadership change at open field. the person you were reporting to changes. and very often when that occurs you see some other people will leave as well because they were in their roles primarily because of the very good working relationship with with with that person or because they had strategic alignment with that with that person. and i suppose potentially the CEO changing could have been a trigger for you to think well maybe this isn't so great anymore and i should like proactively start looking for something else.

Ajeya Cotra: yeah yeah i think that's possible. i mean in in it's sort of was true for me in both directions. like i think holden very much was a huge part of why i like wanted to work at give well rather than work in a number of other potential places or do earning to give. like i thought i was going to do it first and then when he when he left that was like coincided with like a a difficult period. for me. and now with emily in the in the position that he was in before it's like again like pretty dramatically changed like what my work is and how it feels. so so it it does seem like it's a it's a big transformative thing. and if you're in an organization where there's a leadership change i think it should probably be a trigger to think about. like even if you don't leave like what is what might be like different about your role and your place and what you're doing based on like the different style or the different constraints and strengths and weaknesses of new leadership.

Rob Wiblin: it sounds like taking four months off. it's also a good call that i guess it stopped. you were like reasonably unhappy. i guess you could it could have gotten worse though if you if you hadn't done that and i gave you like breathing room to to make good decisions.

Ajeya Cotra: yeah i think that's right. i'm very glad that i took the sabbatical. i'm also glad that i didn't leave. like i think an an a salient alternative for me at the time that i decided to take four months off was to just leave and and like figure out what i wanted to do next. and i think it was good both for my impact and for my personal growth and like satisfaction that i came back. i like helped emily. and now i'm doing like a a proper job search which like at the time that i left for my sabbatical it was more like kind of like healing and reflecting and and not like sort of in a focused way searching for a role.

Rob Wiblin: coming back to effective altruism for a bit i guess it he said we basically almost don't talk about effective autism on the on the show anymore. i guess it was like a much bigger feature in in the earlier years. i mean the biggest reason for that i suppose is now that we're more AI focused. but AI is an issue that so many people are concerned about regardless of like that that their broader moral values or broader moral commitments. it just doesn't feel as relevant. like you don't have to be concerned about shrimp or you don't have to be concerned of how like beings very far away in time to think it would be really good to do AI technical safety research or it would be good to think about what governance challenges are going to be created by it. and of course like EA has it's a controversial idea. and i think like actually is at its core quite a controversial idea. like many people even fully understanding i think would just would simply not agree with with its prescriptions basically of how resources ought to be ought to be allocated. and like why bring along all all of that baggage when it's not actually decision relevant for for most people? do you think we should? should we talk about it more or is that kind of just a sensible evolution?

Ajeya Cotra: yeah. i mean i think it kind of depends on the show 's goals. like i think my take is that it's correct and good that you don't need to buy into the whole EA package with all of its baggage to worry about misaligned AI taking over the world and to do technical AI safety research to prevent that to worry about AI driven misuse and to do research and policy to prevent that and to just generally worry about AI disruption and think about that. but i don't think so. so i think there should be and there is like a healthy thriving like AI is going to be a big deal ecosystem that does not take EA as a premise. but at the same time i think EA thinking and EA values probably do still have a lot to add. like in the age of AI disruption like i think it's going to be EAS for the most part who are thinking seriously about whether AIS themselves are moral patients and like whether they should have protections and rights and like how to navigate that thoughtfully against like trade offs with safety and other goals. it's going to be EAS that by and large are still the ones that take most seriously the possibility that AI disruption could be so disruptive that like you know we end up locked into a certain set of societal values. like we gain the technological ability to like you know shape the future for millions of years or billions of years. and like our thinking about how that should go. like there's a lot of degrees of extremity to the like AI worldview. like even if you accept that AI is going to disrupt everything in the next ten or twenty years the people who are thinking hardest about the most intense disruptions are are going to be disproportionately EAS cause sort of EA thinking like challenges you to like try and engage in that kind of like very far seeing like rigorous speculation even though that you know there's a lot of challenges with that and it's like very hard to know the future. i think EAS are the ones that like try hardest to like peek ahead anyway.

Rob Wiblin: yeah i think i guess digital sentience so like worrying about AIS themselves suffering is a good example. i guess what yeah i would definitely make the prediction that effective autism will loom large in that. so the group of people working on that i guess i mean for for someone who's not altruistic it's a bit doesn't isn't motivated by social impact. it's a bit unclear why you would go into that area. it's not particularly lucrative. it's not at least yet particularly respected. i guess it's not super easy to make progress. and i guess it's it's officially unconventional. i think most people most of the time in their career like they want to do something that's acceptable and that their parents will be proud of. and it's just a lot. it's a lot less clear that digital sentience is going to provide you with the kind of esteem or prestige that many people or or or safety comfort that many people want in in in a career. so it's maybe natural that people who are yeah altruistically motivated and also i guess like intellectually a bit eclectic willing to be avant garde are going to be more.

Ajeya Cotra: intellectually avant garde like like tolerant of like quite a lot of philosophical like reasoning and speculation. in a sense i think this might be like what a healthy EA community is. it's like an engine that incubates cause areas at a stage when they're like not very respected they're extremely speculative. the methodology isn't firm yet. you you kind of just have to be extremely altruistic and extremely willing to do unconventional things. and then like matures those cause areas to the point where they can stand on their own while also being a thing that like many EAS work on. and and i think like digital sentience and maybe like the other things on will and tom 's list like space governance and thinking about value lock in and stuff like that are other candidates for EA to kind of incubate the way it incubated worrying about AI takeover basically.

Rob Wiblin: yeah i feel a lot less strongly in the case of the value lock in thing because many of the mechanisms there would be just just ways that AI ends. i guess you would get a power grab by people or a power grab by AIS or somehow it undermines democracy or or deliberation in a way that like makes it hard for society to to to adapt over time. i think people are worried about that like regardless of you know both people involved in effective autism and people who would be very sceptical of it.

Ajeya Cotra: i think that there are some versions of like the value lock in concern that go through something else kind of overtly scary and bad happening like one person getting all of the power and that's how like that person 's values get locked in and that's how we get value lock in. but i think there's a whole spectrum of things that are sort of like almost like social media plus plus. it's sort of like in in this distributed way this technology has made us like meaner to each other and like worse at thinking and has allowed individuals to live in information bubbles of their own creation. you can imagine AI is getting way better at like creating a curated information bubble for each individual person that allows them to continue believing whatever it is they started believing with like super intelligent help like preventing them from changing their mind. and this might be something you think of as an important social problem for the long run future. even if it's not if it doesn't happen via like one person getting all the power it's power is still relatively distributed but like large fractions of society are like sort of impervious to changing their mind.

Rob Wiblin: so it's interesting that in thinking about what is the niche that EA can fill that others won't fill the thing you were pointing to was not primarily actually altruism. whether i guess like that that is a factor in terms of going into like digital sentience. perhaps it's actually a research methodology or like a research instinct which is i guess being willing to be in that very uncomfortable space between just making stuff up and like having you know firm conclusions that you can stand by because you've taken particular measurements. it feels like for some reason that is one of the most distinctive aspects of people who are like passionate about effective altruism willing to absolutely true like try really hard to make informed speculation about how things will go. and like neither neither neither just have it be a good story nor be be too conservative that you're not willing to actually like make hot predictions.

Ajeya Cotra: yeah absolutely. and i think even the the tamest of EA cause areas like global health and development has a huge dose of this. like i think if you look at givewell 's cost effectiveness analysis they have to grapple with like how does the value of doubling one 's income if you make a very low amount of money compared to a certain risk of death or like the value of like a certain painful disease you could have. and they they have to try and like get their answers based on like surveys and weird studies people have done. it's just not very rigorous in the end. and they have to like form their judgments and like spell out their judgments. and i think the willingness to like tackle questions like this and just be like well here's here's our answer. and like you know there is a lot to argue with. it's very emblematic of EA organizations including all the best like AI safety EA organizations like redwood research.

Rob Wiblin: yeah. i guess like more standard ways to approach those questions would be to like just pick one slightly arbitrarily and then be really committed to it or to be kind of irritated at being asked the question and to say that there's like absolutely no way of knowing or there's no fact of the matter here whatsoever. yeah. and i guess like yeah trying to be somewhere. i guess i don't know whether it's somewhere in the middle but yeah effector autism.

Ajeya Cotra: and within EA there's kind of like a a spectrum in terms of like where in the middle you want to land where like some you know everyone 's kind of looking at the person more speculative than them and thinking that they're sort of like yeah they're they're just like building castles on sand. and this is not the way to like you know do things. and they're looking at people less speculative than them. and they're thinking like you know they're just the street light effect and like they're they're just ignoring the most important considerations and like not working in the most important area.

Rob Wiblin: yeah. so i guess for people who do have that mindset i suppose an important message would be that people should take advantage of the fact that they have this unique mentality or there's like reasonably rare mentality and go into roles that other people won't probably won't fill because they feel too too uncomfortable. or at least like i guess they think they could just reasonably think it's misguided. but but other people aren't necessarily going to do this stuff.

Ajeya Cotra: yeah i think that's right. and i think it's interesting to think about like if you imagine EA as a as one piece of the world 's response to crazy changes like a hi there's actually a case that EA should be heavily indexed on research. i think there's a the the community has gone back and forth with how it thinks about this. and i think at first people are just like naturally attracted to research stuff. so there's like a huge glut of people who wanted to be researchers. and then there was a big push including from adk and others that know like consider like operations roles and policy roles and like other things that aren't just research. and i think that was a good move at the time. but i wonder if we if we think about like what is EA 's comparative advantage relative to the world? maybe that suggests that some of the people who are doing operations and doing policy but maybe in their hearts just want to be like a weird truth teller like you know thinking speculative thoughts should consider like going back and doing that again.

Rob Wiblin: my guest today has been ajaya kocha. thanks so much for coming on the eighty thousand hours podcast again ajaya.

Ajeya Cotra: thanks so much for having me.