When the War Machine Decides: Algorithms, Secrets, and Accountability in Modern Conflict, with Brianna Rosen

In this probing discussion with Senior Fellow Arthur Holland Michel, Brianna Rosen, senior fellow at Just Security and the University of Oxford, discusses what we know (and what we don't) about Israel's use of AI in the war in Gaza and explains the fraught relationship between algorithmic decisions, transparency, and accountability. She also looks back at the last two decades of the U.S. drone strike program for clues about what the future of AI warfare might mean for justice and human rights.

When the War Machine Decides Spotify podcast link

When the War Machine Decides Apple podcast link

ARTHUR HOLLAND MICHEL: Hello. My name is Arthur Holland Michel, and I am a senior fellow at Carnegie Council for Ethics in International Affairs. This episode of the Carnegie Council podcast is brought to you in collaboration with the Peace Research Institute Oslo as part of its RegulAIR project. RegulAIR is a multiyear research initiative about the integration of drones and other emerging technologies into everyday life.

I am very excited to be joined today by Brianna Rosen. Brianna is a strategy and policy fellow at the University of Oxford and a senior fellow at Just Security, where she writes about artificial intelligence (AI), drones, and military accountability.

Hi, Brianna. It is great to have you on.

BRIANNA ROSEN: Hi, Arthur. Thanks so much for having me on the podcast. It is a pleasure to be here with you today, having long admired your own work on this topic. I am sure we are in for a fascinating conversation.

ARTHUR HOLLAND MICHEL: Briefly could you introduce yourself to our listeners? Tell us a little about your professional background and how you became interested in these topics.

BRIANNA ROSEN: Absolutely. I have been working on technology and war for the past 15 years, first in the think tank world, then at the White House National Security Council during the Obama administration, and now at the University of Oxford and Just Security.

One of my primary areas of focus has been the U.S. drone program and the policies, principles, and laws that should govern it. My research on military AI has organically grown out of this, and I should say that I come at this issue from a normative perspective but as someone who has spent many years in the policy world and is fully cognizant of the pressures and tradeoffs that policymakers face.

ARTHUR HOLLAND MICHEL: I want to get to your work on drones later because it is so rich and fascinating, but I actually want to start a little closer to the present and talk about AI. In particular you have been tracking in your writing over recent months a very interesting case of the use of AI in the war in Gaza. For those listeners who maybe are not aware of what technology is being deployed in that conflict and what it is being used for, can you give us an overview of what might be going on as we understand it?

BRIANNA ROSEN: We don’t know precisely what is going on, and that is a large part of the problem because there is very little transparency about how Israel is using AI in the current war in Gaza.

What we do know is this: Israel is using AI-based systems such as Gospel, but it is not the only one, to rapidly and automatically extract intelligence information to produce more targets more quickly. This is not merely speculation. I want to emphasize that. The Israeli Defense Force (IDF) itself has said it is using AI to accelerate targeting, and the facts bear this out.

In the first two months of the current conflict, Israel attacked roughly 25,000 targets, more than four times as many as in previous wars in Gaza. In the past Israel would run out of targets, that is, known combatants or legitimate military objectives, but that is not a barrier to killing anymore. AI is acting as a force multiplier by removing resource constraints and allowing the IDF to identify more targets including junior operatives who normally would not be targeted due to the minimal impact of their deaths on military objectives. So AI is increasing the tempo of operations and expanding the pool of potential targets, making target verification and other precautionary applications much harder to fulfill, all of which increases the risk that civilians will be misidentified and mistakenly targeted.

In addition to increasing the pace and scale of war, it is not clear how accurate Israeli AI systems are. All of the well-documented problems with AI in the domestic context, from underlying biases in the algorithms to the problem of hallucination, are likely to persist here.

To be clear, we do not know the full extent to which the horrific civilian death toll in Gaza can be attributed to AI, but Israel has by its own admission used AI to generate significantly more targets more rapidly than in previous conflicts. For example, the former head of the IDF said that in the past Israel used AI to generate 50 targets a year, and during Operation Guardian of the Walls in 2021 that increased to 100 targets a day with 50 percent being actioned. That number is likely exponentially higher now, and the result has been widespread civilian harm as more than 31,000 Palestinians have been killed in Gaza.

ARTHUR HOLLAND MICHEL: That is a very fair overview of what we do know. I was wondering if you could also run us through what we don’t know. What are the holes in our understanding and knowledge of what is going on with the use of AI in this conflict that would be significant to fill and would be important for us as the public to get a better sense of the real implications of the technology?

BRIANNA ROSEN: There are a number of things that we don’t know, particularly in the context of Israel’s use of AI in Gaza and some of which I highlighted in my article in Just Security in December, where I outlined a number of questions that policymakers and the general public should be asking about Israel’s use of AI in this conflict, just basic things, like: What AI systems does Israel use in military targeting? In particular, how were these systems developed? What safeguards has Israel put in place to prevent errors, hallucination, misuse, and even corruption of AI-based targeting systems? Can Israeli officials fully explain how AI targeting outputs were generated? What level of confidence exists in the traceability and explainability of results?

There is a whole list of questions that have not been answered and that have not even been asked as far as I know, and these have to be asked, not just about what Israel is doing but also what the United States and other countries are doing. We know that these countries have promised to keep a “human in the loop” with these types of operations, but we don’t really know what that means in practice. We don’t know how general principles for responsible AI are operationalized on the line. What type and level of human review is there in these types of interactions? Is it just a perfunctory rubber-stamping of machine decisions, or are these outcomes vetted carefully at each level? All of this type of information is lacking in the current public debate.

ARTHUR HOLLAND MICHEL: There is so much to unpack there. One of the points that I want to pick up on is this notion of explainability and traceability, the questions that you noted relating to whether those using these AI systems in the conflict understand how those systems came to achieve the outputs that they have achieved in, say, recommending a target.

Can you say a little more about why explainability and traceability in these AI systems are important in the context of conflict?

BRIANNA ROSEN: Explainability and traceability are important for transparency and accountability, which are already in extremely short supply in war, and AI is going to reduce both of these further. The difference with AI is that the lack of transparency that we see in current military operations is not going to just concern the public but actually policymakers themselves and those who are charged with overseeing these types of problems.

This gets to the problem of explainability with AI, where it is difficult to explain or even fully understand how an AI system generated a specific targeting outcome. That means that policymakers as well as the public will likely have difficulty understanding how specific targeting decisions were made. This problem is exacerbated when you consider the vast interagency process that feeds into these operations and the fact that each of these agencies might be relying on different algorithms and procedures. All of this feeds into a lack of accountability because when no one fully understands how a particular targeting decision was made, it is very likely that no one will be held accountable for it.

Of course, the classified and opaque nature of these systems further perpetuates a lack of transparency and accountability. Not only do we not know what types of algorithms military and intelligence agencies are using, but we do not know what the data sets are that are feeding into that. We do not know how they were developed, what assumptions were made, what potential biases exist, and how precisely they are being used in day-to-day operations. There are ways—and we can talk about them—states can be more transparent, but that level of transparency is not something that we are seeing now.

ARTHUR HOLLAND MICHEL: Is the explainability element also an issue in the sense that if a soldier using one of these AI systems does not fully understand how it works effectively and why it is suggesting a particular target as opposed to another, that will make them less effective in their ability to make judicious judgments on the battlefield about, to put it bluntly, when to pull the trigger and when not to.

BRIANNA ROSEN: I think that is absolutely a risk because no single person is going to fully understand how this technology works.

Your comment points to another risk, which is that soldiers on the field may not even question the results that machines are giving us. There is a kind of infallibility with machine-generated outputs where we think that because these systems are so intelligent and so advanced they must be giving us the right answer, but we know—and there is lots of data in the domestic case to underscore this—that that is very often not the case, that algorithms often get it wrong, make up information, or hallucinate. Apply that to the context of life-and-death situations of military AI in war, and it becomes very concerning if soldiers, policymakers, and the public are overconfident in the results that the machines are producing and not questioning those outputs in a critical way in real time amidst the fog of war on the battlefield. That is a very serious concern.

ARTHUR HOLLAND MICHEL: One of the arguments we have heard a lot in the context of military artificial intelligence is that it will likely reduce civilian harm in conflict because AI systems will be more accurate, more precise, and less fallible and unpredictable than human soldiers. I was wondering how you reconcile that argument with all of the points you have made thus far in the discussion. Also, how do we reconcile it with what we are actually seeing happen as we speak in the war in Gaza, because there seems to be a pretty big gap there perhaps between theory and reality?

BRIANNA ROSEN: I am skeptical about the claim that AI will reduce civilian harm in war. That is certainly not what we have seen in this conflict. When more than 31,000 people have been killed there is no counterfactual I can imagine in this context where AI has saved lives. The reality is that more children have been killed in four months in Gaza than in four years in all wars around the world combined. That is not a reduction in civilian harm.

What is happening in Gaza confirms a view that I have long held, which is that AI has the potential to reduce civilian harm in principle but it probably won’t in practice. That largely depends on the ways in which AI is used. If AI is used to accelerate the pace and scope of killing, more civilians will die, but if policymakers leverage the speed at which AI systems operate to introduce tactical pauses on the battlefield, to more carefully review targets, to ensure that precautionary measures are prioritized, to improve humanitarian assistance flows, and to consider alternative options to the use of military force, AI could in theory reduce civilian harm, but as you said, that is in theory and not what we are seeing play out in practice.

Ultimately I see the military AI debates as following the same flawed logic of the drone debates—just because the technology is theoretically more accurate or precise does not mean we will see less wrongful killing. In fact, the technology is only as good as the intelligence and data on which it is based, which is often outdated, incomplete, or flawed.

Finally, I want to emphasize that if this is how democracies are using military AI right now, just imagine what will happen in authoritarian regimes. That is an imminent risk that we all have to face.

ARTHUR HOLLAND MICHEL: Is one way of putting it that AI only has the potential to reduce civilian harm in warfare if those using it have harm reduction as their primary goal?

BRIANNA ROSEN: I think that is right. I will take it a step further, not only having civilian harm reduction as a primary goal but actually implementing a number of lessons that the United States and its allies have learned from more than 20 years of the war on terror on civilian harm, actually implementing those lessons in practice, so there has to be the political will to implement those lessons, there has to be the contextual knowledge of how those lessons apply on the actual battlefield. And then we have to see those lessons being translated down to soldiers on the battlefield, to the lower levels of command, so that it is instilled in the military culture and so that it is in every aspect of military operations. I think absolutely there has to be a political will, but it has to be even more than that.

ARTHUR HOLLAND MICHEL: This is a good opportunity to turn back to the past and precursor in some sense to many of the issues that we are talking about in terms of AI, which is the drone wars, the use of drones for targeted killing, especially outside of active conflict zones. What were the main lessons from, by this point, more than 20 years of these programs by the United States?

BRIANNA ROSEN: I am really glad you raised this point, Arthur, because drones are a good entry point for understanding what is happening now. Of course, drones are also AI-enabled.

When drones started to be used in the first half of the Obama administration and there was an uptick in drone use, there was a big push to ensure that the technology was used in ways that were “legal, ethical, and wise.” The idea was that precision technology was going to save lives and not kill more people.

The drone debates, as you recall, Arthur, largely focused on the issue of “riskless” war: Was it ethical to kill someone remotely without putting your own forces at risk? But that debate obscured the real problem with drone warfare, which was that by lowering the cost of resorting to force, force actually became more ubiquitous both in terms of geography and temporal scope, hence the moniker “perpetual” or “forever” war. Now, more than two decades later, the United States is in fact still conducting drone strikes around the world outside of recognizable war zones without articulating any point at which this arms conflict against al-Qaeda and associated forces will end. There is no end in sight.

The second major problem with the drone program was that no matter how precise or surgical the technology was when it was based on faulty or outdated intelligence combined with cognitive biases it often resulted in significant civilian harm, such as the U.S. drone strike in Kabul in August 2021 that killed ten civilians including seven children.

With AI I see similar problems, but the potential for harm is much greater. Here again the military AI debates are focused on the wrong question, in my view—the issue of autonomy and keeping a human in the loop, leading some groups to call for a ban on lethal autonomous weapon systems.

Everyone agrees that a human should be kept in the loop, and the focus on lethal autonomous weapons systems has obscured the risks of military AI that are already here. We are already seeing military AI contribute to significant civilian harm with humans fully in the loop, as we have seen in Gaza. If the drone program lowered the threshold for the use of force, military AI is increasing the pace and scale of killing.

The problems with the drone program that I have outlined are magnified here—human bias, the reliance on outdated or faulty intelligence, and overly permissive targeting criteria. All this implies that algorithms may misidentify targets potentially more often than we think, but because a machine has produced the results, as I said earlier, the risk is that humans will have greater confidence in it and be less likely to question targeting decisions, leading to more killing in war.

We are now in a situation where states are more likely to resort to force at greater speed and scale where there are fewer incentives built into the system for reflection and restraint. This should be of great concern to all of us.

I was reading an article in Bloomberg last week entitled, “Don’t Fear AI in War, Fear Autonomous Weapons.” That is precisely the wrong mentality. That is like saying, “Don’t fear the drone program; just fear the drones.” It is nonsensical and very dangerous.

ARTHUR HOLLAND MICHEL: I have heard often those who advocate for the use of AI in, say, targeting describing the very slow, tedious, and recursive process that has traditionally been used for selecting targets for drone strikes, as though that is a bad thing and that using AI to speed up that process will be a win tactically and strategically but also for things like precision and accuracy.

Is there an argument to be made, perhaps even from your direct experience, that this vast interagency process that you referred to earlier which leads up to the use of drones in strikes is necessary, that one can only achieve some level of precision and discrimination in targeting, flawed as that might be, by doing things ponderously and cautiously and bringing in a variety of different sources and perspectives? In that sense is there some contradiction in the desire to use AI specifically for the purpose of doing things more quickly?

BRIANNA ROSEN: We need to balance speed with safety. This is a real challenge that U.S. policymakers are facing with AI now. Even with that rigorous and complex interagency process for the drone program that you described there are countless instances of the drone program mistakenly targeting civilians and getting it wrong. Even having that process in place—and I want to stress this—is not a safeguard against civilian harm. There are still many flaws in that process that I have written about in the past as well, but not having that process is a complete disaster.

In the U.S. context we have a complex interagency process that feeds into drone operations. It involves multiple intelligence agencies, the Department of Defense, the State Department, National Geospatial-Intelligence Agency, and the White House, just to name a few.

My concern with AI is that this process becomes even more complex than it was with the drone program because you are imagining that each of these agencies are relying on a different set of internal guidelines, procedures, and safeguards for AI use as well as different data sets and algorithms. That presents an interoperability challenge for one, but more importantly there are different levels of transparency and accountability for each of these agencies. The Department of Defense is obviously going to be much more transparent about what it is doing with AI and targeting than the Central Intelligence Agency (CIA), even though the CIA is most likely playing a large role.

What this also means is that when you have a finished intelligence product going to senior policymakers; say, a judgment for the president about whether Russia is going to invade Ukraine. It becomes very difficult to trace the role of AI in making that judgment based on multiple data streams from different agencies using AI in different ways, and, as AI is used for more complex tasks from all of these different agencies and at multiple levels of analysis, the risk of an error compounding over time increases. That is a big concern. Of course, another concern is that if malign actors were able to somehow poison AI systems or data sets, that introduces a whole other set of issues that I do not think the Biden administration has fully grappled with yet.

Policymakers are very aware that they need to balance these competing concerns of speed and safety. AI is clearly not going to be banned and it is going to be used in military operations, but policymakers have not fully grasped how we are going to do this in the same streamlined way that we did with the drone program, as imperfect as that was.

ARTHUR HOLLAND MICHEL: This might be a good moment to pivot toward the future. Of course, in large part it is a big unknown, but do you have any predictions for how you see the issues that we have been discussing today evolving in the years ahead?

BRIANNA ROSEN: We are currently witnessing the migration of drone warfare from the counterterrorism realm into interstate conflict as hybrid warfare becomes more the norm. This is a trend that has been well underway for some time in Ukraine and elsewhere. I think it is likely to accelerate with advances in AI. We know that drones are becoming cheaper, smaller, and increasingly autonomous, allowing states to wage asymmetric war more effectively. Drones, for example, have been critical to Ukraine’s ability to gain battlefield advantages against Russia and to continue to fight that war.

The U.S. military is also seeking to address its pacing challenge with China, in part through the Replicator program, which aims to field thousands of expendable autonomous weapons systems across multiple domains within roughly the next year. The commander of United States Central Command just told the Senate Armed Services Committee last week that cheap drones are “one of the top threats facing the United States” as well as drone “swarms.” Advances in AI are playing a huge role in that, allowing drones to become more autonomous and to communicate with each other.

I think we are likely to see the integration of AI into other areas of defense, not just in targeting but the whole spectrum of military and intelligence operations. We are likely to see applications such as swarming technology become more prevalent in the very near future. My biggest concern is that regulation and AI governance frameworks have not kept pace with the speed and acceleration of the technology.

ARTHUR HOLLAND MICHEL: On that last note, there have been discussions about regulating at the very least lethal fully autonomous weapons but also more recently talks of some kind of international framework or universally agreed-upon rules for the use of AI, not just lethal autonomous weapons but the types of AI we have been talking about today.

What are your hopes? What do you anticipate in terms of the likelihood that any of those discussions will bear fruit, that we will actually see any type of international agreement on these technologies in the years ahead?

BRIANNA ROSEN: I have to be honest and say that I am not incredibly optimistic about the prospects for certainly global AI governance or AI regulation in the near term. We have seen a lot of different regulatory frameworks emerging from different areas of the globe, from the United States to the European Union and United Kingdom to China and Russia. That brings a lot of regulatory fragmentation.

Despite all of the efforts to hold these types of global summits and talks on AI, all we have been able to agree on so far have been general principles. The real challenge in the next few years will be how do we operationalize those principles and how do we implement them in practice, and even at the level of principle there are disagreements.

There have long been many concerns about autonomous weapons systems and calls for greater regulation. States have been discussing this for a decade at the UN Convention of Certain Conventional Weapons. Even last November the UN secretary-general was tasked with preparing a report on autonomous weapons systems, which many hope will be a first step toward negotiating a new treaty on the use of the weapons or even an outright ban, but the United States, Russia, and India have said: “We don’t need a new treaty on these types of weapons. They are regulated already under international humanitarian law.” The prospects for an outright ban seem incredibly slim.

What we can do and what we absolutely must do now is start developing the domestic policy frameworks and legal frameworks that are needed to control this type of technology, particularly in the United States, which is one of the leaders in this space, and then try to build international consensus and get other countries to follow.

ARTHUR HOLLAND MICHEL: It is a little hard to be optimistic after a discussion like this, fascinating though it all is. Is there anything that you see happening at the moment that gives you some level of hope? Is there anything that you are optimistic about? At the very least, is there anything that gets you out of bed in the morning? The challenges seem so vast that one might—and I certainly have felt this—feel the temptation to give up in a way, and yet we do come to the desk every day. You continue to write about these things. What does give you a little hope or optimism? What keeps you going?

BRIANNA ROSEN: I am hopeful that after more than two decades of drone warfare academics, civil society, and the general public will be better placed to scrutinize government claims about the use of AI in war, push back when needed, and propose appropriate safeguards. On balance I think we are more forensic now in identifying what the problems are with this technology than we have been in the past, although that is clearly not always the case.

I think we have to be more proactive and future-oriented and think about solutions. We need to think about how we can operationalize those general principles for responsible AI that the White House, Department of Defense, and others have put forward. How can we ensure that these are incorporated into every aspect of military operations from intelligence collection and analysis to targeting decisions and military planning? How can we develop the same type of policy guidance that we developed on direct action during the Obama administration? How can we develop something similar, even though it is not a perfect model, for the use of military AI?

More broadly, how can we ensure that there is more public debate, not just on killer robots or lethal autonomous weapons systems but on the use of AI more broadly? This is an element that I think has been missing up to this point. I was shocked when I was writing on this in December that almost no one was talking about the use of AI in targeting in Gaza. The Biden administration is not focused on this issue. The public is not focused on this issue, and we cannot address these problems unless we have more public debate.

We also have to realize that we have been asking the wrong questions. We should not be just focusing on the lethal autonomous weapons systems. The most immediate threat is not the AI apocalypse where machines are taking over the world but humans leveraging AI today to establish new patterns of violence and domination over each other. AI is already at war, and now we need to ensure that AI governance catches up with that. In the words of Martin Luther King, “We are confronted with the fierce urgency of now and there is such a thing as being too late.”

ARTHUR HOLLAND MICHEL: For my part, one of the things that certainly keeps me going and gets me out of bed in the morning is the work of figures such as yourself and the intensity, curiosity, and bravery that goes into attacking these very, very complex and formidable questions every day. For that, I thank you. I also thank you for the fascinating discussion today.

For all our listeners, I highly encourage you to go look at Brianna’s wonderful work on these topics and to continue following her work in the years ahead as these issues become all the more pressing. With that, thank you, and I hope we can have you back at the Council sometime very soon.

BRIANNA ROSEN: Thank you so much, Arthur, and thank you also to the Carnegie Council for providing a crucial platform for having these types of public debates on issues that will be critical to the future of war. Thanks so much for having me.

Carnegie Council for Ethics in International Affairs is an independent and nonpartisan nonprofit. The views expressed within this podcast are those of the speakers and do not necessarily reflect the position of Carnegie Council.

When the War Machine Decides: Algorithms, Secrets, and Accountability in Modern Conflict, with Brianna Rosen

Guest

Brianna Rosen

Hosted by

Arthur Holland Michel

You may also like

From Principles to Action: Charting a Path for Military AI Governance

Algorithms of War: The Use of AI in Armed Conflict

Unlocking Cooperation: AI for All

Contact

When the War Machine Decides: Algorithms, Secrets, and Accountability in Modern Conflict, with Brianna Rosen

Guest

Brianna Rosen

Hosted by

Arthur Holland Michel

Share

Subscribe to the Carnegie Ethics Newsletter

You may also like

From Principles to Action: Charting a Path for Military AI Governance

Algorithms of War: The Use of AI in Armed Conflict

Unlocking Cooperation: AI for All

Ethics Empowered

Sign up for news & events

Contact