CIA Adopts Microsoft’s Generative AI Model for Sensitive Data Analysis

James Purtilo discusses Microsoft's air-gapped generative AI for intelligence agencies, enhancing data security and computing power.

May 16, 2024

Descriptive image for CIA Adopts Microsoft’s Generative AI Model for Sensitive Data Analysis

America’s spy agencies will be deploying generative artificial intelligence (AI) to analyze sensitive data. It was announced last week that Microsoft’s generative AI model for the intelligence communities will address the security issues from large language models (LLMs) – which are typically connected to the Internet – by “air-gapping” the tools to a cloud-based environment.

This will be the first major LLM to be separated from the Internet, yet it will retain much of the computing power. Generative AI can analyze massive amounts of data and be used to recognize patterns far faster than humans. The CIA began using a generative AI tool last year for unclassified purposes, but more sensitive national security information needs to be isolated from the public Internet.

“This is the first time we’ve ever had an isolated version – when isolated means it’s not connected to the internet – and it’s on a special network that’s only accessible by the U.S. government,” William Chappell, Microsoft’s chief technical officer for strategic missions and technology, told Bloomberg.

Generative AI and the IC

Chappell told Bloomberg that the new AI tool could theoretically be accessed by as many as 10,000 members of the intelligence community (IC) who require access to Top Secret data. The tool went live last Thursday and will enter a testing and accreditation phase before it can go into broader use by the intelligence community.

“Generative AI can help the intelligence services to process data faster and discover links between different data points,” technology industry analyst Roger Entner of Recon Analytics told ClearanceJobs. “One of the biggest areas should be the processing of countless phone calls, emails, and other data transmissions that the services collect and need to make sense of.”

Air-gapped platform

The AI platform was developed so that it can read files but not learn from them in any way that would impact its output. The data also can’t be accessed from the Internet.

“Keeping it air-gapped and away from the internet is the only way we can envision the IC using generative AI technology,” explained Dr. James Purtilo, associate professor of computer science at the University of Maryland.

“Except for the sensitivity of the domain, and thus the danger of spilling important tells to its other users, it is fair to assume that Microsoft’s LLM would be used in all the ordinary ways we use such tech today – assist preparation of reports, answer general questions, search for information and so on,” Purtilo told ClearanceJobs. “The workflow often looks just like what happens in corporate America and thus are fair game for streamlining with emerging tools.”

However, one concern even with an isolated model is the potential for data spillage between protected IC projects.

“As typically configured, these models learn from prompting over time, so one can envision that the sharing of a model will also inadvertently share information outside of compartmentalized a project,” Purtilo continued. “The answer to one user’s prompt might be based on another user’s interactions which were never intended to telegraph that certain data were known.”

Click HERE to read the full article

The Department welcomes comments, suggestions and corrections. Send email to editor [-at-] cs [dot] umd [dot] edu.