
Artificial intelligence is driving significant changes in corporate productivity, from completing code in GitHub Copilot to chatbots mining the foundation of internal knowledge for immediate answers. Each new agent must authenticate to other services and quietly inflate the population of non-human identity (NHIS) across the enterprise’s cloud.
That population is already dominating businesses. Many companies currently juggle at least 45 machine identities for all human users. Service accounts, CI/CD bots, containers, and AI agents all require secrets to securely connect and work with other systems, most commonly in the form of API keys, tokens, or certificates. Gitguardian’s State of Secrets Sprawl 2025 report reveals the cost of this sprawl. Over 23.7 million secrets have emerged on public Github in 2024 alone. And instead of refining the situation, repositories using Copilot made secret leaks possible 40% often.
NHIS is not a person
Unlike people logging in to the system, NHIS rarely has a policy that requires credential rotation, strictly scoped privileges, or deprecation of unused accounts. Uncontrolled, they weave a dense, opaque web of high-risk connections that can be exploited even after attackers remember that those secrets exist.
AI adoption, particularly large-scale language models and retrieved generation (RAG), dramatically increased the rate and amount of sprawl that induces this risk.
Consider an internal support chatbot with LLM. When asked how to connect to your development environment, the bot may get a confluence page with valid credentials. Chatbots can unconsciously disclose secrets to those who ask the right questions, and logs can easily leak this information to those who have easy access to it. Worse, in this scenario, LLM instructs the developer to use this plaintext credential. Security issues can quickly accumulate.
But the situation is not hopeless. In fact, when the right governance model is implemented around NHIS and Secrets Management, developers can actually deploy innovation and deployment faster.
Five practical controls to reduce AI-related NHI risk
Organizations seeking to control the risks of AI-driven NHIS should focus on these five practical practices.
Data Source Audit and Clean Up Centralize Existing NHIS Management to Prevent Secret Leakage of LLM Deployments Logging Security Limitations AI Data Access
Let’s take a closer look at each of these areas.
Audit and Data Source Cleanup
The first LLM was combined only to specific trained datasets, making it a novelty with limited functionality. Searched Generation (RAG) Engineering changed this by allowing LLM to access additional data sources as needed. Unfortunately, if there is a secret in these sources, the associated identities currently run the risk of being abused.
Data sources including project management platforms such as Jira and communication platforms such as Slack, and knowledge bases such as Confluence were not built with AI or secrets in mind. If someone adds a Plantext API key, there is no safeguard to warn that this is dangerous. Chatbots become easy to get into a secret appearance engine with the right prompts.
The only sure way for LLMs to prevent leaking their internal secrets is to eliminate secrets that exist, or at least revoke the access they are telling. Invalid credentials do not take immediate risks from attackers. Ideally, you could remove these secret instances completely before the AI gets it. Luckily, there are tools and platforms like Gitguardian that can help reduce the pain of this process as much as possible.
Centralize existing NHIS management
The quote “If you can’t measure it, you can’t improve it” is largely due to Lord Kelvin. This is very true for nonhuman identity governance. There is little hope that effective rules and scopes can be applied to the new NHI associated with Agent AI without stocking all the service accounts, bots, agents and pipelines that we currently have.
One thing that all these types of non-human identities have in common is that they all have secrets. No matter how you define NHI, we all define the authentication mechanism in the same way: The Secret. Concentrating inventory through this lens can disrupt the focus on proper storage and management of secrets. This is far from a new concern.
There are many tools that can make this achievable, such as Hashicorp Vault, Cyberark, or AWS Secrets Manager. Once they are all centrally managed and explained, you can move from the long-term world of qualifications to what is automated and enforced by policy.
Prevents secret leakage of LLM deployments
The Model Context Protocol (MCP) server is a new standard for how agent AI accesses services and data sources. Previously, if you want to configure your AI system to access resources, you need to wire it yourself and figure it out as you go. MCP has introduced a protocol that allows AI to connect to service providers through standardized interfaces. This simplifies things and reduces the chances of developers hard-code their credentials to make the integration work.
In one of the more surprising papers released by Gitguardian security researchers, they found 5.2% of all MCP servers that can be found to contain at least one hard-coded secret. This is more pronounced than the 4.6% incidence of exposed secrets observed in all public repositories.
Just like any other technology you deploy, a 1 ounce safeguard early in the software development lifecycle can prevent a 1 pound incident later. Catching hardcoded secrets when still in the feature branch means they will never be merged and shipped to production. Adding secret detection to your developer workflow via Git Hooks or Code Editor Extensions means that plain text credentials don’t even reach the shared repository.
Improve logging security
LLMS is a black box that takes requests and gives probabilistic answers. Although you cannot adjust the underlying vectorization, you can know if the output is as expected. AI engineers and machine learning teams record everything from the initial prompts, the acquisition context, and generated responses to coordinate the system to improve AI agents.

If any of the steps logged in the process have a secret published, then there are probably multiple copies of the same leaked secret, either on a third-party tool or platform. Most teams store logs in a cloud bucket without adjustable security controls.
The safest path is to add a disinfection step before the logs are saved or shipped to a third party. This requires engineering effort, but again, tools like Gitguardian’s GgShield are here to help with secret scans that can be called programmatically from scripts. If the secret is scrubbed, the risk is significantly reduced.
Limit AI data access
Does your LLM need to access your CRM? This is a tricky question and is very situational. If it’s an internal sales tool that’s locked down behind SSO, it might be fine if you can quickly search your notes to improve your delivery. For customer service chatbots on the front page of your website, the answer is the company.
Similar principles of similar access must be applied to the AI you deploy, so that you must follow the principle of minimal privilege when setting permissions. The temptation to give AI agents full access to everything that is called speeding up is superb. Too little access will invalidate the purpose of the RAG model. Allowing access too much will lead to abuse and security incidents.
Raises developer awareness
It’s not on the list we started, but all of this guidance is useless unless you reach the right person. Frontline people need guidance and guardrails to help them work more efficiently and safely. I hope there is a magical technological solution to offer here, but the truth is that to safely build and deploy AI, humans need to enter the same page with the right processes and policies.
If you are on the development side of the world, we encourage you to share this article with your security team and get their ideas on how to safely build AI in your organization. If you are a security expert reading this, I recommend sharing this with your developers and DevOps teams to encourage the conversation that AI is here.
Protecting the identity of a machine equals a more secure AI deployment
The next stage in AI adoption belongs to organizations that deal with nonhuman identities with the same rigor and care as human users. Continuous monitoring, lifecycle management, and robust secret governance must become standard operating procedures. Now, by building a secure foundation, businesses can confidently expand their AI initiatives and unlock the full promise of intelligent automation without sacrificing security.
Source link