What Is Shadow AI Auditing? How to Detect and Manage Unauthorized AI Tools Used Within Your Organization

What Is Shadow AI Auditing? How to Detect and Manage Unauthorized AI Tools Used Within Your Organization

Shadow AI auditing is the process of detecting, assessing, and managing AI tools used by employees without organizational approval. This article explains specific internal audit procedures and governance framework construction methods for IT managers and information security officers seeking to understand the reality of shadow AI, which carries risks of data leakage and compliance violations.

A shadow AI audit is an internal audit process designed to detect and assess AI tools that employees are using for work without organizational approval, and to bring them under management within a defined set of usage rules. As the practice of using tools such as ChatGPT and Gemini through personal accounts continues to spread, this article is aimed at IT managers and information security officers who want to understand the risks of data breaches and compliance violations. It walks through the process step by step—from preparation and detection, through assessment, to building a governance framework—while also incorporating data protection considerations such as Thailand's PDPA (Personal Data Protection Act), and presenting a practical approach that can be operationalized on the ground.

Shadow AI is a collective term for AI tools that employees use for work purposes outside the awareness of IT and security departments. Because of their convenience, these tools spread rapidly on the front lines, and can easily become unnoticed channels through which confidential data flows to external AI systems. This section first examines why this issue is qualitatively different from conventional problems.

Defining Shadow AI and Its Differences from Traditional Shadow IT

Shadow IT refers to SaaS applications and personal devices used without IT department approval. Shadow AI is a subset of this, but with one critical distinction: data entered into these tools is passed to the AI and may be used for training or response generation. It is not merely a matter of tools being outside of management control—confidential information and personal data pasted into prompts can leak to external models, and once that data has been shared, it cannot be retrieved.

Shadow AI also presents a troublesome combination of characteristics: an extremely low barrier to entry, since anyone can use it instantly and for free; quality and hallucination risks (the generation of plausible but incorrect information) that arise when outputs are used directly in business decisions; and the difficulty of visibility, since everything can be done through a single browser. It must be treated as a new category of risk that cannot be adequately captured by the conventional framing of "unauthorized applications."

The Background Behind the Rapid Increase of Shadow AI Within Organizations

Several converging factors lie behind the rapid proliferation of shadow AI. ① Generative AI has become widespread, and employees have grown accustomed to using it in their personal lives. ② Official company AI tools are either not yet in place or are difficult to use, leading employees to turn to personal tools as a workaround. ③ These tools are free, immediately available, and fully functional on a smartphone, making the barrier to adoption effectively zero. ④ There is pressure to deliver results quickly.

In other words, the reality is less "it's prohibited but people use it anyway" and more "it's so convenient there's no reason not to use it." Multiple AI governance surveys have repeatedly noted that in many organizations, employees are using tools such as ChatGPT and Claude through personal accounts or personal devices, and IT departments have no visibility into the actual extent of this usage. Many companies now stand at a crossroads: leave it alone and assume good faith, or make it visible and bring it under management.

Risks of Inaction: PDPA Violations, Data Leakage, and AI Governance Breakdown

The risks of leaving shadow AI unaddressed can be grouped into three broad categories. The first is data breach: if customers' personal information, trade secrets, or source code are entered into an external AI, that data may leak outside the organization. The second is regulatory violation: data protection laws, including Thailand's PDPA, impose restrictions on the unauthorized disclosure of personal data and its transfer outside the country. If an employee passes personal data to an AI without authorization, this could constitute a violation leading to administrative sanctions, penalties, and reputational damage. The third is governance breakdown: without records of who entered what into which AI, it becomes impossible to fulfill audit and accountability obligations, and quality risks—such as erroneous outputs making their way into business processes—cannot be managed.

All of these risks are said to significantly increase the cost of response once a breach or violation actually occurs. The core problem is that these issues are progressing in places where they cannot be seen.

What Should Be Prepared Before Starting an Audit? Prerequisites and Framework Setup

Before jumping straight into looking for tools, decide "who audits, with what authority, and to what extent." If the structure and scope remain vague while rushing into detection, it tends to invite pushback from the field and wasted effort.

Forming the Audit Team and Clarifying Authority

The audit team should not be confined to a single department. The basic composition includes Information Security/IT, Legal and Compliance (from a data protection perspective, such as PDPA), representatives from each business unit, and an executive sponsor who serves as the backing authority for granting permissions.

In terms of authority, formalize—before the audit begins—access rights such as reviewing logs and network monitoring, conducting employee interviews, and accessing tool inventories. At the same time, care is needed because excessive employee monitoring can create separate legal issues around labor and privacy. Role assignments should be documented explicitly using frameworks such as RACI (Responsible, Accountable, Consulted, Informed), leaving no ambiguity about who makes the final call. Obtaining executive approval in advance significantly affects the organization's ability to carry out subsequent corrective measures (such as prohibiting use or providing alternative tools).

Reviewing the Current State of AI Usage Policies and Setting Standards

First, confirm whether a current AI usage policy exists. In practice, many organizations have yet to establish a policy specifically addressing AI. Two frameworks that can serve as references for building standards are the NIST AI RMF (a voluntary U.S.-originated framework consisting of four functions: Govern, Map, Measure, and Manage) and ISO/IEC 42001 (an internationally recognized AI management system standard for which certification can be obtained). The two frameworks have significant overlap in their control items, and many organizations adopt the NIST AI RMF as their operational model while targeting ISO/IEC 42001 for external certification.

At this stage, there is no need to produce a finished policy. Provisionally establish the decision criteria for "what to permit, what to permit conditionally, and what to prohibit," based on the combination of data sensitivity and tool characteristics. These criteria will be refined in later steps as detection findings are cross-referenced against them.

Defining the Audit Scope: Target Departments, Tools, and Data Range

Define the audit scope along three axes. ① Target departments: A company-wide simultaneous rollout is burdensome, so begin with departments that handle sensitive data (such as Sales, Development, HR, and Finance). ② Target tools: Include not only chat-based AI, but also SaaS with embedded AI features, browser extensions, and code assistance tools. ③ Target data: Identify where personal information, trade secrets, and regulated data reside.

Overreaching on scope will cause the audit to drag on indefinitely. It is more practical to focus the first cycle on high-risk areas and expand the scope iteratively. Additionally, define the audit timeline and deliverables (tool inventory, risk assessment, and remediation plan) from the outset. Articulating the goals in writing helps avoid the outcome of "we went through the process but nothing was decided."

Step 1: How to Assess the State of AI Tool Usage Within the Organization

The fundamental approach to detection is to combine "network and endpoint logs" with "interviews with people." This is because technical detection alone tends to miss activity conducted via personal devices, while interviews alone tend to result in underreporting.

Leveraging Network Traffic Analysis and Proxy Logs

The foundation of technical detection is aggregating access to known AI service domains from proxy, firewall, and SWG (Secure Web Gateway) logs. If a CASB or SASE solution is in place, SaaS usage can be visualized more broadly. DNS logs and on-endpoint application detection serve as complementary measures.

However, there are limitations. Usage from personal devices, mobile data connections, or home networks does not pass through the corporate network and therefore cannot be captured. Additionally, SSL inspection—which examines the contents of encrypted communications—requires careful consideration of privacy and a proper legal basis. The accuracy of detection ultimately depends on how well the list of known AI service domains is maintained and kept up to date. Since new services emerge continuously, the list cannot simply be created once and left as-is.

Gathering Usage Data Through Employee Surveys and Interviews

Surveys and interviews fill the blind spots that technical detection cannot reach. The most important step here is to state clearly at the outset that the purpose is not to punish, but to understand actual usage and improve the working environment. If employees sense a punitive tone, responses will quickly skew toward underreporting.

Use anonymous surveys to gather information on "which AI tools are being used, for which tasks, and with what types of data," then conduct on-site interviews to surface candid feedback—such as frustrations with officially approved tools. In practice, many respondents report that they "did not know it was prohibited," and this should be treated not as grounds for enforcement, but as a signal of training needs. To improve response rates, a message from senior leadership can be effective. Designing the exercise as an effort to understand the realities of the workplace—rather than a survey aimed at catching violations—ultimately leads to a more accurate picture of actual usage.

Automated Detection of AI Services Using SaaS Management Tools

SaaS management tools (such as SSPM and CASB) can automatically inventory AI services in use by drawing on information about contracts, billing, OAuth integrations, and browser extensions. Particularly valuable is the identification of external AI applications connected via OAuth to corporate accounts, as these can serve as pathways for accessing internal data without proper authorization. Browser extension management status and IdP (identity provider) login histories should also be cross-referenced.

Unlike manual inventories, the key advantage of these tools is their ability to detect continuously. They are most effective when incorporated as a mechanism for "ongoing monitoring" after the first audit cycle is complete. Since many tools are available, selection should be based on compatibility with the organization's IT environment—including whether an IdP or MDM is already in place. To avoid making tool adoption an end in itself, it is advisable to define what needs to be detected before choosing a solution.

Step 2: How to Evaluate and Classify Detected AI Tools

Detected tools should be scored along a "risk axis" and classified into one of three tiers: approved, conditionally approved, or prohibited. Rather than blanket prohibition or unchecked permissiveness, drawing distinctions based on risk level is the practical approach.

Four Axes of Risk Assessment: Data Confidentiality, Terms of Use, Security, and Compliance

In conclusion, risk is determined by the product of "what data is entered" and "how the tool handles that data."

Evaluation AxisKey Checkpoints
Data SensitivityClassification of data being input (public / confidential / personal information / regulated data)
Terms of Service & Data HandlingWhether inputs are used for model training, retention period, overseas storage, opt-out availability
SecurityAuthentication (SSO, MFA), encryption, access controls, vendor track record
CompliancePersonal data requirements such as PDPA, industry regulations, contractual confidentiality obligations

Score each axis as high, medium, or low, then aggregate the scores to gauge overall risk level. One area requiring particular attention is that free, consumer-tier plans may be designed in ways that allow inputs to be used for training. Even for the same tool, data handling practices can differ depending on the subscription plan, so evaluation should extend to "which plan is being used."

Three-Tier Classification Criteria: Approved, Conditionally Approved, and Prohibited

In conclusion, tools with a track record of securely handling confidential data are classified as approved; those that are questionable but difficult to replace are conditionally approved; and those that are clearly unacceptable are prohibited.

ClassificationCriteriaExamples
ApprovedEnterprise contracts guarantee no use of data for training and ensure data protection; SSO integration is availableAI services under an official contract
Conditionally ApprovedAcceptable when use cases and data types are restricted (e.g., no entry of personal information)General chat AI used exclusively for non-confidential tasks
ProhibitedConfidential data may inevitably be used for training or stored overseas; terms of service are opaqueFree AI tools of unknown origin

This classification is not fixed and will be reviewed as contracts change or alternative solutions become available. Most importantly, whenever a "prohibited" designation is assigned, an "approved alternative" must always be presented alongside it. If a prohibition is issued without offering an alternative, employees will revert to shadow AI.

Visualizing and Recording Tool Usage with an AI BOM

An AI BOM (AI Bills of Materials) is an inventory—a "parts list"—of the models, data sources, dependencies, and external AI services used by a system. As a deliverable of a shadow AI audit, all detected AI tools should be registered in the AI BOM, with their use cases, data types, risk classifications, and approval status recorded.

This serves as the foundation for ongoing governance. New tools should be added as they emerge, and the BOM should be leveraged as an audit trail. Frameworks such as the NIST AI RMF also identify this kind of inventory (asset register) management as a common requirement. By centralizing information in a single register, it becomes possible to eliminate the very condition at the root of shadow AI: the state in which no one has a complete picture of the whole. Conversely, without such a register, no amount of individual detection will result in cumulative, sustainable management.

Step 3: How to Implement Governance Frameworks and Corrective Measures

Rather than stopping at assessment, embed the results into a sustainable operational framework built on three pillars: policy, technical controls, and education. An audit is not a one-time event; it is an operational process that must be continuously cycled.

Developing and Revising AI Usage Policies and Communicating Them Organization-Wide

An AI usage policy should include a list of permitted tools, rules on what data may or may not be entered by data classification, prohibited actions, how violations will be handled, and the process for requesting approval. The key to making the policy effective is to move beyond abstract principles and provide concrete examples—specifying which AI tool may or may not receive which data for which task.

Communication should not end with distribution. Pair it with training, comprehension checks, and integration into new employee onboarding. Establish a revision cycle in advance, since AI tools, the tools themselves, and vendor terms of service are constantly changing. One important caution: an overly strict policy will drive shadow AI behavior once again. Drawing a realistic line that employees can reasonably follow will ultimately result in higher compliance rates.

Introducing AI Guardrails and Technical Access Controls

It is important not to rely on rules alone, but to reinforce them with technology. Specifically: ① Use CASB or SWG to control access to unapproved AI services (blocking or alerting). ② Use DLP (Data Loss Prevention) to detect and block the transmission of confidential data to external AI services. ③ Provide enterprise AI with SSO and MFA, making the official route easy to use. ④ Log prompts and outputs within an internal AI platform to enable monitoring.

An indispensable principle here is to always pair "prohibitive technical controls" with "the provision of legitimate alternatives." Simply blocking access will cause employees to look for workarounds. Technical controls only function effectively when accompanied by a design that naturally guides users toward a safe and convenient official route. Blocking and enabling are not opposites—they must be designed as two wheels of the same vehicle.

AI Literacy Training and Continuous Monitoring Mechanisms

In terms of education, communicate why it is dangerous (with concrete data leakage scenarios), what constitutes confidential information, and how to use AI safely. Do not treat training as a one-time event — conduct it on a regular basis. On the monitoring side, continuously detect new AI services (by periodically re-running Step 1), keep the AI BOM up to date, and have an incident response workflow in place.

Examples of KPIs for measuring progress include the adoption rate of approved tools, trends in the number of unauthorized access incidents, and training completion rates. Use these to run a PDCA cycle of audit → remediation → re-audit. One final operational point: it is important to start from the premise that shadow AI can never be reduced to zero. The goal is not eradication, but rather keeping risk within an acceptable range on an ongoing basis.

Common Failure Patterns in Shadow AI Audits and How to Avoid Them

The two typical failures are "an immediate blanket ban" and "one audit and done." Both ultimately drive shadow AI further underground, into places that are even harder to see.

The Pitfalls of a "Prohibition-First" Approach That Triggers Pushback from the Field

If a ban is imposed without providing any alternative tools, employees will push back with "we can't get our work done like this" and start using personal devices or home environments in secret. In other words, the problem is driven further out of sight, making the ban counterproductive.

The workarounds are: ① provide approved alternative tools at the same time as the ban; ② ensure employees genuinely understand why the ban is in place; ③ gather input on operational needs from the front lines and reflect them in official tools; and ④ allow a transition period. The key is to shift the message from "don't use this" to "please use this instead." A design that pits security against productivity will almost inevitably break down somewhere. Building the framework on the premise that both can coexist is the most direct path to lasting adoption.

The Problem of Treating the Audit as a One-Time Exercise

AI services emerge one after another, and the ways employees use them continue to evolve. As a result, the findings of a one-time audit quickly become outdated. The workaround is to run detection on a regular schedule (e.g., quarterly), maintain the AI BOM as a living ledger, periodically revise policies, and continuously assess the risk of new services. In short, audit should be embedded as an "operational process" rather than a "project," with a defined owner, frequency, and deliverables on a standing basis. Frameworks such as the NIST AI RMF are also premised on continuous Govern and Measure activities.

One final point worth reaffirming: the purpose of a shadow AI audit is not enforcement for its own sake, but rather the ongoing work of building an environment where AI can be used safely. By establishing a recurring cycle of visibility → assessment → governance → education, organizations can keep risk in check while genuinely reaping the benefits of AI adoption.

Author & Supervisor

Yusuke Ishihara

Yusuke Ishihara

Started programming at age 13 with MSX. After graduating from Musashi University, worked on large-scale system development including airline core systems and Japan's first Windows server hosting/VPS infrastructure. Co-founded Site Engine Inc. in 2008. Founded Unimon Inc. in 2010 and Enison Inc. in 2025, leading development of business systems, NLP, and platform solutions. Currently focuses on product development and AI/DX initiatives leveraging generative AI and large language models (LLMs).