When Zero Trust Meets AI Training: The Zscaler GDPR Data Processing Controversy

When Zero Trust Meets AI Training: The Zscaler GDPR Data Processing Controversy
Photo by Brett Jordan / Unsplash

TL;DR: Zscaler's CEO boasted about training AI models on "half a trillion daily transactions" from customer logs, triggering GDPR concerns. Despite corporate damage control, fundamental questions remain about data processing transparency, legal bases, and whether cybersecurity vendors can transform from processors to controllers without explicit consent.

The Spark That Lit the Fire

In August 2025, cybersecurity giant Zscaler found itself at the center of a data protection storm. CEO Jay Chaudhry made references this week to "trillions" of Zscaler's transaction-level logs being used to train its AI models. Those remarks were shared online, leading to some consternation regarding the potential impact on the firm's zero-trust promise.

The controversy began when privacy advocates noticed statements from Zscaler's earnings calls where leadership claimed they leverage their massive data pipeline—"over 500 billion transactions per day and hundreds of trillions of signals every day"—for AI model training. For a company whose entire value proposition rests on "Zero Trust" principles, this raised uncomfortable questions about what exactly was being trusted.

Zscaler’s Commitment to Responsible AI
A breakdown of how Zscaler trains its AI models.

The GDPR Challenge

The situation escalated when a privacy-conscious individual filed a formal GDPR Article 15 data subject access request, demanding transparency about how Zscaler processes personal data for AI training purposes. The request, reproduced in the images above, was methodical and legally precise, asking for:

  • Categories of personal data processed for AI training
  • Legal basis under GDPR Article 6
  • Recipients and retention periods
  • Automated decision-making information
  • Copies of personal data undergoing such processing

The individual's follow-up request cut to the heart of the matter: "For avoidance of doubt, under GDPR Zscaler cannot discharge its obligations by referring me to my employer if Zscaler itself processes data as a controller for AI development or related purposes."

Zscaler Data Processing Agreement
Discover the Data Processing Agreement (DPA) between Zscaler, Inc. and its customers, outlining essential terms and conditions for data management and protection.

Zscaler's Defensive Response

In a response from Zscaler CISO Sam Curry, the company stressed its commitment to responsible AI. "Zscaler does not use customer data to train its AI models," Curry wrote. "Each customer owns their proprietary information or personal data ... in the Zscaler logs. We only use data or metadata that does not contain customer or personal data for AI model training."

But this response immediately raised more questions than it answered. The company's blog post explained: "Think of it like water flowing through pipes: while the content of the water belongs entirely to each customer, the knowledge of how the water moves—its pressure, velocity, and patterns—can inform the system without ever extracting the water itself."

The metaphor, while poetic, sidesteps crucial technical and legal details about what constitutes "metadata" and whether it truly contains no personal data.

Zscaler (ZS) Q4 2024 Earnings Call Transcript | The Motley Fool
ZS earnings call for the period ending June 30, 2024.

The Technical Reality Gap

Here's where Zscaler's explanations become problematic from a GDPR perspective. Independent reporting interpreted the CEO's remarks as saying Zscaler leverages transactional logs — including structured and unstructured elements and full URLs — as training material for internal AI models.

Full URLs are inherently personal data under GDPR when they can identify or relate to individuals. Consider these examples:

  • https://linkedin.com/in/john-smith-12345
  • https://company.com/employee-portal?user=jane.doe
  • https://medical-site.com/patient-dashboard?id=patient123

If Zscaler's AI models are trained on such URLs—even in aggregated form—they're processing personal data. The company's claim that they only use "metadata that does not contain customer or personal data" becomes legally questionable when that metadata includes potentially identifying information.

Zero Trust Maturity Evaluator | Free Assessment Tool for CISOs
Evaluate your organization’s Zero Trust security maturity across 7 critical pillars with our free assessment tool. Get personalized recommendations for your security roadmap.

The Controller vs. Processor Problem

This controversy illuminates a fundamental shift in cloud security relationships. When Zscaler acts as a security service processor for its customers, it operates under strict contractual limitations. But the EDPB's view is that when looking at whether the controller conducted an appropriate assessment, supervisory authorities should consider "whether the controller has assessed some non-exhaustive criteria, such as the source of the data and whether the AI model is the result of an infringement of the GDPR".

The key legal question: If Zscaler repurposes customer log data for its own AI development, does it transform from a processor to a controller for that processing? If so, it needs:

  1. A separate legal basis under GDPR Article 6
  2. Transparent privacy notices about AI training
  3. Data subject rights mechanisms for the AI processing
  4. Legitimate interest assessments if relying on Article 6(1)(f)

Zscaler's Data Processing Agreement (DPA) reportedly lacks provisions for AI model training, suggesting this processing wasn't contemplated in the original customer agreements.

AI RMF to ISO 42001 Crosswalk Tool
Navigate between NIST AI Risk Management Framework and ISO/IEC 42001 standards with our interactive crosswalk tool.

The Broader Industry Implications

Zscaler isn't unique. Security researchers and frustrated administrators reacted because the phrasing used in earnings calls and media reports — mention of "proprietary logs," "full URLs," and "complete logs" — reads to many like an admission of training on high-fidelity customer records.

This reflects a broader trend where cybersecurity vendors are pivoting to AI-powered services, often leveraging the massive data flows they already process. The regulatory landscape is struggling to keep pace:

  • The EDPB has also emphasised that safeguards can assist in meeting the balancing test for legitimate interest processing
  • The CNIL affirms that training AI models on personal data sourced from public content can be lawful under the GDPR's legitimate interest basis, provided certain conditions are met
  • The EDPB Opinion also emphasised the need for controllers deploying the models to carry out an appropriate assessment on whether the model was developed lawfully
The Geopolitical AI Brain Trust: When Foreign Investment Meets National Security in Cybersecurity’s New World Order
TL;DR: As cybersecurity companies deploy AI systems with unprecedented access to sensitive data, a complex web of foreign investment, geopolitical positioning, and executive leadership creates new national security risks. From Israeli-funded AI platforms processing your data to executives taking geopolitical stances on Ukraine-Russia while their AI “brains” have kernel-level

The Transparency Deficit

What makes this case particularly concerning is the apparent lack of proactive transparency. Customers and data subjects weren't informed about AI training uses until after public controversy erupted. Article 13 GDPR requires that data subjects be informed of their rights, including the right to object (which applies where processing is based on legitimate interests). In some cases, to satisfy the fairness principle, it may be appropriate to provide a specific notification to data subjects and give them the opportunity to object before processing is carried out.

What This Means for Organizations

For companies using Zscaler and similar services:

Immediate Actions:

  • Review your vendor contracts for AI training clauses
  • Understand whether vendors are processing your data as controllers for AI purposes
  • Assess your own GDPR obligations for vendor data processing

Strategic Considerations:

  • Explicit model-training clauses: Prohibit any use of customer-identifiable data for third-party model training unless explicitly consented to in writing
  • Implement vendor auditing procedures for AI development activities
  • Consider data residency and sovereignty implications
GDPR & ISO 27001 Compliance Assessment Tool
Comprehensive tool for security leaders to evaluate GDPR and ISO 27001 compliance and prioritize remediation efforts

The Road Ahead

While the CNIL's guidance provides welcome clarity on how legitimate interest can support GDPR compliance during AI training, it does not attempt to resolve adjacent legal or strategic questions. The Zscaler controversy highlights the urgent need for:

  1. Clearer regulatory guidance on processor-to-controller transitions in AI contexts
  2. Industry standards for transparency in AI training by service providers
  3. Better contractual frameworks that anticipate AI development uses
  4. Technical solutions for privacy-preserving AI training

Conclusion

The Zscaler case represents more than a single company's misstep—it's a preview of the regulatory challenges facing the entire cybersecurity industry as AI becomes central to service delivery. Zero trust underpins the Zscaler USP, with the Z in its company name standing for zero. But when it comes to AI training transparency, the industry may need to rebuild that trust from the ground up.

The All-Seeing AI: How Cybersecurity Companies’ AI Systems Access Your Most Sensitive Data
TL;DR: From Zscaler to Cloudflare, Microsoft to Google, cybersecurity companies are deploying AI systems with unprecedented access to organizations’ most sensitive data—including cleartext passwords, SSL certificates, private keys, SOC logs, and NOC data. While marketed as security enhancements, these AI-powered systems create new systemic risks that few organizations

The controversy also demonstrates the power of individual GDPR rights. A single, well-crafted data subject access request exposed gaps that could affect millions of users worldwide. As AI deployment accelerates, such scrutiny will likely intensify.

Organizations should prepare now: audit your vendor relationships, understand the data flows, and ensure your AI-powered security tools don't become compliance liabilities. In the age of AI-driven cybersecurity, trust must be not just zero—it must be earned through transparency, legal compliance, and respect for fundamental privacy rights.


This analysis is based on publicly available information and should not be considered legal advice. Organizations should consult qualified data protection counsel for specific compliance guidance.

Read more

The Compliance Officer's Nightmare: How Modern Cybercrime is Reshaping Legal and Regulatory Responsibilities

The Compliance Officer's Nightmare: How Modern Cybercrime is Reshaping Legal and Regulatory Responsibilities

Bottom Line: Compliance officers and Data Protection Officers (DPOs) have become the unsung frontline warriors in the cybercrime battle, facing an unprecedented perfect storm of triple extortion ransomware, 72-hour breach notification requirements, million-dollar forensic investigations, complex insurance claims processes, and evolving legal frameworks. As ransomware groups sophisticated their tactics with

By Compliance Hub
Generate Policy Global Compliance Map Policy Quest Secure Checklists Cyber Templates