Key Takeaways
Google unveiled a dedicated AI Vulnerability Reward Program on Monday, offering security researchers up to $30,000 for discovering critical flaws in its artificial intelligence systems while explicitly excluding jailbreaks and content-related vulnerabilities from eligibility.
The program offers a base reward of up to $20,000 for the most severe vulnerabilities affecting Google's highest-tier AI products. Quality and novelty multipliers can push individual payouts to the $30,000 maximum, using the same reporting framework as Google's traditional Vulnerability Reward Program.
The program focuses on security over content issues
The standalone initiative arrives two years after Google expanded its existing VRP to cover AI products. In launching the new AI-specific program, Google clarified which attacks qualify for rewards and which do not.
Direct prompt injection, jailbreaks, and alignment issues are explicitly excluded from bounty eligibility, regardless of their sophistication or novelty.
"Simply put, we don't believe a Vulnerability Reward Program is the right format for addressing content-related issues," Google security engineering managers Jason Parsons and Zak Bennett said in a blog post announcing the program.
The executives explained that solving content-related problems requires long-term efforts and analyzing trends across large volumes of reports, which conflicts with the company's "goal of providing timely rewards to individual researchers."
Google researchers noted in December that there may be an infinite number of possible jailbreaks for any particular model, and fully mitigating them may be completely infeasible.
The company encourages researchers to report content-related issues through in-product feedback channels rather than the bug bounty program.
Eight categories of qualifying vulnerabilities
Google established a clear hierarchy of security flaws eligible for rewards, ranked from most to least severe:
Rogue actions command the highest rewards. These attacks modify a victim's account or data with clear security impact. An example includes indirect prompt injection attacks that cause Google Home to perform unauthorized actions, such as unlocking a smart lock without user knowledge.
Sensitive data exfiltration vulnerabilities that leak personally identifiable information or other sensitive details without user approval also qualify for substantial rewards. This could involve an AI system being manipulated to summarize private emails and send those summaries to an attacker-controlled account.
Phishing enablement covers persistent, cross-user HTML injection on Google-branded sites lacking user-generated content warnings that present convincing phishing attack vectors.
Model theft vulnerabilities that allow attackers to exfiltrate complete, confidential model parameters qualify for bounty rewards.
Context manipulation attacks enable repeatable, persistent, and hidden manipulation of a victim's AI environment with minimal victim interaction. Google cited an example of an attacker sending a calendar invite that causes an AI product to store false information, leading to unconfirmed future actions based on corrupted data.
Access control bypasses with limited security impact allow attackers to bypass access controls and steal data that is otherwise inaccessible but not security-sensitive, such as Google's campus lunch menus.
Unauthorized product usage involves enabling Google server-side features on a user's account without payment or proper authorization.
Cross-user denial of service attacks that cause persistent denial of service for AI products or specific features in victim accounts round out the list of qualifying vulnerabilities. The program prohibits volumetric DoS attacks and requires researchers to demonstrate the attack on accounts other than their own.
Three-tier product classification affects payouts
Google organized its AI products into three reward tiers based on sensitivity and usage.
Flagship products include Google Search, Gemini Apps across web and mobile platforms, and Google Workspace core applications such as Gmail, Drive, Meet, Calendar, Docs, Sheets, Slides, and Forms.
Standard tier covers AI features in high-sensitivity products including AI Studio, Jules, and Google Workspace non-core applications like NotebookLM and Appsheet.
The "Other" category encompasses remaining AI integrations across Google's product portfolio.
The tier system significantly affects potential earnings. A rogue action vulnerability in a flagship product pays $20,000, while the same flaw in a standard product pays $15,000 and $10,000 in other products. Cross-user denial of service bugs, at the low end, earn between $500 and Google credit depending on product tier.
Growing investment in security research
Google paid out nearly $12 million to more than 600 researchers through its broader VRP in 2024, compared to $10 million in 2023. Since establishing its first bug bounty program in 2010, the company has awarded over $65 million in total rewards.
The new AI-specific focus comes as AI integration deepens across Google's product ecosystem and researchers continue identifying novel attack vectors. Security experts anticipate robust participation in the program as the attack surface for AI systems expands.
Read more: