California Generative AI: Training Data Transparency
Complete compliance reference
Summary
California AB 2013 requires developers of generative AI systems or services made available to Californians on or after January 1, 2022 to publicly post on their website, on or before January 1, 2026, documentation describing the data used to train the GenAI system or service. Required training data documentation includes: (1) high-level summary of datasets including sources or owners, point in time scraped if applicable, types of data, sizes, and formats; (2) whether the datasets include any data protected by copyright, trademark, or patent, or any other proprietary information; (3) whether the datasets include personal information or aggregate consumer information as defined under CCPA; (4) whether the data was purchased or licensed; (5) whether the data was modified by the developer. AB 2013 does not specify a per-violation penalty cap. Compliance failures could give rise to UCL claims and CCPA/AG enforcement actions.
Key obligations
- transparency→ developerCal. Bus. & Prof. Code § 22757.20
Publicly post on the developer's website a high-level summary of training datasets used for any generative AI system or service made available to Californians on or after January 1, 2022.
Deadline: by_2026-01-01
Sources
Last verified: April 25, 2026
We may receive referral commissions from recommended compliance tools. Recommendations are based on product fit and not on commission size. Links marked “partner link” include a tracked redirect.