California Generative AI: Training Data Transparency
Compliance reference — obligations, penalties, applicability, and primary sources.
Last verified April 25, 2026
Summary
California AB 2013 requires developers of generative AI systems or services made available to Californians on or after January 1, 2022 to publicly post on their website, on or before January 1, 2026, documentation describing the data used to train the GenAI system or service.
Required training data documentation includes: (1) high-level summary of datasets including sources or owners, point in time scraped if applicable, types of data, sizes, and formats; (2) whether the datasets include any data protected by copyright, trademark, or patent, or any other proprietary information; (3) whether the datasets include personal information or aggregate consumer information as defined under CCPA; (4) whether the data was purchased or licensed; (5) whether the data was modified by the developer.
AB 2013 does not specify a per-violation penalty cap. Compliance failures could give rise to UCL claims and CCPA/AG enforcement actions.
Key obligations
Specific compliance requirements derived from the primary source. Each item links to the relevant statutory section where applicable.
- TransparencyRole: developerCal. Bus. & Prof. Code § 22757.20
Publicly post on the developer's website a high-level summary of training datasets used for any generative AI system or service made available to Californians on or after January 1, 2022.
Deadlineby_2026-01-01
Sources
Every fact above is sourced from the official primary source. Independent verification recommended before acting on the information.
- Officialleginfo.legislature.ca.gov — Cal. Bus. & Prof. Code § 22757.20
- leginfo.legislature.ca.gov
- oag.ca.gov
Last verified April 25, 2026
We may receive referral commissions from recommended compliance tools. Recommendations are based on product fit and not on commission size. Links marked “partner link” include a tracked redirect.