The recent lawsuit initiated by Asian News International (ANI) against OpenAI in the Delhi High Court has sparked critical conversations about the legal responsibilities of artificial intelligence (AI) developers in India.
At the heart of the ANI case is the allegation that OpenAI’s large language models (LLMs) were trained on copyrighted material without authorisation. This issue mirrors global developments, such as the lawsuit brought by the New York Times against OpenAI in the US, which also centres on copyright infringement. The crux of these disputes lies in the use of content — often scraped from publicly available sources — for training AI models without securing explicit licences.
In India, the Copyright Act of 1957, which governs intellectual property, lacks clear provisions addressing AI-specific use-cases. Developers must, therefore, err on the side of caution, proactively seeking licences from content owners before incorporating such data into their training sets. Licensing arrangements not only prevent legal disputes but also build goodwill among content creators. This approach could include revenue-sharing agreements or limited-use licences, ensuring that both parties benefit from the arrangement.
Another contentious legal issue concerns the applicability of “fair use” or “fair dealing” provisions under Indian copyright law. While the US provides relatively broad latitude for fair use, particularly in transformative or non-commercial contexts, India’s interpretation of fair dealing is narrower. Courts here have historically favoured copyright holders, especially in cases involving commercial use. AI developers relying on fair dealing to justify their training practices may find their defences challenged, particularly when large-scale, profit-driven training processes are at play. Even publicly accessible content, which might seem free to use, could fall under copyright protections if its use exceeds the scope of the original intent.
Indemnity clauses
In this evolving legal landscape, indemnities are emerging as a key strategy to protect not just developers, but also the users of AI systems. LLM users, such as businesses deploying AI to create content, may inadvertently infringe on third-party copyrights or violate data protection laws. By offering indemnity clauses in their service agreements, AI developers can shield their clients from potential lawsuits, while simultaneously setting clear boundaries on liability. However, these indemnities must be carefully drafted, limiting exposure to claims arising directly from the AI system rather than user misapplications.
Data protection is another area where developers must exercise caution, particularly with the publication, and imminent enforcement, of India’s Digital Personal Data Protection Act, 2023 (DPDP Act). AI systems that process personal data in ways that contravene the DPDP Act’s principles of purpose limitation and data minimisation could attract significant penalties. Developers must ensure that personal data is either anonymised or used with explicit consent with respect to AI model training.
The ANI case also raises questions about how courts might respond to the unauthorised use of copyrighted material in training data. One possible outcome is a judicial directive requiring developers to purge such content from their models. In these situations, the concept of “machine unlearning” offers a potential solution. Machine unlearning involves retraining models to remove the influence of specific datasets, effectively erasing their contribution to the system’s outputs. While technically challenging, especially for large-scale LLMs, this approach could become a vital compliance tool as courts increasingly scrutinise AI training practices.
Lessons can be drawn from the European Union’s AI Act. For example, the EU requires AI developers to disclose details about their systems’ energy consumption, life-cycle efficiency, and data usage practices. India’s regulators, such as the Ministry of Electronics and Information Technology (MeitY), could adopt similar requirements, ensuring that AI systems are both ethical and sustainable. Collaboration with industry stakeholders will be crucial in shaping these regulations, as will the adoption of global best practices in licensing, data protection, and energy monitoring.
The writer is a counsel at S&R Associates, a law firm. Views are personal