Pitfalls of AI-generated content

Though several experiments and projects have been happening across the world on Generative AI, it’s OpenAI’s ChatGPT that has shown the world the power of the LLM (large language models), which can churn out human-like answers.

It opened the floodgates with Google announcing a limited edition Bard and a host of Generative AI bots such as Perplexity.ai, ChatPDF, AgentGPT, YouChat, and ChatSonic — the list is getting bigger, with developers trying out newer features and introducing dedicated sector-specific platforms.

There are a host of other models that churn out art pieces, make videos and PowerPoint Presentations, create unique pictures and music, and what not.

The natural language processing models are ‘trained’ on books or other content that is fed into them. The quality of the output is based on the quality of the content that is fed into it and its learnings from each query that you ask.

Like some portals that offer you comparative prices of electronic goods on different e-commerce platforms, there are portals such as Nat.Dev that help you compare the answers on at least three different platforms for the same query that you pose. You can choose the best of them.

Stirring a debate

The enormous volume of content generated, virtually overwhelming the users and disrupting a wide variety of sectors, has stirred a worldwide debate on who owns the content generated by the machines, whether the content generated by the Generative AI is authentic, and what will happen to the copyrights of authors whose books are being fed into the machines.

While students are using it extensively for their theses and project work, IT developers are deploying it to write codes and companies to write business and marketing strategies. Though its use cases are many and varied, the issues around content have triggered a global debate.

Those who are using the ‘output’ (content generated by Generative AI), need to be careful. If you are using the output as it is, you may be caught. If an AI model can generate content for you, another AI model can tell you if it is written by a human or a machine, for machines leave a pattern or a fingerprint in the output. The software solutions that are used for testing plagiarism too are getting equipped with AI, which can spot machine-produced content in no time.

Also, remember that another person in another remote part of the world might have asked the same question and the machine might have generated the same or almost a similar answer.

One of the two could get caught for plagiarism.

Another big challenge in AI generated content is, you will never know whether it is quoting verbatim from one or more books while generating the output. In such cases, you would be using someone else’s work, exposing you to copyright violations. You can imagine what will happen if it used content from books that contained full of assumptions and falsehoods.

AI models are also posing a serious challenge to original authors and research organisations, whose books and content are freely used for training these AI models.

Another important problem with the ‘output’ is, there is a scope for bias as the output is completely dependent on the content that it was trained on. The AI models, which also constantly learn from the queries (which include sensitive content) that the users ask, might accidentally pop up sensitive information, violating the IP rights of others.

When ChatGPT was asked whether one can use the output as one’s own, ChatGPT insists that one needs to “attribute the source of the information to OpenAI”.

The same holds good if one refers to one’s own text for correction. “If I were to correct your text, it would still be important to attribute the source of the corrections to OpenAI. It also respects the intellectual property of the creators of the AI model,” it asserts.

Conflicting stand

ChatGPT’s views on the use of content look conflicting. On one hand, it says the text generated by ChatGPT is not subject to traditional IP protections because it is not the original work of any human author. On the other, it cautions that you could be subject to IP infringement claims by the owners of that material (used by ChatGPT for the generation of content).

For now, you may not face any challenges from OpenAI with regard to IP. “We assign to you all its right, title, and interest in and to output. You can use the content for any purpose, including for publication,” it says.

It, however, comes with riders of ‘comply with terms’ and ‘applicable law’. One must read carefully the terms and conditions listed by the AI models that you use to create content.

These issues should concern national governments and academic institutions. Policies governing copyright issues around AI-generated content might evolve sooner or later.

While the US Copyright Office recently kicked off an initiative to discuss the copyright law and policy issues raised by AI technologies, the European Data Protection Board (EDPB), an European Union agency, set up a task force to consider ‘possible’ enforcement actions against the use of ChatGPT. Italy has announced restrictions on its use. Sooner or later other countries might come out with their own set of rules on the use of AI-generated content.

Till the time national policies evolve, one should be very cautious about using the content.

Mushtaq Bilal, a social media influencer who is introducing several AI-based content tools to research students, has rightly asked the users not to outsource their thinking to ChatGPT.

You can, instead, oursource your labour, asking it for tasks like suggesting structures and reviewing, which will reduce the drudgery in writing.

Comments

Comments have to be in English, and in full sentences. They cannot be abusive or personal. Please abide by our community guidelines for posting your comments.

We have migrated to a new commenting platform. If you are already a registered user of TheHindu Businessline and logged in, you may continue to engage with our articles. If you do not have an account please register and login to post comments. Users can access their older comments by logging into their accounts on Vuukle.

Pitfalls of AI-generated content

The use of AI tools like ChatGPT has opened up a debate on ownership of content and copyrights

Stirring a debate

Conflicting stand

You might also like

You might also like

Comments