Cyber Security professionals need to study the cause of the rogue behaviour of GPT3

Some individuals act dumb but they act intelligently. If given an opportunity and platform, they can through their false narrative mislead the society.

If public are repeatedly exposed to such fake narrative, they are likely to be swayed to some extent.

This is typical of the AI training environment too. When an AI is trained on data which consists of false narratives, then it is most likely that it would develop a “Bias”.

In the case of humans, there is an inherent internal mechanism where by we take decisions based first on logical deductions from our memory. But what we call as “Instincts” indicate that some times, we donot go by our past experiences alone and are willing to try a new approach to decision making.

Hence we say that some people can be fooled all the time but not all people cannot be fooled all the time. Soon some will start doubting and asking questions to reveal the falsehood.

This tendency to go independent of the past experiences is a protection available to humans against biases created out of a barrage of false narratives. For example, India believed for a long time in a certain history which is now being gradually debunked. Today the greatness of many whom we believed to be responsible for India’s freedom have been diluted with additional information that has become public.

Similarly, the AI applications like ChatGPT has been trained on data from the web. But we are not sure that  the information available on the web accurate and reliable?. There is no transparency in the machine learning process that has been used in developing the skills of GPT3.  Hence we are unable to understand how did “Sydney” exhibit emotional responses and suggestively dark side of itself.

What is required now is to research on how did “Sydney” apparently display a rogue behavior. Many of the technology experts I have checked with are unable to explain the reasons.

One plausible explanation is that whatever programming they have introduced in the transformer process is not working within its limitations and intelligently constructed prompts can push the GPT3 to display another side of its capability.

But the fact is that that capability is presently existing in the algorithm since the web data with which the learning was modeled on did consist of the negative information also.

Information on “Shadow Self” or “Hacking”, “Machines turning against Humans” etc are all part of the training data gathered from the web and is already with GPT 3 and hence it is natural that Sydney could exhibit this information once it was fooled by clever prompting to ignore the safety barriers included in the programming.

There was an apparent failure in the programming of the mandatory barriers that should have been part of the coding.

If GPT 3 is capable of being abused by clever prompting, then we must recognize that there are enough number of such clever people around with malicious intentions around to create a rogue version of GPT 3.

It is the duty of Cyber Security specialists to ensure that before the bad actors start misusing  GPT 3,  corrective action is taken.


About Vijayashankar Na

Naavi is a veteran Cyber Law specialist in India and is presently working from Bangalore as an Information Assurance Consultant. Pioneered concepts such as ITA 2008 compliance, Naavi is also the founder of Cyber Law College, a virtual Cyber Law Education institution. He now has been focusing on the projects such as Secure Digital India and Cyber Insurance
This entry was posted in Cyber Law. Bookmark the permalink.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.