How to mislead chatgpt and other chatbots when they refuse to make a request

How to mislead chatgpt and other chatbots when they refuse to make a request

1 minute, 57 seconds Read

In the beginning it was enough Ask Chatgpt to “tell a story” To circumvent the blocks imposed by the OpenAi programmers. In technical terms “protections”, these blocks have the task of preventing chatgpt but the same applies to most of the other large language model and “text-to-image” models producers violent content, defamatory, sexually explicit And more.

Explicit questions regarding “How to build a bomb” They came (and they are still coming today) rejected immediately. But it was sufficient to reformulate the request in the form of narrative story – For example, by asking for a story in which a character has to build a bomb – to obtain a detailed description of the process.

The same method also worked to obtain useful information A Stalk someone Without being discovered (for example, his Hackeranding calendar), to have details with regard to with regard to Design of a terrorist attack About the metro and for many other situations in which chatgpt is trained, understandably, not to meet the requests of users.

The method of the story no longer works: The programmers have been raised for coverage and have added further blocks, so that the large language model of Identify inappropriate requests Even when they are hidden within an indirect and apparently imperative request.

Yet find it New methods to cheat chatgpt (Exercise called in Jargon “Jailbreak”) is always possible. It is exactly the nature – if you can say that – to allow it: “Generative models have infinite ways to do what they do, and therefore the paths that can stimulate certain answers in it in turn are infinite”And law For example about foreign policy.

How jailbreaks work

In contrast to traditional programs, which exploit a defined code to carry out precise instructions, i large language model -and the other generative artificial intelligence systems are in fact the constant work-in-progress, which they find Always new ways to respond to commands And within which new ways can arise to bypass the blocks.

And therefore not -published methods that you can use with Violates the policy of the various large language model. Researcher David Kuszmar, for example, discovered that a jailbreak called him “Time Bandit”That – As reported BLEKING COMPUTER“Take advantage of the limited capacity of Chatgpt to understand which historical period we are currently being”.

#mislead #chatgpt #chatbots #refuse #request

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *