Vol. 44 No. 1 (2024): Topographies of Risk. Theoretical Approaches
Articles

Tampering with Generative Artificial Intelligence by Jailbreaking

Corrado Claverini
University of Salento

Published 2024-06-12

Keywords

  • Generative Artificial Intelligence,
  • ChatGPT,
  • AI ethics,
  • Jailbreaking,
  • regulation of AI

How to Cite

Tampering with Generative Artificial Intelligence by Jailbreaking. (2024). Teoria. Rivista Di Filosofia, 44(1). https://doi.org/10.4454/mg6wax06

Abstract

In this paper, I will analyse the risks linked to the use of generative artificial intelligence systems and relative risk-reduction strategies, while concentrating in particular on the possibility of tampering with the chatbot ChatGPT
by jailbreaking. After examining how a user can tamper with this generative AI, bypassing its ethical and legal restrictions, through a series of prompts, I will turn my focus to the ethical issues raised by the malicious use of this technology: are the transparency requirements requested of generative AI sufficient or should there be tighter restrictions that do not hinder the innovation and
development of these technologies? How can the risk of tampering with these AI tools be lowered? And, should a breach take place, who is responsible: the AI developer or the jailbreaker? To what extent could the changes needed to prevent jailbreaking involuntarily generate or strengthen certain biases? In conclusion, I will uphold the necessity of ethical reflection for the sustainable
and “human-centric” development of AI.