Unlike previous AI models from OpenAI, such as GPT-4o, the company trained o1 specifically to work through a step-by-step problem-solving process before generating an answer. When users ask an “o1” model a question in ChatGPT, users have the option of seeing this chain-of-thought process written out in the ChatGPT interface. However, by design, OpenAI hides the raw chain of thought from users, instead presenting a filtered interpretation created by a second AI model. Nothing is more enticing to enthusiasts than information obscured, so the race has been on among hackers and red-teamers to try to uncover o1’s raw chain of thought using jailbreaking or prompt injection techniques that attempt to trick the model into spilling its secrets.
Read more of this story at Slashdot.