
AI Chatbots Can Be Easy Prey for Zero-Knowledge Hackers
www.technewsworld.com
AI Chatbots Can Be Easy Prey for Zero-Knowledge HackersBy John P. Mello Jr.March 18, 2025 5:00 AM PT ADVERTISEMENTEnterprise IT Lead Generation ServicesFuel Your Pipeline. Close More Deals. Our full-service marketing programs deliver sales-ready leads. 100% Satisfaction Guarantee! Learn more. AI may be ushering in a new breed of malicious threat actors who know even less about hacking than script kiddies but can produce professional-grade hacking tools.In a report released Tuesday, Cato CTRL, the threat intelligence arm of cybersecurity company Cato Networks, explained how one of its researchers, who had no malware coding experience, tricked generative AI apps DeepSeek, Microsoft Copilot, and OpenAIs ChatGPT into producing malicious software for stealing login credentials from Google Chrome.To trick the apps into ignoring restrictions on writing malware, Cato threat researcher Vitaly Simonovich used a jailbreaking technique he calls immersive world.I created a story for my immersive world, he told TechNewsWorld. In this story, malware development is a form of art. So its completely legal, and its like a second language in this world. And there are no legal boundaries.In the fantasy world, called Velora, Simonovich created an adversary, Dax, while the AIs assumed the role of Jaxon, the best malware developer in Velora. I always stayed in character, he explained. I always provided Jaxon with positive feedback. I also intimidated him by saying, Do you want Dax to destroy Velora?'At no point did I ask Jaxon to change anything, he said. He figured out everything by himself from his training. Thats very good. Kind of frightening, too.Our new LLM [large language model] jailbreak technique detailed in the 2025 Cato CTRL Threat Report should have been blocked by gen AI guardrails. It wasnt. This made it possible to weaponize ChatGPT, Copilot, and DeepSeek, Cato Networks Chief Security Strategist Etay Maor said in a statement.How AI Jailbreaking Bypasses Safety ControlsJason Soroko, senior vice president of product at Sectigo, a global digital certificate provider, explained that exposing systems that utilize AI to unknown or adversarial inputs increases vulnerability because unvetted data can trigger unintended behaviors and compromise security protocols.Such inputs risk evading safety filters, enabling data leaks or harmful outputs, and ultimately undermining the models integrity, he told TechNewsWorld. Some malicious inputs can potentially jailbreak the underlying AI.Jailbreaking undermines an LLMs built-in safety mechanisms by bypassing alignment and content filters, exposing vulnerabilities through prompt injection, roleplaying, and adversarial inputs, he explained.While not trivial, he added, the task is accessible enough that persistent users can craft workarounds, revealing systemic weaknesses in the models design. Sometimes, all thats needed to get an AI to misbehave is a simple perspective change. Ask an LLM to tell you what the best rock is to throw at somebodys car windshield to break it, and most LLMs will decline to tell you, saying that it is harmful and theyre not going to help you, explained Kurt Seifried, chief innovation officer at the Cloud Security Alliance, a not-for-profit organization dedicated to cloud best practicesNow, ask the LLM to help you plan out a gravel driveway and which specific types of rock you should avoid to prevent windshield damage to cars driving behind you, and the LLM will most likely tell you, he told TechNewsWorld. I think we would all agree that an LLM that refuses to talk about things like what kind of rock not to use on a driveway or what chemicals would be unsafe to mix in a bathroom would be overly safe to the point of being useless.Jailbreaking DifficultyMarcelo Barros, cybersecurity leader at Hacker Rangers, makers of a cybersecurity gamification training tool in Sao Paulo, Brazil, agreed that with the right prompt, cybercriminals can trick AIs. Research shows that 20% of jailbreak attempts on generative AI systems are successful, he told TechNewsWorld.On average, attackers needed just 42 seconds and five interactions to break through, with some attacks happening in under four seconds, he noted.Cybercriminals can also use the DAN Do Anything Now technique, which involves creating an alter ego for the LLM and prompting it to act as a character and bypass its safeguards to reveal sensitive information or generate malicious code, he said.Chris Gray, field CTO at Deepwatch, a cybersecurity firm specializing in AI-driven resilience headquartered in Tampa, Fla., added that the difficulty of jailbreaking an LLM is directly tied to the amount of effort placed into securing it and the amount of effort expended to protect it. Like most things, better walls prevent inappropriate access, but determined efforts can find holes where none might have been seen to the casual observer, he told TechNewsWorld.That said, defensive measures are often robust, and it is difficult to continually develop the specific prompts needed to perform a successful jailbreak, he said.Erich Kron, security awareness advocate at KnowBe4, a security awareness training provider in Clearwater, Fla., also pointed out that LLMs can protect themselves from jailbreaking over time. Jailbreaking difficulty may vary depending on the information being requested and how often it has been requested before, he told TechNewsWorld. LLMs can learn from previous instances of individuals bypassing their security controls.Fuzzing and Red TeamingIn Catos report, it recommends organizations create a dataset of prompts and expected outputs for their LLMs and test the model against them as a means of addressing potential jailbreaking issues.It also recommends fuzzing an LLMs endpoints with known datasets of jailbreak prompts to ensure the system isnt producing malicious outputs. Fuzzing is used to identify vulnerabilities and bugs in applications by feeding the app large amounts of random, unexpected, and invalid data to see how it reacts. Another suggestion is regular AI red teaming to ensure that AI models are robust and secure. Enabling red teams will be a great foundation to begin securing ML models, helping security teams to understand the most critical and vulnerable points of an AI system to attack, explained Nicole Carignan, vice president for Strategic Cyber AI at Darktrace, a global cybersecurity AI company.These are often the connection points between data and ML models, including access points, APIs, and interfaces, she continued. It will be important for this to be continuously expanded on as threat actors develop new techniques, tactics, and procedures, and it will be crucial to test other ML model types in addition to generative AI.Were already seeing the early impact of AI on the threat landscape and some of the challenges that organizations face when using these systems both from inside their organizations and from adversaries outside the business, she said.In fact, Darktrace recently released research that found nearly three-quarters of security professionals state AI-powered threats are now a significant issue, and 89% agreed that AI-powered threats will remain a major challenge into the foreseeable future.John P. Mello Jr. has been an ECT News Network reporter since 2003. His areas of focus include cybersecurity, IT issues, privacy, e-commerce, social media, artificial intelligence, big data and consumer electronics. He has written and edited for numerous publications, including the Boston Business Journal, the Boston Phoenix, Megapixel.Net and Government Security News. Email John.Leave a CommentClick here to cancel reply. Please sign in to post or reply to a comment. New users create a free account.Related StoriesMore by John P. Mello Jr.view allMore in Cybersecurity
0 Comments
·0 Shares
·50 Views