Anthropic unveils new framework to block harmful content from AI models
www.infoworld.com
Anthropic has showcased a new security framework, designed to reduce the risk of harmful content generated by its large language models (LLM), a move that could have far-reaching implications for enterprise tech companies.Large language modelsundergo extensive safety training to prevent harmful outputs but remain vulnerable to jailbreaks inputs designed to bypass safety guardrails and elicit harmful responses, Anthropic said in a statement.
0 Σχόλια ·0 Μοιράστηκε ·93 Views