TL;DR: Researchers at Anthropic, the UK AI Security Institute and the Alan Turing Institute discovered that just 250 malicious documents are enough to slip a “backdoor” into any large language model—no matter its size or how much data it’s been trained on. Even a tiny 600M-parameter model is just as vulnerable as a huge 13B-parameter one once those poisoned samples are in play.
On a lighter note, there’s a Diwali special going on: snag 20% off all live AI courses with coupon code AI20 at Krishnaik’s site. Check out the live classes here: https://www.krishnaik.in/liveclasses or https://www.krishnaik.in/liveclass2/ultimate-rag-bootcamp?id=7, and hit up +91 91115 33440 or +91 84848 37781 if you’ve got questions!
Watch on YouTube
Top comments (0)