TechCrunch Minute: How Anthropic found a trick to get AI to give you answers it's not supposed to

Alex Wilhelm

Updated 5 April 2024 at 0:29 pm·1-min read

[youtube https://www.youtube.com/watch?v=ZaUlJWL9Sds?version=3&rel=1&showsearch=0&showinfo=1&iv_load_policy=1&fs=1&hl=en-US&autohide=2&wmode=transparent&w=640&h=360]

If you build it, people will try to break it. Sometimes even the people building stuff are the ones breaking it. Such is the case with Anthropic and its latest research which demonstrates an interesting vulnerability in current LLM technology. More or less if you keep at a question, you can break guardrails and wind up with large language models telling you stuff that they are designed not to. Like how to build a bomb.

Of course given progress in open-source AI technology, you can spin up your own LLM locally and just ask it whatever you want, but for more consumer-grade stuff this is an issue worth pondering. What's fun about AI today is the quick pace it is advancing, and how well -- or not -- we're doing as a species to better understand what we're building.

If you'll allow me the thought, I wonder if we're going to see more questions and issues of the type that Anthropic outlines as LLMs and other new AI model types get smarter, and larger. Which is perhaps repeating myself. But the closer we get to more generalized AI intelligence, the more it should resemble a thinking entity, and not a computer that we can program, right? If so, we might have a harder time nailing down edge cases to the point when that work becomes unfeasible? Anyway, let's talk about what Anthropic recently shared.

PA Media: Movies
Blake Lively explains Lady Deadpool connection amid cameo speculation
The Gossip Girl actress outlined a string of coincidences.
Yahoo Movies UK
Is Joker 2 actually a musical?
Joker: Folie à Deux unites Joaquin Phoenix and Lady Gaga, but there are split reports on whether the movie is an all-out musical or not.
Yahoo Movies UK
The highest-grossing animated movies of all time
Inside Out 2 now stands alone at the top of the animation world's highest-grossing movies list. Here are more of the big-hitters.
Yahoo Movies UK
Everything we know about the Borderlands movie
Eli Roth gathers an ensemble cast to bring a video game classic to screens. Here’s everything we know about the Borderlands movie.
Yahoo Movies UK
What is Rob McElhenney’s Deadpool and Wolverine cameo?
Ryan Reynolds found room for his Welcome to Wrexham co-star Rob McElhenney in the new Marvel movie Deadpool and Wolverine.
PA Media: Movies
Mick Jagger and Charlize Theron go chic at Paris Olympics fashion event
The event was co-hosted by Theron along with Lupin star Omar Sy, US tennis star Serena Williams and Spanish singer Rosalia.
PA Media: Movies
Taylor Swift calls Ryan Reynolds’ Deadpool 3 ‘best work of his life’
She shared a photo of herself with Reynolds and his wife Blake Lively.
PA Media: Movies
James Bond star George Lazenby retires from acting after ‘a fun ride’
Lazenby had been a model in his early life, before 007 producer Albert Broccoli met him in a barber’s shop and later offered him an audition.
Yahoo Movies UK
What you need to remember from Marvel and Fox to understand Deadpool and Wolverine
The new Marvel film is a love letter to superhero movies of the past, so if you haven't seen them all or Disney+'s TV shows then you might struggle.
Yahoo Movies UK
Deadpool and Wolverine post-credit scenes explained
Marvel fans are no doubt wondering if the threequel continues the tradition of having a post-credit scene, or multiple, after the main event.
Yahoo Movies UK
Deadpool and Wolverine Easter eggs and cameos you may have missed
As Deadpool joins the MCU, Ryan Reynolds' fourth wall-breaking superhero has a much larger sandbox to play in.
Washington Post
How Skibidi Toilet became one of the most valuable franchises in Hollywood
LOS ANGELES - While big budget movies vie for the top spot at the box office this summer, billions of people are clamoring to watch a YouTube show about toilets with human heads that is fast becoming one of the most valuable franchises in Hollywood. Alexey Gerasimov, the creator behind “Skibidi Toilet,” is working with leading independent Hollywood entertainment studio, Invisible Narratives, to expand the YouTube Shorts series into myriad product lines and a potential television and movie franch
BuzzFeed
I Genuinely Cannot Watch "Longlegs" The Same Way After Learning These 15 Fascinating Facts
Maika Monroe didn't even meet Nicolas Cage until they filmed the scene where her character interrogates Longlegs. So, the first time she met Nicolas Cage, she met him as Longlegs.
Yahoo Movies UK
Matthew Macfadyen wasn’t miscast as Mr Darcy
The Pride and Prejudice actor feels he was miscast in Joe Wright's 2005 adaptation of Jane Austen's book, but the film works so well because of his performance.
Yahoo Movies UK
As Star Wars and Gladiator 2 are review bombed, why is it a thing?
The Acolyte, Gladiator II and House of the Dragon are just some of the recent examples of shows and films being review bombed, but why does it happen?
Yahoo Movies UK
What critics are saying about Marvel's Deadpool and Wolverine
The movie sees Ryan Reynolds and Hugh Jackman team up for Marvel for the first time, but the dream team hasn't convinced every critic of the film's value.
PA Media: Movies
Colin Farrell to run marathon to support friend with rare skin condition
Emma Fogarty is Ireland’s longest-surviving person battling the most severe type of the agonising skin condition epidermolysis bullosa.
PA Media: Movies
Joaquin Phoenix and Lady Gaga dance through chaos in Joker: Folie A Deux trailer
The film will see Arthur Fleck awaiting trial for his crimes.
Yahoo Movies UK
What you need to know about Deadpool & Wolverine
Hugh Jackman is back, and he’s ready to carve himself a new legacy as Wolverine in new MCU blockbuster Deadpool & Wolverine.
Yahoo Movies UK
How is Wolverine alive in Deadpool and Wolverine?
After 2017's Logan many viewers thought the X-Men icon was dead and buried, but not anymore as Hugh Jackman is reprising the role in Deadpool and Wolverine.

Latest stories