Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm sure if someone inspected the logs of what I've written to various LLMs they'd think they can extrapolate all sorts of personal characteristics about me, but I'm also a person who plays around with things, tries to find limits and whatever...

If you looked at my LLM interaction logs you would probably assume that I have an unhealthy obsession with pirates and a napalm fetish.

In reality, I use the "can I get it to tell me how to make napalm" thing as a quick "acid test" around the extent and strength of censorship controls, and simply find asking LLM's to "talk like a pirate" amusing. And, also, I've found occasions where doing nothing more than instructing the LLM to talk like a pirate will bypass it's built-in inhibitions against things like giving instructions for making napalm.



Now explain that to the police. And to the court.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: