This isn’t an external directive; Anthropic was founded with the mission of creating safe, reliable AI systems. You wouldn’t see the same people working at the company if the company didn’t stand by its acceptable use policy and other internal standards
I'm saying the capability to reason about novel situations is in tension with guaranteeing it never produces harmful outputs. We are talking about contradictory design constraints.