@niplav Ok I imagine adversarial selection pressure means that you're just telling the AI to not do X, and the more you punish that, the worse a thing gets, but I don't know why it would do that.
@Paradox you're selecting both against the behavior *and* against your ability to detect/effectively deinforce the behavior
Similar to antibiotics resistant bacteria, where we select against bacteria and against our ability to defeat bacteria
@niplav Ohhh yeah.
Cuz when you use antibiotics too much, it gives them the resources to figure out how to resist it. That's kinda just evolution. If you don't get rid of a thing entirely, it will eventually become harder to do so. So what's the alternative here?
@Paradox good Q! Best option I know is kill on the first try
Otherwise 🤷
@Paradox this one ^ i endorse far more