Show newer

I guess people could argue SGD is strong enough to just remove all deception in the first pass when it's still weak

Show thread

@Paradox you're selecting both against the behavior *and* against your ability to detect/effectively deinforce the behavior

Similar to antibiotics resistant bacteria, where we select against bacteria and against our ability to defeat bacteria

@Paradox you're right

Okay

Adversarial selection pressure against something can worsen the badness of the thing, monotonic in the strength of the pressure

This might also apply to misaligned AI systems selected by search

Misalignment in cancers and antibiotic resitant bacteria, and selecting against is not enough

@rime cool, I expected you to have thought about this thoroughly

High variance for sure, I look forward to seeing wjat you make of it and how you save the world

@mesaoptimizer i did end up putting something on my about page on the site, but not PGP signed (yet, good idea though)

@xarvos ah ok I'd have been very confused if python was unique in that respect

@rime in your case rather a correction - divergence oscillation

@rime also, you're leaving?! Do you have a time frame for coming back?

@rime yeah, I was thking of this kind of phenomenon a couple of days ago

A correction - convergence oscillation

@cosmiccitizen "The problem is that unpike most of the stuff I read and write about, *everybody likes psychology*."

"as a X-"
"no. NO. fuck off. Go into the deepest hole, curl up, and breathe the cold air. Smell wet moss. Reflect on lichen, your distant cousin, fungiing there on the ground nect to you. What would the lichen say? Words of deep harm, letters from millenia. Stay. Stay. stay the cline'd burrow."

Show older
Mastodon

a Schelling point for those who seek one