A functional definition of algorithm similarity (number of same outputs on same inputs) disregards some "continuity"-ish assumptions: If A₁ gives the same answer as A₂ for many inputs, but for slightly perturbed inputs they give radically different outputs, I'd call those two algorithms very dissimilar.

@niplav In cases like evolutionary programming, I like to preserve diversity by tracking correlation of algorithms vs. size.

You can always improve a solution by memorizing a new example, but that will degrade the "uncorrelated new correct examples per bit of complexity" metric because it's inefficient.

But two algorithms correct on 50% but perfectly anti-correlated? You want that in a population, because the next step is combining them for very low cost.

@niplav For floats or bitsets, you could also adjust correlation of one input(between the two algorithms) based on correlations of nearby inputs. A weighted average or something.

I thought about comparing traces, but for genetic programming you can have the same solution but two gigantic piles of mud that look nothing alike and have nothing in common for intermediate states. But if your algorithms are naturally more similar, maybe that doesn't matter.

@niplav For computing paradigms, this doesn't really answer your question but do look at the Binary Lambda Calculus if you're looking to standardize on one for some "Computer Sciencey" purpose.

It's a good size metric to compare different programs: If somebody asked "What is the Kolmogorov Complexity of the universe?", a codegolf'ed Standard Model in BLC sounds like a great answer.

Just a note that it exists if you hadn't seen it.

Follow

@mira Thanks! These seem interesting but not quite what I was asking for :-) (though you probably weren't optimising for that anyway)

Sign in to participate in the conversation
Mastodon

a Schelling point for those who seek one