i think bags are fascinating. it's got so much potential as an efficient data structure.
a bag is a list where you "forget" the original order of elements. and the simplest way to forget is to just sort the list.
and just that -- sorting the list -- could already result in performance gains, thanks to improved branch prediction. see this question: https://stackoverflow.com/questions/11227809/why-is-processing-a-sorted-array-faster-than-processing-an-unsorted-array
but i think that's just the tip of the iceberg. one thing you can do with bags is take advantage of the isomorphism:
bag (a + b) ≅ bag a × bag b
meaning that if you have a bag of "a or b", it's equivalent to having one bag of a, and one bag of b.
you can use this to optimize the storage of sum types in bags. you don't need to store one tag per element, instead you store one bag per tag (that appears in the overall bag). or you go further and factor out whole prefixes. ultimately bags are just finite tries*, with naturals at the leaves (because, bag unit ≅ nat).
(*-not an infinite "lazy trie", which should be called a cotrie, imo)
reconstructing the original elements from the trie representation is certainly possible. but, we can go one step further and specialize our algorithms to this trie represntation. e.g. if we're processing all the elements of a bag over a sum type, and the first thing we do is pattern match on the value, well, we can probably eliminate the pattern match by just projecting to the correct sub-bag of elements, via this trie representation.
i wanna be optimistic and say that ghc can probably do these optimizations, but i haven't tested it.
(though to some extent, the branches we can eliminate here by doing this are also the branches that the CPU would be able to predict if you process everything after sorting. but you're still gonna get some gains by eliminating the pattern match altogether.)
bonus bag fact #1:
"bag" is the question mark exponential in linear logic,
?x
which for the producer means "i get to produce x as many times as i want"
and for the consumer means "i need to consume as many x as i'm given (i don't get to choose how many that is)".
i say this because the operations on ?x are effectively the operations of a commutative monoid with a function x -> ?x ... and the "canonical" example of this is the free commutative monoid on x, which is bag x
(note that linear logic doesn't by itself guarantee that ?x is the "free" commutative monoid on x. the exponential rules don't guarantee uniqueness, so you can have multiple non-equivalent ? exponentials. bag x is just the canonical choice, imo, where that interpretation makes sense.)