▲Algebraic Types are not Scaryblog.aiono.dev

37 points by Bogdanp 2 days ago | 13 comments

ekidd 3 hours ago [-]

"Sum types" (aka "Rust enums") are something that I've really come to love for representing the idea "There are 3 possible cases here, here's the data we need in each case, and please remember that these cases are strictly mutually exclusive."

Making this super easy and short to express makes a lot of designs clearer.

There's also an interesting tradeoff here between sum types and abstract interfaces:

- Abstract interfaces: "We have an unknown number of cases/subclasses, but they each support this fixed set of methods." The set of implementations is open, but the set of operations is harder to update.

- Sum types (aka Rust enums): "We know all the cases, but there may be many different methods." The set of cases is harder to extend, but the set of methods manipulating them is completely open.

A good example of the latter pattern is actually LLVM! LLVMs' intermediate representation is a essentially a small assembly language with a fixed set of operations (and an escape hatch). But because the set of basic instructions is small and known, it's possible to write dozens and dozens of optimization passes that each manipulate that small set of instructions.

And a surprising number of programs work well with an LLVM-like structure.

So in a language like Rust, which supports both abstract interfaces (via traits) and sum types (via enums), the tradeoff is whether you think you have many different types of values that you'll manipulate in a limited number of ways, or a small number of types of values that you'll manipulate in many different ways.

(Oh, and the metaphor of "algebraic types" goes deeper than you might think; there are some super interesting data structures based on taking the "derivatives" of simple algebraic types.)

GhosT078 55 minutes ago [-]

This tradeoff sounds similar to the choice to use "tagged types" versus "variant records" in Ada. Ada has provided variant records since 1983 and tagged types since 1995 (and both with a very nice syntax).

wk_end 6 minutes ago [-]

According to this [0] tutorial, variant records in Ada have the downside that you only get an error at runtime if you use them incorrectly, i.e. access data from one branch of the sum type when the actual value is on another. That's a pretty huge drawback.

https://learn.adacore.com/courses/intro-to-ada/chapters/more...

phkahler 2 hours ago [-]

Sum type sounds like the old Variant from VB.

We use something similar in Solvespace. Where one might have used polymorphism, there is one class with a type enum and a bunch of members that can be used for different things depending on the type. Definitely some disadvantages (what does valA hold in this type?) but also some advantages from the consistency.

ulrikrasmussen 2 hours ago [-]

The lack of sum types should really be considered a weird omission rather than a strange fancy language feature. I suspect that their omission from popular languages is mostly due to the historical belief that object-orientation was the way forward and that "everything is an object", leading many programmers to represent things that are decidedly not objects using abstractions designed for objects. An object is a stateful thing which exists over a period of time (sometimes unbounded) and whose identity is characterized by its observable behavior. Abstractions for objects consist of interfaces which expose only a slice of the observable behavior to consumers. On the other hand, a value is really just that; a value. It does not have state, it does not start or stop existing, and its identity is defined only by its attributes. You can use objects to model values (poorly), but you often end up doing a lot of extra work to stop your "fake value" objects from behaving like objects, e.g. by being careful to make all fields immutable and implementing proper deep equality. Abstractions for values are not naturally expressed using interfaces because interfaces force the implementation to be tied to the instance of an object, but since values do not have a lifetime this restriction is a huge disadvantage and often leads to clunky and inflexible abstractions. For example, consider the Comparable interface in Java which tells you that an object is modeling a value which can be compared to other values of the same type. It would be awfully nice if List<A> could implement this interface, but it cannot because doing so will mean that you can only create lists of things (of some type A) which have a total order defined on them.

However, if you consider programming to not only be about expressing what you can do to stateful objects but also expressing values and their operations, then algebraic data types and traits/modules/type classes become the natural basic vocabulary as opposed to classes and interfaces. When dealing with first-order algebraic data types, products (i.e. records) and coproducts (i.e. sum types) are unavoidable. A list is one of the simplest algebraic data types which use both products and coproducts:

data List a = Nil | Cons a (List a)

One trait of lists is that lists of totally ordered things are themselves totally ordered using lexicographic ordering, and this is in Haskell expressed as a type class instance

instance Ord a => Ord (List a) where ...

Crucially, the type class is removed from the definition of what a list is, which enables us to still talk about lists of e.g. functions which do not have an order on them. These lists do not have the Ord trait, but they are still lists.

Another important distinction between traits and interfaces is that some behavior of value types consist of simply identifying special values. For example, the Monoid type class in Haskell is

    typeclass Monoid a where
      mempty :: a
      mappend :: a -> a -> a

It is impossible to express Monoid as an interface because one of the features of a monoid is that you have a special element `mempty`, but consumers cannot get this value without first having another value of type `a` on their hands.

Many languages now have proper support for both objects and values in that they have better primitives for defining both products and coproducts, but I still think that most mainstream languages are missing proper primitives for defining traits.

epolanski 12 minutes ago [-]

> It is impossible to express Monoid as an interface because one of the features of a monoid is that you have a special element `mempty`, but consumers cannot get this value without first having another value of type `a` on their hands.

Can't you trivially do so in TypeScript?

- define a Magma<A> interface `concat<A>(a:A, b: A) => A`,

- define a Semigroup<A>, same implementation of `Magma<A>` could even just point to it,

- define Monoid<A> as Semigroup<A> and `empty` property (albeit I prefer the term `unit`, empty is very misleading).

Now you can implement as many monoids you want for strings, functions, lists and use those instances with APIs that require a Monoid. E.g. you could have a `Foldable` interface that requires a `Monoid` one, so if you have a Foldable for some A you can define foldMap.

Not sure what the practical differences would be.

After all Haskell's monoids, same as in TypeScript, are contracts, not laws. There are no guarantees that monoid properties (left and right identity) hold for every element of type `a`.

spicyusername 25 minutes ago [-]

I always miss them so much in languages that don't have them.

It is just such a common situation to be in where you have something that can only be X, Y, or Z, where X, Y, or Z are separate types.

And then being able to pattern match over it to guarantee all cases are handled. Just perfect.

tromp 38 minutes ago [-]

> In OCaml for instance, unit corresponds to void.

It makes more sense that Void should be void of values, i.e correspond to the empty set, as it does in Haskell [1].

[1] https://hackage.haskell.org/package/void-0.6.1/docs/Data-Voi...

raxxorraxor 2 days ago [-]

> Algebraic Types are Just Elementary School Algebra

My math prof did say the exact same while torturing students with question about proofs about their arcane set of arbitrary numbers and if they can be considered a field or ring or a group or everything at the same time.

Sure, just some + and *...

And sure, for a programmer it is mostly about which operations are defined on the type. But with just a few tweaks here and there you can transform a tool into a torture device...

Jokes aside, I think this is a good explanation about the concepts and parallels.

HelloNurse 1 days ago [-]

> their arcane set of arbitrary numbers and if they can be considered a field or ring or a group or everything at the same time.

A toolbag of abstract theory and tools that can be straightforwardly applied to any "arcane set" is the value that upgrading from arithmetic to algebra provides.

zk108 3 hours ago [-]

...and a monad is just a monoid in the category of endofunctors of some fixed category

Just kidding but algebraic types are a great abstraction paradigm,

zokier 2 hours ago [-]

ADTs are just structs and tagged unions. Tbh I don't understand why there is such fetishization of them considering how mudane they are.

aiono 55 minutes ago [-]

That's the point I tried to make. All my bachelor education there was no single mention of sum types even though we learned about the visitor pattern. Instead of having to model sum types with object hierarchies, the language should provide a straightforward way to represent them since it's a very basic concept. I think that while things have improved a lot compared to past, sum types concept is still not known as much as objects for instance. Today all mainstream languages added direct support for sum types but awareness of it lacks.

Loading comments...