This is almost a new post on the Function SOLID series. It should be about the Liskov Substitution Princible, but before we talk about it it’s important to understand the concept of subtyping.
Subtyping in Dynamic languages like Clojure or in languages that do not have hierarchical typing like Haskell seems strange. But subtyping is not only a concept about object-oriented programming languages – it’s about restrictions, and the concept of variance. I’m going to try to explain both in this post
To begin with, we can say that a supertype is more generic than a subtype, but that’s not all. In truth, is all about properties that you can prove about a specific data. For example, if we use Clojure as a starting language you could say that the coll?
is a property of maps and vectors, but vector?
is a property that only applies to vectors. In practice, this means that coll?
is a superclass of vector?
.
(coll? []) ; => true (coll? {}) ; => true (vector? []) ; => true (vector? {}) ; => false
Now let’s change the language: in Haskell you can use type classes as a restriction on which generic type you can pass to functions. So for example, Eq
is typeclass that implements equality (allows two objects of the same type to be compared for equality and inequality), and Ord
is a typeclass that allows types to be compared if they are less than or greater than. So a type that implements Ord
will have to implement Eq
too – it is a subtype of Eq
.
The thing is that typing doesn’t apply only to structures, but also to behaviors – and this is a little bit more complex. The idea is the same – a code is a subtype of other if it have the same properties (or, in “mathematical lingo”, if you can prove properties about that behavior) and also adds more of its own. In this sense, map
is a supertype of mapv
, but there are more rules (and this is the point where things become weird), so let’s start with a simpler example:
Suppose you want to check if you can apply map
to some value. You could use coll?
or vector?
. So, if some code does the checking with vector?
, it is a behavior supertype of one that uses coll?
.
And why is that? Suppose you have the following code:
(defn greet-person [person] (str "Hello, " (:name person))) (defn greet-senior [person] (when (>= (:age person) 18) (str "Hello, sr. " (:name person))))
It is easy to see that the map {:name "Anderson"}
is a supertype of {:name "Anderson, :age 22}
. Being a supertype, it is accepted in less places than the subtype (because both obey the same property: #(contains % :name)
, but the subtype obeys another property: #(number? (:age %))
.
The same is true for coll?
and vector?
– for the purpose of seeing that we can map
a function to the values, coll?
will accept more inputs than vector?
, because some type that have the property vector?
also have the property coll?
:
(defn map-maybe [maybe-collection] (when (coll? maybe-collection) (map inc maybe-collection))) (map-maybe [1 2 3]) ; => (2 3 4) (map-maybe '(1 2 3)) ; => (2 3 4) (defn map-maybe2 [maybe-collection] (when (vector? maybe-collection) (map inc maybe-collection))) (map-maybe2 [1 2 3]) ; => (2 3 4) (map-maybe2 '(1 2 3)) ; => nil
Now, the things are reversed on the output – seq
is a supertype of vec
, because the output of seq
is accepted in less places than the output of “vectoror, again,
seqreturns data that have property
coll?, but
vecreturns data that have both property
coll?and
vector?`:
(map-maybe2 (vector '(1 2 3)) ; => (2 3 4) (map-maybe2 (seq '(1 2 3))) ; => nil
This is probably one of the most counter-intuitive rules of typing: the return of a function is covariant – it means that if, outside of the function, a supertype is expected, the function can return a subtype without breaking anything; but the function arguments are contravariant – if, outside of the function, there’s a subtype, the function cannot expect a supertype.
These rules become more evident when you add high-order functions: a vector? -> coll?
is more specific than a vector? -> vector?
or from a coll? -> vector?
, and these two are more specific than coll? -> vector?
. It is easy to see that the last function will be accepted in more places than the first one.
Now everything falls down if you use mutability! For example, an atom
of coll?
is not a subclass of an atom
of vector?
, nor the opposite, so they are invariant. The reason for that is:
; This satisfies `vector?`: (def atom-of-vector (atom [])) ; If this is a subtype of an atom of coll?, ; then this should be possible as a "downcast" (def atom-of-coll atom-of-vectors) ; Then you could change its value to: (reset! atom-of-coll {}) ; But now, `atom-of-vector` have a map...
And then you effectively broke the rules of subtyping. There’s more to these rules, and I’ll talk more about it on the post about Liskov Substitution Principle.