The magical wonders of programming with pure data

I’ll start with a disclaimer – if you’re developing in ClojureScript because of performance, you’re doing the wrong thing. Although a Clojure code can be really close to Java performance, ClojureScript simply can’t match JS performance – specially because in Javascript, the fundamental data structures like arrays, sets, and objects are implemented in a faster language, and as of today, no “user space” data structure can get even close the “native” ones. And, let’s not forget that JS is slower than Java.

So, if you want to beat JS performance in ClojureScript, you have to make use of what ClojureScript offers you – the REPL, the fast experimentation, and the wonderful experience of programming with data. So, this is a tale of a huge performance improvement that I’m currently working in Chlorine.

Rendering EDN

In Chlorine and Clover, when you evaluate something, the result will be rendered in a tree-like way at the console. This works quite well for Clojure, because UNREPL makes structures that are too big small, and you can query for more data. So far, so good.

In ClojureScript, Clojerl, Babashka, etc, things are not so good. Big data structures will be rendered fully on the view. They can lock or crash your editor, they can occupy all your memory, etc. The reason for that is that tree structures are hard – when you’re rendering the “parent”, you don’t know what the children will be. Currently, in Chlorine, rendering (range 80000) locks my editor for 4 seconds until it calculates everything, layouts all data, etc… and I wanted to change this.

Reagent vs Helix vs React

I was always intrigued on how much performance hit I get when I use Reagent’s Hiccup data structure instead of React objects – after all, there must be some performance problems, right? After all, that’s the promise of Helix – to not pay this performance hit because you’re closer to what React wants.

So, first things first: profiling Javascript inside Electron IS A NIGHTMARE. I decided to install React Devtools extension on Electron (inside the Atom editor) but that got me into so much trouble, false-positives, wrong profiles that I ended up deciding against it and simply used the “performance” tab to do my profiling. There, I could see batching.cljs hogging my performance, so the next move was easy – move away from Reagent and use Helix.

Except… that Helix uses some macros and I ended up doing the code in React. My testcase was simple: render a vector of 80000 elements and see how it would perform, obviously without all the bells and whistles that Chlorine offers today (otherwise this experiment would be waaay longer). And that’s where things get surprising: with Reagent, I was hitting 1800ms of scripting, and about 120ms of rendering. With React… 1650ms of scripting, and about 200ms of rendering. I decided to do more benchmarks and probably because of OS caching, warm-up, or whatever, the results got even closer, with Reagent sometimes performing better than React – but still too slow.
(more…)

The state of JS in 2021

So… I recently found out that I had no space left on my SSD. When I tried to debug what was consuming my space, my projects folder were the culprit. How? I mean, I take photos, store them in RAW format, so how source code can be consuming more than my photos?

Turns out that by running rm -Rf (find -name 'node_modules'), (FISH shell) I was able to free 20gb of space!

So… let’s see how bizarre is the Javascript ecosystem in 2021.

Strangely big libraries

Sometimes, when I have to generate projects, they ask me to install yo. Yo is just a CLI to run generators – it does not include any generator. Yo occupies 89mb of space in my disk. Installing the react-webpack (manually by running npm i generator-react-webpack, because otherwise yo will try to install globally) means that your node_modules now occupies 180mb.

Let’s thing a little bit about this number: Windows XP, minimalist install without any add-on, requires 1.5gb of space. So a SINGLE GENERATOR occupies 12% of a full operating system.

So, create-react-app is more gentle – it just installs under 5.3mb. Generating a react app creates a node_modules folder with… 242mb. One of the dependencies is global-modules, that is just a single file with an if to define how to call things. It’s also only used in a single file – react-dev-utils/printHostingInstructions.js – in the folllowing code:
(more…)

The power of Pathom

Most people don’t know the power of Pathom. Most people that I know of think that Pathom is about graphs.

They are wrong.

I mean, yes, in the end you’ll have a graph, with dependencies, but that’s not the point. The point of Pathom is the ability to transform your code into a soup or attributes. It’s also probably the best usage of qualified keywords for me, if not the only one that justifies the downsides (I’ll not enter in details here – just know that having to convert from qualified keywords to unqualified multiple times is not fun at all).

The name “soup of attributes” may not be a beautiful one, but believe me – it’s incredible. The idea is quite simple – instead of trying to handle all possible conversions from multiple sources to multiple destinations, you just define which attributes can be computed in terms of others, and Pathom does the rest. As always, things are better with examples, so let’s go.

I had to work on a system that somehow had strange rules – it needed to generate a bunch of text files for different companies. Each company expected a different file name and different fields. To generate the file, we had to accumulate data that came from a payload, from an external system that we called via REST API, and also from some data we had on our database. To make things worse, some companies would expect some of the data that was returned from REST on the filename, and there were also some state changes – like, if a file was already processed, the company would send us a return file, and we had to read some content of this file, move this file to another directory, renaming the file in the process.
(more…)

Your editor as a query

Some time ago I found out this post: https://petevilter.me/post/datalog-typechecking/. It stroked me as really interesting – because it was an idea to make some IDE not depend on some internal API that only works for a single editor, but somehow make a tooling that’s query able by datalog. This datalog Read more…

Static Typing – the dangers of incomplete info

Ok, so I’m going to use this post – Making Illegal States Unrepresentable – and I’ll add my experience to it. For people that don’t know F# (or that don’t want to check all the post to see what’s the point), the idea is that he’s trying to construct a type that will only be valid if a user does at least an e-mail address or a postal contact. Then, he ends with the following type (I’m “inventing” a way to represent this type that’s close to Scala, but easier to read for people that don’t know Scala or Haskell or F#):

type Contact {
  Name: String
  AND Contact: ContactInfo
}

type ContactInfo {
  EmailOnly: EmailInfo
  OR PostOnly: PostalInfo
  OR EmailAndPost: EmailInfo with PostalInfo
}

// Types EmailInfo and PostalInfo have to be defined also

Then, he uses 13 lines to construct a ContactInfo, and another 12 to update a contact info. He ends up concluding that these complicated types are necessary because the logic is complicated. And that’s where we start to disagree.
(more…)

Implementing shadow.remote API

Since version 0.8.0 of Chlorine, there’s a new way to evaluate ClojureScript code: that’s the Shadow-CLJS Remote API. It is basically a new REPL (not nREPL, no Socket REPL) over WebSockets to try to solve problems when translating other REPLs to ClojureScript. So, to understand why these problems exist, I’ll first introduce the difference between ClojureScript and Clojure.

On Clojure, you’re always inside a JVM. This means that compilation happens on the same JVM that your REPL, and your code is running. If you practice REPL-Driven Development, even your tests are running on the same JVM. In practical terms, it means that when you fire up your REPL, you already have everything ready to run code, compile code, and evaluate forms.

On ClojureScript, the compiler is written in Clojure – that means it’s running on the JVM. So, to produce Javascript code you don’t need a Javascript environment – and that’s when things become confusing, because when exactly will you run the REPL? Let’s try from another angle: if you start the REPL on compilation time, you can’t evaluate code (because there’s no Javascript generated, nor any Javascript engine running). If you start the REPL when you run the compiled code, this REPL can become unusable if you stop the Javascript environment, and also you have to coordinate lots of state and translations between formats.
(more…)

Reagent Mastermind

One of these days, a friend of mine posted about his experience writing the “Mastermind” game in React (in portuguese only). The game is quite simple:

  1. You have six possible colors
  2. A combination of 4 colors is chosen randomly (they can be repeated – for example, blue,blue,blue,blue is a valid combination) – you have to guess that number
  3. You have up to 10 guesses of 4 colors. For each color on the right position, you win a “black” point. For each color in the wrong position, you win a “white” point
  4. If you can’t guess on the 10th try, you loose.

So, first, we’ll create a shadow-cljs app – create a package.json file, fill it with {"name": "Mastermind"}, then run npm install shadow-cljs. Traditional stuff.

Then, we’ll create the shadow-cljs.edn file. It’ll only contain a single target (:browser), opening up a dev-http server so we can serve our code, and we’ll add reagent library dependency. I also added the material-ui dependency, but you don’t really need it for the code. Now, running npx shadow-cljs watch browser will start a webserver at port 3000, and we can start to develop things.
(more…)

Please, use the right terms for “typing”…

Names matter. That’s one of the things that I learned, working multiple years on programming software. When you have something called “integration”, for example, that means different things for different people, that’s a recipe for disaster.

That’s why I get quite triggered when people use Untyped / Typed, Weak / Strong typed, Uni / Multi typed, to describe programming languages – because it means different things to different people, or are simply wrong. So let’s dive into it:

Untyped languages or languages without types are quite rare: Assembly is the most interesting example (there’s literally no types in Assembly: everything is a memory address, byte, word, etc). Some virtual machines’ bytecodes are also untyped. In this case, most languages are typed. Ruby, for example, have types, and you can dispatch different behavior depending on the type of the caller (because it’s a class-based object-oriented language). Clojure is also typed, because you can do type-based polymorphism with protocols using defprotocol and defrecord for example. You can also use deftype, and if there’s a specific command to define types, how can we call it “untyped”?

Weak / Strong typed is a term that’s hard to define. Some people say that it’s when you have pointers, because you can bypass the typing at all; others, when the compiler erases typing at run-time; some say that’s when the language tries to coerce and implicitly convert from one type to another, to try to figure out what you want…

Now, a language that allows you to bypass typing because you have pointers is called “memory unsafe”; if the compiler erases typing information, is called “type erasure”; coercion of course is called “coercion”, and can be “automatic” or “manual” coercion (for example, most languages try to coerce integers to decimals at some point); even the ability to convert between types is controversial: on Ruby and Scala, operators are also methods of the class (and, on Ruby, you can rewrite methods on classes that already exist, so automatic conversion is a user-defined feature) – so, please use the right term.

Unityped means a language that only have one type. I believe only some markup languages enter on this classification. Calling a language that have multiple types as unityped because someone imagined that all these multiple types can somewhat collapse into a single one is, at minimum, strange; and also, again, we do have a term for languages like Javascript, Ruby, Python, Smalltalk, and Clojure.
(more…)

ClojureScript vs clojure.core.async

I’m going to make a somewhat bold statement: core.async does not work with ClojureScript. And, in this post, I’m going to show some examples why this is true, at least for the current versions of core.async.

So let’s start by understanding a little bit about the runtime: Javascript is a single-threaded runtime that implicitly runs an event-loop. So, for example, when you ask to read a file, you can do it synchronously or asynchronously. If you decide to run in that asynchronously, it means that as soon as you issue the fs.readFile command, you need to register a callback and the control is returned to the “main thread”. It’ll keep running until it runs out of commands to execute, then the runtime will wait the result from the callback; when it returns, the function that you registered will be called with the file contents. When the function ends, the JS runtime will await to see if there’s any other pending call, and it’ll exit if there’s nothing else to do.

The same thing happens in browser environment, but in this case the callbacks are events from the DOM: like clicking on buttons or listening for changes in some elements. The same rules apply here: the runtime is single threaded and when something happens it will first execute everything that needs to be executed, then it will be called back with the event that happened.

So maybe we can change these callbacks with core.async channels right? But the answer is no, because core.asyncs go blocks will not run in different threads (because, again, the runtime is single-threaded). Instead, it creates a state machine and it’ll control of when each of these go blocks will be called, at what time, eventually replacing the event-loop that Javascript environment already have.
(more…)