Recently I wrote about a Jank, the programming language that is able to bring Cojure to the native world using LLVM. I also mentioned that Jank is not yet ready for production usage – and that is just true; it still have many bugs that prevent us to use it for more complex problems, and maybe even some simple ones.

That doesn’t mean we can’t try it, and see how bright the future might be.

At my last post I ended up showing in an example of a Ruby code. Later, after many problems trying to make something more complex, I finally was able to hack up some solution that bypasses some of the bugs that Jank still have.

And let me tell you, even in the pre-alpha stage that the language is, I can already see some very interesting things – the most important one being the “interactive development” of the native extension – or if you want to use the Ruby terms, monkey patching native methods.

Let’s start with some very simple code: a Ruby method that’s supposed to be written in a native language. I’m going to start using C++ because it’s the main way of doing that. For now, don’t worry about the details – it’s just a class creation with a method that just prints “hello word” in the screen:

#include <iostream>
#include <ruby.h>

static VALUE hello(VALUE self) {
  std::cout << "Hello, world!\n";
  return Qnil;
}

void define_methods() {
  VALUE a_class = rb_define_class("Jank", rb_cObject);
  rb_define_method(a_class, "hello", RUBY_METHOD_FUNC(hello), 0);
}

extern "C" void Init_jank_impl() {
  define_methods();
}

We should be able to port this directly to Jank by just translating from C++ to a Clojure-like syntax – should being the keyword here, because we can’t. There are a bunch of different bugs that prevents us from doing that right now:

  1. We can’t define the hello funcion, because we have no way to add type signatures to Jank functions. To bypass that, we have to define the method in C, using cpp/raw, which brings us to
  2. Using cpp/raw to define C++ functions doesn’t work – the code generated will somehow duplicate the method definition, so things won’t compile. We can solve that by using the same technique that C headers use – we use #ifndef and #define to avoid duplications
  3. Unfortunately, Jank doesn’t actually understand the callback method signature. It expects something that matches exactly the method signature, but for Ruby callbacks (of native functions) the C API sometimes use “type aliases” or “C preprocessors/macros”. We can also solve that, but we need to “cast” the C function with the “concrete” signature to the one with the “abstract” one so that Jank will be happier.
  4. Finally, Jank doesn’t understand C macros/preprocessors. So in some cases (for example, converting a Ruby string to a C one) we’ll also need to generate an “intermediate function” to solve the issue.

First experiments

With all that out of the way, we can actually do some experiments:

(ns ruby-ext)

(cpp/raw "
#ifndef JANK_HELLO_FN
#define JANK_HELLO_FN
#include <jank/c_api.h>
#include <ruby.h>

static VALUE jank_hello(VALUE self) {
  std::cout << \"Hello, world\\n\";
  return Qnil;
}

static auto convert_ruby_fun(VALUE (*fun)(VALUE)) {
  return RUBY_METHOD_FUNC(fun);
}

#endif
")

(defn init-ext []
  (let [class (cpp/rb_define_class "Jank" cpp/rb_cObject)]
    (cpp/rb_define_method class "hello" (cpp/convert_ruby_fun cpp/jank_hello) 0)))

And this works. But obviously, that’s not what we want. We want a Jank function to be called that will be used as the Ruby implementation. To do that, we can actually use the C API to callback Jank, so let’s just add a function and change our jank_hello code to call this new function:

(defn hello []
  (println "Hello, from JANK!"))

(cpp/raw "
...
static VALUE jank_hello(VALUE self) {
  rb_ext_ractor_safe(true);
  auto const the_function(jank_var_intern_c(\"ruby-ext\", \"hello\"));
  jank_call0(jank_deref(the_function));
  return Qnil;
}
...
")

Compiling the Jank code

This actually works, and we get a Hello, from JANK! appearing in the console! But to make that work, we need to compile the Jank code, and then link that together with our C++ “glue code” before we can use that in Ruby. But to compile for Jank, we actually need to include the Ruby headers and the ruby library directory, otherwise it won’t work – but knowing which flags to use can be a challenge. Luckily Ruby can help with that:

ruby -e "require 'rbconfig'; puts '-I' + RbConfig::CONFIG['rubyhdrdir'] + ' -I' + RbConfig::CONFIG['rubyarchhdrdir'] + ' -L' + RbConfig::CONFIG['libdir'] + ' -l' + RbConfig::CONFIG['RUBY_SO_NAME']"

That will return the flags you need. Then you save your file as ruby_ext.jank, and compile it with jank <flags-from-last-cmd> --module-path . compile-module ruby_ext. Unfortunately, the output (the ruby_ext.o file) goes to a different directory depending on lots of different things (at least on my machine) – so my build script first deletes the target directory, then use a wildcard in the extconf.rb file (the file that’s used to prepare a Ruby code) so that we can actually get anything under target directory. Then, finally, we can build a final Ruby gem:

#extconf.rb
require &#039;mkmf&#039;

dir_config(&#039;jank&#039;, [&#039;.&#039;])
$CPPFLAGS += &quot; -I/usr/local/lib/jank/0.1/include&quot;
RbConfig::MAKEFILE_CONFIG[&#039;CC&#039;] = &#039;clang++&#039;
RbConfig::MAKEFILE_CONFIG[&#039;CXX&#039;] = &#039;clang++&#039;
RbConfig::MAKEFILE_CONFIG[&#039;LDSHARED&#039;] = &#039;clang++ -shared&#039;

with_ldflags(&quot;
  -L/usr/local/lib/jank/0.1/lib/
  target/*/ruby_ext.o
  -lclang-cpp
  -lLLVM
  -lz
  -lzip
  -lcrypto
  -l jank-standalone
&quot;.gsub(&quot;\n&quot;, &quot; &quot;)) do
  create_makefile(&#039;jank_impl&#039;)
end

#jank_impl.cpp
#include &lt;ruby.h&gt;
#include &lt;jank/c_api.h&gt;

using jank_object_ref = void*;
extern &quot;C&quot; jank_object_ref jank_load_clojure_core_native();
extern &quot;C&quot; jank_object_ref jank_load_clojure_core();
extern &quot;C&quot; jank_object_ref jank_var_intern_c(char const *, char const *);
extern &quot;C&quot; jank_object_ref jank_deref(jank_object_ref);
extern &quot;C&quot; void jank_load_ruby_ext();

extern &quot;C&quot; void Init_jank_impl() {
  auto const fn{ [](int const argc, char const **argv) {
    jank_load_clojure_core_native();
    jank_load_clojure_core();

    jank_load_ruby_ext();
    auto const the_function(jank_var_intern_c(&quot;ruby-ext&quot;, &quot;init-ext&quot;));
    jank_call0(jank_deref(the_function));

    return 0;
  } };

  jank_init(0, NULL, true, fn);
}

And compile with rm target -Rf && jank <flags> --module-path . compile-module ruby_ext && ruby extconf.rb && make. This surprisingly works! I say “surprisingly” because, remember, this way of using the language is not officially supported!

Refactoring the glue code away

Now, this whole number of “glue code” is a problem – it’s quite tedious to remember to make a C code, then convert that C code to be usable via Ruby. But LISPs have macros, and Jank is no exception – so let’s define a defrubymethod that accepts a parameter+type, and will generate all boilerplate for us:

(cpp/raw &quot;
#ifndef JANK_CONVERSIONS
#define JANK_CONVERSIONS
static jank_object_ref convert_from_value(VALUE value) {
  return jank_box(\&quot;unsigned long *\&quot;, (void*) &amp;value);
}
#endif
&quot;)

(defmacro defrubymethod [name params &amp; body]
  (let [rb-fun-name (replace-substring (str name) &quot;-&quot; &quot;_&quot;)
        cpp-code (str &quot;#ifndef &quot; rb-fun-name &quot;_dedup\n#define &quot; rb-fun-name &quot;_dedup\n&quot;
                      &quot;static VALUE &quot; rb-fun-name &quot;_cpp(&quot;
                      (-&gt;&gt; params (map (fn [[param type]] (str type &quot; &quot; param)))
                           (str/join &quot;, &quot;))
                      &quot;) {\n&quot;
                      &quot;  auto const the_function(jank_var_intern_c(\&quot;&quot; *ns* &quot;\&quot;, \&quot;&quot; name &quot;\&quot;));\n&quot;
                      &quot;  jank_call&quot; (count params) &quot;(jank_deref(the_function), &quot;
                      (-&gt;&gt; params (map (fn [[param]] (str &quot;convert_from_value(&quot; param &quot;)&quot;))) (str/join &quot;, &quot;))
                      &quot;);\n&quot;
                      &quot;  return Qnil;\n}\n#endif&quot;)]
    `(do
       (cpp/raw ~cpp-code)
       (defn ~name ~(mapv first params) ~@body))))

We need to implement replace-substring too, because str/replace is still not available in Jank. Anyway, now we can create Ruby methods like this:

(defrubymethod hello-world [[self VALUE]]
  (println &quot;HELLO, WORLD!&quot;))

(defn init-ext []
  (let [class (cpp/rb_define_class &quot;Jank&quot; cpp/rb_cObject)]
    (cpp/rb_define_method class &quot;hello_world&quot; (cpp/convert_ruby_fun cpp/hello_world_cpp) 0)))

Way easier, and no boilerplate. But now, it’s “superpower” time – because Jank is dynamic, we can actually reimplement hello-world while the code is running and Ruby will use the new implementation – it doesn’t matter that the implementation is native!

A REPL (kinda – more like a REP)

Of course, we also have a problem with this approach: Jank doesn’t actually have a REPL API (yet), so the easiest way to solve this is to add another method in our Ruby class that will evaluate something in Jank. This might sound confusing, and it kinda is – we’re actually registering, in Jank, a method that will be called by Ruby; this method will call a C++ code, that will delegate to Jank, to evaluate some Jank code. The trick here is to be able to pass Ruby parameters do Jank too, and to make this a reality what I decided to do is:

  1. Ruby will pass a string containing Jank code; that string can refer some pre-defined variables p0, p1, etc – each is a “parameter” that we can pass
  2. To be able to “bind” these variables, we’ll create a Jank function called __eval-code that will contain these parameters. We then will call this __eval-code and we’ll “box” the Ruby parameters (“boxing” is a way for Jank to be able to receive any arbitrary C++ value)
  3. To be able to inspect these values, we’ll also create a rb-inspect function that will call inspect in the “unboxed” Ruby value
(defn define-eval-code [code num-of-args]
  (let [params (-&gt;&gt; num-of-args
                    range
                    (mapv #(str &quot;p&quot; %)))
        code (str &quot;(defn __eval-code [&quot; params &quot;] &quot; code &quot;\n)&quot;)]
    (eval (read-string code))))

(cpp/raw &quot;
...
static const char * inspect(VALUE obj) {
  ID method = rb_intern(\&quot;inspect\&quot;);
  VALUE ret = rb_funcall(obj, method, 0);
  return StringValueCStr(ret);
}

static VALUE eval_jank(int argc, VALUE *argv, VALUE self) {
  try {
    const char *code = StringValueCStr(argv[0]);
    auto const the_function(jank_var_intern_c(\&quot;ruby-ext\&quot;, \&quot;define-eval-code\&quot;));
    jank_call2(jank_deref(the_function), jank_string_create(code), jank_integer_create(argc));

    auto const the_function2(jank_var_intern_c(\&quot;ruby-ext\&quot;, \&quot;__eval-code\&quot;));
    std::vector&lt;jank_object_ref&gt; arguments;
    for(int i = 0; i &lt; argc; i++) {
      arguments.push_back(jank_box(\&quot;unsigned long *\&quot;, (void*) &amp;argv[i]));
    }

    jank_call1(jank_deref(the_function2),
      jank_vector_create(argc,
        arguments[0],
        arguments[1],
        arguments[2],
        arguments[3],
        arguments[4],
        arguments[5],
        arguments[6],
        arguments[7],
        arguments[8],
        arguments[9],
        arguments[10],
        arguments[11],
        arguments[12],
        arguments[13]
    ));
  } catch(jtl::ref&lt;jank::error::base&gt; e) {
    std::cerr &lt;&lt; \&quot;ERROR! \&quot; &lt;&lt; *e &lt;&lt; \&quot;\\n\&quot;;
  }
  return Qnil;
}

static auto convert_ruby_fun_var1(VALUE (*fun)(int, VALUE*, VALUE)) {
  return RUBY_METHOD_FUNC(fun);
}
...&quot;)

(defn rb-inspect [boxed]
  (let [unboxed (cpp/* (cpp/unbox cpp/VALUE* boxed))]
    (cpp/inspect unboxed)))

(defn init-ext []
  (let [class (cpp/rb_define_class &quot;Jank&quot; cpp/rb_cObject)]
    (cpp/rb_define_method class &quot;hello_world&quot; (cpp/convert_ruby_fun cpp/hello_world_cpp) 0)
    (cpp/rb_define_method class &quot;eval_jank&quot; (cpp/convert_ruby_fun_var1 cpp/eval_jank) -1)))

This… was a lot of code. But here’s what it is: define-eval-code will basically concatenate (defn __eval-code [[p0 p1 p2 p3]] ... ) for example, if we pass 3 parameters (self will always be p0). Then we have the implementation, in C++, for eval_jank – this will just get the first parameter, convert to a const char*, and pass that to define-eval-code that we created previously; this will be evaluated, and we’ll create (or patch!) the __eval-code function. Now, the problem is how to pass parameters – I hacked a bit the solution by creating a C++ vector, “boxing” everything, and then arbitrarily creating a Jank vector with 14 elements; and that’s it.

Notice also that jank_box is passing the type as unsigned long * instead of VALUE *. This is actually because, again, Jank doesn’t support type aliases for now – so VALUE * gets resolved in Jank to unsigned long *, but don’t get resolved in C++. Also, be aware of the space between long and * – this is something that Jank also needs, otherwise you won’t be able to unbox the variable.

Now that we have everything in order, we can finally test some superpowers!

Ruby console

Ruby have a REPL, and that’s what we’ll use. After compiling the whole code, run irb and then you can test the code:

irb(main):001&gt; jank = Jank.new
=&gt; #&lt;Jank:0x00007324fec7da08&gt;
irb(main):002&gt; jank.hello_world
HELLO, WORLD!
=&gt; nil
irb(main):003&gt; jank.eval_jank &#039;(prn (rb-inspect p1))&#039;, [1, 2, 3]
&quot;[1, 2, 3]&quot;
=&gt; nil
irb(main):004&gt; jank.eval_jank &#039;(defn hello-world [_] (println &quot;Another Implementation!&quot;) )&#039;
=&gt; nil
irb(main):005&gt; jank.hello_world
Another Implementation!
=&gt; nil

Yes – in line 004, we reimplemented hello-world, and now Ruby happily uses the new version! Supposedly, this could be used to implement the whole code interactively – but without proper nREPL support, we can’t – some stuff simply doesn’t work (yet, probably) like cpp/VALUE – it depends on Ruby headers being present, the compilation flags, etc. Maybe in the future we can have better support for shared libraries, who knows?

And here is the very interesting part – this is not limited to Ruby. Any language that you can extend via some C API can be used via Jank, which is basically almost any language – Python, Node.JS, and maybe even WASM in the future. The future is very bright, and with some interesting and clever macros, we might have an amazing new choice to make extensions to languages!

And… another thing

Most of the time, when doing extensions, one of the worst things is the need to convert between the native types and the language ones. In Ruby, everything is a VALUE – in Jank, everything is a jank_object_ref. But again, Jank have macros – can we use that to transparently convert these types? And in such a way that’s fast, relies on no reflection, etc? Turns out we can – we can change the defrubymethod to receive a parameter before our body that will be the “return type”. We then will implement some Jank->Ruby conversions, and vice-versa, and transparently convert, box and unbox stuff, etc. So as a “teaser” here’s one of the first versions of this conversion:

(defn ruby-&gt;jank [type var-name]
  (case type
    Keyword (str &quot;to_jank_keyword(&quot; var-name &quot;)&quot;)
    String (str &quot;to_jank_str(&quot; var-name &quot;)&quot;)
    Integer (str &quot;jank_integer_create(rb_num2long_inline( &quot; var-name &quot;))&quot;)
    Double (str &quot;jank_real_create(NUM2DBL(&quot; var-name &quot;))&quot;)
    Boolean (str &quot;((&quot; var-name &quot; == Qtrue) ? jank_const_true() : jank_const_false())&quot;)
    VALUE (str &quot;jank_box(\&quot;unsigned long *\&quot;, (void*) &amp;&quot; var-name &quot;)&quot;)))

(def jank-&gt;ruby
  &#039;{String &quot;  auto str_obj = reinterpret_cast&lt;jank::runtime::obj::persistent_string*&gt;(ret);
    return rb_str_new2(str_obj-&gt;data.c_str());&quot;
    Integer &quot;  auto int_obj = reinterpret_cast&lt;jank::runtime::obj::integer*&gt;(ret);
    return LONG2NUM(int_obj-&gt;data);&quot;
    Double &quot;  auto real_obj = reinterpret_cast&lt;jank::runtime::obj::real*&gt;(ret);
    return DBL2NUM(real_obj-&gt;data);&quot;
    Boolean &quot;  auto bool_obj = reinterpret_cast&lt;jank::runtime::obj::boolean*&gt;(ret);
    return bool_obj-&gt;data ? Qtrue : Qfalse;&quot;
    Keyword &quot;  auto kw_obj = reinterpret_cast&lt;jank::runtime::obj::keyword*&gt;(ret);
    auto kw_str = kw_obj-&gt;to_string();
    auto kw_name = kw_str.substr(1);
    return ID2SYM(rb_intern(kw_name.c_str()));&quot;
    Nil &quot;  return Qnil;&quot;})

You might be asking: why does it return a string, all the time? And that’s because we’ll generate a C++ string on our macro – this string, which originally was just a way to help define a C++ function with the right signature, pass control do Jank, and return Qnil from Ruby, will now also convert parameters before sending them to Jank, which means we’ll have the correct type information on Jank’s side.

With all this, we can actually create some Jank code without any explicit conversion, that behaves like Jank, and will seamlessly work in Ruby too:

(defrubymethod sum-two-and-convert-to-str [[self VALUE] [p1 Integer] [p2 Integer]] String
  (str (+ p1 p2)))

(defn init-ext []
  (let [class (cpp/rb_define_class &quot;Jank&quot; cpp/rb_cObject)]
    ...
    (cpp/rb_define_method class &quot;sum_and_to_s&quot; (cpp/convert_ruby_fun2 cpp/sum_two_and_convert_to_str_cpp) 2)
    ...))

And then you can run with this:

Jank.new.sum_and_to_s(12, 15)
# =&gt; &quot;27&quot;

As soon as Jank stabilizes, this might actually be the best way to create language extensions! If you want to check out the work so far, see this repository with all the code in this post (and probably more in the future). Just install Jank, run ./build.sh, and cross your fingers 🙂