In the first post of this series we decided to compile a Clojure library to a native shared library (with the GraalVM native-image command) and we used “isolates” so that a “global variable” in Clojure (and Java) would appear as if they’re “local” in Ruby. Then on the second post of the series we decided to make things local, and instead of “making a resolver then storing it in a global variable” we decided to “make a resolver and return it to Ruby”, and to abstract the nature of “callbacks” we used a CFunctionPointer in the Java side, and we sent the “block” from Ruby as an “opaque object” in Java, using GraalVM’s VoidPointer object.

Now, we need to do the opposite – we need to return a “resolver” to Ruby, and then we need to make a “List of Resolvers” in Ruby-side, and send it to Clojure-side somehow. These Clojure/Java objects (a Resolver, and a List of Resolvers) will also be “opaque objects” in C (meaning – we’ll receive them as void*, because they can’t be represented in C-land, nor in Ruby-land), but only in C – in Java-land (and Clojure, by definition) they’ll need to retain their types. The object GraalVM provides for this task is an ObjectHandle.

We’ll also fix a lot of memory leaks, so let’s move on!

We already had a preview on an ObjectHandle in the last post:

public final class LibPatao {
  // ... lots of code here ...
  @CEntryPoint(name = "gen_resolver")
  public static ObjectHandle gen_resolver(IsolateThread thread, Callback callback, VoidPointer arg1, CCharPointer name, CCharPointer params) {
    AdaptedCallback adaptedCallback = new AdaptedCallback(callback, arg1);

    Object resolver = big_duck.core.gen_resolver(
      adaptedCallback,
      CTypeConversion.toJavaString(name),
      CTypeConversion.toJavaString(params)
    );

    // We have a global "storage" that we can get with "ObjectHandles.getGlobal()"
    // We then "create" a `ObjectHandle` and that will be what C will "see" as the
    // opaque object we need
    return ObjectHandles.getGlobal().create(resolver);
  }
}

The creation of an ObjectHandle is very easy, as shown above. But there’s a problem: An ObjectHandle is “stored” by the global ObjectHandles object, and it’ll stay there to guarantee it won’t be garbage-collected by Java (and, by definition, be unaccessible from C). But because we created said handle, we also need a way to “free” it – so memory manipulation is now up to GraalVM (and C).

So, to recap: if we have a complex object on C, we will pass it to GraalVM’s as a VoidPointer, and the responsibility to clear up memory of this object is on C code. But if we have a complex object on Java, we’ll pass it to C as an ObjectHandle, and the responsibility to clear up memory if this object is on Java code. Which means we also need to create a “destroy_handle” or something, so that the memory can be freed:

public final class LibPatao {
  // ... lots of code here ...
  @CEntryPoint(name = "destroy_handle")
  public static void destroy_handle(IsolateThread thread, ObjectHandle handle) {
    ObjectHandles.getGlobal().destroy(handle);
  }
}

Very simple code, and it does the job. We just need to remember to call it from C side when we don’t need the resolver anymore (spoiler alert – it’ll be part of the “free” implementation in ResolverImpl class).

But there’s one more thing – in the last two posts, every time we converted Java strings to C, we used the method CTypeConversion.toCString to create a “string holder”, and then we used holder.get() to actually get the C String. This… also leaks memory.

And that’s, unfortunately, how far my knowledge on the subject goes. Supposedly we need to call holder.close() to free memory, but unfortunately I didn’t find anything on the GraalVM documentation (did I mention how incredibly bad the GraalVM documentation is? If I didn’t… well, it’s worse than you can imagine, to the point I had to use ChatGPT to write parts of the code otherwise I would not be able to go anywhere) on how to, given a char* in C, how to get the “holder” of that, and then call close on it. So what I did is a hack, but it works: when you convert a char* to a Ruby string (using rb_str_new2), the Ruby API will copy the contents of the char* to a memory area that Ruby handles – so you can easily “free” the char*.

What I did is to start up a thread, sleep up for 100ms, then free memory. This means we have 100ms to get the char* from GraalVM, make a Ruby string, and then the char* will be unallocated. It’s more than enough, but it’s absolutely not the right way to handle that by any metric:

public final class LibPatao {
  // ... again, lots of code here ...
  private static CCharPointer to_string(String origin) {
    CTypeConversion.CCharPointerHolder holder = CTypeConversion.toCString(origin);
    new Thread(() -> {
      try {
        Thread.sleep(100); // 0.1 seconds
        holder.close();
      } catch (InterruptedException e) { }
    }).start();
    return holder.get();
  }

Now, everywhere that used CTypeConversion.toCString will need to use this to_string, and that’s it. No more memory leaks.

Collections of Resolvers

There’s only one thing missing right now, and that is to make a “collection of resolvers” from C, so that we can query things. As a simple way to explain, Pathom (the library we’re exposing) expects a Vector of Resolver objects, and that will be used to create a query – which is a function, basically. We can call this function (the query) and that will call each resolver needed to get the result, and that will finally get our final HashMap as a result. Unfortunately, there’s no way to create a Vector (or even a simple Java list) from C with GraalVM, so I’m going to expose two functions: one that returns an “empty vector” and a second one that adds an element to any vector. These two functions will be exposed on our “final class” and they will be called from C. Amazingly, Clojure have a Java API to define vectors and to “concat” elements into said vectors (the disadvantage being that vectors are immutable, so when we “push” things from C we also need to “free” the old vector) – this “Java API” will allow us to just expose the functions in Java, without needing to also define Clojure functions for these operations:

public final class LibPatao {
  // ... again, lots of code here ...
  @CEntryPoint(name = "push_resolver")
  public static ObjectHandle push_resolver(IsolateThread thread, ObjectHandle vector_ptr, ObjectHandle resolver_ptr) {
    IPersistentCollection vector = ObjectHandles.getGlobal().get(vector_ptr);
    Object resolver = ObjectHandles.getGlobal().get(resolver_ptr);
    return ObjectHandles.getGlobal().create(RT.conj(vector, resolver));
  }

  @CEntryPoint(name = "empty_vec")
  public static ObjectHandle nth(IsolateThread thread) {
    ObjectHandle h = ObjectHandles.getGlobal().create(PersistentVector.EMPTY);
    return h;
  }
}

In this case, PersistentVector.EMPTY is an empty vector (it’s a constant because, remember, Vectors are immutable), and RT.conj is the “concat” of something to a vector. Now we need to consume that from Ruby – to do this, we’ll create a Collection from the C API (we’re not calling CollectionImpl now because we won’t need to subclass and “normalize” anything in Ruby side). For the first time, we’ll show the full implementation of this class, so it’ll become clear how allocating data structures work, how the “destructor” works, and how to handle lots of the edge cases.

The first thing we need is a C struct to store C data that will come from GraalVM’s side (the “Vector” that we exposed in the Java API):

typedef struct { 
  void* collection; 
} CollectionWrapper;

This is a void* because, in C, the whole Vector object is completely opaque: we can only interact with it by calling the APIs we exposed in our library. When we allocate data for a new Ruby Collection object, we’ll set this collection field to the empty_vec and when we free said object, we’ll call the destroy_handle function to cleanup memory:

static void collection_class_free(CollectionWrapper *data) {
  // We destroy the handle
  destroy_handle(global_thread, data->collection);
  data->collection = NULL;
  // And free the struct we allocated
  free(data);
  data = NULL;
}

static VALUE collection_class_allocate(VALUE klass) {
  // We allocate memory for the struct
  CollectionWrapper *data = malloc(sizeof(CollectionWrapper));
  // and we associate the collection to the empty vector
  data->collection = empty_vec(global_thread);
  // finally, we "wrap" the object with the structure.
  VALUE obj = Data_Wrap_Struct(klass, NULL, collection_class_free, data);
  // and return the object
  return obj;
}

This handles the allocation and deallocation of the Ruby object. Please notice that collection_class_allocate is not a constructor – meaning, it’s not the same as defining an initialize method – this code runs before the initialization, and for this class, we don’t actually need an initialize. Now, we’ll define the code for push:

// We'll define the "Collection" class variable here
// so we can use it on this function:
VALUE CollectionImplClass;

static VALUE collection_push(VALUE self, VALUE resolver) {
  // These two lines will get the struct that's wrapped by this object:
  CollectionWrapper *data;
  Data_Get_Struct(self, CollectionWrapper, data);

  // And this here will get data from the struct that's wrapped
  // by the "resolver" object:
  ResolverWrapper *resolver_data;
  Data_Get_Struct(resolver, ResolverWrapper, resolver_data);

  // Finally, we'll return a new "collection" from the GraalVM
  // side. Again, it's a void* because it's opaque for us
  void* new_collection = push_resolver(global_thread, data->collection, resolver_data->resolver);

  // And finally, we'll do the same code as the allocator here, so that
  // we can create a new Collection object with this new_collection
  // that came from GraalVM. This also means that Collection
  // is immutable: when we `push` things, we get back a new
  // Collection, we don't change the current one:
  CollectionWrapper *new_data = malloc(sizeof(CollectionWrapper));
  new_data->collection = new_collection;

  // We wrap this new_data into a Collection, and we reuse the "free" function
  return Data_Wrap_Struct(CollectionImplClass, NULL, collection_class_free, new_data);
}

Because a Clojure Vector is immutable, we’ll also make Collection immutable – that simplifies some things in our implementation, and also allows for easier cleanup of Vector objects on the GraalVM’s side. But there’s a catch: in Ruby, when we call collection.push(resolver) the resolver argument is a Ruby resolver object – and what we want is the “opaque representation” of a Resolver from the GraalVM side, so we need to “unwrap” the ResolverWrapper struct, get the void* object, and send both the void* that represents a Clojure Vector and the void* that represents a Clojure Resolver object.

Finally, we wrap up everything:

void Init_patao_impl() {
  graal_create_isolate(NULL, &global_isolate, &global_thread);
  VALUE PataoModule = rb_define_module("Patao");
  // definition of other classes
  CollectionImplClass = rb_define_class_under(PataoModule, "Collection", rb_cObject);
  rb_define_alloc_func(CollectionImplClass, collection_class_allocate);
  rb_define_method(CollectionImplClass, "push", collection_push, 1);
}

And that will create our Collection class in Ruby. For the full implementation, please check the GitLab page on the patao_impl.c code. Now there’s only one thing that needs to be done: we need to wrap up the “QueryImp” object in C (not going to show the implementation because it’s “more of the same” – we have a struct in C, wrap that around the EqlImpl object, handle calling functions from C, this whole thing) but there’s a touch: when we define resolvers, we want to give a Ruby Array of Resolver objects, but GraalVM expects a Ruby Collection of Resolvers, and the same problem can happen where the Collection gets out of scope (then each Resolver will be out of scope), it’ll be garbage collected, and we’ll crash the interpreter…

… so we need to keep references to resolvers:

module Patao
  class Eql < EqlImpl
    def initialize(resolvers)
      self.resolvers = resolvers
      super()
    end

    def resolvers=(resolvers)
      # We need to store resolvers, not the colls below.
      # The reason is, an Array keeps the record of each
      # individual Resolver, but we never actually did
      # any code to say a "Collection" will keep the record
      # of anything, really.
      @resolvers = resolvers.dup.freeze
      colls = @resolvers.inject(Collection.new) do |coll, resolver|
        coll.push(resolver)
      end
      @eql = gen_eql(colls)
    end
  end
end

With this… we can finally use Pathom from Ruby!

# A resolver
name = Patao::Resolver.new(
  'name',
  output: [:user__name, :user__sn],
  input: []
) do |inputs|
  {
    user__name: 'Maurício',
    user__sn: 'Szabo'
  }
end

# A different resolver
full_name = Patao::Resolver.new(
  'full_name',
  output: [:user__full_name],
  input: [:user__name, :user__sn]
) do |i|
  {
    user__full_name: "#{i[:user__name]} #{i[:user__sn]}"
  }
end

# The query object
eql = Patao::Eql.new([name, full_name])

# Querying a user with the default arguments
p eql.query([:user__full_name])
# => {user__full_name: "Maurício Szabo"}

# And now supplying a different surname
p eql.query({user__sn: "Santos"}, [:user__full_name])
# => {user__full_name: "Maurício Santos"}