In the first post of this series we decided to compile a Clojure library to a native shared library (with the GraalVM native-image
command) and we used “isolates” so that a “global variable” in Clojure (and Java) would appear as if they’re “local” in Ruby. Then on the second post of the series we decided to make things local, and instead of “making a resolver then storing it in a global variable” we decided to “make a resolver and return it to Ruby”, and to abstract the nature of “callbacks” we used a CFunctionPointer
in the Java side, and we sent the “block” from Ruby as an “opaque object” in Java, using GraalVM’s VoidPointer
object.
Now, we need to do the opposite – we need to return a “resolver” to Ruby, and then we need to make a “List of Resolvers” in Ruby-side, and send it to Clojure-side somehow. These Clojure/Java objects (a Resolver, and a List of Resolvers) will also be “opaque objects” in C (meaning – we’ll receive them as void*
, because they can’t be represented in C-land, nor in Ruby-land), but only in C – in Java-land (and Clojure, by definition) they’ll need to retain their types. The object GraalVM provides for this task is an ObjectHandle
.
We’ll also fix a lot of memory leaks, so let’s move on!
We already had a preview on an ObjectHandle
in the last post:
public final class LibPatao { // ... lots of code here ... @CEntryPoint(name = "gen_resolver") public static ObjectHandle gen_resolver(IsolateThread thread, Callback callback, VoidPointer arg1, CCharPointer name, CCharPointer params) { AdaptedCallback adaptedCallback = new AdaptedCallback(callback, arg1); Object resolver = big_duck.core.gen_resolver( adaptedCallback, CTypeConversion.toJavaString(name), CTypeConversion.toJavaString(params) ); // We have a global "storage" that we can get with "ObjectHandles.getGlobal()" // We then "create" a `ObjectHandle` and that will be what C will "see" as the // opaque object we need return ObjectHandles.getGlobal().create(resolver); } }
The creation of an ObjectHandle
is very easy, as shown above. But there’s a problem: An ObjectHandle
is “stored” by the global ObjectHandles
object, and it’ll stay there to guarantee it won’t be garbage-collected by Java (and, by definition, be unaccessible from C). But because we created said handle, we also need a way to “free” it – so memory manipulation is now up to GraalVM (and C).
So, to recap: if we have a complex object on C, we will pass it to GraalVM’s as a VoidPointer
, and the responsibility to clear up memory of this object is on C code. But if we have a complex object on Java, we’ll pass it to C as an ObjectHandle
, and the responsibility to clear up memory if this object is on Java code. Which means we also need to create a “destroy_handle” or something, so that the memory can be freed:
public final class LibPatao { // ... lots of code here ... @CEntryPoint(name = "destroy_handle") public static void destroy_handle(IsolateThread thread, ObjectHandle handle) { ObjectHandles.getGlobal().destroy(handle); } }
Very simple code, and it does the job. We just need to remember to call it from C side when we don’t need the resolver anymore (spoiler alert – it’ll be part of the “free” implementation in ResolverImpl
class).
But there’s one more thing – in the last two posts, every time we converted Java strings to C, we used the method CTypeConversion.toCString
to create a “string holder”, and then we used holder.get()
to actually get the C String. This… also leaks memory.
And that’s, unfortunately, how far my knowledge on the subject goes. Supposedly we need to call holder.close()
to free memory, but unfortunately I didn’t find anything on the GraalVM documentation (did I mention how incredibly bad the GraalVM documentation is? If I didn’t… well, it’s worse than you can imagine, to the point I had to use ChatGPT to write parts of the code otherwise I would not be able to go anywhere) on how to, given a char*
in C, how to get the “holder” of that, and then call close
on it. So what I did is a hack, but it works: when you convert a char*
to a Ruby string (using rb_str_new2
), the Ruby API will copy the contents of the char*
to a memory area that Ruby handles – so you can easily “free” the char*
.
What I did is to start up a thread, sleep up for 100ms, then free memory. This means we have 100ms to get the char*
from GraalVM, make a Ruby string, and then the char*
will be unallocated. It’s more than enough, but it’s absolutely not the right way to handle that by any metric:
public final class LibPatao { // ... again, lots of code here ... private static CCharPointer to_string(String origin) { CTypeConversion.CCharPointerHolder holder = CTypeConversion.toCString(origin); new Thread(() -> { try { Thread.sleep(100); // 0.1 seconds holder.close(); } catch (InterruptedException e) { } }).start(); return holder.get(); }
Now, everywhere that used CTypeConversion.toCString
will need to use this to_string
, and that’s it. No more memory leaks.
Collections of Resolvers
There’s only one thing missing right now, and that is to make a “collection of resolvers” from C, so that we can query things. As a simple way to explain, Pathom (the library we’re exposing) expects a Vector
of Resolver
objects, and that will be used to create a query
– which is a function, basically. We can call this function (the query) and that will call each resolver needed to get the result, and that will finally get our final HashMap as a result. Unfortunately, there’s no way to create a Vector
(or even a simple Java list) from C with GraalVM, so I’m going to expose two functions: one that returns an “empty vector” and a second one that adds an element to any vector. These two functions will be exposed on our “final class” and they will be called from C. Amazingly, Clojure have a Java API to define vectors and to “concat” elements into said vectors (the disadvantage being that vectors are immutable, so when we “push” things from C we also need to “free” the old vector) – this “Java API” will allow us to just expose the functions in Java, without needing to also define Clojure functions for these operations:
public final class LibPatao { // ... again, lots of code here ... @CEntryPoint(name = "push_resolver") public static ObjectHandle push_resolver(IsolateThread thread, ObjectHandle vector_ptr, ObjectHandle resolver_ptr) { IPersistentCollection vector = ObjectHandles.getGlobal().get(vector_ptr); Object resolver = ObjectHandles.getGlobal().get(resolver_ptr); return ObjectHandles.getGlobal().create(RT.conj(vector, resolver)); } @CEntryPoint(name = "empty_vec") public static ObjectHandle nth(IsolateThread thread) { ObjectHandle h = ObjectHandles.getGlobal().create(PersistentVector.EMPTY); return h; } }
In this case, PersistentVector.EMPTY
is an empty vector (it’s a constant because, remember, Vector
s are immutable), and RT.conj
is the “concat” of something to a vector. Now we need to consume that from Ruby – to do this, we’ll create a Collection
from the C API (we’re not calling CollectionImpl
now because we won’t need to subclass and “normalize” anything in Ruby side). For the first time, we’ll show the full implementation of this class, so it’ll become clear how allocating data structures work, how the “destructor” works, and how to handle lots of the edge cases.
The first thing we need is a C struct
to store C data that will come from GraalVM’s side (the “Vector” that we exposed in the Java API):
typedef struct { void* collection; } CollectionWrapper;
This is a void*
because, in C, the whole Vector
object is completely opaque: we can only interact with it by calling the APIs we exposed in our library. When we allocate data for a new Ruby Collection
object, we’ll set this collection
field to the empty_vec
and when we free said object, we’ll call the destroy_handle
function to cleanup memory:
static void collection_class_free(CollectionWrapper *data) { // We destroy the handle destroy_handle(global_thread, data->collection); data->collection = NULL; // And free the struct we allocated free(data); data = NULL; } static VALUE collection_class_allocate(VALUE klass) { // We allocate memory for the struct CollectionWrapper *data = malloc(sizeof(CollectionWrapper)); // and we associate the collection to the empty vector data->collection = empty_vec(global_thread); // finally, we "wrap" the object with the structure. VALUE obj = Data_Wrap_Struct(klass, NULL, collection_class_free, data); // and return the object return obj; }
This handles the allocation and deallocation of the Ruby object. Please notice that collection_class_allocate
is not a constructor – meaning, it’s not the same as defining an initialize
method – this code runs before the initialization, and for this class, we don’t actually need an initialize
. Now, we’ll define the code for push
:
// We'll define the "Collection" class variable here // so we can use it on this function: VALUE CollectionImplClass; static VALUE collection_push(VALUE self, VALUE resolver) { // These two lines will get the struct that's wrapped by this object: CollectionWrapper *data; Data_Get_Struct(self, CollectionWrapper, data); // And this here will get data from the struct that's wrapped // by the "resolver" object: ResolverWrapper *resolver_data; Data_Get_Struct(resolver, ResolverWrapper, resolver_data); // Finally, we'll return a new "collection" from the GraalVM // side. Again, it's a void* because it's opaque for us void* new_collection = push_resolver(global_thread, data->collection, resolver_data->resolver); // And finally, we'll do the same code as the allocator here, so that // we can create a new Collection object with this new_collection // that came from GraalVM. This also means that Collection // is immutable: when we `push` things, we get back a new // Collection, we don't change the current one: CollectionWrapper *new_data = malloc(sizeof(CollectionWrapper)); new_data->collection = new_collection; // We wrap this new_data into a Collection, and we reuse the "free" function return Data_Wrap_Struct(CollectionImplClass, NULL, collection_class_free, new_data); }
Because a Clojure Vector
is immutable, we’ll also make Collection
immutable – that simplifies some things in our implementation, and also allows for easier cleanup of Vector
objects on the GraalVM’s side. But there’s a catch: in Ruby, when we call collection.push(resolver)
the resolver
argument is a Ruby resolver object – and what we want is the “opaque representation” of a Resolver from the GraalVM side, so we need to “unwrap” the ResolverWrapper
struct, get the void*
object, and send both the void*
that represents a Clojure Vector
and the void*
that represents a Clojure Resolver
object.
Finally, we wrap up everything:
void Init_patao_impl() { graal_create_isolate(NULL, &global_isolate, &global_thread); VALUE PataoModule = rb_define_module("Patao"); // definition of other classes CollectionImplClass = rb_define_class_under(PataoModule, "Collection", rb_cObject); rb_define_alloc_func(CollectionImplClass, collection_class_allocate); rb_define_method(CollectionImplClass, "push", collection_push, 1); }
And that will create our Collection
class in Ruby. For the full implementation, please check the GitLab page on the patao_impl.c code. Now there’s only one thing that needs to be done: we need to wrap up the “QueryImp” object in C (not going to show the implementation because it’s “more of the same” – we have a struct
in C, wrap that around the EqlImpl
object, handle calling functions from C, this whole thing) but there’s a touch: when we define resolvers, we want to give a Ruby Array
of Resolver
objects, but GraalVM expects a Ruby Collection
of Resolvers
, and the same problem can happen where the Collection
gets out of scope (then each Resolver
will be out of scope), it’ll be garbage collected, and we’ll crash the interpreter…
… so we need to keep references to resolvers:
module Patao class Eql < EqlImpl def initialize(resolvers) self.resolvers = resolvers super() end def resolvers=(resolvers) # We need to store resolvers, not the colls below. # The reason is, an Array keeps the record of each # individual Resolver, but we never actually did # any code to say a "Collection" will keep the record # of anything, really. @resolvers = resolvers.dup.freeze colls = @resolvers.inject(Collection.new) do |coll, resolver| coll.push(resolver) end @eql = gen_eql(colls) end end end
With this… we can finally use Pathom from Ruby!
# A resolver name = Patao::Resolver.new( 'name', output: [:user__name, :user__sn], input: [] ) do |inputs| { user__name: 'Maurício', user__sn: 'Szabo' } end # A different resolver full_name = Patao::Resolver.new( 'full_name', output: [:user__full_name], input: [:user__name, :user__sn] ) do |i| { user__full_name: "#{i[:user__name]} #{i[:user__sn]}" } end # The query object eql = Patao::Eql.new([name, full_name]) # Querying a user with the default arguments p eql.query([:user__full_name]) # => {user__full_name: "Maurício Szabo"} # And now supplying a different surname p eql.query({user__sn: "Santos"}, [:user__full_name]) # => {user__full_name: "Maurício Santos"}