Skip to content

My work on a next-generation proof of concept for ZMQ bindings for Node.js #189

@rolftimmermans

Description

@rolftimmermans

TL;DR: I'm this issue to start a discussion about the future of zeromq.js and my work on a proof of concept for a next-generation version. Please let me know your opinion!

Background

We have recently taken a system into production that uses zeromq.js. This works well, but I encountered a number of issues with this library that triggered me to think about solutions. Unfortunately some problems did not appear to be fixable without a massive overhaul of the internals. I will briefly describe the main problems.

Problem 1: Sent messages are queued, but not with ZMQ

Zeromq.js uses an internal queuing mechanism for sent messages. Essentially it pushes all messages onto a JavaScript array and sends them out as soon as the socket is ready for writing. This works nicely except when it doesn't:

  • If write is not possible, messages are queued infinitely; eventually exhausting all system memory if messages continue to be sent.
  • Because writes only happen when when the socket is writable, write timeouts have no effect – the socket just never enters the writable state. If the application waits for a message to be sent before continuing (by adding a callback) it will be stuck without an obvious way to recover.

Related issues: #152 #185 #124

From #152 (comment) I understand that this queueing was built to improve performance. Which means that matching the performance of the current library should be a goal of any new version.

Problem 2: Messages that are received are read automatically

Readable sockets follow the event emitter pattern. This is common in Node.js APIs, so it makes sense for most people. There is important one drawback, though: messages are read automatically.

  • Sequential processing of messages is not possible – all available messages will flow in automatically.
  • Read timeouts have no effect; this may break some REQ/REP scenarios, especially.
  • An application cannot reliably signal that it cannot process any more messages.

It appears that a call to pause()/resume() offers a suitable workaround, however:

  • A call to pause() also disables writes, which may not be intentional.
  • A call to pause() will always dispatch all messages that can be read from the ZMQ queue without blocking before it actually stops reading from the socket.
  • A user has to explicitly call pause()/resume() to approximate features that are offered by libzmq by default, which is not a very friendly developer experience.

Related issues: #139

Other potential improvements

There were some other things we encountered while working on the library that we believed could be improved:

  • This library includes many ZMQ options, but some are not (yet) added. Adding an option requires changes to JS & C++ code. It would be nice to be able to add options without having to change the library itself. This allows users to polyfill options available in their version of ZMQ, before they are added to the library.
  • The future of Node.js native addons seems to be N-API (possibly with a C++ wrapper). This is still experimental though.
  • It would be nice to support promise instead of callbacks. Node.js has great support for promises in all supported versions.
  • It would be nice to support async iterators for incoming messages. This is only natively supported with an experimental flag in Node.js version 8. It can be used with Babel or TypeScript on any supported Node.js version.

My work on a proof of concept

Given that nature of problems 1 & 2 I set out to rewrite the internals and design a better API that is suitable for honouring high water marks and timeouts for both send & receive operations.

Because I was now effectively rewriting the entire library I also decided to address the additional improvements mentioned above. Especially the move to N-API is non-trivial and influences most of the native code.

To summarize, the goals of my experiment were:

  • Design new promise-based API that solves usability issues with high water marks and timeouts.
  • Attempt to use Node.js experimental N-API which offers an outlook of binary compatibility across Node.js versions once it stabilises. (This prompted several patches to the C++ node-addon-api wrapper and potential improvements to the N-API itself).
  • Allow easy definition of new socket types and options by consumers of the library, without changes to the internals.
  • Match performance of the current library & reduce the interdependency between JS and C++ code (in practice this boils down to having all socket & context logic in C++).

What is included and working

  • Support for all socket types and socket options up to ZMQ 4.2.
  • Support for ZMQ 4.0+.
  • Test suite for all code written so far – covering at least all scenarios in the current test suite (except for features still missing, see below).
  • Successful builds on macOS & Linux.

What is still missing

The following features have not yet been added. My plan is to implement them over the next weeks years:

  • A proxy class
  • A curve keypair helper function
  • Monitoring sockets for events
  • Support for prebuilt binaries
  • Draft sockets and associated (new) APIs
  • Thorough documentation of the API
  • Windows support
  • Electron support
  • Real world testing

So, how do I use it?

To build the proof of concept version you need a local installation of libzmq with development headers. For example, on Debian/Ubunto you can apt-get isntall libzmq3-dev, or on macOS you can brew install zeromq.

Then clone the repository and run the tests with yarn test or npm test.

Install with npm install zeromq-ng.

Next, have a look at the the examples in the README. To give you a taste of the API, this is how you can use it:

const zmq = require("zeromq-ng")

const push = new zmq.Push
const pull = new zmq.Pull

async function produce() {
  while (!push.closed) {
    /* Queues a message and resolves the promise on success.

       This is equivalent to zmq_msg_send() in the sense that it can "block",
       but it does not actually block the internal Node.js event queue.

       If the message cannot be queued or times out the promise is rejected. */

    await push.send("some work")
    await new Promise(resolve => setTimeout(resolve, 500))
  }
}

async function consume() {
  while (!push.closed) {
    /* Waits for a new message to be received. Any timeout will cause the
       promise to be rejected. */

    const [msg] = await pull.receive()
    console.log("work: %s", msg.toString())
  }

  /* An alternative to this loop is to use async iterators:

     for await (const [msg] of pull) {
       console.log("work: %s", msg.toString())
     }
  */
}

async function run() {
  await push.bind("tcp://127.0.0.1:3000")
  pull.connect("tcp://127.0.0.1:3000")

  await Promise.all([produce(), consume()])
}

run().catch(console.error)

I've you made it this far reading this post, thank you for your attention!

What next?

I'd like to invite you to have a look at https://github.com/rolftimmermans/zeromq-ng and give your feedback.

Specifically, I'd like to discuss two scenarios:

  1. I publish this as a new library that exists separately from zeromq.js
  2. This can be the beginning of a next major (API incompatible) release of zeromq.js

I'd be completely happy with either.

In addition to the above I am looking from feedback from users that have experience with one or more of the problems mentioned above (or even other problems!). It would be nice if you can give your opinion on the API or even test the proof of concept.

I am also in the process of writing and collecting benchmarks and will post the results soon. Based on what I have tested so far I'm expecting similar performance in general.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions