Tag Archives: rxcpp

Batching Data by Time or Count Using Reactive Extensions (Rx)

Motivation

The time-series database InfluxDB provides a HTTP API to write data. Data points (measurements) are inserted via a Line protocol, which allows batching of data by writing multiple points in one HTTP request.

While experimenting for a simple InfluxDB C++ client, I wanted to create an asynchronous fire-and-forget API, so that the data points can be sent over HTTP without blocking the instrumented C++ code. Several “readymade” options to implement concurrency in this scenario are available.

A simple PAIR of ZeroMQ sockets would do the job, but I’d have to implement batching separately. Thus, I turned my attention to a higher-level abstraction: Rx

Rx Window Operator

Quickly looking through the cross-language Reactive Extensions site, I found the right operator: Window.

This operator has luckily been implemented in RxCpp, thus I proceeded with the experiment.

Batching Design

Batching using Rx Winodw Operator

Rx Window Operator (CC BY 3.0 reactivex.io)1

The window operator takes an observable sequence of data and splits it into windows (batches) of observables. To batch requests, the observable windows of data are aggregated to a single value upon the last value from the windows (via other aggregating Rx operators).

A Toy Problem

To validate the approach, the following problem is set:

Given a stream of integers, append the integers into a series of strings, either every second, or every N integers

String appends with integer-to-string conversions in C++ will be done via the {fmt} library.

Batching in One Line of Code

A stream of numbers batched either by time or count:

auto values = rxcpp::observable<>::range(1, 1000'000)
    .window_with_time_or_count(std::chrono::seconds(1), 100'000);

Note, there is an almost one-to-one translation into a C# version:

var values = Observable.Range(1, 1000000)
               .Window(TimeSpan.FromSeconds(1), 100000);

This indicates the power of the Rx abstraction across languages. The Rx website provides just the right sorting of the documentation to be able to translate Rx code from one language to another.

Aggregating the Batches

In order to do something useful with the batched data, the Scan operator is used to gather the data in a string buffer, and after the last value has been received, the string buffer is assembled into a string and processed:

values.subscribe(
    [](rxcpp::observable<int> window) {
        // append the number to the buffer
        window.scan(
            std::make_shared<fmt::MemoryWriter>(),
            [](std::shared_ptr<fmt::MemoryWriter> const& w, int v)
        {
            *w << v;
            return w;
        })

        // what if the window is empty? Provide at least one empty value
        .start_with(std::make_shared<fmt::MemoryWriter>())

        // take the last value
        .last()

        // print something fancy
        .subscribe([](std::shared_ptr<fmt::MemoryWriter> const& w) {
            fmt::print(
                "Len: {} ({}...)\n",
                w->size(),
                w->str().substr(0, 42)
            );
        });            
    }
);

The Tale of Two Bugs

In the initial (non-TDD) spike, the batching seemed to work, however, something caught my attention (the code bites back):

[window 0] Create window
Count in window: 170306
Len: 910731 (123456789101112131415161718192021222324252...)

the window wasn’t capped at 100’000. This could have been either a misunderstanding or a bug, thus I formulated a hesitant issue #277. As it turned out, it indeed was a bug, which was then fixed in no time. However, the first bug has hidden another one: the spike implementation started to crash at the end: when all the windows were capped by count, and not by time, last window was empty, as all values fit exactly into 10 batches.

The Last operator rightly caused an exception due to an empty sequence. Obviously, there’s no last value in an empty sequence. Rubber Ducking and a hint from Kirk Shoop fixed the issue by utilizing the StartWith operator to guarantee, the sequence is never empty. An empty string buffer can be ignored easier downstream.

Active Object

The active object pattern was applied to implement a fire-and-forget asynchronous API. A Rx Subject to bridge between the function call and the “control-inverted” observable:

struct async_api {
    //...
    rxcpp::subjects::subject<line> subj;
    //...

    async_api(...)
    {
        auto incoming_requests = subj
            .get_observable()
            .map([](auto line) {
                return line.to_string();
            });

        incoming_requests
            .window_with_time_or_count(
                window_max_ms,
                window_max_lines,
                // schedule window caps on a new thread
                rxcpp::synchronize_new_thread()
            )
            .subscribe(...)
        ;
    }

    // fire-and-forget
    void insert(line const& line)
    {
        subj
            .get_subscriber()
            .on_next(line);
    }
};

in order not to block the caller (which would be the default behavior), the observable watches the values from each window on a new thread. Here, scheduling on a thread pool (currently missing in RxCpp) would probably be beneficial.

While this implementation might not be an optimal one, the declarative nature of Rx, once the basics are understood, allows to “make it work and make it right” pretty quickly by composing the right operators.

Code

The runnable code of the example can be found at Github: C++ version.

In order to show, how similar the high level code can be between different languages when writing, I’ve “ported” the example to C# 2.

  1. Source: reactivex.io License: (CC BY 3.0)
  2. The C# version appears to run faster on my windows machine while solving the same toy problem

Deterministic Testing of Concurrent Behavior in RxCpp

A Retrospective

After getting inspired by The Reactive Manifesto, it is hard not to get excited about Reactive Extensions. Such excitement has lead to a series of hello-world articles and some code examples. While Reactive Extensions take over the programming world in C#, Java and JavaScript, it seems, the world of C++ is slow to adopt RxCpp.

The new ReactiveX Tutorial link list is a great place to start learning and grokking. This article is an attempt to bring RxCpp closer to C++ developers who might not see yet, how a reactive programming model might help writing better, more robust code.

Testing concurrency with RxCpp

A previous article showed how to test ViewModels in C# by parameterizing the ViewModels with a scheduler. In a UI setting, the scheduler usually involves some kind of synchronization with the GUI thread. Testing keystrokes arriving at certain speed would require some effort to simulate events, probably leading to brittle tests. With the scheduler abstraction, the concurrent behavior of a component is decoupled from physical time, and thus can be tested repeatedly and very fast. This was the C# test:

(new TestScheduler()).With(scheduler =>
{
    var ticker = new BackgroundTicker(scheduler);

    int count = 0;
    ticker.Ticker.Subscribe(_ => count++);
    count.Should().Be(0);

    // full control of the time without waiting for 1 second
    scheduler.AdvanceByMs(1000);
    count.Should().Be(1);
});

Show Me The Code

Without further ado, the C++ version is not very far from the C# version. In a simple test, we can parameterize a sequence of integer values arriving at specified intervals (a ticker) with a coordination (why coordination and not scheduler, read in the RxCpp developer manual:

auto seq = rxcpp::observable<>::interval(
            std::chrono::milliseconds(1),
            some_scheduler
);

The deterministic test scheduler API is currently available through a worker created on the test scheduler:

auto sc = rxcpp::schedulers::make_test();
auto worker = sc.create_worker();
auto test = rxcpp::identity_same_worker(worker);

The rest should read like English:

int count = 0;

WHEN("one subscribes to an observable sequence on the scheduler") {
  auto seq = rxcpp::observable<>::interval(
              std::chrono::milliseconds(1),
              test // on the test scheduler
             ).filter([](int i) { return i % 2; });

  seq.subscribe([&count](int){
    count++;
  });

  THEN("the sequence is not run at first") {
    worker.sleep(2 /* ms */);

    CHECK(count == 0);

    AND_WHEN("the test scheduler is advanced manually") {

      THEN("the sequence is run as expected") {
        worker.advance_by(8 /* ms */);
        CHECK(count == 5);
      }
    }
  }
}

The full test can be seen @github, and is built on Travis CI

RxCpp 2

RxCpp 2 and API

The last article on rxcpp was based on a now obsolete version of RxCpp. The key contributor to the library, Kirk Shoop, has kindly provided a rewrite based on the newer, 2.0 API of the library: see the pull request, upon which this article is based.

Since the first article, the project has been enriched with somewhat more readable GIVEN/WHEN/THEN-style tests using Catch 1.

Still Ticking: Scheduler and Coordination in RxCpp 2

The previous articles give examples of managing periodic events, such as ticker ticks and measurements in c++. The following example creates an event loop that will be used for coordinated output of various events to the console:

auto scheduler = rxcpp::schedulers::make_same_worker(
    rxcpp::schedulers::make_event_loop().create_worker()
);

auto coordination = rxcpp::identity_one_worker(scheduler);

One such sequence of events is some kind of measurement 2

auto measure = rxcpp::observable<>::interval(
        // when to start
        scheduler.now() + std::chrono::milliseconds(250),
        // measurement frequency
        std::chrono::milliseconds(250),
        coordination)
    // take Hz values instead of a counter
    .map([&FM](int) { return FM.Hz(); });

auto measure_subscription = measure
    .subscribe([](int val) {
        std::cout << val << std::endl;
    });

Why didn’t it tick?

If this code were the end of the main program, there wouldn’t be any observable ticks, as all the objects would be destroyed before the first scheduled event. To see the code in action, we shall wait for some condition that will change when we’re done. This step is not necessary if there’s a GUI toolkit event loop that keeps objects alive, but it has to be simulated for a console example.

To demonstrate the subscription change and wait for some time, we’ll wait twice for an atomic variable to become zero:

std::atomic<long> pending(2);

...

// after all subscriptions defined
while (pending) {
    sleep(1000); // wait for ticker and measure to finish
}

Tick and Stop

The other ticker will have another period, will only tick 10 times, and then decrement the pending counter:

auto ticker = rxcpp::observable<>::interval(
    scheduler.now() + std::chrono::milliseconds(500),
    std::chrono::milliseconds(500),
    coordination);

ticker
    .take(10)
    .subscribe([](int val) {
        std::cout << "tick " << val << std::endl;
    },[&](){
        --pending; // take completed the ticker
    });

Now, we can schedule the termination of the measurement (decrement pending) subscription halfway through the 10-tick run. This scheduling is done on the same scheduler that is running all the subscriptions:

scheduler.create_worker().schedule(scheduler.now() + std::chrono::seconds(2), 
    [&](const rxcpp::schedulers::schedulable&) {
        std::cout << "Canceling measurement ..." << std::endl;
        measure_subscription.unsubscribe(); // cancel measurement
        --pending; // signal measurement canceled
    });

The result:

63
tick 1
63
61
tick 2
63
61
tick 3
63
62
Canceling measurement ...
tick 4
tick 5
tick 6
tick 7
tick 8
tick 9
tick 10

Thanks, Kirk & other library contributors!

Code @ github

Next: deterministic testing of concurrent behavior

  1. i.e. create.cpp
  2. Observe the convergence of the API towards the C# version.

A C++ Background Ticker, now with Rx.cpp

Finally, Rx.cpp

Some time ago I have written that I didn’t have enough patience to recreate the background ticker example in C++ using Rx.cpp. Since then the Rx.cpp project seems to have grown out of the spike phase, and even has a native NuGet package. It has also gone multiplatform (Windows, OSX and Linux): observe the green Travis-CI Button.

Update: new blog post, discussing RxCpp v2 and testing using the test scheduler.

A simple console ticker

As in .Net, Reactive Extensions provide a simple way to process streams of data asynchronously, while keeping the concurrency-related code declarative and thus readable. Here’s a simple ticker in the console which runs asynchronously to the main thread:

auto scheduler = std::make_shared<rxcpp::EventLoopScheduler>();
auto ticker = rxcpp::Interval(std::chrono::milliseconds(250), scheduler);

rxcpp::from(ticker)
	.where([](int val) { return val % 2 == 0; })
	.take(10)
	.subscribe([](int val) {
		std::cout << "tick " << val << std::endl;
	});

std::cout << "starting to tick" << std::endl;

resulting in something like:

starting to tick
tick 0
tick 2
tick 4
tick 6
tick 8
...

where the ticks appear once in 250 milliseconds.

Throwing away code

The PPL example was simulating polling a sensor and printing the value. It had an error-prone and buggy ad-hoc implementation of an active object, ticking at predefined intervals. This can be now happily thrown away, as Rx allows a cleaner concurrency control and testability using schedulers, and implements a timed sequence: Interval.

Preconditions

FrequencyMeter FM;
auto scheduler = std::make_shared<rxcpp::EventLoopScheduler>();

The scheduler will be used for all subscriptions.

The tickers

The first one:

auto measure = rxcpp::Interval(std::chrono::milliseconds(250),scheduler);
auto measure_subscription = rxcpp::from(measure)
	.subscribe([&FM](int val) {
		std::cout << FM.Hz() << std::endl;
	});

where measure_subscription is a rxcpp::Disposable for subscription lifetime control.

And the other one:

auto ticker = rxcpp::Interval(std::chrono::milliseconds(500), scheduler);
rxcpp::from(ticker)
	.take(10)
	.subscribe([](int val) {
		std::cout << "tick " << val << std::endl;
	});

where you can observe the LINQ-style filter take

Managing subscriptions

In the PPL example, one could start and stop the ticker. However, in Rx.cpp this can be simply modeled by disposable subscriptions. Hence, after some kind of sleeping, the measurement can be stopped:

sleep(2000);
std::cout << "Canceling measurement ..." << std::endl;
measure_subscription.Dispose(); // cancel measurement

Resulting in similar output:

60
63
tick 0
61
62
tick 1
63
63
tick 2
62
60
tick 3
Canceling measurement ...
tick 4
tick 5
tick 6
tick 7
tick 8
tick 9

Restarting measurement can be done by creating a new subscription.

Why not simply signals/slots?

Almost quoting the Intro to Rx book, the advantages of using Rx over (at least) simple implementations of signal/slot mechanism are:

  • Better maintainability due to readable, composable, declarative code
  • Scheduler abstraction allowing for fast, deterministic, clock-independent tests of concurrency concerns
  • Declarative concurrency through the same scheduler abstraction
  • LINQ-like composition and filtering of event streams
  • Easy subscription control via disposables
  • Completion and exception handling built-in in the Observer concept

Code

@GitHub

Corrections, suggestions and comments are welcome!

Update 26.6.2014: There’s been a new release of Rx.cpp on nuget, and Kirk Shoop pushed a pull request, upgrading the project and the api usage to Rx.cpp 2.0.0. There have been some changes, and there are some interesting patterns, which should be blogged about in the near future.