How to use Rebus in a web application

2015-08-21 by mookid8000 8 Comments

If you’re building web applications, you will often encounter situations where delegating work to some asynchronous background process can be beneficial. Let’s take a very simple example: Sending emails!

We’re in the cloud

Let’s say we’re hosting the website in Azure, as an Azure Web Site, and we need to send an email at the end of some web request. The most obvious way of doing that could be to new up an SmtpClient with our SendGrid credentials, and then just construct a MailMessage and send it.

While this solution is simple, it’s not good, because it makes it impossible for us to have higher uptime than SendGrid (or whichever email provider you have picked). In fact, every time we add some kind of synchronous call to the outside from our system, we impose their potential availability problems on ourselves.

We can do better 🙂

Let’s make it asynchronous

Now, instead of sending the email directly, let’s use Rebus in our website and delegate the integration to external systems, SendGrid included, to a message handler in the background.

This way, at the end of the web request, we just do this:

await _bus.Send(new SendEmail(recipient, subject, body));

1	await _bus.Send(new SendEmail(recipient, subject, body));

and then our work handling the web request is over. Now we just need to have something handle the SendEmail message.

Where to host the backend?

We could configure Rebus to use either Azure Service Bus or Azure Storage Queues to transport messages. If we do that, we can host the backend anywhere, including as a process running on a 386 with a 3G modem in the attic of an abandoned building somewhere, but I’ve got a way that’s even cooler: Just host it in the web site!

This way, we can have the backend be subject to the same auto-scaling and whatnot we might have configured for the web site, and if we’re a low traffic site, we can even get away with hosting it on the Free tier.

Moreover, our backend can be Git-deployed together with the website, which makes for a super-smooth experience.

How to do it?

It’s a good idea to consider the backend a separate application, even though we chose to deploy it as though it was one. This is just a simple example on how processes and applications are really orthogonal concepts – in general, it’s limiting to attempt to enforce a 1-to-1 between processes and applications(*).

What we should do is to have a 1-to-1 relationship between IoC container instances and applications, because that’s what IoC containers are for: To function as a container of one, logical application. In this case that means that we’ll spin up one Windsor container (or whichever IoC container is your favorite) for the web application, and one for the backend. In an OWIN Startup configuration class, it might look like this:

public class Startup
{
    public void Configuration(IAppBuilder app)
    {
        ConfigureWebApi(app);

        ConfigureBackend(app);
    }

    static void ConfigureWebApi(IAppBuilder app)
    {
        var httpConfiguration = new HttpConfiguration();
        WebApiConfig.Register(httpConfiguration);

        var webContainer = new WindsorContainer()
            .Install(new ApiHandlerInstaller())
            .Install(new WebRebusInstaller());

        httpConfiguration.UseWindsorContainer(webContainer);

        app.RegisterForDisposal(webContainer, "Windsor container for the web app");

        app.UseWebApi(httpConfiguration);
    }

    void ConfigureBackend(IAppBuilder app)
    {
        var backendContainer = new WindsorContainer()
            .Install(new RebusHandlerInstaller())
            .Install(new BackendRebusInstaller());

        app.RegisterForDisposal(backendContainer, "Windsor container for the background jobs");
    }
}

public class Startup

{

public void Configuration(IAppBuilder app)

{

ConfigureWebApi(app);

ConfigureBackend(app);

}

static void ConfigureWebApi(IAppBuilder app)

{

var httpConfiguration = new HttpConfiguration();

WebApiConfig.Register(httpConfiguration);

var webContainer = new WindsorContainer()

.Install(new ApiHandlerInstaller())

.Install(new WebRebusInstaller());

httpConfiguration.UseWindsorContainer(webContainer);

app.RegisterForDisposal(webContainer, "Windsor container for the web app");

app.UseWebApi(httpConfiguration);

}

void ConfigureBackend(IAppBuilder app)

{

var backendContainer = new WindsorContainer()

.Install(new RebusHandlerInstaller())

.Install(new BackendRebusInstaller());

app.RegisterForDisposal(backendContainer, "Windsor container for the background jobs");

}

In the code sample above, UseWindsorContainer and RegisterForDisposal are extension methods on IAppBuilder. UseWindsorContainer replaces Web API’s IHttpControllerActivator with a WindsorCompositionRoot like the one Mark Seemann wrote, and RegisterForDisposal looks like this:

public static void RegisterForDisposal(this IAppBuilder appBuilder, IDisposable disposable, string description = null)
{
    var descriptionOfDisposable = description ?? disposable.ToString();

    var context = new OwinContext(appBuilder.Properties);
    var token = context.Get<CancellationToken>("host.OnAppDisposing");
    if (token == CancellationToken.None)
    {
        Trace.TraceInformation("{0} could not be registered for disposal because the cancellation token could not be found", descriptionOfDisposable);
        return;
    }

    Trace.TraceInformation("Registering for disposal: {0}", descriptionOfDisposable);

    token.Register(() =>
    {
        Trace.TraceInformation("Disposing {0}", descriptionOfDisposable);

        disposable.Dispose();
    });
}

public static void RegisterForDisposal(this IAppBuilder appBuilder, IDisposable disposable, string description = null)

{

var descriptionOfDisposable = description ?? disposable.ToString();

var context = new OwinContext(appBuilder.Properties);

var token = context.Get<CancellationToken>("host.OnAppDisposing");

if (token == CancellationToken.None)

{

Trace.TraceInformation("{0} could not be registered for disposal because the cancellation token could not be found", descriptionOfDisposable);

return;

}

Trace.TraceInformation("Registering for disposal: {0}", descriptionOfDisposable);

token.Register(() =>

{

Trace.TraceInformation("Disposing {0}", descriptionOfDisposable);

disposable.Dispose();

});

}

which is how you make something be properly disposed when an OWIN-based application shuts down. Moreover, I’m using Windsor’s installer mechanism to register stuff in the containers.

Rebus configuration

Next thing to do, is to make sure that I configure Rebus correctly – since I have two separate applications, I will also treat them as such when I set up Rebus. This means that my web tier will have a one-way client, because it needs only to be able to bus.Send, whereas the backend will have a more full configuration.

The one-way client might be configured like this:

public class WebRebusInstaller : IWindsorInstaller
{
    public void Install(IWindsorContainer container, IConfigurationStore store)
    {
        var azureServiceBusConnectionString = ConfigurationManager
            .ConnectionStrings["azureServiceBusConnectionString"]
            .ConnectionString;

        Configure.With(new CastleWindsorContainerAdapter(container))
            .Transport(t => t.UseAzureStorageQueuesAsOneWayClient(azureServiceBusConnectionString))
            .Routing(r => r.TypeBased().MapAssemblyOf<SendEmail>("backend"))
            .Start();
    }
}

public class WebRebusInstaller : IWindsorInstaller

{

public void Install(IWindsorContainer container, IConfigurationStore store)

{

var azureServiceBusConnectionString = ConfigurationManager

.ConnectionStrings["azureServiceBusConnectionString"]

.ConnectionString;

Configure.With(new CastleWindsorContainerAdapter(container))

.Transport(t => t.UseAzureStorageQueuesAsOneWayClient(azureServiceBusConnectionString))

.Routing(r => r.TypeBased().MapAssemblyOf<SendEmail>("backend"))

.Start();

}

this registering an IBus instance in the container which is capable of sending SendEmail messages, which will be routed to the queue named backend.

And then, the backend might be configured like this:

public class BackendRebusInstaller : IWindsorInstaller
{
    public void Install(IWindsorContainer container, IConfigurationStore store)
    {
        var azureServiceBusConnectionString = ConfigurationManager
            .ConnectionStrings["azureServiceBusConnectionString"]
            .ConnectionString;

        Configure.With(new CastleWindsorContainerAdapter(container))
            .Transport(t => t.UseAzureStorageQueues(azureServiceBusConnectionString, "backend"))
            .Start();
    }
}

public class BackendRebusInstaller : IWindsorInstaller

{

public void Install(IWindsorContainer container, IConfigurationStore store)

{

var azureServiceBusConnectionString = ConfigurationManager

.ConnectionStrings["azureServiceBusConnectionString"]

.ConnectionString;

Configure.With(new CastleWindsorContainerAdapter(container))

.Transport(t => t.UseAzureStorageQueues(azureServiceBusConnectionString, "backend"))

.Start();

}

Only thing left is to write the SendEmailHandler:

Email message handler

public class SendEmailHandler : IHandleMessages<SendEmail>
{
    public async Task Handle(SendEmail message)
    {
        var mail = CreateMailMessage(message);
        var client = GetSmtpClient();

        using(client)
        {
            await client.SendMailAsync(mail);
        }
    }

    MailMessage CreateMailMessage(SendEmail message)
    {
         //...
    }

    SmtpClient GetSmtpClient()
    {
         //...
    }
}

public class SendEmailHandler : IHandleMessages<SendEmail>

{

public async Task Handle(SendEmail message)

{

var mail = CreateMailMessage(message);

var client = GetSmtpClient();

using(client)

{

await client.SendMailAsync(mail);

}

MailMessage CreateMailMessage(SendEmail message)

{

//...

}

SmtpClient GetSmtpClient()

{

//...

}

Conclusion

Hosting a Rebus endpoint inside an Azure Web Site can be compelling for several reasons, where smooth deployment of cohesive units of web+backend can be made trivial.

I’ve done this several times myself, in web sites with multiple processes, including sagas and timeouts stored in SQL Azure, and basically everything Rebus can do – and so far, it has worked flawlessly.

Depending on your requirements, you might need to flick on the “Always On” setting in the portal

Skærmbillede 2015-08-21 kl. 11.44.22

so that the site keeps running even though it’s not serving web requests.

Final words

If anyone has experiences doing something similar to this in Azure or with another cloud service provider, I’d be happy to hear about it 🙂

(*) This 1-to-1-ness is, in my opinion, a thing that the microservices community does nok mention enough. I like to think about processes and applications much the same way as Udi Dahan describes it.

Another Rebus extension example

2015-08-13 by mookid8000 Leave a comment

In the previous post I showed how Rebus’ subscription storage could report itself as “centralized”, which would provide a couple of benefits regarding configuration. In the post before the previous post, I showed how Rebus could be extended in order to execute message handlers inside a System.Transactions.TransactionScope, which was easy to do, without touching a single bit in Rebus core.

This time, I’ll combine the stuff I talked about in the two posts, and show how the Azure Service Bus transport can hook itself into the right places in order to function as a multicast-enabled transport, i.e. a transport that natively supports publish/subscribe and thus can relieve Rebus of some of its work.

Again, let’s take it from the outside and in – I would love it if Rebus could be configured to use Azure Service Bus simply by doing something like this:

var activator = new BuiltinHandlerActivator();

Configure.With(activator)
    .Transport(t => t.UseAzureServiceBus(connectionString, "my_input_queue"))
    .Start();

var activator = new BuiltinHandlerActivator();

Configure.With(activator)

.Transport(t => t.UseAzureServiceBus(connectionString, "my_input_queue"))

.Start();

to configure each endpoint, and then await bus.Subscribe<SomeMessage>() and await bus.Publish(new SomeMessage("woohoo it works!!")) in order to subscribe to messages and publish them, and then just sit back and see published messages flow to their subscribers without any additional work.

Now, what would it require to make that work?

Subscriptions must be stored somewhere

No matter which kind of technology you use to move messages around, and no matter how complex logic they support, I bet they somehow build on some kind of message queue building block – i.e. a thing, into which you may put messages meant for some specific recipient to receive, and out of which one specific recipient can get its messages – and then everything that the transport can do with the messages, like routing, filtering, fan-out, etc., is implemented as logic that hooks into places and does stuff, but it will always end out with messages going into message queues.

Since Rebus is made to be able to run right on top of some pretty basic queueing systems, like e.g. MSMQ, and then has some of its functions going on in “user space” (like pub/sub messaging), it has an abstraction for something that persists subscriptions: ISubscriptionStorage – this is the way a publisher “remembers” which queues to send to when it publishes a message.

When a transport offers its own pub/sub mechanism (and Azure Service Bus does that via topics and subscriptions) it means that it effectively works as a centralized implementation of ISubscriptionStorage (as described in the previous post) – and in fact, it turns out that that is the proper place to hook in in order to take advantage of the native pub/sub mechanism.

Let’s do it 🙂

How to replace the subscription storage

Similar to what I showed in the previous Rebus extension example, the UseAzureServiceBus function above is an extension method – it uses Injectionist to register an instance of the transport, and then it sets up resolvers to use the transport as the primary ITransport and ISubscriptionStorage implementations – it looks like this:

public static AzureServiceBusTransportSettings UseAzureServiceBus(this StandardConfigurer<ITransport> configurer, string connectionStringNameOrConnectionString, string inputQueueAddress)
{
    var connectionString = GetConnectionString(connectionStringNameOrConnectionString);
    var settingsBuilder = new AzureServiceBusTransportSettings();

    // register instance that implements ITransport and ISubscriptionStorage
    configurer
        .OtherService<AzureServiceBusTransport>()
        .Register(c =>
        {
            var transport = new AzureServiceBusTransport(connectionString, inputQueueAddress);

            if (settingsBuilder.PrefetchingEnabled)
            {
                transport.PrefetchMessages(settingsBuilder.NumberOfMessagesToPrefetch);
            }

            if (settingsBuilder.AutomaticPeekLockRenewalEnabled)
            {
                transport.AutomaticallyRenewPeekLock();
            }

            return transport;
        });

    // resolve ISubscriptionStorage by forwarding to the transport instance
    configurer
        .OtherService<ISubscriptionStorage>()
        .Register(c => c.Get<AzureServiceBusTransport>());

    // resolve ITransport by forwarding to the transport instance
    configurer.Register(c => c.Get<AzureServiceBusTransport>());

    return settingsBuilder;
}

public static AzureServiceBusTransportSettings UseAzureServiceBus(this StandardConfigurer<ITransport> configurer, string connectionStringNameOrConnectionString, string inputQueueAddress)

{

var connectionString = GetConnectionString(connectionStringNameOrConnectionString);

var settingsBuilder = new AzureServiceBusTransportSettings();

// register instance that implements ITransport and ISubscriptionStorage

configurer

.OtherService<AzureServiceBusTransport>()

.Register(c =>

{

var transport = new AzureServiceBusTransport(connectionString, inputQueueAddress);

if (settingsBuilder.PrefetchingEnabled)

{

transport.PrefetchMessages(settingsBuilder.NumberOfMessagesToPrefetch);

}

if (settingsBuilder.AutomaticPeekLockRenewalEnabled)

{

transport.AutomaticallyRenewPeekLock();

}

return transport;

});

// resolve ISubscriptionStorage by forwarding to the transport instance

configurer

.OtherService<ISubscriptionStorage>()

.Register(c => c.Get<AzureServiceBusTransport>());

// resolve ITransport by forwarding to the transport instance

configurer.Register(c => c.Get<AzureServiceBusTransport>());

return settingsBuilder;

}

Now, the Azure Service Bus transport just needs to perform some meaningful actions as the subscription storage it is now claiming to be. Let’s take a look at

How to be a subscription storage

Subscription storages need to implement this interface:

public interface ISubscriptionStorage
{
    Task<string[]> GetSubscriberAddresses(string topic);

    Task RegisterSubscriber(string topic, string subscriberAddress);

    Task UnregisterSubscriber(string topic, string subscriberAddress);

    bool IsCentralized { get; }
}

public interface ISubscriptionStorage

{

Task<string[]> GetSubscriberAddresses(string topic);

Task RegisterSubscriber(string topic, string subscriberAddress);

Task UnregisterSubscriber(string topic, string subscriberAddress);

bool IsCentralized { get; }

}

where you already know that the IsCentralized property must return true, indicating that subscribers can register themselves directly. And then, because it’s Azure Service Bus, we just need to

ensure the given topic exists, and
create a subscription on the given topic with ForwardTo set to the subscriber’s input queue

in order to start receiving the subscribed-to events. And that is in fact what the Azure Service Bus transport is doing now 🙂

Only thing left for this to work, is this:

Rebus must publish in the right way

When Rebus publishes a message, it goes through the following sequence:

Asks the subscription storage for subscribers of the given topic
For each subscriber: Asks the transport to send the message to that subscriber

Now, since the Azure Service Bus transport is both subscription storage and transport, we’ll take advantage of this fact by having GetSubscriberAddresses return only one single – fake! – subscriber address, on the form subscribers/<topic>(*).

And then, when the transport detects a destination address starting with the subscribers/ prefix (which cannot be a valid Azure Service Bus queue name), the transport will use a TopicClient for the right topic to publish the message instead of the usual QueueClient.

Conclusion

Rebus (since 0.90.8) can take advantage of Azure Service Bus’ native support for publish/subscribe, which means that you need not worry about routing of events or configuring any other kind of subscription storage.

(*) Azure Service Bus does not support “,”, “+”, and other characters in topics, and these characters can often be found in .NET type names (which Rebus likes to use as topics), Azure Service Bus transport will normalize topics by removing these illegal characters. In fact, all non-digit non-letter characters will be replaced by “_”, causing "System.String, mscorlib" to become a topic named "system_string__mscorlib".

Using Azure to host the node.js backend of your Xamarin-based iOS app

2014-04-22 by mookid8000 Leave a comment

TL;DR: This is how to tell Azure to host your node.js web service even though your repository contains .NET solution files and/or projects: Create an app setting called Project and set its value to the directory where your node app resides, e.g. src\Server – this is the argument that will make Azure decide to host your node.js stuff and not the .NET stuff.

This is not new – it’s how you pick a .NET project to host as well when your repository contains multiple hostable .NET projects, but when it’s .NET you point to a project file instead of a directory.

Do NOT be fooled by GitHub issues mentioning setting the SCM_SCRIPT_GENERATOR_ARGS app setting to --node – doing that will not work together with the Project setting!

I wrote this blog post because it took me quite a while to figure this seemingly obvious thing out, mostly because I got fooled by various red herrings around the net referring to the aforementioned Kudu setting.

Long version 🙂

Lately, I’ve been tinkering a bit with Xamarin Studio, trying to get a little bit into building a simple iPhone app.

My app needs a backend, which will be function as a mediator between the iPhone app and another backend, allowing me to

tailor the service to my iPhone app, trimming the API for my needs, and
not worry too much about having to update the iPhone app everytime something changes on the real backend (the “backend-backend”…)

This is a sound way of putting these things together IMHO, and since the backend for my iPhone app will mostly function as a mediator/proxy it was kind of an obvious opportunity to get to play a little bit with node.js as well 😉 (because JavaScript is fun, and because of its asynchronous nature)

But – alas – when I Git-deployed my Azure web site, I got the following message in the Azure portal:

and clicking the log revealed that Azure was kind of confused by the fact that my repository did not contain an obvious single candidate for something to build & deploy – it had multiple (the Xamarin solution Client.sln and my initial Web API-based dummy server solution Server.sln):

When a Git-deployed Azure web site has multiple things that can potentially be built & deployed, you’ll usually create a Project app setting and point it to the web project that you want to host in that particular web site, e.g. setting it to src\Something.Web\Something.Web.csproj if that is a web site.

That’s also what we need to do here! – just by pointing to the directory where your node app’s package.json resides – in this case it’s src\Server2 (which was the most awesomest name I could come up with for my server no. 2….) It’s that simple 😉

Ways of scaling out with Rebus #2: Azure Service Bus

2013-12-30 by mookid8000 Leave a comment

Scaling out your application is easy with Azure Service Bus, because Azure Service Bus by design lends itself well to the competing consumers pattern as described by Gregor Hohpe and Bobby Woolf in the Enterprise Integration Patterns book.

So, in order to make this post a little longer, I will tell a little bit on how Rebus makes use of Azure Service Bus. And then I’ll tell you how to scale it 🙂

Rebus and queue transactions

When Rebus is configured to use Azure Service Bus to transport messages like this:

Configure.With(yourContainerAdapter)
    .Logging(l => l.NLog())
    .Transport(t => t.UseAzureServiceBus(connectionString, "some_input_queue", "error_queue"))
    .CreateBus()
    .Start();

Configure.With(yourContainerAdapter)

.Logging(l => l.NLog())

.Transport(t => t.UseAzureServiceBus(connectionString, "some_input_queue", "error_queue"))

.CreateBus()

.Start();

the bus will not use Azure Service Bus queues for its input queue and error queue, as you might think.

This is because Rebus will go to great lengths to promise you that a message can be received, and 0 to many messages sent – in one single queue transaction!

This means that the underlying transport layer must somehow be capable of receiving and sending messages atomically – and in a way that can be either committed or rolled back.

And since Azure Service Bus has limited transactional capabilities that do NOT allow for sending messages to multiple queues transactionally, we had to take a different approach with Rebus.

So, how does Rebus actually use Azure Service Bus?

What Azure Service Bus DOES support though, is receiving and sending atomically within one single topic.

So when Rebus starts up with Azure Service Bus, it will ensure that a topic exists with the name “Rebus”, which will be used to publish all messages that are sent.

And then, for each logical input queue – let’s call it “some_input_queue” – there will be a subscription for that queue by the same name, and that subscription will be configured with a SqlFilter that filters the received messages on a specific message property that holds the name of the intended recipient’s input queue. The filter will then ensure that only the intended messages are received for that endpoint.

So – how to scale it?

Easy peasy – in the Azure portal, go to this section of your cloud service:

Skærmbillede 2013-12-18 kl. 15.01.25

and go crazy with this bad boy:

Skærmbillede 2013-12-18 kl. 15.01.39

and – there you have it! – that is how you can scale out your work with Rebus in Azure 🙂

One thing, though – when you’re doing some serious number crunching, depending on the granularity of your messages of course, you may be bitten by the fact that Azure Service Bus’ BrokeredMessage’s lease expires after 60 seconds – if that is the case, Rebus has a fairly non-intrusive way of letting you renew the lease, which you can read more about in the “more about the Azure Service Bus transport” on the Rebus wiki.

In the next post, I’ll delve into how to scale your Rebus workers if you’re using RabbitMQ.

Ways of scaling out with Rebus #1

2013-12-28 by mookid8000 2 Comments

Introduction

When you’re working with messaging, and you’re in need of processing messages that take a fair amount of time to process, you’re probably in need of some kind of scaling-out strategy. An example that I’ve been working with lately, is image processing: By some periodic schedule, I would have to download and render a number of SVG templates and pictures, and that number would be thousands and thousands.

Since processing each image would have no effect on the processing of the next image, the processing of images is an obvious candidate for some kind of parallelisation, which just happens to be pretty easy when you’re initiating all work with messages.

Rudimentary scaling: Increase number of threads

One way of “scaling out” your work with Rebus is to increase the number of worker threads that the bus creates internally. If you check out the documentation about the Rebus configuration section, you can see that it’s simply a matter of doing something like this:

<rebus inputQueue="my_worker" errorQueue="error" workers="30" />

1	<rebus inputQueue="my_worker" errorQueue="error" workers="30" />

Increasing the number of worker threads provides a simple and easy way to parallelise work, as long as your server can handle it. Each CLR thread will have 1 MB of RAM reserved for its stack, and will most likely require additional memory to do whatever work it does, so you’ll probably have to perform a few measurements or trial runs in order to locate a sweet spot where memory consumption and CPU utilization are good.

If you’re in need of some serious processing power though, you’ll most likely hit the roof pretty quickly – but you’re in luck, because your messaging-based app lends itself well to being distributed to multiple machines, although there are a few things to consider depending on the type of transport you’re using.

In the next posts, I’ll go through examples on how you can distribute your work and scale out your application when you’re using Rebus together with Azure Service Bus, RabbitMQ, SQL Server, and finally with MSMQ. Happy scaling!

Install ElasticSearch on Ubuntu VMs in Azure

2013-11-03 by mookid8000 7 Comments

Since ElasticSearch is hot sh*# these days, and my old hacker friend Thomas Ardal wrote a nifty guide on how to install it on Windows VMs in Azure, I thought I might as well supplement with a guide on how to do the same thing, only on Ubuntu VMs in Azure….

So, in this guide I’ll take you through the steps necessary to set up three Ubuntu VMs in Azure and install an ElasticSearch node on each of them, and finally connect the nodes into a search cluster… here goes:

First, create a new virtual network

Unless you intend to add your new Ubuntu VMs to an existing virtual network, you should use the “New” button and go and create a new virtual network. You can just fill in the name and leave all other options at their defaults.

guide01

Create virtual machines

Now, go and create a new virtual machine from the gallery.

guide02

Select the latest Ubuntu from the list.

guide03

Give your virtual machine a sensible name – in this case, since this is the third machine in my ElasticSearch cluster, I’m calling it “elastica3”. For all three machines, I’ve created a user account called “mhg” on the machine so I can SSH to it.

guide04

On the first machine, be sure to create a new cloud service that you can use to load balance requests among the machines. When adding the subsequent machines, remember to select the existing cloud service. In this case, since it’s balancing among “elasica1”, “elastica2”, and “elastica3”, I’m calling the cloud service “elastica”.

Moreover, it’s important that you add the machines to the same availability set! This way, Azure will ensure that the machines are unlikely to crash/be disconnected/fail at the same time by putting the machines in different fault domains.

guide05

When the first machine was added, the public port 22 on the cloud service “elastica” got automatically mapped to port 22 on the machine. When adding the subsequent machines, select another public port to map to 22 so that you can SSH to each individual machine from the outside. I chose 23 and 24 for the two other machines.

guide06

SSH to each machine

Open up a terminal and

ssh mhg@elastica.cloudapp.net -p22

1	ssh mhg@elastica.cloudapp.net -p22

in order to SSH to the first machine, logging in as “mhg”. In this example, I’m using the (default) port 22 which I will replace with 23 and 24 in order to SSH to the other two machines.

Update apt-get

On each machine, I start out by running a

sudo apt-get update

1	sudo apt-get update

in order to download the most recent apt-get package lists.

Install Java

Now, on each machine I install Java by going

sudo apt-get install openjdk-7-jre-headless -y

1	sudo apt-get install openjdk-7-jre-headless -y

and at this point I usually feel inspired to go grab myself a cup of coffee… 😉

Download and install ElasticSearch

And, finally, we’re ready to install ElasticSearch – go to the download page and copy the URL of the DEB package. At the time of writing this, the most recent DEB package is https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.90.5.deb which I download and install on each machine like this:

wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.90.5.deb
sudo dpkg -i elasticsearch-0.90.5.deb
sudo service elasticsearch start

wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.90.5.deb

sudo dpkg -i elasticsearch-0.90.5.deb

sudo service elasticsearch start

Configure ElasticSearch cluster

In order to be able to edit the configuration file, I

sudo apt-get install emacs

1	sudo apt-get install emacs

and go

sudo emacs /etc/elasticsearch/elasticsearch.yml

1	sudo emacs /etc/elasticsearch/elasticsearch.yml

By default, ElasticSearch will use UDP to dynamically discover an existing cluster which it will automatically join. On Azure though, we must explicitly specify which nodes go into our cluster. In order to do this, uncomment the line

discovery.zen.ping.multicast.enabled: false

1	discovery.zen.ping.multicast.enabled: false

to disable UDP discovery, and then add the full list of the IP addresses of your machines on the following line:

discovery.zen.ping.unicast.hosts: ["10.0.0.4", "10.0.0.5", "10.0.0.6"]

1	discovery.zen.ping.unicast.hosts: ["10.0.0.4", "10.0.0.5", "10.0.0.6"]

In my case, the IPs assigned to the VMs were 10.0.0.4 through 10.0.0.6. You can use ifconfig on each machine if you’re in doubt which IP was assigned (or you can check it out via the Azure Portal).

After saving each file, remember to

sudo service elasticsearch restart

1	sudo service elasticsearch restart

for ElasticSearch to pick up the changes.

Check it out

Now, on any of the three machines, try CURLing the following command:

curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'

1	curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'

which should yield something like this:

{
  "cluster_name" : "elasticsearch",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0
}

{

"cluster_name" : "elasticsearch",

"status" : "green",

"timed_out" : false,

"number_of_nodes" : 3,

"number_of_data_nodes" : 3,

"active_primary_shards" : 0,

"active_shards" : 0,

"relocating_shards" : 0,

"initializing_shards" : 0,

"unassigned_shards" : 0

}

Finally, let’s make the cluster accessible from the outside….

Set up load balancing among the three VMs

Go to the first VM on the “Endpoints” tab and add a new endpoint.

guide07

guide08

Remember to check the option that you want to create a new load-balanced set. Just go with the defaults when asked about how the load balancer should probe the endpoints.

Last thing is to add an endpoint to the two other VMs, selecting the existing load-balanced set.

guide09

When this step is completed, you should be able to visit your cloud service URL (in my case it was http://elastica.cloudapp.net:9200) and see something like this:

{
    ok: true,
    status: 200,
    name: "Machine Teen",
    version: {
        number: "0.90.5",
        build_hash: "c8714e8e0620b62638f660f6144831792b9dedee",
        build_timestamp: "2013-09-17T12:50:20Z",
        build_snapshot: false,
        lucene_version: "4.4"
    },
    tagline: "You Know, for Search"
}

{

ok: true,

status: 200,

name: "Machine Teen",

version: {

number: "0.90.5",

build_hash: "c8714e8e0620b62638f660f6144831792b9dedee",

build_timestamp: "2013-09-17T12:50:20Z",

build_snapshot: false,

lucene_version: "4.4"

tagline: "You Know, for Search"

}

So, is it usable yet?

Not sure, actually – I haven’t had time to investigate how to properly set up an authorization mechanism so as to make my cluster accessible only to specific applications.

If anyone knows how to do that on Azure, please don’t hesitate to enlighten me 🙂