Enterprise Sitecore xDB High Availability with MongoDB

As you may already know, Sitecore v7.5 and above has replaced the SQL Server based DMS technology that it has been using for analytics and replaced it with a new approach that uses the No-SQL technology MongoDB for capturing analytics data. Sitecore has a cloud xDB offering available in some geographies (alas it hasn't come to Australia yet!) but we recently delivered an xDB implementation for a client who needed to house their analytics databases on their local hosting infrastructure. This meant that we had to consider how to meet some of the enterprise requirements with Mongo that we had been pretty comfortable dealing with in DMS with SQL Server.   In particularly our high availability and security requirements had to be addressed slightly differently.

This will form the first part in a two-post set outlining some of what we learnt along the way.  The second post deals with configuring Sitecore and MongoDB for enterprise security requirements.

High Availability 

MongoDB has a couple of technologies built into the product that you can use to deliver high availability -- replica sets and sharding.  These two technologies both meet slightly different goals, and you can deploy replica sets only, or replica sets with sharding as required (you can probably deploy sharding by itself, however that doesn't really address high availability requirements!).

Replica sets

Replica sets provide the capability to keep a mirror copy of data on more than one machine, which is similar to the concept of synchronous mirrors in SQL Server.  In a replica set, there is only ever one primary node that accepts writes, and these are replicated to each of the configured replica.  If the primary node goes offline, the remaining nodes perform an election to choose a new primary.  The primary node is by default also responsible for fulfilling queries, although if is possible to balance reads across secondary replicas if required.

Replica sets need a minimum of three nodes to provide high availability.  In our particular case, we chose to host these all in a single data center and rely on underlying replication built into the hosting platform to support DR capability.

Entire replica set in a single location
Fig 1 - Entire replica set in a single location

There is another possible approach which would allow replica sets to be provide redundancy across different geographic locations.

Geographically Distributed Replica Set
Fig 2 - Geographically distributed replica set
This approach would let us achieve high availability of analytics data across multiple geographically dispersed data center locations, and allow automated failover in the case of a primary center outage, without storing any sensitive data outside of the organisation's control.  This is done by having one replica set node at each data center location, with an arbiter node running in a third location -- a cloud server running on AWS or Azure would be a perfect choice.  An arbiter, like a witness in a SQL Server mirrored setup, maintains a quorum for electing a new primary if one of the replica set members is lost, but does not store any data itself.

We are running our Mongo instances on Linux boxes, so we performed the following steps to set up the replica set:

  • First we created a keyfile, which is used for the Mongo replica set nodes to communicate together by using the following command (where [num] is any number you choose):

openssl rand –base64 [num] >  mongo-replSet-keyfile

  • The generated keyfile was then given the appropriate file system permissions set:

chmod 700  mongo-replSet-keyfile

  • We then updated the following settings in /etc/mongod.conf:

replSet=sitecoreReplSet
keyfile=/usr/local/mongodb/etc/mongo-replSet-keyfile

  • The generated keyfile and config settings were replicated on each server in the replica set.

With the base configuration set up, we then logged into Mongo itself to setup the replica set.

  • On the first server we issued the following command to initialise the replicaSet

rs.Initiate()

  • We then added each of the other nodes in the set:

rs.Add("server2")
rs.Add("server3")
The naming convention used when adding the servers is important, because these values will be used for clients to connect to an replica set member.  They must be in a format that is addressable from clients (e.g. FQDN if needed).

Sharding

Sharded clusters partition the data across a number of nodes facilitating horizontal scaling to improve performance.  In particular this helps with high-write environments as a replica set can only ever have one primary node for writing although you can scale reads.

Adding sharding as well as replica sets represents a much more complex environment, and based on advice from Mongo we didn't feel that this level of complexity was warranted given our expected load.

Sitecore configuration

Sitecore connection strings then need to be updated for each of the Mongo databases to inform them of each of the replicas, and the C# Mongo driver delivers the intelligence to know which replica is the current elected primary.
This is done using the following format:
mongodb://server1,server2,server3/[database]?replicaSet=sitecoreReplSet
Where the three replica set server names are listed comma delimited, [database] is the Mongo database for the given connection string you are setting, and sitecoreReplSet is the replica set name you configured during setup.
When Sitecore uses this querystring to access the replica set, it tries calling each of the servers listed in the connection string to retrieve the status of the replica set (equivalent to calling rs.status() from Mongo) and uses the returned set member name designated as being in the the PRIMARY state to issue the required command.

Comments

Popular posts from this blog

Cloud hosting Sitecore - High Availability

Setting up TDS to work with Azure DevOps

Sitecore - multi-site or multi-instance?