Database replication vs. clustering

nhpr · February 2012

I work for a hosting company, so I deal with a few database configs everyday. For sites that need high availability, they all seem to choose replication (though most of the servers I work with are smaller business customers). Prior to coming into my current position, I had only worked with clustering for failover/high availability. Since I feel silly setting up these things for customers without having a full understanding of why I'm doing them, I was wondering if anyone could point me towards something that (relatively) succinctly explains why someone would choose replication over clustering? All the articles I seem to be finding in Google aren't comparative.

EDIT: I just remembered that the site ServerFault exists. Sure enough: http://serverfault.com/questions/288031/mysql-multi-master-replication-vs-mysql-cluster

GAngel · February 2012

As an aside with the rise of virtualization the need for clustering isn't as high as it used to be.

RobertKaucher · February 2012

If you would like a deeper understanding I suggest that you take a look at this book: http://www.amazon.com/Microsoft-Availability-Clustering-Database-Mirroring/dp/0071498133/ref=sr_1_1?ie=UTF8&qid=1328492246&sr=8-1

Not, that being said, there are a lot of new features coming out in SQL Server 2012 that you might want to have a handle on.

nhpr · February 2012

RobertKaucher wrote: »

If you would like a deeper understanding I suggest that you take a look at this book: http://www.amazon.com/Microsoft-Availability-Clustering-Database-Mirroring/dp/0071498133/ref=sr_1_1?ie=UTF8&qid=1328492246&sr=8-1

Not, that being said, there are a lot of new features coming out in SQL Server 2012 that you might want to have a handle on.

No, there aren't! I'm a Linux sys. admin.

powerfool · February 2012

Well, technology has changed. I have used SQL Server (7, 2000, 2005, and 200

along with log shipping. I have also used SAN snapshots and replication for my needs.

A big reason for using clustering is that is what Microsoft pushed in their products for the longest time. It also made sense because many companies had their users local to the data and they needed high availability on-site. Third-party products were typically used for replication, likes SANs, and products like Double-Take.

Virtualization and its features, like vMotion/Live Migration and Fault Tolerance add to that level of high availability. Getting data to remote data centers is now the primary concern. Another driver of this is the move to a tapeless backup environment. Since you cannot get the non-existent tapes off-site, you have to replicate.

Microsoft Exchange is another good example of a product following this path. I am currently the Exchange SME/Lead Engineer for a large MS Exchange 2003 environment that uses four production active/active/active/passive clusters that are used for high availability and we rely on SAN snapshots and replication along with four matching DR clusters. I am documenting our Exchange 2010 deployment and migration, currently, and the product builds in replication to its core and does not use HA clustering technologies inherent in the previous versions of Windows Server.

Also, there will always be an ebb and flow with technology. For the past two decades there has been some branding of a remote infrastructure as technology has evolved, and then it falls back to local infrastructure. The first big push was with Citrix Winframe/Metaframe and then with web-based applications, or the Application Service Provider (ASP) market. Now virtualization has facilitated this lasted push, whether you call it Cloud Computing or Infrastructure/Platform as a Service (IaaS or PaaS)... Maybe the will be the last big push... I don't know. But, it is rare for a technology to change things in one push. We may go back to the clustering model in the future... it is hard to tell.

EDIT: And after all of that, I finally realized you are talking about MySQL. Anyhow, that doesn't change much... it's just a different product. The main difference is that there isn't tight integration between MySQL and any one specific OS. But the replication trend certainly is influencing other systems, like MySQL and PostgreSQL.

ptilsen · February 2012

Clustering can typically provide better availability while replication is good for providing faster availability across different locations. Replication differs from clustering in that the additional availability that can be provided by replication is generally as secondary benefit and generally comes with a performance impact during node downtime.

Also, they are not typically mutually exclusive. You can replicate a clustered database to provide both good performance in different locations and high availability.

Here, ASCII drawings.

Clustering:
Clients ..|
............|
............|
Node A ---- Node B
_____Storage____

Replication:

Clients ..|............................|Clients
............|............................|
Node A ..|

|..NodeB
Storage

Storage

Everyone · February 2012

GAngel wrote: »

As an aside with the rise of virtualization the need for clustering isn't as high as it used to be.

No, that just shifts where clustering occurs. You may not cluster your guests O/S's, but you'll cluster your hosts.

I've found that a lot of people don't understand clustering, and are therefore afraid of it.

I've stepped in places where someone that didn't really know what they were doing tried to setup clustering, and it didn't work right at all. A failover took 30 minutes if it even worked at all. It made the rest of the shop leery of clustering. Then I fixed it, failovers took less than 30 seconds as they should, and restored some faith in it for the others.

powerfool wrote:

I am documenting our Exchange 2010 deployment and migration, currently, and the product builds in replication to its core and does not use HA clustering technologies inherent in the previous versions of Windows Server.

That's not true. Exchange 2010 DAGs are really sort of a hybrid. They use both database replication and Windows clustering. When you create a DAG, Exchange does the clustering setup for you behind the scenes, but it still sits on top of Windows clustering. It is easier than it ever was before. I know you said you work with some Exchange 2003 clusters now, but have you ever set one up? Those were 10x more difficult to setup by comparison.

AlexNguyen · February 2012

GAngel wrote: »

As an aside with the rise of virtualization the need for clustering isn't as high as it used to be.

Even if your hosts are clustered, you still need to cluster your guests for zero downtime maintenance reasons (e.g. patches installation).

AlexNguyen · February 2012

In "simple" words, clustering is for HA (High Availibility) and database replication is for BCP (Business Continuity Plan) or DRP (Disaster Recovery Plan).

Forsaken_GA · February 2012

Bunch of microsoft geeks here

Since he mentioned he's a linux admin, i'm going to assume he's from that world, and since he's working for a hosting company, I'm going to assume he's mostly dealing with mysql databases.

Long story short - replication is simply having another copy of the data available, for whatever purposes. If the master fails, the slave is not automatically going to take over, unless you're doing master-to-master replication, and mysql load balancing. You can certainly manually fail your application over to use the slave, but you're going to have some times getting the database consistent again, especially if the master had data it hadn't yet replicated to the slave. The most common use I've seen for mysql replication is to split the performance barrier. Some folks use the slave for all reads and only the master for writes (vbulletin can do this, for example). I've also seen folks use the slave for pulling reports and monitoring data out of, etc. In my previous job, the security events were written to on the master, which then replicated it out, and all of the analysts tools were reading from slave nodes (so if replication broke, it took everyone down).

Clustering, on the other hand, is all about availability. When data is written, it gets written to all nodes at the same time. A cluster is all about high availability. The application knows of and interfaces into the cluster, not the individual servers within it. So servers can go down for whatever reason, and the application won't miss a beat. When the node comes back online, it's data will be brought back to consistency with it's other members.

Everyone · February 2012

Forsaken_GA wrote: »

Clustering, on the other hand, is all about availability. When data is written, it gets written to all nodes at the same time. A cluster is all about high availability.

No, what you're describing here is replication. Replication writes all storage points, it can be at the same time, or it can be delayed depending on how it is configured. Clustering uses shared storage. Only the active node accesses the data. Even in Active/Active clustering, only 1 node accesses data at a time, as a cluster group can only be active on one node.

An example of replication and clustering being used together would be a Geo-Cluster. Replication occurs at the storage level, data is written simultaneously to Site A and Site B. While each site physically has its own storage, the cluster see's it as being the same storage and can use it as shared storage. This allows the active node to fail over between sites even though they are geographically separated.

Forsaken_GA · February 2012

Everyone wrote: »

No, what you're describing here is replication. Replication writes all storage points, it can be at the same time, or it can be delayed depending on how it is configured. Clustering uses shared storage. Only the active node accesses the data. Even in Active/Active clustering, only 1 node accesses data at a time, as a cluster group can only be active on one node.

Not in the mysql world. In the mysql world, replication is an asynchronous operation, whereas clustering is a synchronous operation. Essentially, in the mysql world, for the terminology, it's all replication, it's just implementation that differs. You can do most of the function of an NDB cluster using multi-master replication with mysql, primarily the high availability, but it adds a very large element of complexity to configuration and maintenance.

Again, I'm speaking specifically to a mysql database environment, not a generalization. Since the OP specifically mentioned linux and hosting, I'd bet dollars to donuts he's in a mysql environment. Oracle and IBM do things very differently.

Everyone · February 2012

Forsaken_GA wrote: »

Not in the mysql world. In the mysql world, replication is an asynchronous operation, whereas clustering is a synchronous operation. Essentially, in the mysql world, for the terminology, it's all replication, it's just implementation that differs. You can do most of the function of an NDB cluster using multi-master replication with mysql, primarily the high availability, but it adds a very large element of complexity to configuration and maintenance.

Again, I'm speaking specifically to a mysql database environment, not a generalization. Since the OP specifically mentioned linux and hosting, I'd bet dollars to donuts he's in a mysql environment. Oracle and IBM do things very differently.

Database replication is not clustering, so the part I bolded there is more accurate.. Yes things are done very differently depending on what vendor you go with. MySQL, Postgre, Sybase, Oracle, SQL, etc. You can mix database replication with clustering. I'm not a database guy, but I've seen it done quite frequently.

erpadmin · February 2012

I have never worked at a shop that did not use an active/passive cluster for production databases--you just don't put a production database on a single point of failure...ever. Replication is usually handled by the SAN setups, but that's a really expensive solution. At the host level, clustering (active/passive) is usually what gets done. One word DBAs will pretty much learn on day 1 is "redundancy..."

And while mysql is owned by Oracle, you can run an Oracle (proper) database on Linux.

I've only seen mySQL databases that are tiny....and I mean at most a few hundred MB. Oracle will never let mySQL become an enterprise product, for very obvious reasons.

Forsaken_GA · February 2012

Everyone wrote: »

Database replication is not clustering, so the part I bolded there is more accurate.. Yes things are done very differently depending on what vendor you go with. MySQL, Postgre, Sybase, Oracle, SQL, etc. You can mix database replication with clustering. I'm not a database guy, but I've seen it done quite frequently.

I understand what you're saying, and I'm not disagreeing with you. However, what MySQL calls clustering (specifically, NDBCluster) is actually replication, as you're pointing out, but they call it a cluster. And if you're in the hosting world, and you're talking databases and replication vs. cluster, that's what you're up against. Since I believe that's where the OP is coming from, that's how I answered the question.

Not if he's not dealing with mysql, then what I'm posting is completely irrelevant. If he is, then the semantical differentiation only serves to confuse the issue hehe

Forsaken_GA · February 2012

erpadmin wrote: »

I have never worked at a shop that did not use an active/passive cluster for production databases--you just don't put a production database on a single point of failure...ever. Replication is usually handled by the SAN setups, but that's a really expensive solution. At the host level, clustering (active/passive) is usually what gets done. One word DBAs will pretty much learn on day 1 is "redundancy..."

Sadly enough, I have seen more than one company that generates seven figures of revenue do exactly that, have a single point of failure. One of them was so bad that if the master database died, the only choice in order to keep everyone working was to fail all services over to the DR site. Others had no DR site and simply had to endure the outage. When I worked for a hosting company, I only ever saw one company that was willing to do the investment for database redundancy, and their revenue was only in the high six figures a month. So each company has different goals.

And while mysql is owned by Oracle, you can run an Oracle (proper) database on Linux. I've only seen mySQL databases that are tiny....and I mean at most a few hundred MB. Oracle will never let mySQL become an enterprise product, for very obvious reasons.

Well of course you can run Oracle on Linux. However, my experience has been that it's companies that keep their infrastructure in house that are willing to make that kind of investment. Companies that are going to provide services with a hosting company (this doesn't necessarily apply to those that are colocating instead of actually leasing infrastructure, but I've seen that as well) tend to be cost conscious, and the additional investment in hardware and licensing costs for one of the commerical vendors tends to be off putting, so they play dice and skimp on redundancy. Whether they've done a proper cost benefit analysis to determine if the downtime is cheaper than investing in a proper solution, I don't know, all I do know is that there are alot of companies that make alot of money that are gunshy when it comes to capex. A few of the smarter ones hedged and instead of investing in redundancy, modularized things by seperating their database servers, and making sure that the dependencies were limited. So if something went down, it only affected a subset of their business, as opposed to a total outage. It's a crapload cheaper to spin up additional database servers running mysql and split up your data stores than it is to invest in a properly redundant Oracle cluster.

As far as mysql database sizes go, I've seen them from anywhere from a few hundred k for someones Wordpress blog, to 80 gigs worth of the enterprises entire production data store (and the 80 gig one had endured enough replication faults that the data between the slave and master was hopelessly out of sync and they had to accept some data discrepencies by skipping some transactions and praying it wasn't anything important).

If you've worked in operations for any length of time you understand that there's a world of difference between best practices and the investment that requires and daily operational reality.

erpadmin · February 2012

Forsaken_GA wrote: »

Sadly enough, I have seen more than one company that generates seven figures of revenue do exactly that, have a single point of failure. One of them was so bad that if the master database died, the only choice in order to keep everyone working was to fail all services over to the DR site. Others had no DR site and simply had to endure the outage. When I worked for a hosting company, I only ever saw one company that was willing to do the investment for database redundancy, and their revenue was only in the high six figures a month. So each company has different goals.

If you're a small shop, then you're IT budget is going to be small. If you're a large shop, then you will allocate funds for an acceptable level of redundancy. If you run mission-critial applications as a front end to a mission critical database, then an active-passive cluster is what one goes with.

Forsaken_GA wrote: »

If you've worked in operations for any length of time you understand that there's a world of difference between best practices and the investment that requires and daily operational reality.

I'm fortunate enough that my daily operational reality is that we implement best practice solutions. That means working in a shop with SAN replication, with two 2-node clusters for High Availability. The days of being a MacGuyver type of tech are over...there are standards and methodologies that should be followed...even for the cost-conscious. I've been doing IT (professionally) for going on 15 years...I've been in operations for most of that time, except my brief stint as a desktop support analyst. Even as a JOAT, I was involved in operations. But it wasn't until I dealt with ERPs that I stopped dealing with IT issues by the seat of my pants, like MacGuyver....I buckled down and learned methodologies and best practices that I make sure are followed in any in-house shop I work at.

Forsaken_GA · February 2012

erpadmin wrote: »

If you're a small shop, then you're IT budget is going to be small. If you're a large shop, then you will allocate funds for an acceptable level of redundancy. If you run mission-critial applications as a front end to a mission critical database, then an active-passive cluster is what one goes with.

You're speaking in absolutes.

Understand, I *agree* with you, and I'm not here saying that skimping is ok. If the choice is mine, I'm going best practice all the time. Reality is that different companies have different goals, and that size is no indication of competence. I seriously doubt I'm the only one who's ever worked for a company or done work for a company that is so inept that the only reason they survive the magnitude of their mistakes is because of their size. On the flipside of that, I've seen smaller shops that understand that doing it right the first time leads to better returns in the long run, and play the long game, instead of only trying to maximize their results for the quarter.

I'm fortunate enough that my daily operational reality is that we implement best practice solutions. That means working in a shop with SAN replication, with two 2-node clusters for High Availability. The days of being a MacGuyver type of tech are over...there are standards and methodologies that should be followed...even for the cost-conscious. I've been doing IT (professionally) for going on 15 years...I've been in operations for most of that time, except my brief stint as a desktop support analyst. Even as a JOAT, I was involved in operations. But it wasn't until I dealt with ERPs that I stopped dealing with IT issues by the seat of my pants, like MacGuyver....I buckled down and learned methodologies and best practices that I make sure are followed in any in-house shop I work at.

And I applaud you for it, and I've gotten myself to a point where I can get away with the same. Where I take issue is with the representation that this is the way it is

While this may be typical for you, it is not typical for the industry as a whole. Just look at all the security breaches that occurred last year for an overview that having a large budget and/or revenue stream does not automatically equate to a company that invests in following best practices.

it_consultant · February 2012

When cost is an issue, which it is for a SMB business, doing an active/passive cluster is unobtainable. For these folks either a snapshot backup (on VMWARE, Hyper-V, etc) or an image backup is the most they can afford. A DB replica is more affordable, since it doesn't require shared storage, but unlike MS Exchange where the system will understand a failure and redirect clients, I don't see that MSSQL replicas offer that ability. If your main DB server went down and you had a replica, you either have to reconfigure all your hooks to go to the replica or fix your main server and sync back the replica.

Database replication vs. clustering

Comments