Availability, Durability and Reliability

This concept relates to the support when you working with a cloud compute and also it relates to your infrastructure. So while you're building your application, you're building your infrastructures, you have typically three different measurements, three different metrics: availability, durability and reliability. They are very different, most people confuse them, it's good that you're taking this course because I will explain and demystify everything for you.

Let me give you an example. Let's say an application, just some application, not very nicely written, it's down six minutes out of an hour. So that means 6 divided by 60, we have 60 minutes in an hour, that's 10%. Right? So the application is available 90% because 10% it's down and the reliability means how long that application could run maximum. So the reliability would be less than an hour because it never reaches an hour, it's down every six minutes in an hour, on average. So availability and reliability, they kind of related, but not really.

Reliability becomes important when you have some long running process, let's say it's a cron job, it just takes two hours to complete. But if your system is never up for one hour, that job will never be done. Right? Availability is typically when you go to website, it's not available, and you, we typically see availability in the ranges of four nines, considered a good, five nines considered a "military grade", that's what we build at DocuSign, we build our own solution. We didn't even use private cloud, not even public cloud, didn't use cloud at all at DocuSign. And the guys, the IT guys, the IT ops at the DocuSign that build five nines, which is amazing.

And then the redundancy, that's when you basically lose the data. Redundancy independent of availability and reliability because that's a different mechanism how you implement that. Let me give you another example. S3, if you go to their service agreement, they guarantee you the durability of eleven nines, which is like very, very kind of big number. It's pretty close to 100%. So what that means? That means that the data, once you put it into S3, it's pretty much undestructible. It doesn't matter if there is a tsunami in one region or another, they have so many redundancies built in, so many backups of backups of backups then it's pretty much guaranteed. Right? So it's very close to 100% with eleven nines.

And I recently uploaded my old archives from my old drives to S3. I will just get rid of those drives, so that's a great durability. But then availability is just four nines. Well, it's still large for availability, but compared to durability is way, way, way, way less number. So why is that? Well, if you try to request an object from S3, there is a few minutes in the year that it might not be available. So for that case they tell you, okay. So availability, the data is still there, it's not like the data is gone. It's just not available at that particular minute. So if you wait a few minutes or a few hours, eventually it will be available. So that's the difference between durability and availability and then you also have reliability.

And just, yeah, third example, EC2 has a little bit less availability than S3. So S3 has more availability. If you interested to learn more, go and check the standard service agreement for the AWS for each particular service. The availability that they guarantee might differ.