Cloud Conversations: AWS EBS, Glacier and S3 Overview |
Simple Storage Services (S3), Glacier and Elastic Block Storage (EBS) are part of the AWS cloud storage portfolio of services
Amazon Web Services (AWS) recently added EBS Optimized support for enhanced bandwidth EC2 instances (read more here). This industry trends and perspective cloud conversation is the first (looking at EBS) in a three part series companion to the AWS EBS optimized post found here. Part II is here (closer look at S3) and part III is here (tying it all together).For those not familiar, Simple Storage Services (S3), Glacier and Elastic Block Storage (EBS) are part of the AWS cloud storage portfolio of services. There are several other storage and data related service for little data database (SQL and NoSql based) other offerings include compute, data management, application and networking for different needs shown in the following image.
AWS Services Console via www.amazon.com Simple Storage Service (S3) is commonly used in the context of cloud storage and object storage accessed via its S3 API. S3 can be used externally from outside AWS as well as within or via other AWS services. For example with Elastic Cloud Compute (EC2) including via the Amazon Storage Gateway (read more here and about EC2 here). Glacier is the AWS cold or deep storage service for inactive data and is a companion to S3 that you can read more about here.S3 is well suited for both big and little data repositories of objects ranging from backup to archive to active video images and much more. In fact if you are using some of the different AaaS or SaaS services including backup or file and video sharing, those may be using S3 as its back-end storage repository. For example NetFlix leverages various AWS capabilities as part of its data and applications infrastructure (read more here).AWS Basics
AWS consists of multiple regions that contain multiple availability zones where data and applications are supported from.Note that objects stored in a region never leave that region, for example data stored in the EU west never leave Ireland, or data in the US East never leaves Virginia.AWS does support the ability for user controlled movement of data between regions for business continuance (BC), high availability (HA) and disaster recovery (DR). Read more here at the AWS Security and Compliance site and also in this AWS white paper.
What About EBS?
That brings us to Elastic Block Storage (EBS) that is used by EC2 (read more about EC2 and instances here) as storage for cloud and virtual machines or compute instances. In addition to using S3 as a persistent backing store or target for holding snapshots EBS can be thought of as primary storage. You can provision and allocate EBS volumes in the different data centers of the various AWS availability zones. As part of allocating your EBS volume you indicate the type (standard) or provisioned IOP’s or the new EBS Optimized volumes. EBS Optimized volumes enables instances that support the feature to have better IO performance to storage.The following image shows an EC2 instance with EBS volumes (standard and provisioned IOPS’s) along with S3 volumes and snapshots. In the following example the instance and volumes are being served via the AWS US East region (Northern Virginia) using availability zone US East 1a. In addition, EBS optimized volumes are shown being used in the example to increase bandwidth or throughput performance between storage and the compute instance.Using the above as a basis, you can build off of that to leverage multiple availability zones or regions for HA, BC and DR combined with application, network load balancing and other capabilities. Note that EBS volumes are protected for durability by being spread across different servers and storage in an availability zone. Additional protection is provided by using snapshots combined with S3. Additional BC and DR or HA protection can be accomplished by replicating data across availability zones.The above is an example of tying various components and services together. For example using different AWS availability zones, instances, EBS, S3 and other tools including those from third parties. Here is a link to a free chapter download from Cloud and Virtual Data Storage Networking (CRC Press) pertaining to data protection, BC and DR (available at Amazon here and Kindle here). In addition here is an AWS white paper pertaining to using their services for BC, HA and DR.EBS volumes are created ranging in size from 1GByte to 1Tbyte in space capacity with multiple volumes being mapped or attached to an EC2 instances. EBS volumes appear as a virtual disk drive for block storage. From the EC2 instance and guest operating system you can mount, format and use the EBS volumes as any other block disk drive with your favorite tools and file systems. In addition to space capacity, EBS volumes are also provisioned with standard IO (e.g. disk based) performance or high performance Provisioned IOPS (e.g. SSD) for thousands of IOPS per instance. AWS states that a standard EBS volume should support about 100 IOP’s on average, with about 2,000 IOPS for a provisioned IOP volume. Need more than 2,000 IOPS, then the AWS recommendation is to use multiple IOP provisioned volumes with data spread across those. Following is an example of AWS EBS volumes seen via the EC2 management interface.
AWS EC2 and EBS configuration statusNote that there is a 10 to 1 ratio of space capacity to IOP’s being provisioned. If you try to play a game of 1,000 IOPS provisioned on a 10GByte EBS volume to keep your costs down you are out of luck. Thus to get 1,000 IOPS’s you would need to allocate at least a 100GByte EBS volume of which you will be billed for the actual space used on a monthly pro-rated basis. The following is an example of provisioning an AWS EBS volume using provisioned IOPS in the US East region in the 1a availability zone.
Amazon Web Services (AWS) recently added EBS Optimized support for enhanced bandwidth EC2 instances (read more here). This industry trends and perspective cloud conversation is the second (looking at S3) in a three-part series companion to the AWS EBS optimized post found here. Part I is here (closer look at EBS) and part III is here (tying it all together).
For those not familiar, Simple Storage Services (S3), Glacier and Elastic Block Storage (EBS) are part of the AWS cloud storage portfolio of services. With S3, you specify a region where a bucket is created that will contain objects that can be written, read, listed and deleted. You can create multiple buckets in a region with unlimited number of objects ranging from 1 byte to 5 Tbytes in size per bucket. Each object has a unique, user or developer assigned access key. In addition to indicating which AWS region, S3 buckets and objects are provisioned using different levels of availability, durability, SLA’s and costs (view S3 SLA’s here).
Cost will vary depending on the AWS region being used, along if Standard or Reduced Redundancy Storage (RSS) selected. Standard S3 storage is designed with 99.999999999% durability (how many copies exists) and 99.99% availability (how often can it be accessed) on an annual basis capable of two data centers becoming un-available.
As its name implies, for a lower fee and level of durability, S3 RRS has an annual durability of 99.999% and availability of 99.99% capable of a single data center loss. In the following figure durability is how many copies of data exist spread across different servers and storage systems in various data centers and availability zones.
What would you put in RRS vs. Standard S3 storage?
Items that need some level of persistence that can be refreshed, recreated or restored from some other place or pool of storage such as thumbnails or static content or read caches. Other items would be those that you could tolerant some downtime while waiting for data to be restored, recovered or rebuilt from elsewhere in exchange for a lower cost.
Different AWS regions can be chosen for regulatory compliance requirements, performance, SLA’s, cost and redundancy with authentication mechanisms including encryption (SSL and HTTPS) to make sure data is kept secure. Various rights and access can be assigned to objects including making them public or private. In addition to logical data protection (security, identity and access management (IAM), encryption, access control) policies also apply to determine level of durability and availability or accessibility of buckets and objects. Other attributes of buckets and objects include life-cycle management polices and logging of activity to the items. Also part of the objects are meta data containing information about the data being stored shown in a generic example below.
Access to objects is via standard REST and SOAP interfaces with an Application Programming Interface (API). For example default access is via HTTP along with a Bit Torrent interface with optional support via various gateways, appliances and software tools.
The above figure via Cloud and Virtual Data Storage Networking (CRC Press) shows a generic example applicable to AWS services including S3 being accessed in different ways.
AWS S3 buckets and objects in different regions
The following figure is a generic example of how cloud and object storage are accessed using different tools, hardware, software and API’s along with gateways. AWS is an example of what is shown in the following figure as a Cloud Service and S3, EBS or Glacier as cloud storage. Common example API commands are also shown which will vary by different vendors, products or solution definitions or implementations. While Amazon S3 API which is REST HTTP based has become an industry de facto standard, there are other API’s including CDMI (Cloud Data Management Interface) developed by SNIA which has gained ISO accreditation.
Cloud and object storage access example via Cloud and Virtual Data Storage Networking
Additional reading and related items:
- Cloud conversations: AWS EBS optimized instances
- Cloud conversations: AWS EBS, Glacier and S3 overview (Part I EBS)
- Cloud conversations: AWS EBS, Glacier and S3 overview (Part III)
- Cloud conversations: AWS Government Cloud (GovCloud)
- Cloud conversations: Gaining cloud confidence from insights into AWS outages
- AWS (Amazon) storage gateway, first, second and third impressions
- Cloud conversations: Public, Private, Hybrid what about Community Clouds?
- Amazon cloud storage options enhanced with Glacier
- Amazon Web Services (AWS) and the NetFlix Fix?
- Cloud conversation, Thanks Gartner for saying what has been said
- Cloud and Virtual Data Storage Networking via Amazon.com
- Seven Databases in Seven Weeks