Azure Storage understanding
As the name suggests, Microsoft Azure Storage is a cloud storage solution from Microsoft. It provides a highly scalable storage solution for objects, file system, messaging, and tables, which is non-relational. Azure Storage is:
Highly available and durable. Azure storage ensures that customers data is safe and available in the events of hardware failures or even natural disasters. Azure provides an option for data replication across datacenters or geographical regions for added protection. Data, therefore, still remains highly available.
Secure. Azure storage provides high security by data encryption and also provides fine-grained control over who has access to your data.
Scalable. Azure Storage is designed such as to provide high scalability and performance requirements of the application.
Azure Storage can be accessed from anywhere in the world over either HTTP or HTTPS. It can also be accessed using any programming language of your choice using a well defined mature REST API. Azure Storage additionally supports scripting in PowerShell or CLI.
There are other ways as well to access Azure Storage. One is through the Azure portal, and the other one is using storage explorer. Storage Explorer is a Microsoft provided visual tool that can be downloaded on your local systems, which can be connected to your subscription and then you can use it to access your storage.
We already discussed that the storage accounts provides high availability through redundancy. Azure provides facilities to replicate the copies of the storage account as per the redundancy opted during its creation. There are four different replication options. They are –
- LRS (Locally Redundant Storage) – It is a low-cost replication strategy, and the data is replicated within the same data center
- ZRS (Zone Redundant Storage) – Data is replicated synchronously across three availability zones in the same regions for high availability and durability
- GRS (Geo Redundant Storage) – In GRS the data replication happens across different regions. Data cannot be accessed in GRS
- RA-GRS (Read Access Geo Redundant Storage) – This is similar to GRS as the data is replicated across regions, but with the read access.
Azure Storage Data Services
Azure Storage services provide four different kinds of data services, each one being used for specific requirements. I will explain them in more detail, but they are:
- Blobs – This storage data service is designed for handling a huge amount of unstructured data and is ideal for images, videos, audios, backup files, log files, archives, and other large files. It can be accessed globally using HTTP/HTTPs part from Azure portal and using Rest API, PowerShell, and CLI.Azure blob storage offers different access tiers, which allow you to store blob object data in the most cost-effective manner. They are –
- Hot – This is used for most frequently used data, where the access cost is the lowest, but incur more costs on storage
- Cool – It has lower storage costs than the hot tier, but larger than the Archive tier. In this tier, the data will remain for at least 30 days. For ex – short term backups, large datasets for analysis, while more data is being collected, etc.
- Archive – This tier has the lowest storage costs, but highest access costs. It has the highest data access latency. For accessing the data in the Archive tier, we need to change the tier to either hot or cold. This process is termed as rehydration, and it takes around 15 hours
- Files – File data service is used to store highly available files accessed using the SMB protocol (Server Message Block). SMB protocol allows multiple VMs to access, read, and write to the files. As with other storage data services, Files can be accessed from anywhere in the world over HTTP/HTTPs.
Applications using Azure File services storage are easier to migrate to Azure. Since, file share can be used from different VMs, common tools and utilities can be stored in the files storage.
- Queues – As the name suggests, Queues are used to store and retrieve millions of messages, each up to 64 KB in size, for asynchronous processing.
- Tables – This data service from Azure storage stores structured NoSQL data (non-relational) with a schemaless design. It is easier to adapt as per requirements. This data service is fast and cost-effective. It may contain any number of tables, up to the storage account limit. It can be used to serve web applications and can evolve along with the applications.
It does not need complex queries to be written, with joins, foreign keys, stored procedures, etc.
Azure Table storage data service is now a part of the Azure Cosmos DB, which provides throughput, global distribution, etc.
Storage Account Kinds
We will now discuss different kinds of storage accounts. Each king provides different kinds of features, with a different pricing model. When choosing a storage account kind, we need to compare the features and the pricing for different account types before consideration.
I have listed them in the tabular format here, and let us understand one at a time.
- General-purpose v2 accounts: Basic storage account type for blobs, files, queues, and tables. Recommended for most scenarios using Azure Storage
- General-purpose v1 accounts: Similar to the legacy account type for blobs, files, queues, and tables. Use general-purpose v2 accounts instead when possible
- Block blob storage accounts: Blob-only storage accounts with premium performance characteristics. Recommended for scenarios with high transactions rates, using smaller objects, or requiring consistently low storage latency
- FileStorage (preview) storage accounts: Files-only storage accounts with premium performance characteristics. Recommended for enterprise or high-performance scale applications. This is in preview and only available when the performance tier is chosen as Premium
- Blob storage accounts: Blob-only storage accounts. Does not provide files, queues and tables storage as the name suggests.
Securing Storage Accounts
It is very important for us to discuss the security aspects of the storage accounts and how we can protect our data. We can achieve security by implementing authorization and data encryption for Azure storage.
Azure Active Directory (Azure AD) integration for blob and queue data. We can authenticate and authorize Blob and Queue services with Azure AD credentials using the Role Based Access Control (RBAC).
Azure AD authorization over SMB for Azure Files (preview). We can use the Azure AD ID for accessing Azure Files as it uses the identity-based authorization over SMB (Server Message Block) through Azure AD.
Authorization with Shared Key. Azure Blob, Queue, Table and Files also support authorization with Shared Key. There are two keys that are created, which are primary and secondary keys, and either one can be used to have complete access. The request header is signed using the Shared Key.
Authorization using shared access signatures (SAS). A shared access signature (SAS) is a time-bound string containing the security token which is appended to the URI for a storage resource. This gives access to the resource for a limited period of time after which the access is automatically revoked.
Anonymous access to containers and blobs. We can also configure blobs and containers to have anonymous access, which means that no permission is required and anyone can access them.
Apart from the above, encryption also helps to protect the data to meet organizational security policies. The encryption and decryption happen automatically when the data is written or retrieved. This is termed as Storage Service Encryption (SSE) at rest, which is present in all tiers, be it standard or premium.
Another type of encryption is client-side encryption. In this, we can programmatically encrypt the data using the storage client libraries. It can encrypt the data before sending and decrypt the data back while the data is read.
Download presentation here.