CodeCloudy

Azure | .Net | JQuery | Javascript | Umbraco

Understanding Windows Azure Tables comparing with Relational SQL Tables (on-premise/SQL Azure)

Table of Contents

  • Background
  • A comparison of Relational SQL Table and Azure Storage Table
  • Azure Storage Table Concept
  • Understanding Entity
  • Understanding Properties by example
  • Partitions
  • Key things to remember

Background

Windows Azure Storage is a storage option provide by Windows Azure. Using this technology will bring all the benefits of Windows Azure such as High Availability, Scalability, Security, etc. Azure tables were specially designed to store non-relational data that needs some structure, but highly scalable. All the values of Azure tables are stored as Key-Value pairs which makes it a highly scalable storage option.

When someone starts to learn about windows azure table storage, the first thing that will come to his mind will be “Why do we need this when we have Sophisticated Relational SQL Server Tables?”

Therefore, the best way to understand about Windows Azure Tables will be, start thinking about a relational SQL table and learn the differences.

A comparison of Relational SQL Table and Azure Storage Table

Relation SQL table Azure Storage table
Can have Relationships between tables There are no relationships between tables. Follows NoSQL Concepts
Has Rows of values Considered as Entities
Strictly defined Columns & enforces data integrity There is no Concept of Columns, instead values are stored as key-value pairs
Each row has fixed amount of columns Each entity can have Zero or more properties
Each data/value will have a data type which corresponds to the column definition Each data/value are independent & can have different data types
No build-in System columns 3 System properties for each entity (PartitionKey, RowKey, Timestamp)
Merging data from multiple sources that has different formats – may need to be converted programmatically Have the option of saving data in a different formats.Example:Source 1: 01/01/2001Source 2: “2001” (only the year as a String)
Ability to Query – TSQL Limited Query options
Can apply indexes to any column Can’t do that. Is only available for PartitionKey & RowKey built-in
Geographically redundant price (as of 2014-Mar-09) – $9.99 Upto 1GB per Month Cheaper – $0.095 per GB per Month
When comparing only on-premise SQL More Availability
SQL Azure Max Capacity (as of 2014-Mar-09) – 150 GB 200TB

Azure Storage Table Concept

In an Azure storage table a collection of Properties (key-values pairs) will form an Entity. Those entities will be meaningfully grouped in to partitions using PartitionKeys. Ultimately, the collection of entities that were grouped in to partitions will form the Azure Storage Table.

Understanding Entity

An entity has 3 system properties:

  1. PartitionKey – means of partitioning entities for load balancing with the rule of thumb “Data that need to be Queried together, must be kept together”
  2. RowKey – This is a key that helps to uniquely identify an entity that belongs to a partition. So, RowKey together with PartitionKey will make an Entity unique in a given azure table.
  3. TimeStamp – When you first create an entity, the Date & Time of creation will be recorded in this property. And later if you do any change to that entity, the value will get updated with the last updated date & time.

Those system properties are built-in indexed and it is recommended to use these properties as identity columns so it is efficient. Non-system values cannot be indexed.

The paritionKey together with RowKey will form unique entities:


Understanding Properties by example

Example 1: No fixed Schema for storing values

In a relation SQL table all those values that is entered under column “Age” will be bound to the limitations defined in the column definition/design such as Data type, max length, unique key, etc.

Now let’s try to store those values in an Azure Storage Table. Now, those values are not bound to a particular column. (There is no concept of columns)

Each and every values are stored as key-value pairs. Unlike in relational SQL tables, those values can be of different data types as well.

This design has enable some new scenarios that were not possible before.

For example, let’s say we store date values in relational SQL table we have to store them in a standard format. But in SQL Azure table, we can store the date as string for some entities if needed.

Example 2:

There is a requirement to create a simple consolidated search engine of products that are referred from two different ecommerce stores. Store 1 stores Product Expiry Dates as DateTime data type (1900-01-01 00:00:00). On the other hand, Store 2 stores only Year and Month, but as Strings (“1900/June”). If we are doing this using relational SQL, we will have to assume a default date for Store 2 (may be logically incorrect according to the product type) and programmatically convert each and every date stored either as a separate column or do it when querying. But, we can push products from both the stores to Azure tables without any of these effort. (Store 1 values stored as DateTime, Store 2 values stored as String)

Moreover, if we take a particular entity, it can have Zero or more properties.

Example 3

In the above example,

  • First entity will have Zero properties
  • Third entity will have 3 properties (“First”,”Last”,”Birthdate”)
  • Second entity will have 4 properties (“First”,”Last”,”Birthdate”,”Sport”); the property “Sport” only belongs to the second entity and will not be applicable for other entities unless other entities have explicitly defined a property with the same name.
  • PartitionKey together with RowKey will form unique entity in Azure storage table. Although first & second entities have the RowKey “001” since they have different paritionKeys they will be unique inside the table.

Properties supports following data types:

  • Binary
  • Bool
  • DateTime
  • Double
  • GUID
  • Int32
  • Int64
  • String

Partitions

Azure storage tables are partitioned by the system property called “PartitionKey”.

Entities that were partitioned with the same ParitionKey will belong to the same load balancing group. On other words, anything within the same partition will live in the same server (same storage machine), so they can be queried/accessed faster.

So, as a rule of thumb when we design for Azure Storage tables we need to remember that “Data that need to be queried together, must be kept together in the same partition”.

Since, we cannot index non-system properties, the key idea of azure table is that design it in a way so that we make use of the concept of partitioning for load balancing. So, the art of choosing how to partition entities will depend on the requirements, transactions, and mainly on what kind of Queries that we are going to use on the azure table.

Following are some examples of how developers have used azure tables and how they have paritioned according to different requirements:

In this part of the table, we can we the entities have been partitioned using their document name.

The different versions of the same document are stored using different RowKeys.

In this case, all the versions of a particular document will be stored together. So, retrieving the versions of the same document will be very fast.

As you can see ParitionKey or RowKey doesn’t have to be a GUID or Integer they can be any data type. In the following example, Name is considered as the RowKey.

Following is a good example that describes how independent that values are stored for each entity as properties.

Key things to remember

In summary, when we design for azure storage tables with the concept of partitions, following are some key concerns:

  • First of all, know what you cannot do with Azure storage tables. It is not a replacement for relational SQL tables. In different cases, it can reside independently or it can work together with relational SQL tables to form a system. The latter case is the obvious.
  • Know the transactions; it is important to know what are the entities that are going to be updated together.
  • Know your Queries. It is important to know what are the entities that are going to be queried/accessed together.
  • Therefore, depending on the above concerns, we have to partition data.
  • Sometimes, if we design our tables to server some key queries, it may not possibly cater some other queries, which we will have to look for workarounds. In Relational SQL we have really good guides and best practices. In contrast, there is less direct guidance on designing such NoSQL based tables. So, it’s more of an Art!
Leave a comment »

How can store json data in azure tables?

There is no direct Art for designing noSQL schema. It basically will depend on the requirement, mainly on how we are going to use it and query it. You can store the Json file in a blob as .js file if you just wan’t access it programatically. Or else if you want to make use of the azure table features and query capabilities, then you may try to convert json in to azure properties. So, as said it depends on how you want to use it and the limitations of azure tables.

Leave a comment »

Questions from the Windows Azure Storage – Session 01 – Sri Lanka .NET Forum

I must say we had really good participants who were really interested and keen on learning new technologies. There are some of the interesting questions that they asked.

Question: Is SQL Azure and Azure Storage are same?

Reply: Not Really. SQL Azure is simply the Cloud version of the on-premise SQL Server but with some limitations.

Can we use Visual Stuido 2012 to create azure applications?

Yes you can. But, azure releases new feature frequently. so most of the visual studio tools will be included with the latest Visual Studio versions. So, its always better to use the latest.

Question: How can store json data in azure tables?

Reply: There is no direct Art for designing noSQL schema. It basically will depend on the requirement, mainly on how we are going to use it and query it. You can store the Json file in a blob as .js file if you just wan’t access it programatically. Or else if you want to make use of the azure table features and query capabilities, then you may try to convert json in to azure properties. So, as said it depends on how you want to use it and the limitations of azure tables.

Answers for Other Questions:

https://codecloudy.wordpress.com/2014/03/07/does-azure-tables-internally-use-relational-sql-to-store-its-values/

https://codecloudy.wordpress.com/2014/03/07/can-we-have-indexes-for-properties-other-than-the-system-properties-partitionkey-rowkey/

Some of the Questions were really interesting. And i will discuss some of those in details in my future forum Sessions.

Leave a comment »

Can we have indexes for properties other than the system properties PartitionKey & RowKey?

No. You cannot have indexes for non-system properties. The key idea of azure table is that design it in a way so that we make use of the concept of partitioning for load balancing. So, the art of choosing how to partition entities will depend on the requirements, transactions, and mainly on what kind of Queries that we are going to use on the azure table.

System properties

  1. PartitionKey – means of partitioning entities for load balancing with the rule of thumb “Data that need to be Queried together, must be kept together”
  2. RowKey – This is a key that helps to uniquely identify an entity that belongs to a partition. So, RowKey together with PartitionKey will make an Entity unique in a given azure table.
  3. TimeStamp – When you first create an entity, the Date & Time of creation will be recorded in this property. And later if you do any change to that entity, the value will get updated with the last updated date & time.

 

uniqueKeyOfEntity

 

Leave a comment »

Does Azure tables internally use relational SQL to store its values?

Storage Emulator which comes with the Azure SDK, will simulate the Azure Cloud Storage environment in the local Development environment. And it uses sqlExpress/localdb with NTFS files systems to simulate this environment. It creates this resources the first time when we run a Cloud Project using visual studio.

StorageEmulatorArchitecture

But this is only for the simulation by the Storage Emulator.

Note: In earlier azure versions after we start the storage emulator some temporary databases will be created in the sqlExpress. With new versions of SQL tools sqlExpress doesn’t run as service anymore. I couldn’t see the databases using Visual Studio Server Explorer. But, i found those temporary database files (mdf & log) in the user profile folder (C:\User\UserName\).

DevelopmentStorageDB

The actual Azure storage will store data in a completely different way without using SQL Server architecture.

BasicAzureStorageInternalArchitecture

this is a very basic view of the internal architecture. So, we have Front Ends which will authenticate & authorize incoming requests and then will route those requests to partition layer. Partition servers in the partition layer will manage all the partitions in the server.

Leave a comment »

Windows Azure Storage – Session 01 – Sri Lanka .NET Forum

Areas Covered:

 

  • —Introduction to Windows Azure Storage
  • —How to get a free account for Azure
  • —How Azure storage enables you to build highly scalable applications
  • —NoSQL with Azure storage & Comparison of Relational SQL
  • —Hands on azure portal
  • —Azure storage tools
  • —Demo on Azure storage, emulator, access programatically

Material of the Session can be downloaded via the following link:

https://onedrive.live.com/redir?resid=CBB792500B0D7760!207&authkey=!AIG9Tk5qer-wPBg&ithint=folder%2c.zip

The above link is also stored using windows azure storage blobs which we discussed in the session.

In future sessions we will try to cover more with azure technologies such as how to start with cloud applications, push notifications with Windows 8 & Windows Mobile, access control services, SQL Azure, Azure Virtual Machines, Web sties, Mobile Services, Cloud services, Service Bus, etc.

Hope you gained something from the First Session on Azure. Cheers!

Leave a comment »

Error SQL71564 – When Exporting your Database to Azure (bacpac)

When you use the Export Data-tier feature to create a backup of you database in order to upload it to windows azure you might get this error

Error SQL71564 –

The database principal owns a schema in the database, and cannot be dropped

Solution: to drop the user.

Sql Script to remove the schema from the user:

   ALTER AUTHORIZATION ON SCHEMA::db_owner TO db_owner

you can specify the required schema instead of db_owner

Done!

 

Leave a comment »

Visual Studio 2012 RC does not have setup and deployment project template

The basic setup and deployment project template that was with the visual studio 2010 was not included in the visual studio 2012 RC.

instead it had a third party add-in that need to be installed called “InstallShield”.

even though this had many feature, it was so expensive. And i was trying to contact the support team for some clarification about the feature of the product in order to purchase it. but, i didn’t get a proper reply from them.

in order to try it, first you need to download it. and Restart visual studio 2012 RC. and there will be a new template available.

 

Requirement was to create a setup file that will:

  • create a database and run some sql scripts – configuration setting need to given as parameters in the wizard in deployment time
  • create a website and a web service – configuration setting need to given as parameters in the wizard in deployment time
  • if something fails, rollback all
  • Need to update the website and web service with user enterd sql connection settings.

 

Possible alternatives:

  • InstallShield: expensive
  • InstallAware: less expensive than InstallShield
  • Web Deploy 3.0: Can only be useful for web deployment only. But, still manual work need to be done in the installation. Can’t do it as a wizard installation.
  • MsBuild: is a command line tool, which visual studio uses for its operations in background. Could do all the things we need, but we need to create some GUI to get the parameters that need to be configured and some transaction handling c# code.

still investigating possible options. i will update this post with what we ended up doing.

cheers!

 

Leave a comment »

“Available Memory” in Task Manager

 

The “Available Memory” counter in Task Manager is actually the sum of the following three internal memory lists:

  • The Standby list.
  • The Free list.
  • The Zero list.

Available Bytes is the amount of physical memory available to processes

running on the computer, in bytes. It is calculated by summing space on
the Zeroed, Free, and Standby memory lists. Free memory is ready for
use; Zeroed memory is pages of memory filled with zeros to prevent later
processes from seeing data used by a previous process. Standby memory is
memory removed from a process’s working set (its physical memory) on
route to disk, but is still available to be recalled.

ref:

http://support.microsoft.com/kb/312628

http://support.microsoft.com/kb/184063

Leave a comment »

Sort a list of strings numerically using LINQ

Lets take a case where you need to sort an array of numbers that are strings.

And the list need be sorted numerically.

One thing to note is that this is useful where you cannot convert them to int list.

 
typical way:
yourStringList.OrderBy(x => x.yourSortAttribute)
this will not sort it numerically. because they are strings.

so you need to give a comparer called SemiNumericComparer.
yourStringList.OrderBy(x => x.yourSortAttribute, new SemiNumericComparer())
 
Your have write a custom IComparer<T> class to overcome this issue. Add this class to you class library:

/// <summary>
/// Semi Numeric Comparer
/// </summary>
public class SemiNumericComparer : IComparer<string>
{
/// <summary>
/// Compares the specified s1.
/// </summary>
/// <param name=”s1″>The s1.</param>
/// <param name=”s2″>The s2.</param>
/// <returns>the specified s1.</returns>
public int Compare(string s1, string s2)
{
if (Convert.ToInt32(s1, CultureInfo.InvariantCulture) > Convert.ToInt32(s2, CultureInfo.InvariantCulture))
{
return 1;
}

if (Convert.ToInt32(s1, CultureInfo.InvariantCulture) < Convert.ToInt32(s2, CultureInfo.InvariantCulture))
{
return -1;
}

if (Convert.ToInt32(s1, CultureInfo.InvariantCulture) == Convert.ToInt32(s2, CultureInfo.InvariantCulture))
{
return 0;
}

return string.Compare(s1, s2, true, CultureInfo.InvariantCulture);
}
}

this should work now. Cheers! :)
for more information about the IComparer: http://msdn.microsoft.com/en-us/library/8ehhxeaf.aspx
Leave a comment »