Posts Tagged ‘Cloud’

Creating Excel Interactive View July 24th, 2014

Vinod Kumar

I have been wanting to write on this topic for ages but seem to have missed out for one reason or the other. How many times in your life seen a web page with a bunch of tables and it is so boring to read them? The numbers or tables sometimes might have sorting capability but lacks a striking visualization to say the least. So I am going to borrow a table from a Wikipedia page about Indian Population. There are a number of tables and the table of interest to be in Literacy rate. So the rough table looks like:

State/UT Code India/State/UT Literate Persons (%) Males (%) Females (%)
01 Jammu and Kashmir 86.61 87.26 85.23+-
02 Himachal Pradesh 83.78 90.83 76.60
03 Punjab 76.6 81.48 71.34
04 Chandigarh 86.43 90.54 81.38

Well, this is as boring as it can ever get even when pasted as-is on this blog. Now here is the trick we are going to do called as Excel Interactive View. As the name suggests, we are going to use the power of Excel to make this mundane table into some fancy charts for analysis. This includes a couple of scripts that needs to be added as part of the HTML Table and we are done. It is really as simple as that. So let me add the complete table with the script added. Just click on the button provided above to see the magic:

State/UT Code India/State/UT Literate Persons (%) Males (%) Females (%)
01 Jammu and Kashmir 86.61 87.26 85.23+-
02 Himachal Pradesh 83.78 90.83 76.60
03 Punjab 76.6 81.48 71.34
04 Chandigarh 86.43 90.54 81.38
05 Uttarakhand 79.63 88.33 70.70
06 Haryana 76.64 85.38 66.77
07 Delhi 86.34 91.03 80.93
08 Rajasthan 67.06 80.51 52.66
09 Uttar Pradesh 69.72 79.24 59.26
10 Bihar 63.82 73.39 53.33
11 Sikkim 82.20 87.29 76.43
12 Arunachal Pradesh 66.95 73.69 59.57
13 Nagaland 80.11 83.29 76.69
14 Manipur 79.85 86.49 73.17
15 Mizoram 91.58 93.72 89.40
16 Tripura 87.75 92.18 83.15
17 Meghalaya 75.48 77.17 73.78
18 Assam 73.18 78.81 67.27
19 West Bengal 77.08 82.67 71.16
20 Jharkhand 67.63 78.45 56.21
21 Odisha 72.9 82.40 64.36
22 Chhattisgarh 71.04 81.45 60.59
23 Madhya Pradesh 70.63 80.53 60.02
24 Gujarat 79.31 87.23 70.73
25 Daman and Diu 87.07 91.48 79.59
26 Dadra and Nagar Haveli 77.65 86.46 65.93
27 Maharashtra 83.2 89.82 75.48
28 Andhra Pradesh 67.66 75.56 59.74
29 Karnataka 75.60 82.85 68.13
30 Goa 87.40 92.81 81.84
31 Lakshadweep 92.28 96.11 88.25
32 Kerala 93.91 96.02 91.98
33 Tamil Nadu 80.33 86.81 73.86
34 Puducherry 86.55 92.12 81.22
35 Andaman and Nicobar Islands 86.27 90.11 81.84

So how cool is this Excel visualisation? I am sure you will want to build or use this capability in your webpages or internal sites in your organizations too. I hope you learnt something really interesting.

If you want to learn more about using this feature in your dataset and web pages, well read the documentation from Excel Interactive View.

PS: the data comes from Wikipedia and I have just used a snapshot to show the same. So please dont read too much into the data etc, look at the Excel view capabilities.

Continue reading...


Managed Databases on Cloud July 18th, 2014

Vinod Kumar

Recently my good friend and colleague Govind wrote about this topic on what are customers looking forward to when it comes to Cloud and working with Azure. The fundamental tenants that customers look at for cloud be it PaaS, SaaS or IaaS has been around:

  1. Reduced Maintenance headaches
  2. SLA backed for HA/DR
  3. Performance
  4. Synchronization with on-prem
  5. Security
  6. Backups
  7. No worry about hardware

and a few more. But for most parts the above fits the quizzing we get into. In a recent conversation, I had to outline some of the options when it comes to backup requirements with the customer which I thought is worth a share here. I am looking at this from an Azure standpoint:

For IaaS:

  1. You will need to use SQL Server Agent and build your maintenance plans that can automated. This can be scripted (powershell, TSQL or others) and done for all workloads.
  2. For SQL Server 2008 R2 CU2 onwards, we can use Backup to URL option wherein backups from Azure VM – SQL box we can point backups to a blob storage. I wrote about this a while back and you can try the same –
  3. SQL Server 2014 also supports Encrypted backups to Blob and the same article shows the same.
  4. Also from SQL Server 2014 we have option to use Managed Automated backups configured. This will take backups automatically to  blob on a predefined time or based on workload pattern. Documentation for this can be found at:

For PaaS:

  1. Since we already make sure of consistency in the Azure world, we dont have to worry on this.
  2. For Basic, Standard and Premium editions there are SLA for Point-In-Time recovery which is 7, 14 and 35 days respectively. You can read more about this at: . I highly recommend to use Powershell scripts to automate this, if you plan to use the them.
  3. In the past, I have also seen customers use Database Copy functionality to keep a copy of their database in a ready to use state every couple of days. This gives them an opportunity to go back to that version immediately without any problems. This is also an viable option if you like to use. – Since point-in-time restores are available, I am more inclined to use that for cold standby and restores. Having said that, we can still use that feature for creating a copy for Dev, Test environments from our prod servers for testing.

These are my customer notes and I plan to start publishing these customer notes from time to time here in my blog. Since we are talking about Azure, I am sure some more additional capabilities and SLA’s can change over a period of time. So please keep an eye on the documentation for the latest values.

Continue reading...


Big Data – Big Hype yet Big Opportunity February 14th, 2012

Vinod Kumar

“Big Data” seems to be the buzz word everywhere and the number of blogs on this very topic has been exponentially growing. Let me take a step back to understand what to expect. Even at India TechEd 2012 we plan to cover this very topic under the Architect track. Personally, I am really excited to see this session discussed from multiple angles. As budding Architects there are tons to look out for. Refer my previous post coming your way on Architecture. So at TechEd India we will have speakers discuss the problem statements and the possible solutions with recommendation on architecture. In this blog post, I am surely talking about some of them – I am not going to steal the awesome content they are lining up :).

Where does Big Data fit? Datasets that exceed the boundaries and size of normal processing capabilities forcing you to take non-traditional approaches.

Fundamental Problem

I was wanting to drop this topic before and strangely figured out that the SQL Community are anyways running the TSQL Tuesday on this very topic. Now with announcements at SQL PASS and investments of Microsoft also in this space – this is huge deal.

When we talk about Big Data we are fundamentally looking at 3 basic dimensions:

  1. Large Data (In ranges of Peta to exabytes and more)
  2. Complex Data (Write once – read many times, Dynamic Schema data)
  3. Unstructured data (Text mining, Images, Videos, Logs)

And these are the same problems we currently have in the industry when it comes to database / data store systems. Look at systems now with RFID tags, Web logs, sensors, medical images, telecom, public sector databases etc all are grappling with this problem.

Where to start?

Hadoop started as a way to quickly process Web log files. Web 2.0 sites were finding that they were accumulating logs that contained valuable click information and user behavior data. As an alternative to parsing log data and storing it in a relational database, Hadoop emerged as a way to keep the log files in their original format and allow processing and analysis.

Though the basic concept is simple and powerful, let me link to some basic explanation to the post Pinal Dave wrote today. He takes a stab at simply demystifying the basics on Hadoop, Pigs, Hives, MapReduce. Feel free to read more on them:

  1. Pig – A high-level language that lets non-programmers use Hadoop
  2. Hive – An SQL query implementation for Hadoop
  3. HBase – A key/value store for Hadoop

One other resource I would like to point in this context is Cloudera from learning resources. Cloudera is a for-profit company that produces integrated, tested, and commercially supported Hadoop releases. Look at some of the other extensions they support as extensions – some new releases make an interesting read.

  • Hue – Hadoop user interface
  • Sqoop – tool to import relational data
  • Flume – tool to import nonrelational data
  • Oozie – workflow engine and many more.

Relational or DW Database Obsolete?

Personally, I don’t think we are talking about this-or-that Boolean approach here. There is something that makes these concepts of Hadoop interesting and viable for organizations to start considering. Let me call out some of them (not exhaustive though)-

  1. Hadoop clusters can be on x86 commodity hardware
  2. No need build cubes for predictive analysis of large data
  3. Relational DB have their own limits on scale-out and scale-up scenarios
  4. Addition of scale-out options easy with Hadoop

With this steady stream of data, is this what the industry is also looking for? Check the McKinsey Global Institute – Big Data: The next frontier for innovation competition and productivity paper and the numbers are bind blowing.

  • 1.5 million more data saavy managers in the US alone
  • 140,000-190,000 deep analytical talent positions
  • €250 billion Potential annual value to Europe’s public sector
  • 15 out of 17 sectors in the US have more data stored per company than the US Library of Congress

Read the whitepaper and there are many more statistics that seem to make this Big Data really Big. Now take examples of big data patterns and sites like facebook or twitter with millions of data stream coming every minute and you want some analytics. Does this Big data architecture qualify here? or do you need a different architectural choices? Well, don’t forget to tune into our India TechEd Architecture track for the details :).

Microsoft Integration Points

From Microsoft, you are going to see lot of work to happen as it is data. Applications like Excel, PowerPivot, Power View, SQL Server Analysis Services, SQL Server Reporting Services are some of the integration we have seen in the recent past at SQL PASS. More about this can be read from the MS Big data home site.

Channel-9 Video: Lynn Langit and Dave Nielsen discuss "Big Data" in the Cloud

MSR Research Paper on Big Data – gives a nice read

Another Research Paper: Big Data and Cloud Computing: New Wine or just New Bottles?

What we can see is, as we get to know this more recent phenomenon of Big Data even the cloud seems to embrace it with two hands. You are going to see some serious integration across the platform and it is a great sign for us -

  1. Connectors for Hadoop, integrating it with SQL Server and SQL Sever Parallel Data Warehouse.
  2. An ODBC driver for Hive, permitting any Windows application to access and run queries against the Hive data warehouse.
  3. For developers, well now addition of JavaScript Layer to the Hadoop ecosystem is very compelling.
  4. An Excel Hive Add-in, which enables the movement of data directly from Hive into Excel or PowerPivot.

Where to start

I highly recommend using Apache Hadoop on Windows WIKI – please bookmark it. Now as a Microsoft ecosystem, there are 3 other interesting pages for reference you don’t want to miss.

On-Premise Deployment of Apache Hadoop for Windows

Windows Azure Deployment of Apache Hadoop for Windows

Windows Azure Deployment of Hadoop on the Elastic Map Reduce (EMR) Portal

This forms a great ecosystem from on-premise to the Cloud. As part of the whole bundle of links here, couldn’t resist from linking Rob Farley who has been kind enough to point out that Big Data now features in 24 hours of PASS too. Nice timing to talk more and more about Big data.


Personally, I see there is tons of learning with Big Data coming our way and 2012 will start the same conversation that we started about BI in Year 2005 timeframe. So get prepared for some Big Hype, Big Challenges, Big Insights and a Big Year of Big Data coming your way.

Continue reading...


Cloud Computing – Trends watch December 26th, 2011

Vinod Kumar

Cloud Computing is surely a buzz word and is catching the IT industry slowly but surely. As part of my work, I do meet a lot of Indian customers asking for more when it comes to consuming and evaluating various Cloud implementation strategies. Yes, cost is one of the dimensions where this is evaluated, but the crux being – Have you designed for the cloud? I will spare that thought of designing for the cloud for a different discussion some other day. But let me call out some of the trends I am seeing and is worth investing our time when it comes to learning Cloud phenomenon in the coming year !!!

Dilbert.comSource – Dilbert Site

Cloud as Disaster Site

This is quite a viable option to think, as most companies are looking at options to get their online backups to the cloud as the storage costs are cheaper and available with redundancy on the cloud. The traditional method of using a DR site was only for business continuity and needed dedicated processes and replication infrastructure to be maintained at a different site. Not to add the human resources, electricity, A/C, infrastructure and more costs to maintain a dead weight. So this trend will become something to lookout for in the future especially when ingress traffic is at no cost from most cloud providers :).

Word of caution here would be – look at the SLA the cloud providers give you even with the storage. Given storage cheaper, just incase of disaster – look at the recovery time requirements that you have set with your customers. Having TB’s of backup will take some considerable time and don’t assume on your recovery strategy !!!

Enterprise moves with Hybrid-Private Cloud

More and more I get an opportunity to talk with the Enterprise CxO’s, I get to hear more of these terms coming up. Yes, the investments are already out there within the enterprises and these cannot go anywhere. So the need to use the existing storages and extend the future needs of storage to the cloud can be a viable option that exec’s always want to contemplate and discuss. The whole concept of limitless (based on costing) storage that hardly requires upgrades, replacements and with no additional capital investment means quite a lot to these CxO’s – especially the CFO’s love to see the ROI here.

Now that these decisions work, next is the need to seamlessly integrate your on-premise environments with the cloud infrastructure. This will become a critical part of any application designs moving forward. We cannot live in a Monolithic model moving forward – hybrid is here and there to stay for a long time.

With computing easy to get on the cloud, it is sometimes the storage that will need to transform from local storages to SAN to Private Cloud storages to Public Cloud storages. These are challenges to keep in mind but not far from implementation – get ready.

Bootstrap to Cloud

As more and more applications do get designed to move to the cloud, there are many more administrative tasks that organizations are contemplating to move to the cloud to reduce the maintenance overheads. The tools and steps required to migrate or move these applications will be something we need to aware and understand holistically. As IT Admins, the need to maintain VMs on the Public Cloud and keep them up-to-date and running is a trend not to miss. Yes, pure play cloud enabled apps are taking shape for newer applications but the legacy applications will stay, hence VMs might be something that cannot be avoided.

As much as applications migrate away from on-premise to cloud. Keep in mind the tools needed to bring them back to your environment anytime (if required).

Big Data – trends not to forget

As more and more people are getting stung with these industry phenomenon not very well understood implementations of NoSQL worlds, Non-relational databases – a trend that will hit the market sooner than later. Though I am a big relational DB fan for a long time, I can see it clearly the Big-Data story is something I will need to bite in the coming year without any doubt. Do we have a choice – Nope !!!


As 2012 nears, just like explained in “Crossing the Chasm” by Geoffrey Moore, the concepts of adoptions remain the same. There will be industry early adopters, others on the way to get themselves moved coming year and a vast majority still contemplating to move first in a sandwich mode with Private Cloud and then move slowly to the new era of completely public cloud infrastructures.

If you really ask me, there is more than what hits the eye and we will get a lot to learn as companies make these moves. There is anyways an opportunity in everything new that comes to the market and the phenomenon of Cloud is something to stay as we move into 2012 !!!

Continue reading...


Database Consolidation Considerations June 3rd, 2011

Vinod Kumar

Following the post around Multi-tenancy, there were interesting comments that have come my way to write more on such topics. Well, thanks for reading and dropping a line. In this post, thought will write around another interesting yet very common topic which I get a chance to discuss with customers – Consolidation of Servers. Though these can be driven by very specific business reasons, these do need some thought before implementing.

There are a significant number of manageability, availability, performance, scalability, and political considerations when deciding between dedicated (physical/virtual), instance level, database level or schema level consolidation. Fortunately, most of these are covered in the SQL Server 2008 Consolidation Guidance technical article. The article comes complete with excellent explanations and decision trees, and though it’s primarily focused on the decision between virtualization vs. instance level vs. database level – it is noted in the article that “Other possibilities include further optimizations on an existing approach such as schema-level consolidation. The key decision factors are similar to the higher-level consolidation options mentioned previously, so this paper will focus only on those.”

As mentioned before, business might see Consolidation from (not in any specific order):

  1. High-Availability – Instead of giving redundant multiple HA databases, some times consolidating will give an advantage for all the applications on a better server with a common HA option (like clustering).
  2. Centralized Management – Customers look at this as an opportunity to consolidate all the departmental applications and more so consolidate the DBAs inside their organization.
  3. Cost Saving – This is most likely the first thing that business sees as an opportunity. Ultimately, they want to look at maximizing the utilization of the hardware they have bought or more so get one beefy server to manage 10s of applications database away from outdated hardwares.
  4. Risk management – As discussed on above, centralization means it also standardizes the way DB code is developed, managed, deployed and maintained. Also it becomes easier for servicing, implementing processes and automated system administration.

Other Consolidation Considerations

Well, all the above reasons are valid, there are more that are missed between the lines.

  1. Operational Cost – Consolidation on newer hardware means reduced servers and power savings can also be achieved. And moving virtual also means you are saving on hosting costs.
  2. Increased uptime – Server consolidation makes it more economic to provide high-availability configurations and dedicated support staff. Organizations also benefit from implementing better storage management and service continuity solutions.
  3. Predictable Performance – Moving to a more standardized systems means we can assure a more predictable performance and can behind the scenes implement isolation of resources per application and a DBA can go ahead and implement compression like techniques without change in application code.
  4. Integration Benefits - A consolidated platform provides for easier, consistent and cheaper, systems integration, which improves data consistency and reduces the complexity of tasks such as extract, transform, and load (ETL) operations across systems.

Technical Considerations

For code already written, migrating to an alternative solution than was originally designed may require rework by developers – i.e. if they’ve already written their code in database A to access database B objects with 3-part name identifiers, they would have to make accommodating changes. Additionally, any database users with a database-wide role (db_datareader, db_ddladmin, etc) that are consolidation candidates might have to be changed.

Though schema consolidation with a single database is nice for logical groupings of data that benefit from being kept in synch [especially during a recovery] or must maintain referential integrity, the thought of combining unrelated entities raises all kinds of management and political questions.

Unrelated applications might have differing availability, maintenance, and isolation requirements – and combining them at a database level complicates this. And if these are financially sensitive data then the process of auditing or tracking becomes even more harder than ever.

Finally, It is great to collapse multiple databases from applications, but if these applications are in-house applications you are good. If these were third party applications – then application compatibility needs to be checked. Certifying the application to Versions of database, x86 Vs x64 architecture, version of the OS etc are all extra activities to be handled – not to mention on dependency services.

Misc Considerations

Well, Planning is the name of the game. Loved to see this hidden somewhere in documentation – a Checklist. Worth a read and consideration, though this is not complete – you will need to create one for your infrastructure needs.

But there are tons of other things that also come to mind -

Special Security context of databases
Limitations or dependencies that prevent consolidation (Agent jobs, Maintenance Plans, SQLMail, ETL or others)
Third party add-on dependency for application
OLTP VS OLAP features and frequency of use
Dependencies of Server / Instance names (hard coded inside application)
How many databases for a single app and proximity of all the dependent DBs
Data growth rate, Data Retention Policies followed by Archival
Backup windows and Special Backup technologies
Peak usage / Low usage time windows of each server
Replication frequency, duration, and volume
Specific Connectivity requirements (Protocols, SSL and others)
Internet / Public Access vs. Internal only
SLA’s to business units for uptime

In the 60 page whitepaper on Consolidation Using SQL Server 2008, the option to collapse multiple databases into a single database [via multiple schema management] is essentially written off citing challenges with security, object naming conflicts, performance issues, etc. It leaves off with an ominous warning that schema-level consolidation should be “used carefully” – but I’d just avoid it all-together for unrelated applications. And you have more dope above on why I say so.

HomePage for SQL Server Consolidation and Virtualization

If you did read it this far, feel free to drop a comment with suggestions if any. Obviously, your experience are also unique and valuable.

Continue reading...