Posts Tagged ‘Architecture’

Managed Databases on Cloud July 18th, 2014

Vinod Kumar

Recently my good friend and colleague Govind wrote about this topic on what are customers looking forward to when it comes to Cloud and working with Azure. The fundamental tenants that customers look at for cloud be it PaaS, SaaS or IaaS has been around:

  1. Reduced Maintenance headaches
  2. SLA backed for HA/DR
  3. Performance
  4. Synchronization with on-prem
  5. Security
  6. Backups
  7. No worry about hardware

and a few more. But for most parts the above fits the quizzing we get into. In a recent conversation, I had to outline some of the options when it comes to backup requirements with the customer which I thought is worth a share here. I am looking at this from an Azure standpoint:

For IaaS:

  1. You will need to use SQL Server Agent and build your maintenance plans that can automated. This can be scripted (powershell, TSQL or others) and done for all workloads.
  2. For SQL Server 2008 R2 CU2 onwards, we can use Backup to URL option wherein backups from Azure VM – SQL box we can point backups to a blob storage. I wrote about this a while back and you can try the same – http://blogs.extremeexperts.com/2014/04/14/sql-server-2014-backup-to-azure-blob/
  3. SQL Server 2014 also supports Encrypted backups to Blob and the same article shows the same. http://blogs.extremeexperts.com/2014/04/08/sql-server-2014-encrypted-backups/
  4. Also from SQL Server 2014 we have option to use Managed Automated backups configured. This will take backups automatically to  blob on a predefined time or based on workload pattern. Documentation for this can be found at: http://msdn.microsoft.com/en-us/library/dn449496.aspx

For PaaS:

  1. Since we already make sure of consistency in the Azure world, we dont have to worry on this.
  2. For Basic, Standard and Premium editions there are SLA for Point-In-Time recovery which is 7, 14 and 35 days respectively. You can read more about this at: http://msdn.microsoft.com/en-us/library/azure/jj650016.aspx . I highly recommend to use Powershell scripts to automate this, if you plan to use the them.
  3. In the past, I have also seen customers use Database Copy functionality to keep a copy of their database in a ready to use state every couple of days. This gives them an opportunity to go back to that version immediately without any problems. This is also an viable option if you like to use. http://blogs.msdn.com/b/sqlazure/archive/2010/08/25/10054109.aspx – Since point-in-time restores are available, I am more inclined to use that for cold standby and restores. Having said that, we can still use that feature for creating a copy for Dev, Test environments from our prod servers for testing.

These are my customer notes and I plan to start publishing these customer notes from time to time here in my blog. Since we are talking about Azure, I am sure some more additional capabilities and SLA’s can change over a period of time. So please keep an eye on the documentation for the latest values.

Continue reading...


 

SQL Server: Columnstore Index on Temp Tables June 20th, 2014

Vinod Kumar

Recently I was talking to a customer during performance testing and they had an interesting twist to using Columnstore Index. Since they were on SQL Server 2012, I had suggested them to use Non-Clustered Columnstore index for their workload. He started to quiz me with a number of questions and I thought it would be appropriate to point them to some resources I had already written. So I took the liberty to send them the two articles:

  1. SQL Server 2012: ColumnStore Characteristics
  2. SQL Server 2012: ColumnStore Part II

The fundamentals and concepts have not changed but where worth a mention and read. Having said that, the counter question after about an hour of their research and reading the contents. “Will it work with Temp Tables? We are doing a reporting workload and are using Temp tables extensively.”

I thought, wow, interestingly I have not written on it till date and is worth a mention though. The easy answer is “YES”, it is quite possible. Why didnt you try was my answer :)

Columnstore Index on Temp Tables

To drive home the concept, in the below example I have created a simple temp table called as #Temp_Account. When I query the table, the execution plan shows as “RowStore” under storage as shown below.

The next logical step is to create our Non-Clustered Columnstore Index. I am using the standard syntax as shown below.

If we try to insert into this temp table, it is still readonly because non-clustered columnstore index has been defined in the table.

Msg 35330, Level 15, State 1, Line 19
INSERT statement failed because data cannot be updated in a table that has a nonclustered columnstore index. Consider disabling the columnstore index before issuing the INSERT statement, and then rebuilding the columnstore index after INSERT has completed.

After index creation, if we make a simple query we can see the change in Execution Plan. The storage is now showing as “ColumnStore” as shown in the execution plan below.

The above query is simple in nature and hence the execution mode is “Row”. In future posts I will show how this can be changed to Batch mode.

Continue reading...


 

DAX: Date and Time functions June 26th, 2013

Vinod Kumar

We have been looking at various DAX related content in this blog in the past few weeks, I thought let me start off by bringing out some of the commonly used DAX functions to light. Many are most useful and seem to work just like normal Excel functions. In this blog we will look at Date and time functions available with DAX.

The Date and Time functions convert string or number to a datetime value or convert a datetime value to a number representing the different part of the date time, such as year, month, day of a week, hour, minute.

The datetime argument can be an explicit datetime value, a column of datetime data type or an expression that evaluates to a datetime value. The result depends on the locale and date time setting which affects whether the datetime value is interpreted as mm-dd-yyyy, or as dd-mm-yyyy.

The value of the datetime argument can be in any of the proper datetime formats. Here are some example datetime formats:

  • "August 1, 2012"
  • "Sep 12, 2012"
  • "3-4-2012"
  • "2012-03-04"
  • "03/04/2012"

DATE Function

Syntax: DATE (yyyy, mm, dd)

Return type: DateTime

DATE function generates a date value from the separately provided value of year, month, and date. The year, month, and date arguments are rounded to integer if non- integer values are presented. The arguments can have explicit values, or expression evaluates to the value. The resulting date is generated using the following logic:

If yyyy is less than 1900,
     Resulting year = 1900 + yyyy
Else
     Resulting year = yyyy
If mm is between 1 to 12,
     Resulting month = mm
Else
     Resulting month = beginning of the resulting year + mm
If dd is between 1 and end of the resulting month
     Resulting date = dd
Else
     Resulting date = beginning of the date of the resulting year + dd

Examples:

=DATE(08,13,32)

The above example returns February 1, 1909.

=DATE(2010,1,-7)

The above example returns December 24, 2009

DATEVALUE Function

Syntax: DATEVALUE (date_text)

Return type: DateTime

DATEVALUE function converts date value represented in a text string to date value in DateTime type. The resulting date depending on the locale setting which controls how to interpret certain date format

Example:

=DATEVALUE ("8/1/2012")

The result in US will be August 1, 2012. The result in India will be January 8, 2012.

TIME Function

TIMEVALUE Function

Similar to DATE and DATEVALUE functions, TIME function converts the provided numbers for hour, minute, and second to a time value, TIMEVALUE converts a string time value to a time value of datetime type.

Syntax: TIME (hour, minute, second) and TIMEVALUE (time_text)

Return data type: datetime

DAY Function

MONTH Function

YEAR Function

WEEKDAY Function

WEEKNUM Function

Syntax: DAY(datetime)

MONTH (datetime)

YEAR (datetime)

WEEKDAY (datetime, <return_type>)

WEEKNUM (datetime, <return_type>)

Return data type: integer

These functions return the day number of the month (1-31), the month number of the year (1-12), the year, the day number within the week (0-6 or 1-7), or the week number within the year (1-53) from the datetime value passed to the argument. The return_type argument for the WEEKDAY function specifies when a week starts and whether the first day needs to return 0 or 1. If not specified, 1 is the default value.

  • 1 Week begins on Sunday (1) and ends on Saturday (7)
  • 2 Week begins on Monday (1) and ends on Sunday (7)
  • 3 Week begins on Monday (0) and ends on Sunday (6)

The return type for WEEKNUM function specifies whether a week starts on Sunday or Monday.

  • 1 Week begins on Sunday. Weekdays are numbered 1 through 7
  • 2 Week begins on Monday. Weekdays are numbered 1 through 7

EDATE Function

Syntax: EDATE (start_date, numberofMonths)

Return data type: datetime

EDATE returns the datetime before or after the number of months from the start_date. Useful for calculating the maturity date or due date that needs to be on the same day as the start_date. When the resulting day goes beyond the end of the month, the end of the month is returned.

Example:

=EDATE("3-4-2013", 3)

It returns June 4, 2013.

=EDATE("March 30, 2013", -1)

It returns February 28, 2013.

EOMONTH Function

Syntax: EOMONTH ( start_date, numberOfMonths)

Return data type: datetime

EOMONTH returns the datetime of the end of the month before or after the specified number of months from the start_date. Useful for calculating due date that needs to be the end of the month.

Example:

=EOMONTH("3/4/2013", 1)

It returns April 30, 2013.

HOUR Function

MINUTE Function

SECOND Function

Syntax: HOUR (datetime)

MINUTE (datetime)

SECOND (datetime)

Return data type: Integer

These functions return the number of hours (0-23), minute (0-59), and second (0-59) from the input datetime value.

NOW Function

Syntax: NOW()

Return data type: datetime

NOW returns the current date and time. Only updated when we refresh PowerPivot data. This is similar to Excel function.

TODAY Function

Syntax: TODAY ()

Return data type: datetime

TODAY returns the current date. Similar to NOW function, once the value is populated, it will not change until you refresh the PowerPivot workbook. This functions similar to Excel function.

YEARFRAC Function

Syntax: YEARFRAC(start_date, end_date, <basis>)

Return data type: decimal

YEARFRAC returns a fraction of the days between the start_date and the end_date to the total number of days in a year. <Basis> arguments defines the type of day count

Basis Description

  • 0 US (NASD) 30/360
  • 1 Actual/actual
  • 2 Actual/360
  • 3 Actual/365
  • 4 European 30/360

Hope you learnt the typical use of various functions with DAX. In future posts, we will take a look at other functions and how they work with DAX. Do let me know if you want me to take any other functions next.

Continue reading...


 

DAX Concepts Explained June 24th, 2013

Vinod Kumar

In this week, let me take an opportunity to write more about the DAX concepts. The previous posts include – Installing SQL Server 2012 Analysis Services–Tabular mode and Introduction to DAX in Excel.

To help understand how DAX functions work, particularly the ones that are used in the complex calculation scenarios, we need to spend some time on some of the key concepts that are the foundation for DAX functionality. This blog post talks about the terminologies used with DAX and how they function inside Excel or SQL Server Analysis Services.

Tabular database

Up to the current SQL Server 2012 release, DAX only works against tabular SSAS database. The tabular database can be a PowerPivot workbook inside Excel, a sandbox database loaded to SSAS instance running in SharePoint integration mode, or a tabular database deployed from a BISM model. In SQL Server 2008 R2 PowerPivot release, the entire tabular database needs to be stored in memory or error would be returned. In SQL Server 2012, paging capability is enabled so the data can be paged to disk if there is not enough memory for the entire database.

Table

Table stores data about a dimension (entity) or the fact data. Often a table gets data from an external data source, such as SQL Server relational database. For PowerPivot running in Excel, a table can also be a local table saved in the Excel file and linked to the PowerPivot workbook through the PowerPivot linked table feature or a table copied/pasted from a regular Excel workbook in a different Excel file.

Table Relationship

Tabular database supports many-to-one relationship between two tables. In tabular database, there can be only one relationship between any two tables. Typically two tables are related through the primary key and the foreign key constraint. From SQL Server 2012, we support multiple relationships between two tables, therefore some of the relationships would be based on non-key columns as well. When multiple relationships exist between two tables, only one relationship is active at any time and that relationship is used by default to navigate between the related tables. In order to use other relationship when evaluate a DAX expression, USERELATIONSHIP function is needed. For example, the following expression calculates the sales amount grouped by Ship Date.

=CALCULATE( FactInternetSales[TotalSalesAmount], USERELATIONSHIP( DimTime[TimeKey], FactInternetSales[ShipDateKey]))

Proper table relationship eliminates the need to maintain redundant data in multiple tables, and ensures DAX calculation returns the correct result. Table relationship also enables performing calculations involving looking up data in the related table by following the relationship. For instance, in the product table, if we create a calculated column with the following definition

=SUMX(RELATEDTABLE(‘FactInternetSales’),’FactInternetSales'[SalesAmount])

For each product in the product table, we will go to the internet sales table, get the subset of rows that contain sales data for the corresponding product, sum the value in the SalesAmount column in the matching rows and return the calculation result to the calculated column for the current product in the product table.

Column

Column is represented as ‘table name'[column name]. The data type for a column can be text, float, whole number, currency, date, and Boolean. These datatypes are same as what we discussed at Introduction to DAX in Excel.

Unique column

By definition, unique column does not allow duplicate value and null value. The uniqueness is enforced by setting the cardinality of the column to one. This property does not stop a null row from being added to the query results to represent the unknown member when necessary.

When table relationship is created, it internally sets the column on the one side of the one-to-many relationship to be unique, hence preventing duplicate value and blank value. When the relationship is deleted, the unique flag is removed from the column.

There is no way to mark a column as unique in PowerPivot for Excel add-in. In BISM, under Reporting Properties, one can set Row Identifier property for a column to be True. Setting a column as Row Identifier would mark the corresponding attribute as the key attribute of the dimension, which implies uniqueness as well. Only one column in a given table can be the Row Identifier (key attribute).

Row Context

Row context refers to the current row in the current table that is being evaluated. This applies primarily to the calculated column. For a complex calculation, row context may change as we switch from one DAX function to another DAX function. Row context controls the value that is returned for the column in the current function.

Here are some examples of calculated column definitions using AdventureWorksDW database as the data source to show the row context and row context change.

Calculated column in FactInternetSales table:

=’FactInternetSales'[SalesAmount] – ‘FactInternetSales'[TotalProductCost]

The above formula calculates the profit for each row in the FactInternetSales table. The row context is the current row being evaluated. As the calculation moves to the next row in the same table, the row context moves to the next row as well.

Calculated column in DimProduct table:

=SUMX( RELATEDTABLE( FactInternetSales), FactInternetSales[SalesAmount])

This calculates the total sales amount for each product in the product table. The calculation involves two row contexts, the first one is for the current row in the ‘DimProduct’ table. The second row context is the row in the ‘FactInternetSales’ table that is being scanned when we are inside the RELATEDTABLE function to retrieve the rows that have the same product key.

Filter Context

Filter context consists of members in row label, column label, vertical slicer, horizontal slicer and report filter in the PivotTable. It is used to evaluate the calculated measure. Filter context filters out rows in the selected workbook tables so the calculation is carried out on the selected rows after the filter.

When CALCULATE function is evaluated, if there is a row context associated, the row context is promoted to filter context, the SetFilter arguments passed to CALCULATE function will then be applied on top of the query context to produce the final context before we evaluate the expression passed to CALCULATE function.

I will try to explain these functions in future posts. This is just a start. In this post, let us just learn the concepts.

BLANK

BLANK in DAX is the same as ‘blank’ in Excel and ’empty cell value’ in MDX. PowerPivot table does not include blank row. For instance, COUNTROWS(Employee) includes the number of actual employees, not including one extra row for "null" or "unknown" employee.

A blank row will be added into the query result when we encounter referential integrity violations or null foreign keys. ALL() function will return all rows in the result including the blank row if added.

ALLNOBLANKROW will return the same result as ALL except the blank row will only appear if there is a blank row in the data – it will not appear if it is only present as the result of a referential-integrity violation or a null foreign key.

When we aggregate values (SUM, COUNT, AVERAGE, etc.) and all the values being aggregated are BLANK, the result will be BLANK. In arithmetic operations of addition and multiplication, we treat BLANK as zero when some operant is non blank value, for instance, BLANK+3+4+BLANK=7. When all the operant is BLANK, the result is BLANK, i.e. BLANK + BLANK = BLANK. In arithmetic division operation, when the nominator or the denominator is BLANK, the result is BLANK.

This behavior is different from Excel where BLANK is explicitly converted to zero when being calculated. The reason DAX do not convert BLANK value to zero is to avoid displaying members that do not have value. For instance, Cities table has 200 cities, for a particular week, only five cities have sales, when build PivotTable to show the sales for that week, and place cities on the row labels, we want to show five rows of data, instead of 200 rows with 195 of them being zero. We refer to this behavior as non-empty behavior, the same concept as in MDX.

NULL

Null has different semantic from BLANK. When arithmetic operation involves NULL operant, the result is always NULL. For example, NULL + 3 = NULL. NULL is not used in DAX. When we encounter a NULL value at import time, we convert it to BLANK.

These were some of the basic concepts we need to know when it comes to DAX. In next blog posts, we will look at the various functions available in DAX and learn more about this feature.

Continue reading...


 

SQL Server 2012: ColumnStore Part II June 5th, 2013

Vinod Kumar

I wrote about ColumnStore Indexes before and those were around the basic restrictions and things to consider while working with ColumnStore Index with SQL Server 2012. In this blog post let me take a moment to look at the boundaries, use with SSMS and other diagnostics information. Obviously some of the restrictions have been relaxed for next version of SQL Server 2014. That will be for a different blog post.

Datatype restrictions

In SQL Server 2012, columnstore index creation is supported only for the following data types:

  1. char, varchar, nchar and nvarchar (except varchar(max) and nvarchar(max))
  2. decimal (and numeric)
  3. int, bigint, smallint, and tinyint
  4. float, real
  5. bit
  6. money and smallmoney

New DDL

There is now a new keyword COLUMNSTORE in the CREATE INDEX DDL which is required to create a columnstore index. More about it can be read from MSDN.

CREATE COLUMNSTORE INDEX <Name> ON <tableName>(<Cols>)

Index hint

We can force an index hint to use the columnstore index in a query if required:

… FROM mytable WITH (INDEX (mycsindex)) …

Query hint

We can either use a table hint to force the use of a different index or we can use a new query hint: IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX (MSDN). This new hint will prevent the use of any nonclustered columnstore indexes in the query. Below is an example of using the hint to prevent use of any nonclustered columnstore index in a query:

SELECT DISTINCT (SalesTerritoryKey)
FROM dbo.FactResellerSalesCopy
OPTION (IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX);

CATALOG VIEWS and DMVs 

There are a number catalog views which will show different values for ColumnStore. I am listing a few here:

sys.indexes:

type : 6

type_desc: NONCLUSTERED COLUMNSTORE

sys.partitions:

data_compression: 3

data_compression_desc: COLUMNSTORE

sys.index_columns:

key_ordinal: 0 (not a key column; columnstore index does not have a search key like a row-based index)
is_descending_key: 0 (default value; not a key column and index is not "sorted" like a row-based index)
is_included_column: 1 for every column mentioned in the create columnstore index ddl.

sys.dm_db_index_operational_stats:

Columns
nonleaf_insert_count,
nonleaf_delete_count,
nonleaf_update_count,
nonleaf_allocation_count,
nonleaf_page_merge_count,
tree_page_latch_wait_count,
tree_page_latch_wait_in_ms,
tree_page_io_latch_wait_count,
tree_page_io_latch_wait_in_ms,
Leaf_page_merge_count,
page_compression_attempt_count,
page_compression_success_count – Has a value of 0 for columnstore index.

Showplan Enhancement

Three additional properties have been added to the showplan data for columnstore indexes:

  1. Storage – row or columnstore
  2. EstimatedExecutionMode : row or batch
  3. ActualExecutionMode : row or batch

SSMS Enhancements for Columnstore

I am outlining some of the visible enhancements for columnstore with SQL Server 2012 Management Studio. Firstly, Columnstore indexes are represented by a different icon than traditional row based indexes.

image

The next UI changes is for creating a New Columnstore Index.

image

The third part is around adding columnstore columns as we create the index.

image

Final Words

These are some of the basic enhancements with SQL Server 2012 I thought was worth sharing. Do let me know if you have used columnstore index in your production environments and what are the scenarios you found using this type of index useful.

Continue reading...