Monday, 20 April 2009

Table and Index Partitioning in SQL Server 2008 Enterprise Edition

Introduction

SQL Server 2008 Enterprise enables you to horizontally partition individual tables and indexes so that data storage is spread over multiple drives. Horizontal partitioning is the process of dividing the data in a table or index by rows, so that some of the rows are contained in one location, some in a second location, some in a third, and so on. For example, with SQL Server 2008 Enterprise, you can partition large tables by date, so that each partition stores rows that correspond to a specific time period.
Partitioning has multiple benefits:
• Firstly, it enables you to make efficient use of the available disk capacity by controlling the placement of data from a single table or index across multiple disks. Partitions are mapped to filegroups, which in turn contain the files that hold the actual data.
• Secondly, partitioning enables flexible management of larger tables, for example by enabling you to backup a just subset of the rows in a table. Because partitions are mapped to filegroups, you can use SQL Server backup to backup just the filegroups that contain changed data.
• Thirdly, partitioning can make the loading new data a much faster procedure using the SWITCH option, which we’ll discuss shortly.
• Finally, and perhaps most significantly, partitioning can improve query response times for users.

Partition switching

Faster data loading can be achieved when you use the SWITCH clause of the ALTER TABLE statement. This enables you to swap a populated table or partition with an empty table or partition, provided that the they both have the same design structure, including the number, order, naming, and nullability of columns. For example, you could load new data by switching a populated table into a partitioned table and switching a named partition – say, partition 1 - out of that table. All that actually happens with this operation is that the metadata of the table and partition, including the table name and the partition name, are switched. No data actually needs to be moved, and as a result the process is very quick. You can also use the ALTER PARTITION FUNCTION statement with the MERGE and SPLIT clauses to merge two partitions together and to split an existing partition.

Performance improvements

Partitioning improves query response times on partitioned tables when users issues queries that filter rows by date, as is very often the case in large databases such as data warehouses. This is because SQL Server only needs to search through the partition or partitions that contain the relevant rows, which is more efficient. You can also limit lock escalation to the partition rather than the table, reducing the risk of blocking and deadlocking.
SQL Server 2008 offers a considerable improvement in performance over SQL Server 2005 for queries against partitioned tables that run on multiprocessor systems. In SQL Server 2005, if a user issued a query against a partitioned table and the query touched only one of the partitions in that table, all available threads could be allocated to that query, resulting in fast response times. If, however, the query touched more than one partition, then only one thread could be allocated per partition in the query, resulting in comparatively slower response times, particularly in cases where the majority of the data resided in just one of the partitions. This behaviour was by design and was actually intended to provide improved performance in the special case of a query touching only one partition, but from the users’ point of view, it sometimes meant that execution times seemed unpredictable. SQL Server 2008 addresses this issue with improved partitioned table parallelism. Queries that touch a single partition behave in the same way as they do in SQL Server 2005, and all available threads are allocated to the query. Queries that touch multiple partitions now have all available threads allocated to them, with the threads being distributed in a round-robin fashion. For example, on a server that has four cores, a query that touches one partition will have all four threads allocated to it. A query that touches two partitions will have all four allocated again, with two available per partition. In SQL Server 2005, this second query would only have had one thread available per partition. This new behaviour can result in drastic improvements in performance, with some queries running ten to fifteen times faster.

Summary

Table and index partitioning is a powerful feature that offers many benefits to administrators who manage large databases. SQL Server 2008’s enhanced partitioning delivers impressive performance on multi-core servers, enabling you to get more out of your existing hardware.

No comments: