Archive for November 22nd, 2012

SQL Server: Database Page Basics November 22nd, 2012

Vinod Kumar

As you have been reading multiple posts around the database fundamentals topic in the past few weeks. Here is yet another post that will discuss around the basics of SQL Server Database Pages concept. I have seen there are many junior DBA’s who come from other platforms and ask what is a page, why is it restrictive, Is there a way to configure, are there optimizations I need to be aware etc. For most part the answer to this question is – there is hardly anything we can configure as parameters, but there is a reason why one must learn why SQL Server uses these defaults, what are the nuances we need to be aware and how can we understand the inner working. In this post, we just take a look at the basics of what constitute a database page and what are the various functions of each of these page types. This is not exhaustive, but still the learning can never stop right :).

  1. Logical pages are 8KB in size by design and default.
  2. The pages are numbered sequentially from 0 to N in each file, while the size of the file decides the number of pages that it contains.
  3. SQL Server can identify any page using the database-id, file-id and page-number combination. Many times there are error messages with this nomenclature for reference.
  4. With all math done, it means SQL Server has 128 pages per MB.
  5. When new space is allocated to a file because of expanding file, the first page of the newly created space is page# is N + 1.
  6. Though shrinking removes pages, SQL Server also ensures page numbers within a file are always contiguous.

SPACE ALLOCATIONS

  1. Space in a database file is managed in units called extents.
  2. Extent is made up of 8 logically contiguous pages therefore having a capacity of 64KB.
  3. 96 byte of each page is allocated for header information such as what type of page, amount of free space on page, object owing it etc.
  4. Post the header information is the data rows placed serially.
  5. Bottom of the page contains a row offset – there is one entry for every row inside that page and its offset from the start of the page.
  6. Row Offsets are in reverse sequence from the sequence of the rows.
  7. Maximum amount of data that one can store in a single row is 8060 bytes. Whenever we are able to fit the rows in this limit – it is called In-Row Data.
    1. From SQL 2008, the page restriction has been relaxed with an concept called as Row-Overflow Data. Read more from MSDN.
  8. SQL Server doesn’t allocate entire extents to tables which have small amounts of data.
  9. Extents are of 2 Types:
    1. Uniform extents: As name suggests it is owned by a single object; all eight pages in the extent can be used only by this single object
    2. Mixed extents: These are shared between objects. Up to eight objects can share an extent (a.k.a each object using a single page)
  10. Whenever a new table or index needs allocation it is given a mixed extents; when the table or index grows to eight pages, future allocations can use uniform extents. Between versions of SQL Server the # of pages post which uniform extents are given may vary.
  11. If a table or index needs more space and is still less than 8 pages total, SQL Server must find a mixed extent with space available, else if it is 8 pages or larger, uniform extents are located.
  12. When there is no mixed extents with free space, a new extent marked as mixed extent is allocated and SGAM updated accordingly.
  13. Basic Page Layouts:
    1. Page 0 is the File Header
    2. Page 1 is the Page Free Space (PFS)
    3. Page 2 is GAM
    4. Page 3 is SGAM
    5. Page 6 is Differential Changed Map (DCM)
    6. Page 7 is Bulk Changed Map (BCM)
  14. 2 special types of pages to record which extents have been allocated and what type it is being use for:
    1. Global Allocation Map (GAM) pages (always on page 2)

      1. These pages record which extents have been allocated for any type of use.
      2. GAM has a bit to indicate value 0 means in use and 1 means free extent.
      3. After header and other overhead are accounted, there are 8000 bytes or 64000 bits to cover 64000 extents.
      4. Each GAM page covers 4 GB of data and a GAM page exists in a file for every 4 GB of file size.
    2. Shared Global Allocation Map (SGAM) pages (always on page 3)
      1. Record which extents are currently used as mixed events and have at least one unused page.
      2. Similar to GAM, it covers 64000 extents every 4GB of data.
      3. Has a bit for each extent in the interval it covers: 1 if the extent is being used a mixed extent and has free pages, 0 if the extent isn’t being used a mixed extent or it’s a mixed extent with no free pages

Index Allocation Maps (IAM)

  1. Any structure of data or index needs an IAM page.
  2. IAM pages keep track of the extents in a 4 GB section of a database file.
  3. An allocation unit is a set of pages belonging to a single partition in a table or index. The pages of can be of three storage types:
    1. Pages with regular in-row data
    2. Pages with Large Object (LOB) data
    3. Pages with row-overflow data
  4. IAM page contains a 96 byte page header, followed by IAM header which contains 8 page-pointer slots.
  5. IAM pages contain a set of 8 bits that map a range of extents onto a file. The IAM header has the address of the first extent in the range mapped by the IAM.
  6. A bit in the IAM bitmap represents an extent in the range: 1 means extent is allocated to the object owning the IAM, 0 means the extent isn’t allocated to the object owning the IAM.
  7. IAMs are allocated as needed for each object. Each of the IAM covers a possible range of 512,000 pages.

Misc. Notes

  1. PFS (page 1) – keeps track of how each particular page in a file is used.
  2. PFS also tracks if the page is empty, 1 to 50 percent full, 51 to 80 percent full, 81 to 95 percent full or 96 to 100 percent full.
  3. There is a PFS page approximately 8,000 pages in size after the first PFS page.
  4. Differential Changed Map (DCM page 6) – keeps track of which extents in a file have been modified since the last full database backup.
  5. Bulk Changed Map (BCM page 7)- is used when a page in a file is in a minimally or bulk-logged operation.
  6. Like the GAM and SGAM pages, DCM and BCM pages have 1 bit for each extent in the section of the file they represent.
  7. There is a GAM page 64,000 extents after the first GAM page on page 2 and another SGAM page 64,000 extents after the first SGAM page on page 3.
  8. Log files donot have the concepts of pages, they are a series of log entries written sequentially.
  9. In the previous versions the text, ntext and image data are stored in separate pages and are not inside the Data pages.

I think I have touched most parts of how database pages exist inside SQL Server. Though there are even more subtle nuances on the page functioning, we will keep it outside of this post for later. Please refer to MSDN BOL for more information on the page architectures.

Continue reading...