casinos

SQL Server: Transaction Logs and Recovery November 12th, 2012

Vinod Kumar

In my day-today customer interactions I meet a lot young, smart DBA’s who are still stepping into the ranks of making their applications robust. As part of their daily activities they are getting guided by some seniors who impose some of their views based on their past experience. Unfortunately, in this process the fundamentals are lost and as we build our career we need to understand the basics. One of the topics I explain almost every customer is about how Transaction Logs work, why they keep growing, how to make it cycle/reuse, what happens after backup etc. So I plan to write such posts in the future too, these are lists for reference making learning fun and easy. I am writing this post in the style of notes taken in a class, hope that will make reading easy and leaning faster :).

Transaction Log Basics

  1. Transaction log records changes made to the database and stores enough information to allow SQL Server to recover the database.
  2. Recovery is the process of reconciling the data files and the log.
  3. Any changes to the data that the log indicates have been committed must appear in the data files and any changes that are not marked as committed must not appear in the data files
  4. Log also stores information on how to roll back transactions if a ROLLBACK TRAN is received from the client or a deadlock generates an internal ROLLBACK.
  5. One or more transaction log files are associated with a database when the database is created.
  6. Log record is labeled with a Log Sequence Number (LSN) which is guaranteed to be unique.
  7. Log entries part of the same transaction are linked so that they can be located easily for undo or redo activities.
  8. SQL Server keeps track of the current position in the log by the LSN and when a page is changed the LSN corresponding to the log entry for that change is written into the header of the data page.
  9. Dirty pages can be written to the disk only when the LSN on the page is less than or equal to the LSN for the last record written to the log.
  10. The Buffer Manager guarantees that the transaction log will be written before changes to the database are written (write-ahead logging).
  11. Buffer Manager also guarantees that log pages are written in a specific order making it clear which log blocks must be processed after a system failure regardless of when the failure occurred.
  12. Log records for a transaction are written to disk before the commit transaction is sent to the client process. Actual changed data might not have been written out to data pages yet.
  13. Though log writes are asynchronous, at commit time thread must wait for the write to complete to the point of writing the commit record in the transaction log. Data page writes are completely asynchronous.
  14. Logging involves "demarcating" the begin and end of each transaction with information about changes made to the data in between. At the end, its marked with a commit record.
  15. An aborted transaction undoes the changes made to the original data and these changes are written to the log and marked as "compensation log records".
  16. There are two types of recovery (goal for both is to make sure log and data are in sync):
    1. Restart Recovery – runs every time SQL Server is started. Runs on each database (in parallel) and the SQL Server error log tells you how many transactions were rolled forward and back. This is also called as Crash Recovery.
    2. Restore Recovery – also known as Media Recovery is run by request when a restore operation is requested via the backup route.
  17. Both the above types have to deal with 2 important situations:
    1. When transactions are recorded as committed in the log but not written to the data file.
    2. When changes to the data file don’t correspond to committed transactions. This can happen when a implicit indirect or explicit checkpoint forced this case.

Phases of Recovery

Now that I have set some context to the basic units of transactional logs and why they are used. We need to also understand the stages of recovery process.

  1. Phase 1: Analysis Phase – Forward pass starting at the last checkpoint record in the transaction log. Determines and constructs a Dirty Page Table (DPT) consisting of pages that might have been dirty when SQL Server stopped. Active Transactions Table is also built consisting of uncommitted transactions
  2. Phase 2: Redo Phase - Returns the database to the state it was in at the time SQL Server was stopped. Starting point for this forward pass is the start of the oldest uncommitted transaction. The minimum LSN in the DPT is the first page that is expected to be redone. But all logged operations starting at the oldest open transaction need to be redone for necessary locks to be acquired
  3. Phase 3: Undo Phase – Uses the list of Active Transactions found in Phase 1 to roll each transaction back individually. Any transaction that was not committed is undone
  4. Fast Recovery Phase – available in Enterprise and Developer edition allows for a database to be online as soon as redo phase is finished

Page LSNs and Recovery

  1. Every database page has a LSN in the page header that reflects the location in the transaction log of the last log entry that modified a row on this page.
  2. Also keeps track of the LSN which was on the data page before the change recorded by the latest log record.
  3. During redo, LSN of each log record is compared to the LSN of the data page that the log entry modified.
    1. If it’s equal to the previous page LSN in the log record, the operation indicated in the log entry is redone.
    2. If it’s equal or higher than the actual LSN for the log record, REDO operation is skipped.
  4. A transaction log cannot be truncated prior to the point of the earliest transaction that is still open, no matter how many checkpoints have occurred.

Virtual Log Files

  1. A transaction log for any database is maintained as a set of virtual log files (VLFs) who size is determined internally by SQL Server by the total size of all the log files and the growth increment used when enlarging the log.
  2. Log file is grown in units of entire VLFs and be shrunk only to a VLF boundary:
    1. When a log file is first created it always has between 2 and 16 VLFs.
    2. If its 1 MB or less – log file size is divided by minimum VLF size 31 * 8KB to determine number of VLFs.
    3. If its between 1 and 64 MB – log is split into 4 VLFs.
    4. If greater than 64 MB but less than 1 GB 8 VLFs are created.
    5. If greater than 1 GB, 16 VLFs are created.

      The above rules of thumb can be different depending on the version of SQL Server you are working.

  3. 4 states for a VLF:
    1. Active – active portion of the log begins at the minimum LSN representing an active transaction. The active portion of the log ends at the last LSN written
    2. Recoverable – portion of the log preceding the oldest active transaction is needed only to maintain a sequence of log backups for restoring the database to a former state
    3. Reusable – If transaction log backups are not being maintained or if log was already backed up, VLFs before the oldest active transaction are not needed and can be reused
    4. Unused – One or more VLFs at the physical end of the log files might not have been used yet if not enough log activity has taken place. Or if earlier VLFs have been marked as reusable and then reused
  4. Automatic Truncation – understand when the Log backups are not existent:
    1. You have configured the database to truncate a log on a regular basis by setting the recovery model to SIMPLE.
    2. You have never taken a full database backup.
    3. All log records prior to the oldest active transaction are invalidated and all VLFs not containing any part of the active log are marked as reusable. This process doesn’t shrink the physical file it is used in a cyclic manner.
  5. Automatic Shrinkage – A database is truncated to allow for shrinking. If autotruncate and autoshrink option is set, the log is physically shrunk at regular intervals. This is something not recommended, though this option exists.

Database Recovery Models

As we learn about transaction logs, we cannot miss understanding the recovery models and the effects of the same. There are 3 modes of recovery models:

  1. FULL Recovery Model
    1. Provides the least risk of losing work in case of a damaged data file.
    2. All operations are fully logged – in addition to logging every row changed via DML, the entire row is written to the transaction log using BCP or BULK INSERT operation.
    3. Also fully logs CREATE INDEX operations.
    4. Use this if you need to recover to a Point-in-time from your backup.
  2. BULK_LOGGED Recovery Model
    1. Allows you to restore a database completely in case of media failure and gives the best performance and least log space usage for certain bulk operation.
    2. Minimal logging compared to FULL recovery model.
    3. Minimal logging consists of logging information that is required to roll back the transaction without supporting point-in-time recovery.
    4. Every data file in SQL has a special page called Bulk Changed Map (BCM) page also called Minimally Logged Map (ML Map) page. Each bit on a BCM page represents an extent and if the bit is 1 it means that this extent has been changed by a minimally logged bulk operation since the last transaction log backup.
    5. BCM map is located on page 7 of a 4GB span and all the bits are reset to 0 every time a log backup occurs.
  3. SIMPLE Recovery Model
    1. Offers the simplest backup and restore strategy.
    2. Transaction log is truncated whenever a checkpoint occurs (which happens regularly).
    3. Only backups that can be made are ones that don’t require log backups.

Choosing the right recovery model is very important for functioning of your databases.

Backing Up and Restoring a Database

SQL Server has four main types of backups:

  1. Full Backup – Copies all pages from a database onto a backup device.
  2. Differential Backup – Copies only the extents that were changed since the last full backup was made. Done by examining the bits on the DCM pages for each data file in the database. When any page in an extent is changed, the corresponding bit in the DCM page is set to 1
  3. Log Backup – Copies all the log records that have been written to the transaction log since the last full or log backup was made
  4. File and Filegroup Backup – these backups are intended to increase flexibility in scheduling and media handling compared to full backups, in particular for very large databases. Useful when database contains varying update characteristics or priority levels

Thanks for reading this far. Hope the basics are clear now. Given these pointers, feel free to explore more into MSDN documentation.

Share this article

Tags: , , , , , , , ,

This entry was posted on Monday, November 12th, 2012 at 08:30 and is filed under Technology. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.


21 Responses to “SQL Server: Transaction Logs and Recovery”

  1. Prasanna says:

    Sir Nice Post…!
    One question is,Can we tune transaction logs?? Because every data modification made in SQL Server must be logged. Second, no transaction can complete until all of the related data modification are written to the transaction log.

    • Vinod Kumar says:

      Thanks Prasanna. When you ask for tuning is concerned it is the writes only. So it is recommended that we have it on a faster drive. And you are correct, the transaction is not complete till it is written to the transaction log. Writing to the data file can be done later.

  2. sybanva says:

    Vinod,
    Can you please also let me know about LSN?

    What is LSN and how SQL server generates unique LSN?

    • Vinod Kumar says:

      LSN’s have been explained above already. LSN is “Log Sequence Number” and it is internal of SQL Server to generate unique LSN – it is SQL Server’s mode of generating some identifier unique enough to mark and track all transactions. Thanks again for visiting the blog and dropping a line.

  3. sybanva says:

    Hi Deepak,

    I have one basic question, Does tran log in SQL server store the DML only?

    I heard with some one in case of select statement also , itstore the stats for select statement in tran log ( time consumed, database etc)?

    I have little doubt on this part as main motive of tran log is to make the prepare the database recovery and there is basically no role of select statement in Restart Recovery(or you can say for any kind of recovery).

    Let me put in this way,

    begin tran
    select * from test where id=5
    commit

    Will it store any thing in tran log ?

    Please clarify. Thanks,

    • Vinod Kumar says:

      You are right. The select statement doesn’t need to be logged because there is no data modifications. SQL will use locking mechanisms to give consistency and concurrency of data.

      Read about concurrency on the guest blog post: http://blog.sqlauthority.com/2012/11/15/sql-server-concurrency-basics-guest-post-by-vinod-kumar/ .. That will give you an idea what happens.

      The database recovery doesnt need to track on selects actually. The above example code will not write into logs in the default isolation level.

      Hope it makes it clear.

      • Vinod Kumar says:

        When in doubt, tryout the code and see.

        DBCC log (< >, 4) will give you the transaction log entries and you can check it for yourself.

        • sybanva says:

          I have tested and found nothing in tran log for select statement.

          But I have doubts on below steps:

          =======================================

          Monitor your DB log usage

          Issue some 10 SELECT statements (some long running queries that involve considerable IO cost )

          Monitor your DB log usage

          use dbcc log and its properties including flags.

          One more important thing: SELECT has three different uses in SQL Server. Every SELECT (or SELECT INTO or ASSIGNMENT of VALUES using SELECT) are also logged.

          OR

          Switch on Activity Monitor and verify the WRITES on to LOG file of your database

          ====================================

          Please let me know your thoughts, Thanks.

          • Vinod Kumar says:

            Activity monitor is not something I am suggesting. Use DBCC Log to check what is written – if you want to see.

            SELECT which is only selecting data – nothing is written based on my previous comments. Now with INTO clause you WILL have writes as it is writing into some structure. That log writes can be on permanent table or temp table.

  4. [...] fun when it comes to SQL Server and learning the basics again can be more fun. I did write about Transaction Logs and recovery over my blogs and the concept of simplifying the basics is a challenge. In the real world we always [...]

  5. Devi Prasad says:

    One more good contribution to community .. Enjoyed reading this post and have been sharing this link with all my colleagues since I read. Thumbs up for this article! Keep rolling and rocking..

  6. Gopalakrishnan Arthanarisamy says:

    Very good and 100% explanation of Transaction Log & Its Recovery Process. Good Job Vinod….Wish You Happy Deepavali.

    Gopalakrishann Arthanarisamy,
    Unisys, Bangalore.

  7. Manoj Pandey says:

    One stop shop for me to understand all about Transaction Logs… very informative. Good job Vinod :)

  8. Sandip pani says:

    Yes Vinod, One liners are easy to refer too and for me at least this is like reference to refresh the fundamentals about Transaction Log.

    Thank you Sir.

    • Vinod Kumar says:

      Glad to hear it is of use. Feel free to spread it around too. I talk about these to my customers almost every other day and thought it will be great to share :)/

  9. manasdash says:

    Probably the best one liner collection about transaction log & recovery. a must and nice read.

Leave a Reply



 

Email
Print