New version, Decode and Find, and documentation

I’ve just release a new version of Internals Viewer available here.

This release has a few bug fixes in it and also the Decode and Find tool window has been added, which I’ll explain below.

I’ve also created some documentation on the CodePlex site. It covers the basic functionality of the add-in as is available here.

image

The Decode and Find window was available in Version 1 of Internals Viewer. It allows a value to be encoded to a particular type and then searched for in the Page Viewer.

To open it click the icon on the Page Viewer toolbar button:

image

It will also give a breakdown of how certain types are structured, for example smalldatetime:

image

In this version it’s an SSMS tool window, so it can be docked or tabbed, just like the Find and Replace window.

image

In the latest release the Decode and Find window will automatically link to the active Page Viewer window, and can also be used if no Page Viewer is active.

Internals Viewer Update – Latest Build 20081102.1

I’ve just uploaded the latest build of Internals Viewer.

Now it’s got the following features:

  • Integration with SSMS (SQL Server Management Studio) 2005 and 2008
    • The application is installed as a SSMS add-in
    • Internals information integrated into the Object Explorer
    • Transaction Log viewer integrated into the Query Results
  • Allocation Map
    • Displays the physical layout of tables and indexes
    • Displays PFS status
    • Overlay pages in the Buffer Pool
  • Page Viewer
    • Displays Data Pages including forwarding records and sparse columns
    • Displays allocation pages (IAM, GAM, SGAM, DCM, and BCM pages)
    • LOB data types and row overflow decoding

(New to this release)

If you’re interested in how any of it works all of the source code is available from CodePlex.

Also, if you liked the SSMS Allocation Add-in please vote for it as part of the SQL Heroes competition here.

Internals Viewer…the next version

I’ve been working on the next version of SQL Internals Viewer (now called Internals Viewer).

There are a couple of things that are new with it. First it’s a SQL Server Management Studio add-in, and second it will be open source on CodePlex.

At the moments it’s far from finished, so current builds are an early preview.

The project page is here:

http://www.codeplex.com/InternalsViewer

The latest build can be downloaded here:

http://www.codeplex.com/InternalsViewer/Release/ProjectReleases.aspx

SSMS Integration

The next version will be fully integrated with SSMS.

Extra internals information is added to the Object Explorer:

image

The Transaction Log Viewer is now available as a tab the Query Results pane (to get it working click on the Internals Viewer – Display Transaction Log menu item)

image

For the time being SQL Internals Viewer 1.0 will be the only standalone version. Once the add-in is developed I’ll create a new standalone version that will share the same code.

Allocation Add-in – now with fragmentation information!

The latest version of the SSMS Allocation Add-In now includes fragmentation information for each table:

fragmentation

0% – 10% is green, 11% – 20% is orange, and 20%+ is red.

The release also includes a much quicker routine for creating the allocation map as the previous version was very slow for VLDBs.

It can be downloaded here.

Allocation Add-in Version 0.2

Version 0.2 of the Allocation Add-in has just been released.

It’s available from CodePlex here.

This new version includes support for SQL Server 2008 RTM and has a couple of new features.

allocationinfo_fitscan

The Allocation Map can now be set so that the entire file is rendered to fit the screen without the need for scrolling. This is the default view. Fit map is rendered asynchronously and is also cached.

 newbuttons

There’s also the option to toggle on and off the table and allocation map.

Allocation Information Add-in

First of all, my apologies. I’ve been busy with work and other stuff so I haven’t had much spare time to update this blog or SQL Internals Viewer.

One thing I have been working on is a new free add-in for SQL Server Management Studio called the Allocation SQL Server Management Studio Add-in (maybe it needs a better name...)

Recently I’ve been working on refactoring a large database. I wanted a quick way of seeing in which tables the majority of data was stored in, but it’s not that easy without using sp_spaceused or querying DMVs.

The add-in makes space allocation easy to see by creating a new screen in SSMS (both 2005 and 2008) that has a table and displays things like table size and row numbers, and also the percentage of the database the table uses.

allocinfo

As you can see from the screenshot it includes an Allocation Map similar to SQL Internals Viewer. I see the possible* future of the add-in as a version of the Internals Viewer that is more geared towards DBA use. I’m going to add in fragmentation information soon.

The add-in is my entry to the SQL Heroes competition. I’ve created the project in Codeplex (http://www.codeplex.com/) so you can view and download the source code.

I’m warming to Codeplex, and releasing the source of SQL Internals Viewer is something I’ve had in mind for a while, so it’ll probably go in there too.

Release

The release 0.1 of the add-in can be downloaded here

 

*I use possible as the add-in could also turn into Internals Viewer 2. Which would you prefer – standalone as it is now or integrated with SSMS?

Page compression - internals and examples

First of all, apologies for the delay in writing this. I meant to blog about page compression as soon as I added it in to Internals Viewer but I didn’t get round to it.

Going over it again for this blog post and testing things out with RC0 I found a few bugs, so along with this comes a new release, version 1.02. It can be downloaded here.

Syntax:  ALTER TABLE table REBUILD WITH (DATA_COMPRESSION = PAGE)

Page compression uses the following techniques to save space:

  • Row Compression (covered here)
  • Prefix compression
  • Dictionary compression

Applying Page compression to a table or index doesn’t necessarily mean that all of the techniques are used. Row compression will always be used if page compression is used, but prefix and dictionary compression are only used if SQL Server determines that space can be saved by using them.

For this post I’ve created an example script that can be used with SQL Server 2008 (RC0) and SQL Internals Viewer. The script will create and populate two tables, one called FirstNames_PrefixOnly and one called FirstNames_PrefixAndDictionary. It can be downloaded here.

Both page compression techniques work by cutting down on repetition in the page. Prefix compression identifies common prefixes on a per-column basis. SQL Server identifies a common prefix and then it assumes that all values start with that prefix. Records only need to store the differences from the prefix.

Here’s an example. All of the following names start with Antoni.  The prefix is stored as Antoni so only the suffixes need to be stored, saving space.

Antonia
Antoni
e
Antoni
etta
Antoni
na
Antonio

Dictionary compression is applied after prefix compression and identifies common patterns in the data. If a particular pattern occurs more that once space can be saved by adding the pattern to the dictionary and in its place adding a reference back to the dictionary entry. Prefixes are specific to each column, but the dictionary can be used across columns.

Following on from the example of first names beginning with Antoni, there may be repetition in the table:

Antonietta
Antonietta
Antonietta
Antonietta
Antonina
Antonina
Antonina
Antonina

Prefix compression (in red) has already identified common prefixes. Dictionary compression indentifies the common pattern (etta and na), adds them to the dictionary and replaces them with a reference to the entry.

So this...

Data:       Antonietta
            Antonietta
            Antonina
            Antonina

Becomes this...

Prefix:      Antoni
Dictionary:  [0]etta, [1]na
Data:        [0]
             [0]
             [1]
             [1]

There’s more information on this on the Storage Engine Blog and also in Books Online.

Compression Info structure

If a table or index uses Page compression and it uses Prefix or Dictionary compression it will have something called a Compression Info (CI) structure just after the page header starting at byte 96.

The CI structure contains:

  • A header describing the CI structure and what it contains
  • The Anchor Record (for Prefix compression)
  • The Dictionary (if used)

In SQL Internals Viewer if a page has a CI structure an additional set of options will appear underneath the Offset table. Selecting one of the three items will colourise and decode it in the Page Viewer.


 
Header

The CI header is made of the following parts:

  • Status Bits – 1 byte
    • If bit 1 is 1 the CI has an anchor record
    • If bit 2 is 1 the CI has a dictionary
  • Page Mod Count – 2 byte short (I’m not sure what this does)
  • Length – Length of the anchor record (Only present if there is a dictionary) – 2 byte short
  • Size – Size of the CI structure – 2 byte short

Prefix Compression - Anchor Record

The anchor record is a record that uses the new row compression record format. The anchor record defines the prefix, if it exists, for each column.

For more on the new row compression record format see the previous post about row compression.

The screenshot above shows the anchor record for the first page of the example FirstNames_PrefixOnly table.

0x416E6E6162656C6C61 decodes to Annabella – this is the prefix for column 1.

 
 
In the example table, NameId 461 is Annemarie. The first name field is made up of two things. The first is the data offset. This is 1 or 2 bytes (depending on if the first bit is set) that determines at what offset the subsequent data starts. 03 decodes to 3 and 65 6D 61 72 69 65 is emarie. This means the prefix is used for the first three bytes, and then the rest of the data is taken from the field.

Annabella (Prefix from the anchor record)
123
emarie (Data from the row)
Annemarie

The record for Annabella (NameId 441) doesn’t use any space and the prefix provides all of the data:


 
When the prefix isn’t used at all the data offset is set to 0, which means the entire anchor prefix is ignored:

 
Dictionary Compression

The Dictionary is stored in the Compression Info structure.


 
The dictionary structure is as follows:

  • Entry count – 2 bytes
    • This defines the number of entries in the dictionary
  • Dictionary entry offset array – 2 * Entry count bytes
    • They array defines the end offset of each dictionary entry in a very similar way to the column offset array in a standard record with variable length fields.
  • Dictionary entries
    • Defined by the offset array

The example script creates a table called FirstNames_PrefixAndDictionary. The table has repetition in it which makes it suitable for dictionary compression.

How does a field refer back to the dictionary? The row structure discussed in the previous row compression post has a CD Array that contains information about each column in a record stored as 4-bit integers, two to a byte.  Row compression covers values 0-10 for the CD array, above 10 and is used to mark that the value in the field is a symbol, a reference to the dictionary item. The size of the symbol is determined by the CD array value minus 11.

Here’s an example:

This record has two columns, so the CD Array is one byte (0xC3) that is split out into 3 (2 bytes short) and 12 (1 byte symbol). The 1 byte symbol value is 4 so we need to replace this with the data in dictionary entry 4.

Because this column has a prefix the first byte is the offset of the data, and the subsequent bytes need to be appended at the offset. This column has a prefix of 41 64 65 6C 61 69 64 61  (Adelaida).

Dictionary 4 has the value 04 65 which means 0x65 (e) is appended to the prefix at byte 4, this is 41 64 65 6C 65. This decodes to Adele. It's as simple as that!

SQL Internals Viewer - New version with sparse column support

I've just released a new version of SQL Internals Viewer that has support for 2008 sparse columns, a feature introduced in SQL Server 2008 CTP6.

There are also a few bug fixes and minor changes.

It's available to download from http://www.sqlinternalsviewer.com/

Thanks to Kalen for the help with the sparse vector complex header info.

SQL Internals Viewer 1.0 Released

I’m pleased to announce that SQL Internals Viewer 1.0 has been released. It can be downloaded from www.sqlinternalsviewer.com.

I've also put up the first part of a user guide that covers the main window and Allocation Map. The second part will follow shortly which will cover the Page Viewer. The user guide is available here.

If you've got an existing version installed it will need to be removed through Add/Remove Programs or Programs and Features in Control Panel.

New features

Encode and Find is a new feature in the Page Viewer that allows you to encode a value to a particular data type and then search for it in the page. It can be accessed using the Page – Encode and Find menu item or the button on the toolbar.

Encode and Find

There’s also a new feature on the hex viewer so that once you’ve found the data you are looking for you can select the record that it is contained in. This can be done by right clicking on the byte and selecting Select Record.

SQL Server 2008 Page and Row Compression

The Page Viewer can display the new SQL Server 2008 Page and Row compression row structures, including the CI (Compression Information) structure.

2008 Compression

At the moment the application only supports data pages.

Key

There is a new improved version of the Key for the allocation map.

Key

Clicking on an item on the Key will highlight it on the Allocation Map and fade the other items. Clicking on it again will clear the select.

Improvements

There have been several bug fixes and performance fixes, including improvements to the load times for databases.

Clicking on the Allocation Map will open the page in the current Page Viewer. To open the page in a new Page Viewer hold down the shift button and click on the page. There are more details of the new changes in the User Guide.

SQL Server 2008 Support

There is still work to be done to get Internals Viewer working will all of the latest features of SQL Server 2008, including Page and Row compression on indexes and sparse columns. It’s something I’m working on at the moment and I hope to release in the next few months.

Row compression – internal structure

The CTP6 release of SQL Server 2008 includes row and page compression. It’s a feature that will only be in the Developer and Enterprise edition of SQL Server, so I wasn’t sure whether I should (or could!) add it in to SQL Internals Viewer. I had a look into it and thought it would be worth putting it in as it’s one of those things where it’ll be very useful to understand how it works.

So far I’ve only got row compression covered in SQL Internals Viewer, but I thought I’d better get down what I’ve found out so far.

There’s more on the new SQL Server 2008 compression on the SQL Server Storage Engine Blog here.

Row compression has a completely different row format. The size of a field is determined by the minimum amount of space needed to store it.  The Books Online topic ‘Row Compression Implementation’ has a good run down of how space is saved with different data types.

It’s very easy to add row compression to a table. The syntax is:

ALTER TABLE table REBUILD WITH (DATA_COMPRESSION = ROW)

The standard row structure is covered in Kalen Delaney’s Inside Microsoft SQL Server 2005: The Storage Engine, and there’s also an overview here (Storage Engine Blog).

Compressed row structure

(There may be mistakes in this as it’s currently undocumented, please let me know if anything needs correcting)

Compressed row example

Status Bits A
1st byte

This looks the same as a normal row, although there may be differences.

Number of columns 1 or 2 byte integer
2nd/2nd-3rd byte

This is the first instance where space can be saved. If the number of columns can fit into one byte (0-254) one byte will be used, if not two bytes are used. If the first (high-order) bit is 1 on the first byte this indicates a second byte is used.

CD Array
Next (Number of columns/2) + (Number of columns%2) bytes

I’m guessing CD stands for Column Description or Compression Description. It’s an array of 4-bit (nibble) integers, stored 2 per byte.  Every column in the row has a CD Array entry that determines if it is null, empty, stored ‘short’ (and if so the size) or stored ‘long’.
Short and long are the equivalents to the difference between fixed and variable length storage in the standard row format. Short CD Array entries represent fixed length storage (defined by the CD Array entry), but they use the optimal amount or storage. Long fields are similar to variable length fields, they have an entry in a row offset array, and these too use optimized storage.

Possible values for the CD array are:

0 – Null
1 – Empty
2 – 1 byte short
3 – 2 bytes short
4 – 3 bytes short
5 – 4 bytes short
6 – 5 bytes short
7 – 6 bytes short
8 – 7 bytes short
9 – 8 bytes short
10 - Long

If a field is a BIT data type the value of the CD Array is used as the value.

Row compression essentially turns every compressible field into a variable length field. It seems that the distinction between long and short columns is used so the extra overhead (column offset array entry) is only used when necessary. Below 9 bytes the CD array can be used to store the length. Above 8 bytes and an extra two bytes are used for the offset array entry.

Short Column Data
Next ∑ (short bytes in CD Array)

Unknown
?

Number of variable length columns
Next 2 bytes

Column offset array
Next (2 * Number of variable length columns) bytes

Each 2-byte integer defines the end offset of the variable length field

Long Column Data
The long/variable length fields with the offsets defined in the offset array.

Here’s an example in Internals Viewer:

No compression:

Row with no compression

With compression:

Row with compression

This only covers data records. I've still got to look into indexes and after that page compression (which when used also uses row compression). I'll also try to blog on how the data is actually stored and how to decode it.

Hopefully everything will be covered in version 1.0 of SQL Internals Viewer.

More Posts Next page »