10 March 2007 14:45 tonyrogerson

How to create a corrupt database using BULK INSERT/ UPDATE and BCP - SQL Server as a HEX editor.

This entry shows you how to use a combination of BULK INSERT, UPDATE and BCP queryout to create a database corruption which you should then be using to test your database consistency checking processing in your environments.

How do you know that the process checking and reporting database consistency is working? You really need to have a corrupt database to fully test your procedures (that’s how I found they’d broken the database consistency check task in 2005 SP2).

I wasn’t sure what to write first a blog entry on how you use BULK INSERT to load raw data or this one so as the topic on checkdb is in the press at the moment so to speak I’ll go with this one.

This post is really specific to SQL Server 2005, but, with a little effort I could make this work for SQL Server 2000; unfortunately my inclination to do that is somewhat lacking with there being so much juicy stuff to talk about in 2005.

Onward; we need to create our test database and sample table, here is the script ->

--  Create the target test database

CREATE DATABASE TestCorruption ON  PRIMARY

( NAME = N'TestCorruption', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\TestCorruption.mdf' , SIZE = 5MB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB )

 LOG ON

( NAME = N'TestCorruption_log', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\TestCorruption_log.LDF' , SIZE = 560KB , MAXSIZE = 2048GB , FILEGROWTH = 10%)

go

ALTER DATABASE TestCorruption SET RECOVERY SIMPLE

GO

ALTER DATABASE TestCorruption SET PAGE_VERIFY CHECKSUM

GO

 

 

USE TestCorruption

go

 

--  Create a test table that we will corrupt later

CREATE TABLE testtable (

    mydata1  char(17)    not null,

    id  int not null,

    mydata2  char(10)    not null,

    mydata3  char(10)    not null,   

    mydata4  char(6000)  not null   --  Forces one row per page

)

go

 

--  Enter some 'data pages'

INSERT testtable ( mydata1, id, mydata2, mydata3, mydata4 ) VALUES( 'HERE IS YOUR DATA', 99999999, 'SEPERATOR', 'TONYR', 'FILLER' )

INSERT testtable ( mydata1, id, mydata2, mydata3, mydata4 ) VALUES( 'FRY A PAGE', 99999999, 'SEPERATOR', 'TONYR', 'FILLER' )

INSERT testtable ( mydata1, id, mydata2, mydata3, mydata4 ) VALUES( 'LEAVE A GOOD ONE', 99999999, 'SEPERATOR', 'TONYR', 'FILLER' )

go

 

--  Create an index that allows you to find the data easily within the page structure, hence mydata1 + mydata3

--  you will see HEREISYOURDATATONYR in the page data later

CREATE INDEX idx ON testtable( mydata1, mydata3 )

go

 

CHECKPOINT      --  Writes dirty pages out to the MDF

GO

 

So we now have a test database that shouldn’t have any corruption in it; we now need to use BULK INSERT to get the data back into SQL Server so we can mess about with the page structures.

We are using a completely separate database to hold the loaded data; the table dbpages contains two columns – int (will hold the page number and uses the IDENTITY property) and page_data which is varbinary(max) – note you cannot define binary(8192) which is what we are really after (hey ho).

--  Detach the database so it can be bulk inserted

USE master

GO

sp_detach_db TestCorruption

GO

 

--  Create a seperate database to hold imported data structures

CREATE DATABASE TestCorruption_Load

go

USE TestCorruption_Load

go

DROP TABLE dbpages

go

CREATE TABLE dbpages (

    page_number int not null identity primary key clustered,

    page_data varbinary(max) not null

)

GO

 

Now, we can read the MDF file for our target database, at the end of the day its just data – 8192 bytes per row to be loaded, we create a format file as below...

9.0

1

1     SQLBINARY   0 8192 "" 2 page_data ""

 

Note, page_data is the second column in the table hence we map that to column (2) in the format file. We are not loading anything into the page_number column – that is populated via the IDENTITY property.

BULK INSERT dbpages

    FROM 'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\TestCorruption.mdf'

    WITH (

        BATCHSIZE = 1000,

        CODEPAGE = 'raw',

        DATAFILETYPE = 'native',

        ORDER( page_number ),

        TABLOCK,

        FORMATFILE = 'c:\temp\TestCorruption_page.fmt'

    )

 

Crazy J I’ve loaded a database into a database Lol. The fun now begins; because the data is now in SQL Server we can query it! Let’s try and find some data to corrupt.

Corruption 1 – Non Clustered leaf value not same as data page.

The first corruption is to break a non clustered index by changing the value it holds so it’s different from the data page.

SELECT page_number, CHARINDEX( 'HERE IS YOUR DATA', page_data ), SUBSTRING( CAST( page_data as varchar(max) ), CHARINDEX( 'HERE IS YOUR DATA', page_data ), 100 )

FROM dbpages

WHERE page_data LIKE '%HERE IS YOUR DATA%'

 

You get three rows returned – one is the data page, one is the leaf level of the index and the third (I believe) is one of nodes of the index (probably the root page).

Now modify the index row to force a corruption –

UPDATE dbpages

    SET page_data = CAST( STUFF( page_data, CHARINDEX( 'HERE IS YOUR DATA', page_data ) - 1, 27, CAST( 'THISDATAISRUBBISH' AS varbinary(max) ) ) as varbinary( max ) )

FROM dbpages

WHERE page_data LIKE '%HERE IS YOUR DATATONYR%'

Here I am changing the index value from the column mydata1 from HERE IS YOUR DATA to THISDATAISRUBBISH thereby creating a corruption between the index data and the data page itself.

In a real world system this means that if your query uses this index then the row you are trying to look up e.g. HERE IS YOUR DATA will not be found because the index does not have that value in it, instead it’s corrupt. Now, if you are using SQL Server 2005 and you have the CHECKSUM database property set and the page does have a CHECKSUM on it then you will get a consistency failure and your query will fail, otherwise (and this will always be the case in SQL 2000) you won’t get an error just no row returned (ouch!).

Corruption 2 – Database Page is completely fried

Find our data page that we are going to trash...

SELECT page_number, CHARINDEX( 'FRY A PAGE', page_data ), SUBSTRING( CAST( page_data as varchar(max) ), CHARINDEX( 'FRY A PAGE', page_data ), 100 )

FROM dbpages

WHERE page_data LIKE '%FRY A PAGE%'

  AND page_data NOT LIKE '%THISDATAISRUBBISH%'  --  NOT THE INDEX PAGE

  AND page_data NOT LIKE '%LEAVE A GOOD ONE%'   --  NOT THE INDEX PAGE

 

Now update it by setting the entire 8KB to ASCII 0.

UPDATE dbpages

    SET page_data = CAST( cast( REPLICATE( CHAR(0), 8000 ) as varchar(max) ) + REPLICATE( CHAR(0), 192 ) as varbinary( max ) )

FROM dbpages

WHERE page_data LIKE '%FRY A PAGE%'

  AND page_data NOT LIKE '%THISDATAISRUBBISH%'  --  NOT THE INDEX PAGE

  AND page_data NOT LIKE '%LEAVE A GOOD ONE%'   --  NOT THE INDEX PAGE

GO

We can’t do REPLICATE and 8192 because the max is 8000 bytes hence our need to do it twice and cast the first to varchar(max).

Corruption complete, we now need to get the data back out of SQL and attach it as a database.

Exporting the data pages

We need to use a different format file and also use the queryout option on the BCP; the format file cannot be used to exclude columns (like it can for importing data).

Here is the format file to use...

9.0

1

1     SQLIMAGE    0 8192 "" 1 page_data ""

 

Note the use of SQLIMAGE, you cannot use SQLBINARY because its limited in size.

To actually export the data use this (which is a DOS executable and should be run through the DOS CMD.EXE, make sure the format file is placed in c:\temp and has the same name you are using on the BCP statement, also - if you are doing this through xp_cmdshell make sure the file is on c:\temp on the SQL Server)...

bcp "select page_data from TestCorruption_load..dbpages order by page_number" queryout "c:\program files\microsoft sql server\mssql.1\mssql\data\TestCorruption.mdf" -S -T -f c:\temp\TestCorruption_page_out.fmt

Some things to note here;

"select page_data from TestCorruption_load..dbpages order by page_number"

This is the query, it outputs only the page data but importantly you order by the page number (that’s the value calculated by using IDENTITY).

queryout

Tells BCP that you are using a query as output.

Now we have the data back out of SQL Server we can create a new database from that MDF file – note, sp_attach_db is being deprecated so you should use the CREATE DATABASE instead...

USE master

 

CREATE DATABASE CheckDBPractice ON

    (FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\TestCorruption.mdf')

FOR ATTACH;

GO

A log file will be created for us; using the old LDF is meaningless in this instance.

We can now do a check database on our newly corrupted database J

DBCC CHECKDB( CheckDBPractice )

Go

Msg 8909, Level 16, State 1, Line 1

Table error: Object ID 0, index ID -1, partition ID 0, alloc unit ID 0 (type Unknown), page ID (1:89) contains an incorrect page ID in its page header. The PageId in the page header = (0:0).

CHECKDB found 0 allocation errors and 1 consistency errors not associated with any single object.

Msg 8928, Level 16, State 1, Line 1

Object ID 2073058421, index ID 0, partition ID 72057594038321152, alloc unit ID 72057594043301888 (type In-row data): Page (1:89) could not be processed.  See other errors for details.

Msg 8928, Level 16, State 1, Line 1

Object ID 2073058421, index ID 2, partition ID 72057594038386688, alloc unit ID 72057594043367424 (type In-row data): Page (1:114) could not be processed.  See other errors for details.

Msg 8939, Level 16, State 98, Line 1

Table error: Object ID 2073058421, index ID 2, partition ID 72057594038386688, alloc unit ID 72057594043367424 (type In-row data), page (1:114). Test (IS_OFF (BUF_IOERR, pBUF->bstat)) failed. Values are 12716041 and -4.

CHECKDB found 0 allocation errors and 3 consistency errors in table 'testtable' (object ID 2073058421).

CHECKDB found 0 allocation errors and 4 consistency errors in database 'CheckDBPractice'.

repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB (CheckDBPractice).

 

Unfortunately I seem to have stumbled across a problem with CHECKDB, my corruption causing CHECKDB to have an access violation so if you check in the ERRORLOG you’ll see a stack dump. This is a good thing from our perspective because it means you are getting more than just an error out of CHECKDB!

Now you have a corrupted database you can test your routines, just plonk this database into your live environment and when your consistency checker runs you should get notified or whatever your process does.

 Oh and if you try and use that broken index you’ll get the checksum changed or torn page error ->

SELECT *

FROM CheckDBPractice..testtable WITH ( INDEX = idx )

WHERE mydata1 = 'HERE IS YOUR DATA'

 

If PAGE_VERIFY is CHECKSUM option is enabled you get this...

mydata1           id          mydata2    mydata3    mydata4

----------------- ----------- ---------- ---------- ------------------------------------------Msg 824, Level 24, State 2, Line 1

SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0xe5582473; actual: 0x1a5f4505). It occurred during a read of page (1:114) in database ID 24 at offset 0x000000000e4000 in file 'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\TestCorruption.mdf'.  Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.


If PAGE_VERIFY is TORN_PAGE_DETECTION

mydata1           id          mydata2    mydata3    mydata4

----------------- ----------- ---------- ---------- ------------------------------------------Msg 824, Level 24, State 2, Line 1

SQL Server detected a logical consistency-based I/O error: torn page (expected signature: 0x55555555; actual signature: 0x12c048d5). It occurred during a read of page (1:114) in database ID 24 at offset 0x000000000e4000 in file 'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\TestCorruption.mdf'.  Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

If PAGE_VERIFY is NONE

mydata1           id          mydata2    mydata3    mydata4

----------------- ----------- ---------- ---------- ------------------------------------------Msg 5242, Level 22, State 1, Line 1

An inconsistency was detected during an internal operation in database 'CheckDBPractice'(ID:24) on page (1:114). Please contact technical support. Reference number 6.

On SQL Server 2000 you’d most likely not get the error message, instead you’ll just not get any data back – this is a major engine enhancement in SQL Server 2005.

I really want to expand on the corruptions, for instance breaking the page chain but alas time does not permit....

Anyway here are some references –

https://blogs.msdn.com/sqlserverstorageengine/archive/tags/CHECKDB+Series/default.aspx

https://blogs.msdn.com/sqlserverstorageengine/archive/tags/On-Disk+Structures/default.aspx

Filed under:

Comments

# Easy way to create a corrupt database for testing

22 March 2007 19:45 by SQL Server Storage Engine

In previous posts I've explained how to use a hex editor to corrupt a database for testing and exploration.

# SQL Server をbinary editor として使用する(MLメモ)

26 April 2007 21:49 by 米田 Blog ( SQL Server MEMO )

SQL Server をbinary editor として使用する(MLメモ)

# How to corrupt a page in a specific index and then fix it

07 October 2007 10:25 by Christian Bolton's SQL Server Blog

I was developing course materials for a workshop recently and I wanted to run a lab where you DBCC’d

# SQL Server Database Corruption - a nightmare for a DBA, what are you chances to survive?

I bet the subjet line is not an easy one to take on for any DBA to think about the survival chances in

# Twitter Trackbacks for How to create a corrupt database using BULK INSERT/ UPDATE and BCP - SQL Server as a HEX editor. - Tony Rogerson's ramblings on SQL Server [sqlblogcasts.com] on Topsy.com

Pingback from  Twitter Trackbacks for                 How to create a corrupt database using BULK INSERT/ UPDATE and BCP - SQL Server as a HEX editor. - Tony Rogerson's ramblings on SQL Server         [sqlblogcasts.com]        on Topsy.com

# how to corrupt a table in SQL Server? - Programmers Goodies

Pingback from  how to corrupt a table in SQL Server? - Programmers Goodies