You will have no doubt seen or heard branded about that its good to use multiple files for a file group but you’ve always wondered why?
The current recommendation from Microsoft is that you have a file per logical CPU for tempdb so as to balance load not only for object creation but for extent allocation and performance.
But what does SQL Server do when you use multiple files? SQL Server round robins the rows being inserted so the rows are distributed across the multiple files.
Consider this first database, notice the files have the same initial size and autogrowth-
USE [master]
GO
CREATE DATABASE [testfile2] ON PRIMARY
( NAME = N'testfile2', FILENAME = N'E:\testfile2.mdf' , SIZE = 5MB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB ),
FILEGROUP [FileTest]
( NAME = N'testfile21', FILENAME = N'E:\testfile21.ndf' , SIZE = 1MB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB ),
( NAME = N'testfile22', FILENAME = N'E:\testfile22.ndf' , SIZE = 1MB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB )
LOG ON
( NAME = N'testfile2_log', FILENAME = N'E:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\testfile2_log.LDF' , SIZE = 18560KB , MAXSIZE = 2048GB , FILEGROWTH = 10%)
COLLATE Latin1_General_CI_AS
GO
EXEC dbo.sp_dbcmptlevel @dbname=N'testfile2', @new_cmptlevel=90
go
ALTER DATABASE testfile2
MODIFY FILEGROUP FileTest DEFAULT;
We can now do a test to see how the file grows when we do an INSERT….
select * from ::fn_virtualfilestats(db_id(), NULL) where FileId > 2
go
select *
into SalesOrderDetail
from AdventureWorks.Sales.SalesOrderDetail
go
select * from ::fn_virtualfilestats(db_id(), NULL) where FileId > 2
go
I am using ::fn_virtualfilestats( <dbid>, <fileid> ) to capture the statistics against the file, thanks to Linchi Shea for pointing the best way to do that out.
On my test box measurements where
Before population
fileId BytesOnDisk
3 1048576
4 1048576
After population
fileId BytesOnDisk
3 6291456
4 6291456
We can see by the autogrowth that SQL Server has round robin’d between the two files equally because they had an equal amount of freespace in them.
Lets now look at a situation where the two files have different amounts of freespace, a situation that can result from a) autogrowth different, b) file added to the file group at different times.
Drop the original database and run this…
USE [master]
GO
CREATE DATABASE [testfile2] ON PRIMARY
( NAME = N'testfile2', FILENAME = N'E:\testfile2.mdf' , SIZE = 5MB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB ),
FILEGROUP [FileTest]
( NAME = N'testfile21', FILENAME = N'E:\testfile21.ndf' , SIZE = 10MB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB ),
( NAME = N'testfile22', FILENAME = N'E:\testfile22.ndf' , SIZE = 1MB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB )
LOG ON
( NAME = N'testfile2_log', FILENAME = N'E:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\testfile2_log.LDF' , SIZE = 18560KB , MAXSIZE = 2048GB , FILEGROWTH = 10%)
COLLATE Latin1_General_CI_AS
GO
EXEC dbo.sp_dbcmptlevel @dbname=N'testfile2', @new_cmptlevel=90
go
ALTER DATABASE testfile2
MODIFY FILEGROUP FileTest DEFAULT;
Notice we are still creating 2 files, but this time one starts with 10MB free and one with 1MB free – autogrowths are the same.
Re-running the select and insert gives this result…
Before population
fileId BytesOnDisk
3 1048576
4 10485760
After population
fileId BytesOnDisk
3 3145728
4 13631488
More data has been written to FileId 4 which is the one with more freespace.
So, the moral of the story – if you are going to use multiple files then be aware of how the round robin insert stuff works!
One last thing, another benefit of using multiple files is that each file has its own NT thread so its good when you have multiple logical CPU’s which is pretty much the norm now-a-days.
There are lots of other things I want to show – effects of shrinking, defraging etc… that will need to be another entry!
One last thing, the unit is rows, a single row can't span files - I think its done at the page level but I'm not sure - if anybody knows for sure perhaps they'd comment, otherwise I'll get back in the not too distant future with the answer....