There are some things in the life that are so implicit and taken for granted that nobody ever bothers to tell you about them or write up any documentation. This is how I feel the implicit support for date and time measures must have come about in Microsoft SQL Server 2005 Analysis Services (SSAS). Certainly I could not find any documentation about the topic and yet intriguingly both Measures and Calculated Members have "Simple Date" and "Simple Time" in their drop down list of possible text formats.
Clearly SSAS will allow you to add attributes to a dimension that have the data type DateTime. However, SSAS does not allow you to create a measure based on DateTime data type. Now this is a bit of a drawback when you want to calculate the average time a process took, which is exactly what I needed to do. After searching the internet, I was none the wiser as to how to solve the problem so some experimentation was in order.
An obvious approach would be to create a measure which contains the elapse time for the process in minutes. This would certainly give the right answer when averaged. However, presenting the resulting number in a format that was meaningful to the end-user is beyond standard out-of-the-box functionality. Especially when the elapse time needs to be presented in financial speak such as T+1 15:30 (i.e. the next day at 3:30pm). And since the end-user would be browsing the cube with Excel 2007, standard out-of-the-box functionality is all I could use.
My data was pretty simple:
CREATE TABLE dbo.WhenEventHappened(
SurrogateKey int NOT NULL,
MeasurementTypeID int NOT NULL,
ReferenceDate datetime NOT NULL,
DateTimeEventHappened datetime NOT NULL
) ON [PRIMARY]
Where ReferenceDate holds the date the event is relative to and DateTimeEventHappened clearly holds the date & time when the event took place. So for example, if we use the financial services analogy the trade took place on the ReferenceDate and it was settled on DateTimeEventHappened. So the elapse time is (DateTimeEventHappened – ReferenceDate) in days, hours and minutes. In my data, ReferenceDate is always a date with no time element.
The SQL Server DateTime data type
You probably know that behind the scenes SQL Server holds dates as a decimal number. The whole number contains the number of days since 1900 and the fraction represents the date. For example, 39415.5 represents 2007-12-01 12:00:00
So, if
ReferenceDate= 2007-12-01 00:00:00
and
DateTimeEventHappened=2007-12-01 12:00:00
then
(DateTimeEventHappened – ReferenceDate)= 1900-01-01 12:00:00 which is held as 0.5
OK so this gives me the elapse time, but now do I get it into the cube? Well the answer is simple. Convert the DateTime into a decimal number. To do this you need to apply the relevant cast/convert such as:
CONVERT(float, (A.DateTimeEventHappened - A.ReferenceDate)) AS EventDateDiffDecimal
To import this data into the cube, I created a view that performed the relevant cast/converts.

Now, for illustration of what happens between SQL and SSAS, I have added some redundant columns to my view just to make it obvious what is really going on here. So here is my data, including the extra ReferenceDateDecimal and DateTimeEventHappenedDecimal columns. I have also created some simple data which is shown below:

Next I simply added this new view to the cube's Data Source View and added the EventTimeDecimal and EventDateDiffDecimal plus the extra columns to my example cube as a new measure group which automatically sets the AggregateFunction to Sum. So here is the result:

Now apply magic!
OK, so the above is not so useful, so let's apply some magic. Simply set the FormatString and AggregateFunction for each measure as follows:
|
Column |
Format String |
AggregateFunction |
|
Reference Date Decimal |
dd-mmm-yyyy |
AverageOfChildren |
|
Date Time Event Happened Decimal |
dd-mmm-yyyy hh:mm |
AverageOfChildren |
|
Event Time Decimal |
hh:mm |
AverageOfChildren |
|
Event Date Diff Decimal |
dd-mmm-yyyy hh:mm |
AverageOfChildren |
So after re-deploying the cube, we get:
Oh look, wrong dates!
Well we have dates, but not the correct ones! It seems that date zero in SQL Server is 01-Jan-1990 and date zero in SSAS is 30-Dec-1899! To fix the problem, we simply add two to the values provided by our EventElapsePeriod view. After processing the cube we now get the correct dates:

However, I want the EventDateDiffDecimal to be presented in the format T+1 15:30. So I need to subtract one from the EventDateDiffDecimal value so that I get 1-Jan as my date. Then I can format the value, ignoring the month and year using the FormatString "\T\+d hh:mm". Now purist among you may point out that this strategy will not work when the date gets bigger than 31 as it will flick over T+1 again. This is true, but I have defined EventDateDiffDecimal as a semi-additive measure so I will not hit this problem.
So here is the final formatted data:

Note how the Grand Total is correctly calculated. Just magic!
If you would like to try this for yourself, I have published the T-SQL scripts and cube for download here. Note that the ZIP file also contains the SQL script to re-create the sample table and view used in this article.
Good Luck!
Excel spreadsheets with complicated layouts really do not lend themselves to processing with the standard Microsoft OLE DB drivers. Indeed, extracting one or two cells from different worksheets can be positively painful! This is where the power and flexibility of Aspose.Cells comes into its own.
Aspose.Cells provides an API that looks very similar to the OLE automation API exposed by Excel in the Office InterOp assemblies. However, you do not get all the headaches normally associated with the InterOp assemblies (i.e. modal dialogs, memory leaks, multiple Excel instances etc. etc.). With Aspose.Cells you do not need Excel to be installed on your server. Instead you only need a single assembly (Aspose.Cells.dll ) and the accompanying license file (Aspose.Cells.lic). Simple!
Working with Aspose.Cells in SSIS
To work with Aspose.Cells, you need to use the Script Task or Script Component. My personal preference is to use the Script Component in its asynchronous transformation mode. This way I can feed in a list of Workbooks that I want the component to process on its input buffer and the component can write the data read from the Workbooks onto multiple output buffers (one buffer for each type of data).
As SSIS uses Visual Studio for Applications (VSA), you do not have the flexible "Add Reference" dialog found in Visual Studio. In order to get around this issue while you are developing, you need to copy the Aspose.Cells assembly and associated license file into your .NET folder, which is usually:
C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727
Restart Visual Studio and you will see Aspose.Cells listed on VSA's "Add Reference" dialog. Now you can add the appropriate "imports" statement to the top of your module and start coding. Easy!
Later, when you move to production, you will need to register the Aspose.Cells assembly in the GAC by simply dropping the assembly into the following folder:
C:\WINDOWS\assembly
Dealing with the Aspose.Cells License file
Aspose.Cells has a license file which can either be in the same directory as the assembly, or in your own location. The easiest way to handle this in SSIS is to use a File Connection Manager which records the location of the license file. Then in code, you will need to add:
Dim license As Aspose.Cells.License = New Aspose.Cells.License()
license.SetLicense(Me.Connections.AsposeCellsLicense.ConnectionString)
Example Code
My sample code reads a list of workbooks from the input buffer and writes the extracted data to the output buffer. The data flow is shown below along with the code from my asynchronous Script Component. Enjoy!
' Aspose.Cells.dll must be registered in the GAC for production
' AND the C:\WINNT\Microsoft.NET\Framework\v2.0.50727 for development
' standard libraries
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper
' additional libraries
Imports System.IO
Imports System.Collections.Generic
Imports Aspose.Cells
Public
Class ScriptMain
Inherits UserComponent
Public
Overrides
Sub WorkbookList_ProcessInputRow(ByVal Row As WorkbookListBuffer)
' Imports all data from the the Excel workbooks provided on the input
' Note that each workbook contains several worksheets.
' For efficiency, we open the workbook once and then
' process all the worksheets in the file
Try
' The previous step in the package downloaded the workbook.
Dim fInfo As FileInfo = New FileInfo(Path.Combine(Me.Variables.ImportFolder, Row.URL))
If fInfo.Exists Then
' Set Aspose.Cells license
Dim license As Aspose.Cells.License = New Aspose.Cells.License()
license.SetLicense(Me.Connections.AsposeCellsLicense.ConnectionString)
' Open the workbook
Dim wrkBook As Workbook = New Workbook()
wrkBook.Open(fInfo.FullName)
' process each worksheet
For
Each wrkSheet As Worksheet In wrkBook.Worksheets
Try
' Check we have a valid worksheet by testing
' for text prompts in specific locations
If ValidWorksheetLayout(wrkSheet) Then
' process each worksheet found in the workbook
' that has the correct format
ImportWorksheetData(Row.WorkbookID, wrkSheet, Row)
End
If
Catch exi As Exception
Me.ComponentMetaData.FireInformation(0, Me.ComponentMetaData.Name, "WorkbookList_ProcessInputRow: Invalid Worksheet Layout: " & wrkSheet.Name, String.Empty, 0, True)
End
Try
Next
Else
Me.ComponentMetaData.FireError(0, Me.ComponentMetaData.Name, "WorkbookList_ProcessInputRow: Missing Excel File: " & fInfo.FullName, String.Empty, 0, True)
End
If
Catch ex As Exception
Me.ComponentMetaData.FireError(0, Me.ComponentMetaData.Name, "WorkbookList_ProcessInputRow: " & ex.Message, String.Empty, 0, True)
End
Try
End
Sub
Private
Function ValidWorksheetLayout(ByVal wrkSheet As Worksheet) As
Boolean
' checks if data is in expected locations. If not, then return false.
If wrkSheet.Cells("B3").Value.ToString = "Profit and loss projections"
And wrkSheet.Cells("B14").Value.ToString = "Revenue"
Then
Return
True
End
If
Return
False
End
Function
Private
Sub ImportWorksheetData(ByVal WorkbookID As
Integer, ByVal wrkSheet As Worksheet, ByVal Row As WorkbookListBuffer)
' imports all data from the worksheet
Dim iRow As
Integer
Dim iCol As
Integer
Dim iCount As
Integer
Dim dataRange As DataTable
Dim dr As DataTableReader
Dim HeaderRow As List(Of
Integer)
Try
' heading row is in seperate location to data, so import heading first
' note that Aspose.Cells uses zero-based row and column numbers in ExportDataTable call
' so Cell=4,Row=12 is actually cell E13
Try
' read single header row into generic list
HeaderRow = New List(Of
Integer)
dataRange = wrkSheet.Cells.ExportDataTable(12, 4, 1, 5)
dr = dataRange.CreateDataReader()
If dr.Read Then
For iCount = 0 To 4
HeaderRow.Add(GetInteger(dr, iCount))
Next
End
If
Catch ex0 As Exception
Me.ComponentMetaData.FireError(0, Me.ComponentMetaData.Name, "ImportWorksheetData ex0: " & ex0.Message, String.Empty, 0, True)
End
Try
Try
dataRange = wrkSheet.Cells.ExportDataTable(14, 2, 2, 7)
dr = dataRange.CreateDataReader()
While dr.Read
ProcessDataTable(WorkbookID, dr, 2, 6, HeaderRow)
End
While
Catch ex1 As Exception
Me.ComponentMetaData.FireError(0, Me.ComponentMetaData.Name, "ImportWorksheetData ex1: " & ex1.Message, String.Empty, 0, True)
End
Try
Try
dataRange = wrkSheet.Cells.ExportDataTable(19, 2, 2, 7)
dr = dataRange.CreateDataReader()
While dr.Read
ProcessDataTable(WorkbookID, dr, 2, 6, HeaderRow)
End
While
Catch ex2 As Exception
Me.ComponentMetaData.FireError(0, Me.ComponentMetaData.Name, "ImportWorksheetData ex2: " & ex2.Message, String.Empty, 0, True)
End
Try
Try
dataRange = wrkSheet.Cells.ExportDataTable(23, 2, 9, 7)
dr = dataRange.CreateDataReader()
While dr.Read
ProcessDataTable(WorkbookID, dr, 2, 6, HeaderRow)
End
While
Catch ex3 As Exception
Me.ComponentMetaData.FireError(0, Me.ComponentMetaData.Name, "ImportWorksheetData ex3: " & ex3.Message, String.Empty, 0, True)
End
Try
Catch exOuter As Exception
Me.ComponentMetaData.FireError(0, Me.ComponentMetaData.Name, "ImportWorksheetData: outer " & exOuter.Message, String.Empty, 0, True)
End
Try
End
Sub
Public
Sub ProcessDataTable(ByVal WorkbookID As
Integer, ByVal dr As DataTableReader, ByVal iStartDataCol As
Integer, ByVal iDataColCount As
Integer, ByVal HeaderRow As List(Of
Integer))
Dim iCol As
Integer
Dim DataValue As
Decimal
Dim bAddRow As
Boolean
Dim TimePeriodID As
Integer
Dim OutputCount As
Integer = 0
Try
For iCol = iStartDataCol To iDataColCount
bAddRow = False
Try
If
Not dr.IsDBNull(iCol) Then
DataValue = GetDecimal(dr, iCol)
If DataValue <> 0 Then bAddRow = True
If bAddRow Then
With
Me.WorksheetDataBuffer
.AddRow()
' get header from the row we read into a list earlier
.FinancialYear = HeaderRow(iCol - 2)
.WorkbookID = WorkbookID
.RowName = dr.GetString(0)
.DataValue = DataValue
End
With
End
If
End
If
Catch ex1 As Exception
Me.ComponentMetaData.FireError(0, Me.ComponentMetaData.Name, "ProcessDataTable: Inner " & ex1.Message, String.Empty, 0, True)
End
Try
Next
Catch ex As Exception
Me.ComponentMetaData.FireError(0, Me.ComponentMetaData.Name, "ProcessDataTable: Outer " & ex.Message, String.Empty, 0, True)
End
Try
End
Sub
Function GetDecimal(ByVal dr As DataTableReader, ByVal iCol As
Integer) As
Decimal
Dim DataValue As
Decimal
Try
Select
Case dr.GetDataTypeName(iCol)
Case
"String"
If IsNumeric(dr.GetString(iCol)) Then
DataValue = CDec(dr.GetString(iCol))
End
If
Case
"Integer"
DataValue = CDec(dr.GetInt32(iCol))
Case
"Double"
DataValue = CDec(dr.GetDouble(iCol))
Case
"Decimal"
DataValue = dr.GetDecimal(iCol)
End
Select
Catch ex As Exception
Me.ComponentMetaData.FireError(0, Me.ComponentMetaData.Name, "GetDecimal " & ex.Message, String.Empty, 0, True)
End
Try
Return DataValue
End
Function
Function GetInteger(ByVal dr As DataTableReader, ByVal iCol As
Integer) As
Integer
Dim DataValue As
Integer
Try
Select
Case dr.GetDataTypeName(iCol)
Case
"String"
If IsNumeric(dr.GetString(iCol)) Then
DataValue = CInt(dr.GetString(iCol))
End
If
Case
"Integer"
DataValue = dr.GetInt32(iCol)
Case
"Double"
DataValue = CInt(dr.GetDouble(iCol))
Case
"Decimal"
DataValue = CInt(dr.GetDecimal(iCol))
End
Select
Catch ex As Exception
Me.ComponentMetaData.FireError(0, Me.ComponentMetaData.Name, "GetInteger " & ex.Message, String.Empty, 0, True)
End
Try
Return DataValue
End
Function
End
Class
In my previous post, I outlined how to obtain a recursive list of all Excel files held in a SharePoint document library. In SSIS, I use this code to upsert records into a table which identifies which files are new or modified. Clearly I could process the file in-situ, but that would over complicate my code with data streams which SSIS would not understand – especially if you are attempting to use the standard Microsoft.Jet.OLEDB.4.0 engine to process the file! So downloading the file to a local folder is the easiest option. Fortunately, I spotted Greg Enslow's post which pointed me in the direction of using the WebClient library.
In my control flow I obtain a list all the files I need to process and store it in an ADO.NET recordset. I then iterate over this list using a ForEach task, which sets various variables containing the URL, file extension and the document's name and it's GUID in SharePoint. My script task then downloads each file so it can be processed by subsequent tasks in the control flow.
I have expanded on Greg's original code to support file system locations as well as SharePoint URLs. Enjoy!
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime
Imports System.Net
Imports System.IO
Public
Class ScriptMain
Public
Sub Main()
' downloads the file from SharePoint or a file system location to a local folder
Dim taskResult As
Integer = Dts.Results.Success
Try
' obtain location of local folder from variable
Dim dir As DirectoryInfo = New DirectoryInfo(Dts.Variables("ImportFolder").Value.ToString())
If dir.Exists Then
' Create the filename for local storage using
' the GUID from SharePoint as this will be unique.
Dim file As FileInfo = New FileInfo(dir.FullName & "\" & Dts.Variables("WorkbookGUID").Value.ToString() & Dts.Variables("Extension").Value.ToString())
If
Not file.Exists Then
' get the path of the file we need to download
Dim fileUrl As
String = Dts.Variables("EncodedAbsUrl").Value.ToString()
If fileUrl.Length <> 0 Then
' download the file from SharePoint or Archive file system to local folder
Dim client As
New WebClient()
If Left(fileUrl, 4).ToLower() = "http"
Then
'download the file from SharePoint
client.Credentials = System.Net.CredentialCache.DefaultCredentials
client.DownloadFile(fileUrl, file.FullName)
Else
' copy file from remote file system
System.IO.File.Copy(fileUrl, file.FullName)
End
If
Else
Throw
New ApplicationException("EncodedAbsUrl variable does not contain a value!")
End
If
End
If
Else
Throw
New ApplicationException("ImportFolder does not exist!")
End
If
Catch ex As Exception
Dts.Events.FireError(0, String.Empty, ex.Message, String.Empty, 0)
taskResult = Dts.Results.Failure
End
Try
Dts.TaskResult = taskResult
End
Sub
End
Class