Microsoft Access: an elegant solution to Data Warehouse metadata
Before you send me flame mail telling me I must be off my rocker, I am not proposing Microsoft Access store any data! We all know that its JET-based database engine has severe limitations in terms of speed and scalability. No, what I am proposing is that Access is used as is a very quick way for you to deliver an easy-to-use front-end for your business users to edit metadata which is actually stored in a SQL Server back-end database.
Whether you have a classic data warehouse architecture, or some botched system that evolved over the years, you will no doubt have some metadata that is vital to your data warehouse yet did not come from any source system. Usually this data needs to be changed by business users on an infrequent basis.
In big data warehouse (DWH) implementations, where money is no object, you can take the time to build an entire web-based application to edit and manage metadata. This is fantastic for those with deep pockets and long delivery timescales, but realistically not every customer can afford such luxuries. On the other hand Microsoft Access provides a quick and inexpensive way to deliver a mechanism to edit metadata using Linked Tables. This blog provides a tutorial on how to create a front-end metadata editing tool with Microsoft Access and also provides a useful starting template for you to build your own metadata editor. So here goes.
Before showing how to build your first front-end using my template, let's talk about the technical architecture.
The best approach is to store your metadata in a separate database. This offers better control over security as users can be given full permission on the metadata database but limited rights to the other databases in the data warehouse. Also, the metadata database can have full recovery model, so it can be restored to any point; something that is not always applied to a staging database. However, your architecture may be a sub-set of this. Needless to say, the template I provide will prove useful in any situation.
The following screen shots shows exactly what can be achieved using Microsoft Access. Now these are actually very crude by comparison to what can be achieved by proper MS-Access developers, but they are quick and easy to build for those with limited time and skills. Also business users find them easy to understand.
My example screens allow the business user to edit a Region hierarchy which becomes a dimension in the OLAP cube. The region hierarchy is roughly geographical in nature, but it is business focused and not related to geo-political boundaries and therefore not be available in any source system; hence the need to hold this data in the data warehouse. The hierarchy has three-levels: the top level has Regions, the mid-tier has Locations and the bottom level holds Business Units. Clearly we could have more levels, but this is sufficient for my example.
Been able to edit this kind of structure delivers a lot of power to your business users as they are now able to control how all reports are presented to end-users and how numbers roll-up in the OLAP cube. Nothing to be sneezed at!
The tables in the metadata database that support the region hierarchy are normalized for editing and de-normalized by a view when been sucked into the DWH for use in the OLAP cube. So our MS-Access database has to edit the following SQL Server tables:
The main menu is the first thing the user sees. This is their jumping off point for editing the metadata. Of course the screen must be in their language, not techno-babble!
Clicking "Regions" will show the following screen.
Although crude, this allows the user to a) change the name of a region and b) add new regions. Foreign key relationships or triggers in the underlying SQL Server database should stop users deleting rows unless you want them too.
In a classic DWH implementation, the user should indicate that a region is defunct by changing its status. In my simple example, they do so by deleting the row. Both approaches can be implemented using MS-Access, it is just a matter of design.
The Edit Locations form is more sophisticated as it provides drop-down lists for the user to select the region (i.e. from what the user entered on the Edit Regions form). This is because the metadata tables are normalized for editing and de-normalized by a view when been sucked into the OLAP cube.
Edit Location to Business Unit Mapping
The final screen allows the user to map their business data to the bottom level of the regions hierarchy. For my client, this screen related locations to the identifiers used by the finance department to classify all revenue and costs.
The easy way to create Linked Tables
In the template I provide a very simple mechanism to help you create your linked tables as well as switching between servers (i.e. between test and production database servers). However, if you are new to MS-Access you will not appreciate the pain of creating and managing them manually, so let's create a linked table the standard way first.
To create a linked table through MS-Access, you go to the External Data tab and click More and select ODBC Connection as shown below.
You next select Link to a data source by creating a linked table and click OK. This will display the Select Data Source dialog. Now select your data source or create a new ODBC connection file. Be careful to select the correct SQL Server Client Library for your version of SQL Server. For example:
Once you have created your ODBC connection, MS-Access will show you a list of all the tables in your SQL Server database where you can select multiple tables to link.
Click OK and the tables are linked. Well that seemed easy enough, so where is the pain? Well the pain comes when you want to switch servers. The connection string for each table is held in the metadata associated with the table. It is far too easy to connect different tables to different servers during development (e.g. some to localhost, others to a dev server etc.), secondly the Linked Table Manager provided in MS-Access is not the most intuitive UI for changing the database links. Indeed it does not tell you on which server the tables reside! For that, you need to hover over the linked table in the Navigation Pane as shown below. Not too convenient!
The other thing about these newly linked tables is that the schema is displayed, which means the first thing you will do is edit the name displayed in MS-Access. Again a pain if you have to drop and recreate tables when switching between servers.
Change server and create linked tables all in one step
My template provides a far simpler mechanism to simultaneously change server and re-create all the linked tables in one step. You simply edit the SourceServers table and fill in names and descriptions of all your servers.
You then edit the LinkedTables table and fill in the names of the tables you want to link along with the name of the database in which they are stored.
Then select Change Server from the main menu and a list of servers will appear. Simply select your server and click OK. The code will delete all existing linked tables and then re-create them based on the information provided in the LinkedTables table. So changing between development and production servers is now extremely easy!
The top level of the hierarchy is Region and is held as a simple list of names with an identity column. We only need a basic editor for the user to edit this list, so we will use the default datasheet view provided by MS-Access. To invoke the basic editor, simply double-click on the Region table and you will see the datasheet view (displayed below). Remember, if the left-hand Navigation Pane is not visible, simply click F11.
Editor with drop-down selections
Our Location table has a foreign key relationship with the Region table. Rather than have our users remember a lot of meaningless IDs, we want to provide them with a drop down menu. To do this, click on the Location table in the Navigation Pane and then open the Create tab and select the Forms Wizard as shown below.
Next you will see the Form Wizard's field selection dialog. Select all fields.
On the next Layout selection dialog, select Tabular.
On the Style dialog, pick your preferred style. I recommend sticking to the Office theme if you are unsure. On the final dialog, name the new form and click Finish.
You will now see the basic Location editing form ready for you to enter data. It looks much better than the basic form we created for Regions, but has the problem that the user has to remember the RegionID in order to make an entry.
We get around this by changing the RegionID field to use a combo box. Switch into design mode using the View option on the Home tab and select the RegionID field. Right-click and select change to combo box.
Now change the Row Source property by clicking the three dots in the Property Sheet (press F4 if this is not visible).
Use the query designer to create the query:
SELECT Region.RegionID, Region.RegionName FROM Region;
This SQL should be displayed in the Row Source property once you exit the designer.
Finally change the Column Count property to 2 and set the column widths to 0cm;4cm
This has the effect of hiding the RegionID from the user as shown below.
Edit Location Mapping table
The LocationMapping table maps business unit IDs from the source finance system to locations. So here the drop down list we provide to the user has to come from the source finance system. As this is a data warehouse, we have already extracted and cleaned that data with SSIS and written it into the staging database. So we simply need to create a linked table to the relevant table in the staging database. Again this is easily done using the LinkedTables table as this has a SourceDatabase column as shown below. Of course you will need to click the Change Server button to create the new linked table.
Using the same technique as we did with the RegionID column in the Location form, we can create a form containing two drop-down lists as shown below.
Creating the Main Menu
When the user first opens the database, you want to display a menu of options so they can find things easily. To create a new form, simply choose Form on the Create tab. Then switch to design mode and expand the detail section to reveal a white cross-hatched area. As you are in design mode, you will see the Form Design Tools tabs on the ribbon. Select Design tab and click Button. Now highlight an area of form and draw a button. When a dialog pops up, simply click Cancel as we do not want any standard option. Open the Property Sheet and give the button a sensible Name on the Other tab (e.g. cmdRegion). On the Event tab, select [Event Procedure] in the on-click event and click the three dots box to open the Visual Basic editor.
To open the region table directly for editing (i.e. because we do not need a special form) we use the OpenTable command as follows:
DoCmd.OpenTable "Region", acViewNormal, acEdit
We now repeat the steps to create buttons for our other forms. However, as we have created special forms to edit these tables we need to use the OpenForm command as follows:
DoCmd.OpenForm "Location", acNormal, , , acFormEdit
For some obscure reason MS-Access still uses macros to achieve certain tasks. In order to have the main menu appear when the user first opens the database, you need to create macro called AutoExec which does this.
If you have a separate metadata database, then your users only need read-write permissions to that database. If the MS-Access database uses other tables to populate drop-down lists, then the user will need read-only access to those tables. This can be managed on a table by table basis or by simply adding the user to a database role with db_datareader and/or db_datawriter roles.
Gotha: Tables must have a Primary Key
MS-Access can only edit tables that have primary keys defined. If no primary key is defined, then you will find that your form is locked in read-only mode with no indication whatsoever as to why you cannot edit the data. For me this was a painful learning curve, as I had most of my tables with primary keys but the odd one or two without. Given the lack of feedback from MS-Access as to why the forms were locked in read-only mode, it was quite some time until that I realised what caused the problem.
Clearly your business users need MS-Access installed on their desktop in order to use you new metadata editor. I would recommend developing the database using the same version of MS-Access as your users have installed. Although my screen shots are of MS-Access 2007, this application can be built using MS-Access 2003 as well.
Although convenient, I would not recommend using a file share to deploy the database as I have found that slow VPN connections can corrupt the MS-Access database making it unusable for everyone. Instead, have the users copy the database to their desktop or install on their PC to a standard location. You could even write an installer, although that is rather over the top as a simple batch file would do!
The other thing to be careful about is the version of SQL Client libraries on the user's desktop. I would recommend using what is already available rather than requiring a separate install. Clearly testing on several machines will highlight any issues in this area. If you need to change the version of SQL Client used by my template, simply edit the ODBC_DRIVER constant in the VB module called Common.
Replication of metadata to the staging database
If you are not happy using cross-database joins in your stored procedures, you may well need to replicate the metadata to the staging database for use in the data warehouse. This can be done in a number of ways using SSIS or SQL replication. However, this is a topic outside the scope of this blog entry.
The MS-Access Template and SQL scripts to create the demo region hierarchy is available for download here: Metadata_Editor.zip.