Does the direction of storage make us bad data citizens?
My career started at a company where we hardly had email, the network was a 10base2 affair with cables running all around the office. You used floppy disks and the thought of a GB of data was absurd. You had to look after every byte and only keep what you really needed.
Whilst the cost of the spinning disks gradually falls the cost and size of flash storage continues to plummet.
The new Crucial SSD is £380 for 1TB
I can now keep 128GB of data on a SD card the size of my finger. It only costs $50 a month to store 1TB of data in Azure ($61.29 on Amazon S3 Europe)
This brings long with it a whole host of problems.
Its too cheap
Whilst before you had to manage your mailbox, your desktop, your database because you ran out of space.
Now its cheaper to keep data than it is to get rid of it,
Keeping data around leads to many problems, and considering the cost of the media is only a small part of the cost of maintaining the data
1. You have to manage it, i.e. Backups
2. You have to secure it.
3. The more you have of it the more risk there is of loosing some of it.
4. When in a database having more data affects your query performance.
5. Can your employees find the data they need, is it a game of find the needle in the hay stack,
6. Most data protection acts state you should only keep data for as long as you need it.
7. You are also liable to give customers copies of the data you have on them. Do you know all the data you have on someone. Even those notes the sales person wrote about someone in a onenote notebook.
Its just like the blob
The reason that this is a problem for many organisations is that its never something that is on the list of priority things when a company starts or a project starts. Its never a problem as there is always enough space when things start. However as time goes by the problem gets bigger and bigger until it becomes a problem.
When it becomes a problem its then such a big thing to deal with, to do the work, change attitudes, implement policy, that no one really has the appetite.
What are you doing about it?
Do you think about what you are storing and decide if you really need to store it?
Do you have a data retention policy?
Do you have a data deletion policy?
What about a data archive policy?
Are you actively reviewing what data you are holding?
Do your IT guys really push back when a team says we need 1TB of storage for project X?
Do your developers have data retention in their definition of done?
Deal with it sooner rather than later or it will be just too big to digest.