Many companies don’t store their electronic files in formal document management systems.
Despite this, when they store them on file servers or file sync ‘n’ share, many key document attributes get automatically stored as metadata, for example the dates a document was created, updated, and last opened.
Some documents may have dates stored in the content (that you’d have to manually read), but many don’t …. so these dates are critical when it comes to determining when documents must be kept until, and when they should be deleted.
What would happen in terms of managing your file based business records if the dates became altered? Is there any way you could re-create them apart from doing a restore of all files from a backup? How long would it take you to notice they had become altered?
Don’t just assume this is a consideration for team / shared drives. Critical business documents often exist on employees home drives unless there are robust processes in place. Even where that’s the case, were they in place when these documents were created?
Migrate Data With Care
Despite the massive challenge that this would present, when data is migrated to a new repository some of the original dates are often reset to the day of the migration …. in other words records that may have been 4, 5 or 6 years old may appear now to be new documents, unless individually analysed further.
When you are planning a data migration, especially if it is to the cloud, think about how data will be moved, and thoroughly test, document and validate the migrations.
Make sure you have documentation available for your users if they are going to be migrating data, so it is done consistently with expected or no impact to dates.
Be just as cautious of migrating data in bulk between file servers … some data migration apps correctly preserve the file dates, but others do not.
Even data migrated carefully will probably update the last opened date. If you have many identical copies of the same file (same size, name, last updated date/time) in your environment, the last opened date is a good indicator of which copy of the file is actively being used. So cleaning up your file data before a data migration is much easier and safer than doing it afterwards.
The migration of file documents to the cloud is sometimes seen as an opportunity to fix one or both of these problems:
- A chance to get rid of lots of data “that’s not needed”
- They decide that only very recent files on home drives will be migrated by IT to the new cloud storage, and everything else will stay on the old environment for a limited period
- Any document that the user deems important will be available for them to upload manually to the cloud
- Remember, some of these documents may be important business records!
- A chance to fix capacity problems with the limited storage available on their server
- By lifting and shifting everything to the cloud, where often the amount of capacity available well exceeds what is needed
- This is putting off the inevitable task of understanding your data, becoming so vital with GDPR, and making it harder in the process (see our blog on finding out what files you store in the cloud) even if you manage to preserve file dates
Data MetaMorph’s experience
We often see examples of dates having been updated en masse when we run our analytics against file listing output as part of our unstructured file data assessments, where we are looking for ROT (Redundant, Obsolete or Trivial) data as well as SELF (Sensitive, Exposed or Legacy Files) data.
The dates are vital to us, so that we can provide a structured order to file history at scale, as well as identify data that has exceeded any client retention policies …. but they are just as important to you on a file by file basis when you are looking for your important documents.