It’s a pretty commonplace thing now to be meeting with a client and talking through their requirements, the tossing in the “what about data privacy and ownership – what are your requirements there,” question. It’s not a small thing, and it’s something that has been morphing over time. I think the morphing part comes from uncertainty and fear.
Uncertainty because the extremely large companies are facing down the regulations, laws, and expectations first, then it’ll continue spreading and changing and as people get used to both having and wanting control over their information. Expectations will change and be formed based on the user’s experiences.
Fear because if you get it wrong, if you protect (or don’t protect) the wrong thing, you’re quite possibly in a world of hurt. Could be reputation, could be downtime, could be legal, or all of the above.
To help with this, we’ve been talking with several people to enhance logging of information, but at the same time, provide very clear guidelines about what is being kept, and how long it stays around. If you can control that bit of expectation, you can help mitigate some fears about data loss.
The interesting one is the whole “what is being kept.” We’re seeing that more detail (not PII, just useful source information and data points and such) is better than less detail. Why? Because many people are working with data sets and asking where the data came from, how it was used, how it was derived, etc.
Add to this the fact that if something DOES happen to your systems, if you are breached or hacked or someone walks away with a thumb drive, etc., you’ll want to have a clear idea of what was involved, what types of things it might influence and those types of things – having a detailed audit trail is nothing but helpful.
To that end, we’ve been suggesting deeper information “wells” – the data bits that are non-identifiable – things that form the structure of your work with information. Could be sales information without specific customer data, could be data points from your 4,567,234+ IoT devices. Whatever it is, keep that stuff at a level that lets you prove out reports, usage, types of information, new uses for information.
The initial response to data protection laws and such was to trim data structures, to clear out information as soon as possible, but there are risks there as your company moves forward. Risks that you won’t be able to learn from your approaches, or your formulas or know the basis for certain assertions made from your data analysis. It’s hard to learn from the story if all you have is the final chapter (of information).
There is much to be done to make that information accessible, usable, and to make sure it’s architected, to cleanse the personally identifiable information, but keep the bits that make it all work. I think it’ll become increasingly important to have those for your systems.