These are some really intriguing movies (Netflix, etc.) – they’re finally talking about data, data use, data ownership, privacy, customization (sort of) and all sorts of things that have significant implications on our career ethics and responsibilities.
When you watch these, they can quickly lead to a rabbit hole of TED talks, blog posts and more. Probably result in some pondering time as well, which is probably a good thing. The end result is that we have a lot to talk about and consider.
Data privacy, data ownership, data use, data protection are not the same things. This is hard to discuss in part because of this. And yet, as database people, every single aspect of this falls to us to administer.
Data Privacy
This is what most people will equate with much of the outrage in these movies, and what will be seen here as the culprit. I remember when Minority Report came out, people laughed at the scene in the retail outlets – where the signage and announcements changed to the specific person that was there “Hey, Mr. Smith, like those overalls? Want some more?” At the time it was funny and absurd.
That was before Amazon and it’s uncanny ability to recommend other things you may be interested in based not only on what you like but others that just bought that widget you bought. Then things come out about the way social, search and retail services store and use the information on each of us, and suddenly it takes on a weird “holy cow I had no idea…” aspect.
At the same time, I’ve seen reports where Facebook has tested offering to buy access to people’s information for the value of that information. The value on a discreet individual basis is something like $10/yr. Trivial by nearly any standard when you consider. But the value to FB is on the aggregate. The point about the $10 though is that people aren’t going to go through the trouble to do that, instead they’ll just kind of continue complaining about their information being private and talking about it in their FB posts. :/
I think privacy is a bit of a shadow issue. I think the privacy of information wouldn’t be so important if we knew both how it was used in a meaningful way, and knew it was being used for good (rather than influence). Unfortunately, that’s actually a really hard thing to do on the whole. Who determines what “good” is – and who provides the protection of information so that it’s truly not abused?
Ah, protections. DBAs. Data people.
Data Ownership
The whole GDPR model and structure set out definitions of this – who owns what data. But when it comes down to it, there are data derivatives too that have to be managed. Reporting, summaries, statistics, data pools, lakes, warehouses, and extracts.
All of that has to be managed and controlled. The knee-jerk reaction of allowing the originator of the data to control it throughout its complete lifecycle is doable… to a point. But it breaks down in incredible volume. As more and more information is generated, stored, processed and systems learn about us, about life, it’ll be more and more a gray area as to who owns that information. It’s one thing to say I own my address, phone number, SSN, all of that. It’s another to talk about sensor data in my sprinkler system in my yard, or black box information from my vehicle that is actually a rideshare vehicle…
Back to DBAs, data people having to provide for ways to manage those information bits. Manage the designs and relationships between not only data elements but data systems. It’s a huge deal and, like privacy, I think if the information were to be used as accepted, and protected, it would be more manageable.
Data Use
This, really, is the biggie. How will information be used? I have no real answers here, not because I can’t think of 1000 ways to use information, but for the things I can’t imagine yet.
Example: I was on Amzn, buying windshield wipers for my daughter’s car as a gift (hey, we live an exciting, on the edge life). Amzn knew about my car, my wife’s, not my daughters. It pointed out that the wipers I’d selected didn’t fit my vehicles. This was pretty cool. Once I put in the vehicle make/model I was buying for, it pointed out the best choices. Again, really appreciated, and they worked like a charm when installed.
I was pretty ok with this. I’m NOT OK with some of the data use tactics outlined in those movies that started out this post. I’m not sure where the line is, but for me, that’s a pretty clear intent difference potentially.
Data use will become even more of an ethics question for DBAs and data professionals. This will be something we’re going to have to be willing to really go to the mat on, I believe. There will only be a small number of people in a position to connect the dots on how information is being used in the aggregate, the intent and all of that. Standing guard over that is no small feat of course, and to make it even more interesting, it’s highly subjective. Who decides the intent of personalization?
Data Protection
In my mind, this is a no-brainer, absolute responsibility of the data professional, full stop. It’s a core responsibility to do all you can, and keep clamoring to do more, to protect the information in your systems. This is no small can of worms though.
You’ll be working with SQL Server, Azure SQL, RDS SQL Server, on-premise databases, mixed cloud provider environments, open-source databases, “just a little test database in the department” systems and more.
But we’ve got to keep after this. We’ve got to do all we can and this falls squarely in our collective laps to continue to fight for the right things here.
Summary
All of these things come together under the heading of “Angst” when it comes to these movies, blog posts, and TED talks. The discussion is just starting, and it’s important. But we’ve got to get some handholds on this and start breaking it down into pieces we can navigate. The cat is out of the bag on controlling our information – we simply rely too much on systems that are truly important and helpful. Not only that, but the benefits here are potentially huge for life in general. There is much good to be had from learning from the human experience, IMHO.
We need to actively be part of this learning process. We need to be constantly pushing forward, figuring out the best steps to take next and all of that. That way we can continue to work on the trust that people will need to have as they figure out how all of this comes together because really, it comes down to exactly that – trust.