This is the last workday of 2022. Next week starts a new year, and as I've often done, I wanted to look back at the year. This time I decided to look back month by month, at some of the headlines and memorable data-related topics. I'm tackling things month-by-month.
In January there was a set of "tech experts" who shared their thoughts on the best database management systems. This one is worth a read for the humor involved. I wouldn't really consider many of these to be DBMSes. I thought about including this one as an April 1 joke, but it was a real story. I get asked for my opinion at times by writers researching a topic they don't understand. I hope I don't come across like a few of these people.
February was another sad data breach story. In this case, from the state of Washington where many tech people live and have startups. It wasn't clear initially what happened, but later articles noted this was from a stolen device. To me, this was a great reminder why dev machines (and databases) should NOT have PII. Mask/obfuscate/anonymize that data please. Or at least delete my name from your dev systems.
March had another humorous story (to me): Oracle is going to lure people away from AWS and SAP with their new offering. I could believe the latter, but not the former. Oracle hasn't ever been good about pricing and it seems more people are leaving Oracle than coming to it.
April is the month of April Fools, but this isn't a joke. Another data breach, again from a dev system. This one from Fox News, which included information of not only employees, but guests and celebrities. The claim that this was a dev system and not production doesn't matter if the data is real. Please people, keep prod PII out of dev.
May had a funny post from Hacker News. Someone put their whole life in a database. The comment that caught my eye on HackerNews: "Men will literally devote hundreds of hours to building a bespoke database tracking every moment of their lives instead of going to therapy."
I worked a lot on my weight and diet in 2022. June had me finding a public database to help me choose better food. A public database on processed foods. Great idea, but everyone has an agenda. I hope this has some crowdsourcing and reasoning and isn't just one person's opinion.
In July, another data breach. This time in China with information for 1 billion people. Wow. I dislike large databases for this reason. It's also a good reminder why you ought to remove information from your databases over time, at least the PII part. Again, delete my name, if nothing else.
There are so many types of database platforms. Have you heard of a vector database? Apparently, it's for managing vector embeddings, whatever those are. An August article on the strange growth of the database market.
September started this crazy AI art craze. There was a call to remove living artists from the database of works that an AI uses. Makes sense to me. I think artists deserve support and while I like AI doing new things, maybe wait until the artist isn't producing work.
October showed a good reason why we need ongoing patches or open-sourcing of code for retired systems. There was a 22-year-old vulnerability reported in SQLite.
In November, what other news could there be than Lego Steve? It was a tiring week.
December is just ending, but I'm ending on a reason why databases without auditing are a problem. Men behaving badly in this one. Gathering public information (or even semi-public) at scale can be problematic, and the information gets abused. Better controls, but also more auditing and triggering of some actions to prevent this (and other) sort of abuse.
Let me know if you remember these events, or perhaps if there's a favorite memory of the 2022 data world that you wish I'd included.