I'm occasionally accused of 'over-normalizing'. I take it as a compliment, like being 'over-virtuous'. Years of getting it wrong, and witnessing others doing likewise, have led me to believe in slim, narrow tables with a well-chosen primary key that is supported by a clustered index. Sure, it means that you are doing more joins, but this is only bad news if you believe, as some do, that all joins intrinsically take a long time.
I have twinges of guilt, of course. I had one the other day, while creating a dead simple customer database. Things get bought by a customer. Dead simple, name, gender, address. Hang on. Some items are bought by organizations, via representatives who are different every time we make contact. Organizations don't have a gender. What identifies a customer? Name? I once had a database that acquired eight identical names at the same address. We investigated, of course, and it was true. We had eight customers all with the same name living at the one house. It was one extended family, recently arrived, and the family tradition was to call all males by the same first name. After a while, some of them moved, both in and out, which made life even more difficult.
As well as many people sharing one name, people can have many names at once. I was once involved in a major project for a national retail bank. They decided to rewrite the customer database. After spending several million pounds on it, it had to be scrapped. At the heart of the disaster was the mistake of believing that the customer's name represented their identity. I remember some of the customers who broke the system: One was a lady who had had several marriages, and had accounts under several surnames. To make things even more interesting, she would change her first name occasionally, possibly to shake off some of the complications of her past life. She also took the option of lying about her considerable age.
The more the database team examined the problem, the more difficult it became to try to nail down the identity of an individual customer. Surely, they thought, we can just sort out these few edge cases 'by hand'? The scale of the problem appeared only when we were able to test on live data, as part of the deployment process. Oops.
Maybe it is just names that are awkward. What about gender? Ah, maybe not. Perhaps address is simple? Well, not only does it change, but an increasingly large minority have second or third homes, in contrast to the homeless and those in short-term lets or lodgings. I can remember NAD databases that had a single phone number and email address. How times change.
Real life is richer and more unexpected than we geeks imagine, and entities can end up slipping out of the neat structures to which society believes that they conform.
Phil Factor.