It seems that despite all of our work on solving SQL issues, we as a group rarely write design articles. So I'm starting a series on basic design issues and ideas to help those of you with little experience building schemas. I hope that I'll also inspire those gurus out there to show off some sample scenarios that can help the next group of DBAs build their skills
A long time ago I got called on some phone survey and asked a bunch of questions about the Internet. This was back in the late 90's when the Internet was getting popular and I had time for such nonsense.
In any case I was asked about what type of web site I would build. It was the middle of the day, I was taking 5 minutes from work and I didn't have a site at the time. I thought about it a minute and said I'd build a site that hosted book reviews. I've always had the dream of owning a bookstore and this was before Amazon's reviews were quite as evolved as they are now. Eventually I set up a site and added book reviews.
I wanted to write a short series on designing tables since I see so many forum posts and questions out there on designing things. I know that so many applications and situations are different, but there are some good examples of common situations, inventory, sales, etc., and I thought I'd start with one where I did something wrong early on and then fix it in a later article.
And maybe even learn something from a few of the design gurus out there. While I chose books, there are any number of scheduling type scenarios that you could adapt from this one.
NOTE: One big disclaimer here. This isn't the best design, nor even the current one :). It's the start of a series that shows how to spot flaws and evolve a design.
The Design
The premise for the book review section would be that I would log the books I bought or borrowed, track when they were complete and then write a short review. It was designed to be a single user system since it was my site and I had no interest in hosting reviews from other people. I also decided to ignore comments since I think they add noise to many sites.
My initial design examined books, since they're kind of at the heart of this subsystem. Books have their own unique key in their ISBN, which is used across all books. As we're publishing more now there are 13 digit ISBNs, but when I started there were 10 digit ones. The 10 digit ones still work, so those have turned out to be a decent primary key.
There are also a number of other characteristics of books that I needed to consider:
- Title
- Author
- Cover Image
- Status
- Review Date
- Rating
- Owned
- Review
I'll explain a few of these in more detail to explain how I viewed the system.
Title - The title is easy, every book as one. I wasn't concerned about sub-titles, series, etc. I just wanted what was printed on the cover. Keeping track of stuff in a series wasn't important either since I never knew if I'd actually read them all. So this was a simple character field.
Author - I know that many books have multiple authors, but this wasn't designed to be a full scale, searchable system. After all, how many books will there be to search through? I decided on a simple character field for authors. Multiple authors separated as they are on Amazon, by commas.
Cover Image - Since I was putting up reviews, I figured I'd see if I could make referral money. So this is a simple URL that goes to Amazon. I could automate these using the ISBN, but I found a few books early on that weren't on Amazon and I scanned in images. So this way I could host the image if needed.
Status - This was a tracking system for me in addition to the reviews, so this was a simple set of codes: reading, not reading, completed, etc. I decided on an integer here linked to a lookup table.
Review Date - I wasn't so concerned about figuring out how long it took to read a book as I was knowing when I finished it. Since I read some books in hours and some in months, just tracking the datetime of the finish was acceptable.
Rating - Obviously it makes some sense to rate the book in a review. I could have included this in the prose, but I thought that I might want to get the top books, worst books, etc. I decided on a simple 1-5 rating, but didn't think it was worth linking this in.
Owned - This was a good flag for me, separate from the status, so I could see how many books I bought in a year. Depending on how my finances were going, I've gotten lots of books from the library at different times, so this might be a nice number to know. Or not know, depending on how many I'd purchased.
Review - The last item was the crux of this whole system. The review of the book as I finished it. I had no idea how many words I typed, so I thought I should be optimistic and leave plenty of room.
Code
In looking at the above requirements, I ended up with a fairly simple design. Two tables, the code for which is below, were created and most of the web coding done in a day. As you might guess for a simple system, this was built rather rapidly.
create table books (
isbn varchar(10)
, title varchar(80)
, author varchar(200)
, ImageURL varchar(200)
, BookStatus int
, owner int
, Rating int
, ReviewDate datetime
, bookreview text
)
go
create table bookstatus
( Bookstatus int
, Status varchar(40)
)
I also needed to create a few indexes as part of the design. Leaving these to the end of our design or after things have gone into production is a mistake. I decided on three indexes for the books table and one for the lookup table.
ALTER TABLE [dbo].[books] ADD CONSTRAINT [PK_books] PRIMARY KEY NONCLUSTERED ( [isbn] ) ON [PRIMARY] GO CREATE CLUSTERED INDEX [books_IDX2] ON [dbo].[books]([reviewdate]) ON [PRIMARY] GO CREATE NONCLUSTERED INDEX [books_IDX3] ON [dbo].[books]([title]) ON [PRIMARY] GO ALTER TABLE [dbo].[bookstatus] ADD CONSTRAINT [PK_bookstatus] PRIMARY KEY CLUSTERED ( [bookstatus] ) ON [PRIMARY] GOConclusion
There are some fundamental problems with this design. It does meet the requirements, but it has some flexibility issues and likely will cause me some problems in the future. However this is the type of design I often seen built with beginning designers or those with no experience that are forced to develop a solution.
There are a few easy fixes to these problems that I'll address in the next article. Please feel free to comment on the design, where you see flaws, and what improvements you might make. I've got my own ideas about what to do differently, but this will be a good start and maybe you'll even teach me a thing or two.
Steve Jones © 2007 dkranch.net