March 5, 2009 at 1:59 pm
Hello:
I hope someone can help me with this--I'm at my wits' end!
I inherited a biological taxonomy database containing this table:
TaxonLookup (all fields are varchar(50))
Phylum
Class
Subclass
Superorder
Order
Suborder
Superfamily
Family
Subfamily
Genus
Species
Subspecies
It seems incorrect to me to have all of this data for each item, so my plan was to create these tables:
TaxonUnitType TaxonUnit
RankID (smallint, PK) TaxonUnitID (smallint, PK)
RankName (varchar(50)) UnitName (varchar(50))
DirectParentRankID (smallint) ParentTaxonUnitID (smallint)
RequiredParentRankID (smallint) RankID (smallint, FK)
So the TaxonUnitType table serves as a recursive hierarchy; for example, if Superorder's RankID is 4 and Class's RankID is 2, Order's record would look like this:
RankID.....................5
RankName.................Order
DirectParentRankID.....4
RequiredParentRankID..2
Now I need to create the same kind of hierarchy with the ParentTaxonUnitID in the TaxonUnit table, but it's considerably more difficult since there are tens of thousands of records to deal with. I need to come up with an UPDATE query that will, for example, find the genus associated with each species in the flat file, find the TaxonUnitID whose UnitName matches the genus name, then insert the genus's TaxonUnitID into the species's ParentTaxonUnitID field.
Does this make any sense? Can anyone help a poor stupid newbie? :crying:
March 5, 2009 at 2:07 pm
Break it up into stages. Assuming your flat file has species and genus in it as columns, you import that into a table with the same columns as the flat file. Then add the ID numbers for the parents to that table. Once you have that, insert into the final table.
Does that help?
If you provide a few sample rows of what the flat file looks like, I can help write import code, but without that, I really can't go beyond a description.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
March 5, 2009 at 2:20 pm
I have a question, also, on your proposed schema. Why would the required parent ID of Order be the ID of Class? I think you'd be better off with that rule in a separate join table. Have a table of the allowed values for parent rank, and use that in your keys.
For example:
create table dbo.Ranks (
ID int identity primary key,
Name varchar(100) not null);
go
create table dbo.Ranks_Ranks (
RankID int not null,
ParentRankID int not null,
constraint PK_Ranks_Ranks primary key (RankID, ParentRankID));
go
create table dbo.Units (
ID int identity primary key,
ParentID int null references dbo.Units(ID),
RankID int not null references dbo.Ranks(ID),
Name varchar(100));
go
create trigger Units_RankEnforcement on dbo.Units
after insert,update
as
if exists
(select *
from inserted
inner join dbo.Ranks_Ranks
on inserted.RankID = Ranks_Ranks.RankID
left outer join dbo.Units
on inserted.ParentID = Units.ID
and Units.RankID != Ranks_Ranks.ParentRankID)
rollback;
Something like that would allow you to hold the rule in the database, and you could allow a genus to have a parent that was either a family or subfamily, but not that was an order or phylum (for example).
Not that the taxonomic categories are likely to change all that frequently, but they do change over time. Was originally just Phylum, Class, Order, Family, Genus, Species, without the sub/super categories. And what happens if someone wants to add something at the bottom, like Breed (for dogs or horses)? Easy to do with something like this.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
March 6, 2009 at 11:12 am
Thanks, GSquared! You've definitely gotten me on the right track. I think I have some sort of mental block when it comes to recursive relationships. 🙂 As for your suggestion for the trigger, that's a good idea. I based my tables on an ITIS (Integrated Taxonomic Information System) model I found online--I should have known better than to try and get database design guidance from a government organization!
Thanks again!
March 6, 2009 at 11:38 am
You're welcome. Let us know how it works out, okay?
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply