January 8, 2014 at 4:38 am
Comments posted to this topic are about the item Relational database or graph database? Why not have both?
Tony
January 8, 2014 at 5:13 am
Excellent Article Tony. I have been playing with Neo4j myself for awhile, and you have me very excited as well. I am going to have to try this out. Thanks.
January 8, 2014 at 5:17 am
Hi Toni,
very interesting paper.
I could be that I could use these ideas in the near future (I hope !).
Since I would like to do some tests it would be important for me to work with some realistic (sample) data. Are those publicly available ?
Thanks
Teodoro Marinucci
January 8, 2014 at 5:30 am
teodorom (1/8/2014)
Hi Toni,very interesting paper.
I could be that I could use these ideas in the near future (I hope !).
Since I would like to do some tests it would be important for me to work with some realistic (sample) data. Are those publicly available ?
Thanks
Teodoro Marinucci
Teodoro, unfortunately not. The SnomedCT thesaurus distribution files are only available to download under licence from the International Health Terminology Standards Development Organisation (http://ihtsdo.org) or affiliated national organisations
Tony
January 8, 2014 at 6:39 am
Good work. Good doc.
Can you, please, let me know when you have CRUD working?
Thanks 🙂
January 8, 2014 at 9:03 am
Interesting and very eclectic.
January 8, 2014 at 9:38 am
Great work. On the one hand, I'd be really excited to try this out. On the other hand I hope I never have to.:hehe:
January 8, 2014 at 1:26 pm
It is important to note that:
SNOMED more facilitates information input into an electronic record during patient care (OLTP in concept)
Conversly, ICD facilitates data retrieval for purposes outside of patient care (i.e. reporting/OLAP).
This is a nice solution for extending SNOMED to serve in the latter role; although, I believe there are efforts going on to map SNOMED to ICD (if it is even possible?).
January 8, 2014 at 3:36 pm
I like to think of NOSQL as meaning Not Only SQL rather than No SQL
Haha - perfect.
Thank you for writing this; one of the most interesting articles I've seen for a while.
I am interested to know how scalable this approach is. Have you performed any tests with, say, a hundred or more simultaneous connections all using your TVF in queries?
January 8, 2014 at 4:24 pm
sneumersky (1/8/2014)
It is important to note that:SNOMED more facilitates information input into an electronic record during patient care (OLTP in concept)
Conversly, ICD facilitates data retrieval for purposes outside of patient care (i.e. reporting/OLAP).
This is a nice solution for extending SNOMED to serve in the latter role; although, I believe there are efforts going on to map SNOMED to ICD (if it is even possible?).
Mapping is available from SnomedCT to ICD10, ICD-O and OPCS4.
There is much work being done to facilitate the analytical role of SnomedCT and the approach described hereis part of that.
Tony
January 8, 2014 at 4:28 pm
I am interested to know how scalable this approach is. Have you performed any tests with, say, a hundred or more simultaneous connections all using your TVF in queries?
Not yet. As yet I haven't deployed it to a proper server situation. The only tests were done on my somewhat under-resourced laptop!
Tony
January 8, 2014 at 6:57 pm
teodorom (1/8/2014)
Hi Toni,very interesting paper.
I could be that I could use these ideas in the near future (I hope !).
Since I would like to do some tests it would be important for me to work with some realistic (sample) data. Are those publicly available ?
Thanks
Teodoro Marinucci
I've seen people use Twitter as a source for this kind of data. Tweets are considered public, so you should be able to do some testing with them
January 9, 2014 at 2:02 am
Good work .
Considering performance gain I'd make some query optimization first. Looks like CTE may be a bit shorter, eliminating join which is only needed at final step
WITH children (ConceptID, Level) AS
(
SELECT r.ConceptID1, 1
FROM dbo.Concepts c
INNER JOIN Relationships r ON r.ConceptID2=c.ConceptID
AND r.RelationshipType='116680003' -- conceptID for 'IS_A' relationship
WHERE c.ConceptID='371984007' -- conceptID for malignant neoplasm of oesophagus
UNION ALL
SELECT r.ConceptID1, c.Level+1
FROM children c
INNER JOIN Relationships r ON r.ConceptID2=c.ConceptID
AND r.RelationshipType='116680003' -- conceptID for 'IS_A' relationship
)
SELECT DISTINCT children.*, c.FullySpecifiedName
FROM children
INNER JOIN dbo.Concepts c ON c.ConceptID=children.ConceptID
Next one may load relationships data as 81 separate tables. So the query can be rewritten as
WITH children (ConceptID, Level) AS
(
SELECT r.ConceptID1, 1
FROM dbo.Concepts c
INNER JOIN IS_A r ON r.ConceptID2=c.ConceptID
WHERE c.ConceptID='371984007' -- conceptID for malignant neoplasm of oesophagus
UNION ALL
SELECT r.ConceptID1, c.Level+1
FROM children c
INNER JOIN IS_A r ON r.ConceptID2=c.ConceptID
)
SELECT DISTINCT children.*, c.FullySpecifiedName
FROM children
INNER JOIN dbo.Concepts c ON c.ConceptID=children.ConceptID
After the optimization NoSQL perfomance gain may be not so dramatic. And mind extra efforts to keep two DBs in sync.
January 11, 2014 at 2:21 pm
The thing I like about graph databases is that the relationship between nodes is a whole concept in its own right. It performs the equivalent of a link table between two objects and the necessary foreign keys all at once!
I saw a discussion on Neo4J where the original use case was described as an accidental discovery that they had cracked the problem of the relationships that take place within a content management system.
To be honest, all the types of NOSQL DB have their place. The challenge is to understand what their use case is and where it adds significant benefit to offset the cost of having to manage the technical diversity.
I'm wondering if a natural development path for future developments of SQL Server will be as a general data management platform that can use relational, key-value, document and distributed data all from a common framework.
Blending in Hadoop, column store indexes and in-memory OLTP may be the first step in this process.
January 13, 2014 at 2:18 am
David.Poole (1/11/2014)
I'm wondering if a natural development path for future developments of SQL Server will be as a general data management platform that can use relational, key-value, document and distributed data all from a common framework.
I couldn't agree more. That was partly the driver for getting me into Neo4j. SQL Server remains the bedrock of the RDBMS development, but using the power of other systems to deliver functionality that is best performed outside tSQL. What we really need is a set of ODBC drivers for the various NoSQL offerings, then my SQLCLR approach would be redundant!
Tony
Viewing 15 posts - 1 through 15 (of 17 total)
You must be logged in to reply to this topic. Login to reply