Creating a Glossary

  • I want to create a hyperlinked text glossary.

    I have a text field, which is in a 'results' table. Then I have another table which contains the actual glossary, ie Terms and Explanation.

    I want to create a SP or query, which results in something like this:

    1, 'This is a glossary ', ''

    2, 'Term', 'This Term is explained here'

    3, 'which is ', ''

    4, 'hyperlinked', 'another term explained'

    I guess I can create a SP which parses the text and then looks up each word in the glossary table, but I was looking for some more elegant solution.

    I would appreciate any suggestions!

    TIA

    Felix

  • I've never done it before but the first idea that comes to mind is as follows:

    create table tHyperlinkedTerms (hyperLinkedTermsId int IDENTITY, keyword varchar(100), description varchar(4000))

    create table tGlossary (glossaryId int IDENTITY, keyword varchar(100), description varchar(4000))

    -- a n to m match on this table where n<=m or m<=n

    -- with foreign keys pointing back to the appropriate Id columns

    create table tGlossaryHyperLinkMatches (glossaryId int, hyperLinkTermsId int)

    You could return two recordsets out of these tables... one with the glossary paragraph and the other with a list of words that need to be hyperlinked.

    If you are outputting this to a web page, I think that you could do the hyperlinking/substitution on the clientside.

    Other things to consider:

    Terms that may appear in the hyperlinked list that in turn should be hyperlined.

    IDEA #2

    Benefit of idea 2 is that the hyperlinked area is separated.

    Any word that you want to add a hyperlinked definition for can just be added to the hyperlink table without having to update the mapping table.

    Find each individual word in a varchar string by finding the spaces in the string:

    --will find spaces in the first 500 positions of the string

    create table #spaces (id int IDENTITY, positionIndex int)

    insert into #spaces

    select ones+tens+hundreds as theCount from tParagraph

    join (select 0 as ones UNION select 1 UNION select 2 UNION select 3 UNION select 4 UNION select 5 UNION select 6 UNION select 7 UNION select 8 UNION select 9) as ONES on 1=1

    join (select 0 as tens UNION select 10 UNION select 20 UNION select 30 UNION select 40 UNION select 50 UNION select 60 UNION select 70 UNION select 80 UNION select 90) as TENS on 1=1

    join (select 0 as hundreds UNION select 100 UNION select 200 UNION select 300 UNION select 400 UNION select 500) as HUNDREDS on 1=1

    where id =@paragraphID and ones+tens > 0 and substring(text, ones+tens, 1) = ' '

    order by ones+tens

    --will give the words inbetween the spaces that you found:

    --you'll need to make a fix to have it include the first word...

    create table #Words (id int IDENTITY, words varchar(4000)

    insert into #Words

    select substring(text,a.positionIndex+1, b.positionIndex-a.positionIndex) from tParagraph c

    join #spaces a on 1=1

    left outer join #spaces b on a.id = b.id - 1

    where c.ID = @paragraphID

    Then you can compare this list of words with your hyperlinked words...

    Obviously there's a lot to work out still. Finding commas and periods and what do you do when you even find the hyperlinked words???

    Just some thoughts.

    Good luck.

    -jraha

  • Hi Felix,

    quote:


    I want to create a hyperlinked text glossary.


    just to make sure, I understand you right. do you want to create something like http://www.investorwords.com , http://www.investorwords.com/cgi-bin/searchTerms.cgi?term=maturity ?

    Cheers,

    Frank

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Thanks jraha and frank for your replies.

    Yes Frank, the end result is going to be a hyperlinked scientific text such as:

    "A drench effective against Barbers Pole can give you up to 6 weeks protection from re-infection"

    The underlined term is a hyperlink, which is looked up in the glossary table, eg. "Barbers Pole: a roundworm that lives in the abomasum, sucks blood and reproduces very quickly" (Thought you wanted to know!!)

    The problem is how can I efficiently parse my initial text to find the words which I need to hyperlink from the glossary.

    Regards

    Felix

  • Hi Felix,

    quote:


    "Barbers Pole: a roundworm that lives in the abomasum, sucks blood and reproduces very quickly" (Thought you wanted to know!!)


    sounds like a description of politicians

    quote:


    The problem is how can I efficiently parse my initial text to find the words which I need to hyperlink from the glossary.


    do you have any influence on this initial text?

    I mean, it would be better to manipulate this text, than to parse it at runtime.

    You could some self made app and a lookup table. Maybe something like this

    A drench effective against <a href="blabla.page?ID=someentryinyourlookuptable">Barbers Pole (which also can come from some field in thsi lookup table)</a> can give you up to 6 weeks protection from re-infection

    Hope you kow what I'm trying to say

    Cheers,

    Frank

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Frank, unfortunately I don't have any control over the text (as you don't have much control over politicians ;-), so I guess I just have to do the parsing.

    Hovever, I just thought I could possibly create another text field which would contain the scientific text *including* the hyperlinked terms ready made. That way, I only need to parse when the user creates the text, rather than doing it everytime I display it. Of course I double my disk use that way, but we can always buy another disk....

    Still the parsing functions in T-SQL seem pretty basic, unless I missed something. Can't wait for T-.NET !

  • quote:


    Hovever, I just thought I could possibly create another text field which would contain the scientific text *including* the hyperlinked terms ready made.

    That way, I only need to parse when the user creates the text, rather than doing it everytime I display it. Of course I double my disk use that way, but we can always buy another disk....


    hey, that's a kind of influence you need. Is the original text updated frequently? How do the users enter data into the db?

    How many entries will there be? 100,000 or 200,000.

    Seems pretty manageable.

    quote:


    Still the parsing functions in T-SQL seem pretty basic, unless I missed something. Can't wait for T-.NET !


    Well, REPLACE is there, SUBSTRING, CHARINDEX...all you need

    Cheers,

    Frank

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply