containstable??

Question

containstable??

Felipe Saavedra

SSC-Addicted

Points: 477
More actions
February 2, 2005 at 2:18 pm

#66447

Hi again
I've never use CONTAINSTABLE before, and I'm trying to use it but it seems that it needs a indexed table to work with
this is the script that I'm testing
CREATE PROCEDURE metodo2
AS
DECLARE my_cursor CURSOR
FOR
SELECT rfqKey, Family
FROM ##table_metodo1
WHERE rfqKey IN (SELECT rfqKey FROM ##table_metodo1
WHERE pos=1
GROUP BY rfqKey
HAVING COUNT(1) > 1)
ORDER BY rfqKey
DECLARE @rfq int, @Family varchar(255), @keyword varchar(20)
OPEN my_cursor
FETCH NEXT FROM my_cursor INTO @rfq, @Family
WHILE @@fetch_status = 0
BEGIN
SELECT rfdDescription, rfdRfq_LNK INTO #temp1
FROM qstRfqDetail
WHERE (rfdRfq_LNK = @rfq)
DECLARE sec_cursor CURSOR
FOR SELECT keyword FROM ModelTest.dbo.keywords WHERE Family = @Family
OPEN sec_cursor
FETCH NEXT FROM sec_cursor INTO @keyword
WHILE @@fetch_status = 0
BEGIN
SELECT * FROM
CONTAINSTABLE (#temp1, rfdDescription,
'ISABOUT (@keyword)')
FETCH NEXT FROM sec_cursor INTO @keyword
END
CLOSE sec_cursor
DEALLOCATE sec_cursosr
DROP TABLE #temp1
FETCH NEXT FROM my_cursor INTO @rfq, @Family
END
CLOSE my_cursor
DEALLOCATE my_cursosr
GO
the Idea it's inside the double loop use the containstable in order to obtain a rank.
The general idea is, I have a table (Keywords) with 2 columns, one with a keyword and the other a Category that belongs that keyword, (It could be a lot of keywords for each Family)
And I have a #temp1 table with text description, and what i want is to rank if each keyword is inside the text, (I could use charindex but i want more info)
With this script I just want to see the output of containstable, that way I could think something to use it, but I don't really know how it works becuase it fails with this message
Server: Msg 7601, Level 16, State 1, Procedure metodo2, Line 32
Cannot use a CONTAINS or FREETEXT predicate on table '#temp1' because it is not full-text indexed.
That means I have to create a Keywords table with an identity column??
Well I'm just testing here, so if someone have a better picture of this please enlightme

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

John Kane SSCrazy Points: 2810 More actions · Answer 1

First of all, you cannot FT-enable a #temp table, but your can FT-enable the Tempdb database and then FT-Index and run a Full Population on tempdb..temp1. This assumes that tempdb..temp1 has a single column, non-nullable unique key (usually an identity column) that is used in the creation of the FT Index. Note, the difference in how the temp table is referenced: tempdb..temp1 vs. #temp1, the former will work, the latter will not.

Perhaps, I could help more, if I knew more about what you are trying to achieve. What is the goal of your stored proc? If you could provide an explaination of what problem you are trying to solve, we could work towards a solution from that problem description.

Thanks,

John

SQL Full Text Search Blog

http://spaces.msn.com/members/jtkane/

John T. Kane

Felipe Saavedra SSC-Addicted Points: 477 More actions · Answer 2

Well, hi there and thx for the replie

I will explain what I want to do

I have a table ##table_metodo1 with the ranking of Families for each rfqKey (id), but some of them are tied in the first place, so with the my_cursor I fetch the rfqKey and the tied Families one by one to be able to get a second rank to untie it.

Well after open my_cursor I fetch the rows corresponding to the description of the rfqKey and I put them into a temp table #temp1.

Then with sec_cursor I fetch keywords related to the family. The Keywords table it's a table with two columns, one with Families and the other with keywords related to the Femilies, it have no ids ( in example I put the keyword 'electr' with the Electronic, Illumination and Industrial Electricity families), so I select only the keywords for the Family fetched with the my_cursor.

Then I want to have a sum of ranks obtained with containstable for the keyword in all the description rows. With this I'll be able to untie the families, and select a max.

(In the last step I have to create a table with rfqKey, Family, rank) then I'll select the top 3 for each rfqKey, and returns a finall table.

That would be everything, I hope you can help me with this.

Meanwhile I will try to use charindex to obtain a rank, or try to read more carefully containstable in order to use it successfully

thx

John Kane SSCrazy Points: 2810 More actions · Answer 3

I begin to understand what you're looking to accomplish. Effectively, you want to "break the tie" between Rank values returned from a CONTAINSTABLE for your table Families based upon supplied keywords and categories of keywords associated with the Keywords table. Correct?

If so, then the cursor processing you've described is not the best approach. What you're looking for is often called "best bets" or an means to influence the Rank formula to *improve* or raise the Rank value of one row over another. One approach to this problem is to use another table that contains duplicate data of rows from your table Families that you want to have a higher Rank than what would be normally returned. You can see how this is done at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql2k/html/sql_fulltextsearch.asp. However, IMHO, this method of achieving a "best bet" solution is not scalable and not very flexible, but give it a try and see if it comes close to what you are looking to achieve.

Thanks,

John

SQL Full Text Search Blog

http://spaces.msn.com/members/jtkane/

John T. Kane

Felipe Saavedra SSC-Addicted Points: 477 More actions · Answer 4

It would work if there is a way that i can index keyword with family, from the information in the db, but there isn't.

The real problem is I have to give the best Family for each rfq, and I already use an algorithm that gives me a very good results, but in some cases, there are a ties with more than 3 results, and that is inaceptable. So in order to deal with that, I select the items of the rfq (description of the rfq) it could be a hammer, nail or anything ( the rfqs are quotations), and my keywords would be pencil with Office Materials, or table with Office Material, and it can't be obtained with automatization.

Well I will try to continue with my aproximation but with charindex

very gratefull

Fei

John Kane SSCrazy Points: 2810 More actions · Answer 5

Ok, I think I know what you're trying to do and if my assumptions are correct, it is possible to do this without using cursors, or duplicating 40% - 50% of your data in another table in order to achieve a "best bet" result, i.e. break your ties. However, in order to be certain that I understand your problem, could you post the all the involved table structures via a "CREATE TABLE" statement with a few sample INSERT statements for each table with sample or dummy data?

If I correctly understand what you're trying to accomplish, I may be able to provide a possible solution using MS SQL Server Full Text Search CONTAINSTABLE or FREETEXTTABLE and the influence Rank values that are returned on a per query basis.

Thanks,

John

SQL Full Text Search Blog

http://spaces.msn.com/members/jtkane/

John T. Kane

Felipe Saavedra SSC-Addicted Points: 477 More actions · Answer 6

grrrr I lost my post :@ ...

again ... U_U

My principal table

CREATE TABLE ##table_metodo1

(rfqKey int, rfqName varchar (255), Family varchar(255),

Repet float, Freq float, total float, pos int )

INSERT INTO ##table_metodo1

VALUES (383710,'MATERIAL DE DIV. TENIENTE', 'Válvulas', 7.0447760820388794, 7.4790049420381806, 94.194028973579407, 1)

INSERT INTO ##table_metodo1

VALUES (383710, 'MATERIAL DE DIV. TENIENTE', 'Filtros y Elementos Filtrantes', 5.8059701323509216, 6.163840952147237, 94.194028973579407, 2)

INSERT INTO ##table_metodo1

VALUES (383710, 'MATERIAL DE DIV. TENIENTE', 'Electrónica e Instrumentación', 5.134328305721283, 5.4508001851809702, 94.194028973579407, 3)

INSERT INTO ##table_metodo1

VALUES (396660, 'REP TOMA FUERZA', 'Filtros y Elementos Filtrantes', 0.86567163467407227, 10.0, 8.6567163467407227, 1)

INSERT INTO ##table_metodo1

VALUES (396660, 'REP TOMA FUERZA', 'Hidráulica, Neumática y Mangueras Industriales', 0.86567163467407227, 10.0, 8.6567163467407227, 1)

INSERT INTO ##table_metodo1

VALUES (396660, 'REP TOMA FUERZA', 'Martillos Neumáticos e Hidráulicos', 0.86567163467407227, 10.0, 8.6567163467407227, 1)

INSERT INTO ##table_metodo1

VALUES (396660, 'REP TOMA FUERZA', 'Materiales Eléctricos', 0.86567163467407227, 10.0, 8.6567163467407227, 1)

INSERT INTO ##table_metodo1

VALUES (396660, 'REP TOMA FUERZA', 'Motores', 0.86567163467407227, 10.0, 8.6567163467407227, 1)

And the Keyword table its obtained by bulk insert with this script and file

create table Keywords

(

Family varchar(255),

keyword varchar(20)

)

BULK INSERT Keywords

FROM 'c:\keywords.csv'

WITH (FORMATFILE = 'c:\kw.fmt' )

file:

Materiales Eléctricos;corriente

Materiales Eléctricos;CIRCUIT

Materiales Eléctricos;CLAVIJA

Materiales Eléctricos;BOBINA

Materiales Eléctricos;WIRE

Materiales Eléctricos;ALAMBRE

Materiales Eléctricos;OHMS

Materiales Eléctricos;CABLE

Materiales Eléctricos;WAGO

Materiales Eléctricos;BORNE

Materiales Eléctricos;LAMP

Materiales Eléctricos;ENCHUFE

Motores;RODAMIENTO

Motores;TRUCK

Motores;WABCO

Motores;MOTOR

Hidráulica, Neumática y Mangueras Industriales;HIDRA

Hidráulica, Neumática y Mangueras Industriales;NEUM

Hidráulica, Neumática y Mangueras Industriales;MANGUERA

Hidráulica, Neumática y Mangueras Industriales;FLANGES

format file:

8.0

2

1 SQLCHAR 0 255 ";" 1 Family Latin1_General_CS_AS

2 SQLCHAR 0 20 "\r\n" 2 keyword Latin1_General_CI_AS

Well thx for the concern, and I already have a solution using charindex and the criteria

if value returned by charindex > 0

then is a match and sum a point to the actual family.

This should be everything you need, and the other temp tables used by my proc in the first post are created in the same proc.

Thx again