April 6, 2007 at 8:06 am
Heh... I missed THAT! Not only does the table need some fixin', but you do need to add some memory! My desktop box has a gig... a server running standard edition should have at least 2. For enterprise edition, you really need to scale memory to the number of connections hitting the server but, certainly, no less that 2 gig. Memory is pretty cheap... buy as much memory as the server and the edition of SQL can handle.
--Jeff Moden
Change is inevitable... Change for the better is not.
April 6, 2007 at 8:18 am
I totally aggree on Jeff, except that the "rename" of the new well designed table won't be a "snap". Rename is just not possible with SQL. It might look like in Management Studio. But what it does is create another new table (the one with the new name) an then copy data from the old -and this will take quite a while again and you need double space for a while. Then it'll drop the old one.
Just keep that in mind when using it in production.
Andreas
---------------------------------------------------
MVP SQL Server
Microsoft Certified Master SQL Server 2008
Microsoft Certified Solutions Master Data Platform, SQL Server 2012
www.insidesql.org/blogs/andreaswolter
www.andreas-wolter.com
April 6, 2007 at 9:53 am
This is obviously a log file from a web server, so the table setup is probably the default log created by IIS when you choose ODBC logging. I've done a lot of analysis of web traffic and I think your stuck with the schema - referrers can be very long, queries can be very long (particulary with asp/jsp/cgi sites). I would instead suggest you look at the content and attempt to ditch some data. The problem is IIS is a slut when it comes to logging it just logs everything. So hits to your site from developers, search engine bots, hits to images on the page ... I typically tried to use the default log as a staging area, then do an etl process into new tables (obviuosly you need to talk to your management about what they want). I would get rid of hits to the site from our own domain (I don't care how many hits I got from my people reviewing the site, just outsiders coming in). Get rid of the bot hits, Recode the user agent, ditch all the hits to gif, jpg ... and keep only the asp, html and real "pages". I found I could reduce the web site by a qtr by doing all the above. Then I could apply indexes that acutally worked. The downside is your traffic will appear less, but it will be closer to truth and your queries will be faster.
April 6, 2007 at 1:45 pm
or you could just exec sp_rename...
April 6, 2007 at 2:56 pm
April 6, 2007 at 4:51 pm
Sorry Andreas... That's absolutely incorrect and you really need to check your facts before making such an unqualified statement... Try this... insert a column in the middle of an existing table using EM... now, look at the code that action generates... yes, it's creating a new table (like I am in my suggestion) and that will take some time especially for 60 million rows... but, near the end of the code, you'll find that even EM uses sp_Rename and it only takes milli-seconds.
But, no sense arguing about the speed... a demonstration is in order...
The following creates a million row table and then measures the amount of time it takes to rename the table using sp_Rename... do read the comments in the code before you post back, eh?
--===== Identify the environment to run the test in
USE TempDB
SET NOCOUNT ON -- Suppress the autodisplay of rowcounts for speed and appearance
--===== Declare local variables
DECLARE @bar CHAR(78) --A line of "=" to separate sections on the report device
DECLARE @StartDate DATETIME --Remembers when a section of code started
SET @bar = REPLICATE('=',78)
PRINT @bar
--===== Create and populate a million row test table
-- (~37 seconds to execute).
-- Column "SomeDate" has a range of >=01/01/2000 <01/01/2010
-- That's ten years worth of dates.
-- Identify the section and start the section timer
PRINT 'Building the million row table...'
SET @StartDate = GETDATE()
-- Build and populate the table on the fly
SELECT TOP 1000000
RowNum = IDENTITY(INT,1,1),
SomeID = CAST(RAND(CAST(NEWID() AS VARBINARY))*50000+1 AS INT),
SomeValue = CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*25+65))
+ CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*25+65)),
SomeNumber = CAST(RAND(CAST(NEWID() AS VARBINARY))*100 AS MONEY),
SomeDate = CAST(RAND(CAST(NEWID() AS VARBINARY))*3653.0+36524.0 AS DATETIME)
INTO dbo.jbmBigTest
FROM Master.dbo.SysColumns sc1 WITH (NOLOCK),
Master.dbo.SysColumns sc2 WITH (NOLOCK)
-- Display the rowcount and the duration for this section
PRINT STR(@@ROWCOUNT,12) + ' Rows Inserted'
PRINT CONVERT(CHAR(12),GETDATE()-@StartDate,114) + ' (hh:mi:ss.mil)'
PRINT @bar
--===== Rename the table to show how long it takes
-- (~0 to 220 milliseconds to execute with an average less than 5 milliseconds)
-- Identify the section and start the section timer
PRINT 'Renaming the table from jbmBigTest to jbmBigTestRenamed...'
SET @StartDate = GETDATE()
-- Rename the table
EXEC dbo.sp_Rename 'jbmBigTest', 'jbmBigTestRenamed'
-- Display the duration for this section
PRINT CONVERT(CHAR(12),GETDATE()-@StartDate,114) + ' (hh:mi:ss.mil)'
PRINT @bar
--===== Display the first 10 rows just to show the table has been renamed
PRINT 'Selecting 10 rows from jbmBigTestRenamed to prove it''s there...'
SELECT TOP 10 * FROM dbo.jbmBigTestRenamed
--===== Housekeeping (and further proof that the table has been renamed)
DROP TABLE jbmBigTestRenamed
DROP TABLE jbmBigTest
PRINT 'Test complete.'
... and here's the output from the run...
============================================================================== Building the million row table... 1000000 Rows Inserted 00:00:36:830 (hh:mi:ss.mil) ============================================================================== Renaming the table from jbmBigTest to jbmBigTestRenamed... Caution: Changing any part of an object name could break scripts and stored procedures. The object was renamed to 'jbmBigTestRenamed'. 00:00:00:000 (hh:mi:ss.mil) ============================================================================== Selecting 10 rows from jbmBigTestRenamed to prove it's there... RowNum SomeID SomeValue SomeNumber SomeDate ----------- ----------- --------- --------------------- ------------------------------------------------------ 1 407 SH 45.0266 2009-08-02 18:07:35.383 2 31297 YI 57.4772 2007-07-03 19:22:33.887 3 19774 EK 9.5429 2007-08-13 19:13:12.447 4 11851 FL 51.3669 2000-03-01 02:02:53.813 5 24085 MQ 59.8272 2004-11-20 11:34:09.050 6 47946 RD 33.5944 2000-02-25 23:35:55.947 7 46825 EH 23.0357 2002-12-06 18:48:11.760 8 10240 FI 56.5999 2004-03-14 21:07:18.460 9 2467 SF 62.5958 2000-03-02 19:40:05.820 10 486 EH 56.5331 2006-09-22 00:19:23.660
Server: Msg 3701, Level 11, State 5, Line 56
Cannot drop the table 'jbmBigTest', because it does not exist in the system catalog.
Test complete.
Like I said... it's a "snap" and it's fast. Any questions, Andreas?
--Jeff Moden
Change is inevitable... Change for the better is not.
April 6, 2007 at 4:59 pm
Heh... point well taken, David... we have a day or two just like that. Our "big one" is affectionately known as "Cycle 28".
--Jeff Moden
Change is inevitable... Change for the better is not.
April 6, 2007 at 5:19 pm
Halifaxdal...
I just noticed that the size of the MDF table doubled when you added the column... you've done something wrong... very wrong... stop modifying the original table!!!
--Jeff Moden
Change is inevitable... Change for the better is not.
April 6, 2007 at 6:34 pm
Unless I'm missing something, his original table was about the same size, as he had three columns at 4096 each. While I realize that you questioned his need for that large of a varchar on two of them, it looks like he kept the same basic structure he originally had.
April 6, 2007 at 7:38 pm
David,
What I was looking at were the numbers from one of the previous posts (reprinted here for convenience...)
Warning: The table 'IISLog' has been created but its maximum row size (13625) exceeds the maximum number of bytes per row (8060). INSERT or UPDATE of a row in this table will fail if the resulting row length exceeds 8060 bytes.
(63008970 row(s) affected)
Original files' size:
mdf: 16,860,096
ldf: 1,008,000
New files' size:
mdf:37,697,856
ldf: 28,333,632
Notice that the size of the MDF file more than doubled... adding a simple INT IDENTITY column probably would not account for that. I'm thinking that an incorrect padding option was used and all the varchar columns have been right padded with spaces to the full extent of the data type... we can find that out by comparing DATALENGTH to LEN on each variable length column. If true, the transfer is going to need an RTRIM on each variable length column... that's going to slow things down even more not to mention causing additional and unwanted logging.
And, because someone is operating without the proverbial "net", they also managed to blow the log file way out of proportion and that someone (Halifaxdal) has even more work to do, now. THIS is the reason we needed to use SELECT/INTO to create the IDENTITY column and to do other things... when done correctly, logging will be minimized.
The tape backup boys are gonna be pissed about that log file!
--Jeff Moden
Change is inevitable... Change for the better is not.
April 6, 2007 at 7:49 pm
Halifaxdal,
I/we are trying to help... in order to do that, I need answers to the questions I posed, earlier. Of course, if you don't want the help or think you can do it on your own, no problem... but, considering that your little experiment caused the overall database storage requirements to grow from a piddley 18 gig to a whopping 66 gig just to add a single column to a medium sized table, I'm thinking you really need the help.
No wonder it took so long... those poor files had to grow many, many times.
And, considering bnordberg's observations about this being a system log file (in which case, I wouldn't need those answers I asked for), it might be a bad thing to change anything but some indexes on the table. We can repair all of this damage and an archive may be in order instead of modifying the table.
--Jeff Moden
Change is inevitable... Change for the better is not.
April 6, 2007 at 9:47 pm
Hi Jeff, and other gurus,
Thank you so much for the analysis. I apologize I reply late because of the holiday reason. (Happy Easterday! )
Yes like bnordberg said this is from a system generated log file. I have no idea how the field is populated and what will be populated in the future, before I left company on Thursday I checked the max size of those varchar(4096) fields, and the maximum size I got is definitely less than 1000.
While looking for fundamental database optimization, I am also thinking if I can do anything from the front-end, seriously! I used dotLucene in one of my other project, dotLucene is an excellent search engine that is usually used to index/search files. But it can be used to search database content too! I will try if it helps to sort it out. IT IS THUNDERING FAST! (probably I have to wait until Tuesday)
Here is a link to dotLucene:
April 6, 2007 at 9:56 pm
I have a local copy of the original data files, so it't not a problem to use your suggestion (SELECT/INTO) to create a new well-designed schema on the server.
April 6, 2007 at 10:23 pm
FYI, here is some data from the table:
select top 10 * from iislog
date time c-ip cs-username s-ip s-port cs-method cs-uri-stem cs-uri-query sc-status cs(User-Agent) cs(Referer) IISLog_id
------------------------------------------------------ ------------------------------------------------------ --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------ --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- --------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -----------
2006-11-01 00:00:00.000 1900-01-01 00:10:18.000 201.141.179.251 - 168.165.10.18 443 GET /Imagenes/movmis.gif - 200 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1) https://inverweb.somecompany.com/InverWeb/MovtosCheques/MovChqsMisDia.asp?lang=0 1
2006-11-01 00:00:00.000 1900-01-01 00:10:18.000 201.141.179.251 - 168.165.10.18 443 GET /Imagenes/b_acept.gif - 200 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1) https://inverweb.somecompany.com/InverWeb/MovtosCheques/MovChqsMisDia.asp?lang=0 2
2006-11-01 00:00:00.000 1900-01-01 00:10:19.000 189.145.60.86 - 168.165.10.18 443 GET /Imagenes/toppage.gif - 304 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.1.4322) https://inverweb.somecompany.com/inverweb/Default.asp 3
2006-11-01 00:00:00.000 1900-01-01 00:10:19.000 189.145.60.86 - 168.165.10.18 443 GET /inverweb/rutinaJS.js - 304 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.1.4322) https://inverweb.somecompany.com/inverweb/Default.asp 4
2006-11-01 00:00:00.000 1900-01-01 00:10:19.000 189.158.178.21 - 168.165.10.18 443 GET /inverweb/ConsultaMovimientos/default.asp id=99999%2C160%2C0250%2C2%2C1%2C&lang=0 200 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.0.3705;+.NET+CLR+1.1.4322;+Media+Center+PC+4.0) https://inverweb.somecompany.com/inverweb/Default.asp 5
2006-11-01 00:00:00.000 1900-01-01 00:10:19.000 189.172.89.195 - 168.165.10.18 443 GET /Imagenes/movant_c.gif - 200 Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+en-US;+rv:1.7.2)+Gecko/20040804+Netscape/7.2+(ax) https://inverweb.somecompany.com/inverweb/ConsultaMovimientos/default_tc.asp?id=99999,238,0273,3,1,&lang=0 6
2006-11-01 00:00:00.000 1900-01-01 00:10:19.000 148.233.37.34 - 168.165.10.18 443 GET /inverweb/blanco.asp - 200 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.0) https://inverweb.somecompany.com/inverweb/ 7
2006-11-01 00:00:00.000 1900-01-01 00:10:19.000 63.68.228.197 - 168.165.10.18 443 GET /Inverweb/Inverweb_imagenes/fondo.gif - 404 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+InfoPath.1;+.NET+CLR+1.1.4322;+.NET+CLR+2.0.50727) https://inverweb.somecompany.com/Inverweb/entrada.asp 8
2006-11-01 00:00:00.000 1900-01-01 00:10:19.000 200.34.165.26 - 168.165.10.18 443 GET /Inverweb/salir.asp - 302 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.1.4322;+.NET+CLR+2.0.50727) https://inverweb.somecompany.com/inverweb/ 9
2006-11-01 00:00:00.000 1900-01-01 00:10:19.000 189.172.89.195 - 168.165.10.18 443 GET /Imagenes/movtc.gif - 200 Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+en-US;+rv:1.7.2)+Gecko/20040804+Netscape/7.2+(ax) https://inverweb.somecompany.com/inverweb/ConsultaMovimientos/default_tc.asp?id=99999,238,0273,3,1,&lang=0 10
(10 row(s) affected)
I also have a very bad news to report:
On Thursday just after I finished adding the index to the table using David's code, I was able to get the max length of a field in 9 mins, and then I check all the three varchar(4096) fields and get result in about 7 mins, but now, it's been 18 mins, result is still not there.... I'll check tomorrow.
April 8, 2007 at 1:37 pm
>>or you could just exec sp_rename...
which is not the point I was trying to make - I was trying to stress was the GUI (which USES sp_rename) actually DOES
Andreas
---------------------------------------------------
MVP SQL Server
Microsoft Certified Master SQL Server 2008
Microsoft Certified Solutions Master Data Platform, SQL Server 2012
www.insidesql.org/blogs/andreaswolter
www.andreas-wolter.com
Viewing 15 posts - 16 through 30 (of 40 total)
You must be logged in to reply to this topic. Login to reply