August 11, 2015 at 9:51 am
I'm facing the same issues right now as we migrate to a new data center and possibly migrate data from SQL Server to a new Hadoop environment.
I find it insanely hard to find the length of a tree or a piece of string. That's because how tall is a tree? How long is a piece of string really? I can't answer fully because trees and strings come in all different sizes and lengths.
So, I've been pretty clear with my system/server admins. Right now, we X amount of space. But, we may need a lot more than that depending on the project scope and client. We need to be able to add additional resources beyond our original estimates at a moments notice.
Ranges help me, but so does just being clear about the guess work.
August 11, 2015 at 10:05 am
xsevensinzx (8/11/2015)
I'm facing the same issues right now as we migrate to a new data center and possibly migrate data from SQL Server to a new Hadoop environment.I find it insanely hard to find the length of a tree or a piece of string. That's because how tall is a tree? How long is a piece of string really? I can't answer fully because trees and strings come in all different sizes and lengths.
So, I've been pretty clear with my system/server admins. Right now, we X amount of space. But, we may need a lot more than that depending on the project scope and client. We need to be able to add additional resources beyond our original estimates at a moments notice.
Ranges help me, but so does just being clear about the guess work.
My understanding is that scaling out more nodes on demand and inexpensively is Hadoop's claim the fame.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
August 11, 2015 at 10:16 am
Jeff Moden (8/11/2015)
Everyone I know keeps saying that "Disk space is cheap". I know better but it's still cheaper than having to work within ridiculously small confines on an unknown project. While I appreciate that SAN administrators DO have a job to do, they should never get upset when someone says something like "We need 20GB in the short term and we'll be able to drop back to 5GB when we're done". They should just leave the space so that they won't have to add it later when someone finds out their original estimate was incorrect.
Disk is cheap, but administration and development are not. And we use more than we ever have.
August 11, 2015 at 10:17 am
Eric M Russell (8/11/2015)
Disk space truely is cheap. Yes, we can get a lot more storage space for our money today, but the I/O speed hasn't kept up at the same pace. So, when an application database team requests an order of magnitude additional space, the real concern shouldn't really be cost but rather how the unexpected usage and growth rate impacts performance.
This is actually changing. Finally.
The newer flashes, and 3D architectures are finally starting to amazingly change the IO speeds.
August 11, 2015 at 10:19 am
For some reason I had never thought of the challenges that ensue when I need more space as being related to the person being fixated on an original estimate from a long time ago.
However, you're right. Asking for 15GB when you initially asked for 5GB can feel like the Spanish inquisition. I have to explain why it's now 15GB and whether it will always be 15GB. And I'm left there thinking "my phone has more space than what we're arguing about".
I need to start all my estimates off in terabytes. That way, even if I only get 1, it'll cover me for a while.
Leonard
Madison, WI
August 11, 2015 at 11:35 am
John Hanrahan (8/11/2015)
Scotty did say that but it was to Geordi LaForge (not sure on spelling) when he was on Star Trek Next Gen. I only remember this because I have a great memory not that I watched every episode at least twice. ...
Sorry. Star Trek 3: The Search for Spock:
https://en.wikiquote.org/wiki/Star_Trek_III:_The_Search_for_Spock
I remember Scotty on TNG, wasn't it the Dyson Sphere episode? It's been so long since I saw it that I'm not sure. IIRC, Geordi was alarmed about pushing the engines beyond spec, or something like that, and Scotty assured him that the specs were written by him and he knew what the true max was.
I'm a firm believer in padding and "under promise, over deliver".
-----
[font="Arial"]Knowledge is of two kinds. We know a subject ourselves or we know where we can find information upon it. --Samuel Johnson[/font]
August 11, 2015 at 11:53 am
Agreed. Everybody knows but everybody want to be ignorant of it. Though many things could be guesses but this is surely a starting point to proceed irrespective of developing in air. Atleast 60-70% of documented design is followed and doesnt change. What say?
August 11, 2015 at 1:04 pm
phonetictalk (8/11/2015)
I need to start all my estimates off in terabytes. That way, even if I only get 1, it'll cover me for a while.
For now. Might be PB soon.
August 11, 2015 at 1:19 pm
Steve Jones - SSC Editor (8/11/2015)
phonetictalk (8/11/2015)
I need to start all my estimates off in terabytes. That way, even if I only get 1, it'll cover me for a while.
For now. Might be PB soon.
FSM protect us! I can't imagine what the initialization time on a PB DB would be like, and backing up and verifying such a beast would be amusing. I know it's inevitable, I just don't think it would be an environment that I'd like to work in.
-----
[font="Arial"]Knowledge is of two kinds. We know a subject ourselves or we know where we can find information upon it. --Samuel Johnson[/font]
August 11, 2015 at 1:30 pm
John Hanrahan (8/11/2015)
Scotty did say that but it was to Geordi LaForge (not sure on spelling) when he was on Star Trek Next Gen. I only remember this because I have a great memory not that I watched every episode at least twice.On another note it is always better to under promise and over deliver so I always pad pad and more pad. I have found that I usually need the first 2 pads anyway.
I knew he'd said it...just couldn't remember tv or movie.
My memory right now is shot, thanks to lack of sleep, lack of caffeine, and a having to sort out an SQL issue with a pre-gen statement from some old code with 70+ fields and hunting down the one causing the "String or bit will be truncated..." error....such fun.
August 11, 2015 at 1:32 pm
You can provide business with an equation like the following:
Bytes of Storage =
{? avg orders per day}
X {? avg items per order}
X {? number of days}
X (350)
X (2)
The business must supply estimates for orders per day, items per order, and number of days for which to target.
In this example, (350) is a hypothetical estimate for average row and index storage per transaction. It shouldn't be a total guess but rather based on something like a draft of the physical model and a sample data load in development. The (2) is padding, in this case doubling the estimates, and what you would use depends on how conservative you want to be. If there is reference data, something like a product catalog, then factor those things in as well, but they are ususally not multipled by usage, and you probably have a good idea upfront how many records and the layout you're going to deal with.
If this is a database for an eCommerce website, and the business wants you do something crazy like record every time a visitor hovers their mouse over a product, then all bets are off. Just have the application stream the records off to a network folder somewhere (not on your data drive) and let the BI team deal with it. Convince the business it's outside the scope of the transactional database.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
August 11, 2015 at 1:34 pm
jay-h (8/11/2015)
jckfla (8/11/2015)
Not sure who Hofstadter is/was. Have to look him/her up.
.
Hofstadter's book Gödel, Escher, Bach ranks at the top of my book list. Entertaining and inter-related dissertations on logic, mathematics, philosophy, music, art...
I will have to look it up. Thanks.
August 11, 2015 at 2:46 pm
For person-time (not disk space or bandwidth or other fun techie items) that standard in project management for the last 30 years has been "take the estimates you get and multiply by 2.6". When I've gone back and looked at past projects where that was not done, I've found that 2.4 to 2.6 was indeed the correct multiplier.
And oddly, it seems to work regardless of whether the people who gave the estimates knew about the multiplier and had already applied it... :hehe:
For rough estimates, we ALWAYS give a range - usually a pretty big one (depends on how rough, and how many "known unknowns". The the range gets smaller as we receive more information and re-estimate.
And I have to say I don't mind when the SysAdmins ask for justification for increased disk (or memory or whatever). Having to explain it to another person has often led us to better solutions - or to increasing the disk allocation even more than requested based on things we might have missed. So I actually see their questioning as a positive, rather than a negative.
Love it when you throw out these really thoughtful editorials, Steve, even when I don't have time to respond. You always get me thinking! Thanks. 😉
Steph Brown
August 11, 2015 at 2:48 pm
Randy Worrell (8/11/2015)
I agree on the concept of making "guess-timates" when asked to provide estimated time and scope. Almost without fail, any estimate--made in good faith--will be received as having been cast in stone. What we understand, that management or others down the line in an IT shop don't (or won't), is that many IT projects have a margin of error of 100%. Especially for projects that have never been done before. A mechanic can estimate how long a brake job will take on a certain vehicle because it's been done millions of times. But IT projects aren't like that. We can only give what I've called "guess-timates" based upon similarities with other projects. The farther into the project we go, we can ratchet down the estimate more accurately; but to create a project plan that is inflexible is simply irrational. The project manager then uses his/her negotiating skills to bring management into the real world. I shudder each and every time I'm asked for an estimate on any project... especially when I know the one asking is famous for scope creep!:w00t:
----------------------------------
Randy Worrell
+1 on your comment!
In a similar vein, Steve's questions combined with your comment made me think of reference class forecasting[/url], which I have heard is pretty effective for situations where you do have access to similar projects. Apparently most of them do generally take a similar amount of time and resources, at least close enough to keep estimates in the ballpark.
Also, the editorial made me think of the Anchoring Effect:
https://en.wikipedia.org/wiki/Anchoring
whereby some of the disagreements Steve mentioned might happen. It is possible (I am guessing) that initial numbers for stuff like storage allocation can serve as initial quantity anchors that stick too rigidly in some situations. It may not be anyone's precise fault, just that as Steve said there may be times when a relatively small margin of uncertainty in the initial storage request amount may come back to bite the requestor because a mental guesstimate of 5 GB vs 20 GB that seems small (it is not, say, a factor of 10 off) may still cause unhappiness for storage admins who track their data down to the GB or even MB. They may see a real limit or even a slippery slope where if everyone came back and asked for 15 more GB, or 300% more GB, depending on how you calculate it, they would quickly be running out of space or money or both.
I happen to think organizations should try to build in flexibility for such variances, but at the same time there has to be a limit, too, almost like budgeting with money. Some months a person might see 3 movies instead of 2, but they know their budget is not flexible enough to see 6 movies. That kind of thing.
-------------------
A SQL query walks into a bar and sees two tables. He walks up to them and asks, "Can I join you?"
Ref.: http://tkyte.blogspot.com/2009/02/sql-joke.html
August 11, 2015 at 3:11 pm
There is a level of hell where you get committed to estimates before you find out the subject of the estimate.
Does anyone have any experience of estimates that have proben to be reasonable predictors of actual activity?
Does the whole bureaucracy surrounding estimates actually justify its own existence?
Viewing 15 posts - 16 through 30 (of 44 total)
You must be logged in to reply to this topic. Login to reply