March 4, 2010 at 9:34 am
Hi Experts,
I have been going through several interviews these days, and have came accross certain questions that everybody asks and will expect me to answer in a different way, and are never satisfied :hehe: with my answers, I would like some experts to give me answers to these Questions, or let me know what exactly are they expecting in a answer from me :
1. How many Dimension and Fact tables you had?
It always varies as per requirement. Also you design your fact tables based on report you are gonna fetch.
2. How did you load your Dimension tables?
You can load them using SSIS or using T-SQL, all you need is a source to load and some transformations to perform(varies as per requirement again) and then your INSERT statements to complete the load.(or I should say OLE DB DESTN)
3. Now tell me about FACT table how did you load it?
Fact tables load is basically performed after loading all the related dimensions and contains lookup for all the FKs in the fact table. with some measures(that varies tooo)
4. How often did you DELETE/ UPDATE your FACT tables?I said I never delete from fact I only insert in fact tables they are not satisfied...
You dont reuire to update or delete from a fact table only INSERTS
5. What BEST practices did you follow in your packages (control flow/data flow)?
6. How did you use logging and Configurations in your package?If I say I used XML config and SQL server loggong they will ask again HOW???
You can use logging to log events that you want to check on later after package has run. Also yo use configurations to make sure you dont have to hardcode everything and you can pass values in runtime
7. How did you deploy your packages? I didnt do it DBA does it:
Deployment can be done by using SQL SERVER or by using FILE SYSTEM.
:
8. What is the difference between MERGE and MERGE JOIN transformation::
9. Where have you done Data modelling?how? I said i assisted the data modeller but they what to listen something else...
10. How did you do Data cleansing and Data profiling???:w00t: - I said I assisted the data analyst...
11. How and when did you use Checkpoints????? (its very expensive does not work well so i dint use it in my package....
12. Why do we create INDEX on tables , Can we create index on all the columns of a table??? (I dont know,DBA does all this work)
May be I will have more but these are THE main questions
Please provide your inputs and let me know if I missed something as well.
Thanks a tonnnnnnnnnnn
Thanks [/font]
March 4, 2010 at 10:36 am
I don't think there are right or wrong questions for most of those. They look like questions designed to explore what you know, how you've worked in the past, what practices you use with SSIS, etc
If the interviewer doesn't seem satisfied, ask for clarification, ask what it is he's looking for, ask for more details, etc. If you've never done something (data modelling, deployment, indexing) then just say so.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 4, 2010 at 11:07 am
I agree with Gail, ask for clarification.
Don't lie about what you have done, if you say you have done something when they ask you to do it you end up scrambling, the problem is that if you say you have done data profiling and you haven't you will probably not do it very well that first time.
CEWII
March 4, 2010 at 12:58 pm
Hi,
Thanks for your reply, I always try to do that but still....:hehe:
Can some one provide me some answers to Questn Number
4
5
and 8
As I am really confused about what to answer.
Thanks
Thanks [/font]
March 4, 2010 at 1:17 pm
I think the answer to 4 is just fine, I would ask them what they are looking for.
I would ask for clarification for 5. That isn't a clear question.
For number 8 look in BOL:
Merge Transform
ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/extran9/html/cff8690c-07ac-46a0-aab5-20bd4848c677.htm
Merge Join Transform
ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/extran9/html/cd8b0412-f83b-4bd2-b227-e53dcfd941a8.htm
CEWII
March 4, 2010 at 1:20 pm
What I would answer on those questions:
4. Updates: only if corrections are allowed to be made. Otherwise it is just a new fact.
Deletes: if the fact tables contains snapshot, I delete snapshot data from today if the ETL runs twice or more by accident.
5. Best practices. You just can't answer that question correctly 🙂 Every company/individual has its own best practices.
Use logging, package variables configured with configuration tables, use standard naming conventions and datatypes et cetera...
8. I don't know it by hard 🙂 I only used a merge join once to join different sheets of an excel file together and put it in one table. For a merge join the data has to be sorted, for a merge, well, I don't know... I would surely avoid using them though 🙂
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
March 4, 2010 at 2:28 pm
Hi thanks a lot for your valuable inputs...
I still have doubts about BEST PRACTICES...
I mean I say that we can use NOLOCK, we can ue select queries in OLE SRC, and we avoid using AGGREGATE and SORT transformation...
but except this I dont get any other point to say...Is there any other point that I am missing.....
Thanks a lot........
Thanks [/font]
March 4, 2010 at 2:33 pm
As far as NOLOCK I always recommend caution while using it. You want to make sure that data in your DW is valid and you didn't get something in-flux.
CEWII
March 4, 2010 at 3:03 pm
cRuchika (3/4/2010)
I still have doubts about BEST PRACTICES...
Google SSIS Best Practices
I mean I say that we can use NOLOCK,
I wouldn't necessarily say that's a best practice. Common practice, yes, often because peopple don't know what NOLOCK actually means.
See - http://sqlblog.com/blogs/andrew_kelly/archive/2009/04/10/how-dirty-are-your-reads.aspx
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 5, 2010 at 12:46 am
Very common best practices:
- never use *, always select and the columns you need. Not only in OLE_SRC, everywhere. (e.g. lookup component)
Limit the results of the queries with appropriate where clauses.
- avoid using blocking components
- never use the build-in slowly changing dimensions wizard
- double check your queries for cartesian products
- maintain documentation. The person after you wants to understand your code too.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
March 5, 2010 at 9:04 am
never use the build-in slowly changing dimensions wizard
Why would you never use it? I'm not a big fan of most wizards, but I find this one very effective.
March 5, 2010 at 9:14 am
RonKyle (3/5/2010)
never use the build-in slowly changing dimensions wizard
Why would you never use it? I'm not a big fan of most wizards, but I find this one very effective.
I have found this component helpful in the past as well, there is a Kimbal Method SCD component on codeplex that I would seriously look at, it is at:
http://kimballscd.codeplex.com/
I have not actually used it myself.
CEWII
March 5, 2010 at 9:19 am
da-zero (3/5/2010)
- never use the build-in slowly changing dimensions wizard
If you make such a statement in an interview, any competent interviewer will immediately follow up with 'Why?'
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 6, 2010 at 5:00 am
GilaMonster (3/5/2010)
da-zero (3/5/2010)
- never use the build-in slowly changing dimensions wizardIf you make such a statement in an interview, any competent interviewer will immediately follow up with 'Why?'
Well, thanks for interviewing me then 😀
The MS build-in SCD component is for example not that efficient. It uses the OLE DB Command to perform the updates, which means that if a million rows have to be updated, a million different updates will be issued against the DB, instead of one big update. I'm not a DBA, but I think that is not so good for performance, locking and transactions.
The Kimball SCD at Codeplex however has much better efficiency, so this one is prefered above the build-in one.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
March 6, 2010 at 8:40 am
cRuchika (3/4/2010)
and we avoid using AGGREGATE and SORT transformation....
I would be careful on the sort transformation. Many times a sort transformation is necessary when using a merge join due to the IsSorted property not properly functioning.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
Viewing 15 posts - 1 through 15 (of 47 total)
You must be logged in to reply to this topic. Login to reply