Data Mining Data Warehouse For Beginners

  • Hello!

    We have a lot of database used in 3rd party apps. Things like shop sales. and marketing data.

    As i understand it. The best way to Analyse the data is to put the 3rd party apps data into a Data Warehouse and use SSAS (or is SSAS is the data warehouse!?) and then run reports against it.. Is that about right? Where is the best place to start for a complete beginner..

    Im looking for books and preferable CBTs etc. Is a course in SSAS all i need? and that teaches you: how and what, data to take from the 3rd party database and where to put it? and how to analyse it? Or is just going to show how to look at data that's already in there?

    Thanks for any help.

    The accidental DBA

  • Yes, SSAS is aimed for analysis.

    A good reference is for example "Applied Microsoft Analysis Services 2005 and Microsoft Business Intelligence Platform" from Teo Lachev

    Regards

    Igor Micev,My blog: www.igormicev.com

  • I can also recommend this book:

    Smart Business Intelligence Solutions with Microsoft SQL Server 2008

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • n00bDBA (3/31/2012)


    Hello!

    We have a lot of database used in 3rd party apps. Things like shop sales. and marketing data.

    As i understand it. The best way to Analyse the data is to put the 3rd party apps data into a Data Warehouse and use SSAS (or is SSAS is the data warehouse!?) and then run reports against it.. Is that about right? Where is the best place to start for a complete beginner..

    Im looking for books and preferable CBTs etc. Is a course in SSAS all i need? and that teaches you: how and what, data to take from the 3rd party database and where to put it? and how to analyse it? Or is just going to show how to look at data that's already in there?

    Thanks for any help.

    The accidental DBA

    Before jumping into SSAS, I'd suggest reading a book that covers the academic and practical aspects of designing operational data stores and OLAP cubes.

    http://www.amazon.com/Star-Schema-The-Complete-Reference/dp/0071744320

    You don't want to end up developing and supporting an "Accidental" Data Warehouse.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • You may also find these useful:

    http://www.amazon.com/The-Microsoft-Data-Warehouse-Toolkit/dp/0470640383/ref=sr_1_6?s=books&ie=UTF8&qid=1333442957&sr=1-6

    http://www.amazon.com/The-Data-Warehouse-Toolkit-Dimensional/dp/0471200247/ref=sr_1_1?s=books&ie=UTF8&qid=1333442957&sr=1-1

    Andy

    ==========================================================================================================================
    A computer lets you make more mistakes faster than any invention in human history - with the possible exceptions of handguns and tequila. Mitch Ratcliffe

  • Hello to you all and thank you for the replies!

    Koen Verbeeck: Thank you! i had missed that one, which seems to be about right for me if the review quote of:

    This book would be a good manual if you do not understand some of the key concepts (Dimension vs Measure, Star schema vs Snowflake etc). If you want to actually implement a Snowflake schema, SSIS, and customize the results, the book was lacking. In other words, the book does not focus on the nuts and bolts of SQL 2008 BI. It gives you more of a 5000 ft overview.

    is right!

    Eric M Russell: Thanks looks a goodie!

    Andy Hyslop: I had heard of Kimball and that he was the god father of cubes or something important!. So had looked at hes books. Would these be good for learning the aspects of design or are they practical?

    I see CBT Nuggets do a BI course, which covers some SSAS. i was thinking of trying them out as well. so i might try them out.

    Thanks for all the replies!

    If anyone can offer a good track for beginners (ie read book 1, then book 2, then book 3) that would be great to..

    Thank you!

  • The Kimball books are about design and architecture of a data warehouse.

    They are technology independant, so it should be your first read if you really want to go into data warehousing.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Koen Verbeeck (4/3/2012)


    The Kimball books are about design and architecture of a data warehouse.

    They are technology independant, so it should be your first read if you really want to go into data warehousing.

    +1

    Although the first book does focus on the use of Microsoft Technologies.

    Personally I found them quite useful although some of the concepts are not quite suited to the real life situations IMO but they will give you a very good introduction.

    Andy

    ==========================================================================================================================
    A computer lets you make more mistakes faster than any invention in human history - with the possible exceptions of handguns and tequila. Mitch Ratcliffe

  • Hey,

    So do you all like the books Andy Hyslop suggested. Looking on Amazon there's a fair few Kimball books!

    Thanks again!

  • n00bDBA (4/3/2012)


    Hey,

    So do you all like the books Andy Hyslop suggested. Looking on Amazon there's a fair few Kimball books!

    Thanks again!

    If you are building this data warehouse for a corporation or organization, then your first step is to start gathering feedback from all potential users about what information they expect to get out of the warehouse and how they will be using it.

    The book I mentioned earlier, 'Star Schema The Complete Reference', presents both the Kimball and Inmon approaches to data warehousing, but neither contributed to the book. It's mostly at the conceptual level, and is not vendor specific. In short, the Kimball implementation typically revolves around OLAP Cubes, while the Inmon approach a relational ODS (operational data store) and summary tables.

    http://www.amazon.com/Star-Schema-The-Complete-Reference/dp/0071744320

    It's important to gather input from all stakeholders, and then start exploring various design options. At some point down the road, you may be questioned (perhaps you may even question yourself) about why you initially chose a specific approach.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Thanks Eric,

    I just don't want to start asking around and saying im planning on doing this super great thing.. then fail horribly at it! I work in a team of 1. I have no support or goto person in work for any stage of this so want to get my head in there. Do some playing around on my own.. build something to play with more importantly! Then start involving the user base!

    BI isn't really done here.. So im seeing a massive scoop to get my head in and make an good impression to my boss and all concerned! but im starting from scratch.

    Ive picked up a few books on Statistics

    Ive picked up some Business Analysis Books. (And found a course id like to do..)

    So i can start to see what the users might be looking for and how it can be got in theory.

    Then its about learning how/what data to get and how to format it in the DW

    Then its onto how to Analysis it and report it!

    I realise its a massive amount of learning and work. But hopefully the excitement of doing it will get me though!

    I do have another question! the bosses are keen to go to 2012 for Always On. This would been i get access to Tabular Models (No idea what they are. But the little bits ive seen make them look very cool!)

    Does Kimball and the Star Schemas etc all still fit in with Tabular models or is it a new way of doing things and completely different? (I believe PowerView only uses the tabular verison)

    Thanks again!

  • n00bDBA (4/3/2012)


    Thanks Eric,

    I just don't want to start asking around and saying im planning on doing this super great thing.. then fail horribly at it! I work in a team of 1. I have no support or goto person in work for any stage of this so want to get my head in there. Do some playing around on my own.. build something to play with more importantly! Then start involving the user base!

    BI isn't really done here.. So im seeing a massive scoop to get my head in and make an good impression to my boss and all concerned! but im starting from scratch.

    Ive picked up a few books on Statistics

    Ive picked up some Business Analysis Books. (And found a course id like to do..)

    So i can start to see what the users might be looking for and how it can be got in theory.

    Then its about learning how/what data to get and how to format it in the DW

    Then its onto how to Analysis it and report it!

    I realise its a massive amount of learning and work. But hopefully the excitement of doing it will get me though!

    I do have another question! the bosses are keen to go to 2012 for Always On. This would been i get access to Tabular Models (No idea what they are. But the little bits ive seen make them look very cool!)

    Does Kimball and the Star Schemas etc all still fit in with Tabular models or is it a new way of doing things and completely different? (I believe PowerView only uses the tabular verison)

    Thanks again!

    From what you've described, based on the lack of clear requirements, scope, or direction from supervisors, I actually don't think you need to start thinking about building a data warehouse now. What you probably need at this point is a daily or weekly copy of production, which you can then use to start crunching SQL and experimenting with pivot tables without impacting the user's work.

    A data warehouse is not something that one person can build in a silo; you have to include input from users into the design process in order for it to be useful. From what you've described, that's something that needs to happen down the road. It's not important at this point.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • When you use the tabular models, you can still build your data warehouse using the Kimball method. Since the star schema's are relatively simple, it will be pretty straight forward to implent the tabular model. Tabular models can be completely different: they can incorporate data from various sources, not only relational and you can create the whole model without using a data warehouse. However, in an enterprise environment, it's my opinion that a data warehouse is beneficial, in order to get a centralized and cleaned data storage.

    Make sure all the BI stuff you want to do is included in the edition that your company buys. If they get Enterprise, you're all set.

    You'll have to know the business pretty well to construct a data warehouse without stakeholder involvement. It's very courageous of you to dive into it like that, and I wish you the best of luck.

    If you have any more questions, please feel free to ask them on this thread.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Hey all

    Thanks for all the replies..

    I think your right for a starting point Eric. Ill do a restore onto my pc of some data and start seeing how its structured and what i can get from it.

    Ive still failed to find any known training provider offer up courses on Data Warehousing, the only small things ive found are from PluralSight (If anyone reading is interested)

    I will look into buying the following:

    http://www.amazon.co.uk/Star-Schema-The-Complete-Reference/dp/0071744320

    http://www.amazon.com/The-Microsoft-Data-Warehouse-Toolkit/dp/0470640383/ref=sr_1_6?s=books&ie=UTF8&qid=1333442957&sr=1-6%3Cbr%3E

    http://www.amazon.co.uk/Data-Mining-Microsoft-Server-2008/dp/0470277742/ref=sr_1_1?ie=UTF8&qid=1333447887&sr=8-1

    And hope to find some video tutorials (as learning by only reading hasnt had the greatest success for me!)

    Thanks again

  • Something else to decide is how all those disparate databases might be linked. And how you might find/fake the connections if they don't really exist...a huge challenge. Is there something in the Sales database that would identify that it came from a lead in the Marketing database? You can't always assume that since the name's the same, the contact should be the same.

    You have a multi-year task ahead of you -- learning SSIS, SSRS, and SSAS all on your own, and presumably on your own time -- and it's easy to get lost and discouraged. I believe that BI is a good "resume item" to have, so try to keep after it and don't let too much time lapse between learning-a-thing and putting-into-practice.

    Good luck.

    (I know this thread is a week old, I'm runnin' behind!)

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply