June 15, 2015 at 9:29 pm
I'm looking into the available tools in preparation to develop a first data warehouse, if you've used a Data Warehouse automation product before, what are your thoughts on it?
Also, are the below statements correct, or have I picked up a flawed understanding of the different available products?
The following Data Warehouse automation tools are coming up most on Google searches for DW automation products:
With the slower alternatives of hand-coding data warehouse schemas and performing indexing and using SSIS for ETL, or hand-coding the entire process. Both of these approaches require low or no initial investment, but requiring significantly more development resources and less flexible when business requirements change.
I also understand WhereScape is significantly more expensive (at least in initial investment) but is a great product to use due to it's holistic and abstracted approach to DW and ETL development, and has a reasonable fan base.
Both BIReady and timeXtender seem to cost upwards of USD$20,000 but provide an GUI that abstracts and reduces schema and ETL development time. These are not as powerful as RED, but cheaper and facilitate faster development than manual coding if used correctly.
June 15, 2015 at 10:35 pm
I have been doing BI for about 8 years now and have never heard of these products. That does not mean they're not good. I looked over the website for each of those companies and it appears that they are pretty new and untested. A data warehouse/BI project is the kind of thing that can make or break a business and I would not be comfortable with an untested product.
Regarding:
With the slower alternatives of hand-coding data warehouse schemas and performing indexing and using SSIS for ETL, or hand-coding the entire process. Both of these approaches require low or no initial investment, but requiring significantly more development resources and less flexible when business requirements change.
It depends on your goals and the scale of the project. Hand-coding a simple data mart for analyzing key metrics and slapping an Excel spreadsheet onto it is not an overly complex task. Indexing is not rocket science.I guess it all comes down to the amount of talent you have to work with and your budget. A data warehouse is a tool that you use to create a single version of your business truth formatted in a way that is easier to report against so that you can make better business decisions. Sometimes this requires an investment in experienced and educated people. My experience is that such an investment is worth the cost and yields a high rate of return if you can find the right people.
That's my opinion.
-- Itzik Ben-Gan 2001
June 17, 2015 at 3:38 am
TimeXtender has been on the market for nearly 10 years and our technology is being used daily by more than 2600 customers around the world. I would say that it is neither new nor untested.
Andrew - let me know if you are interested in taking a closer look at TimeXtender - I will be happy to assist you.
Kind regards,
Thomas Duun
Product Manager
TimeXtender
June 17, 2015 at 3:31 pm
Thanks Thomas - I appreciate the offer, but not at the moment. I've had a read through the TimeXtender 2014 user guide and have somewhat of an idea of what it can do. I will get in touch should the project grow to a position where it receives the management support required for financial investment in a Data Warehouse automation tool.
The current scale is very small (a Data Mart for a specific department) and is an initial step into Data Warehousing - I expect ideal for learning how to develop and maintain a small-scale Data Warehouse manually before any automation, with "Microsoft Exam 70-463: Implementing a Data Warehouse with Microsoft SQL Server 2012" as a guide.
June 17, 2015 at 4:20 pm
Andrew P (6/17/2015)
Thanks Thomas - I appreciate the offer, but not at the moment. I've had a read through the TimeXtender 2014 user guide and have somewhat of an idea of what it can do. I will get in touch should the project grow to a position where it receives the management support required for financial investment in a Data Warehouse automation tool.The current scale is very small (a Data Mart for a specific department) and is an initial step into Data Warehousing - I expect ideal for learning how to develop and maintain a small-scale Data Warehouse manually before any automation, with "Microsoft Exam 70-463: Implementing a Data Warehouse with Microsoft SQL Server 2012" as a guide.
That's a good book. I would also look at the Kimball Data Warehouse Toolkit (Microsoft version); it's excellent and gives you a great overview of the DW/BI development life cycle. It's worth noting that it includes how to provide constant deliverables so as to see a quick RIO.
-- Itzik Ben-Gan 2001
March 27, 2017 at 2:50 am
If you're still interested in finding out more about WhereScape RED, Attunity and other data warehouse tools (it doesn't look like timeXtender has been added yet) I recommend that you visit IT Central Station and read the user reviews of these products, all of which have been written by people with personal experience using these solutions.
For example, one BI Architect commented about WhereScape RED "It allows us to focus on the most important aspect of a data warehouse--business requirements--instead of focusing on the low-level details of the development platform." (if you want, you can read the rest of this review here.)
Hope this helps.
March 27, 2017 at 10:05 am
Regardless of what data warehousing tools you choose (if any); the majority of your human resource time will be spent gathering requirements, documenting requirements, team status meetings, troubleshooting source data quality issues, performance tuning, and attempting to overcome the inflexibility and limitations of your specific tool selection. Regardless of tool, you will at least be coding SQL against your source databases. It's a myth that some vendor's data warehousing "automation" tool will be a game changer in terms of project reducing project cost or implementation time. Really, the most deciding factor is that the team consist of experienced data warehouse analysts and developers.
Kimball's Data Warehouse Toolkit book should be required reading for everyone on the IT side of the team.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
March 27, 2017 at 2:53 pm
Eric, thanks for the reply - I get that, the tool isn't the deciding factor, or where most of the time should be spent. I chose to go with SSIS, because it's purpose-built for ETL, and (I hope) the graphical design and sensible component labels will assist any other developers in understanding the ETL work flow, with the understanding that SSIS has it's own limitations that need time to learn workarounds for.
This isn't on-topic to this question, but here's some background. There's just me working on this project, and part-time. The end users are the IT team (the primary focus of this DW is to create a standardized data model for building reporting solutions from), and coming from a reporting background in this company, I'm confident I understand the user requirements and data quality for the primary resources (I've worked with the source data-sets for 6+ years).
I think I understand the schema design - at least I get the basics (star, snowflake, fact, dimension, measure attribute, surrogate key, accumulating snapshot) - Kimball's DW Toolkit has been a good help with this. I still need to study the last few chapters of that book -understand the subsystems, and modelling processes, those chapters seem more valuable than some of the earlier.
Currently I'm continuing to learn SSIS in the evenings through the Microsoft training kit for SSIS 2012 and Pluralsight and PASS videos for SSIS, so I can get a few basic star schemas with SCDs up through SSIS so the business can see some results, and I can gain some experience from running/maintaining these in production while I continue to study.
With myself being the most experienced in the IT team, and limited resource available for outside consultants, this forum (and you Eric and Alan B.) have been a great help in teaching me when I don't understand things or need advice - thanks.
March 27, 2017 at 3:38 pm
The SSIS task for implementing SCD does not scale well at all; for all practical purposes it creeps when loading more than about 10,000 rows.
https://connect.microsoft.com/SQLServer/feedback/details/632052/ssis-slowly-changing-dimension-scd-component-performance-unusably-slow
However, on CodePlex there is a 3rd party SCD task that you can plug into SSIS.
https://dimensionmergescd.codeplex.com/
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
March 1, 2018 at 2:26 am
If you plan to chose a Data Vault approach you could use the Datavault Builder https://datavault-builder.com which is automating all aspects of a DWH lifecycle and is generating objects on SQL Server in real time while you are designing.
June 8, 2018 at 6:06 am
This was removed by the editor as SPAM
December 29, 2022 at 5:30 am
There are several tools available for automating the process of building and maintaining a data warehouse. Some of the more popular options include:
Talend: This is an open-source ETL (extract, transform, load) tool that can be used to automate the process of extracting data from various sources, transforming it into a format that is suitable for analysis, and loading it into a data warehouse.
Informatica PowerCenter: This is a commercial ETL tool that offers a wide range of features for automating the process of building and maintaining a data warehouse, including support for real-time data integration, data quality, and data governance.
AWS Glue: This is a fully managed ETL service offered by Amazon Web Services (AWS) that can be used to extract data from various sources, transform it, and load it into a data warehouse.
Talend Cloud: This is a cloud-based version of Talend that offers many of the same features as the on-premises version, but with the added convenience of being hosted in the cloud.
Google Cloud Data Fusion: This is a fully managed, cloud-native ETL service offered by Google Cloud that can be used to extract data from various sources, transform it, and load it into a data warehouse.
Microsoft Azure Synapse Analytics (formerly SQL Data Warehouse): This is a fully managed data warehouse service offered by Microsoft Azure that includes built-in ETL capabilities for extracting data from various sources, transforming it, and loading it into the data warehouse.
It's important to evaluate the different tools available and choose the one that best meets the needs of your organization. Factors to consider might include the types of data sources you need to support, the volume of data you need to handle, and your budget.
Read here complete guide on warehouse automation
January 10, 2023 at 2:35 pm
This was removed by the editor as SPAM
September 14, 2023 at 12:46 pm
This was removed by the editor as SPAM
September 25, 2023 at 4:49 am
If you can get a trial version of the data warehouse software, You should be able to see if it is the right tool. You may consider looking at some case studies with these tools. Prototyping tools could be useful in your decision making process.
Prototyping is a feature of the automated agile data warehouse lifecycle that skips several manual steps. It allows the customer to visualize the design of the architecture they have requested, with full lineage, before the build takes place.
Here is something I found:
Statistical Analysis System (SAS) is one of the large data warehouse tools that are used to access information from various data sources. The benefit of SAS data management is that users can access the data from wherever it is stored (from cloud, to legacy systems, to data lakes, like Hadoop, etc.) without any hassle.
DBASupport
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply