Can I get three examples of ETL?

  • Here are my guesses: 1) putting a huge .txt file into a database using a BULK INSERT or BCP command. 2) converting Excel spreadsheets into tables in a SQL Server database. 3) converting an Oracle database into a SQL Server database

  • Anything that takes data from one place and puts it somewhere else.

    Converting an Oracle DB to SQL wouldn't be ETL. That would be a very complex development project.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Slight modification in the sentence and now it qualifies for ETL example...

    'converting an Oracle data base into a SQL Server data base'

    And yes, ETL projects vary in complexity from 'simple' to 'very very complex'.

  • Just loading a table hardly qualifies as ETL.

    The term ETL, which stands for "Extract, Transform and Load" is usually used in the context of dimensional Data Warehousing referring to the process of:

    - Extracting data from OLTP systems.

    - Transform such data mapping the original relational structure to the target dimensional one, and

    - Load such transformed data into the data warehouse's core FACT and DIM tables.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • PaulB-TheOneAndOnly (2/11/2012)


    Just loading a table hardly qualifies as ETL.

    Seriously? You don't think that taking a mega-row text file from something like Double-Click.Net and loading it into a table qualifies as at least the first step of and ETL process? Loading a warehouse from OLTP data is just one form of ETL.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I agree with Jeff, ETL is in no way specific to the dimensional data warehouse. I define ETL as any process which takes data from one source into another source. The act of moving data from a database table to an Excel extract is also a form of ETL, although it's generally the other way around (hopefully).

    So three examples:

    1) Moving data from an OLTP database into a dimensional data warehouse.

    2) Moving data from .csv files into a database.

    3) Moving data from a database into a cube.

    Jake

  • jake.mayher (2/16/2012)


    I agree with Jeff, ETL is in no way specific to the dimensional data warehouse. I define ETL as any process which takes data from one source into another source. The act of moving data from a database table to an Excel extract is also a form of ETL, although it's generally the other way around (hopefully).

    So three examples:

    1) Moving data from an OLTP database into a dimensional data warehouse.

    2) Moving data from .csv files into a database.

    3) Moving data from a database into a cube.

    Jake

    I would say that your #2 is a bit off... No extraction 🙂 How about, extract data from an OCS server to a .csv file, then import it into an OLTP database.

    Jared
    CE - Microsoft

  • SQLKnowItAll (2/16/2012)


    jake.mayher (2/16/2012)


    I agree with Jeff, ETL is in no way specific to the dimensional data warehouse. I define ETL as any process which takes data from one source into another source. The act of moving data from a database table to an Excel extract is also a form of ETL, although it's generally the other way around (hopefully).

    So three examples:

    1) Moving data from an OLTP database into a dimensional data warehouse.

    2) Moving data from .csv files into a database.

    3) Moving data from a database into a cube.

    Jake

    I would say that your #2 is a bit off... No extraction 🙂 How about, extract data from an OCS server to a .csv file, then import it into an OLTP database.

    After you've worked with a DB2 file, flat files are definately in the extraction category. CSV is just an easier storage mechanic to extract from. I'll have to go with Jeff Jake (sorry) on that one. 😀


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Evil Kraig F (2/16/2012)


    SQLKnowItAll (2/16/2012)


    jake.mayher (2/16/2012)


    I agree with Jeff, ETL is in no way specific to the dimensional data warehouse. I define ETL as any process which takes data from one source into another source. The act of moving data from a database table to an Excel extract is also a form of ETL, although it's generally the other way around (hopefully).

    So three examples:

    1) Moving data from an OLTP database into a dimensional data warehouse.

    2) Moving data from .csv files into a database.

    3) Moving data from a database into a cube.

    Jake

    I would say that your #2 is a bit off... No extraction 🙂 How about, extract data from an OCS server to a .csv file, then import it into an OLTP database.

    After you've worked with a DB2 file, flat files are definately in the extraction category. CSV is just an easier storage mechanic to extract from. I'll have to go with Jeff on that one. 😀

    Touche! Ha! I forgot that some systems actually write and store to .csv files. I retract my previous statement 🙂

    Jared
    CE - Microsoft

  • SQLKnowItAll (2/16/2012)


    jake.mayher (2/16/2012)


    I agree with Jeff, ETL is in no way specific to the dimensional data warehouse. I define ETL as any process which takes data from one source into another source. The act of moving data from a database table to an Excel extract is also a form of ETL, although it's generally the other way around (hopefully).

    So three examples:

    1) Moving data from an OLTP database into a dimensional data warehouse.

    2) Moving data from .csv files into a database.

    3) Moving data from a database into a cube.

    Jake

    I would say that your #2 is a bit off... No extraction 🙂 How about, extract data from an OCS server to a .csv file, then import it into an OLTP database.

    Ummm... not quite true. While it's true that the Double-Click.net files (for example) that I had to "ETL" where in a TAB delimited format, some extreme "extraction" was required to normalize the data. The number of columns was never known and could vary from minute to minute (all the same in any given file) and column "pairs" and "quads" had to be extracted and matched to the first 9 columns before it could be inserted into the final tables. It made loading a data warehouse or a DB2 file look like a cakewalk.

    The same goes for some data I used to get from ATT for call detail records. It was a mixed bag of multiple line types... 1 for the call, 1 for the tax information, and several for the routing information that the call took.

    Just because you're reading from a CSV or other delimited file doesn't mean there isn't an extraction process involved.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (2/17/2012)


    SQLKnowItAll (2/16/2012)


    jake.mayher (2/16/2012)


    I agree with Jeff, ETL is in no way specific to the dimensional data warehouse. I define ETL as any process which takes data from one source into another source. The act of moving data from a database table to an Excel extract is also a form of ETL, although it's generally the other way around (hopefully).

    So three examples:

    1) Moving data from an OLTP database into a dimensional data warehouse.

    2) Moving data from .csv files into a database.

    3) Moving data from a database into a cube.

    Jake

    I would say that your #2 is a bit off... No extraction 🙂 How about, extract data from an OCS server to a .csv file, then import it into an OLTP database.

    Ummm... not quite true. While it's true that the Double-Click.net files (for example) that I had to "ETL" where in a TAB delimited format, some extreme "extraction" was required to normalize the data. The number of columns was never known and could vary from minute to minute (all the same in any given file) and column "pairs" and "quads" had to be extracted and matched to the first 9 columns before it could be inserted into the final tables. It made loading a data warehouse or a DB2 file look like a cakewalk.

    The same goes for some data I used to get from ATT for call detail records. It was a mixed bag of multiple line types... 1 for the call, 1 for the tax information, and several for the routing information that the call took.

    Just because you're reading from a CSV or other delimited file doesn't mean there isn't an extraction process involved.

    Yes, I was being narrow minded in haste. Thanks for the reality check!

    Jared
    CE - Microsoft

  • Jeff Moden (2/11/2012)


    PaulB-TheOneAndOnly (2/11/2012)


    Just loading a table hardly qualifies as ETL.

    Seriously? You don't think that taking a mega-row text file from something like Double-Click.Net and loading it into a table qualifies as at least the first step of and ETL process? Loading a warehouse from OLTP data is just one form of ETL.

    well... it turns out that you are talking a more complex table load that involves moving text from an external data source to a data warehouse - in this case it does qualifies as ETL.

    Having said that, I stick to my arms "the (simple) load of a table hardly qualifies as ETL".

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • PaulB-TheOneAndOnly (2/17/2012)


    Jeff Moden (2/11/2012)


    PaulB-TheOneAndOnly (2/11/2012)


    Just loading a table hardly qualifies as ETL.

    Seriously? You don't think that taking a mega-row text file from something like Double-Click.Net and loading it into a table qualifies as at least the first step of and ETL process? Loading a warehouse from OLTP data is just one form of ETL.

    well... it turns out that you are talking a more complex table load that involves moving text from an external data source to a data warehouse - in this case it does qualifies as ETL.

    Having said that, I stick to my arms "the (simple) load of a table hardly qualifies as ETL".

    I disagree. Complexity isn't what makes it ETL or not. Neither does scope. Type of process is what makes it ETL.

    Importing a .txt file with rows and columns defined by line-returns and pipe-characters, into a table, is what if not ETL?

    Extract obtain something from source: to obtain something from a source, usually by separating it out from other material

    Transform change something dramatically: to change somebody or something completely, especially improving their appearance or usefulness

    Load transitive verb put something in machine: to put into a machine the items that it will work on, e.g. clothes for washing

    To load a table from a text file, you have to extract the data from the text file, including conversion of delimiters or spacings (fixed-width material) into columnar format.

    Then there's transform: conversion of storage format (external .txt file to internal binary representation in the database), validation of length and characteristics of the data to conform to the table definition (typed instead of free-text data), etc.

    Then it's actually loaded into the table, exactly as per the definition of "load".

    What else would it be, if not ETL?

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (2/17/2012)


    PaulB-TheOneAndOnly (2/17/2012)


    Jeff Moden (2/11/2012)


    PaulB-TheOneAndOnly (2/11/2012)


    Just loading a table hardly qualifies as ETL.

    Seriously? You don't think that taking a mega-row text file from something like Double-Click.Net and loading it into a table qualifies as at least the first step of and ETL process? Loading a warehouse from OLTP data is just one form of ETL.

    well... it turns out that you are talking a more complex table load that involves moving text from an external data source to a data warehouse - in this case it does qualifies as ETL.

    Having said that, I stick to my arms "the (simple) load of a table hardly qualifies as ETL".

    I disagree. Complexity isn't what makes it ETL or not. Neither does scope. Type of process is what makes it ETL.

    Importing a .txt file with rows and columns defined by line-returns and pipe-characters, into a table, is what if not ETL?

    Extract obtain something from source: to obtain something from a source, usually by separating it out from other material

    Transform change something dramatically: to change somebody or something completely, especially improving their appearance or usefulness

    Load transitive verb put something in machine: to put into a machine the items that it will work on, e.g. clothes for washing

    To load a table from a text file, you have to extract the data from the text file, including conversion of delimiters or spacings (fixed-width material) into columnar format.

    Then there's transform: conversion of storage format (external .txt file to internal binary representation in the database), validation of length and characteristics of the data to conform to the table definition (typed instead of free-text data), etc.

    Then it's actually loaded into the table, exactly as per the definition of "load".

    What else would it be, if not ETL?

    If I'm reading it correctly - as per your own definition of ETL Jeff's scenario qualifies as ETL, scenario calls for:

    - Extract data from a plain text external datasource

    - Transform that text (at least) mapping it to the target columns

    - Load the target table which by Jeff's scenario is part of a data warehouse.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • I would say that your #2 is a bit off

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply