Is Big Data good for Data Professionals?

  • Comments posted to this topic are about the item Is Big Data good for Data Professionals?

  • Argh, just one more new marketing term conjured up to convince us, and business leaders, to buy something.

    Yet another term for something that already exists and has existed for years; like calling the Internet "The Cloud"

    Like Samsungs revelation about the stylus... like its something new

    Or touch that has been around since the 80's

    The only thing that matters is that you (the business) knows what data it needs, why it needs it and how it will be used; beyond that it is all just noise

    And that data provides a competative advantage... period (otherwise what's the point)

  • I initially thought Big Data was a "so what" but then had an epiphany.

    Big Data is not an IT term, it is a business term and as such it is the business realising that data is an asset and not a liability!

    The business terms obviously know that the reports and KPIs are generated from data but in their heart of hearts they have never really believed that data was an asset.

    Big Data gets talked about in terms of volume, velocity and variability/variety but that is IT completely grabbing the wrong end of the stick.

    Big Data is about value, trustworthyness (veracity) and the fact that it may not be on your premises or owned by you but is legitimately available for you to use (virtual).

    It is also about collaboration, after all there are more big brains outside your organisation than there are within it.

  • To me "big data" just means data that's too big to do normal data transforms and analyse in the way you would with smaller data. So you need different methods of dealing with it.

  • To me big data has something to do with the fact that nowadays, when I ask people (analytics of different sorts) what they want, the sometimes say "I dont know what I want until I see it". And nowadays this maybe a perfectly reasonable respons, with information mass growing at light speed. This makes it hard to expect what amount of data we need to handle, so we need an architecure that can handle... well, any amount of data. I certainly think it can leverage the IT pros work as a political instrument.

  • Did anyone notice Dilbert today? Dilbert says that the analysis of the "Big Data" shows that his productivity plunges whenever the boss learns new jargon.

  • When I think of "big data", I think of cutting edge analytical systems that crunch TB sized datasets to solve new problems. Things like predicting weather patterns or election results. It's not the total size of all the tables in the database, but rather how it's being used. The source data could be housed in one centrally managed database or a distributed and loosely coupled grid computing solution like SETI@home.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • I think it is getting every possible piece of information from transactional systems instead of just getting data specific for reporting. This gives you more potential for analysis but at the same time, it also creates sizing issues and information overload.

    This is especially bad with those end users who request that they see all the data so they can analyze it when in fact they just want to see all the data because they have no idea what is in it and want to aggregate and slice it from every possible angle.

    The upside is that you can produce better forecasting models and such using data of this grain but it often takes a while for the business to feel comfortable with the data. So you end up storing a lot more data than what you really need. And usually once you have started to store it, it becomes critical that you keep all of it even if it has never actually been accessed.

    So to me, big data means retaining all of the data instead of just what is deemed as required. It is essentially the opposite of what data professionals have been trying to do for years by narrowing scope during the requirements gathering process of ETL development.

  • I like the term "Big Data" for anything that brings more attention to data and how it becomes information is good for IT. But there is also a relative side of "Big Data". As mentioned it has to do with appropriate load management across your servers and is that load too large for the available resources on site.

    For those of you who love equations one way of looking at this is Medium Data + Tiny Hardware = Big Data. Or more correctly stated Medium Data + Tiny Hardware = Big Data Problem. If you have more then you can manage or use within reasonable performance limits your data is big data for your situation.

    You also have to take into consideration that some think they have or will have big data when in reality they do not. Developers and business experts are proponents for the systems the maintain and develop. As the advocate they project that each and every system being developed is going to be the "Killer Application" and that the data needs are going to be huge. Often developers use to worry about how to better manage the "hot spots" that might occur around the read/write heads of the data server. The concern was great and the amount of time considering indexes was massive only to find that the number or real users in the system was only 12% of what was estimated and that the transaction rate per user is only about 20% or the projected rate.

    Sorry for the ramble.

    M.

    Not all gray hairs are Dinosaurs!

  • One characteristic of "big data" I've noted seems to be that the dataset isn't easily contained and managed in a single server instance. "Sharding" comes to mind, and once you start splitting datasets into multiple servers, normal query techniques can go out the window. This isn't along the lines of connected SQL servers such that one server sends transactions to another, in my mind its more that a single relation simply cannot be handled by a single server, ie., a table lives on many servers.

    If all the data can be contained on one server, is it really big data, or is it rather just a VLDB?

  • I think the diversity of opinions on this thread show that "Big Data" is just another magic marketing word that means whatever you want:

    "Buy our server to support Big Data"

    "Buy our professional services to implement Big Data"

    "Buy our DBMS to support Big Data"

    "Buy our software to manage, analyze, etc. Big Data"

    Or at a lower level:

    We have "Big Data" = We have "Big D***s"

  • I remember when I first got into IT some 15 years ago. I was working for a grocery chain, and a big industry topic was about customer loyalty cards. The challenge was since everyone was now collecting all this shopping data on customers, who could analyze and make use of the data the fastest. So "big data" certainly is not a new concept.

    Tony
    ------------------------------------
    Are you suggesting coconuts migrate?

  • "What data professionals have been trying to do for years." is exactly where the problem lies (imho). Way too masny IT departments trying to drive the ship instead of the busniess leaders.

    IT and technology is ALWAYS an enabler, we need to understand the levers of the business and proide the correctly implemented and right-sized solutions. Technology for Technologies sake is neraly always a disaster.

    Business needs to understand what it needs and how it will be used before it is pushed into the ecosystem

  • Ignore..... see next

  • I think the diversity of opinions on this thread show that "Big Data" is just another magic marketing word that means whatever you want:

    "Buy our server to support Big Data"

    "Buy our professional services to implement Big Data"

    "Buy our DBMS to support Big Data"

    "Buy our software to manage, analyze, etc. Big Data"

    Or at a lower level:

    We have "Big Data" = We have "Big D***s"


    BINGO

    There have always been and always be large information stores. Increased technology simply allows some of that data to be used in faster more pratical ways.

    If a new term helps to create dialog in the C-Suite as to what the REAL needs for data are and what practical and competative purpose they serve then fine, but otherwise it once again marketing driving technology needs.

Viewing 15 posts - 1 through 15 (of 49 total)

You must be logged in to reply to this topic. Login to reply