A Tool for the Job

  • David.Poole (11/21/2012)


    Darren Wallace (11/20/2012)


    Ultimately any task where you care more about high speed and low cost than you do about consistency is a great candidate for NoSQL.

    Anyone have a view on data quality with regard to NoSQL? Does it help/hinder?

    I believe that the importance of data quality (ie: dropped, uncommitted, or orphaned records) depends on the application. That's why non-relational databases are more of a natural choice for some organizations than it is for others.

    Let assume that Google, FaceBook, or Twitter had an intermittent transactional consistency issue such that 1 listings out of the top 100 were randomly excluded each time a user hit the website. Unless a QA engineer were systematically analyzing the results looking for such specific irregularities... no one would notice. Even if some users did notice, it wouldn't be a show stopper. It would be like "Oh, yeah we know about that bug and some people working on it.", but it's not as if external auditers or a regulatory agency is going to shut the business down until the problem is fixed.

    In the banking industry, accounts have to balance, and information presented to clients is not subjective at all. Likewise, in the healthcare, government, or scientific industries, the data drives critical decisions and thus matters in a critical way.

    For an e-commerce company like Amazon or e-Bay, those listing presented to users in the web browser are potential sales. I guess the same applies to Google and FaceBook, but to a lesser extent. However, from their perspective it's the paid add links, not the aggregated content, that matters most.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • My concern regarding data quality is largely that it is absent.

    The solution compiles, tests run, the users like the UI, job done. The reporting/BI aspect is someone elses problem.

    Only it's NOT! For data to be an asset rather than a liability you have to build sufficient domain/referential integrity in as a foundation stone. It matters not whether it is NoSQL, flat files or RDBMS. If a NoSQL document needs to be a certain structure then how is that enforced? If attributes within the documents have rules defining legitemate values how are those enforced?

    I worry that some of the development community are jumping on the band waggon of NoSQL simply because they are trying to bypass the disciplines that DBAs insist on without understanding why those disciplines are so important.

    If you've been to any Big Data conferences you will come away thinking "but that's been my day job for 'x' years! Big Data is just a marketing term!".

    Yes and No. The reason something so old has only just been given a marketing term is because non-IT and non-data professionals are now sitting up and taking notice of data and recognising its intrinsic worth. Up until now that audience has played lip service to topics surrounding "data as an asset" without really believing it.

    Now they are realising they could make serious money out of data the light-bulbs have gone on only to find that the ancient curse of SHISHO is still as potent today as it was to our ancestors.

    I may be data Santa Claus but what I can deliver on the stale mince pies of data quality and astringent brandy of technical debt is a rather nasty smell.

Viewing 2 posts - 16 through 16 (of 16 total)

You must be logged in to reply to this topic. Login to reply