banks rants – data lake

My first official banks rants is about “data lake”. Anyone that wants to make me annoyed will say “data lake” to me. The term drives me crazy.

banks rant - data lake

Why? Well because people throw it out there like they understand what it means and they usually don’t. Almost like the term blockchain. Companies say they want it but most don’t know what it means. Proof? When a company puts it in their company name, their stock goes up. For instance, On-Line Plc but it in their name and the stock went up 394% or an ice tea company changed their name to “Long Blockchain” and their stock went up 275%. You getting my point?

When I first heard “data lake” I had no idea what it was, so I asked. The concept was that streams and rivers of data come onto in a central location, creating a “data lake”. The theory is that the CTO of Pentaho, James Dixon coined the term.

Gartner defines it as ” a collection of storage instances of various data assets additional to the originating data sources. These assets are stored in a near-exact, or even exact, copy of the source format. The purpose of a data lake is to present an unrefined view of data to only the most highly skilled analysts, to help them explore their data refinement and analysis techniques independent of any of the system-of-record compromises that may exist in a traditional analytic data store (such as a data mart or data warehouse).”

It’s a concept, it is a way of doing something. It isn’t a product. It is an enabler. It should be a benefit. One truth, a data lake is for analytics needs and everyone wants to do it differently. So understand what you are getting yourself into… understand what you need… ask questions.

  1. #1 by Lori Schlesman on January 30, 2018 - 11:48

    Love it – exactly! However, maybe a bit more Banks style “rant” in the next rant?

    • #2 by Erin K. Banks on January 30, 2018 - 11:52

      Ok…. I will work on that!

  2. #3 by James Yazejian on January 30, 2018 - 12:57

    didn’t know you ranted, but I like it

    • #4 by Erin K. Banks on January 30, 2018 - 13:12

      thank you!!! I have a lot of passion… i needed to share

  3. #5 by Fiona Schrader on January 30, 2018 - 15:03

    I like it and can’t wait for the next installment although perhaps a little more of the passion we know you have to share.

    • #6 by Erin K. Banks on January 30, 2018 - 17:43

      i will work on it but you aren’t the first to mention it

  4. #7 by Bobo on January 31, 2018 - 00:11

    Ha, so glad this show up in my feed this morning. I think you’ll need at least one #pewpew in every rant.

    • #8 by Erin K. Banks on January 31, 2018 - 07:38

      I will absolutely work on that. Thank you for the suggestion

  5. #9 by Openbridge (@openbridgeinc) on June 14, 2019 - 00:13

    Erin, agree that the term can be opaque and overly broad. We have seen a fair amount of bad advice and critiques that confuse those trying to understand what it is and how it can add value. The original concept was simple enough and like most things tech, everyone made it into something else.

    We pulled together a post on this topic to help folks peel the onion on the topic:
    https://blog.openbridge.com/8-myths-about-data-lakes-c0f1fc712406

    Thank you for highlighting the issues in this space.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Discover more from common denial

Subscribe now to keep reading and get access to the full archive.

Continue reading