Getting into Data

I’ve been asked frequently enough about making a transition into the Data Analytics space, aka my day job, that I thought it would be useful to combine my thoughts into a coherent post. This is a quick take on who this industry/job/role suits, what the skills required typically  look like, and some background info on the industry as a whole. Note that this is oriented towards the Data Analyst / Analytics Engineer. Also note this post is mostly links.

The barrier to entry to working as a data analyst is reducing rapidly. If you are a combination of curious, technically astute, outcomes-oriented, driven, observant, people-oriented and a natural leader then the technology should not be a hindrance, as it is quite literally being made easier every single day.

Consider this as a primer, and reach out to me directly if you think you’d like to work in this space, especially if you are motivated and have any data experience.

Is it a good fit for me/you?

  1. It is a great fit for people with good soft skills, strong intuition for business operations and a desire to make a difference in how a company operates. Are you interested in finding out the facts about what happened in the past and suggesting changes as to how they should operate in the future? To tell the CEO this? To back the decision and measure the changes? To possibly have been wrong? Good things to like the sound of.

  2. There is Very Strong Demand in the startup space for these kinds of roles, so if you can convince the right person that you have what it takes then it is entirely possible to transition into this career, without any formal qualification. Typically good transitions are from engineering and technical operations in fast moving businesses.

  3. With that goes the obvious implication that the current high demand may not be sustainable in the long term. 

  4. The term Data Scientist has largely been merged into the analytics roles, as pure data science is very niche, and often misapplied. Typically companies need more data analytics than they do data science unless their core product is somehow related to data science, and even if they say it is, it often in reality very much is not.

  5. I prefer working in or as a provider to startups and scaleups, as this environment is more dynamic. Larger organisations and enterprises have analytics functions, but they often have more functionally or departmentally specialised roles, which means less exposure to the business as a whole. As you become more senior, larger companies will provide a great growth opportunity. 

Data Analyst & Analytics Engineer stuff:

Necessary tech skills.

  1. The Data Analyst / Data engineer concepts have been pushed together into a single role, Analytics Engineer, which broadly makes sense, and can be considered a role related to the Data Scientist. I explained this to some extent visually in this tweet. Further, this describes in a bit more detail what we do and don’t know about analytics engineering.

  2. These roles are generally distinct from or work alongside Business Analyst and Product Manager roles.

  3. If you can do some part-time education, this short course from one of the ex-dbt employees looks to be worthwhile considering. It is brand new, but they intend to teach the exact thing I usually look to hire for.

  4. So, onto The Analytics Engineer skills/tools to know

     I’d give 60% focus on SQL, with maybe 10% each for Git, CLI, Python, BI tools.

    1. SQL

      1. Analytical queries like this tutorial, especially the “advanced” section, also this tutorial

      2. CTEs

      3. Window functions

      4. Focus on consistency, neatness, style

    2. Git basics

      1. Pull, branch, commit, merge

      2. Most people don’t need to know much more

    3. Command-line basics

      1. Navigation, creating, deleting, moving

    4. Python

      1. Python is a huge domain, super useful but harder to get to grips with, and also less useful unless you are in a specific niche that calls for it

      2. Jupyter notebooks

      3. Basic pandas for data manipulation

      4. Functions

      5. Virtual environments basics

    5. BI Tools

      1. BI tools are often prohibitively expensive, so the experience is often limited to one, and often the wrong one or an old one.

      2. Metabase is the defacto open source quick and simple option, it functions in a reasonably useful way. It is worth downloading and testing out. 

      3. Another shoutout to lightdash, a very simple BI tool that runs with dbt, and is familiar to users of Looker.

    6. Amendment: *Excel*

      1. This post strongly assumes Excel expertise, but some readers pointed out that this would be worth stating.

      2. Pivots, xlookup etc. 

      3. Highly recommend reading the post Doing Better With Excel.

Analytics industry context:

Articles/books etc that I think worth having a look at to add some colour to the above.

  1. I wrote some articles about what I think about technology, the data-as-a-utility-tool one is probably the only one worth reading.

  2. Building a data team at a mid-stage startup: a short story

    1. This very neatly describes my jobs and career so far 

    2. A good representation of what the “data” industry looks like

  3. The Modern Data Stack: Past, Present, and Future

    1. dbt is the tool that I use for SQL transformations. It is probably the single most useful thing to learn in addition to the SQL, git, command line, Python list

    2. They are doing lots of thought leadership in the analytics space, with very good blog posts

  4. Technology in Data Analytics

    1. Simplest, shortest overview of the most important tech tools, anything not on this list is possibly out of date or redundant, or used in a different context

  5. The Analytics Setup Guidebook

    1. This is a deep dive into modern analytics, opinions are now pretty generic, but it is a BOOK

    2. Some interesting meta-analysis takes on the jobs in the industry here

    3. Possibly the thing that is  done most badly in the analytics space is data modelling, with the background here. This is probably what will be the hardest to attain, and near impossible to hire for, but something that is mostly trial and error anyway!

That is all I’ve got. Topical and relevant as of publishing. I’ll continue to add and amend, but for now, if you know of someone interested in Data, send them this. There is a world of nuance not covered here, the most interesting themes all covered in due course in a post here. Or not. 

As may be obvious from the long list of articles, there are lots of thoughts and opinions in this space. I have personally experienced the tension and growth of data analytics alongside data science and data engineering, (and also a frustrated relationship with software engineering). In very many contexts the data science/engineering/analytics terms are used overly interchangeably, and often end up meaning the exact same thing, but often not meaning anything similar at all, just to make it confusing, especially data engineering. See an upcoming post on Data Engineering: Backend Developer, or Data Analyst.

And as I said, I’d love to hear from you if you are making this transition, or feel like understanding it in more detail. The above is just a primer! Get in touch to set up time directly, quickest via the dreaded:

Please consider subscribing for more on the subject of data systems thinking

What is group by 1

Who is Matt Arderne