Career in Data Science

Data Science is all about building data-driven algorithms, making a decision based on data.

Most businesses generates a lot of data, analyzing those data helps the business to make clever and proven decisions.

Here in this discussion, we will walk through the following topics:

  1. What are the qualifications required to pursue a career in Data Science?
  2. What are the career options in Data Science?
  3. Career growth in Data Science

What are the qualifications required to pursue a career in Data Science?

To pursue a career in data science you must hold any of one of the following degrees:

  1. BTech / BEng
  2. BSc / MSc (Maths, Computational maths)
  3. MCA

What are the career options in Data Science?

There are three major tracks in Data Science:

  1. Data Analyst
    This role involves analyzing and visualizing data using tools such as Microsoft Excel, Tableau, etc.
  2. Data Engineer
    This role involves building data infrastructure to handle the vast amount of data. Integrating and deploying various tools to handle big data. This role requires good software engineering skills.
  3. Data Scientist
    This role involves building predictive models using data. This role requires good machine learning-related skills and moderate software engineering skills.

Skills of Data Analyst

The main responsibility of a Data Analyst is to answer business-related questions based on data, by figuring out the relation between various variables and their impacts.

As a data analyst you should have the following skills:

  1. Data visualization: Using tools such as MS Excel, Tableau, etc. Visualizing data and finding the pattern in the data to get answers to business questions.
  2. SQL: Reading data from database and analyzing using various tools
  3. Reporting: Very good communication and presentation skills.
  4. Domain knowledge: Should know about the business aspect of the data, understanding the source of data, and interpreting it.
  5. A/B Test: A testing method to compare the impact of two variables, read more.
  6. Coding: Understanding/coding using R may be required, varies based on companies.

Skills of Data Engineer

The main responsibility of a data engineer is to build a high-performance data infrastructure to support loading and reading big data.

As a data engineer you should have the following skills:

  1. Backend Technology: Well versed at developing server-side applications, building stream processing software.
  2. Distributed computing: Understanding of distributed systems and implementation.
  3. Programming: Python / Java

Mostly for this roles experienced software engineers are hired.

Skills of Data Scientist

The main responsibility of a data scientist is to build a predictive model using data.

As a data scientist you should have the following skills:

  1. Machine learning: Applying machine learning to get insight from data, building classification, regression, segmentation models.
  2. Coding: Implementation of various algorithms in Python.
  3. Database: Writing and reading data from databases. SQL understanding is a basic requirement.
  4. Data: Data visualizations and modeling
  5. Metrics: Understanding to measure the performance of machine learning models.
  6. Deep learning: Applying deep learning to solve problems in domains such as image, text, video, speech, etc.

Note that deep learning is an advanced data-driven approach and may not be required for many roles.

What is a machine learning engineer or deep learning engineer?

Both are data scientist roles with slight changes.

In the case of machine learning engineer / deep learning engineer, apart from building predictive models, you will also need to work on deploying these models to production environments applying software engineering skills.

In the case of the deep learning engineer role, you may not need to work on classical machine learning.

Problem-solving skills

For both data engineer and data scientist (depends on companies) roles problem solving skills is tested during the interview.
Problem-solving questions test one’s understanding of the computer fundamentals such as data structures, algorithms, memory management, compute optimization, etc.

It is very important that you practice solving problems to improve your skills, you can find interesting problems at the following sites:

  1. Leetcode Problems
  2. Codeforces Problems

Tools for Data Science

These tools are also required for data scientists and data engineers.
It is very important to get familiar with tools used for software development before applying to any job. Here are some important tools:

  1. Git (versioning) - Github / Gitlab / Bitbucket
  2. IDE (coding platform) - VIM/VS-Code
  3. Open source - Contributing and reading open-source software code. You can learn a lot of things about coding by going through experts’ code.
  4. Stackoverflow - QA platform.

Career growth in Data Science

In all three roles, the first 10 years of your career journey would look as follows:

  1. Beginner: Joining as level 1 (individual contributor) (IC) [0 years of experience]
    In this role, you will be working alongside other levels 1 and seniors. You will be asked to work on non-critical tasks to get started and test your performance. If you do very well, you will be given more and more critical tasks gradually. (mostly not applicable for data engineer)
  2. intermediate: Joining as level 2 (IC) [ 2 to 5 years of experience]
    In this role also you will be working with other level1, level2, and seniors. Mostly you will be assigned to work on specific tasks on your own, taking full ownership of the task.
  3. Senior: Joining as level 3 / senior (IC) [5 to 10 years of experience]
    In this role you, depending on the team/company, will need to either take ownership of a few tasks and work with juniors. Mentor juniors and sometimes even lead a team.

After completion of 10 years (approximately, in between 8 to 12) in the industry, you will have two choices

  1. Move to a managerial role
  2. Stay in as IC

In the case of a managerial role, you will be responsible for managing a team. In this track also similar to the IC role, you will progress from manager level 1 to X.
For data scientists based on your performance, you may also get a chance to lead the entire data-science group.

For data analysts, there will be no significant growth, unless you up-skills and move to a data scientist role.

For data engineer, based on your performance, you will be promoted to an architect role, where your responsibilities would be to own the entire data infrastructure of a particular product.

For a data scientist, based on your performance, you will be promoted to a lead role, where you will be taking lead in developing a predictive model for a particular product.

Additionally, as an architect or leader, you will have to mentor and train junior engineers/data scientists on modeling, designs, and coding standards.

Learn about Career as Software Engineer