Being titled as the sexiest job of the 21st century – Data Science has clearly created a lot of buzz in the market, the popularity has been displayed all over the internet from courses like “learn data science in 3 months” to flooded forums asking for all kinds of absurd questions on making a career in data science.
But what do you really understand by it?
- Being a new field airs all kinds of myths about the area, like confusing it with data engineering and analytics; while these are entirely different niche having different roles.
- Another common misconception about data science is you need to be a mathematics pundit or a great statistician to make a career as a data scientist. Yes, mathematics is there, but the even more noted skills you’ll need is problem-solving; mathematics and statistics will make your job easy but they’re a fraction of the actual work to be performed in a day to day work as a data scientist.
- More data will get more accuracy; data in itself is quite useless, the way you process it, determines its usability.
These are just a few to state, the field is filled with myths and misconceptions.
The fact that only 29% of organizations are able to derive meaningful conversions from data science tells us there’s still a long way to go.
Is data science your cup of tea?
The pay scale received by data scientists justifies people’s interest towards the profession. But let’s face reality it’s not everyone’s cup of tea to be intrigued by numbers and data, solve complex problems on day to day basis and implement those solutions in a rather mathematical form using data science.
Though there’s nothing one can’t attain with enough effort and dedication, the real question is will you enjoy it? Will all the struggle be worthwhile?
The scarcity of talent makes it even more sought after field; the skill gap is real. Over 50,000 data scientist jobs are vacant in India alone.
But apart from proficiency in Python or R, there’s a lot you need to be a successful data scientist; if it’s more than the salary drawn or the buzz around the job that lure you into the field only then, it’s a profession worth considering.
As once your pockets are filled, money won’t be a motivation anymore.
To have a smooth transition into the career, you should be someone who’s meticulous and analytical to investigate trends and patterns into a vast amount of data.
You must have a counter-intuitive approach to question the obvious and think with a comprehensive perspective. You should be adroit with math and statistics to draw solutions for concrete business problems. You must be patient enough to perform monotonous tasks and can keep calm when your weeks of efforts fail to show any result.
If you believe you have what it takes to be a data scientist, let’s get started.
How to make a career in Data Science
Data Science in straightforward terms is a process of asking questions and answering them using data.
An average day in the life of a data scientist involves finding, cleaning and organizing data with a very little time remaining to perform actual analysis; it can be very different from what you might have picturized.
Do the math
To start with the first thing you need is mathematics, if not a good command at least some familiarity with probability, statistics, and linear algebra.
Here I’m mentioning the stuff you need to get started with, to avoid flooding you with information. The learning for a data scientist may never end you’ll need to learn a lot and stay updated with all the technology trends.
You will be learning more mathematical concepts as you proceed further with projects.
There are a bunch of sources available to learn math for data science, my recommendation would be; Statistics and probability & linear algebra course from Khan Academy or if you prefer books there’re a bunch of data science math books you can download for free.
Get your hands on coding!
The major programming languages used in data science are R and Python.
Python is ideal to start with as the code’s syntax is like a regular human language and is much more intuitive compared to R even if you have experience with other programming languages.
Editor – There are numerous text editors available to write code in Python such as Sublime text, Atom, Pycharm, Jupyter Notebook, etc. you can use any of these to start writing your code.
There are a number of online sources available to learn python without paying a dime, a few of my recommendations are; DataCamp, Introduction to Python by Python Org, Google’s Python Class, Python Jumpstart by Building 10 Apps, and w3schools’ Python Tutorial.
Data Analysis – to perform data analysis using Python you’ll be mainly using two Python libraries namely; NumPy and Pandas. Pandas is a go-to tool for data scientists coding in Python, and what you’ll need for all data operations.
Data Visualization – To visualize the text data in visual form data visualization comes handy, for which there’s a library named Matplotlib in Python which is similar to MATLAB.
For interactive visual plotting, there’re libraries such as Plotly & Bokeh & for statistical data; there’s a built-in plotting library in Pandas written in Matplotlib.
Machine Learning – The most popular machine learning library in Python to perform clustering, dimensionality reduction, classification, regression, model selection, and preprocessing is Scikit learn. For more deep learning there’re libraries such as TensorFlow, PyTorch or Keras [Which can be used, to begin with, due to its simplicity].
Editor – RStudio is the most popular IDE for R, though it can also be written in Jupyter Notebook.
Data Analysis – A lot of data analysis features are already present in R, though if you’re using R for data analysis, it’s a must to learn tidyverse which is a collection of data analysis packages such as; dplyr for data manipulation, tidyr for data cleaning, readr for reading data, etc.
Data Visualization – For data visualization most popular R library is ggplot2; it also works with Plotly to instantly create interactive visualizations using ggplotly.
Machine Learning – Caret is one uncomplicated package that you can start with as the packages and libraries in R for machine learning are vast enough to overwhelm you as a beginner. You can later master other packages as you’ll learn further.
If you want data science to be more than just a skill on your resume you can opt for certification courses, diploma or masters in the field.
Though very few universities who provide specialization in data science and talking about Indian universities, you can count the numbers on fingertips.
And the universities providing postgraduate equivalent courses for data science are rarer, though a lot of executive, diploma or online courses are available which you can opt for.
If you’d like to pursue a proper masters with a specialization in data science then you can look forward to foreign studies or Indian institutes like IISC-B or IIITs.
No “Data Science in 3 months” course is going to make you proficient or will land you a job.
Merely learning data science is not enough; your skills should be practical and employable. To practice what you’ve learned you can build some sample projects, or take some from sites like Freelancer and Upwork to enhance your competency and your skills.
With that, you can start to apply for data science jobs, also while learning, try to attend any events related to data science that will help you network with people you can not only learn from but can also prove to be a helping hand in landing you a job.
Though making a career in another domain is difficult but not impossible, At the end of the day the most important thing you need is determination towards your goal, there’s nothing you can’t do without enough effort and persistence.