On Data Science, Data Analysis, and Machine Learning

Kevin Siswandi
2 min readNov 23, 2020

--

The definition of data science varies quite a bit depending on who you ask. In the academic sense, data science is a multidisciplinary field that combines mathematics, statistics, machine learning and computer science for solving difficult problems in an area of application (such as physics, astronomy, finance, medicine, social sciences, etc.). Unfortunately, in the real world, data science is often used as a blanket term to conveniently describe various things, which could actually be totally different in nature. These include data analysis, data engineering, and machine learning.

Photo by Myriam Jessier on Unsplash

In most cases, typical data analysis tasks or projects may be rebranded as data science for various purposes (which I am not going into). These include data analysis with Excel, pulling data with SQL, data visualization with Tableau, or dashboarding with Qlikview. Sometimes this may also involve managing some online campaigns with Google Analytics. Although these tasks might sound a bit too menial to someone with a more academic background, they bring a lot of value to an organization. Therefore, mastering these skills is the way to go for someone who is aspiring to build a data career in the business world.

Data engineering is usually far more specific: it refers to the setting up of data infrastructure (but often involves heavy data analysis component). Thus, it is more relevant to large companies that have a lot of data, or startups who have acquired a significant amount of traffic. In contrast to data science, data engineering is mainly on software engineering and much less on statistics or machine learning.

If you are more academically inclined, machine learning engineering might be your cup of tea. Machine learning engineering, however, is often extremely product-centric and thus involves a huge software engineering component. This is probably the ideal thing to pursue for someone with a strong background in an academic field (e.g. mathematics, physics, or economics) who is looking to transition into a company that is focused on building data products or ML/AI platforms. The internet companies or forward-looking enterprises that value digital innovation and automation are often the early adopters of machine learning engineering.

Lastly, the general data science problem is one that aims to answer operational questions for an organization (and ideally also generates actionable insights in the process) from a messy dataset. Usually this is more relevant in companies or organizations where data is not the core product, but who care a lot about data. Here, it can involve data preparation, machine learning, visualization, model deployment, and ‘big data’.

--

--

Kevin Siswandi
Kevin Siswandi

Written by Kevin Siswandi

👨🏻‍💻 Data Scientist 📚 MSc. Mathematics & Computer Science 🎓 BSc. (Hons) Physics https://www.linkedin.com/in/kevinsiswandi/

No responses yet