Part 1 — SQL, Python, R and Data Visualization

Recently, I graduated from Chemical Engineering and got in my first role as a data analyst in a tech company. I documented my journey here from Chemical Engineering to Data Science. Since then, as I spoke to students from my school about the move, many expressed the same interest and the same question…

That was the exact question I ask myself — how can I make the move? That same thought bugged for and pushed to start pursuing the skills of a data scientist a little over a year ago.

It…


Model Interpretability

Understand LIME Visually by Modelling Breast Cancer Data

Photo by National Cancer Institute on Unsplash

It is almost trite at this point for anyone to espouse the potential of machine learning in the medical field. There are a plethora of examples to support this claim — one would be Microsoft’s use of medical imaging data in helping clinicians and radiologists make an accurate cancer diagnosis. Simultaneously, the development of sophisticated AI algorithms has drastically improved the accuracy of such diagnoses. Undoubtedly, such amazing applications of medical data and one has all the good reasons to be excited about its benefits.

However, such cutting-edge algorithms are black boxes that might be difficult, if not impossible, to…


Which Tool is Better for Scalable Data Science?

An illustration of the end-to-end data science pipeline without GPU, parallel programming and a DevOps team. Photo by amirali mirhashemian on Unsplash

Ask any data scientist for the most common tool they use at work. Chances are, you will hear a lot about Jupyter notebook or Google Colab. That’s no surprise since data scientists often need an interactive environment to code — to see the results of our data wrangling immediately, to extract insights from visualizations, and to monitor the performance of the machine learning models closely. I for one would love for my code to execute very quickly, if not immediately. That is usually done with the help of GPU and parallel programming.

After a machine learning model is developed and…


A Worthy Cert for Aspiring Data Analysts from Non-traditional Backgrounds

(I am not paid to promote this course. This is my personal opinion.)

In Mar 2021, Google launched a Data Analytics Professional Certificate. This came at a perfect time as the supply lag behind the demand for analytics role, creating a shortfall of data analysts in the market.

The demand for data analytics role has skyrocketed in recent years, causing an increase in the number of openings in analytical roles with a lucrative paycheck. In fact, according to Burning Glass, there are 337,400 data analytics job in the US alone, with $67,900 as the average entry-level salary.

At the same…


In Conversation with Michael Ng, the Data Analytics Manager at Agilent

I recently stumbled across Symbolic Connection, a data podcast ran by experienced data practitioners, Thu Ya Kyaw and Koo Ping Shung. This podcast caught my attention as the guests are mainly from Singapore, one of the burgeoning tech hub of South-East Asia (it was also from where I am based.)

Michael from Agilent. Image published with Michael’s permission.

As a data analyst in a tech company, I am always curious about the work of analysts in other tech firms. So I listened to the podcast featuring Michael Ng from Agilent.

In this podcast, Michael shared about how his role as a data analyst manager at Agilent, advice for…


Useful Courses and Resources for Machine Learning

Ready to move into data science? Photo by Andrea Gradilone on Unsplash

Recently, I graduated from Chemical Engineering and got in my first role as a data analyst in a tech company. Since then, as I spoke to students from my school about the move, many expressed the same interest and the same question…

That was the exact question I ask myself — how can I make the move? That same thought bugged for and pushed to start pursuing the skills of a data scientist a little over a year ago. …


Lessons from an Economist turned Data Analyst

I recently stumbled across Symbolic Connection, a data podcast ran by experienced data practitioners, Thu Ya Kyaw and Koo Ping Shung. This podcast caught my attention as the guests are mainly from Singapore, one of the burgeoning tech hub of South-East Asia (it was also from where I am based.)

As a data analyst in a tech company, I am always curious about the work of analysts in other tech firms. So I listened to the podcast featuring Cliff Chew, a senior data analyst at Grab.

Cliff is a senior data analyst at Grab, a Singapore-based ride-hailing tech unicorn with a presence across South-East Asia.

In this podcast, Cliff shared about how his role as a data analyst in…


Chihuahua, Muffins and Confusion Matrix in Machine Learning

Photo by Alicia Gauthier on Unsplash

Conditional probability is one of the fundamental concepts in probability and statistics, and by extension, data science and machine learning.

In fact, we can think about the performance of a machine learning model using confusion matrix, which can be interpreted using a conditional probability perspective.

In this blog post, we will cover all that and more—

  1. Introduction
  2. A formal definition of conditional probability
  3. Intuition to conditional probability: Moving to a new universe
  4. Conditional Probability in Machine Learning and Confusion Matrix
  5. How to learn more probability

Let’s get started.

Conditional Probability Introduction

Conditional probability is a method to reason about the outcome of an…


OPINION

I’m sorry, but it’s true.

Photo by Martin Klausen on Unsplash

“A proud man is always looking down on things … as long as you are looking down, you cannot see something that is above you.” — C.S. Lewis

Humans are prideful.

We think that human intelligence is the pinnacle, and thus mock artificial intelligence, even the face of its incredible intellect and creativity.

AI can code in any language, make amazing art (we’ll get into that), and drive better than us with Tesla’s Full Self-Driving. These are all advancements of the last year.

To think that AI won’t eventually do your job better than you is sheer pride, or the…


What’s the difference and why data scientists should know them

Photo by Wolfgang Hasselmann on Unsplash

Probability and statistics — which comes first?

We almost always discuss probability and statistics hand-in-hand.

In fact, it is not uncommon for students to not know the distinction between statistics and probability.

But which comes first? What are the differences even?

It’s a chicken-and-egg problem. You can’t have probability without statistics, and vice versa. What do I mean by that? I’ll explain more in this post.

In this short post, I will highlight some of these differences between probability and statistics and some applications of probability and statistics in data science.

From this, I hope you will gain or have a new perspective of why data scientists…

Travis Tang

A Data Science Guy in Tech from Singapore. linkedin.com/in/travistang | travistang.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store