1 to N Journey to Deep Learning Engineer

"Fall down seven times, get up eight"

It's been about 2 weeks since I decided to actively work on my long-term goal of becoming a deep-learning engineer. And there are a handful of catalysts that spurred my motivation:

  1. My transition into my remote MS degree in data science in the Fall

  2. My transition into a data engineering position in the Summer

  3. My "need" to become an expert on all things machine learning

And if I learned anything from my career as a student, I by no means am gifted, so the earlier I can REPL an ML model the better.

Who am I?

My name is Joel Montano and I am finally (I say finally because it took me 8 years to get a 4-year degree) graduating with a BS in Computer Science in May. And as one life goal closes another life goal opens, becoming a world-class machine learning engineer.

Even with a degree, I still consider myself a novice engineer, so I thought it'd be interesting to chronicle my journey to becoming an expert (which will also give me a chance to work on another lifetime goal of mine, to become a writer).

My Learning Strategy

Let's propose a scenario, where I try to learn convolutional neural networks (CNNs). There are two ways I could approach this. First, top-down, where I would look around for some images, an implementation of a popular CNN model, such as resNet50, and some already implemented data preprocessing methods. After a couple of hours, I would have a deep learning model that was considered world-class about 10 years ago.

Meanwhile, if I were to use a bottom-up approach, I could start by looking up some popular CNN papers, specifically the resNet's paper, and read post after post on topics like skip connections, 1x1 convolutional filters, max pools or average pools, feature maps, strides, and so on until the dawn of math. Finally, after a handful of weeks of learning, aggregate some data and build a model (only after implementing it from scratch).

Top-Down Learning

ProsCons
I can start building practical things quicklyLimited by APIs
I can showcase to the world a lot quickerSeeing technical terms can confuse me

Since my focus leans more towards development and I already have a solid computational foundation, due to my degree, I'll probably make this about 80% of my approach.

Bottom-Up Learning

ProsCons
Easier to get into technical articlesIt might be weeks before I build something useful
Can debug models quickerI might get discouraged
Enables building things from scratch

20% of my time will be committed to understanding the theory. My split isn't to down on a bottom-up approach, but rather, to tailor my learning to my strengths and weaknesses. Realistically, at least for me, abstract concepts tend to marinate more deeply when informed by hours and hours of trial and error.

WEEKLY CHALLENGE ALERT

So, I am currently trying to understand the basics of CNNs and there are a handful of goals I'd like to accomplish this upcoming week:

  • Go over FastAI's lessons 3 and 4

  • learn some more of nbdev and try to build a portfolio highlighting my deployed dog/cat Classifier on my Github page

  • Review FastAIs Documentation specifically the data portion

  • Look to see if there are some Kaggle competitions I can join

  • Write a blog post on materials covered from lessons 1 and 2

  • Understand the history of CNNs and some basic concepts surrounding them

  • Review the "Imagenet classification with deep convolutional neural networks" paper by Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton

  • Try to Implement a basic CNN using Pytorch

I don't expect to get through all of these goals in a week, but I'll list them here regardless to help stretch myself as the week goes by.

Additional Resources

To give some credit where credit is due, I've linked some books, articles, and videos that helped inspire this journey: