What's Next for Welch Labs?

Welch Labs has been inactive for 3 months. This will be changing soon. I’ll be finishing up the Imaginary Numbers Series, and am excited to premiere a new series at ML CONF in New York this April. (Use the code Welch18 for $45 off your ticket!)

My vision for Welch Labs in 2016 is continue making the highest-quality educational content possible. I’m excited to create new Machine Learning content this year, as well as math content in the vein of my Imaginary Numbers series.

If you’re interested in the Machine Learning content - please read on, I’d love to hear what you think. If you’re more interested in math content, I’d also love to hear what you think! Click here to see what math topics are coming up.

The next Machine Learning series from Welch Labs will cover Decision Trees and Mutual Information. This direction came about for a few reasons:

1. As a strong Intro to Machine Learning. I love making content that appeals to and is understandable by people from all education levels, high school to graduate school. Decision trees are perhaps the "original" machine learning tool - they are easy to understand, and provide great "bang for the buck" - making for a great introduction to machine learning.

The Five Tribes of Machine Learning According to Pedro Domingos. Image from The Master Algorithm. 

2. Stealing from The Master Algorithm. If you're interesting in ML and haven't read Pedro Domingos' book - The Master Algorithm - I highly recommend it. Certainly the best "pop" machine learning book I've read. Pedro weaves a wide variety of Machine Learning approaches into a single narrative in clear and approachable ways. I love this stuff - it's the kind of history and context you can't get from reading technical resources. Pedro structures his book by dividing Machine Learning into 5 tribes.  While this is clearly a simplification of a complex field, I think it is a very useful one. I think it's so useful in fact, I'll be stealing it :). I'll be roughly following Pedro's path through the world of Machine Learning for the next few series - starting with Decision Trees. Pedro seemed like a nice enough guy when I met him at ML CONF Atlanta - so I'm hoping he won’t be to mad. 

3. Decision Trees = The Most Popular ML on the Planet. Although no longer among the newest and hottest algorithms out there, trees are incredibly widely used across many diverse applications. The success of trees is likely a result of many factors - but I believe the most important is the simplicity of the resulting models. In my professional work, trees have beat out other models again and again, simply because I can clearly explain the resulting algorithm to anyone. Try that with a Neural Network!

Research Time

The next couple weeks are all about research. I'm reading through Ross Quinlan's book as well as CART. I'm working to develop a deep understanding of the history and fundamentals and will build up from there. Along the way I'm looking for interesting resources on and applications of decision trees. I would like to cover a simple example in detail to put emphasis on the tools and techniques, but I'd also like to spend some time exploring how Decision Trees are used in big, challenging problems. Mutual information and entropy are big parts of most Decision Tree algorithms, and are clearly topics that could make up their own series - to keep my scope in check, I'll be investigating these through the lens of decision trees.

Help!

As I begin to shape the series, I'd love to hear from you. What do you think is interesting here? What are some cool/interesting applications of decision trees? What are some resources you've found helpful? What would you like to know about decision trees and mutual information? I'd love to hear what you think - please let me know what you think in the comment section below. Thank you!