These are powerful ideas that will take your Machine Learning pipelines to the next level.
To help me understand you fill out this survey (anonymous)
Thanks to the internet and open-source community, there has never been a better time to start building products that leverage AI and Data to create valuable insights for various organizations. In this article, I will share some amazing techniques that will make your Machine Learning Solutions much better. These are techniques that have shown a lot of success in projects, but too many ML practitioners look beyond these and jump straight to expensive Deep Learning Methods.
To anyone who has been following my work for a while, this will not surprise you. In terms of benefits, implementing a degree of randomness into your Machine Learning Pipelines will improve your overall network- in terms of performance, robustness, and even costs. For eg, Google Researchers were able to beat SOTA image-classification models while using 12 times fewer images (and not needing to label these images).
We show a counter-intuitive result that adding more sources of variation to an imperfect estimator approaches better the ideal estimator at a reduction in compute 51xcost.
Why does this happen? The most likely explanation is that adding an element of randomness and chaos into your training protocols actually expands the solution space your model searches through. Thus the final model configurations will be more suited to generalize to a greater possible set of inputs. This helps a lot when it comes to building solutions that can generalize to the often chaotic and messy input you will see in the real world.
The amazing thing about this is that the randomness added to your datasets doesn’t have to be too fancy. In Computer Vision TrivialAugment and RandAugment are the most successful DataAugmentation protocols. Both of them are extremely simple and do much better than the fancier protocols. Well-crafted simple policies create a lot of input noise (thus diversity), and will not introduce as much bias as the more elaborate policies, that take entire datasets, and will thus often mimic the original dataset distributions.
Research into comparing Deep Learning models has further backed up my hypothesis that more noise →bigger space to search through →better results. Below is my article breaking down a paper that compared Deep Learning Ensembles to Bayesian Networks. The authors showed that Ensembles outperformed BNNs because they were able to sample through a more diverse search space.
If you’re interested in working randomness into your pipelines, check out my guide on using Randomness effectively in Deep Learning. It will cover the places you can implement randomness and other important details. If you need additional help with this, you can always reach out to me. Contact info is at the end of this article.
Speaking of things that can traverse a very diverse search space- it’s time to cover possibly the most overlooked tool in a Machine Learning Engineers arsenal. The amount of disrespect on EAs is unbelievable. The number of ML people I’ve talked to, who’ve never even considered evolutionary methods is staggering. It’s time to change that.
Evolutionary Algorithms are amazing for many reasons- they are cheap, can work on a more diverse set of problems than neural networks, they have shown amazing performance, and their training can easily be followed and analyzed. We have seen more and more people use Evolutionary Algorithms as one layer in a higher-level optimization problem. And they have been used in some of the coolest projects- including fighting government censorship. The thread above contains more information, including various implementations of EAs in domains like Neural Architecture Search, Computer Vision, and much more.
You can separate a Good ML Engineer from a Great one based on their familiarity with Bayesian Methods. Most people in Machine Learning don’t understand Math and thus aren’t fully comfortable with (or even aware of) Bayesian Statistics and Causal Inference. A lot of the ‘cool’ ML research doesn’t cover this topic a lot, but Bayesian Learning has been a game changer to my effectiveness in ML.
Bayesian Learning can be a tricky thing to learn and understand, but it will be a great addition to your arsenal. I only really got good with it through my work on Supply Chain Analysis at ForeOptics(shoutout to my manager Shaheen for all his help with this), but since then, I’ve been able to solve some pretty interesting problems. To develop your skills with it, search up some talks/papers using Bayesian Learning to understand some contexts it’s being applied in (and why it works there). Use that to gain an appreciation for the subject. Then (and only then) look into the Math and coding of the topic. If you’re struggling with understanding Math, read the above article to help you.
If you’re looking to get into ML, this article gives you a step-by-step plan to develop proficiency in Machine Learning. It uses FREE resources. Unlike the other boot camps/courses, this plan will help you develop your foundational skills and set yourself up for long-term success in the field.
For Machine Learning a base in Software Engineering, Math, and Computer Science is crucial. It will help you conceptualize, build, and optimize your ML. My daily newsletter, Technology Interviews Made Simple covers topics in Algorithm Design, Math, Recent Events in Tech, Software Engineering, and much more to make you a better developer. I am currently running a 20% discount for a WHOLE YEAR, so make sure to check it out.
I created Technology Interviews Made Simple using new techniques discovered through tutoring multiple people into top tech firms. The newsletter is designed to help you succeed, saving you from hours wasted on the Leetcode grind. I have a 100% satisfaction policy, so you can try it out at no risk to you. You can read the FAQs and find out more here
Feel free to reach out if you have any interesting jobs/projects/ideas for me as well. Always happy to hear you out.
Use the links below to check out my other content, learn more about tutoring, or just to say hi. Also, check out the free Robinhood referral link. We both get a free stock (you don’t have to put any money), and there is no risk to you. So not using it is just losing free money.
Check out my other articles on Medium. : https://rb.gy/zn1aiu
My YouTube: https://rb.gy/88iwdd
Reach out to me on LinkedIn. Let’s connect: https://rb.gy/m5ok2y
My Instagram: https://rb.gy/gmvuy9
My Twitter: https://twitter.com/Machine01776819
If you’re preparing for coding/technical interviews: https://codinginterviewsmadesimple.substack.com/
Get a free stock on Robinhood: https://join.robinhood.com/fnud75