On this episode of The Artists of Data Science, we get a chance to hear from Kyle Polich, a computer scientist turned data skeptic. He has a wide array of interests and skills in A.I, machine learning, and statistics. These skills have made him a sought after consultant in the data science field. He is also the host of the very popular data podcast, “Data Skeptic”, which discusses topics related to data science all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches.
In this episode, Kyle defines what a data skeptic is, and also goes on to give advice on how to communicate effectively with leaders and executives as a data scientist. Kyle brings a very unique perspective related to all things data, along with actionable advice!
Some notable segments from the show
[5:28] Kyle defines “data skeptic” and his journey into becoming one
[17:19] The mission statement of the Data Skeptic podcast
[18:55] Is data science more of an art or science?
[23:36] Advice for data scientists trapped in a perfectionist mindset
[30:43] Important soft skills that you need to succeed
[39:40] How to communicate your ideas with executives
Kyle's journey into data science
Kyle has had a lifelong fascination with computers, and he knew he wanted to be a computer scientist from a young age. As he learned more about computer science, he also stumbled onto A.I, which became his focus.
While in graduate school, Kyle got a part time job where he worked for a search engine marketing company. This was his first experience working as a data scientist to some degree. He decided he wanted to delve deeper into working, rather than pursuing academia, which was his original intention.
The opportunities that followed led him to become an independent consultant, where he has been able to build a team that helps small and medium sized enterprises figure out how to use machine learning, and the creator of the Data Skeptic podcast.
[3:11] “Sure. Yeah. I mean, I guess I have had a lifelong fascination with computers. And I you know, I could've told you at four years old I was gonna be a computer scientist. That was just an obvious path. And naturally, along that journey, I became interested in artificial intelligence and that really became my focus. And as I studied that, I guess I originally thought I might go a more academic path try and go for professorship, something like that. But while in grad school, I started working a part time job just to afford to be in grad school, basically. And that was at a very unique time. And I got in a somewhat unique place. Nothing particularly special. We're a search engine marketing company, so we help small businesses use Google AdWords, basically. But as you might expect, there's a whole lot of what would eventually be called data science that went on there. And my A.I. skills transferred very well. What might not be abundantly obvious to everyone is that A.I. is very largely statistics and a lot of software design. And those two things work wonderfully in industry, especially at the time I'm talking about, which was pre a lot of things that we have today. There wasn't a cloud, there wasn't CICD, there wasn't all this kind of stuff. There was just a lot of elbow grease to do in the rudimentary versions of those. So I learned a lot of lessons about working, and decided I guess I like that better or simply was more successful with that than in academia and just kind of focused on that path.
That job led to an opportunity to move and that led out to California. And I guess the rest is history. I worked in, you know, a couple of various capacities doing different data science things. And at some point after I was at a startup that imploded, I decided I should strike it out on my own, became an independent consultant, and after about a year that started building a team and now we're, you know, we're, I guess, a media company, as you mentioned, we do the podcast, but we're most of the revenue comes from is really our work as a boutique consulting group. So we help small medium enterprises figure out how to do machine learning in the cloud, in particular with real time and streaming kinds of things.”
Where is the field headed in 2-5 years?
[8:13] On the engineering side, there's going to be a continued progression of improved tooling. That means easier and faster and better ways to do stuff. More automation, more transfer learning, more serverless, etc. What 10 data scientists can do today will be done by one in two to five years.
On the academic side, I think there's a lot of neat stuff going on in theory of database design and tying together ideas, such as ACID compliance, the CAP theorem, Paxos, etc. and finding unique ways to serve up tools that are customized and hyper efficient so that maybe some of that stuff is a utility or more of a utility.
What will separate great data scientists from the rest of them?
[11:18] I think good and great often differ just by luck.
Key takeaways from the episode
What does it mean to be a Data Skeptic?
[5:28] A data skeptic is someone who takes in as much information as you can, weighs it against the evidence, and aligns with the truest version of the world that you know. As a data skeptic, I am skeptical of data, and with data, since data is a tool that can be misused.
Kyle talks to us about the mission statement of the Data Skeptic podcast
[17:19] I want to be a resource for data scientists out there wherever data or skepticism should be applied. I want the podcast to be a casual place where people get exposed to deep ideas, not hype. I want to tell the story of how data is changing the world.
Tips on communicating with executives
[39:40] 1.) Understand the dynamics of the room (who’s leading the meeting?)
2.) Know your audience (who will be attending the meeting?)
Is data science an art or science?
[19:02] Art embraces interpretation and even encourages it, and that part I'm not good with when it relates to data. Science is about getting to the truth, and the truth is not open to interpretation.
In data science, the art exists is how one executes on the methodologies that lead to the path to truth with the data.
How the creative process manifests itself in data science
[22:07] I think most of the creative process is really about system design. How can I build something that is sustainable and maintainable and is more of a process?
Advice to those trying to break into data science:
[25:51] Be honest about what you know, what you don't know and come up with a good battle plan for learning. Figure out where you are and where you want to be and draw the straightest line between those two.
[11:43] “...greatness is achieved by a commitment to your craft and pursuing it.”
[16:42] “The greatest trick the devil ever pulled was convincing the world he didn't exist. That's what good data science does to me.”
[24:42] …”being able to fall down but get up fast is important.”
The one thing that Kyle wants you to learn from his story
Kyle wants to share the things he's learned with people and help everyone understand that it's not always easy to manage data, store it, analyze it and leverage it. But it's well worth it because the tools and methodologies that you can learn are pretty much the most effective way to build things and to learn things and to optimize processes.
From the lightning round
The best advice Kyle has ever received
The best advice he's ever received is a classic bit of wisdom: work smarter, not harder.
He elaborated on what this meant to him "...Ingenuity through intelligence. So find a smarter way to do your process, to automate the eat the hard parts or automate.. Figure out what's the core of the problem...Solve it not with lifting more, but with smarter techniques, better algorithms and that kind of stuff."
What motivates Kyle
A burning desire to understand the mechanism of everything I encounter.
The advice that Kyle would give to his 20 year-old self
"I feel like every good and bad choice I made brought me exactly where I am. And it'd be a sort of existential suicide to say anything else. I also don't know that I could get through to myself at that age. So I guess just keep on keeping on"
The song that Kyle have on repeat
Home Away From Home, Be Like Max
“OpenIntro Statistics” by Christopher Barr, David M. Diez, and Mine Çetinkaya-Rundel
“The Elements of Statistical Learning” by Jerome H. Friedman, Robert Tibshirani, and Trevor Hastie
Books and other media mentioned in this episode
“An introduction to Kolmogorov complexity and its applications” by Ming Li
How you can connect with Kyle Polich
Data Skeptic Slack Channel