Fireside 2.1 ( The Artists of Data Science Blog Sun, 07 Mar 2021 00:00:00 -0500 The Artists of Data Science Blog en-us Chat Transcript from Data Science Happy Hours 22, March 5, 2021 Sun, 07 Mar 2021 00:00:00 -0500 3043067d-2109-4730-99f7-9adb304eb59a This is the awesome transcript from the happy hour session. 16:31:11 From Naresh Reddy : Hello everyone
16:31:27 From Nicholas Lowthorpe : hey guys
16:31:29 From Russell Willis : Evening All
16:31:31 From Eric Sims : Yo!
16:31:32 From Nicholas Lowthorpe : and gals
16:31:44 From Ashit Debdas : Hello everyone
16:31:56 From Nicholas Lowthorpe : we've got Albert Bellamy LIVE from the (MAU) mobile analytics unit
16:32:18 From Eric Sims : @Albert with the #48HoursofData coming up!!!
16:32:43 From Eric Sims : @Rodney - Great to see you!
16:33:54 From Russell Willis : Gelotologist, or JelloShotologist..., I think I'd be cool with learning more about either ;)
16:34:36 From Carlos Mercado : Normal, Poisson, gamma, beta, chi2, Weibull (exponential). But a lot of distributions are different scaled versions of other distributions
16:34:37 From Rodney Beard : @Eric hi!
16:34:49 From Eric Sims : Thanks, @Carlos!
16:35:06 From Mikiko Bazeley : I feel like it’s also dependent on domain — when I took supply chain, poisson was king
16:35:11 From Carlos Mercado : agreed^
16:35:20 From Carlos Mercado : A/B tests use Weibull a lot
16:35:25 From Akshay Mandke :
16:35:37 From Albert Bellamy : I was gonna say, no love for the Weibull???
16:35:40 From Rodney Beard : Kotz and johnson classic ref for distributions
16:35:44 From Akshay Mandke : Cassie Kozyrkov recently shared a post on distributions
16:35:51 From Carlos Mercado : Saw but didn't read that ^
16:36:01 From Thomas Ives : Sorry to cut you off Carlos!
16:36:13 From Russell Willis : @harpreet What about more basic Statistical distributions single point / 3 point , PERT, Mean, etc?...
16:36:24 From Thomas Ives : Albert! Thanks for serving brother!
16:36:29 From Rodney Beard : Triangular is terrible
16:37:33 From Carlos Mercado : Bernoulli!
16:37:57 From Carlos Mercado : Is the IBM Data Science Professional cert legit? I'm hearing some in industry disproportionately like it.
16:37:57 From Mikiko Bazeley : I think it’s good to intervene if cost is an issue as well
16:38:16 From Thomas Ives : Yes Mikiko - AGREE!
16:38:36 From Mikiko Bazeley : If someone is asking about a $500-600$ cost class, I like to point to something more available
16:38:47 From Carlos Mercado : Oh fair point^
16:39:14 From Russell Willis : Don't try to understand ALL distributions in one bite... there are tangible relationships between some, so they can lend themselves to an organic learning journey...
16:39:16 From Carlos Mercado : I liked the Coursera Data Science Specialization (~$500). It's how I got started.
16:39:17 From Thomas Ives : Maybe take a little time to tell them there's no perfect course and that they need to focus on concepts and to keep a living ever growing learning plan.
16:39:19 From Harpreet Sahota : If you have a question - let me know! I will add you to the queue
16:39:31 From Eric Sims : Oof, that's expensive!
16:39:36 From Thomas Ives : Love your point Russell!
16:40:16 From Carlos Mercado : Eric - universities are like $60k? LOL
16:40:39 From Thomas Ives : If it's good, $500 is cheap compared to Uni's
16:40:47 From Eric Sims : Haha, yeah, but mine has a much better job placement rate than a Udemy course :)
16:41:12 From Carlos Mercado : NCSU has the actual best job placement numbers and report of any analytics program I've ever seen.
16:41:19 From Thomas Ives : Good point Eric.
16:41:22 From Greg Coquillo : I have a question
16:42:10 From Thomas Ives : Vikram!
16:42:44 From Vikram Krishna : Thom !!
16:43:07 From Ben Taylor : Levy distributions are my favorite..
16:43:11 From Kristen Davis : It’s tough though, as someone entering the field the entry level jobs out there are asking for masters or phd - seems a disconnect from the community which really embraces the power of self learning over the degree and these hiring managers emphasizing the degree not the passion / self learning
16:43:17 From Carlos Mercado : Thomas - you ever do that computer vision thing with the student who called in a few months ago?
16:43:19 From Eric Sims : Ooh, the dictionary is a really good comparison! I like that
16:43:33 From Mikiko Bazeley : I did boot camps & a program — so definitely not knocking paying high value programs & workshops (Reforge was $3K for a workshop) but I always like gauge commitment before ending them down the $$$ path
16:43:44 From Eric Sims : ^
16:44:00 From Mikiko Bazeley : But going to bootcamp was the best thing for me and no regret
16:44:14 From Carlos Mercado : I'm dying bro. This answer is amazing.
16:44:17 From Austin Loveless : Glad to be back :) have had the last few weeks where I had something scheduled. Just eating atm so will hop on camera in a bit!
16:44:17 From Eric Sims : I took free and inexpensive courses before committing to a degree. My wife actually talked me into going back to school
16:44:27 From Thomas Ives : Carlos, I tried to hang in there with them on LinkedIn messaging, and we talked for a while, but then I stopped hearing from them.
16:44:40 From Carlos Mercado : dang, at least you tried!
16:44:57 From Saurabh Dixit : Which bootcamps are good.. simple to grasp and building good foundation
16:45:00 From Christian Capdeville : Vin - this hits so close to home it hurts. But even if someone would've told me, I probably would have ignore it
16:45:04 From Tor Rorvik : Questioning the PPM (Power Point Money)
16:45:17 From Ben Taylor : I wish I had been told that job security is a lie, you need to fight for market security
16:45:18 From Vikram Krishna : @Saurabh DPhi are good
16:45:24 From Eric Sims : Reality 1010 sounds like a tough class...
16:45:37 From Russell Willis : @Eric "the distribution of my course has fewer outliers, when it comes to positive job placements!..."
16:45:46 From Mikiko Bazeley : Wish they taught that in high School lol
16:45:55 From Vin Vashishta : Thanks for the book reco.
16:46:00 From Eric Sims : Amen, @Mikiko!
16:46:08 From Thomas Ives : Good point Ben
16:46:20 From Saurabh Dixit : Thanks @Vikram, @Eric
16:46:22 From Russell Willis : +1 @Ben
16:47:21 From Eric Sims : Ordering a BS detector on Amazon right now...
16:47:35 From Carlos Mercado : David's been a good counter to all my blockchain bs.
16:47:39 From Akshay Mandke : +1 prime order
16:47:46 From Russell Willis : +1 @David BS and unnecessary hyperbole, both!
16:47:53 From Eric Sims : lol, @carlos, I love it!
16:48:04 From Ben Taylor : Thomas’s book video was legendary...
16:48:09 From Carlos Mercado : I linked the google search so that people didn't necessarily buy from Amazon lol
16:48:19 From David Knickerbocker : Sorry, yes, "Calling Bullshit"
16:48:42 From Mikiko Bazeley : Love that web comic
16:48:56 From Mikiko Bazeley :
16:49:16 From Nicholas Lowthorpe : I love that advice Thom.
16:49:30 From Russell Willis : +1 Thom!
16:49:41 From Saurabh Dixit : Great message Thomas
16:50:16 From Ashen Rana : Learning Data Science vs working in data is eye opening. Need to get your feet wet in data world somehow and learning becomes less intimidating
16:50:31 From Ashen Rana : Add value = job security
16:50:38 From Carlos Mercado : My uninformed opinion on PhDs.
16:50:43 From Mikiko Bazeley : Something about penetration of ML in production: slide 40 here:
16:50:49 From Eric Sims : Hey Ben! Enjoyed your DataTalks episode with Alexey today 👍
16:50:55 From Mikiko Bazeley : Still lots of opt for people to enter the field
16:50:56 From Vikram Krishna : I started my Data Science journey with "Python Data Science Handbook"
16:51:02 From Christian Capdeville : Nice Thom, reminds me of: Resourcefulness vs Resources - this comes up everywhere. If you don't have the resources - good news! You're about to learn how to be very resourceful...
16:51:31 From Austin Loveless : Build that social proof by learning in public:
16:51:43 From Carlos Mercado : Austin I still post this link like 3x a day
16:51:50 From Robert Robinson : Most hiring managers don't really understand what they are looking for.
16:51:51 From Mikiko Bazeley : It’s the False Positives companies are trying to defend against
16:51:54 From Mikiko Bazeley : Supposedly
16:52:02 From Mikiko Bazeley : Is why they fall back on MS/PhD
16:52:14 From Austin Loveless : It's a great resource @Carlos. Glad to hear you're still spreading the word
16:52:40 From Russell Willis : @Thom - There has been some good posts on LinkedIn recently about unorthodox and meandering career paths often leading to wider and more resilient skillsets. @BenTaylor has been active in some of these
16:52:54 From Thom (like Thomas without the ass) Ives : Agreed Robert!
16:53:50 From Rodney Beard : Measurement theory
16:53:58 From Rodney Beard : For building intuition
16:53:59 From Carlos Mercado : Echoing Khuyen - I was on 0 people's radar prior to LinkedIn. I think personal branding is just printing value right now. It's still not fully saturated. New platforms coming out regularly and ways to leverage multiple at once.
16:54:03 From Mikiko Bazeley : Packt has a bunch of those “DS/ML for…..” books
16:54:18 From Tor Rorvik : Don't forget that 1st round Application/ resume review is performed by Mr/ Mrs Algorithm and not a human being.... now a days..... hence important to have the right "SEO" wording in there.... 2nd round review is human, that's hen you need projects and a good storyline and then the interview.....
16:54:24 From Thom (like Thomas without the ass) Ives : Greg, That just comes with doing real world projects over time and thinking a lot and doing tons of visualizations throughout the process.
16:54:34 From Carlos Mercado : Packt asks me every week if I'll review a book and then they send me a copy and then I don't read it and then they destroy my inbox for a review
16:54:52 From Mikiko Bazeley : Reforge also has some workshops focused on experimentation & testing for Growth/product
16:54:53 From Thom (like Thomas without the ass) Ives : Good point Tor!
16:55:02 From Harpreet Sahota : If you have a question, let me know! I will add you to the list
16:55:14 From Carlos Mercado : @ Tor - Resumes are sales documents. Agreed on SEO.
16:55:17 From Eric Sims : +1 to Carlos' statement (about making value, not about Packt destroying his inbox...)
16:55:20 From Vikram Krishna : @Carlos I am in a similar situation right now!
16:55:37 From Jaya To Harpreet Sahota(privately) : I have a question from the Sunday mentoring
16:55:50 From Mikiko Bazeley : I feel attacked by the “read then do nothing” cmment
16:55:53 From Mikiko Bazeley : XD
16:55:58 From Tor Rorvik : So instead of saying: I have 3 years of experience, say: I feel like I have done this for 25 years......
16:56:05 From Eric Sims : lol
16:56:17 From Russell Willis : @Greg Do you mean to use past output accuracies to determine which distributions to apply first, rather than trial and error?
16:56:53 From Carlos Mercado : YouTube is the actual GOAT resource for applied learning. You just copy them, then randomly mix it up and experiment with the code, then watch other videos that explain stuff. I could be on YouTube 10 hours/day without even thinking about it.
16:57:03 From Carlos Mercado : But yeah it's not structured to Vin's point.
16:57:27 From Thom (like Thomas without the ass) Ives : I love that Carlos!
16:57:40 From Nicholas Lowthorpe : The idea of learning from YouTube fills me with dread. I hate learning from video, unless there's a transcription I can read.
16:57:48 From Rodney Beard : Measurement theory works really well for building the link between application and distributions. A lot of it is in introductory stats books but is often not mentioned by that name, but that’s where it comes from. Examples nominal scaled, interval scaled, ratio scaled variables, count data etc. these map well to distributions although imperfectly.
16:58:01 From Thom (like Thomas without the ass) Ives : Nicholas 😂
16:58:11 From Nicholas Lowthorpe : Give me a textbook any day.
16:58:18 From Greg Coquillo : @Russell yes, if you're new in the field, how do you tackle business cases you've never seen before? How do you do that with high intuition for building the bridge between a use case and probability distribution?
16:58:26 From Thom (like Thomas without the ass) Ives : YouTube and Popular Mechanics has saved my ass at times!
16:58:37 From Mikiko Bazeley : Daniel Bourke has a great idea about the reverse kaggle
16:58:42 From Carlos Mercado : To Vin's point about the problem for bootcamps (resources) vs colleges (inflexible). The best scenario is (and was prior to like the 1990s) that corporations do the training as close to the daily work as possible.
16:59:08 From Mikiko Bazeley : Reverse Kaggle:
16:59:10 From Eric Sims : There is no safe space. I have found I can be proven wrong pretty much everywhere... :)
16:59:30 From Austin Loveless : Thanks Mikiko I was just looking for that
16:59:38 From Vivienne DiFrancesco : I'd be curious to hear from any of the experienced people about your "oh crap" moments. Are there times when you did some project and later realized that you made mistakes after implementation was already done or underway?
17:00:16 From Carlos Mercado : Google's academy for google suite, Amazon learning paths for AWS, etc. are the structure we should aim for, in my opinion. This is actually where community colleges can fill in a lot of gaps. Less overhead, more relationships to community jobs, etc.
17:00:18 From Russell Willis : I think that using "real-world" data, including anomalies and errors is some of the best training around! if training exclusively with precleaned data sets, then the "real-world" transition will be PAINFULL!!
17:00:59 From Carlos Mercado : I like real-world; but I also like to recommend people learn data simulation.
17:01:08 From Austin Loveless : Data simulation?
17:01:11 From Carlos Mercado : Making random data that has patterns to mine is a tough problem.
17:01:38 From Ben Taylor : I’m walking away for 5-10min… I’ll remain on the call. Helping throw my 6 yr old in the car so he can go to brain balance.
17:01:38 From Rodney Beard : Simulation is really useful
17:01:39 From Wilson Man : Sweet, good question to enter on.
17:01:46 From Vin Vashishta : My earliest oh crap moment was when I realized how hard sourcing data is. Not so much real world data, etc. but what happens when the business does not have the data and does not have access to the data?
17:02:09 From Christian Capdeville : Random crossover question (applies to both data projects and product creation) - do you have any favorite questions to ask users to find out what they actually want/need (rather than what they are asking you for)?
17:02:10 From Russell Willis : one counter to this is the potential to develop data trusts, in which to coalesce and aggregate multiple sources of data, for increased learning pools... However, this needs buy in form multiple parties, to be prepared to give up their data, without fear of losing IP, or other market advantage...
17:02:23 From Thom (like Thomas without the ass) Ives : Creating Fake data is a powerful skill too IMO.
17:02:28 From Eric Sims : @Austin - I just learned today that Pandas has a built in dummy data generator!
17:02:34 From Nicholas Lowthorpe : Vivienne - I imaged assault rifles in luggage to create a small dataset resembling x-ray images (think airport baggage scanners). To flesh it out, I then used that to create tons of artificial imagery. My CNN ended up being really good at identifying photoshopped items in x-ray images...
17:02:37 From Carlos Mercado : Something I was telling a junior consultant who joined my team was, "trust, but verify". Never trust data collectors to collect in a way that is amenable to analysis. Get involved as close to data generation as possible. Data simulation helps with this skill.
17:03:09 From Saurabh Dixit : Can you repeat the book title please “Pragmatic Programming… ?
17:03:23 From Thom (like Thomas without the ass) Ives : Agreed Carlos!
17:03:25 From Harpreet Sahota : And pragmatic thinking and learning
17:03:30 From Austin Loveless : @Eric, oh wow I didn't realize that. I only knew about the "get_dummies" for one hot encoding haha
17:03:34 From Saurabh Dixit : Got it.. thanks
17:03:53 From Thom (like Thomas without the ass) Ives : It's a process Jaya!
17:03:55 From Russell Willis : -- data trusts will be entirely reliant upon being able to effectively anonymise/pseudonymise multiple data sets, so as not to allow reverse engineering of data to be identifiable from any single contribuiting entity!!
17:04:04 From Thom (like Thomas without the ass) Ives : Yes Harpreet! MUST 1st step!
17:04:11 From Wilson Man : Didn't know about the dummy data generator either. Thanks Eric.
17:04:14 From Eric Sims : @Austin, this article talks about it in point 2:
I think it still exists and hope it hasn't been deprecated.
17:04:19 From Thom (like Thomas without the ass) Ives : Then STAY engaged
17:04:33 From Austin Loveless : Trust the data initially as far as you can throw it.
17:04:54 From Austin Loveless : Thanks @Eric. I dig the resource
17:05:04 From Russell Willis : +1 @Thom creating artificial data for testing purposes, is invaluable!
17:05:14 From Carlos Mercado : RE: convincing managers to start data projects.

(1) Are you sure data is needed to generate value? You can generate value with easier things like Excel templates, RPA bots, and process improvements and it's almost guaranteed value. Data might not generate value.

(2) Do you know anything about the data they 'hold close'? Are you sure it's valuable? You might be way too early
17:05:27 From Vivienne DiFrancesco : +1 @ Nicholas
17:05:40 From Carlos Mercado : I think data CAN help companies, but they might not be collecting anything useful.
17:06:16 From Austin Loveless : Agreed @Carlos. Sometimes the value is actually trying to augment/enhance a process.
17:06:27 From Harpreet Sahota : If you have a question let me know, I will add you to the queue
17:06:37 From Wilson Man : Kind of on the same train of thought as Carlos. Data and Excel are not mutually exclusive, so I'm not entirely sure that they're not using data, just because they're using Excel.
17:06:41 From Thom (like Thomas without the ass) Ives : Spot on Mikiko!
17:06:43 From Carlos Mercado : Our Robotic Process Automation team has grown faster than our Analytics team for over 2 years.
17:06:45 From Akshay Mandke : add to queue
17:07:00 From Harpreet Sahota To Akshay Mandke(privately) : added
17:07:10 From Tor Rorvik To Harpreet Sahota(privately) : i like to respond
17:07:19 From Harpreet Sahota To Tor Rorvik(privately) : ok
17:07:48 From Russell Willis : RPA is a huge growth area!
17:08:25 From Mikiko Bazeley : Getting buy-in isn’t always a 1-shot thing — sometimes it needs to be built over small increments
17:09:16 From Russell Willis : +1 @Carlos, though create them an Excel template, then wait 10 seconds for them to break it!!
17:09:21 From Harpreet Sahota To Vivienne DiFrancesco(privately) : Just saw your Q I will add you in
17:09:43 From Wilson Man : Right in the feels, Russell. :'(
17:09:59 From Mikiko Bazeley : Gsheets >>> Excel
17:10:03 From Austin Loveless : A huge time saver that recently came from our finance group was automating a weekly KPI report that would take HOURs each week to pull from various data sources.

They finally pulled the data into the Business Intelligence tool and automated the exporting of the spreadsheet to go out to a list every Monday.
17:10:22 From Russell Willis : +1 @Mikiko... Especially if it is a big shift from existing paradigms!
17:10:33 From Thom Ives : Nice Russell
17:10:36 From Mikiko Bazeley : It’s a nice way to get people bought into the cloud
17:11:01 From Eric Sims : If you want an awesome no-code/low-code tool for process automation across different platforms, check out Zapier. It's my fave.
17:11:09 From Mikiko Bazeley : At one company we built a bunch of dashes out of sheets and some plug-ins
17:11:15 From Christian Capdeville : Recurring theme around how to convince people of stuff: Focus on outcomes (for them - not you)
17:11:20 From Carlos Mercado : "A lot of customer complaints" LOL
17:11:31 From Thom Ives : Think of your data science efforts in a company as being a startup, and do customer centric design for your internal customers.
17:11:36 From Eric Sims : @Christian - BAM! Yes.
17:11:50 From Mikiko Bazeley : THE Eric Sims
17:11:52 From Saurabh Dixit To Harpreet Sahota(privately) : Hi, I need to drop off now.. but it was amazing to join and hear those gems. It’s amazing to see so many awesome books that you have on the list / recommendations. Great to catch up. Good day and take care.
17:11:58 From Mikiko Bazeley : mic drop
17:12:00 From Thom Ives : Love that Christian!
17:12:06 From Carlos Mercado : A really basic thing you can do is study their process and identify the "simplest data entry". If there is something that is ALWAYS the same and they are just entering data? Can you turn that into a structured google/access form?
17:12:16 From Thom Ives : Eric, Accept your fame and popularity!
17:12:24 From Christian Capdeville : Eric is the man!
17:12:24 From Thom Ives : U Duh Mahn!
17:12:28 From Eric Sims : Haha :)
17:12:46 From Mikiko Bazeley : All about performance over baseline
17:12:54 From Ben Taylor : I love Mikiko’s toy story background
17:13:03 From Mikiko Bazeley : <3
17:13:05 From Carlos Mercado : Get that used to the idea that "Excel has limits". Because right now they have PROOF that Excel = profit. You need to build proof that Not Excel = MORE profit.
17:13:11 From Carlos Mercado : get them used*
17:13:47 From Vivienne DiFrancesco To Harpreet Sahota(privately) : Cool, thanks!
17:14:09 From Mikiko Bazeley : The Challenger Sale is a great book about bringing people to the precipice of fear
17:14:22 From Mikiko Bazeley : And then selling them an enterprise solution
17:14:49 From Christian Capdeville : Absolutely Vin - Incentives are everything
17:15:00 From Russell Willis : For Scrappy solutions, one potential ugly truth is that to improve, may entail changing it and monitoring the shift, to keep positive shifts and undo negative shifts... Guerrilla improvement!
17:15:48 From Nicholas Lowthorpe : 2 marketing mantras:

  1. "sell the benefits, not the features" is key for getting your idea across the line. If you've identified a problem, then present your solution as a benefit that is as close to their world as possible. Revenue, EBITDA, etc.

  2. "people will walk to pleasure but run from pain" - benefits are a great sell, but often a stronger sell is to talk about the pain someone will encounter if they don't act.
    17:15:54 From Nicholas Lowthorpe : They're highly applicable
    17:16:10 From Russell Willis : +1 @Carlos also Excel is a well worn comfort blanket for many!!
    17:16:37 From Carlos Mercado : Excel is the worst possible tool for every job in business- but it's always possible.
    17:17:23 From Austin Loveless : It's turing complete I've heard, anything must be possible with it
    17:17:56 From Christian Capdeville : Carlos - not to nitpick, I'm actually interested in your answer to this: Is there a better tool than excel to create custom financial/economic models?
    17:18:10 From Mikiko Bazeley : Google Sheets
    17:18:18 From Mikiko Bazeley : You keep saying Excel
    17:18:24 From Mikiko Bazeley : When you should be saying Google Sheets
    17:18:33 From Rodney Beard : Python is good for economic models
    17:18:36 From Ben Taylor : Find a problem that causes them pain…
    17:18:36 From Russell Willis : @Carlos I'm not so sure?... There's always Power Point!!
    17:18:57 From Ashit Debdas : @carlos yes there is a high chance . my current 99.99% work on excel
    17:19:49 From Austin Loveless : That's wonderful Akshay!
    17:19:57 From Eric Sims : Woohoo! So awesome.
    17:20:01 From Kristen Davis : Congrats to them!
    17:20:08 From Tor Rorvik : Awsome news Akshay....
    17:20:15 From Austin Loveless : They got some great feedback from the group here, glad they used it to make some solid improvements :)
    17:20:39 From Carlos Mercado : Is there a better tool than excel to create custom financial/economic models?

R Shiny, Python Dash. Anything that brings reproducibility and scaled parameter manipulation to the model.
17:20:42 From Thom Ives : Agreed Austin
17:21:13 From Carlos Mercado : Amazon Mechanical Turk
17:21:16 From Carlos Mercado : Fiverr
17:21:25 From Thom Ives : Great Advice Ben!
17:21:27 From Robert Robinson : $10 :-)
17:21:40 From Mikiko Bazeley : I feel like my answer is ways some combination of requests, json_normalize but regex patterns
17:21:53 From Mikiko Bazeley : & regex patterns
17:22:01 From Thom Ives : Scraping does become specialized with each new problem!
17:22:19 From Mikiko Bazeley : I’ll test out the patterns on
17:22:47 From Wilson Man : Money solves so many problems lmao
17:22:49 From Austin Loveless : That's a great resource Mikiko!
17:22:51 From Thom Ives : If you do a lot of Beautiful Soup, not a big deal, but if you do it here and there - OUCH!
17:23:20 From Carlos Mercado : Money Solves [.*]
17:23:30 From Emanuel Vassiliadis : Chat parser (in Perl!):
17:23:34 From Wilson Man : I see what you did there, Carlos.
17:23:36 From Mikiko Bazeley : There is value to learning how to parse html though
17:23:41 From Austin Loveless : Thanks for that Emanuel!
17:23:46 From Carlos Mercado : Rvest is great for html parsing
17:23:49 From Thom Ives : Right Mikiko
17:24:04 From Ben Taylor : Darn.. I was trying to log into upwork to take a screenshot of the $10 to prove I wasn’t full of shit but my google account is no longer valid… no proof that what I said is true.
17:24:31 From Thom Ives : Bummer Ben
17:24:49 From Carlos Mercado : @Wilson thank you
17:24:58 From Thom Ives : I have just a few words on this.
17:25:12 From Austin Loveless : You make it even harder to believe you now Ben. Sounds like an excuse.
17:25:14 From Harpreet Sahota To Thom Ives(privately) : Go for it
17:25:24 From Vin Vashishta : Mikiko nailed it
17:26:07 From Mikiko Bazeley : (This is like my life right now — preparing for data structures & Lagos and going through Standard library)
17:26:18 From Wilson Man : We do? Speak for yourself! ;)
17:26:24 From Mikiko Bazeley : XD
17:26:49 From Akshay Mandke : Thanks for all the inputs in the chat
17:26:55 From Mikiko Bazeley : Yes — every projct
17:27:02 From Carlos Mercado : Just to have my joke explained in the chat. [.*] means "begins with anything, and anything else" i.e. "All".
17:27:09 From Russell Willis : +10 @Mikiko!!
17:27:30 From Kurtis Pykes : @akash - (Web Scraping Book)
17:27:52 From Rodney Beard : Biggest mistake a sorting error while grading an 1800 student Stats class.
17:27:52 From Carlos Mercado : 20 TB of DATA
17:27:56 From Mikiko Bazeley : Serverless ftw
17:28:01 From Kurtis Pykes : @akshay * sorry about spelling
17:28:08 From Carlos Mercado : Cloud, i.e., a computer in Virginia.
17:28:09 From David Knickerbocker : I effectively blew away all of the data in my production database 20 years ago lol
17:28:23 From David Knickerbocker : with no backup....
17:28:30 From Christian Capdeville : Great time as usual everyone - have a great weekend!
17:28:44 From Thom Ives : Ouch Ben! But awesome share dude!
17:28:49 From Mikiko Bazeley : I’ve seen at least 3 companies (that got emails from) where they had some junior entering kill customer data
17:28:58 From Akshay Mandke : @kurtis- no prob. and thanks for the link to the book
17:28:59 From Russell Willis : @ben that data wasn't backed up anywhere else in the data hall?
17:29:18 From Thom Ives : I know I should have a story to top Ben's but I must have blocked it from my memory.
17:29:45 From Ben Taylor : Thom I’m supposed to endorse the same book you did last week and now I’m at a loss on how I can 1-up you… you won that fight
17:30:07 From Carlos Mercado :
School calls: "Did you really name your child Robert`);DROP TABLE students;--"
Dad: "Bobby Tables we call him!"
17:30:17 From Thom Ives : I envy you for biggest pain story though. That's pretty good!
17:30:17 From Austin Loveless : I love bobby tables
17:30:33 From Ben Taylor : I have a fail that ended up with me drinking on a plane (raised Mormon, don’t normally drink) and saying F-the-world… hahaha Data Problems...
17:30:43 From Thom Ives : I know you can best me Ben - become one with your inner producer!
17:31:13 From Carlos Mercado : Bubble Sort is not your friend lol.
17:31:14 From Wilson Man : Common theme here seems to be: backing up your data is a good idea. :D
17:31:21 From Carlos Mercado : Version Control your Data!
17:31:21 From Eric Sims : Wait, so Excel isn't a database??
17:31:32 From Mikiko Bazeley : I spent 3 months on a project that got solved in a 1 hr conversation
17:31:47 From Thom Ives : Mikiko - That's awesome!
17:32:04 From Austin Loveless : Oof… lol At least it got solved
17:32:15 From Harpreet Sahota : Tor, vijay, then vin
17:32:53 From Mikiko Bazeley : After spending the three months lol
17:33:03 From Mikiko Bazeley : Like actively working on it lol
17:33:10 From Thom Ives : Great lesson though Mikiko ;-)
17:33:11 From Austin Loveless : "We wasted 100s of hours and destroyed the relationship, but I think it was worth it" - Carlos
17:33:28 From Akshay Mandke : I need a frame that says this ^
17:33:34 From Mikiko Bazeley : Cross-stitched
17:33:38 From Eric Sims : Mark!!
17:33:48 From Mikiko Bazeley : In a cute little embroidery hoop
17:33:50 From Ben Taylor : People that become “Linux bash pros” often over do it… and have oh shit moments…
17:33:53 From Emanuel Vassiliadis : Chat parser relies on text input from Harpreet. If anyone has ideas to automate upload and processing, let me know afterwards.
17:33:56 From Thom Ives : Austin - classic!
17:33:58 From Carlos Mercado : I gotta drop y'all. Thanks though!
17:34:04 From Akshay Mandke : Bye Carlos
17:34:06 From Harpreet Sahota : @Mark we are talking about fails at work, do you want to share
17:34:06 From Austin Loveless : Carlos, can you tweet that so I can get it engraved as a laser tweet?
17:34:09 From Harpreet Sahota : Peace out Carlos!
17:34:09 From Thom Ives : Ben - I've been there!
17:34:10 From Austin Loveless : Have a good one Carlos!
17:34:14 From Ben Taylor : I’ve had plenty of linux commands where I thought I was a linux bad ass and then executed terrible commands… oh geez
17:34:18 From Mikiko Bazeley : Bye Carlos!
17:34:19 From Vivienne DiFrancesco : I feel like there's a lot of good one liners to keep from these stories
17:34:21 From Mark Freeman : OH DO I!
17:34:23 From Carlos Mercado : Austin whats your twitter, I'll add you on my fake twitter
17:34:34 From Austin Loveless : @ALovelessGuru
17:34:40 From Carlos Mercado : bet
17:34:48 From Austin Loveless : Can't wait
17:34:57 From Mark Freeman : Let me know when to share Harpreet!
17:34:59 From Thom Ives : Take care Carlos!
17:35:07 From Mikiko Bazeley : @ben: I recently went on a zsh spree — locked my computer unzipping 50 gb high-res images for a CV project
17:35:34 From Austin Loveless : ope!
17:35:38 From Mikiko Bazeley : Again - when I was just trying to add a pipe to tree to output the folder structure lol
17:35:42 From Robert Robinson : doh!
17:36:17 From Mikiko Bazeley : Somebody HAS to get fired
17:36:35 From Robert Robinson : Someone has to go to jail. That's the rule. LOL
17:36:42 From Vikram Krishna : LOL
17:36:44 From Mikiko Bazeley : I mean I’m offended as a Japanese person
17:36:52 From Mikiko Bazeley : But also that’s kind of true
17:37:39 From Harpreet Sahota : @Mark - After Vin
17:37:57 From Harpreet Sahota : @Greg after Mark
17:37:59 From Eric Sims : @Vivienne, this question is pure gold.
17:38:14 From Mikiko Bazeley : With that being said, in general, probably best not to make jokes about someone’s ethnicity, gender, or religion
17:38:15 From Austin Loveless : It's pure Bitcoin. We going to the moon
17:38:20 From Eric Sims : LOL
17:38:28 From Mikiko Bazeley : Or sexual orientation
17:38:55 From Austin Loveless : Good call out Mikiko.
17:38:56 From Thom Ives : Ryan Reynolds' Mint Mobile Company has a commercial about a guy that can't remember his Bitcoin pass phrases to his Quarter Billion Dollar Bitcoin account - That guy has us ALL beat!
17:39:31 From Ben Taylor : I watched someone cry once when our 5TB raid failed at the hedge fund. 5 guys standing in a room watching our head of IT cry… good memories
17:39:50 From Mikiko Bazeley : Q.Q
17:40:46 From Thom Ives : Oh wow Ben. That's rough.
17:40:51 From Mikiko Bazeley : Mark- Speaking the story of my life with sales
17:41:33 From Vin Vashishta : Do not share are 3 words the sales team does not understand
17:41:34 From Mikiko Bazeley : (Plot twist: We later find out they were all pictures f the actors who’ve ever played Captain Hook)
17:42:22 From Thom Ives : Vivienne, You've opened a flood gate!
17:42:25 From Eric Sims : Gotta drop off - sad to miss out on the rest of this! Have a good weekend, all!
17:42:32 From Thom Ives : Bye Eric!
17:42:33 From Austin Loveless : Have a good weekend Eric!
17:42:37 From Akshay Mandke : Bye Eric ! Have a good weekend
17:42:41 From Vikram Krishna : Bye Eric
17:42:42 From Vivienne DiFrancesco : So glad I asked this question!
17:43:06 From Rodney Beard : Bye Eric
17:43:20 From Thom Ives : Vivienne, You should be! Many things to help you feel better in the future!
17:43:42 From Ben Taylor : I used to own a rock climbing e-commerce store and a cat pissed on some of our new inventory, one of my co-founders cats. Cat piss corrodes metal, had to throw away $500-1000 product? Seems like a small number but it hurt
17:44:05 From Mikiko Bazeley : Not a single hair then Thom
17:44:11 From Ben Taylor : I’m 37 though
17:44:19 From David Knickerbocker : I'm only 19
17:44:22 From Austin Loveless : I have like 3 wisps I'm catching up
17:44:25 From Ben Taylor : Grayest 30 year old ever
17:44:57 From Austin Loveless : I have an uncle who was gray at 19.. didn't know that could happen
17:44:59 From Rodney Beard : @Daviid Knickerbocker, how does negative age work for making mistakes?
17:45:50 From Robert Robinson : The grass isn't always greener, but sometime it is.
17:46:25 From Vin Vashishta : You say that like its a bad thing
17:46:51 From Robert Robinson : Go for it, Harpreet!
17:46:56 From Mark Freeman : I’m super patchy on my face… One day I’m just going to pull the trigger and get a fake beard :P
17:47:20 From Mikiko Bazeley : Not wrong at all
17:47:43 From Mikiko Bazeley : And it’s one of the defining features of some of the key improvements made in convenient nets archtectures
17:47:53 From Mikiko Bazeley : *ConvNets
17:48:48 From Mikiko Bazeley : Memory optimization can be important
17:48:48 From Austin Loveless : It's work
17:49:03 From Austin Loveless : HE's trying to get the scrapers off his computer and into the cloud
17:49:25 From Thom Ives : Chrome is kind of memory lite I thought.
17:49:27 From Vin Vashishta : Memory, network bandwidth, drive I/O off the top of my head.
17:49:40 From Thom Ives : Agreed Ben
17:49:53 From Thom Ives : Something just sounds off.
17:50:22 From Thom Ives : Experiment
17:50:23 From Austin Loveless : I think he was initially on the free plan and needed to scale up was the response given last time, but no idea what he did from that last week
17:51:14 From Ben Taylor : KISS
17:51:17 From Harpreet Sahota :
17:51:18 From Ben Taylor : Works…
17:51:19 From Thom Ives : Ben said experiment - I agree with that
17:51:22 From Harpreet Sahota :
17:51:56 From Robert Robinson : Great idea!
17:52:03 From Thom Ives : If Elon can build a tunnel under LA, can't we give awards!
17:52:11 From Akshay Mandke : I filled this one today. So excited
17:52:21 From Vin Vashishta : This is an amazing idea.
17:52:24 From Thom Ives : Get out and VOTE!
17:52:34 From Thom Ives : Filled it out today!
17:52:42 From Robert Robinson : Vote early, vote often. ;-)
17:53:20 From Robert Robinson : Is Dominion in charge of the results?
17:53:37 From Thom Ives : Robert :-D
17:53:40 From Austin Loveless : I'll see if Carlos will post it on his fake twitter.
17:53:52 From Harpreet Sahota :
17:54:03 From Ben Taylor : My fake social media account was quoted by USA Today…
17:54:14 From Mikiko Bazeley : Lol fake twiter
17:54:23 From Ben Taylor : Fake me is more famous than real me...
17:54:36 From Wilson Man : Fame is incredibly overrated anyway.
17:54:49 From Mikiko Bazeley : Money is not lol
17:54:55 From Angelo : Good to see you all, have an awesome weekend!
17:54:56 From Kristen Davis To Harpreet Sahota(privately) : Will you drop the link to the Sunday group?
17:54:57 From Ben Taylor : Fame can be engineered… wouldn’t be surprised if the top influencers in the next 5-10 years turn out to not be real, all AI
17:55:00 From Vijay Kumar : Whats the link for sunday zoom
17:55:02 From Thom Ives : Fake us is less inhibited!
17:55:10 From Wilson Man : Money is underrated! haha
17:55:32 From Russell Willis : +1 @Ben There are already fake/AI social media profiles with huge followings and interaction...
17:55:36 From Akshay Mandke : For me personally - Passion, Experimentation, Self- Mentorship, Being Fearless
17:55:39 From Vin Vashishta : Deep fake Vin is more fun at parties
17:56:33 From Jillian Katz : can someone share the info for the Sunday morning calls?
17:56:39 From Mark Freeman : I have a rule that if I give an employer 40hrs, I give myself 10hrs. Used this to learn python and ml!
17:56:50 From Vivienne DiFrancesco : Can someone share the group slack?
17:57:05 From Harpreet Sahota :
17:57:16 From Harpreet Sahota : Sorry,
17:57:19 From Harpreet Sahota : That’s the Sunday session
17:57:21 From Ben Taylor : Ok… proof that everything I say is true even if it sounds like bullshit:
17:57:22 From Ben Taylor :
17:57:29 From Ben Taylor : USA Today article with my fake account
17:57:37 From Ben Taylor : EDITOR'S NOTE: USA TODAY has learned that a source in this story, Klara Jonsson, was using an alias. We still believe the analysis to be correct.
17:57:41 From Harpreet Sahota : Slack: 👋 Let’s move this to Slack! We’ve got 382 folks from the team there already. You can sign up here:
17:57:42 From Ben Taylor : USA Today was PISSED!!
17:58:03 From Vivienne DiFrancesco : Thanks, Harpreet.
17:58:23 From Vivienne DiFrancesco : Thanks everyone who shared this week! This group is great. See ya next time!
17:58:42 From Vikram Krishna : Bye Vivienne
17:58:51 From Austin Loveless : Bye Vivienne!
17:59:05 From Thom Ives : I struggle with it too Vin and Ben
18:00:28 From Robert Robinson : Great advice, Vin.
18:01:01 From Mikiko Bazeley : We’re all human — and growing in data science and machine learning is the expression of uniquely human traits, perseverance in the face of seemingly unconquerable odds
18:01:26 From Austin Loveless : This article is hilarious @Ben
18:01:40 From Mikiko Bazeley To Harpreet Sahota(privately) : Greg & Kyle have great stories
18:01:47 From Mikiko Bazeley To Harpreet Sahota(privately) : *Kurtis
18:01:59 From Harpreet Sahota To Mikiko Bazeley(privately) : Thanks, I will bring them on
18:02:09 From Vin Vashishta : What Mikiko said. WOW.
18:02:27 From Mark Freeman : I 100% felt that psychological safety when I broke our product this week. It made such a huge difference. I just had to focus on fixing the problem and sharing with others how to improve our systems. Compared to other jobs where it was full of finger pointing and cover your ass.
18:02:29 From Austin Loveless : @Mikiko I think that many of us come from varied backgrounds and it creates that unique story for what got us excited about it.
18:03:03 From Mikiko Bazeley : @Austin: Absolutely — it was a good first lesson to not take Silicon Valley at first sniff
18:04:25 From Robert Robinson : Goodnight, everyone. Need to run. See you next week. Thanks for the office hours.
18:04:35 From Austin Loveless : Have a good weekend Robert!
18:04:52 From Austin Loveless : Tutorial Hell.. yeah
18:05:05 From Vin Vashishta : Me too.
18:05:15 From Mikiko Bazeley : The Mikiko Effect
18:05:16 From Mikiko Bazeley : Love it
18:05:18 From Mikiko Bazeley : TM
18:05:19 From Austin Loveless : I love it
18:05:44 From Thom Ives : Mikiko's brain needs to be cloned ... or is it her spirit? 🤔
18:05:45 From Tor Rorvik : Another great session.... have a great weekend all..... my pillow is calling me at 1am...... Stay safe..... See you Sunday......
18:05:52 From Austin Loveless : @Thom both for sure
18:05:59 From Austin Loveless : @Tor have a good one!
18:06:05 From Thom Ives : Agreed Austin!
18:06:09 From Thom Ives : Bye Tor!
18:06:10 From Akshay Mandke : Bye Tor
18:06:37 From Thom Ives : Greg is the Top DS student in the World!
18:06:41 From Mikiko Bazeley : Spirit for cloning, brain could be replaced by 30% of Netflix, 30% of 90’s romcom, 40% of the personal self-help selection
18:06:52 From Thom Ives : He get's DS as a businessman too!
18:07:05 From Thom Ives : Mikiko - LOVE IT!
18:07:19 From Austin Loveless : Those are goods percentages though!
18:07:41 From Thom Ives : Love 190% = Priceless!
18:08:04 From Thom Ives : DOH! 90's not 90%
18:08:12 From Thom Ives : OK = 100%
18:08:39 From Austin Loveless : Math on a Friday evening... I feel you @Thom
18:09:03 From Thom Ives : 90's thru me off - wrong units issue ;-)
18:09:07 From Mikiko Bazeley : I love your posts Kurtis and greg
18:09:24 From Austin Loveless : Thank you everyone!
18:09:25 From Mikiko Bazeley : Just great material
18:09:30 From Greg Coquillo : Thank you Mikiko!
18:09:32 From Vikram Krishna : Thank you everyone!!
18:09:36 From Greg Coquillo : You ROCK
18:10:08 From Jaya : Thanks everyone
18:10:26 From Vikram Krishna : Thanks for the session guys. Happy weekend :)
18:10:26 From Akshay Mandke : Have a good weekend everyone
18:10:27 From Austin Loveless : Have a great weekend all!
18:10:46 From Mikiko Bazeley : Bye everyone! Have a great weekend!

Chat Transcript from Data Science Happy Hours 14, Dec 18 2020 Sun, 20 Dec 2020 00:00:00 -0500 6880fee5-c3ab-4d43-9929-7f5cb575fdd9 The full chat transcript from the Happy Hour on Dec 18th. Lots of excellent questions being asked and some amazing answers. 16:31:14 From Thomas Ives : Susan Walsh is in the ROOM!
16:31:25 From Susan Walsh : just!!
16:32:38 From Eric Sims : Unbalanced data and unbalanced data scientists?
16:33:26 From Akshay Mandke : Sometimes you need to take a few steps back to leap forward. I think it relates to balance
16:34:26 From Carlos Mercado : @ Eric whats up
16:34:37 From Eric Sims : Yo!
16:34:52 From George Firican : Hi everyone!
16:35:23 From Ashit Debdas : Hello everyone
16:35:32 From Greg Coquillo : Hi Team!!
16:35:33 From Thomas Ives : George!
16:35:34 From Ashen Rana : Hi all! Spilled a drink on my laptop right at the beginning :D
16:35:41 From Lalita : hello everyone
16:36:00 From Carlos Mercado : oh noooo @ Ashen upside down + dehumidifier + rice ??
16:36:04 From George Firican : Hope all is ok, Ashen
16:36:17 From Ashen Rana : Lol I might have to dry the rice technique Carlos
16:36:24 From Carlos Mercado : @ Greg whats up man, got some more stump us questions this week?
16:36:32 From Carlos Mercado : @ Mark whats good bro
16:36:34 From Ashen Rana : Managing it for now, George - thanks! Seems okay for now thankfully
16:36:54 From Naresh Reddy : Hello everybody!
16:37:04 From Eric Sims : Hey @Naresh!
16:37:04 From Greg Coquillo : lol @Carlos I'm more on the learning flow today. Might have some questions
16:37:32 From Carlos Mercado : I asked a PhD in biophysical chem that question of yours on AlphaFold, still waiting for the reply, will send that answer to you on LI.
16:37:40 From Mark Freeman : @Carlos hey! Happy to see you in office hours this week!
16:38:05 From Harpreet Sahota : If anyone has questions - send me a message!
16:38:07 From Carlos Mercado : Trying do every other week at a minimum. These are so big now!
16:38:07 From Greg Coquillo : @Carlos that would be awesome! thank you
16:38:11 From Dave Langer : I would offer that analytics is like any other applied STEM field in business (e.g., software engineering).
16:38:41 From Dave Langer : That is, to stay relevant you need to constantly invest in your skills - and that usually happens outside of work hours.
16:38:51 From Saurabh Dixit To Harpreet Sahota(privately) : For someone starting new…. and looking at so many branded Data Science programs … :
16:39:28 From Dave Langer : For many, this level of consistent effort/investment year after year is too much sacrifice. If you love it, however, it's awesome.
16:39:30 From Ashen Rana To Harpreet Sahota(privately) : What are some trends that you’ve noticed on LinkedIn recently? Did you take a course on “how to LinkedIn”?
16:39:42 From Joe Reis : ^ this
16:39:57 From Saurabh Dixit To Harpreet Sahota(privately) : Tableaue Data Scientist, SAS Data Scientist, Microsoft Azure Data Scientist programs… .. how relevant and useful they are in the industry it worth pursuing any of these
16:40:52 From Eric Sims : Welcome, Saurabh!
16:40:55 From Dave Langer : Just want to do a shoutout - Ameya in the house!
16:41:22 From Ameya Dhaygude : Thank you Dave :)
16:41:29 From Ameya Dhaygude : Hi Everyone
16:41:52 From Carlos Mercado : I would just remind people reading this later; that it doesn't HAVE to be outside work hours. People tend to grind their jobs, never saying no, and then get surprised that they have no time.

If your job benefits from you being up to date on skills and knowledge, then its PART of your job to do that. Whenever possible ,carve out a few hours a week of self-time, locked down time on your calendar, to read newsletter, read papers, etc. And if it benefits your job then it counts as your job.
16:42:04 From Mark Freeman : I have question on people’s process for debugging their code efficiently? What’s your personal order of operations as a flexible guide, or any tools you use?
16:42:06 From Florin Badita - Corruption Kills : One of my pet hobby project is scrapping a list with most of the users on the Interert. I currently have around 500M users
16:42:11 From Carlos Mercado : I agree with you @ Dave, just adding my consulting feeling.
16:42:15 From Akshay Mandke : Role based certifications by Microsoft are really great.I recently did one and I can concur it is beneficial to apply your skills.
16:42:47 From George Firican : That's impressive, Florin
16:42:55 From Florin Badita - Corruption Kills : My question was if you know scientists working on this, my aim would be to be able to do hobby based discovery of persons
16:42:56 From Ashen Rana : Good point, Carlos!
16:43:54 From Jennifer Nardin To Harpreet Sahota(privately) : Might start a war but…. As a business-side-data-geek, I want to deep-dive into either R or Python over vacation. Which should I choose? (Organization doesn’t care, nearly any tool available) I would ask for responders to give me 1 reason why to choose their language of choice and 1 reason why it should not be the other language.
16:43:54 From Akshay Mandke : Tools are simply enablers for data science.I think it begins with what problem you want to solve and do you have the 4Vs of data answered well before starting your DS journey.
16:44:16 From Ameya Dhaygude : Question for the experts - I am a SAS veteran, using it for last 10 years. My company stopped using SAS and moving to Azure. Can you share your experience and advice on managing this transition?
16:44:37 From Jennifer Nardin To Harpreet Sahota(privately) : If you’ve covered that, or if it creates contention, then I am happy to just choose offline :)
16:44:46 From Carlos Mercado : @ Mark on debugging. There are 3 levels of debugging (and probably more if you are legit in software development).

1 - Commenting out code and trying again
2 - formal debugging with traceback with something like browser() in R to change your environment
3 - formal unit testing (including automated testing) that return set outputs, e.g. unexpected classes, failed asserts, etc.

Maybe there's a 4th but that's my typical range.
16:45:04 From Matt Housley : My perspective is that R is a really nice transitional language from SAS into the data science mainstream.
16:45:32 From Srivatsan Srinivasan : Hi all.. Good to see you all in one place :)
16:45:40 From Eric Sims : Someone needs to make the "Data Science Unicorn Unicourse" - One course to rule them all...
16:45:44 From Akshay Mandke : Yeah I like R’s functional approach. You can layer around your code as you explore the data. Great if you are a statistician or exploring data for sales and making charts using ggplot.
16:45:59 From Greg Coquillo : +1 @Srivatsan!
16:46:10 From Carlos Mercado : @ Ameya

SAS -> Azure is a bit of a strange way of wording it. SAS the statistical language is easy to swap into something like R. SAS the integrated development workflows with SAS Vidya, etc. switching to Azure is like switching to a full cloud infrastructure. You have to care about VMs, database architectures, Containers (Kubernetes), etc. I think we'd need more info.
16:46:23 From Akshay Mandke : Hi Greg, good to see you.
16:46:56 From Carlos Mercado : SAS code just works, always, it's all backtested. In an open source programming language, dependency management starts mattering in ways that SAS just "works" for you.
16:47:00 From Greg Coquillo : You too AkshaY!
16:47:04 From Matt Housley : Once you know R, various data frame paradigms across different frameworks will make sense. Spark, Python/Pandas, etc.
16:47:38 From Mark Freeman : @Carlos unit tests have been REALLY helpful for my current project. I’m currently struggling with reading the error statements and stack traces when it’s not obvious.
16:47:48 From Eric Sims : Don't post links in your own posts!!!
16:47:50 From Carlos Mercado : that's because you're not using FP lol
16:48:14 From Carlos Mercado : @ Eric. Correct, anything that makes your audience leave LinkedIn is suppressed. The algorithm is always changing.
16:49:33 From Ashen Rana To Harpreet Sahota(privately) : What are your expectations from someone in a Junior role?
16:49:52 From Giovanna Galleno : your style is amazing Susan! I love your tone of voice! You rock! :D
16:49:58 From Carlos Mercado : yeah, to Dave's point. The fundamentals transfer - SQL isn't going anywhere. Designing databases isn't going anywhere. OOP/FP isn't going anywhere. Understanding debugging, classes, O Notation, etc.

16:50:19 From Akshay Adlakha : Agreed Carlos
16:51:44 From Carlos Mercado To Harpreet Sahota(privately) : You can remind people who are just chilling, that the chat is very active and answers will be recorded in the show notes. Lots of questions in here.
16:51:59 From Saurabh Dixit : Great point ..Susan, Giovanna … Be yourself .. fail, learn .. Thanks.
16:52:09 From Matt Housley : I would argue the SQL has grown more important. In the early days of Hadoop, you had to write your own raw map reduce jobs in Java. The advent of Hive changed that, and now SQL is an extremely powerful language at scale. SQL is still annoying sometimes, but it’s a baseline skill.
16:52:16 From Lalita : I have one question regarding landing a entry level machine learning or data science position.
16:52:26 From Eric Sims : Remember, Harpreet interviewed Greg before he was famous!
16:52:26 From Dave Langer : Regarding LinkedIn I would suggest the first thing to do is to determine why you are doing it. Growing a LinkedIn audience is a lot of work. For example: Do you want to establish your "authority" in a particular domain? If so, guide your content accordingly.
16:52:27 From Ray Givler : The other facet is data viz. Human perception isn't going to change any time soon. Clarity won't go out of style.
16:52:41 From Florin Badita - Corruption Kills : Carlos, totally agree . Just started reading Rapid Development by Steve McConnell, written in 1994, and the book is so up to the point and still relevant 25 years later
16:52:43 From Saurabh Dixit : Thanks Srivatsan and Dave… point taken.. focus on foundational skills, transition on the basis of domain knowledge… Thanks a lot.
16:52:44 From Giovanna Galleno : Thank you Saurabh :D
16:53:24 From Ray Givler : @Florin. I read that book in 1994!
16:53:25 From Carlos Mercado : My #2 post ever on LinkedIn was "Data Viz is more important than Deep Learning".
16:53:32 From Carlos Mercado : I'm a big fan of DV.
16:54:32 From Naresh Reddy : what transition does it take to become a data scientist from data analyst?
16:54:36 From Matt Housley : @Carlos Yeah, I second that. We’ve seen that issue with clients - hyper competent data scientists who struggle to communicate with execs because they can’t do data viz.
16:54:47 From Monica Royal : Don't fall into the LinkedIn algorithm trap. Just be you and share things that you enjoy. Also, remember we are all learning from each other, so anything that you post will help someone else :)
16:54:51 From Eric Sims : It's true - Greg never misses!
16:55:00 From Carlos Mercado : Can't do Data Viz and CANT DO POWERPOINT. Execs only speak 3 languages. English, PPT, Finance Excel.
16:55:10 From Susan Walsh : great point Monica!
16:55:12 From Greg Coquillo : Thanks Eric!
16:55:13 From Carlos Mercado : @ Naresh - you just call yourself a data scientist and you're done.
16:55:28 From Carlos Mercado : The roles are supposed to be distinct and different, but they're just not anymore, it's all fumbled.
16:55:38 From Eric Sims To Harpreet Sahota(privately) : Harpreet, I've got a technical question if I can get in the queue
16:55:46 From Mikiko Bazeley : OH man, debugging, good one
16:55:49 From Naresh Reddy : @carlos :D
16:55:52 From Matt Housley : @naresh One of the main things you need to make this transition is strong sponsor. Think of it as an internship.
16:56:34 From Naresh Reddy : @Matt Housley I will keep that in my mind :
16:57:18 From Carlos Mercado : At RStudio 2020 I took Jenny Bryan's workshop on debugging; she also gave a keynote speech on debugging. Available here:
16:57:19 From Florin Badita - Corruption Kills : Thanks for organising this, I will need to sop, have another call coming up. Happy and relaxing xmas to you and your family. Stay safe!
16:57:31 From Eric Sims : I start worrying when I'm not getting enough errors...
16:57:47 From Ashen Rana : Echoing the importance of SQL. I thought it was too simple and easy, and not as exciting as Python or R. I am currently using DBT (data build transform) tool and SQL is key there
16:58:10 From Giovanna Galleno : byw @Florin! :D it was great to see you! :D
16:58:21 From Joe Reis : Check out my next meetup about ML model testing and debugging
16:58:22 From Joe Reis :
16:58:27 From Eric Sims : See you, @Florin!
16:58:33 From Joe Reis : Josh Tobin (Open AI) is presenting
16:58:44 From Matt Housley : @ashen +1 for DBT.
16:58:47 From Akshay Adlakha : We can checkpoint while building a model. This makes easy to debug a model.
16:58:53 From Carlos Mercado To Harpreet Sahota(privately) : I have something I want to chime in on RE Debugging
16:58:58 From Akshay Adlakha : add*
16:59:05 From Akshay Mandke : Keeping in mind data privacy and restrictions for compliance plays a huge role in making changes to data pipelines and solutions. I have experience working in Forensics industry and see this as a big challenge.
17:00:11 From Carlos Mercado : In R the Pins library can record datasets, models, and objects that Srivatsan is mentioning. In Python I think Metaflow does caching for it too
17:00:56 From George Firican To Harpreet Sahota(privately) : I can mention a quick thing about debugging if there's time
17:00:57 From Ashen Rana : Heck yeah, Matt Housley! Getting started with DBT + Snowflake
17:01:38 From Ray Givler : For actual error messages, I like to record the message and my resolution and keep that in a document or OneNote. Multiple problems can cause the same error, so once you find out what yours is, it's worth writing down. If you make an mistake once, you are likely to repeat it. So keep some notes.
17:01:40 From Lalita : I think commenting the sections have helped me a lot while debugging my scripts in Python
17:01:44 From Joe Reis : Also, debug your data...
17:01:47 From Matt Housley : This framework is pretty interesting for model monitoring.
17:02:15 From Susan Walsh : nice hat Monica!
17:02:26 From Joe Reis : Why Labs is great for data profiling in your pipelines. Great expectations is a good data unit testing framework
17:02:52 From Joe Reis : Data’s different than software debugging, in that you need to debug your software code AND your data
17:03:32 From Matt Housley : One of things you have to think about in data science debugging is looking for errors in your source data.
17:03:56 From Monica Royal : Thank you @Susan :D
17:03:59 From Carlos Mercado : Also RE: debugging. Take a break and come back if it's really bad. Then check your assumptions.

The most common debugs are the ones we don't post to stackoverflow. "Shit my date is being stored as a character".

It's like very basic CompSci stuff.

But the #1 way to debug is to create REPREX and actually isolate your error. That will solve it 80%+
17:04:35 From Susan Walsh : step away for a while then come back also helps
17:04:36 From Matt Housley : It really helps to track down domain experts in your company. They can often tell you when events or statistics don’t smell right.
17:04:38 From Thom Ives : Atom
17:04:38 From Dave Langer : RStudio!
17:04:39 From Ashit Debdas : pycharm
17:04:39 From Carlos Mercado : RStudio is the best Python IDE - don't @ me.
17:04:41 From Mark Freeman : Vscode with mypy and flake8!!!
17:04:42 From Greg Coquillo To Harpreet Sahota(privately) : I have a question as well brother
17:04:43 From Joe Reis : VSCode and vim
17:04:48 From Liuna Issagholian : VScode
17:04:49 From Kamarin Lee : VSCode +1
17:04:50 From Lalita : spyder
17:04:57 From DavidTello : I once missed a Space in a Python code, it took me all night to find it
17:04:59 From Monica Royal : Good point @Joe. You may just need to take a walk and come back to it later. Often times you will find the missing comma right away

17:05:02 From Saurabh Dixit : Sometimes just stepping away from your computer for 10 mins and then returning helps solve the problem ;) in my humble exp
17:05:05 From Ray Givler : @Carlos - I was gonna say the same thing regarding debugging. Assumptions are a common source of bugs.
17:05:22 From Akshay Adlakha : Guys, due to some urgency, I have to go. Bye Happy holidays. Enjoy rest of the session.
17:05:27 From George Firican : sorry everyone, I have to step out as I have a couple more hours in the office before the weekend starts. Lovely seeing everyone here
17:05:28 From Joe Reis : If I had a dollar for every simple bug caused by a misplaced character, I’d be retired
17:05:30 From Susan Walsh : Florin that is a wicked camera you have
17:05:34 From Eric Sims : Take care, @Akshay!
17:05:34 From Joe Reis : Later George
17:05:35 From Matt Housley : Has anyone debugged LaTex code? It’s horrible.
17:05:44 From Joe Reis : Matt - that’s morbid
17:05:50 From Akshay Adlakha : Thanks Eric
17:06:00 From Susan Walsh : Bye George!
17:06:09 From Liuna Issagholian : LaTex …… can be challenging :|
17:06:14 From Mark Freeman : Thank you everyone for your answers! This was so helpful and will coming back to this podcast for all the great gems on debugging
17:06:26 From DavidTello : @Matt, I wrote my dissertation (120 pages) using latex. It was painful
17:06:31 From Mark Freeman :
17:06:43 From Mark Freeman : ^ open source Carlos was referring to
17:07:02 From Eric Sims : Sounds YUGE.
17:07:39 From Thom Ives : Mikiko's antlers are EPIC!
17:07:51 From Carlos Mercado : +1 Mikiko's antlers
17:08:19 From DavidTello : Is web scrapping illegal?
17:08:29 From Joe Reis : no
17:08:29 From Carlos Mercado : It's not GDPR compliant
17:08:33 From Carlos Mercado : in most instances
17:08:43 From Carlos Mercado : because you would be liable for certain types of identifiable storage.
17:08:45 From Lalita : Is there any good resource where I can learn writing complex queries in SQL problem just to play around with it more?
17:08:51 From Thom Ives : I'm going to be arrested if it is!
17:08:55 From Joe Reis : If you’re gonna scrape, just keep it quiet
17:09:05 From Dave Langer : ^
17:09:10 From Sarah Nabelsi : lol
17:09:10 From Thom Ives : Lalita - David Langer's course!
17:09:13 From Susan Walsh : we’ll come visit Thom
17:09:21 From Matt Housley : @DavidTello I think it lands in a legal gray area. You’re typically violating terms of service, which you may or may not have actually signed off on.
17:09:44 From Thom Ives : I dream of it Susan!
17:09:49 From Dave Langer : Lalita - I have a free SQL tutorial if you're interested.
17:09:52 From Lalita : Thanks Thom
17:10:05 From Eric Sims : "I don't intend to start a war" - Famous last words immediately before starting a war.
17:10:13 From Susan Walsh : lol
17:10:20 From Lalita : @Dave thanks. I would love to use that.
17:10:28 From Mark Freeman : I’m a heavy python user… but I think R is more friendly to have it just work
17:10:31 From Mikiko Bazeley : For scraping, there is some many conditions
17:10:34 From Mikiko Bazeley : C++!
17:10:41 From DavidTello : Flip a coin and let mathematics decide :)
17:10:49 From Akshay Mandke : R if only researching, python if its part of a larger software solutions
17:11:01 From Susan Walsh : I don’t code 😱
17:11:04 From Matt Housley : @Lalita One nice resource is the DBT walk through. They really emphasize using CTEs to compose complex operations. Basic composition is one of the most under appreciated capabilities of SQL, largely because business users don’t use it, and we tend to learn from business users.
17:11:13 From Dave Langer : @Lalita -
17:11:46 From Akshay Mandke : R is easier if u have no programming backgroun
17:11:53 From Joe Reis : I’m learning swift right now…
17:11:55 From Susan Walsh : see which one you get on best with
17:11:59 From Joe Reis : It’s amazing
17:12:11 From Kamarin Lee : plotly ftw lol
17:12:16 From Dave Langer : In my experience teaching 100s of working professionals, R is easier to learn for folks with no programming background.
17:12:20 From Lalita : Thanks @Matt and @Dave I will check that out
17:12:24 From Sarah Nabelsi : +1 to harpreet
17:12:26 From Carlos Mercado : The V8 package uses Chromium to render JS on webpages prior to scraping (this is unlike rvest in R that has issues with rendering JS first).

You will find a LOT of sites actively prevent webscraping; so in terms of being a white hat hacker. You should protect your sites from scraping by making sure to render important details as JS outputs and not HTML, and also make sure it can effectively block V8 / Chromium engine rendering of JS.
17:12:27 From Eric Sims : @Jennifer - I'm a noob, and I started with Python. It works well for me, and it is widely used in companies that I want to work with, so it's a skill I want.
17:12:29 From Faraaz Sheriff : I believe the learning curve for R is steep vs Linear for Python. Agree?
17:12:41 From Susan Walsh : Hey @Kam
17:12:49 From Joe Reis : Let’s put it this way - my 10 year old can write Python
17:12:52 From Saurabh Dixit : I was a Java developer long back .. and I am now starting to love R’s amazing how much data wrangling it does so easily… .. No strong feelings .
17:12:55 From Dave Langer : If you're working a a solo analyst it is hard to go wrong with R. If you need Production software engineering, then Python might be a better choice.
17:13:06 From Carlos Mercado : I think R step 0 is hilariously easy. And Python step 0 is horrendous. But Python is very readable and has I believe 10x+ the stackoverflow answers.
17:13:10 From Kamarin Lee : @Susan!!! It’s been too long, great to see you, lovely! I’m in love with the Xmas edition you produced today haha :)
17:13:23 From Carlos Mercado : My opinion: R for individuals; Python for Teams.
17:13:28 From Eric Sims : @Carlos - R doesn't have a step 0. Python starts at 0.
17:13:33 From Carlos Mercado : ha ha ha LOL
17:13:35 From Susan Walsh : Haha thanks, that was freshly produced today!
17:13:46 From Karan Ambasht : Sorry gotta run.. great listening to wonderful thoughts from everyone. Happy Holidays all !!
17:13:54 From DavidTello : I started in Python, moved to VBA because of work needs
17:13:55 From Carlos Mercado To Harpreet Sahota(privately) : Can I shout out my mentee on this conversation really quick
17:13:59 From Lalita : I find Python my to go language. Never touched R
17:14:12 From Matt Housley : Python is a true object oriented language, so there’s a steep learning curve. R doesn’t have that problem, but won’t help you to segue into Java like Python will.
17:14:14 From Eric Sims : To Carlos' point, I love Python because it has a b'zillion StackOverflow answers
17:14:18 From Kamarin Lee : @Susan I need to think of ideas for a duet next year haha… 99 Luftballoons
17:14:19 From Joe Reis : Python’s the 2nd best language at everything
17:14:29 From Carlos Mercado : ^ +1 It's truly general purpose.
17:14:47 From Dave Langer : Ben!!!!
17:15:00 From Susan Walsh : Hey Ben!!
17:15:00 From DavidTello : If you want to do R, check out Matt’s courses, among the best I seem
17:15:16 From Greg Coquillo To Harpreet Sahota(privately) : Le me know if I'm in line for a question
17:15:16 From Carlos Mercado : I personally think R has the better community as well, which does matter for learning.
17:15:35 From Matt Housley : +1 for Matt Harrison’s Python courses.
17:15:40 From Ameya Dhaygude To Harpreet Sahota(privately) : I have a question on SAS to Azure transition
17:15:48 From Mikiko Bazeley : +1 for Matt Harrison
17:15:49 From Saurabh Dixit : I had the same Q
17:15:52 From Mikiko Bazeley : And his course
17:16:10 From Joe Reis : Matt Harrison is the king of Python courses
17:16:11 From Eric Sims : Hey, @Al! I hope things are less crazy this week than last!
17:16:23 From Kamarin Lee : Python has been instrumental for me with automation + analytics - Matt Dancho is amazing with R though. Definitely recommend his courses
17:16:59 From Akshay Mandke : +1 ^ I am doing his DS4B-101
17:17:04 From Akshay Mandke : in R
17:17:08 From Carlos Mercado : Python also has the excellent book "Automate the Boring Stuff". For command line automations Python is just unmatched.
17:17:32 From Joe Reis : I think of R like a hand calculator. It’s awesome for out of the box stats
17:17:33 From Ashen Rana : +1 to documentation. That’s been my emphasis
17:17:45 From Matt Housley : Right on Srivatsan
17:17:45 From Ashen Rana : I see you shaking your head, Dave lol
17:17:56 From Dave Langer : SQL, baby!
17:17:56 From Susan Walsh : lol
17:17:57 From Kamarin Lee : Data Storytelling - documentation and translating insights from reports is also incredibly valuable. I personally look for SQL + data storytelling ability before Python
17:18:01 From Mark Freeman : I LOVE SQL SO MUCH!!!!!
17:18:05 From Sarah Nabelsi : Dave!
17:18:06 From Sarah Nabelsi : haha
17:18:08 From Sarah Nabelsi : All you
17:18:10 From Ameya Dhaygude : Super agree with Srivatsan on SQL
17:18:27 From Akshay Mandke : CTEs are life savers
17:18:32 From Joe Reis : Also underrated - shell scripting
17:18:42 From Susan Walsh : next question - how do you say SQL?? 😂
17:18:42 From Carlos Mercado : in R you can do.
summary(lm(y ~ x)) and immediately start talking about model outputs. completely out of the box.

Also R has tidyverse; an entire optimized dialect that even has a SQL converter with dbplyr.

Like RStudio makes R worth learning and the community is insane.
17:18:45 From Ameya Dhaygude : Thank you Srivatsan for mentioning the jargon
17:18:46 From Sarah Nabelsi : I wanna also say creativity!
17:18:47 From Kamarin Lee : the industry conflating ML with AI for example…LOL
17:18:53 From Sarah Nabelsi : YES!
17:19:05 From DavidTello : Does anyone know of a SQL course that covers questions of the type that Eric Weber usually shares
17:19:21 From Ashen Rana : 3 months into my Jr. Data Developer role and I believe I got this role mostly because of my soft skills hah
17:19:25 From Dave Langer : For the teams that I've managed, Jr. Analytics folks are SQL, R (Python is OK, I would teach you R), and some basic knowledge of data analysis.
17:19:33 From Eric Sims : @DavidTello, Eric Weber also shares lists of helpful SQL courses, so definitely check them out!
17:19:59 From Matt Housley : @akshay I think that CTEs have developed a bad reputation because people don’t document them. Basically, SQL is a great language in its domain if we treat it like a programming language and actually document and version control.
17:20:00 From Carlos Mercado : Things a Jr Data Scientist needs to have:

  • Be curious
  • find interesting relationships and BRING THEM TO PEOPLE - don't over-dive in. Get team input
  • Good data viz
  • Can talk to people, i.e. with Data Viz and PPT
  • Python or R
  • Can handle their own basic debugging and stackoverflowing to a reasonable level. 17:20:08 From Harpreet Sahota To Greg Coquillo(privately) : Hey! Yea I have you after two more- feel free to jump in on any response though 17:20:26 From Carlos Mercado : Better to understand GLM very well; then know 10 different ML models and have no clue how to use them. 17:20:42 From Akshay Mandke : @matt I totally agree. I have struggled fixing broken scripts and CTEs that don’t work in my journey. Documentation is crucial. 17:20:42 From Ashen Rana : Good points Carlos! It’s surprising how many people/teams work in silos and bringing those people together is a skill by itseld 17:20:48 From DavidTello : Thanks @Eric Sims 17:20:51 From Kamarin Lee : Jr. Data Scientists should know how to translate data analyses with sound data storytelling and collaborative abilities 17:22:03 From Carlos Mercado : Hey @ Everyone, my mentee William Rodriguez just graduated from college; I mentored him in R, he's looking for his first analyst role either remotely or in the Orlando area. It'd be awesome if y'all added him on LinkedIn and could help him out. 17:22:09 From Carlos Mercado : He's in the chat today 17:22:34 From William Rodriguez : Hey everyone! 17:22:37 From Matt Housley : @akshay My standard is that someone should be able to look at a subquery in my CTE and understand what it does based on the inline comments. We generally expect that for functions in Python, but don’t always apply that standard in SQL. 17:22:44 From Eric Sims : Hey @William! 17:23:02 From Giovanna Galleno : Hola William! :D 17:23:32 From Ashit Debdas : Hello @william 17:23:33 From Akshay Mandke : @matt concur. and don’t forget nested sub queries. That’s where CTEs really play a role especially in complex analytics for financial datasets. 17:23:55 From Dave Langer : Death to nested sub-queries! 17:24:04 From Joe Reis : ^ amen 17:24:06 From Matt Housley : Yeah, nested subqueries can be horrifying. 17:24:28 From Akshay Mandke : Are they not so popular from your experience? I want to know if I need to avoid them in future 17:24:45 From Greg Coquillo : Tell us how you like nested sub-queries @David 17:24:48 From Greg Coquillo : lol 17:24:51 From Joe Reis : They’re popular for all the wrong reasons 17:24:56 From Dave Langer : Also some heresy here. Make all your SQL code CAPS for the love that all is holy! :-p 17:25:11 From Susan Walsh : 😂😂 17:25:11 From Dave Langer : The reserved words, that is. 17:25:16 From Ashen Rana : Hi Al(bert) Bellamy! Nice to see you here :D 17:25:20 From Carlos Mercado : I don't SQL and nested subquery just sounds bad as a word lol. I would assume never do it. 17:25:26 From Matt Housley : Depends - I’ve just seen some horrible examples of nested subqueries that would be much nicer in a CTE 17:25:45 From Matt Housley : But I generally don’t have a problem with them if they’re readable. 17:25:46 From Carlos Mercado : All my SQL skills are: Select * FROM x WHERE … Then save to csv and work in R LOL. 17:25:49 From Srivatsan Srinivasan : CTE has both sides.. But if it is analysis sometimes we might have to just iterate over same base query and generate different rollups quickly 17:26:13 From Akshay Mandke : nested sub Qs and CTEs if not designed well can throw your script into a never ending loop. 17:26:28 From Dave Langer : @Carlos - LOL 17:26:40 From Akshay Mandke : @shrivatsan - great advice 17:26:50 From Mikiko Bazeley : The limit for nested subQ’s & CTE’s was wholly dependent on how loudly our CTO yelled in the data warehouse slack channel 17:27:10 From Joe Reis : It’s the same philosophy in Python - most devs spend their time reading code, not writing it. Make your code readable. 17:27:13 From Greg Coquillo : lol @Mikiko 17:27:17 From Susan Walsh : I need to bail guys, need to sleeeeeep. Great to see you all! 17:27:25 From Joe Reis : Later Susan 17:27:27 From Eric Sims : Sleep tight, @Susan! 17:27:34 From Dave Langer : Bye, Susan! 17:27:38 From Susan Walsh : cheerio.... 17:27:39 From Greg Coquillo : Bye Susan! 17:27:41 From Ashen Rana : Nite Susan ! 17:27:42 From Giovanna Galleno : bye Susan! It was AMAZING to see you here! :D 17:27:43 From Monica Royal : See you later Susan! 17:27:44 From Thom Ives : Bye Susan! 17:27:53 From Ashit Debdas : bye susan, take acre 17:27:54 From Kamarin Lee : See you soon Susan! Happy Holidays! 17:27:56 From Ashit Debdas : care 17:28:39 From Mark Freeman : I have to get back to work. But I enjoy this so much! Happy holidays! 17:28:50 From Eric Sims : Later, Mark! 17:28:55 From Greg Coquillo : same to you Mark! 17:29:02 From Joe Reis : See ya Mark 17:29:16 From Monica Royal : See you later Mark! 17:29:44 From Carlos Mercado : @ Lalita -> What is your sales pitch? -> Are you doing a deep application or doing spray and pray? -> I recommend the following:

I recommend deep applications; when you apply, make yourself stand out, break the rules.
17:29:56 From Carlos Mercado : Lalita, you can send me your resume, happy to review.
17:30:33 From Saurabh Dixit : It would be worth getting your CV reviewed and formatted by profession resume developers.. I say this because there are some filters at the entry stage which could potentially filter out the CV if some key words are not seen.
17:30:43 From Ashen Rana : Ben Taylor, what’s the story behind #unrecruitable on your LinkedIn? [thinking face emoji]
17:31:05 From Ben Taylor : I can talk finally! :)
17:31:17 From Thom Ives : Yay Ben!
17:31:23 From Ben Taylor : #unrecuitable... yeah there’s a story there
17:31:25 From Matt Housley : I would recommend specifically building relationships with data scientists and engineers who have an academic background. That’s how I got my first DS job, and that’s true for many of my friends with PhDs in math as well.
17:31:26 From Lalita : Thanks @Carlos and @Saurabh. I will do that
17:31:31 From Dave Langer : Yes! Ben in the house!
17:31:33 From Eric Sims : @Lalita - Being active on LinkedIn has helped me create authentic relationships that I really value. It has also helped me connect with potential job opportunities.
17:32:08 From DavidTello : 100% agree @Matt Housley
17:32:52 From Kamarin Lee : its about how YOU can contribute to the BOTTOM LINE for a business
17:33:02 From Sarah Nabelsi : YES! There we go
17:33:16 From Sarah Nabelsi : Do your research, find out how you can add value
17:33:18 From Sarah Nabelsi : And lead with that
17:33:39 From Carlos Mercado : Have a day 1 contribution plan. "I understand your business, I did research on you, I know you make money by X, I can add to that with [hire me to find out]".
17:33:41 From Thom Ives : CARE - sorry - 4 letter word!
17:33:43 From Ameya Dhaygude : @Lalita - It's also difficult to get an interview call if you will need visa sponsorship. You may be a top candidate based on your profile, but the visa requirements sometimes plays against international students. I have heard from several international students that active and persistent networking has helped them more to secure jobs. That's how I was able to get my job.
17:33:44 From Kamarin Lee : 👏🏽👏🏽👏🏽 @Mikiko amazing!
17:33:44 From Saurabh Dixit : Personally I recommend .. I leveraged their service to reformat my CV … At a stage when I was laid off and was looking for new job. .. my personal experience there. .. we can’t do much if our CV is kicked out by automated systems
17:34:03 From Joe Reis : Just call them and yell, “DO YOU KNOW WHO I AM?!!!”
17:34:03 From Greg Coquillo : +1 Mikiko!!
17:34:07 From Joe Reis : Actually, don’t.
17:34:28 From Ashen Rana To Harpreet Sahota(privately) : Can we hear Ben Taylor’s story behind the #unrecruitable hashtag? :D
17:35:23 From Eric Sims : Greatest snow on earth.
17:35:24 From Mikiko Bazeley : OMG that’s amazing\
17:35:25 From Carlos Mercado : Unethical Interview tip:
Call the corporate number; and say the data science team didn't call you at 3PM last Thursday when they said they would and why. Benefits if you can name-drop a data science manager.
17:35:53 From Joe Reis : Brighton is a great resort
17:36:13 From Saurabh Dixit : Got to jump off.. .. Awesome audience , great advise.. thanks so much everyone.. !! Thank you Harpreet for connecting with so many amazing people in the field .!!!
17:36:22 From Eric Sims : See you, @Saurabh!
17:36:53 From Carlos Mercado : Just for the comment recording. Ben is actually taking this call from the ski slopes in Utah and answering this in full ski gear XD
17:36:58 From Saurabh Dixit : Cheers @Eric, all
17:36:58 From Mikiko Bazeley : OMG Ben were you snowboarding while on the call!!! #FridayGoals
17:37:29 From DavidTello : while sky diving :)
17:37:32 From Carlos Mercado : 33 to 6 :( :( :( haha
17:37:38 From Joe Reis : While rodeo flipping
17:37:40 From Mikiko Bazeley : Besides Dave and Carlos who voted for R
17:37:43 From Ashen Rana To Harpreet Sahota(privately) : Oh man, that is freaking awesome Ben!
17:38:11 From William Rodriguez : I voted for R :)
17:38:21 From Joe Reis : Funny - Ben’s literally 20 minutes from my house right now. He’s snowboarding at my local spot
17:38:22 From Mikiko Bazeley : Also it’s tough right now for job searching
17:38:48 From Dave Langer : Regarding the poll, ask yourself this. Why would a former C++ OO elitist prefer R over Python? :-D
17:38:58 From Mikiko Bazeley : XD
17:39:00 From Dave Langer : That's me, BTW. :-p
17:39:52 From Ray Givler : I voted for R. Just learning, but I knew I could get some internal R mentors.
17:40:09 From Dave Langer : Wise man, Ray. ;-)
17:40:10 From Monica Royal : Great dedication Ben!! Glad to have you call into the event and hear from you as always! :D
17:40:10 From Joe Reis : Dave - is this you?
17:40:11 From Joe Reis :
17:40:27 From Liuna Issagholian To Harpreet Sahota(privately) : sure
17:40:46 From Srivatsan Srinivasan : I have to drop now.. Have a nice holiday time, Merry Christmas and New Year
17:40:54 From Joe Reis : Later Srivatsan
17:40:57 From Eric Sims : See you, @Srivatsan!
17:41:25 From Monica Royal : See you later Sirvatsan!
17:41:57 From Ben Taylor : getting a job isn’t a charity, make sure you sell the value you’ll bring. the employer needs a return on your salary
17:42:01 From Carlos Mercado : relationships != transactions. Great point Jean-Sebastien.
17:42:24 From Mikiko Bazeley : +1 to Ben’s point
17:42:24 From Dave Langer : @Joe - No, need to check this out. Hopefully it's using EJBs!
17:42:30 From Ben Taylor : sorry about my strange joining... hopefully not too distracting with my coin toss silence
17:42:54 From Matt Housley : @ben There was a lot of envy on this call.
17:43:03 From Mikiko Bazeley : Tons lol
17:43:04 From Carlos Mercado : @ Dave what is your list of jobs that are data adjacent but don't call themselves data scientists?
17:43:10 From Joe Reis : Which lift you on @Ben?
17:43:33 From Dave Langer : @Joe - Look at all the GoF.
17:43:47 From Eric Sims : Young ones... because Greg is sooooo old :)
17:44:26 From Greg Coquillo : lmao
17:44:36 From Carlos Mercado : Titles for jobs you can chase after:
Data Scientist
Data Analyst
Quantitative Analyst
Business Analyst
Financial Analyst
Database Analyst

You can also chase after things like Consulting where data science background gives you an in to doing data science work under a different title.
17:45:13 From Carlos Mercado : Economics is the best undergrad clearly lol.
17:45:32 From Ray Givler : Gotta go. Take care folks! See you in the new year!
17:45:36 From Eric Sims : I have also just searched skill terms like "regression" in LinkedIn because then I'll get more data-oriented roles
17:46:04 From Lalita : Thanks a lot everyone for your advice
17:46:04 From Liuna Issagholian : yaaaas Carlos!
17:46:13 From Ashen Rana : DataOps roles maybe?
17:46:24 From Mikiko Bazeley : Women Who Code, Tech Ladies, and AnitaB are also really good places to put resumes or look at jobs
17:46:45 From Carlos Mercado : Eric is ACTUALLY HACKING LinkedIn right now
17:46:46 From Sarah Nabelsi : Two more things I would add here is: 1. Tailor your resume to the job you’re looking for, don’t just dump everything. 2. It’s going to take time, so be kind to yourself :)
17:46:52 From Ameya Dhaygude : Awesome advice Eric
17:47:01 From Carlos Mercado : Eric with the JUMBO brain right now.
17:47:02 From Mikiko Bazeley : With Tech Ladies you can get access to the hiring manager
17:47:06 From Thom Ives : Nice Sarah!
17:47:17 From Carlos Mercado : Eric stop job hunting, you're going to work with me at Guidehouse.
17:48:13 From Akshay Mandke : how are the scores calculated? is that the only metric for building a model?
17:48:25 From Carlos Mercado : I would need a re-statement of the question.
17:49:03 From Carlos Mercado : all transformations have costs in terms of interpretability and the unit of measure for things like error, coefficients, etc. So be careful in general.
17:49:20 From Ben Taylor : listen to Dave... he speaks data gospel!
17:49:28 From Mikiko Bazeley : +1 Dave!
17:49:57 From Akshay Mandke : Always be mindful of the trade offs and standard errors
17:50:19 From Dave Langer : Wait a second, I thought I was a heretic?
17:50:28 From Dave Langer : My R-loving ways and all that.
17:50:35 From Mikiko Bazeley : Dave - Should have ended with “All done in R"
17:50:45 From Dave Langer : ^
17:50:52 From Mikiko Bazeley : Always Be Closing
17:50:57 From Kamarin Lee : ^
17:51:06 From Carlos Mercado : ABC ^
17:52:00 From Eric Sims : "Not trying to stump" - Famous last words immediately before stumping ;)
17:52:05 From Mikiko Bazeley : @Dave - If it helps, my reason for sticking with python is becuase it’s the last language I needed to use extensively and I didn’t feel like context switching
17:52:09 From Mikiko Bazeley : So basically laziness
17:52:23 From Eric Sims : Googling now...
17:52:39 From Joe Reis : Greg - hit me up. I’ve got someone you should talk with about federated learning
17:52:41 From Dave Langer : Not to worry, I've coded in so many languages over my career. My love with R will change before I retire I'm sure.
17:53:33 From Dave Langer : @Mikiko - Totally makes sense. If you're awesome at Python, stick with it.
17:53:38 From Eric Sims : Greg's questions are top-shelf!
17:53:55 From Mikiko Bazeley : ^
17:53:57 From Greg Coquillo : Got you Joe
17:54:05 From Carlos Mercado : Keywords for Federated Learning:

  • Transfer Learning
  • Edge Computing
  • Blockchain AI

Look into these keywords to fully get the scope of Federated Learning.
17:54:06 From Greg Coquillo : Thanks! Eric!
17:54:43 From DavidTello : @Lolita, I was in the same situation last year. Here are a few points of advise. (1) Looking for a job is hard and takes time. I was in the job market from Sept 2019 to March 2020. I had several hundred rejections and only was able to get 5 interviews all together. (2) Harpreet once shared that he got an interview from emailing a manager and share the format of his email. After, I got a rejection for a position with the Army, I applied online. I went ahead an email the hiring manager using Harpeet’s style of email. Within a few days a got a direct call from the hiring manager and she asked me to come in for an interview. Sometimes it is well worth it to take a chance. (3) Be more than willing to take a job at a small town in the middle of nowhere. When the Bank where I worked today interviewed me, they asked me: “Are you willing to move to Topeka, KS. A small town of about 100K people.
17:57:05 From Lalita : Thanks @DavidTello for the advice. Would you mind sharing the template of that email if you have it. Would really appreciate it
17:57:05 From DavidTello : @Lolita, The Bank literally gave my family everything we asked for including relocation money. I started there too working there two weeks before the pandemic. 9 months forward, I still working for the Bank, but I now do it from a home office in Phoenix, AZ where I live and I can see me doing it for at least 6 more months as everyone in the Bank is in WFH mode.
17:57:30 From Carlos Mercado : This also gets deep into the actual hardware itself. The new Apple chip designed specifically for certain ML applications, optimized for certain types of data transfer, etc. I don't know this stuff tbh.
17:57:43 From Kamarin Lee : i was just thinking about that^
17:57:58 From DavidTello : @Lolita, please send me an email reminder to and give a little time to look for it
17:58:49 From Joe Reis : @greg -
17:58:58 From Joe Reis : Have fun with those
17:59:00 From Sarah Nabelsi : Fun discussion guys! I have to hop off, but lovely seeing everyone! Happy holidays. Ben, super jealous—looks amazing!
17:59:13 From Greg Coquillo : thanks Joe!
17:59:33 From Monica Royal : Go Ben!!
17:59:41 From Joe Reis : Ben - did your kid biff it?
17:59:49 From Joe Reis : Saw that
17:59:53 From Lalita : @DavidTello thank you so much for the help. I will share it for sure
17:59:56 From Dave Langer : If I was 1/2 as cool as Ben I would be 5x cooler than I am now.
18:00:05 From Mikiko Bazeley : ^
18:00:10 From Joe Reis : Dammit, wish I was up there right now
18:00:10 From Monica Royal : That looks so fun!

18:00:19 From Ashen Rana : ^ yes !
18:00:37 From Akshay Mandke : I’m in Canada I need I should pull up a Ben
18:01:02 From Mikiko Bazeley : I’m low-key envious thet anyone who looks at this recording is going to see Ben snowboarding
18:01:11 From Jacqueline Lefèvre López : I have to go, but this was awesome. I learned a lot tonight!
Happy holidays everyone! :)
18:01:40 From Carlos Mercado : Oh you're gonna get Ben's whole HIPAA rant, it's a good one though!
18:02:11 From DavidTello : @Ben can you please share it with me
18:02:13 From Mikiko Bazeley : I can get behind it, it was why we could offer social community at Livongo/Teladoc even for people to engage with their own data
18:02:18 From Mikiko Bazeley : *couln't
18:02:21 From Mikiko Bazeley : HIPAA
18:02:47 From Mikiko Bazeley : (^ in response to Ben’s HIPAA rant)
18:02:53 From Ashit Debdas : I got to go. thank you .. have a happy weekend
18:03:36 From Mikiko Bazeley : +1 Carlos
18:03:49 From Joe Reis : @Carlos - that’s almost buzzword bingo ;)
18:03:50 From Carlos Mercado : @ Greg, Also sent these on LinkedIn. I'm doing a presentation on these papers. It includes federated learning with blockchain notes. Hopefully I didn't butcher anything, I am barely ahead of you on this LOL.
18:04:06 From Carlos Mercado : you call it buzzword bingo, I call it technical sales XD XD
18:04:18 From Mikiko Bazeley : Tomatoe, Tomato
18:04:20 From Joe Reis : basically
18:04:35 From Greg Coquillo : lol
18:04:40 From Greg Coquillo : thanks so much Carlos!
18:04:41 From Joe Reis : Cloud Blockchain for Edge ai
18:04:49 From Mikiko Bazeley : BAM! Bingo!
18:04:54 From Greg Coquillo : I like that!
18:05:09 From Carlos Mercado : Joe, that's the title of my next white paper, I'll tag you when its public.
18:05:22 From Joe Reis : deal
18:05:46 From Eric Sims To Harpreet Sahota(privately) : Hey Harpreet, I know Matthew Blasa announced on LinkedIn a couple of days ago that he just got a new job with Brinks. I thought he might be interested in sharing the news, plus it could be motivating for your listeners. Just dropping it on your radar.
18:06:00 From Harpreet Sahota To Eric Sims(privately) : AWESOME! Is he here still?
18:06:11 From Eric Sims To Harpreet Sahota(privately) : Yes! He just doesn't have his camera on
18:06:32 From Eric Sims To Harpreet Sahota(privately) : He just got a job as a Data Governance Analyst with Brinks
18:07:23 From Lalita : I learnt Python using Kdnuggets and medium articles. They are really good and descriptive.
18:07:27 From Mikiko Bazeley : Well if you learn python you’ll be learning the 1st or 2nd most popular language for the group depending on who you talk to, so there’s that
18:09:09 From Mikiko Bazeley : Maybe see if you can convince some of the leads to foot the budget for workshops and training?
18:09:17 From Carlos Mercado : I am gonna drop guys; getting late for free. Happy Holidays all. Stay safe & sane. If you're looking for a new game, Unstable Unicorns has become my favorite.

Good luck to all people looking for jobs too!
18:09:28 From Joe Reis : Later Carlos. Looking forward to your paper!
18:09:30 From Mikiko Bazeley : Bye Carlos!
18:09:39 From Eric Sims : Later Carlos!
18:09:43 From Monica Royal : See you later Carlos!
18:09:45 From Ashen Rana : See ya Carlos !
18:10:19 From Naresh Reddy : Congrats Matthew
18:10:40 From Lalita : congratulations 🥳
18:10:59 From Mikiko Bazeley : There is never
18:11:06 From Mikiko Bazeley : Kidding but kind of not)
18:11:23 From Kamarin Lee : congrats!
18:11:33 From Harika Panuganty : Congrats Matthew!
18:11:42 From Mikiko Bazeley : Congrats!!!
18:11:52 From Kamarin Lee : be right back!
18:11:54 From Ashen Rana : Congrats! :)
18:12:40 From Ameya Dhaygude : Thank you Harpreet, Dave, Joe, and Matt for advice on SAS to Azure
18:12:54 From Eric Sims : 🥳 🥳 🥳
18:13:03 From Liuna Issagholian : Congrats Matthew
18:13:15 From Joe Reis : NP @ameya
18:13:19 From Naresh Reddy : I have question Harpreet
18:13:37 From DavidTello : I have one.
18:13:59 From Lalita : Thanks everyone. It was nice interacting and getting great advices from everyone
18:14:12 From DavidTello : In the past week or so, I been feeling what I think could be
18:14:16 From Matthew Blasa : Thank you everyone
18:15:02 From DavidTello : “survivor’s guilt” because of everything that is going on. Is anyone else experiencing something similar
18:15:47 From Dave Langer : @Naresh - It is often the amount/level of software engineering requires.
18:15:53 From Joe Reis : “Data Scientist” as a title has turned into a honeypot for some companies who want to attract more candidates. What those companies really mean is data analyst
18:15:54 From Dave Langer : *required
18:16:28 From Ben Taylor : thanks for this, too hard multitasking! have a great weekend!!
18:16:31 From Joe Reis : I actually think the role “data scientist” will be split into more accurate roles in the next few years. It’s too overloaded.
18:16:39 From Dave Langer : ^
18:16:45 From Mikiko Bazeley : ^
18:16:54 From Joe Reis : It’s becoming like “Sandwich artist”
18:16:59 From Ashen Rana : ^ Data Scientists will be specialized roles
18:17:02 From Greg Coquillo : Have a great one Ben!!
18:17:09 From Monica Royal : I hope you are right @Joe. It would be easier for all to have specialized roles
18:17:20 From Joe Reis : Analyst is now cool again though
18:17:27 From Joe Reis : For the last few years, analyst was a 4 letter word
18:17:51 From Giovanna Galleno : Agree with you @Joe!
18:20:40 From Matt Housley : Companies will also use the title “data scientist” to promote themselves with investors. I’ve witnessed many cases where a company doesn’t care at all about data, but promotes some people to the data scientist role so they can tell the prospective investors in their next round that they have “AI.”
18:20:50 From Eric Sims : 200 years ago, physicians didn't really have specialties, but as medical science has progressed the roles and titles have become super specialized. It took trial and error and regulation. It's not going to take 200-ish years to subdivide data science, but I think it's good to keep the pattern in mind that this isn't the first time an industry has matured and specialized.
18:22:37 From Eric Sims : I hadn't even thought about negotiating a kibble allowance into my compensation package...
18:22:37 From Greg Coquillo : +1 Vin!!!!!!
18:25:31 From Matt Housley : A friend of mine was hired as a data engineer at a small startup. The founder told him that they needed data engineering so that he could tell the investors that the company “had data.”
18:25:45 From Mikiko Bazeley : I mean he’s not wrong XD
18:25:57 From Matt Housley : Haha, yeah for sure
18:25:59 From Joe Reis : ^ yep. Done many pitch decks for VCs to get the AI/data multiple on the valuation
18:26:04 From Mikiko Bazeley : Investor milestones essentially go from:
18:26:06 From Matt Housley : It’s absolutely part of the startup game now.
18:26:14 From Mikiko Bazeley : Have app => Have data => Have models
18:26:14 From Matt Housley : Or part of selling any company.
18:26:29 From Mikiko Bazeley : +> have USers
18:26:42 From Mikiko Bazeley : It’s a pretty big jump valuation wise
18:27:26 From Albert Bellamy : did research, right after gunning down a squadron of rebel pilots...
18:27:34 From Mikiko Bazeley : XD
18:28:04 From Joe Reis : “Red 5 standing by”
18:28:07 From Thom Ives : Head Rebel Alliance Fight Pilot - ERIC SIMS
18:28:11 From Albert Bellamy : crushing my dreams Eric....
18:28:19 From Harpreet Sahota : Last call for questions!
18:28:38 From Harpreet Sahota : Type them out!
18:28:41 From Eric Sims : Gold Leader, standing by!
18:29:14 From Albert Bellamy : remind me never to hang out with 2013 Dave Langer. he sounds like a real snob.
18:29:29 From Joe Reis : Vintage Langer 2013
18:29:42 From Matt Housley : A total short hair. Could have worked at the CIA.
18:29:50 From Joe Reis : lol
18:29:56 From Eric Sims : Short-haired, non-heretical Dave? Hard to imagine.
18:30:04 From Albert Bellamy : I prefer "Langer Classic"
18:30:08 From Akshay Mandke : I agree. I worked in fraud investigations and sometimes Excel can solve your problem so not every business problem is a data science solution.
18:30:10 From Mikiko Bazeley : XD
18:30:14 From Joe Reis : He had an earring too
18:31:07 From DavidTello : Does any of you thinks that DS position requirement could eliminate the college degree (any degree) requirement at any point in the near future?
18:31:21 From Joe Reis : 2013 Joe would’ve been good friends with 2013 Langer. At the time, ML was the cool thing
18:31:34 From Dave Langer : Dave of 2013 was very passionately wrong about so many thing - including short hair.
18:31:37 From Matt Housley : @DavidTello I think job recs should eliminate this requirement.
18:31:39 From Joe Reis : Then I saw the title morph into…other stuff…
18:31:43 From Vin Vashishta : @David. YES!
18:31:55 From Thom Ives : Proven problem solving rules, BUT many businesses see degrees as a right of passage and an easy first base qualification.
18:32:03 From Eric Sims : ^
18:32:14 From Matt Housley : Academia does a poor job of preparing people to be data scientists. Degree or not, you have to develop a lot of the skills on the job.
18:32:32 From Joe Reis : Colleges have rent seeked their way to being a job filter
18:32:46 From Dave Langer : What @Matt said. Same is true for software engineering.
18:32:54 From Matt Housley : @Thom To your point, getting the first job is really tough without the credentials.
18:33:15 From Joe Reis : The university where Matt and I teach has seen record enrollment in the data programs
18:33:29 From Joe Reis : So, Matt and I will be busy this Spring
18:33:36 From DavidTello : I agree completely. I have started to believe that the ”college degree requirement” will most likely die after the pandemic
18:34:07 From DavidTello : @Joe where you guys teach?
18:34:15 From Joe Reis : University of Utah
18:34:22 From Matt Housley : @DavidTello Seems like this is happening much faster for software engineering.
18:34:37 From Eric Sims : Gotta drop off - my wife and I are going to watch the new Mandalorian episode! #DateNight
18:34:44 From Joe Reis : As a general rule, I think data’s 5-7 years behind software
18:34:59 From Greg Coquillo : Have a great one Eric!
18:35:02 From Giovanna Galleno : bye Eric! :D
18:35:04 From Monica Royal : See you later Eric!
18:35:09 From Thom Ives : Matt - that's correct, but those hiring people that realize that portfolios of proven work are key, will look for that, BUT your portfolio of work, showcased online, better be first class!
18:36:02 From Lalita : see you later everyone
18:36:29 From Naresh Reddy : Thank you Monica, Dave, Giovanna, Joe, Vin, albert, Eric, Mikiko and all the attendees :)
18:36:34 From Joe Reis : Thanks all
18:36:59 From Thom Ives : Great job Harpreet!
18:37:09 From Thom Ives : Harpreet started is because he cares!
18:37:13 From Greg Coquillo : Awesome job Harpeet!!!
18:37:15 From Joe Reis : yep
18:37:18 From Giovanna Galleno : My pleasure! Thank you for your question @Naresh! :D
18:37:49 From Mikiko Bazeley : Harpreet you’e the best!
18:38:01 From Mikiko Bazeley : Virtual pub
18:38:16 From Ameya Dhaygude : You are awesome Harpreet! Thank you.
18:38:17 From Giovanna Galleno : Agree with Mikiko! :D
18:38:20 From Matt Housley : Thanks Harpreet!
18:38:28 From Monica Royal : This is so awesome, I love this event and you are helping so many! :D
18:38:35 From Akshay Mandke : pd.append(“refill”)
18:39:16 From Naresh Reddy : There is so much learning here and at o cost
18:39:20 From Juan Francisco : Happy Holidays
18:39:22 From Naresh Reddy : 0

18:39:28 From Monica Royal : Happy Holidays all
18:39:29 From Mikiko Bazeley : Happy holidays everyone!
18:39:32 From Ashen Rana : Thank you all for your company. And thank you as always, Harpreet. Happy holidays, Happy New year and stay safe everyone! <3
18:39:33 From Ameya Dhaygude : Happy Holidays Everyone. See you on Jan 8th.
18:39:36 From Liuna Issagholian : Thanks Harpreet. Happy Holidays everyone :)
18:39:37 From Thom Ives : God Bless Everyone!
18:39:38 From Akshay Mandke : Happy Holidays everyone. So glad to join this today.
18:39:46 From William Rodriguez : Happy holidays and thank you for letting me listen in on the conversation, very interesting and useful to hear from people who are in the field!
18:39:47 From Naresh Reddy : Happy holidays and New years !
18:39:49 From Harika Panuganty : Happy Holidays everyone! See you all in the new year. Really enjoyed happy hour today!

Chat Transcript from Data Science Happy Hours 13, Dec 11 2020 Sun, 13 Dec 2020 00:00:00 -0500 3ef4bed2-aa53-4149-ba66-dc5836e38072 16:37:17 From Harpreet Sahota : If you guys have questions shoot me a message and I will add you to the queue!
16:37:32 From Vipul Mehta To Harpreet Sahota(privately) : Hi Harpreet,
16:38:05 From Naresh Reddy : Can anyone please suggest datasets that are easy to start with for a data portfolio?
16:38:26 From Thomas Ives : SKLearn Practice Data Sets
16:38:32 From Camille Leonard : People often use these data sets
16:39:01 From Jake Beliveau : Hi Harpreet! I had to leave the video portion to look after my daughter. My main question is what is the most rewarding part of being a data scientist? Why are people going into this field?
16:40:05 From Akshay Adlakha : Hello Harpreet, I have a question. Can you please provide any suggestions regarding how to get first breakthrough in Data Science field. As I am a graduate student and looking for full-time opportunities. This is because my previous experience was in Software Development. So, just want to have some expert guidance to get first breakthrough in the industry.
16:40:31 From Himashree R S : I was reading an article and it said "None of your observed variables have to be normal in linear regression analysis, which includes t-test and ANOVA. The errors after modeling, however, should be normal to draw a valid conclusion by hypothesis testing." I always tried normalizing before I wrong?
16:40:32 From Mikiko Bazeley : @Naresh: Quite a few of the datasets on Kaggle are good, especially if they’ve been used for a competition
16:40:38 From Vipul Mehta To Harpreet Sahota(privately) : I am working in Product Management as Product Owner. My current role involves Some data analysis through SQL and using Data visualization tools like Power BI. So my questions how to break into Data Science field. should I focus on New Tools or follow the path of statistical learning and learn R or python
16:40:49 From Eric Sims : Excel + dates = LOL
16:41:31 From Mikiko Bazeley : Also encoding
16:41:41 From Naresh Reddy : Thank you Thomas Ives, Camille Leonard and Mikiko Bazeley
16:43:08 From Mikiko Bazeley : Not old at all!
16:44:15 From Christian Capdeville : Question for those with experience delivering data presentations to business stakeholders: do you have a general framework you like to follow in your data presentations?
16:44:47 From Faraaz Sheriff To Harpreet Sahota(privately) : Hi Harpreet,
This is my first happy hour. I am quite excited to be part of it.
I have a question: How does one account for rare events like COVID into predicting models. Do you just skip a few months or come up with a correction factor in your models?

Thank you!
16:45:13 From Dave Langer : As a former PM at the Evil Empire I can tell you that Product Management is a great role to start building your analytics chops.
16:45:22 From Mikiko Bazeley : Agree with everything Thomas and Jennifer said and are saying.
16:45:40 From Mikiko Bazeley : +1 to Dave’s comment
16:45:56 From Dave Langer : Example - If you have access to event logs/telemetry data you've got a gold mine of opportunity.
16:48:03 From Dave Langer : Two words - Process Mining
16:48:03 From Christian Capdeville : Fantastic stuff Dave, thank you!
16:48:04 From Russell Willis : Hybrid solutions can be very useful in some circumstances, whereby using existing transformation methods as a prelude to visualisation in some modern BI tools? i.e SQL/Python/R transformation, prior to Power BI/Tableau/Qlik visualisation, etc?.
16:48:14 From Ray Givler : Tableau Server has good audit data related tow what Dave was talking about.
16:48:18 From Mark Freeman : Can you dive more into process mining? I would love to hear more about that.
16:48:45 From Dave Langer : Market basket analysis is also very useful in Prod Mgmt Analytics.
16:48:58 From Jennifer Nardin To Harpreet Sahota(privately) : +1 deep dive on process mining
16:49:11 From Mikiko Bazeley : One of my roles was working as a Data Scientist focused non Product Adoption for Autodesk, so it’s definitely a thing
16:49:26 From Dave Langer : Titanic
16:49:29 From Eric Sims : @Naresh - What are you interested in?
16:49:45 From Jennifer Nardin : +1 deep dive on process mining
16:49:50 From Naresh Reddy : @Eric Sports
16:50:01 From Eric Sims : @Dave - shots fired. Titanic hit, sunk.
16:50:06 From Ray Givler : @Christian - do you mean a static presentation or an interactive dashboard?
16:50:23 From Dave Langer : Titanic is great for initial skill-building.
16:50:51 From Russell Willis : Any data set, for which you can intuitively know that incorrect output, is incorrect, is a great source to start a transformation/visualisation journey with... Then progress to more complicated, taking previous learnings with you.
16:51:16 From Dave Langer : Titanic isn't 100% clean, the classification problem isn't trivial if you limit yourself to the data at hand, it is ripe for feature engineering, imputation.
16:51:34 From Dave Langer : Oh, and you can use it for learning market basket analysis as well. :-o
16:51:44 From Christian Capdeville : @ray - either, really. Those are probably two different types of findings you would be discussing, but I'm interested in frameworks others lean on for bringing stakeholders up to speed with your data findings
16:52:37 From Eric Sims : @naresh - Sean Sullivan is into baseball. You can check him out here:
16:52:38 From Timothy Gordon : Agree with Brandon. Collecting your own data and making conclusions on something that interests you can lead to a great and different project
16:52:48 From Russell Willis : LinkedIn now allows each user to request a complex suite of their own data, which can also be a good source, for you to review your own activities...
16:53:06 From Eric Sims : @Harpreet, if I can get in the queue, I've got a question about "deploying" a model/app
16:53:20 From Harpreet Sahota To Eric Sims(privately) : Added!
16:53:26 From Eric Sims To Harpreet Sahota(privately) : Thanks!
16:55:06 From Eric Sims : @Akshay - I'm still working on my breakthrough, but LinkedIn has been awesome for me. Being authentic. Taking the time to participate and get to know people and companies makes a big difference.
16:55:58 From Dave Langer : T-Shaped Professional, Monica
16:55:59 From Mikiko Bazeley : Back online!
16:56:10 From Harpreet Sahota To Mikiko Bazeley(privately) : Ok cool - ill get you next
16:56:20 From Albert Bellamy : The "Superpower"
16:56:22 From Mark Freeman : @Akshay Another piece of advice I received from mentors was playing on my domain expertise. I was able to get my first DS job because of my deep knowledge of healthcare, knowing enough stats/python, not for my coding skills.
16:57:05 From Ray Givler : @Christian - basically, know the customer, know their goals - get some KPIs tied to those goals, graph the trends in those KPIs, determine what data correlate with those goals, graph that too, and look for outliers in those that they can take action on to move their KPIs and ultimately achieve their goals. PM me in LinkedIn for more.
16:57:55 From Naresh Reddy : Thanks @Eric
16:57:56 From Akshay Adlakha : Thanks Eric and Mark for your valuable feedback.
16:58:25 From Abe Diaz : Hello,
16:58:53 From Thomas Ives To Harpreet Sahota(privately) : Let me add just ONE extra thing at the end.
16:58:54 From Russell Willis : "Data Science" is a very broad field!
16:58:56 From Abe Diaz : Any insight on wealth data analytics? That's the field I want to be in.
16:59:10 From Harpreet Sahota To Thomas Ives(privately) : Ok, go for it
17:00:54 From Albert Bellamy : Drax: I wasn't listening, I was thinking of something else.
17:01:02 From Russell Willis : I think that is a great method for many initiatives - Identify the issues first, then work on the most appropriate and urgent solutions!...
17:02:26 From Thomas Ives : Hi Everyone, Can you send me a LinkedIn Connection request? I've embarrassingly exceeded my connection request quota.
17:02:28 From Christian Capdeville : Gotta run - looking forward to catching the rest on youtube- thanks everyone! have a great weekend
17:02:37 From Eric Sims : Later Christian!
17:04:30 From Thomas Ives : Bye Christian!
17:04:36 From Mikiko Bazeley : Worth thinking about:
17:04:44 From Mikiko Bazeley : The two schools of thought
17:05:02 From Mikiko Bazeley : Statisticians vs ML school of thought
17:05:44 From Dave Langer : If you're interested in Process Mining, here's an awesome Coursera course:
17:05:58 From Mark Freeman : Thanks @Dave!
17:06:22 From Dave Langer : BTW - It was the single best Coursera class I have taken to date.
17:06:31 From Jennifer Nardin : @Dave - thanks! Coursera is RICH with content
17:06:36 From Mark Freeman : I work with a lot of event logs… so really excited!
17:08:27 From Dave Langer : Ben!!!!
17:10:06 From Russell Willis : Extraordinary events like extreme weather can be modelled with historical patterns, but COVID was VERY extraordinary, so very difficult to account any accommodation for?
17:10:25 From Eric Sims : Can't you use an intervention in time series data? Basically a 0,1 flag
17:11:59 From Dave Langer : Eric - Depends on what algo you are using.
17:19:28 From Russell Willis : @Greg That could really help if you identify a "Perfect Wave" of sub impacts!!
17:19:51 From Vipul Mehta : Thank you everyone for your insights and explaining things clearly. See you all in next session.Need to drop
17:19:59 From Greg Coquillo : @Russell, that's right!
17:20:22 From Dave Langer : If folks are interested in a different perspective on using statistics to analyze business data, my single favorite book on data analysis:
17:20:23 From Dave Langer :
17:22:25 From Russell Willis : Are there also time targets within SLA's that need to be accounted for?
17:23:57 From Mikiko Bazeley : So true!
17:26:31 From Ben Taylor : ha!
17:27:04 From Mikiko Bazeley : +1 Ben
17:28:29 From Ben Taylor : eternal consciousness work
17:29:39 From Russell Willis : Problem solving is a great skill and the payoff of cultivating solutions is great... Data Science provide lots of problems, so if you like problem solving it can be VERY rewarding, but also challenging and on occasion frustrating!
17:32:28 From Camille Leonard : If anyone would like to connect on LinkedIn, I'd love to connect with you!
17:33:12 From Thomas Ives : Mikiko, Great answer!!!
17:34:42 From Ray Givler : Gotta go! Thanks everyone!
17:34:49 From Jennifer Nardin : Gotta run; thanks for another great Happy Hour!
17:34:57 From Thomas Ives : Bye Jenn!
17:36:59 From Ben Taylor : data science is the ether in the world. it connects everything. I find myself interacting with linguists, psychologists, engineers, doctors, it surrounds us. It feels like a powerful magic where impossible is redefined every few years. I would have never imagined we would be doing the things we are doing today.,
17:37:18 From Ben Taylor : ha!
17:38:08 From Didier Muvandimwe To Harpreet Sahota(privately) : Hi Harpreet,
My name is Didier and have a question about starting out in Data Science. I am a maintenance engineer transitioning into this field.
17:42:39 From Ben Taylor : gotta drop, thanks for hosting
17:42:40 From Thomas Ives :
17:42:42 From Mark Freeman : My M.S. has been extremely helpful for domain experience in an area I love and learning research methods, the data science was learned outside of school.
17:43:21 From Mikiko Bazeley :
17:43:28 From Mark Freeman : Domain knowledge*
17:43:46 From Harpreet Sahota : GitHub:
17:44:02 From Mikiko Bazeley : I think this one was also good:
17:44:34 From Evangelos Tzimopoulos : Here's a business question if there time, that i'm sure a lot of data scientists relate to. How you do balance more EDA to get a better understanding of your dataset vs making quick steps forward to produce a prototype that you might not even be able explain some times due to bad data :) ? Especially when there's pressure from the business to tick the boxes early and deal with data later?
17:45:36 From Eric Sims : ^ I like this question! Practical.
17:46:42 From Mikiko Bazeley : So by interesting we mean “pain in the butt"?
17:46:46 From Russell Willis : @Evangelos I am also in London. Good to see you here!
17:47:41 From Faraaz Sheriff To Harpreet Sahota(privately) : Thank you for having me, Harpreet. I would have to drop off. Catch you next week
17:48:51 From Harpreet Sahota To Faraaz Sheriff(privately) : Cheers thanks for comin
17:52:02 From Russell Willis : @Thomas Every once in a while you need to prepare a meal from the back of the refrigerator, to remind yourself to keep any eye on quality and freshness!!
17:52:30 From Eric Sims : ^ Ha! Love that.
17:52:44 From Thomas Ives : I agree with that.
17:54:04 From Thomas Ives : Mikiko's answer is spot on!
17:55:24 From Greg Coquillo : Indeed!
17:56:01 From Timothy Gordon : Great point Mikiko and Brandon!

17:56:01 From Russell Willis : @Mikiko What they need vs. What they want can sometimes be quite a gauntlet to run... until the realisation lands!
17:56:10 From Mikiko Bazeley : Absolutely!
17:56:47 From Mikiko Bazeley : Sometimes your business partners can be the hammer that gets the data cleared up
17:57:22 From Timothy Gordon : Great answer Monica!
17:57:41 From Monica Royal : Thank you @Timothy
17:58:12 From Evangelos Tzimopoulos : hey @Russell, good to be here.
17:58:59 From Evangelos Tzimopoulos : All, thank you for your insights, was great to be here. Here's my linkedin profile if you'd like to connect and continue the great chat online -
18:01:52 From Eric Sims : Managers like Brandon make the world a better place.
18:01:57 From Didier Muvandimwe : Hey all, Would like to connect with you and stay in touch.
18:01:59 From Thomas Ives : Thought you guys might want to support Camille's post about our office hours today
18:02:13 From Timothy Gordon : Great question Mark!
18:02:39 From Camille Leonard : I won't be able to make next week. Looking forward to next year!
18:02:46 From Melania : Great points! Thanks everyone!
18:03:22 From Eric Sims : Dave loves R!
18:03:46 From Russell Willis : Great to have been here. Thanks everyone!

Chat Transcript from Data Science Happy Hours 12, Dec 4 2020 Sun, 06 Dec 2020 00:00:00 -0500 f2790ddf-a1ed-4189-82e0-7be1396faa67 16:31:14 From Eric Sims : Ray, your hoodie is awesome!
16:31:22 From Ray Givler : Thanks, man
16:31:30 From Shantanil Bagchi : Congratulations to your sister Harpreet
16:31:38 From Naresh Reddy : Hi Harpreet!
Does it read ''CloaseBy Card"?
16:31:45 From Naresh Reddy : Close*
16:31:46 From Harpreet Sahota : CloseBuy
16:33:01 From Nicole Bills :
16:33:20 From Harpreet Sahota : Thanks Nicole!
16:34:35 From Naresh Reddy : Thanks Harpreet and Nicole!
16:34:59 From Carlos Mercado : Harpreet this shit is so poppin
16:35:48 From Haseeb Mohammed : fireemoji.png
16:36:45 From Haseeb Mohammed : Dialectical behavior therapy (DBT)
16:36:51 From Joe Reis : Data Build Tool
16:37:00 From Haseeb Mohammed : (kidding)
16:37:01 From Haseeb Mohammed :
16:37:10 From Joe Reis : lolz
16:37:59 From Dave Langer : AnalyticsOps!!!
16:38:05 From Carlos Mercado : Oh my god. Analytics Ops.
16:38:09 From Eric Sims : @monica - Full stack analytics engineer
16:38:12 From Carlos Mercado : So DevOps, DevSecOps, AIOps, MLOps.
16:38:19 From Haseeb Mohammed :
16:38:48 From Carlos Mercado : @ Eric lmaooooo dude; I can't. Full stack Analytics Engineer? This is not a real job title.

"Can do SQL and also make nice visuals and also is aware of AWS".
16:38:55 From Mikiko Bazeley : It’s a well -loved tool by Zenefits
16:39:26 From Mikiko Bazeley : My friend Sean is director of analytics there — and they implemented use a couple years ago
16:39:51 From Mikiko Bazeley : Another good post on “analytics engineer”:
16:40:07 From Monica Royal : Haha! Yes @Eric... the mystical unicorn
16:40:10 From Timothy Gordon : Great follow-up question Dave. Definitely depends on the environment
16:41:07 From Haseeb Mohammed : the monsters in your head
16:41:08 From Haseeb Mohammed : !!
16:41:51 From Dave Langer : Ben!!!!!
16:41:58 From Ben Taylor : Dave!!
16:43:32 From Ben Taylor :
16:43:46 From Ben Taylor : Price wasn’t public, but someone figured out it was $34M? If I remember right
16:43:56 From Joe Reis : That’s it?
16:43:59 From Ben Taylor : I don’t think they had much revenue
16:44:10 From Joe Reis : Acqui-hire
16:44:23 From Ben Taylor : It was really to get Richard:
16:45:38 From Nicholas Lowthorpe : Hope that's tea in that mug Dave
16:45:46 From Ben Taylor : Question: Why did DataRobot buy Zeff?
16:46:00 From Dave Langer : I'm not telling what's in the mug. ;-P
16:46:01 From Carlos Mercado : Ben shouldn't you be answering that one XD
16:46:32 From Haseeb Mohammed : lol
16:46:32 From Joe Reis : DR bought Zeff cuz of you and Gonzo?
16:48:21 From Nicole Bills : Thank you - that was an awesome debrief
16:48:47 From Shantanil Bagchi : Mikiko..thank you
16:49:11 From Mikiko Bazeley : Totally!
16:50:34 From Carlos Mercado : (1) A GitHub with actual READMEs and a commit history that shows they used GitHub while making the project instead of dumping it on there at the end.
16:51:39 From Mikiko Bazeley : YES!!!!!
16:51:42 From Mikiko Bazeley : LaTex
16:53:01 From Mikiko Bazeley : If it’s on the resume, it’s fair game also
16:53:54 From Matt Housley : I have a love/hate relationship with LaTex
16:54:10 From Mikiko Bazeley : Use LaTex to generate the initial template and then edit is word
16:54:44 From Carlos Mercado : Does LaTeX pass ATS - I assume so, but looking at the backend of LaTeX outputs sometimes, I wonder if it can get busted depending on what program opens it
16:54:45 From Ben Taylor : We’re hiring data science right now… just saying.
16:54:53 From Haseeb Mohammed : ;)
16:55:00 From Ben Taylor :
16:55:01 From Carlos Mercado : Ben say it on the audio - you'll get drowned in applications from listeners.
16:55:07 From Sasha Prokhorova : I’ve been advised recently to include some soft skills on the resume, such as leadership, communication etc etc. But for me personally it looks unnecessary and even redundant. What do you guys think? Is it worth it to allocate the valuable real estate to this?
16:55:20 From Mikiko Bazeley : Same Dave
16:55:43 From Carlos Mercado : My opinion on soft skills is you tell business stories that show the skill; don't say "leadership" tell a story that makes it obvious you led.
16:56:08 From Ben Taylor : I have a really funny hedge fund resume story...
16:56:08 From Eric Sims : I wrote a data science slam poem to show I had some understanding of stats and data. It made me stand out, and it worked.
16:56:12 From Carlos Mercado : Like when LinkedIn came out with that article "oral communication is in demand" I was like - nobody talks like that.
16:57:13 From Ben Taylor : I’ve only hired PHDs and a college drop out… I’m missing the middle...
16:57:19 From Ben Taylor : Time for me to hire a BS/MS… :)
16:57:30 From Haseeb Mohammed : working on my masters, dec 2021!
16:57:58 From Eric Sims : Graduating in May and scrolling through the Careers page now :)
16:58:13 From Carlos Mercado : No Eric, you're earmarked for Guidehouse stop
16:58:30 From Eric Sims : Haha, perfect
16:58:33 From DavidTello : I got hired at the Federal Home Loan Bank two weeks before the pandemic and I think what stood out in my resume was a project that I did on interest rates and my prediction on having a recession in 2020. I did the project back on Oct of 2019.
16:59:32 From Carlos Mercado : I'm the 2006 Time Person of the Year.
16:59:36 From Haseeb Mohammed : this
16:59:47 From Eric Sims : @Carlos - Hired.
16:59:48 From Carlos Mercado :
17:00:01 From Ben Taylor : Hire slow, fire fast
17:00:10 From Joe Reis : ^ this
17:00:17 From Ben Taylor : If only I could fire people faster… (life goal)
17:00:28 From Joe Reis : Automate jerbs
17:00:33 From Christian Capdeville : show vs tell
17:00:39 From Haseeb Mohammed :

Six things I learned as a Software Engineer trying out Machine Learning - Haseeb Mohammed
17:00:39 From Matt Housley : Async firing?
17:00:48 From Haseeb Mohammed : my chipy talk from earlier this year
17:00:53 From Christian Capdeville : I have a bar graph on my resume. If you can do data viz - show it!
17:01:10 From Ray Givler : ha
17:01:21 From Ben Taylor : Thanks for doing this @Harpreet! Gotta run to a meeting!
17:01:28 From Joe Reis : Later Ben
17:01:30 From Harpreet Sahota : Thanks for coming ben!!!
17:01:32 From Haseeb Mohammed : thanks Ben!
17:02:53 From Nicole Bills : is about Carl Gold’s excellent tips for churn analytics
17:03:35 From Carlos Mercado : (I have a small contention with bringing up outliers, it's technically correct, I just worry beginners over-engineer outliers to their detriment) but its a good article
17:03:59 From Nicholas Lowthorpe : The bit about including projects in your resume is that it gives you an opportunity to focus on a result and not a process
17:04:21 From Christian Capdeville : Very similar to the classic elevator pitch format: "I help X with Y by doing Z"
17:04:39 From Nicole Bills : re: outliers - I think beginners over-engineer a lot of data science preprocessing efforts
17:04:55 From Haseeb Mohammed : the oldest stuff on my resume is summed up in a single sentence. my recent stuff is more spelled out for each project
17:05:35 From Carlos Mercado : I love to use catch phrases on my resume. I have "Technical Sales lead for the AI group - win big work more often". I was product manager overseeing a document summarization tool and said "hours of reading, in minutes".

Throw in copywrite, it's a whole skill and it shows communication skill.

17:05:48 From DavidTello : I would limited showing work experience to no more than 10 years and eliminate anything that is not that relevant to the position you are applying for
17:06:06 From Nicholas Lowthorpe : +1 for copywriting Carlos
17:06:33 From Haseeb Mohammed : anomaly detection
17:06:37 From Haseeb Mohammed : cascading learning
17:06:40 From Haseeb Mohammed : SMOTE
17:06:45 From Haseeb Mohammed : undersampling
17:06:47 From Haseeb Mohammed : oversampling
17:08:07 From Greg Coquillo To Harpreet Sahota(privately) : I do have a question
17:08:07 From Haseeb Mohammed : "Learning from Imbalanced Data Sets"
17:08:16 From Carlos Mercado : (1) Verify it's actually a problem. 20%, probably not a problem to be honest. 3% - that's where things get tough. Fraud detection especially.
(2) What makes sense in the business context (false positives vs false negatives have costs and your options will mess with this balance)
(3) then do the mathy stuff and test it out on a training sample.

(3b) split test/train/validate prior to re-sampling. This is something people actually forget.... Posted an article about it.
17:09:27 From Carlos Mercado : There are also situations where resampling works on training data and is useless in production. You don't get to resample in production. So again, (1) verify it's actually a problem.
17:09:31 From Timothy Gordon : Appreciate the responses everyone!
17:10:08 From Harpreet Sahota To Greg Coquillo(privately) : Ok for sure!
17:10:15 From Haseeb Mohammed : "There are also situations where resampling works on training data and is useless in production. You don't get to resample in production."
17:10:18 From Haseeb Mohammed : 100%
17:10:28 From Haseeb Mohammed : in training you can be really happy with your oversampled/undersampled dataset
17:10:49 From Timothy Gordon : Can you share the article on 3b Carlos?
17:11:12 From Sasha Prokhorova : Lol skill bubbles :))
17:11:40 From Haseeb Mohammed : someone said it above, everything on the resume is fair game
17:12:04 From Haseeb Mohammed : during our software engineering interviews, i pick out the obscure stuff on the resume and ask them why they added it to their resume, what they've done with it.
17:12:16 From Carlos Mercado : Amen ^
17:12:18 From Haseeb Mohammed : i loathe the skills section, when it doesn't include the project you used those skills on
17:12:21 From Shantanil Bagchi : Thanks Haseeb
17:12:26 From Mikiko Bazeley : David’s comment on real estate pleases me
17:12:32 From Sasha Prokhorova : What if a person is trying to show the bubbles for the comparison between the skills? For instance, they are more comfortable with SQL than Python etc.
17:12:44 From Carlos Mercado : nah
17:12:57 From Carlos Mercado : you're either ready to create value with the skill or not. It's binary.
17:13:03 From Timothy Gordon : Great comment Haseeb definitely want to highlight your abilities with actual proof
17:13:14 From Haseeb Mohammed : you're coming in for a junior position, i'm going to assume you can't do anything anyway -- software related
17:13:40 From Nicholas Lowthorpe : I had a 45 minute conversation with a data and analytics recruiter the other day about the state of recruitment. We had a half-joking conversation about the use of numbers to skills, where we concluded that the only ranking out of 5 you can realistically give yourself is a 4. If you list yourself 3 out of 5 you're saying you're average, and you'd never write on your CV "average at python". 5 is essentially saying 'mastery' because there's no room to improve. Anything less than 3 is a skill you wouldn't list on your CV. So you'd list everything at 4!
17:14:00 From Nicholas Lowthorpe : (hence don't use numbers)
17:14:05 From Carlos Mercado : People actually say "average" "beginner" I hate it, why say "I suck" on a sales document!
17:14:14 From Carlos Mercado : Resumes are a sales document.
17:14:16 From Haseeb Mohammed :
17:14:21 From Shantanil Bagchi : Nicely said NIcholas
17:14:23 From Mikiko Bazeley : Know what you’re selling with your resume
17:14:26 From Mikiko Bazeley : period
17:14:30 From Carlos Mercado : Amen ^
17:14:33 From Haseeb Mohammed : 100%
17:14:35 From Christian Capdeville : Solid points Nicholas and carlos
17:14:41 From Mikiko Bazeley : Are you selling features or a solution?
17:15:52 From Carlos Mercado : RE: domain expertise
Read case studies - like 3-8 page things. You want to have a bunch of stories.

IBM/Walmart Blockchain
RStudio - Blog
17:16:20 From Joe Reis : I heard some apocryphal tale that Guido Van Possum (the creator of Python) rated his Python skills 6/10. So there’s that.
17:16:26 From Joe Reis : Rossum
17:16:27 From Carlos Mercado : haha
17:16:30 From Haseeb Mohammed : lol
17:16:44 From Monica Royal : That is great advice @Nicole!
17:17:11 From Dave Langer : These days you can also articles, books, and online courses that are dedicated to analytics to particular business domains.
17:17:20 From Nicholas Lowthorpe : Nicole Bills - that is absolutely brilliant. I literally just googled and found my local authority has a ton of public data for anonymised people movement through mobile cell towers and air quality
17:17:25 From Nicholas Lowthorpe : Awesome advice
17:17:52 From Dave Langer : For example:
17:17:53 From Dave Langer :
17:18:37 From Nicole Bills : lol I like Van Possum
17:19:32 From Joe Reis : Autocorrect yields a cooler Guido
17:19:40 From Haseeb Mohammed : i leave my only shiny nugget of advice for resumes, no full sentences
17:20:07 From Nicole Bills : Amazon does like the STAR method
17:20:53 From Nicole Bills : Their 14 Leadership Principles are kind of cult-y, but effective
17:21:02 From Carlos Mercado : If you haven't watched the AlphaGo movie:
Alpha stuff is crazy
17:21:39 From Shantanil Bagchi : really mindboggling...the top player was so shocked initally
17:21:42 From DavidTello : I think it matters if the sample space was truly random @Greg
17:21:48 From Carlos Mercado : I cried during the movie for real
17:24:15 From Brandon Quach : This is great! I learned so much from all the experts on this call. I’ve got to get back to work (Pacific Time. California).
17:24:16 From Nick Urban To Harpreet Sahota(privately) : Congrats, biggest OH I’ve seen!
17:24:29 From Brandon Quach : Bye everyone!
17:24:35 From Harpreet Sahota To Nick Urban(privately) : Insane
17:24:36 From Shantanil Bagchi : Thanks Brandon
17:24:36 From Eric Sims : Bye, Brandon!
17:24:36 From Austin Loveless : Bye Brandon! Have a good one
17:26:09 From Haseeb Mohammed : ive got to jet as well, take care folks!
17:26:12 From Carlos Mercado : @ Mark
17:26:17 From Haseeb Mohammed : thanks Harpreet!
17:26:24 From Carlos Mercado : Fundamentals of products is fundamentals of software engineering.
17:26:38 From Austin Loveless : Have a good one Haseeb!
17:26:54 From Carlos Mercado : Specific to your question Mark:
17:27:23 From Carlos Mercado : To maybe speak for Dave - I think the answer is "Fast" first; Scale when it hurts.
17:27:41 From Ray Givler : Gotta go. Thanks everyone!
17:27:50 From Eric Sims : See you, Ray!
17:29:27 From Joe Reis : Later Ray
17:30:06 From Carlos Mercado : There's a difference between Open Source & Free and Open Source - learning that the hard way
17:30:13 From Eric Sims : Dave, you ask great clarifying questions! It's really helpful.
17:30:33 From Carlos Mercado :,Open%20source%2C%20in%20contrast%20to%20FOSS%2C%20accepts%20the%20idea%20that,other%20words%2C%20is%20Irving%20Berlin.
17:30:50 From Timothy Gordon : ^ agree great questions Dave! Learning a lot from everyone here
17:31:49 From Carlos Mercado To Harpreet Sahota(privately) : Harpreet, this has been a sick one. I think a small tweak would be including a separate LinkedIn post or email address in the videos for people to ask questions asynchronously to bring up here.
17:32:14 From Harpreet Sahota To Carlos Mercado(privately) : Yeah - good point
17:33:14 From Nicole Bills : Brent

17:33:23 From Carlos Mercado : Thank you! Brent*
17:33:24 From Jacqueline Lefèvre López : I've got to go, thank you for all the advice :) this was great!
17:33:28 From Carlos Mercado : I just read it too LOL
17:33:39 From Dave Langer : Joe - BTW, I have supported a DB2 system on Z-Series at one point back in the 1800s
17:33:55 From Austin Loveless : Have a good one Jacqueline!
17:34:27 From DavidTello : COBOL :)
17:34:34 From Dave Langer : Hold on, COBOL is not dead!
17:34:41 From DavidTello : Joking
17:34:46 From Dave Langer : :-)
17:34:48 From Joe Reis : dave'
17:34:52 From Carlos Mercado : 01000011 01001111 01000010 01001111 01001100 00100000 01110011 01110101 01100011 01101011 01110011
17:34:57 From Joe Reis : Has been writing Cobol since 1868
17:35:07 From Sasha Prokhorova : I had a podcast episode on COBOL not so long ago - it appears to be very much alive
17:35:25 From Dave Langer : Those were the good old days. 6-shooters, snake oil, and COBOL!
17:35:35 From Mark Freeman : That was extremely helpful!!! Thanks everyone!
17:35:37 From DavidTello : please share the URL, I would like to listen to it @Sasha
17:35:56 From Venkataramana Maram : samr here with ...
17:36:12 From Venkataramana Maram : same here with
17:37:33 From DavidTello : Don’t take that for granted @Sasha. Most programmers/DS are seem as introverts
17:37:55 From Joe Reis : You can show your communication skills by giving talks (and getting them recorded)
17:38:13 From Mikiko Bazeley : I was thinking with covid
17:38:17 From DavidTello : Try Toastmasters, it helped me a lot
17:38:24 From Carlos Mercado : I gotta bounce, but thank you all, these are so great!!
17:38:31 From Joe Reis : Later carlos
17:38:31 From Eric Sims : Later, Carlos!
17:38:38 From Nicole Bills : Bye Carlos!!
17:38:39 From Timothy Gordon : seconding David with Toastmasters
17:38:44 From Naresh Reddy : I got a qucik question, Harpreet!
17:38:51 From Austin Loveless : By Carlos
17:38:58 From Austin Loveless : Bye*
17:39:34 From Nicholas Lowthorpe : Anyone who's interested in toastmasters and might want to take it to the next level - there is a global public speaking competition in STEM called FAMELAB - runs every year
17:39:34 From Eric Sims : This is a chat message Easter Egg for the NLP challenge. Message me on LinkedIn if it's 2021 and you found this...
17:39:43 From Nicholas Lowthorpe : it's a lot of fun
17:39:52 From Mikiko Bazeley : Hey Naresh could you speak up
17:40:01 From Nicole Bills : Data viz based on transcript NLP
17:40:21 From Mark Freeman : I have to get back to work. Always appreciate the knowledge shared here!
17:40:29 From Joe Reis : Later Mark
17:40:38 From Nicole Bills : Going to hop offline as well - thanks all!
17:40:40 From Austin Loveless : Have a good one Mark!
17:40:49 From Joe Reis : Later Nicole
17:40:53 From Austin Loveless : Later Nicole! Appreciate your inisights!
17:40:58 From Austin Loveless : insight* wow haha
17:41:37 From DavidTello : Thank you @Nicholas, I never heard of FAMELAB before, I will check it out
17:42:06 From Timothy Gordon : Haven't heard of FAMELAB either thanks!
17:42:09 From Nicholas Lowthorpe : There are heats in most places - the format is 3 minutes, no notes, talk about a topic to an audience and a panel, Q&A after
17:42:25 From Austin Loveless : Same! I'll check it out Nicholas :)
17:43:02 From Nicholas Lowthorpe : I took part twice, I progressed to Northern UK and won my first time, then the year after I got too confident and went down in a ball of flames in round 1 :)
17:45:24 From Joe Reis : If you’re in college, join something like the debate team. I found debate super useful for developing my communication skills
17:46:27 From Joe Reis : Also, Warren Buffett says one of his best investments was a public speaking course
17:46:28 From Joe Reis :
17:49:24 From DavidTello : Wolf of Wall Street
17:52:09 From Sasha Prokhorova : I could use some public speaking classes myself!
17:53:10 From Mikiko Bazeley : I swear I got that exact line from my early business partners as feedback on my presentations
17:53:25 From Dave Langer : More Meetup talks, less deep learning, people!
17:53:28 From Dave Langer : ;-P
17:53:30 From Matthew Blasa : Yup. My boss told me once not to puke data.
17:53:35 From Joe Reis : yep
17:54:09 From Venkataramana Maram : pls write in chat ...Harpreet for books
17:54:12 From Joe Reis : Also, read “never split the difference”
17:54:31 From Nicholas Lowthorpe : robert cialdini - pre-suasion
17:54:33 From Nicholas Lowthorpe : was one of them
17:54:34 From Matthew Blasa : Meditations has a graphic novel?
17:54:35 From Nicholas Lowthorpe : awesome book
17:54:36 From Joe Reis : And influence
17:54:48 From Mikiko Bazeley : Crucial Conversations
17:54:49 From Austin Loveless : The art of selling anything was another I believe
17:54:55 From Austin Loveless : To sell as human I think the third?
17:54:55 From Joe Reis : I ran into Robert last year. A new version of Influence is coming out soon
17:54:58 From Venkataramana Maram : great
17:55:03 From Venkataramana Maram : thanks

Chat Transcript from Data Science Office Hours 10 Sun, 22 Nov 2020 00:00:00 -0500 fbcf895a-fb2b-4da0-9a79-566196f269ab CHAT TRANSCRIPT

16:54:07 From Carlos Mercado :
16:56:10 From Christian Capdeville : Excellent share Carlos - thank you!
16:57:12 From Kate Strachnyi : hey! I’m with the kids so won’t be joining audio this evening lol. it would be too loud and crazy - good to see you all here :)
16:57:28 From Harpreet Sahota : Thanks for coming Kate! Your presence is felt!
16:57:42 From Thomas Ives : I can relate to that Kate.
16:57:57 From Carlos Mercado : What % of your week is spent reading relevant domain or data science papers / reading newsletters, etc? What newsletters or journals are you following?
16:58:37 From Thomas Ives : Never enough Carlos. Maybe 10%, but my manager and I are thinking I could take more.
16:59:18 From Thomas Ives : Excellent point on projects Gio!
16:59:38 From Ben Taylor : 5% reading audible books since I can multi-task (hike, backcountry ski, workout, bike, etc..). My books are mostly business focused now, less technical.
16:59:51 From Ashen Rana : Currently busy with learning new tools like Snowflake, DBT, and Airflow so I’ve been in the weeds reading documentations and tutorials! What newsletters do you follow, Carlos?
16:59:54 From Carlos Mercado : Just downloaded audible!
16:59:58 From Haseeb Mohammed : i'm overwhelmed with the amount of content available, ive recently unfollowed a bunch of stuff to focus just on databricks for the last month or so.
17:00:06 From Ashen Rana : Finishing up Thinking in systems book also!
17:00:51 From Sasha Prokhorova : I’m reading the Thinking in Systems right now too!
17:01:49 From Harpreet Sahota : If you guys have a question for the crew here type it out and hold your place in “line”!
17:02:34 From Eric Sims : Question: What is unit testing? What is it used for?
17:02:57 From Carlos Mercado : My newsletters are:

Andriy Burkov's AI Newsletter
Emerging Tech Brew
Data Elixir (just joined this one)
The Forecast

17:03:09 From Carlos Mercado : It's a lot but its both technical, pop-AI, and Sales.
17:05:44 From Ashen Rana : Nice list Carlos! Will check em out
17:07:01 From Ashen Rana : Could you please speak to “a significant shortage of business domain experts in the data science and analytics space”? What’s your observation? Prediction?
17:07:25 From Ashen Rana : ^ quote from Jason Krantz’s post
17:08:01 From Kate Strachnyi : not sure if this is the right forum to do this but I had a request today from a contact that’s looking to hire -Bank of America account team for Pega. The bank is looking to do a direct hire of a Data Scientist to help develop insights for Consumer Bank clients.
17:08:21 From Carlos Mercado : does it have a citizenship or location requirement?
17:09:28 From Haseeb Mohammed : re: recruiters
17:10:05 From Carlos Mercado : @Ashen:
A lot of people are learning data science as:

problem -> Python -> Deep Learning -> Results -> look at me.

Instead of problem -> context -> hypothesis -> exploration -> discussion -> model -> does model pass common sense tests -> identify if this problem is solved or if it is now a monitoring problem that needs continuous development / an API / cloud hosting of a model / delivery of insights, etc.
17:11:00 From Ashen Rana : That’s a good way of breaking it down, Carlos - thanks!
17:12:24 From Mark Freeman : What’s your approach for V1s of implementing data science solutions? In my role, I’m the first data scientist in the company as they are now ready to work towards personalizing their product. Thus I’m implementing a lot of V1s at my org.
17:15:51 From Ben Taylor : The more you MAKE them care about the tools and details you used to solve the problem… the MORE they want to fire you :)
17:16:12 From Karan Ambasht : Lol !!
17:16:29 From Christian Capdeville : lol!
17:16:30 From Karan Ambasht : Thanks Carlos, Monica
17:17:33 From Carlos Mercado : @ Mark
figure out really quickly if what you're programming is going to be needed in 6 months; if the vast majority of your work right now is going to get dumped; then just go for speed and iteration.

Make as many models and charts and outputs and features that might be useful; be in constant communication with the team; and avoid overcommitting to useless things. Like Ben said earlier; get 1-month timelines not 6-month moonshots.
17:17:46 From Sasha Prokhorova : I’ve been wanting to learn about the best time managing practices as well! Multi-tasking vs. uni-tasking?
17:18:22 From Haseeb Mohammed : @Ben can I quote you
17:18:41 From Ben Taylor : @haseeb Always
17:19:04 From Ashen Rana : The Pomodoro technique has been effective for me Sasha! Work on tasks in a 25 minutes block, without distractions
17:19:07 From Carlos Mercado : time management
(1) don't beat yourself up. If you drop something, oh well, you dropped it.
(2) consistent 15mins beat 2-hr random sprints.
(3) leverage "flow" state. if you're not feeling like doing something, go do something else for a bit. I code in sprints; I can't just sit there and bang my head for hours.
17:19:10 From Manna Sirak : So it makes sense that ideally you'd have a focused portfolio/projects to present to interviewers. Any thoughts on how to structure a DS/analytics portfolio if you don't know exactly what industry you want to be in?
17:19:42 From Haseeb Mohammed : Carlos> (1) don't beat yourself up. If you drop something, oh well, you dropped it.

17:19:46 From Haseeb Mohammed : do what you said you would do
17:19:47 From Carlos Mercado : Google GitHub project outlines; there are a lot of optimized file path structures for certain tasks (e.g. what gitignores to use, file naming, etc.)
17:19:54 From Haseeb Mohammed : so if you find yourself dropping the ball, figure out how to say no
17:19:55 From Carlos Mercado : Ok ok ok - yes
17:20:09 From Ben Taylor : @manna I would fight ‘sameness’, everyone’s resumes look the same. Pick projects you are passionate about, that wake you up early on the weekend because you’re so excited to invest in those projects. Those are the best. I look for passion.
17:20:20 From Christian Capdeville : Rewrite your priority list each day or week.
17:20:34 From Christian Capdeville : It will change - number 2 this week may be number 6 next week
17:20:35 From Carlos Mercado : if you told someone else you'd do something, do it. But if you don't study one night and then you just hate yourself and quit studying; then that was a waste.
17:20:54 From Haseeb Mohammed : agreed
17:21:38 From Carlos Mercado : @Haseeb I am trying to say this:
17:22:06 From Haseeb Mohammed : that was a great response from the govinator
17:22:16 From Kate Strachnyi : best time management tip is saying no to anything that doesn’t support the main goals
17:22:49 From Carlos Mercado : My mentees will waste 2 weeks beating themselves up for not getting something done. It's like, if someone steals $10 from you, do you delete your bank account??
17:23:49 From Carlos Mercado :
17:24:03 From Manna Sirak : @Ben Thank you!
17:24:25 From Ben Taylor : You get better at what your practice too. Things you think you are weak at (if you want them to be strengths) they will get better over time. I used to hate moderating a panel, I sucked at it, now I’m fine with it.
17:24:48 From Giovanna Galleno : great thoughts @Ben!
17:25:34 From Ben Taylor : The power of procrastination…. When deadlines come… shit gets done. You will fill the time you give for the projects you have, don’t give yourself too much time.
17:25:47 From Thomas Ives : Joe!
17:25:49 From Ashen Rana : ^ Being able to say no to people (politely) is a big hurdle at first but so rewarding in the long run
17:25:55 From Joe Reis : Howdy!
17:26:03 From Haseeb Mohammed : hi joe!
17:26:06 From Ben Taylor : HEY Joe!
17:26:12 From Eric Sims : I'm pressure prompted, so the power of procrastination is real!
17:26:22 From Carlos Mercado : @ Ben - 100% agree; I can do a week's worth of work in 1 14-hour sprint if I don't get a choice.
17:26:24 From Ashen Rana : Procrastination got me through college lol @Ben
17:26:43 From Ben Taylor : Take a f*cking nap if you need it!!! I’ve spent 2 hrs on a damn bug that took me 4 min after a nap
17:27:11 From Haseeb Mohammed : told my teammate this week to take a nap
17:27:14 From Haseeb Mohammed : he came back after 2 hours
17:27:16 From Haseeb Mohammed : and crushed it
17:27:21 From Haseeb Mohammed : WFH >
17:27:23 From Joe Reis : Naps are nice
17:27:43 From Giovanna Galleno : Agree @Joe!
17:27:54 From Sasha Prokhorova : Siesta is the king.
17:27:57 From Thomas Ives : Computer screens are the biggest hindrance to next level troubleshooting sometimes in my experience.
17:28:11 From Thomas Ives : AND lack of sleep.
17:30:48 From Ben Taylor : Carlos time limit yourself on that type of effort/project as well, that could easily consume weeks or months of “R&D”. What is the 2 day effort that could tell a story that a SME could get. You’re hunting for intuition
17:31:40 From Carlos Mercado : I think I needed to hear "2-day effort", we keep doing stand-ups and I have updates and new visuals, but it isn't translating because it doesn't have that KPI. That's the missing piece.
17:32:12 From Carlos Mercado : Thanks!
17:33:24 From Harpreet Sahota : Ted Petrou, Dunder Data, Build a Data Analysis Library from Scratch
17:33:45 From Manna Sirak : Have to head out - thanks all!
17:33:53 From Thomas Ives : Thanks Harpreet
17:33:53 From Eric Sims : See you Manna!
17:34:00 From Thomas Ives : Bye Manna
17:34:06 From Haseeb Mohammed : take care folks, i'm out as well
17:34:11 From Giovanna Galleno : Thanks for joining Manna! All the best! :D
17:34:47 From Giovanna Galleno : Thanks! Take care @Haseeb :D
17:35:02 From Carlos Mercado :
Great book on "don't scale until it hurts"; free too.
17:35:20 From Dave Langer : Go, Ben!
17:35:34 From Dave Langer : The Grim Reaper argument, love it!
17:36:43 From Nicholas Urban : Gotta run, thank you all!
17:36:58 From Karan Ambasht : Thanks everyone, got to head out !!
17:37:16 From Ben Taylor : Pinch your time, not your pennies
17:37:27 From Suraj Bondugula : Got to go. This was really fun and informative. Thanks, Harpreet for organizing this. Helps me a lot in knowing things. Bye!
17:37:56 From Ben Taylor : VIM shortcuts
17:38:01 From Joe Reis : ^ true
17:38:17 From Joe Reis : Think outside the algorithm
17:40:33 From Joe Reis : I’m actually a big proponent of talking to people
17:40:34 From Eric Sims : Genchi genbutsu! Go to where the work gets done.
17:40:39 From Ben Taylor : I’ve sat down with someone working the process (an underwriter) and taken notes for 2hrs… it was SO insightful… more than any VP or director could have communicated.
17:40:46 From Joe Reis : yep
17:40:52 From Ben Taylor : Who has worked the process the most you are trying to automate, talk to them
17:41:05 From Thomas Ives : ^ great examples.
17:41:12 From Thomas Ives : Great points Gio
17:41:19 From Dave Langer : Folks in the trenches usually know how the process REALLY works.
17:41:38 From Ben Taylor : I’m an expert “Googler”
17:41:39 From Carlos Mercado : but maybe don't tell them your plan is to automate them out of their job without a plan for them. We have so much change management problems with my clients who have employees terrified of being fired by effective consulting.
17:42:04 From Kate Strachnyi : great conversation all! have to go now but glad I joined :)
17:42:12 From Joe Reis : Later
17:42:15 From Ben Taylor : @Carlos I’ve found that people don’t typically lose their jobs, their jobs become less boring. They can finally be creative with a process that has been stale for way too long
17:42:17 From Sasha Prokhorova : What are your thoughts on unpaid internships? Do you think they bring value for the newcomers, or do you find them to be unethical practices? Or is there no “cookie cutter” answer?
17:42:18 From Dave Langer : Wisdom, Carlos. Long ago I worked for a P&C insurance company and I sat with Claims people to do exactly that. 😲
17:42:22 From Carlos Mercado : That's 100% true
17:42:38 From Eric Sims : See you later, Kate!
17:42:39 From Ashen Rana : That Dave. Over time those folks probably have figured out shortcuts to be more efficient
17:42:56 From Carlos Mercado : But people, especially with my (government) clients, enjoy that their jobs for 5,10,15+ years has been mentally automated. They can do it in their sleep. Boring is a feature sometimes.
17:43:02 From Joe Reis : “What would you do here” - The Bobs
17:43:09 From Thomas Ives : That's a hard one Sasha. It seems to depend on many things.
17:43:22 From Ben Taylor : Office space… sooooo good!
17:43:59 From Carlos Mercado : unpaid internships solidify societal inequalities. That is a reason to not have them. Although I would support education/institution based co-ops where a 3rd party funds an internship (school, tuition, government, etc.)
17:44:18 From Ashen Rana : Office Space was funnier before I worked full time lol. It’s my nightmare now
17:44:29 From Joe Reis : It’s actually a documentary
17:44:37 From Carlos Mercado : especially for small businesses who NEED analytics support and can't afford real help and are getting drowned by big competition.
17:44:52 From Ben Taylor : I don’t like Silicon Valley…. I relate WAAAYY to much to those stupid episodes...
17:45:08 From Ashit Debdas : my work background not related to IT and data stuff, how can I do transaction ?
17:45:12 From Ben Taylor : Like picking the name of a company… why we changed our business name from Ziff to Zeff……
17:45:15 From Joe Reis : Season 2 of SV was a documentary when we were raising our Series B. It was sad
17:47:07 From Carlos Mercado : Things data scientists need to know:

Reproducible code
how to name files and variables
dependency mgmt. (i.e. package versions)
Git for version control
17:47:12 From Thomas Ives To Harpreet Sahota(privately) : You've earned this and more. Treat it like the new normal

Overcoming Imposter Syndrome | Paul McLachlan, PhD on The Artists of Data Science Podcast Sun, 23 Aug 2020 10:00:00 -0400 e539a35b-0cb5-4e43-8e8b-7f331212de41 On this episode of The Artists of Data Science, we get a chance to hear from Paul McLachlan, a data scientist who has over a decade of experience applying his knowledge and expertise to academia, corporate businesses, and entrepreneurial endeavours. His contributions and expertise have led to numerous startups and nonprofits inviting him to serve as an advisor.

He gives insight into how what sparked his interest into the data science field, his tips for beginners in data science, and how he stays motivated.

Paul shares with us his powerful journey from being a high school dropout to getting his PhD in computational social science and becoming the A.I. research leader for the Consumer and Industry Lab at Ericsson Research.

This episode is packed with advice, wisdom, and tips that will change your mindset. It was a great honor interviewing Paul!

Some notable segments from the show

[3:57] How Paul became interested in data science
[6:19] How Paul got over his fear of "looking stupid"
[27:42] Actionable tips for cultivating the habit of critical thinking
[40:07] Advice on how to be the hero when you feel like a failure

Where to listen to the episode

Listen to the episode on Apple Podcasts, Spotify, Overcast, Stitcher, Castbox, Google Podcasts, TuneIn, YouTube, or on your favorite podcast platform.

Paul's journey into data science

Paul was a high school dropout, and one of the few people who has a GED and PhD. What really sparked Paul's interest in data science was a math class he took during his undergraduate at Columbia. He put off taking this class until his last semester, since he was afraid of making mistakes and getting a poor grade.

The teaching assistant in the class helped Paul, meeting with him during office hours and helping Paul gain a deeper understanding of math. This new found foundation excited Paul to delve deeper into statistics.

Once he gained the foundation to think like a data scientist, he was excited to apply his math skills to answer questions. This fuelled him through graduate school and now in his career.
[3:57] "So I'm a really curious person. And I thought, oh my God, there's this technology or this technique that you can use to test things, to understand questions. That is just a coolest thing I've ever heard of. And that fuelled me through graduate school and now in my career, because I just think of this as a technology to answer questions in a rigorous way. And I just think that's the coolest thing and that's what motivates me."

Where is the field headed in 2–5 years?

What Paul is really excited about is 5G. The connection between 5G and data science might not be very obvious, but the biggest impact will be that 5G reduces latency. This means the lag time for data to be recorded and processed will be significantly reduced.

This will improve data precision and accuracy.

[8:20] "And also is really interesting implications for privacy. But for me, I think the real shift in the near-term is going to be towards thinking of real time data and the type of systems we can build to work with real time data rather than trying to build systems to work more and more with larger a larger historical data sets."

What will separate great data scientists from the rest of them?

Paul considers the willingness to ask good and difficult questions as the differentiator between the good and the great data scientists. One of the challenges in data science is that data scientists need to have an incredible amount of domain expertise, which requires data scientists to keep on top of the literature, while also being subject matter experts in specific industries.

A great data scientist will be able to bridge the gap between these two facets, and be able to communicate and bring value to their specific industry.

[11:28] "It also means that we have a lot of work to bring our non-technical stakeholders along the journey with us because we can build the best and most innovative cutting edge algorithm. But if our sales team doesn't feel empowered to talk about it, that can be a challenge. Or if our stakeholders don't understand the research and the innovation that went into it. That can also be a problem. So I really think a great Data scientists brings domain expertise and machine learning, subject matter expertise in their industry and an ability to bridge the two."

Key takeaways from the episode

How A.I. can help fight COVID-19

[21:37] I think we can find ways to minimize societal cost using A.I. and data science. There is also the question of being able to develop a vaccine at scale. This means supply chain optimization and manufacturing optimization, all which require data science. The number of ways that data scientists can get involved is limitless, but we need to make sure that the work is embedded in a real stakeholder need.

Extended Reality and Virtual Reality

[27:15] XR (extended reality) and VR (virtual reality) are both areas where research is being conducted to ensure security. For example, making sure the content you see is safe, or that deep fakes are not a major concern. Ethics with this technology is on the forefront of research and development, and it has a lot of implications for the future of how people interact with one another.

Tips for beginners

[32:11] You need to be proactive in your communication with non-technical stakeholders. You need to ensure that you can communicate the importance of your work, and the value that it brings to the organization. You need to be able to explain how your tools work, and what they do for the stakeholders. This requires a lot of experience and practice, but is super critical.

Important soft skills

[35:29] Be humble, be curious. Talk to people who you might not have interacted with before. Ask questions, even if they might sound very basic. Read books, nonfiction and fiction. All of this is a great foundation to build on.

Staying motivated

[44:22] Try to make time in your week to have fun. This can mean different things for different people. For example, I like to learn about other domains that I don't know much about. This fuels my creativity. Remember, your career is a marathon, not a sprint. You must find ways to have fun to stay motivated and have longevity in your career.

Memorable quotes

[19:05] "Data science is really a collective endeavour… even the most skilled and successful data scientist is going to have to be able to successfully work with technical stakeholders, non-technical stakeholders…"

[34:51] "…Start from a position of humility…that that can go much further for data scientists than always trying to be the smartest technical person in a conversation…"

[45:29] "Having fun and staying connected and staying entertained is actually part of your job responsibilities rather than something that can be set aside."

The one thing that Paul wants you to learn from his story

[47:13] You don't know the story of the person who is sitting across from you or sitting next to you. We assume that everyone has had such a straight and linear path of success without any setbacks. That's just not true. Everyone has setbacks. It is critical to keep in mind that everyone you're interacting with, from your CEO, to your classmate to your professor, is a human being.

From the lightning round

Data science superpower


Best advice

Speak more slowly.

What motivates you?

Solving puzzles.

Advice to 20 year old self

Careers are marathons, not sprints.

Topic outside of data science we should study

Social sciences.

Recommended book

"Connected" by James Fowler and Nicholas Christakis

Books and other media mentioned in this episode

Amazon Prime show: "The Feed"
Song: Are you feeling sad? - by Little Dragon

Episode transcript

You are welcome to share the below transcript (up to 500 words) in media articles (e.g., The New York Times, LA Times, The Guardian), on your personal website, in a non-commercial article or blog post (e.g., Medium), and/or on a personal social media account for non-commercial purposes, provided that you include attribution to "The Artists of Data Science" and link back to the URL.

For the sake of clarity, media outlets with advertising models are permitted to use excerpts from the transcript per the above.

The Hero's Journey | T. Scott Clendaniel on The Artists of Data Science Podcast Sun, 23 Aug 2020 10:00:00 -0400 76a883c1-1555-4925-8129-6bbe2a6f9664 On this episode of The Artists of Data Science, we get a chance to hear from T. Scott Clendaniel, a leader in the data science space with over three decades of experience serving in various roles in business, analytics and artificial intelligence.

Currently, he's a chief data scientist who is aiming to create cutting edge artificial intelligence that can be made accessible to all. He gives insight into the future of A.I, how to be an effective leader, and how to use storytelling in data science.

Scott shares with us his incredible career journey and the insights he has gathered from it. This episode is packed with advice, wisdom, and tips for every data scientist to take something from. It was a great honor interviewing T. Scott!

Some notable segments from the show

[7:57] What is an A.I. winter?

[10:54] Where the field of data science is headed in the next few years?

[13:58] Tips on being an effective leader

[20:39] The underrated skill of storytelling, and how to cultivate it

[32:43] Tips for people that want to break into data science

Where to listen to the episode

Listen to the episode on Apple Podcasts, Spotify, Overcast, Stitcher, Castbox, Google Podcasts, TuneIn, YouTube, or on your favourite podcast platform.

T. Scott’s journey into data science

T. Scott’s journey began quite accidentally! He majored in strategic planning at the University of Baltimore. He heard of an opportunity to apply for a position at SteadyCorp through one of his internships. Unbeknownst to him, the position was for marketing analytics, something he was unfamiliar with. He was still able to land the job, and so began his career.

[3:40] "So I go into the interview and get all the way to the final question. And the gentleman asked me "Gee, your major seems to be strategic planning. Why would you be interested in focusing on marketing analytics?" Well, my jaw almost hit the table because the one thing that the placement officer had not told me was the job was actually for marketing analytics. I was like, well, it's so important to be able to track your return on investment and ability to do things in a market place and set up metrics. And so that's why I was interested. And somehow that worked so I started off doing analytics from that point on."

Where is the field headed in 2-5 years?

In the next two to five years, T. Scott thinks we are going to continue seeing A.I being used to solve some traditional problems. Although these areas are not “sexy”, these areas provide a big payoff. The biggest applications are actually going to be on the cost savings side and eliminating waste.

(10:54) " But more importantly, more traditional problems can be solved. And they're not nearly as sexy, but they have a lot bigger payoff. So which of my customers is going to open my e-mail? Which of my customers is going to buy? Which product? Recommender systems. From what you've seen from Amazon's been doing that forever, improving those types of areas. I think that the biggest applications are actually going to be on the cost savings side and eliminating waste and solving lots of classic classification problems, which my customers is going to buy. Which of my customers is going to default, which my customers might be a credit risk? Those type things are much lower hanging fruit, but they don't attract nearly the attention. But that's why I see the next three to five years having the biggest opportunity."

What will separate great data scientists from the rest of them?

In T. Scott’s opinion, what will separate the great data scientists from the rest is the ability to take a step back and assess what the organization really needs. Instead of creating code to solve everything, take a step back and ask yourself what the criteria for success is.

(12:35) "I think that we need to go back to say, let's look at this from the standpoint of what does the organization really need? What is the problem we're trying to solve? How are we going to define criteria for success? How are we going to say when good enough is good enough, as opposed to ultimately reaching for some unreachable state of perfection and moving more towards what happened with software development and more of an agile based approach and iterating through, I think great data scientists are going to become much more focused on how we're gonna solve this problem. What are our criteria for success? What stages can we do this in? And let's put on our problem-solving hats and stop trying to make code by itself solve everything."

Key takeaways from the episode

A.I. winter

(7:57) An A.I. winter is a period of time where the field of artificial intelligence goes fallow. This means not a whole lot of development goes on and people start to lose faith in the field. This usually happens because A.I. journalists overhype the field, causing a false narrative on future A.I capabilities.


(13:58) To be a good leader, you have to first learn how to be a good team member. You need to be willing to focus on the greater good. You need to have a vision and the ability to get things done.

Storytelling in data science

(20:39) Storytelling is a very underrated skill that data scientists should develop. Here is a basic outline for storytelling:

Who is your hero? - It’s always the audience
What do they need to overcome?
What tool or technology are they going to use to overcome?
How is that going to happen?
What is the celebration or result of overcoming that problem going to be at the end?

What to do with these crazy jobs descriptions

(36:45) Realize that very few people have all of these crazy skills outlined in these job descriptions. If you have more than half of the requirements listed, send in the application. Also, find the job that you want to have, and check to see if other jobs have similar requirements. If you lack a certain requirement, then make sure you spend some time acquiring this skill.

Memorable quotes

(16:01) “If you're the first data scientist in an organization...make sure that you focus on a crawl, walk, run approach.”
(17:50) “Simplicity is ridiculously underrated…people do not support what they don't understand. Instead, they fear what they don't understand.”
(35:03) “Find your why and make sure it's the right why and use that to propel you…”

The one thing that T. Scott wants you to learn from his story

(46:54) Take your ego out of the equation. Be humble, and be willing to continuously learn.

From the lightning round

Best advice

Treat others the way they wish to be treated, not the way you wish to be treated.

Advice to 20 year old self

Be humble.

Topic outside of data science we should study

Graphics. Using a picture to communicate an idea does wonders to get further into your career.

Favorite interview question to ask

What concerns do you have that I might be able to address?

Recommended book

“In Search of Excellence” by Robert H. Waterman Jr. and Tom Peters.

Books and other media mentioned in this episode

“Start with Why” by Simon Sinek.

Episode transcript

You are welcome to share the below transcript (up to 500 words) in media articles (e.g., The New York Times, LA Times, The Guardian), on your personal website, in a non-commercial article or blog post (e.g., Medium), and/or on a personal social media account for non-commercial purposes, provided that you include attribution to “The Artists of Data Science” and link back to the URL.

For the sake of clarity, media outlets with advertising models are permitted to use excerpts from the transcript per the above.

The transcript for this episode can be found here.

Data Science Double Bam | Joshua Starmer, PhD on The Artists of Data Science Sun, 23 Aug 2020 09:00:00 -0400 1db22a91-d030-417f-a9ed-b5f3df8b45ec On this episode of The Artists of Data Science, we get a chance to hear from Joshua Starmer, a data scientist who has helped empower learners from all over the globe by breaking down complicated statistics and machine learning topics into small bite sized pieces that are easy to understand.

You may know Joshua from his youtube channel StatQuest, where he's beloved by his audience of over 320,000 subscribers and 15 million viewers.

Joshua shares with us his powerful journey from being a cellist and music composer to getting his PhD in computational biology and then creating StatQuest.

This episode is packed with advice, wisdom, and tips for developing a creative process and facing your fears. It was a great honor interviewing Joshua!

Some notable segments from the show

[9:05] How music has helped Joshua become more creative

[17:19] Inspiration for StatQuest

[24:00] The most challenging part of creating content

[28:02] The most misunderstood concept from statistics and machine learning

[36:38] How Joshua approaches his creative endeavours

Where to listen to the show

Listen to the episode on Apple Podcasts, Spotify, Overcast, Stitcher, Castbox, Google Podcasts, TuneIn, YouTube, or on your favorite podcast platform.

Joshua journey in statistics and data science

Joshua’s journey into stats began when he took a statistics class as a graduate student because he thought the women in the class were cute. He thought that he had to become good at stats to get their attention.

[3:20] "It's a little embarrassing. So I first got interested in statistics as a graduate student. I wasn't a statistics major, but I had to take statistics classes and I thought the women in the statistics program were pretty cute. And I thought that if I wanted to get their attention, I had to be good at statistics. So I studied a lot. I mean that's kind of how it all started. I was in these classes and I was like, how do I get. How do I get these people attention?"

Where is the field headed in 2-5 years?

Joshua sees the field of data science becoming more and more important. He believes that the field is currently like the wild west, where there is a lot of data being generated and people don’t know exactly what to do with it.

[4:18] "I just assume that next year they're gonna have a whole new way of generating crazy amounts of data, doing something new, and they're going to need us to come in and make sense of it. Some of that's gonna be using established statistics, some of that is just making stuff up as we go. I just see it becoming more and more important."

What will separate great data scientists from the rest of them?

Joshua believes that the great data scientists are the ones that understand the main concepts, without getting lost in the details. This allows them to stay focused on the more important ideas without getting caught in the hype.

[5:23] "I think, and this is true of any field. I feel like the great people are the ones that understand the main ideas and don't get lost in the details, because when you understand the main ideas, you can see a tool for what it truly is and what it's truly worth. And you don't get swept up in all the hype. And our field is full of hype and that's good and bad. You know it attracts people to the field. Smart people get sucked into it as well. And then the bad thing is, you have to kind of recognize despite the hype, tools are only good at doing certain things. And if you know the main ideas, you will know what tool is the right one for the right job. I think those are going to be the great Data scientists."

Key takeaways from the episode

Music theory

[6:38] Music theory is a way to break down music into its components of harmony, rhythm and melody. Learning how to break down music into its individual components has allowed Joshua to carry over this skillset into machine learning and data science, where he now does this for algorithms.

Inspiration for StatsQuest

[17:19] StatsQuest began as a way to teach basic statistical fundamentals to others. Videos were recorded as a reference, so that new employees at the genetics lab that Joshua worked at could learn some of these stats topics. These videos snowballed on Youtube and grew to what StatsQuest is today.

Most challenging part of creating content

[24:00] There are many aspects of creating content that can be terrifying. But the most terrifying aspect is beginning. Staring at the blank page and creating the rough draft can be a daunting task. It is so simple, yet so scary.

Most misunderstood concept from statistics and machine learning

[28:04] The most misunderstood concept in statistics is that the probability of some specific measurement can be zero. When you want to be precise, having a very specific measurement starts having a smaller probability, getting closer to zero.

Data Science: Art or science?

[34:36] It’s a little bit of both. Some tools will always have their place in data science that have been around for many years, and these tools are based in solid statistical theory. On the other hand, there is an art to how the data can be presented.

The creative process in data science

[36:38] Being creative is critical in data science. All data scientists create insights out of data. That is the very nature of the field. If you are someone who doesn't believe that you are creative, think again. Humans are innately creative. We create things all the time.

Everything begins with the blank page. You need to start somewhere to begin creating. This part can be very scary. This part is similar for all creative endeavours.

The key difference is with music, you never know when you are done. You know when you are done with your publication or creating videos, because you are trying to target a few main points. With music, you have to decide when you are done.

Memorable quotes

[9:38] “I pick up my guitar, my ukulele, and I start playing, and my head just completely clears.”
[19:52] “what I really want people to take home is that anyone can understand these things [statistics]. Ninety nine times out of 100, the only thing between them and understanding is fancy terminology and fancy notation”
[23:31] “It's probably a good thing that I'm a little nervous...because it pushes me just a little harder to make sure that what I'm talking about is correct”
[33:16] “...if you want to educate have to relate with them and you have to see the material from their perspective.”

The one thing that Josh wants you to learn from their story

[39:46] You don’t have to be the smartest kid in math class to be good at data science. I struggle with learning certain concepts, and I need to take things slowly. But taking things slowly has allowed me to be able to break concepts down into understandable and easy to explain pieces for others. If I can do it, so can anyone else.

From the lightning round

Future of StatQuest in the next 2-4 years

Josh want's to see StatQuest grow! Right now it’s just him running the show - which is no doubt a lot of work. "If I can work with like-minded people, I can cover a lot more information". He's got a vision to create a StatQuest curriculum and maybe even a StatQuest University.

Best advice

From my dad: Do something you are passionate about, and focus on the main idea.

From my boss: Do the most important thing you can do.

Advice to 18 year old self

"When I was 18, I wanted to be a professional cello player. I wanted to write music and create film scores. I wouldn’t give any advice to my younger self, because I probably wouldn’t believe what I would end up being passionate about."

Recommended book

Neal Stephenson’s books. They contain action and philosophy, and who doesn't love that combination.

Source of motivation

"I won’t be here forever, so I need to get things done now."

Episode transcript

You are welcome to share the below transcript (up to 500 words) in media articles (e.g., The New York Times, LA Times, The Guardian), on your personal website, in a non-commercial article or blog post (e.g., Medium), and/or on a personal social media account for non-commercial purposes, provided that you include attribution to “The Artists of Data Science” and link back to the URL.

For the sake of clarity, media outlets with advertising models are permitted to use excerpts from the transcript per the above.

The transcript for this episode can be found here.

Flash Statistics | Marco Andreoni on The Artists of Data Science Podcast Sun, 23 Aug 2020 09:00:00 -0400 595a7418-d845-4afe-a4d8-fa90db8777e6 On this episode of The Artists of Data Science, we get a chance to hear from Marco Andreoni, a statistician and data scientist who has a master’s degree in mathematics and machine learning, as well as a master’s degree in mathematics and cryptography.

He is the lead data scientist at Quantyca, where he covers every part of the data lifecycle from ingestion, storage, analytics, web applications, cloud storage and beyond.

Marco shares with us his passion for teaching other statistics in a more meaningful way. This led him to create Flash statistics, a way to make statistics more accessible to people. Marco brings an interesting perspective into sharing knowledge in a creative way, that all of our listeners should develop to be competitive!

Some notable segments from the show

[9:01] Marco’s inspiration for creating flash statistics

[16:42] An episode of flash statistics that is an absolute must for data scientists to watch

[17:21] The most misunderstood topic in statistics

[22:33] Whether data science is more of an art or science

[32:00] The importance of creativity

[36:00] Tips on communicating effectively

Where to listen to the episode

Listen to the episode on Apple Podcasts, Spotify, Overcast, Stitcher, Castbox, Google Podcasts, TuneIn, YouTube, or on your favorite podcast platform.

Marco's journey into data science

Marco’s always been good at math, but he didn’t know much about statistics. Luckily, he met some great professors while in college, which gave him advice and passed on knowledge regarding statistics.

[3:13] To be honest, when I started my university studies at Polytechnic of Milan, you know, 2011, I didn't even know statistics. I have to be honest, I've always been good at math. That's why you have chosen mathematics and engineering, so simple. Luckily during the five years of the of my university, the first three years and then the last two years or so, I was really lucky, I had the opportunity to meet amazing professors. They share their knowledge with me and they turn me on onto statistics. And that's simply why I've chosen my measure of my master's degree in applied statistics. Really I wish you'd meet a person like them on your way because they are able to explain complex ideas so easily and to share passion about their job. So that's where everything started.

Key takeaways from the episode

Relation of cryptography and data science

[5:59] Privacy is crucial in data science. Protecting and masking your data is important to every data scientist, and cryptography is the field of protecting and masking messages. This is how cryptography will interplay into data science now and in the future.

Flash statistics

[9:01] Marco’s inspiration for flash statistics started without a precise goal in mind. He wanted to challenge himself to make statistics more accessible. So he connected his passion for drawing with statistics to create flash statistics. It started off as a fun experiment, and ended up being a successful project.

Most misunderstood statistics topic

[17:57] Most people have a difficult time understanding P-values. When doing hypothesis testing, you never accept the null hypothesis. Rather, you check if there is evidence to reject the null hypothesis.

Data Science: art or science?

[22:34] A data scientist must be competent, creative, and communicative. A data scientist must follow the scientific approach, in order for the field of data science to have more scalability and reproducibility.

Importance of creativity

[32:00] Creativity is crucial. I believe it comes into play in two important ways: the way you communicate with others and the model you choose. If you are someone that does not see him or herself as a creative, just observe the experts and learn, then repeat their methodologies.

Tips on communicating effectively

[36:00] The audience is more impressed with your results, not your process. Just cover the main points of your results, and don’t stress on all the details.

Memorable quotes

[21:03] “You don't need to memorize every single equation...But you must know the underlying idea.”
[31:23] “Only if you measure something, you can control something”
[35:00] “Focus on the process, the result takes care of itself”
The one thing that Marco wants you to learn from his story
[39:31] Creativity makes it easier to share both simple and complex concepts.

From the lightning round

Future of flash statistics

Marco wants to keep developing flash statistics episodes and even some books, with the hope of providing people a more meaningful platform to appreciate statistics.

What would you put on a billboard

Marco’s motto in life is “Give do ways to things”. Essentially, he only focuses on what matters most, without spending energy on useless issues.


Marco is motivated by working with skilled and qualified people interested in sharing knowledge. This allows him to perform at his best.

Advice to 18 year old self

Marco decided that he would not give his younger self any advice, since he firmly believes in learning through experience.

Recommended book

Marco recommends reading “Win” by Novak Djokovic.

Books and other media mentioned in this episode

“The Creative Calling” by Chase Jarvis

Episode transcript

You are welcome to share the below transcript (up to 500 words) in media articles (e.g., The New York Times, LA Times, The Guardian), on your personal website, in a non-commercial article or blog post (e.g., Medium), and/or on a personal social media account for non-commercial purposes, provided that you include attribution to “The Artists of Data Science” and link back to the URL.

For the sake of clarity, media outlets with advertising models are permitted to use excerpts from the transcript per the above.

The transcript for this episode can be found here.

The Infinite Retina | Irena Cronin on The Artists of Data Science Podcast Sat, 22 Aug 2020 19:00:00 -0400 35839390-562b-4a45-81ee-029e64c45c3f On this episode of The Artists of Data Science, we get a chance to hear from Irena Cronin, the co-author of “The Infinite Retina”. She currently serves as CEO of Infinite Retina, an organization which provides research and business strategy to help companies succeed in spatial computing. She gives insight into what sparked her interest into spatial computing, how she sees spatial computing influencing our world, and the potential data problems that will result from more spatial computing technology. Irena shares with us what led her from leaving her career as an equity research analyst on Wall Street to working with AR/VR and other spatial computing tech. This episode is packed with interesting insights in our future, and I believe anyone listening will have something to ponder on! Some notable segments from the show [8:15] Some concerns with spatial computing [11:10] What makes us human, and how it related to spatial computing [17:20] The four technical paradigm shifts [28:56] How spatial computing and autonomous vehicles will shape our future [43:12] Advice for women that want to break into tech On this episode of The Artists of Data Science, we get a chance to hear from Irena Cronin, the co-author of “The Infinite Retina”. She currently serves as CEO of Infinite Retina, an organization which provides research and business strategy to help companies succeed in spatial computing.

She gives insight into what sparked her interest into spatial computing, how she sees spatial computing influencing our world, and the potential data problems that will result from more spatial computing technology.

Irena shares with us what led her from leaving her career as an equity research analyst on Wall Street to working with AR/VR and other spatial computing tech.

This episode is packed with interesting insights in our future, and I believe anyone listening will have something to ponder on!

Some notable segments from the show

[8:15] Some concerns with spatial computing
[11:10] What makes us human, and how it related to spatial computing
[17:20] The four technical paradigm shifts
[28:56] How spatial computing and autonomous vehicles will shape our future
[43:12] Advice for women that want to break into tech

Irena’s journey into spatial computing

Irena was an economics major, and her education led her to become an equity research analyst on Wall Street. She spent 8 years working as an equity research analyst, and in 2015, she was introduced to VR through a friend of hers. Ever since then, she has focused her attention into this space.

[3:00] “This is a nascent kind of industry. It’s going to still take a while for it to develop, but that’s what makes it really awesome.”
Where is the field of spatial computing headed in 2–5 years?
The field of spatial computing should be going more mainstream. Everyday people will start having access to this technology within the next few years.

[6:53] “Okay, so it’s been leaked Apple is going to come out with their headsets on 2020–2023… However, in two to five years, it should be going more mainstream, so regular everyday people will be able to access and use it. Additionally, I’m currently working with a company called Mojo Vision, which is working on an augmented contact lens. So this is another way you could integrate spatial computing into your world.”

Key takeaways from the episode

What is spatial computing?

[4:09] Spatial computing is all the technology associated with bringing a 3D realm to it’s users. This encompasses artificial intelligence, computer vision, augmented reality, VR, sensor technology, automated vehicles, etc.

Concerns of spatial computing

[8:15] The biggest issue associated with the technologies associated with spatial computing is the data that will be coming in. Privacy is also a concern, and details around privacy will need to be hashed out regarding what information companies will be collecting. Another concern is tech addiction, which will only get worse with spatial technology.

The four technical paradigm shifts

1) The desktop computer: everyday people begin using computers
2) Graphical interface: everyday people can now use interfaces without having a technical background. Example would be being able to print a document by clicking “print” rather than running a command to do so.
3) Mobile: phones become cell phones, and computers become laptops.
4) Spatial computing: combines all of the previous aspects and adds a third dimensionality.

Spatial computing and autonomous vehicles shaping our future

[28:56] Autonomous vehicles can affect our culture in various ways. One example is that these vehicles can be used to bring things to you, rather than you having to leave to go somewhere. This is similar to us watching movies on Netflix in our homes, rather than going to the movie theater. Secondly, these vehicles will allow for the decentralization of cities. Public transportation will not be needed anymore, and commuting to work takes on a whole different meaning.

Being a woman in STEM

[43:12] The most important thing women can do is be extremely persistent. Any setback, whether real or imaginative, should not stop someone from going after their goals. Obviously, everyone needs to learn certain technical skills to succeed, but the continuous persistence is the key.

Memorable quotes

[16:59] “Technology…it’s always been a tool for us. But even more so with spatial computing.”
[43:12] “I’d say the most important thing you can ever do is to be extremely persistent, no matter what”
[44:42] “I think it’s extremely important to have professors and the students in a class, …take time to listen to everyone who wants to speak… and not let anyone monopolize that precious time.”
The one thing that Irena wants you to learn from her story
[46:12] Just do what you want to do and don’t let anybody stop you.

From the lightning round

Best advice

Don’t always listen to others’ advice, even if it seems to be well intentioned.

Saying on billboard

Have courage to do what you want to do.

Advice to 18 year old self

Travel more.

Topic outside of data science we should study

Behavioral science. You need to understand people, and how data affects people.

Recommended book

Animal Farm by George Orwell.

Episode Transcript

You are welcome to share the below transcript (up to 500 words) in media articles (e.g., The New York Times, LA Times, The Guardian), on your personal website, in a non-commercial article or blog post (e.g., Medium), and/or on a personal social media account for non-commercial purposes, provided that you include attribution to “The Artists of Data Science” and link back to the URL.

For the sake of clarity, media outlets with advertising models are permitted to use excerpts from the transcript per the above.

The transcript for this episode can be found here.

The Legend of Jeff Jonas | Jeff Jonas on The Artists of Data Science Podcast Sat, 22 Aug 2020 19:00:00 -0400 4b60b8dc-4862-4583-bc97-d7caf677c72a On this episode of The Artists of Data Science, we get a chance to hear from Jeff Jonas, a data scientist who, for over three decades, has been at the forefront of solving complex big data problems for companies and governments. His software has helped casinos identify fraud, increased voter registration, protected Singapore’s waterways from piracy, and even predicted possible collisions between 600,000 asteroids over 25 years. Jeff shares with us his journey from creating word processors in high school to being able to sell one of his companies to IBM, along with being one of three people to complete every Ironman triathlon in the global circuit. This episode is packed with advice, wisdom, and tips that will motivate you! Some notable segments from the show [10:27] Jeff’s journey from high school dropout and bankruptcy to finding success [17:24] Advice on taking entrepreneurial action [23:04] The importance of curiosity and creativity [24:42] How Jeff began his Ironman triathlon journey [31:44] Advice for people trying to break into the data science field On this episode of The Artists of Data Science, we get a chance to hear from Jeff Jonas, a data scientist who, for over three decades, has been at the forefront of solving complex big data problems for companies and governments.

His software has helped casinos identify fraud, increased voter registration, protected Singapore’s waterways from piracy, and even predicted possible collisions between 600,000 asteroids over 25 years.
Jeff shares with us his journey from creating word processors in high school to being able to sell one of his companies to IBM, along with being one of three people to complete every Ironman triathlon in the global circuit.

This episode is packed with advice, wisdom, and tips that will motivate you!

Some notable segments from the show

[10:27] Jeff’s journey from high school dropout and bankruptcy to finding success
[17:24] Advice on taking entrepreneurial action
[23:04] The importance of curiosity and creativity
[24:42] How Jeff began his Ironman triathlon journey
[31:44] Advice for people trying to break into the data science field

Where to listen to the show

Listen to the episode on Apple Podcasts, Spotify, Overcast, Stitcher, Castbox, Google Podcasts, TuneIn, YouTube, or on your favorite podcast platform.
Image for post

Jeff’s journey into data

Jeff’s journey began when his mom introduced him to computers when he was 14 years old. This sparked his fascination with computers and data, and this carried over to his high school years, where he wrote word processors. He was able to make some money off of them, and that instantly hooked him.

[4:08] “ I was like 16 years old and somebody sent me money for writing software. And I thought, this is crazy. And I was hooked. It’s just been an obsession my whole life.”

Where is the field headed in 2–5 years?

In Jeff’s opinion, the field is actually beginning to flat line in certain areas. For example, it is going to take more than five years until we have autonomous cars on the road.
The biggest challenges we currently face are related to security and privacy. For instance, the number of ransomware and phishing attacks in public and private institutions has gone up tremendously. With this in mind, Jeff thinks we have our hands full on helping secure our systems.

[6:07] You know, it’s been kind of flat lining, I think, for a while now. Like, you know, it’s it’s showing a lot of utility against pictures and and and like multimedia sound data, but it’s not really continuing to have the same gains and other kinds of prediction areas, and it tends to flatten out early. You get all these early gains. But then to get to the last mile, it’s not… It’s been a little tougher.

What will separate great data scientists from the rest of them?

The great data scientists will be able to converge multiple data sets. They will be able to gather secondary data from secondary sources, and weave together this data for more effective outcomes.
[8:08] I think that a lot of the big gains to come are not about pointing algorithms at a data set, but converging multiple data sets and getting orthogonal data like secondary data from secondary sources thinking about it like a puzzle. You’ve got red puzzle pieces, blue, yellow, you know, green, white, black, brown, all these color puzzle pieces.

Key takeaways from the episode

How to be an entrepreneur

Another area to focus on as a data scientist is any type of fraud, waste or abuse detection product. This is a great place for data scientists and machine learning to find patterns in the data to quantify fraud and stop it.

[20:19] If you’re an entrepreneur and want to be successful in the next couple years, focus on building worker productivity tools. If you’re creating something and it doesn’t give somebody a really fast return on investment, it’s going to be very, very hard to sell because companies right now are scrambling to reduce their costs.
Important soft skills

[23:04] Do not underestimate curiosity. Knowing where the data is and how it’s structured, how it flows and how to combine it, are very important soft skills to cultivate.

Word of encouragement for new data scientists

[31:44] Download some data, and start. Just work on stuff, even if it’s free, just to get your hands on real problems and real data.

Working on projects that have utility

[35:50] Focus your efforts on projects that have utility and are sustainable for society. Don’t be directionless with the projects you choose to work on.

Being accessible

[49:57] It’s important to be accessible. You can learn a lot by connecting with others. It also creates a lot of goodwill.

Memorable quotes

[15:46] “For everybody that’s had a close call in life…every day since then has been an extra day. When you think about life like that, it allows you to just unleash a little bit more and make the most of it…”
[31:01] “…You have to let new observations reverse earlier assertions.”
[34:31] “If you don’t have something that’s like 10 times better and high margins, then you can’t innovate”
[43:03] “…My work is often about helping humans focus their finite resources”

The one thing that Jeff Jonas wants you to learn from his story

[32:44] If you quit, there’s no chance a miracle will happen.

From the lightning round

Best advice

If you want to do really high quality work and build a reputation, you’ve got to really deliver on what you promise.

Advice to 18 year old self

Stay focused on things that are useful and sustainable.

Recommended books

One Man’s View of the World by Kuan Yew Lee
Zero to One by Peter Thiel

What motivates you?

Right now, I’m just trying to make a difference, especially in this COVID world. I’m trying to make sure my team is doing well and their families are doing well.

Books and other media mentioned in this episode

“SAFE: The Race to Protect Ourselves in a Newly Dangerous World” by Evan Ratliff, Oliver Morton, Katrina Heron, Martha Baer
“The Numerati” by Stephen L. Baker
“No Place to Hide: Behind the Scenes of Our Emerging Surveillance Society” by Robert O’Harrow, Jr.
“The Watchers: The Rise of America’s Surveillance State” by Shane Harris

Episode transcript

You are welcome to share the below transcript (up to 500 words) in media articles (e.g., The New York Times, LA Times, The Guardian), on your personal website, in a non-commercial article or blog post (e.g., Medium), and/or on a personal social media account for non-commercial purposes, provided that you include attribution to “The Artists of Data Science” and link back to the URL.
For the sake of clarity, media outlets with advertising models are permitted to use excerpts from the transcript per the above.
The transcript for this episode can be found here.

Level Up Your Leadership | Pooja Sund on The Artists of Data Science Sat, 22 Aug 2020 18:00:00 -0400 87d5cf07-e8f1-4e8b-9464-907c22e6299c On this episode of The Artists of Data Science, we get a chance to hear from Pooja Sund, a technology leader who has over two decades of global technology and financial experience delivering business and organizational impact across a variety of roles. Her contributions and expertise have led her to be a powerful leader and energizer, and she currently serves as the Director of Technology and Analytics at Microsoft. She gives insight into her journey into working for Microsoft, her tips to becoming more self-aware, and how she energizes her teams. Pooja shares with us his powerful journey from switching career paths and landing her dream job at Microsoft. This episode is packed with advice, wisdom, and tips about cultivating a growth mindset. It was a great honor interviewing Pooja! Some notable segments from the show [8:57] Leadership skills that you aren’t taught in school [10:29] What Pooja looks for in data scientists [17:38] How to ask yourself the right questions to cultivate a growth mindset [24:32] How to develop self-awareness to know what tools you have [27:31] Advice for women in tech On this episode of The Artists of Data Science, we get a chance to hear from Pooja Sund, a technology leader who has over two decades of global technology and financial experience delivering business and organizational impact across a variety of roles.

Her contributions and expertise have led her to be a powerful leader and energizer, and she currently serves as the Director of Technology and Analytics at Microsoft. She gives insight into her journey into working for Microsoft, her tips to becoming more self-aware, and how she energizes her teams.

Pooja shares with us his powerful journey from switching career paths and landing her dream job at Microsoft. This episode is packed with advice, wisdom, and tips about cultivating a growth mindset. It was a great honor interviewing Pooja!

Some notable segments from the show

[8:57] Leadership skills that you aren’t taught in school

[10:29] What Pooja looks for in data scientists

[17:38] How to ask yourself the right questions to cultivate a growth mindset

[24:32] How to develop self-awareness to know what tools you have

[27:31] Advice for women in tech

Where to listen to the show

Listen to the episode on Apple Podcasts, Spotify, Overcast, Stitcher, Castbox, Google Podcasts, TuneIn, YouTube, or on your favorite podcast platform.

Pooja's journey into data

Pooja was initially interested in becoming a doctor, but this changed when didn’t get into the university that she wanted to. She changed her career path, and instead decided to go for an MBA. She realized that she had been so focused on becoming a doctor to serve the broader community, but she realized that she can serve the community by analyzing data.

During her MBA program, one her professors asked where everyone saw themselves after completing their MBA. Pooja wanted to join Microsoft, and she had unflinching faith in her capabilities to get there.

Did you find yourself in your excellence zone? I'm not talking about the time when you are in your great mode, zone or let's say this is a best zone. I'm talking about the zone where you can call yourself as being an excellent person because you're just lose track of time. And I feel that when I am working on the projects, setting was analytics and technology, I just lost track of time. So I found my passion and they embraced it with open heart

Key takeaways from the episode

Leadership skills that aren’t taught in school

[8:57] You need to be able to see the big picture. You need to know how to think outside the box, and be able to connect your organizational goals with the goals of your team.

Important soft skills

[9:49] You need to learn strategic thinking, negotiation, and the ability to network in and out of your organization.

Three key questions to ask stakeholders


  1. What is that thing that we are trying to solve?
  2. What is the risk involved here?
  3. What would happen if an alternate course is taken?
  4. Is this a high priority project?
  5. How does this help the organization as a whole?

What do you look for in a data scientist?

[10:29] Have the curiosity to learn. When I interview candidates, I look for someone who is open to learn, and has been that way in their career path. I look for someone who is proactive and creative, and is able to connect the dots between multiple projects and skills. I look for candidates that can influence without authority, and collaborate well with others.


[17:38] The way data scientists analyze data sets should be how they analyze their own mind. Ask questions regarding outcomes and what you want to achieve. This will allow you to focus your energy on the appropriate tasks, without wasting time and every opportunity that comes your way.

How to develop self-awareness

[24:32] When I talk about self-awareness, I mean the capability to conquer yourself before you can conquer the world. I call this executive presence. You might think that you need to keep learning more to begin solving problems, but sometimes you need to take a step back and realize that you already have the tools. Pause and evaluate yourself.

Advice for women in tech

[27:31] Be you, not someone else. Don’t hesitate, and be assertive.

Memorable quotes

[7:03] "You need to really look at the things that are in front of you and decide what are the things that excite you..."

[12:42] …"Rather than jumping in, take time to understand the problem."

[24:42] "I have seen people, including me, thinking that... I need to keep on learning...there's nothing wrong with it but at times you'll need to really look at the arsenal that you have created for yourself."
The one thing Pooja wants you to learn from her story
[27:52] Follow your passion, and combine your passion with your leadership persona and create your own brand.

From the lightning round

Favorite question to ask during an interview

What is your superpower?

Recommended book

“Data Science from Scratch” by Joel Grus

Books and other media mentioned in this episode

“The Alchemist” by Paulo Coelho

“Johnthan Livingston Seagull” by Richard Bach

Episode Transcript

You are welcome to share the below transcript (up to 500 words) in media articles (e.g., The New York Times, LA Times, The Guardian), on your personal website, in a non-commercial article or blog post (e.g., Medium), and/or on a personal social media account for non-commercial purposes, provided that you include attribution to “The Artists of Data Science” and link back to the URL.

For the sake of clarity, media outlets with advertising models are permitted to use excerpts from the transcript per the above.

Don’t Be Afraid To Build Your Brand | Srivatsan Srinivasan on The Artists of Data Science podcast Sun, 02 Aug 2020 18:00:00 -0400 f8e8fae9-d46e-49d3-b91d-85cfb7fb6660 On this episode of The Artists of Data Science, we get a chance to hear from Srivatsan Srinivasan, a data scientist who has nearly two decades of applying his intense passion for building data driven products. He’s a strong leader who effectively motivates, mentors, and directs others, and has served as a trusted advisor to senior level executives. He gives insight into how he broke into the data science field, the importance of focusing on business outcomes,, and some important soft skills. Srivatsan shares with us his tips on how to navigate crazy job descriptions, as well as his methods for communicating with executives. This episode contains actionable advice from someone who has been working with data since the beginning! Some notable segments from the show [5:25] Srivatsan discusses his inspiration for sharing knowledge on social media [10:26] What it means to be a good leader in data science [11:45] How to productionize a model [17:54] How to navigate difficult job descriptions [20:33] Tips on communicating with executives

Srivatsan’s journey into data science

For Srivatsan, breaking into data science was a gradual process. He has always worked with data his entire career, but never really in data science. He began as a Java application developer, but he quickly realized that working with Java was not his forte.

From there, he transitioned into working in the ETL world, and then working with big data. Whenever he worked with customers, he noticed that he had to work with large datasets, leading him into data science and machine learning.

Initially, Srivatsan and his team had many failures in their first project, but it was a great learning experience for him. Eventually, he redid the project and succeeded, marking the inception of his data science career.

Breaking into the field was kind of a gradual transition. So I’ve been in the data space from the beginning. Not in the data science though. I’ve been working with data from the starting of my career.

Where is the field headed in 2–5 years?

Srivatsan sees the two biggest aspects to the future of data science as research and model explanation


There is a lot of activity currently on advanced algorithms. Srivatsan sees an increase in accuracy in these models overtime, with the insights being democratized.

Model Explanation

Sometimes complex models lose their ability for explanation. So he sees more adoption from an enterprise standpoint, which will lead to more models that become accessible to the end-user.

[6:53] So when we talk about where the field is headed, right. There are two aspects of it. The very the very first aspect is the research side of it, right. There’s a lot going on in the research world on advanced algorithms and everything. The key thing is like you have a lot of technology companies sitting over there like Amazon, Microsoft, and Google. They have a lot of data at their disposal. And they are trying to create like are pretty accurate systems for complex jobs. The complex job can be speech to text, or it can be OCR. It’s typically not accessible to the industry, right. Industry does not have that much data to train a translation model, or a speech to text model. So what I see is the accuracy over time for these models, will get better, but the insights will be democratized. So you’ll see this as cloud services running around and accessible to the industry. That is one aspect of it. The second maybe the model explanation aspect of it. As we go into the complex model we lose the explanation capability of it. So there will be a lot of research is going on, that is on the research side of it. But in the industry side of it, there a lot of initiatives that are getting started; but more in POC stages. The adoption is not completely federated across enterprise. So what I see is more and more enterprise line of business will adopt the more of these techniques and then you can see like that fuels a new way of adoption industry. So that’s what I see like in two to five years. It’s more like more adoption and more like models getting more accessible to end-users. Like complex models like speech to text and it’s still that. But when you really use it in industry, you don’t get that accurate models. So what I meant, it would become more accurate.

What will separate great data scientists from the rest of them?

Srivatsan very insightfully observes that most data scientists are too focused on algorithms and technology.

The real focus should be on business outcomes. It doesn’t matter what tool you use to solve a problem, you just need to solve the problem and deliver.

[8:51] When we say how you adopt your data science journey, we typically — we are more focused on today algorithm and technology, the real focus should be on business outcome.It does not matter whether you use tensorflow or pytorch to solve a problem. It’s about how you are solving a problem and delivering in business outcomes. Right. That should be the clean focus of it. I think more and more data scientist today are technology focused. They need to use technology to just solve a problem. Right. So they should more focus on business outcomes. And that’s what, like, will really differentiate the good and best data scientist.

Key takeaways from the episode

Concept drift

[15:01] Concept drift is basically your underlying business assumption faces changes and the data drift is basically your data assumptions are changing.

Tips on communicating with executives

[20:33] Convert your model outcome into stories, and keep the stories simple.

Important soft skills

[19:30] Problem solving skills and presentation skills. You need the ability to present your insights to people who lack your technical skills. The end user must understand your insights.

What to do with these crazy jobs descriptions

[18:07] These job descriptions are due to the lack of maturity in the industry currently. A lot of industries are experimenting with AI and ML, and so they don’t know what they want. Keep applying to these positions. Focus on learning core skills, and then apply.

Memorable quotes

[9:09] “I think more and more data scientists today are technology focused. They need to use technology to just solve a problem…they should focus more on business outcomes.”
[10:26] “…a good leader in data science…should be ready to embrace failure”
[12:21] “…start with modularizing your code, see where are your common functions that you can use”

The one thing that Srivatsan wants you to learn from his story

[21:23] Give it your best in anything you try to do, and learn as much as possible.Don’t worry if you currently lack the skill set. The skills can be learned along the way.

From the lightning round

The best advice Srivatsan has ever received

Learn how to manage your time.

Advice that Srivatsan would give his 20 year self

Don’t delay building your brand. Be more active in your network.

A topic outside of data science Srivatsan thinks we should study

[22:11] Focus on the industry you want to impact, and try to understand how the current business processes work.
[22:48] Connect with the industry leaders. Send them a note, and ask for a 10–15 minute chat.

Srivatsan’s book recommendation

“Naked Statistics: Stripping the Dread from the Data” by Charles Wheelan

How to connect with Srivatsan


How to Whisper to Data (and Executives) | Scott Taylor on The Artists of Data Science Podcast Sun, 02 Aug 2020 18:00:00 -0400 f388c7a8-8897-45ed-9b52-c396aa9a0e10 11 articles On this episode of The Artists of Data Science, we get a chance to hear from Scott Taylor, also known as the “Data Whisperer.” He has spread the gospel of digital transformation through public speaking engagements, blogs, videos, white papers, podcasts, puppet shows, cartoons and all forms of verbal and written communications. He has also helped organizations, such as Microsoft and Nielsen, comb through and organize their data for meaningful use. Scott shares his “eight ‘ates of master data”, a set of rules to engage with master data in a meaningful way. He also goes over his tips for communicating with executives, along with important soft skills that are being overlooked by data scientists. Scott is very articulate, and his passion for data and teaching are definitely evident in this episode! Some notable segments from the show: [10:08] What separates great data scientists from good ones [11:32] The “eight ‘ates of master data” defined and explained [17:04] How to communicate effectively with executives [23:40] The biggest data blunder Scott has seen in the past year [28:14] Important soft skills that data scientists need to develop [29:54] Words of encouragement for those trying to break into data science Scott's journey into data

Scott has been working in the data space for a couple of decades. His first role working with data was working for an organization where he is looking at location data about supermarkets. Before then, he worked in consumer promotion, where he was never quite satisfied. He is happy to be working with data, and helping people tell their data story.

[3:13] I think I was kind of hard wired for the master data taxonomy ontology space because my parents told me when I was a kid, instead of building with my Lego blocks, I sorted them. So I think if you sort your block, sort your toys, you're got a chance to be in the data business, so anybody out there listening to your kids, sort their toys, encourage that. It's not all about building. Sometimes it's about making sure things are structured and organized the right way.

Where is the field headed in 2-5 years?

Scott tells us that the core value of the content is not going to change to the point where it is unrecognizable. Big data needs highly structured content to work. Regardless of what happens in A.I. and machine learning in the next few years, companies will still need highly structured data.

The stakes are getting higher. Companies understand the importance of managing data. If a company is complacent in some kind of legacy space, others are going to win from the data side

[6:04] I kind of take a sort of a dissenting view here a little bit because I don't think the core value of this foundational content. Is going to change to a point where it's unrecognizable. Kind of an awkward way to describe it there... Software comes and goes. Data always remains.The one point that I want to make is I think the stakes are going to change. I think the stakes are getting higher.

What will separate great data scientists from the rest of them?

Scott tells us what separates the great data scientists from the merely good ones is the ability to manage and govern core data assets. People tend to focus on the cool and sexy stuff. But if you build upon a weak foundation, it’s going to fall. Great data scientists build a foundation and are better equipped for long term success. It's not all about the latest data science thing you're doing, because you can't do that at scale unless you've got great data.

[10:08] People tend to focus on the fancy, cool, sexy stuff. And if you build all that on a weak foundation, it's going to fall. And I keep using that word foundation because it's a great way to think about it. And it helps get the enterprise stakeholders, the business leadership who have to be engaged. Realize it's not all about the cool stuff. It's not all about the latest data science thing you're doing, because you can't do that at scale unless you've got great data.

Key takeaways from the episode

The eight ‘ates of master data

Hear Scott go deep on the eight 'ates at the [12:57] mark.

Relate - build relationships with customers, supplies, etc. No relationship = no business

Validate - is the data real? Is it right? Is it deceptive?

Integrate - take the data sources and pull them together in some way

Aggregate - you need to aggregate the information up to the executives

Interoperate - this allows systems to talk to each other; how things connect to one another

Evaluate - How do we put this data into play? Think A.I., analytics, etc.

Communicate - You need to able to communicate your metics with others in an understandable way

Circulate - data has to be in motion for it to have value

Tips on communicating with executives

[17:04] Focus on the results from your findings. As proud as you might be about your algorithms and about all the mistakes you made that you fixed to get here, that executive team doesn't care about that. You need to tie your results to some recommended action. Ask yourself, “how is my insight that I created or the opportunity that I discovered going to move the business forward?”

Legacy systems in a start-up

[21:43] Start-ups have the advantage of not having legacy systems. If you've got a set of data that you're working with that's becoming the standard for your organization, make sure you share that. Try to form consensus around some form of standards as early as you can.

Important soft skills

[28:14] You need to know how to communicate and listen effectively. You need to be able to communicate your idea through a story. I advise people to take a sales course, because that’s how you learn how to tell a great story with empathy, emotion, and emphasis.

People who think they don’t belong in data science

[29:58] You need to find your passion in data. Data science is the hottest space to be in, and that isn’t an original opinion. Data science can help businesses grow, improve, and protect themselves. I don't think you need to do it all, but find your niche and find your expertise and then run with it. There is a lot of opportunities in this space.

Memorable quotes

[3:37] “It's not all about building. Sometimes it's about making sure things are structured and organized the right way.”
[7:11] “Hardware comes and goes. Software comes and goes. Data always remains.’
[16:11] “Data, to have value, has got to be in motion.”
[20:36] “If you're a data scientist, you are the business….and it's impossible for you to learn too much about your own business.”
[27:08] “'ve got to bring people from “I have no idea what you're talking about” to “how can we live without this?” and that comes from telling a good story.”

The one thing that Scott wants you to learn from his story

[33:08] Remember the value, the foundational importance, the critical nature of master data. It is the most important data any organization has.

From the lightning round

The best advice Scott has ever received

“When you walk into a sales call, never give them the magazine.” What that means is, customers can’t listen and look at a product simultaneously. They can’t focus on two things at once.

What motivates Scott

Doing fun stuff. I love the reaction I get from people from my crazy stuff.

****Advice that Scott would give to his 18 year old self
Get focused, and recognize where your true strengths are.

****The book Scott recommends you to check out
“Big Data, Big Dupe: A Little Book about a Big Bunch of Nonsense ” by Stephen Few

Skepticism is NOT a Denial Activity | Kyle Polich Tue, 21 Jul 2020 19:00:00 -0400 64735776-3026-4103-8d3f-911af9d5a23d On this episode of The Artists of Data Science, we get a chance to hear from Kyle Polich, a computer scientist turned data skeptic. He has a wide array of interests and skills in A.I, machine learning, and statistics. These skills have made him a sought after consultant in the data science field. He is also the host of the very popular data podcast, “Data Skeptic”, which discusses topics related to data science all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches. In this episode, Kyle defines what a data skeptic is, and also goes on to give advice on how to communicate effectively with leaders and executives as a data scientist. Kyle brings a very unique perspective related to all things data, along with actionable advice!

On this episode of The Artists of Data Science, we get a chance to hear from Kyle Polich, a computer scientist turned data skeptic. He has a wide array of interests and skills in A.I, machine learning, and statistics. These skills have made him a sought after consultant in the data science field. He is also the host of the very popular data podcast, “Data Skeptic”, which discusses topics related to data science all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches.

In this episode, Kyle defines what a data skeptic is, and also goes on to give advice on how to communicate effectively with leaders and executives as a data scientist. Kyle brings a very unique perspective related to all things data, along with actionable advice!

Some notable segments from the show

[5:28] Kyle defines “data skeptic” and his journey into becoming one
[17:19] The mission statement of the Data Skeptic podcast
[18:55] Is data science more of an art or science?
[23:36] Advice for data scientists trapped in a perfectionist mindset
[30:43] Important soft skills that you need to succeed
[39:40] How to communicate your ideas with executives

Kyle's journey into data science

Kyle has had a lifelong fascination with computers, and he knew he wanted to be a computer scientist from a young age. As he learned more about computer science, he also stumbled onto A.I, which became his focus.

While in graduate school, Kyle got a part time job where he worked for a search engine marketing company. This was his first experience working as a data scientist to some degree. He decided he wanted to delve deeper into working, rather than pursuing academia, which was his original intention.

The opportunities that followed led him to become an independent consultant, where he has been able to build a team that helps small and medium sized enterprises figure out how to use machine learning, and the creator of the Data Skeptic podcast.

[3:11] “Sure. Yeah. I mean, I guess I have had a lifelong fascination with computers. And I you know, I could've told you at four years old I was gonna be a computer scientist. That was just an obvious path. And naturally, along that journey, I became interested in artificial intelligence and that really became my focus. And as I studied that, I guess I originally thought I might go a more academic path try and go for professorship, something like that. But while in grad school, I started working a part time job just to afford to be in grad school, basically. And that was at a very unique time. And I got in a somewhat unique place. Nothing particularly special. We're a search engine marketing company, so we help small businesses use Google AdWords, basically. But as you might expect, there's a whole lot of what would eventually be called data science that went on there. And my A.I. skills transferred very well. What might not be abundantly obvious to everyone is that A.I. is very largely statistics and a lot of software design. And those two things work wonderfully in industry, especially at the time I'm talking about, which was pre a lot of things that we have today. There wasn't a cloud, there wasn't CICD, there wasn't all this kind of stuff. There was just a lot of elbow grease to do in the rudimentary versions of those. So I learned a lot of lessons about working, and decided I guess I like that better or simply was more successful with that than in academia and just kind of focused on that path.

That job led to an opportunity to move and that led out to California. And I guess the rest is history. I worked in, you know, a couple of various capacities doing different data science things. And at some point after I was at a startup that imploded, I decided I should strike it out on my own, became an independent consultant, and after about a year that started building a team and now we're, you know, we're, I guess, a media company, as you mentioned, we do the podcast, but we're most of the revenue comes from is really our work as a boutique consulting group. So we help small medium enterprises figure out how to do machine learning in the cloud, in particular with real time and streaming kinds of things.”

Where is the field headed in 2-5 years?

[8:13] On the engineering side, there's going to be a continued progression of improved tooling. That means easier and faster and better ways to do stuff. More automation, more transfer learning, more serverless, etc. What 10 data scientists can do today will be done by one in two to five years.

On the academic side, I think there's a lot of neat stuff going on in theory of database design and tying together ideas, such as ACID compliance, the CAP theorem, Paxos, etc. and finding unique ways to serve up tools that are customized and hyper efficient so that maybe some of that stuff is a utility or more of a utility.

What will separate great data scientists from the rest of them?

[11:18] I think good and great often differ just by luck.

Key takeaways from the episode

What does it mean to be a Data Skeptic?

[5:28] A data skeptic is someone who takes in as much information as you can, weighs it against the evidence, and aligns with the truest version of the world that you know. As a data skeptic, I am skeptical of data, and with data, since data is a tool that can be misused.

Kyle talks to us about the mission statement of the Data Skeptic podcast

[17:19] I want to be a resource for data scientists out there wherever data or skepticism should be applied. I want the podcast to be a casual place where people get exposed to deep ideas, not hype. I want to tell the story of how data is changing the world.

Tips on communicating with executives

[39:40] 1.) Understand the dynamics of the room (who’s leading the meeting?)

2.) Know your audience (who will be attending the meeting?)

Is data science an art or science?

[19:02] Art embraces interpretation and even encourages it, and that part I'm not good with when it relates to data. Science is about getting to the truth, and the truth is not open to interpretation.

In data science, the art exists is how one executes on the methodologies that lead to the path to truth with the data.

How the creative process manifests itself in data science

[22:07] I think most of the creative process is really about system design. How can I build something that is sustainable and maintainable and is more of a process?

Advice to those trying to break into data science:

[25:51] Be honest about what you know, what you don't know and come up with a good battle plan for learning. Figure out where you are and where you want to be and draw the straightest line between those two.

Memorable quotes

[11:43] “...greatness is achieved by a commitment to your craft and pursuing it.”

[16:42] “The greatest trick the devil ever pulled was convincing the world he didn't exist. That's what good data science does to me.”

[24:42] …”being able to fall down but get up fast is important.”

The one thing that Kyle wants you to learn from his story

Kyle wants to share the things he's learned with people and help everyone understand that it's not always easy to manage data, store it, analyze it and leverage it. But it's well worth it because the tools and methodologies that you can learn are pretty much the most effective way to build things and to learn things and to optimize processes.

From the lightning round

The best advice Kyle has ever received

The best advice he's ever received is a classic bit of wisdom: work smarter, not harder.

He elaborated on what this meant to him "...Ingenuity through intelligence. So find a smarter way to do your process, to automate the eat the hard parts or automate.. Figure out what's the core of the problem...Solve it not with lifting more, but with smarter techniques, better algorithms and that kind of stuff."

What motivates Kyle

A burning desire to understand the mechanism of everything I encounter.

The advice that Kyle would give to his 20 year-old self

"I feel like every good and bad choice I made brought me exactly where I am. And it'd be a sort of existential suicide to say anything else. I also don't know that I could get through to myself at that age. So I guess just keep on keeping on"

The song that Kyle have on repeat

Home Away From Home, Be Like Max

Recommended book

“OpenIntro Statistics” by Christopher Barr, David M. Diez, and Mine Çetinkaya-Rundel
“The Elements of Statistical Learning” by Jerome H. Friedman, Robert Tibshirani, and Trevor Hastie

Books and other media mentioned in this episode

“An introduction to Kolmogorov complexity and its applications” by Ming Li

How you can connect with Kyle Polich

Data Skeptic
Data Skeptic Slack Channel
Twitter: @DataKkeptic

All The Things I Wish They Taught Us In Bootcamps | Eric Weber, PhD Sun, 19 Jul 2020 20:00:00 -0400 8f485c08-b10d-4e8f-a1a2-65b7ba351707 On this episode of The Artists of Data Science, we get a chance to hear from Eric Weber, a lifelong learner, mathematician, and data scientist. He has cultivated a passion for sharing his work and experience with others to help them become excited about data science, as well as educating executives on all aspects of data science. He gives insight into his perspective of learning, how to be a leader in the data science field, and important skills that data scientists need to develop. Eric shares with us what drew him to the field, and his transition from academia to the business side of data science. This episode highlights the journey and success of someone who has seen the field develop from the beginning, and has continuously improved over time. I think there is a lot to learn from this conversation!

On this episode of The Artists of Data Science, we get a chance to hear from Eric Weber, a lifelong learner, mathematician, and data scientist. He has cultivated a passion for sharing his work and experience with others to help them become excited about data science, as well as educating executives on all aspects of data science. He gives insight into his perspective of learning, how to be a leader in the data science field, and important skills that data scientists need to develop.

Eric shares with us what drew him to the field, and his transition from academia to the business side of data science. This episode highlights the journey and success of someone who has seen the field develop from the beginning, and has continuously improved over time. I think there is a lot to learn from this conversation!

Some notable segments from the show
[4:43] How Eric transitioned from academia to the business context

[11:40] What separates a good from a great data scientist

[20:59] Tips to communicate effectively with your team

[24:07] Is data science an art?

[34:52] Important soft skills that you might be missing

[41:15] How to navigate crazy job postings

Eric's journey into data science

Eric explains that his journey into data science began while he was in academia, teaching statistics and programming. He remembers receiving a phone call from his dad, where they discussed the rise of “big data”, and this kick started Eric’s fascination. Most of the concepts and idea that were based in data science were already familiar to Eric already, but now he had the chance to work with data at scale.

Eric also got the opportunity to explore how data science can be used in a business context. He had to learn how to transition from a classroom setting to a business setting, which was an eye opening experience.

[4:43] “Yeah, that's a really it's a good question and one I think that is often potentially overlooked because even with data science, people talk about it as if it's pretty new. Like they're like "oh data science is still..." It's actually been around for a number of years, but people's journeys into it continue to amaze me. They're all different. There's no one set way to end up in a data science position. For me, I was in the academic world teaching statistics and programming, experimental design, things like that, all the way up until 2013, 2014. And I distinctly remember in a phone conversation with my dad and he at the time was an engineer for United Health. And he was like, well, they're talking about all of this big data, stuff like blah blah blah, like, don't you do things with data? Like I do. But I don't know if I do things with big data. And that kind of kicked off my fascination. I think I remember after that that was the first time I drove into some of the online certificates. And when I started doing some of the MOOCs that you see on my profile with like Johns Hopkins, it's like, what is this all about? And I learned in a lot of ways that the concepts and the ideas that I already knew and used were pretty foundational to working with data. For me, it was having to think about how to operate with data at scale, whereas I was used to being able to run things on my local machine.

Right. I can run things in R. I was programming in R when there was nothing pretty about. It is a brutal experience. Now I look at it and it's see this beautiful tidy, clean thing to do. Back then it was not. So for me, my journey was all about figuring out two things. One, how to work with data at scale. And two, what does it mean to actually do data science in a business context? And those two things are really, really important. And they also, I think the probably overlooked my transition really entailed a lot of learning about what it meant to really do data science in a business context. You can learn about data science and how it operates within business. But to actually make it effective for business is actually quite a journey to go from an academic side where you're in a classroom and you're used to delivering things in a way, and then you push them over the line and they're done almost like a homework assignment in our industry. It's sort of continuous and that continuous learning, that continuous improvement and the fact that even when you create projects, you never really get to just turn them in as homework. That was sort of an eye opening experience for me. Aside from the day to day business to go away from taking my teaching mindset from the classroom to my team, that was a huge experience for me.”

Where is the field headed in 2-5 years?

The field is headed in a non-uniform direction, which is very valuable to the field.

Data science was always meant to be divided into sub-disciplines. There are very few people who are experts in everything, and it's just infeasible at this at this point to hire someone who's good at everything. Companies are changing in how they hire and what their expectations are, and specialists are becoming the norm.

Due to the recent economic pressure due to COVID, companies are going to really evaluate everything. This is time for data science to really prove it’s worth over these next few years.

[8:42] “I think I see the field head in in a non uniform direction. And I think that is maybe the most valuable thing that we have going on for us is data science has evolved into what it always probably was meant to be is a bunch of sub-disciplines, just like the idea of saying that you're a data engineer or that you're an engineer overall. There's so many different types of engineering. They all require different skill sets. There's very few people who are experts in everything. So as much as there's been a lot of debate, I think people like, do we need specialists or generalists?

I think we're getting to the point where specialties are the norm. But that's not a bad thing.
Just because you're focused on time-series work, generally speaking, doesn't mean you're not skilled. Just because you tend to focus on machine learning or system design doesn't mean you're not skilled. It's just infeasible at this at this point to hire someone who's good at everything. That isn't just from a candidate perspective, though, it's from a company perspective. They're figuring out that what they've done in the past, which was hire data scientists and basically make them responsible for all things data. It's actually really tricky to figure out how to use them effectively. This is produced in a lot of companies reorganizations, merging parts of data science with engineering, parts of data science with product. In some cases, data science has become its own entity within an organization. And so for me, what's coming next is this alignment, because companies are at the point now where they're like, how do we get the ROI out of this practice? And a lot of cases getting that ROI out is making sure you figure out that there actually are sub-disciplines and you can't hire one person to do the job of three, even if they're extraordinarily talented.

And so there's a simultaneous change like students and people prepping to go into the field are changing. Companies are changing in how they hire and what their expectations are. And I think in a lot of cases, it's actually that this really interesting experiment to see what's going to happen with people that are working with data at some point, it still is like this sexiest job, but company is still have a bottom line and more so than ever with the current public health and economic conditions that we're in. Companies are going to be evaluating the true value of essentially everything within their midst. And while in the last 10 years it's perhaps been easier and they've had budgets that allow them a little bit of wiggle room for the next 12 to 24 months. That's not going to exist. And so I see it as like a real prove it time for data science. As much as it's hard to think about that. It really I really think it's going to be that in the next year or two.

What will separate great data scientists from the rest of them?

Flexibility. What that means is the ability to have a different approach to different tasks, rather than using the same uniform methodology. Data science tasks require different skills and models.

Also, the ability to deliver business value, and not just scientific value. This is incredibly important. Some people go into data science positions thinking that it is doing a series of projects and establishing good code. But then they hand it off as if the business is going to magically use the thing that they've created. This is not a good assumption.

This flexibility to pick the right approach and the ability to really transform the business with your solutions are what make good data scientists great. Those two things are going to differentiate data scientists who are going to stick around at companies and data scientists who are going to be viewed as just “scientists”.

[11:40] “Flexibility like this idea of being flexible doesn't mean that you can handle a whole bunch of tasks coming at once. It means that you can sort of ramp up your - not just flexibility. I think the best way to put this is, you know, what's required to do different tasks. And you don't always use a uniform approach to do everything. A data science task, task A is probably always going to be different from data science task B. And they're probably going to require different skills. They're going to require different models. They're going to require people to understand how much is actually needed to solve the problem. You don't need to build an incredibly powerful model for every situation, but you need to know what's going to allow the business to thrive in a productive way. The other part is businesses, like I said, are going to be looking for the bottom line value of data science as a practice. And I'm saying this specifically, if you're not talking about research groups and teams that are at Google or Amazon and all of these places that will continue to be research focused, they may not be the immediate business value. And almost every other case data scientists who understand that their skillset gives.

Value to the business by actually delivering business value and not just scientific value. It's going to be incredibly important. This I think there's a disconnect. Sometimes people go into data science positions thinking that it is doing a series of projects and to establish really good code. And the optimal solution. But then they kind of hand it off as if the business is going to magically use the thing that they've created that I don't think is a good assumption. This flexibility to pick the right approach or to also know why an approach may or may not work and your ability to really transform the business with your solutions. Those two things are going to differentiate data scientists who are going to stick around at companies and data, scientists who are going to being viewed as really scientists. There's this weird perception. Oh, they just do the science-y stuff. If a company is looking at you as you do the science stuff, I can probably guarantee that they don't see how you're impacting the business. And those are really key things for people to keep in mind.”

Key takeaways from the episode

Lifelong learner

[14:25] I think we often go to school and we think about getting a degree and we continue to improve enough until we get granted our certificate. But in almost every case, those skills you learn are going to be outdated. In data science, that happens every two years, probably less.

To get the most out of your education, focus on the energy and desire to go into uncomfortable situations. You want to be uncomfortable most days.

You should work on stuff that isn't clear or easy.

Important soft skills

[35:14] Being able to connect with people with clear communication. Most people say a lot, without actually saying much. Are you able to communicate something useful in a clear way? That is the only thing that matters in a business context.

What to do with these crazy jobs descriptions

[41:32] You have to be comfortable with what value you can deliver. If you think that your skill set and your background can provide value for that company in that position, then go for it. Companies often will post things because they've seen their competitors use a similar job posting or they've just aggregated all the words from all the job postings that they've seen. It doesn't mean that they know what they're looking for.

Is data science an art or science?

[24:07] Science in general is an art. Any science done right requires technical mastery, yet there is a lot of gray area in how you do things. As you become better at data science, you start to see that there's a lot of ways to approach problems. It's not always obvious what makes one data scientist better than another. It's not in the model that they build, typically. It's in how they define a question and then pursue an answer.

How to navigate the ambiguous real world environment of data science

[27:32] The challenge is that step by step problem solving doesn't work well in the real world. In the real world, problem solving happens in a continuous cycle of development. A lot of people have a difficult time adjusting to this way of problem solving.

Being a leader

[30:52] Being a good leader is about figuring out how to unlock, amplify, and develop the people around you. Most companies evaluate leadership based on your impact, and your impact as an individual contributor can be huge. You can make it easier for your whole organization to do something. So it is not always about titles. It comes down to your impact and how it helps the people around you.

Memorable quotes

[6:35] “ journey was all about figuring out two things. One, how to work with data at scale. And two, what does it mean to actually do data science in a business context. And those two things are really, really important…”

[12:17] “You don't need to build an incredibly powerful model for every situation, but you need to know what's going to allow the business to thrive in a productive way.”

[19:48] …”getting by is not a long term solution to delivering value for a business, because what you're doing right now to get by is probably going to be automated in a few years…”

[23:50] “You're not always gonna be the expert in the room. And if you are, you're probably in the wrong room.”

The one thing that Eric Weber wants you to learn from his story

[47:16] You have to be pretty tenacious if you want to be good at something. If you want to make an impact, you are going to have to do things that make you uncomfortable.

From the lightning round

Best advice

Be humble.

Advice to 20 year old self

Do what you’re doing.

What motivates Eric

Knowing just a little more than he did yesterday, constant learning and evolving as a person.

Topic outside of data science we should study

Social science.

Recommended book:

“Blink: The Power of Thinking Without Thinking” by Malcolm Gladwell

Books and other media mentioned in this episode

“Multipliers: How the Best Leaders Make Everyone Smarter” by Liz Wiseman, Greg McKeown
“Thinking Fast and Slow” by Daniel Kahneman
“The 5AM Club” by Robin Sharma

How you can connect with Eric online


The Secret to Success Is In This Episode | Kyle McKiou Sun, 19 Jul 2020 20:00:00 -0400 15725a7e-3306-4513-a1e8-47b31ea1c8b9 On this episode of The Artists of Data Science, we get a chance to hear from Kyle McKiou, a data scientist who took the lessons from his own struggles that he faced attempting to break into data science and packaged them into a course for up and coming data scientists. He is known for his remarkable talent for building skilled, balanced and productive teams. He gives insight into how he broke into the data science field, his approach for problem solving, and they importance of facing your fears. Kyle shares with us the importance of finding a mentor that can guide you to accomplish your goals and the important soft skills that you may be overlooking. Kyle brings unprecedented wisdom and advice to this episode, and the points he outlines can help everyone step up their professional goals.

On this episode of The Artists of Data Science, we get a chance to hear from Kyle McKiou, a data scientist who took the lessons from his own struggles that he faced attempting to break into data science and packaged them into a course for up and coming data scientists. He is known for his remarkable talent for building skilled, balanced and productive teams. He gives insight into how he broke into the data science field, his approach for problem solving, and they importance of facing your fears.

Kyle shares with us the importance of finding a mentor that can guide you to accomplish your goals and the important soft skills that you may be overlooking. Kyle brings unprecedented wisdom and advice to this episode, and the points he outlines can help everyone step up their professional goals.

Some notable segments from the show

[7:43] What value Kyle believes data science will bring within the next few years

[11:38] How to transition into data science
[16:33] The importance of cultivating a growth mindset
[28:30] Soft skills that data science candidates are missing
[33:01] The single biggest myth about breaking into data science

Kyle’s journey into data science

Kyle’s journey began back in college, when he decided to get a PhD in mathematics. His original plan was to work as an investment banker or at a hedge fund. After a year into his classes, he realized that he didn’t want to be a part of the banking system. He wanted to find a career that was more fulfilling.

He eventually decided to drop out of his PhD program, and left with his masters degree. He then got a job as a software engineer. Although the job had aspects that were interesting, he still didn’t feel fully satisfied with the impact he was making in his role.

One day, Kyle stumbled upon a Harvard Business Review article about data science being the “sexiest job of the 21st century”. After reading that article, he decided that he wanted to transition into data science. The field promised a role where he would make a real impact in a business context.

[3:30] “Sure. So I originally went to school because I was interested in working at a big hedge fund or a big investment bank or something as a quants.

So I decided I was gonna get a Ph.D in mathematics. So I made the switch. I was actually studying exercise science at the time. I made the switch, and right into my first semester of college at University of Illinois, I was taking senior level math classes. I took five math classes my first semester. So I was working my way through the mathematics degree. And then after a year or so, I realized that I didn't want to really be part of the banking system. I didn't really see that necessarily adding a lot of value to society. I didn't think that was going be very fulfilling. So I was kind of stuck in this position where I had a mathematics degree. I'm like, man, I don't know what to do with it because I don't want to work in banking. I don't want to be a math teacher. I don't want to work for the NSA. What can you even do with a math degree? So I started looking at what can I do with this? How can I start applying this knowledge? And that's where I started studying economics and computer science and statistics and all these other fields that were related to math. And I got really interested in doing mathematics on computers. So I ended up starting doing a PhD. in scientific computing. And then I decided that a life in academia wasn't where I wanted to go either, because that's just a lot of research and writing papers.

And you never really get to do anything. You don't really get to make a real impact in the world. You don't really get to apply your knowledge. You just try to find new knowledge for the sake of it. So I ended up dropping out of my PhD program. I was actually doing a computer science PhD at University of Illinois at the time, and I just left with a master's degree. And so I had a masters degree and then I got a job doing software engineering. We were basically helping companies do electromagnetics simulations. So if you wanted to design a stealth fighter or a battleship, we were creating the software for you to do those simulations and development. Now, this sounds really cool, but the problem is that all of our clients were classified. So it's basically all top secret. And we would roll out this new software, all these improvements, and nobody would say a single word back. It's like if you had clients that just totally ignored you and then you thought, well, what the hell? Do people like the product? Is this helpful? Is this useful? Are they enjoying it? Or is this just a big waste of time? So it's a little bit frustrated that I couldn't really tell if I was making an impact. And that's when I learned about data science. I think I probably saw the same article that said that, you know, everyone else off Harvard Business Review, sexiest job of the 21st century. A friend sent it to me, and I looked at it and said, man, this seems cool because you really get to make a real impact on businesses. You're much closer to the business side, the application side and the work that you do, while it's still mathematics and computer science and statistics and these are things, it's actually used by the business to get a real result.

It's not just theory, it's put into practice, and that's how I got into data science. I said "Man, how do I really make a difference, a positive impact in a company that I can see?" And that's why I decided I got to make this transition into data science.”

Where is the field of data science headed in the next 2-5 years?

Data science has been slow to adapt. It needs to become more scalable. The way to accomplish this is by making data science more systematic. It has to become more organized, with an engineering focus, instead of an analytics focus.

Many companies have tried to implement data science practices quickly. They focus on how to make money with the data they have. This does not work, simply because it is not well planned. You cannot scale with this approach. This is why companies struggle year after year.

Creating a systematic engineering focused discipline will make data science scalable, repeatable, and adaptable to new markets. Doing this will allow data science to have a large impact for the company.

[7:43] - “What needs to happen to make data science more scalable is it has to be much more systematic. It has to be much more organized. It has to have much more of an engineering focus and not just an ad hoc analytics focus, because a lot of companies have tried to set up data science practices really quickly.

What they do is they get one person or maybe a couple of people and they say, "oh, we've got some data, what do we do with it?" “How do we make more money with it?” That's just an approach that doesn't work at all. Then it ends up, of course, not working because it's not planned out. There's no real way to get value from this. There's no way to scale it. There's no way to make it repeatable. Then they say, "oh, data science doesn't work for us". They shut it down or they struggle year after year. So really, it has to be a much more systematic engineering focused discipline because that's what makes it scalable. That's what makes it repeatable. That's what makes it adaptable to new markets, to new problems, to new situations. When you can do that, that's when it makes a lot of money and a lot of impact for the company. That's when data science flourishes, gets a bigger budget, makes a bigger impact. So it's really a focus on engineering, more so than just can we build a machine learning model or do we have, “artificial intelligence?”

What will separate great data scientists from the rest of them?

What separates great data scientists from good ones is not dependent on skill in statistics, building statistical models, or even software engineering. It’s being able to understand the context of the problem that they are solving.

You can be a good data scientist if you can build good models, but if you want to be great, you will need to understand the data, where it came from, how it was collected, the relevance of the data to the business, and what problem you are trying to solve. This will allow you to build a model that is specific to your business problem. Too many people want to build models that have high R-squared or have low error, but this doesn't necessarily make the largest business impact.

Your model may not fit the real world, because the model is a hypothetical, perfect situation, while the real world is messy with multiple angles of interpretation. So understanding the context of the problem is what takes a data scientist from good to great.

[9:56] “What will separate the great ones from the good ones is not their skill in statistics or building statistical models or even software engineering, for that matter. It's really understanding the context of the problem that they're solving. So you can be really good if you make this repeatable, if you understand how to build good models. If you can take data and turn it into predictions. But if you want to be great, you have to understand the context of the data, where it came from, how it was collected, the relevance to the business and the problem that you're trying to solve. So you can build a model that's, you know, much more subtle and specific to the situation of the problem you're solving, because a lot of people just see numbers and they say, "oh, well, I want to build a model that has a high R squared or that has a low error", and it's great to have a model with a small error, but that doesn't necessarily make the biggest business impact. So it's realizing that sometimes your data is biased, sometimes your data is not good, and a model that has more accuracy, is it, you know, whatever your accuracy metric is, but a model with more efficacy is not necessarily better in the real world, because, you know, this is a hypothetical, perfect situation model. Whereas the real world is big and it's messy and there's a lot of different angles to it. So understanding the context of the situation is really what takes someone from good degrade and making a moderate impact to a huge impact.”

Key takeaways from the episode

Why you need to do things that scare you

[19:38] Move towards the thing you fear. Fear is an indicator that there is an opportunity for you to grow here.

Why creating a system is the key to successful problem solving

[24:29] Don’t just solve the problem. Solve the problem by creating a system.

[25:49] Too many people stress the importance of problem solving by having an original, creative approach that is very innovative. This can be a daunting task, and it is very difficult to always have this approach. Instead, focus on breaking down your problem into smaller, actionable pieces.

How to work with non-data scientist stakeholders


1.) Understanding what other people want and what other people need to be successful.
2.) Formulate the solution that's going to help them.
3.) Communicate to them, in their words, why and how it is going to help them, because you need their buy-in.

The importance of modeling someone who you want to be like

[42:01] Find someone who has accomplished the goals you want to accomplish, and have reached the levels of success you are aiming for, and take their advice. That will accelerate your growth.

Important Soft Skills:

[29:22] The most important thing is your communications skills and being able to present your ideas.
[29:31] Understand what other people want, then align yourself with their interests.

Memorable quotes

[16:13] “Be risk averse; Test everything.”

[24:50] “You've got to engineer a system that solves the problem for you, because if you have to leverage your own intelligence to solve a problem, well, you're going to be very limited in the amount of work that you can do.”

[27:23] “ start with the problem you want to solve. You break it down to simpler problems. You break those problems down to simpler problems...all the way back until you get to your present state and then you see the exact path forward at any point time…”

[28:31] “...I think in most careers it's not going to be the hard skills that separate you, particularly in data science…[it’s] those soft skills, because you realize that if you want to make an impact in the company as a scientist, you're going to need other people to work with you…”

[34:55] “ doesn't matter how much you know, it matters how much you can learn and adapt.”

The one thing that Kyle wants you to learn from his story

[37:12] “ Do the thing, and you will have the power.” - Ralph Waldo Emerson.

The one difference between people that are “successful” versus not successful is they've done the thing. They've done the work. So if you want some sort of result, it really just comes down to you putting the work in to get the results.

Be patient enough to not quit. That's all there is to it. You put in the work and you keep improving and you don't quit. And if you do that, you'll get there. I guarantee it.

Lightning Round

Best advice that Kyle has ever received

“You should find someone who knows more than you and has had some of the success that you're looking for, and take their advice”

Advice that Kyle would give to his younger self:

[39:45] You can do anything you want. You just have to do it.

Books and other media mentioned in this episode

“The 10X Rule” by Grant Cardone.
“Mindset” by Carol Dweck.
“Psycho-Cybernetics” by Maxwell Maltz
“Deep Work: Rules for Focused Success in a Distracted World” by Cal Newport

How you can connect with Kyle

Data Science Dream Job - Free Webinar

Why We Should Be More Like Winnie The Pooh | Khuyen Tran Sun, 19 Jul 2020 20:00:00 -0400 0895c10c-fca6-4d3c-9963-fabe15901c9e On this episode of The Artists of Data Science, we get a chance to hear from Khuyen Tran, a student of data science that is currently in pursuit of breaking into the field. She gives insight into how she prioritizes her tasks every day and strategies she uses to take notes and read books. This episode gives our listeners a fresh perspective on how to approach the data science field, and some very interesting soft skills that you can implement to step up your game! Khuyen is definitely someone I believe will bring lots of value into the data science field.

On this episode of The Artists of Data Science, we get a chance to hear from Khuyen Tran, a student of data science that is currently in pursuit of breaking into the field. She gives insight into how she prioritizes her tasks every day and strategies she uses to take notes and read books.

This episode gives our listeners a fresh perspective on how to approach the data science field, and some very interesting soft skills that you can implement to step up your game! Khuyen is definitely someone I believe will bring lots of value into the data science field.

Some notable segments from the show

[3:23] Ways to boost your efficiency and learning rate

[9:34] What inspired Khuyen to begin writing her posts on data science
[11:42] How to initiate projects in data science
[26:43] Reading books the right way

Khuyen's journey into data science

Khuyen has always been interested in the combination of math, programming, and application. She knew that she wanted to pursue a career that combined these areas. When she found machine learning, she knew that this was an area that could answer some really fascinating questions.

[2:35] “Yes. So I always interested in the combination between math, programming, and application. All through my course I major in applied mathematics. But I never find anything like machine learning. How I could use the concept of mathematical equations to apply in something really useful as such as, like predict the heart disease. That made me fascinated at the first time seeing machine learning.”

Key takeaways from the episode

Strategies for boosting your learning rate

[3:23] Create a system, with the goal of finishing three tasks per day. Before the week begins, check to see what needs to be accomplished that week. Make sure you prioritize tasks for every day, and carve out time to check your email or phone. This will minimize any distractions.

Note taking tips

[10:59] Only write down the most important points from the courses you take, and make sure you can take action on what you learned right away.

Planning a project


  1. Set a deadline
  2. Make sub-tasks for your large tasks. This helps create small, approachable tasks.
  3. Understand the data (data visualization) and ask the right questions.

Khuyen's approach to reading books:

[26:43] Only read a book that interests you. Don’t be afraid to skip entire sections of a book if they don’t apply to you. There are so many great books out there, so don’t feel obliged to waste time reading every book cover to cover. If you read books that interest you, then the information will also stick more.

Memorable quotes

[4:43] “...maximize important tasks over the urgent but not important tasks...”

[11:25] “...the best way to learn anything is not from taking notes, but from... using it.”

[24:15] “...learn to love whatever you are doing and you will start to do it really well.”

The one thing that Khuyen wants you to learn from her story

[21:51] Have a specific goal in your mind, and go for it. Make a plan of attack that has specific tasks outlined. If you do this, you can achieve anything.

From the lightning round

Where do you see yourself in 5 years?

Working as a data scientist with lots of implementation.

Question she loves to ask in an interview

What attributes are you looking for in a candidate for this position?

Most interesting question asked during an interview

If you can be a cartoon character, which one would you be?

Best advice that Khuyen has ever received

One minute spent organizing can give you back hours in the future.

Advice that Khuyen would give to her 15 year old self:

Learn to love whatever you are doing, and then you will start to do it well

Recommended book:

“Outliers” by Malcolm Gladwell

Books and other media mentioned in this episode

“Deep Work” by Cal Newport
“Ultralearning” by Scott Young
“So Good They Can’t Ignore You” by Cal Newport
“Peak” by K. Anders Ericcson

How you can connect with Khuyen Tran

Personal Website

Everybody Has A Unique Gift and Perspective | Deborah Berebichez, PhD Sun, 19 Jul 2020 19:00:00 -0400 bc5d460d-5f7c-4b92-907c-356a555134cd On this episode of The Artists of Data Science, we get a chance to hear from Deborah Berebichez, a physicist, data scientist, and TV host. Her passion for learning and teaching has led her to become a voice for women and minorities in STEM. She gives insight into how she broke into the data science field, how to cultivate the right mindset to succeed, and the importance of diversity and inclusion in tech. Deborah shares with us how she grew up in a conservative environment, and the obstacles that she had to overcome to become the first Mexican woman to graduate with a physics PhD from Stanford University. This episode is packed with actionable advice along with wisdom from someone who has had tremendous success!

On this episode of The Artists of Data Science, we get a chance to hear from Deborah Berebichez, a physicist, data scientist, and TV host. Her passion for learning and teaching has led her to become a voice for women and minorities in STEM. She gives insight into how she broke into the data science field, how to cultivate the right mindset to succeed, and the importance of diversity and inclusion in tech.

Deborah shares with us how she grew up in a conservative environment, and the obstacles that she had to overcome to become the first Mexican woman to graduate with a physics PhD from Stanford University. This episode is packed with actionable advice along with wisdom from someone who has had tremendous success!

Some notable segments from the show

[17:11] What value Deborah believes data science will bring within the next few years

[20:43] Deborah’s role model for being curious and inquisitive

[27:42] Actionable tips for cultivating the habit of critical thinking

[40:07] Advice on how to be the hero when you feel like a failure

[51:47] Advice for women that want to break into tech

Deborah's journey into data science

As a physicist, Deborah began her career working on wall street, where she analyzed stock market data. Her background was in computational physics, which is very closely related to data science.

One day, she was invited to the STRATA conference, which was a humbling experience for Deborah. It gave her more insight into how data science can be applied outside of wall street stock data, and the various types of data that can be analyzed.

She always knew she wanted to have a large impact with her work. That led her to Metis, where she was able to connect education with data science. Her work at Metis has allowed her to teach others data science, while also taking part in projects that help underprivileged areas.

[3:54] I just so I think it was a serendipitous path in that I didn't really expect to become a data scientist.

I had never heard about the term. And maybe about 15 years ago when I had finished my PhD, I, I started working in Wall Street.

Like many physicists, because I wanted to be able to get a green card and stay in the U.S.. And as you know, there was a strong connection between the financial markets and the PhD programs in physics and math and statistics across the country. And so it was kind of not even raised eyebrows. There were over a few thousand physicists working Wall Street. And so I finished two post-doctoral fellowships after Stanford at Columbia University and at NYU, at the Grant Institute in Applied Math and Applied Physics.

And then I started working in physics and I realized that academia was a bit too isolating for me. And I wanted to communicate more with the public and evangelize different products and have an impact with my coding and what I was doing. I did computational physics, by the way. And so it was pretty close to Data science. I just would not we would just not call it that. But I had never realized that what I was doing was a very narrow form of Data science, meaning I was, you know, quite proficient with that particular aspect of machine learning.

But when it came to Data science, which was much more vast than what I was doing. And so I was humbled by an experience I had at STRATA, the big data science conference when I was interviewed on video. And I think I said something that I, I, I've regretted saying ever since, which was oh, but come on, Data science is nothing new. You know, we have physicists and Wall Street people doing it for the past 50 years and nothing has changed. And, you know, I was proven wrong quite quickly because we definitely were analyzing things with different algorithms and we were analyzing different kinds of data that we never analyzed before, such as audio and text and images and what not. And so there there were a lot of differences. And also in Data science, you we required to translate the insights that were gained into quite, you know, lay and entertaining terms so that the stakeholders in a company could actually enact policies in a. And change things in vier the company into a different direction to gain success based on those insights. So that's how I started. I finished my appose socks. I worked in Wall Street for six years and then I realized that what I was doing in Wall Street was research and and again, do working with Data. It was the stock market Data to be specific. But I also knew that I wanted to have more of an impact in the world and do good for people.

And at the time, I had been following my friend Hilary Mason, who's a renowned Data a scientist, and I loved her work.

And I saw Cathy O'Neil and other people do use the Data science analysis that they did for bringing more ethics in into the world and and more visibility into under served communities when it came to doing data science work.

And so I ended up wanting to connect education with Data science. And that's how I came about Metis, which is where I'm the currently the chief data scientist at. And we're a Data science training company where I've had the chance to not only train people by teaching a machine, learning bootcamp and create curriculum, but also where I've had the chance to do Metis for good projects like helping create a live map of needed and things during an earthquake that happened in Mexico about four years ago. And people could go to the map and in real time see what kinds of items or people were needed in different locations. So it's been a wonderful world of work where I can actually not only help people, but also educate companies and others in Data literacy. And that's what I loved about my work and Data.

Where is the field headed in 2-5 years?

Data science has been slowly partitioning into more and more specific professions. People in the field have tried to capture a lot under the umbrella of data science, but that has not worked well. People want to know how data science can serve them. With this in mind, we are going to see a shift in who has access to data. People will now have insights at every level.

We are also going to see a shift in how data is reported to executives with this change.
Companies will need to hire people with heavy engineering backgrounds or data science backgrounds for those things.

At the same time, as sophisticated and complex algorithms become successful at solving certain problems, we're going to see more people hire specific experts within data science. This will create more jobs in the field for people with specific skills and training in certain areas, as well as people with less technical backgrounds.

[9:51] “Yeah. So I think that a lot of Data science has been slowly partitioning into more and more specific professions.

We have tried to capture a vast amount of things under the umbrella of Data science, and that has not worked well because companies have been hiring Data scientists, some of whom have expertise in Data management.

Others more in sophisticated algorithms like deep learning and others more in a more statistical base. Data analysis. And so I think people want to know what they can get out of data science.

And so we're seeing the proliferation of dashboards and easy platforms like Tableau that are going to be able to be used within an organization with very little training.

That is pretty much anyone will be able to have access having an initial training to the data that a company has. And people will have insights at every level. So we're going to see that.

And those people are going to be translators or bridging bridges between the executive levels of the company and the companies Data.

And so we will need to hire very kind of heavy engineering background or Data science backgrounds for those things. At the same time as algorithms, sophisticated and more complex algorithms become successful at solving certain problems, we're going to see more people hire specific bands within Data science.

That is somebody who is exclusively an expert in NLP algorithms or in visualization techniques and whatnot. And so I think that more and more jobs are going to open.

But we're going to they're going to require more specific skills and more training in certain areas, as well as people from less technical backgrounds having access to more commodity sized platforms.”

What will separate great data scientists from the rest of them?

A good data scientist is someone that can efficiently manipulate, clean, and gain insights from a data set and can propel a company forward.

A great data scientist can think outside the box and outside the established algorithm. They have the ability to critically think. They check to make sure the statistics are correct, and are not deceived by the data source. They are aware of possible agenda behind the data source, and make sure that data is not being misused to propagate opinions or certain political views.

[12:19] “Oh, that's a good question. I think somebody who has the skills that I call critical thinking will definitely advance. Way more than the good data scientist. So I think we could define a good data scientist as somebody who is able to efficiently manipulate, clean and gain insights from a data set that have actionable metrics that can propel a a company or an institutions business forward, whereas a great data scientist will be somebody who can think outside the box and outside of the established algorithm in both. Go back to the basics and make sure that the statistics are correct, which a lot of people don't think about now and not be deceived by, say, the sample that gather the data. The agenda behind the data source out of the company that's providing the data and whatnot and really gain deeper insights by creating an algorithm that specifically tests what they know they want to test with with the metrics that are as specific as to the errors that I get propagated with statistically measuring only a sample of the population.

And we're not really paying attention to how at every step of a data science project, we can unintentionally or sometimes intentionally propagate these errors and misuse data science to gain insights. That are eliminating from our goal the version of a comprehensive truth, so to speak, like we can, you know, test voter Data set, by eliminating unconsciously the opinions of certain minorities or certain other political views. And, of course, then gain insights that are not actually representative of what the political ecosystem is.”

Key takeaways from the episode

Art or science

[34:00] I caution people who view it as an art. Although there are aspects to data science that are artful, the best data scientists are those who are meticulous in their scientific process.

The creative process

[37:08] A good data scientist can look at the data and begin to piece together a story. What is the data telling us? How will it affect change? This is how the creative process manifests itself in data science.

How to be a hero when you feel like a failure

[40:07] It’s by reminding ourselves that our measure of success is not about how many likes we get, but how we measure our learning and our growth based on what our goal was before we took on an enterprise. Just learning to appreciate how much you have grown and come forward is an incredible skill that you need to nurture and practice.

Biggest myth

[44:14] You need to be a genius in every aspect of data science to do well. In reality, the vast majority of data scientists are good at a few aspects of data science. Have some general knowledge, but do not strive to become an expert before you get your first job.

Diversity and inclusion

[51:47] Don't let the perception and the stereotypes that have formed your unfortunate biases govern what you do and how you behave. Act as if you're confident, even if you don't feel confident yet, and things will happen for you.

To foster inclusion in the data science community, make women role models visible. Put technical women in highly visible roles. That's the best you can do.

Memorable quotes

[19:57] "…I think the most amazing things that are going to happen [due to data science] are giving transparency to industries and to communities of people that otherwise in the past have remained quite invisible”

[24:19] “I am a very strong supporter of making people learn and educat[ing] others in the basics of science so that we can become empowered citizens and know more about the world.”

[24:50] “...Critical thinking to me is about questioning authority…[it] allows us to to gain the proficiency in being able to discard lies from the truth.”

[28:12] “...Make sure that you recognized the biases that you have about the world and what you want to be truth so that you don't blind yourself to the actual results of a data analysis”

[40:59] "…The people who end up succeeding in life are not the ones for whom things come easily. They are the ones for for whom obstacles are just something to transcend and the ones that get up every time that they experience a failure in their lives and they keep going.”

The one thing that Deborah wants you to learn from her story

[55:47] You can make your dreams come true no matter what. Do not believe what people think of you and what your abilities are. Always seek for that inner voice that tells you what you want to do and believe in yourself.

From the lightning round

Deborah's data science superpower

Being very detail oriented! She's known for finding even the most infuriating of bugs in code - like a missing comma buried in deep in some module.

Deborah talks about the most fundamental truth of physics

Science is not about facts. It's about discovery and an ever increasing, more comprehensive view of reality. The school system doesn't do us justice when it comes to learning science, it's more than just selecting an answer on a multiple choice test. The real world is messy, and finding the truth is an iterative process. Science is process, and it is a process of discovery.

The best advice that Deborah has ever received

The best advice she got was from her husband's PhD adviser. He once said: "Hold your water". It didn't click in the moment, but she has since understood it to mean: Don't engage. Don't try to be the one who's right. Don't try to be the one who wins an argument.

The advice that Deborah would give to her 20 year-old self

At 20 years old, Deborah was in Mexico studying for an undergraduate degree in philosophy, and was being told by everyone around her not to pursue the hard sciences. If she could go back in time, she would tell herself to pursue those dreams that she has. Even if you are not the best at them. It's better rather do that - do the things that are hard - than stay doing something that comes easy to me.

What's a topic outside of data science we should study

Critical thinking.

Recommended book

“What Do You Care What Other People Think?” By Richard Feynman

Song on repeat:

David Bowie, Changes

Books and other media mentioned in this episode

Books by Ruth Spiro (for children).
Outrageous Acts of Science

Where you can find Dr. Berebichez online

Personal Website

Pick The Right Voices To Listen To | Brenda Hali Sun, 19 Jul 2020 19:00:00 -0400 3a26db43-3f52-4b05-a3b1-5e63cdc57a99 On this episode of The Artists of Data Science, we get a chance to hear from Brenda Hali, a marketing guru turned data scientist who is passionate about using data to understand causation and to promote company growth. She gives insight into how she broke into the data science field, how marketing and data science are related in some ways, and the struggles she faced when breaking into tech. Brenda shares with us her transition from marketing into data science, along with the importance of having the representation of women and other minorities in the tech industry. This episode really shows why diversity and inclusion in tech is so important, and how we can all play a role to help others break into the field.

On this episode of The Artists of Data Science, we get a chance to hear from Brenda Hali, a marketing guru turned data scientist who is passionate about using data to understand causation and to promote company growth. She gives insight into how she broke into the data science field, how marketing and data science are related in some ways, and the struggles she faced when breaking into tech.

Brenda shares with us her transition from marketing into data science, along with the importance of having the representation of women and other minorities in the tech industry. This episode really shows why diversity and inclusion in tech is so important, and how we can all play a role to help others break into the field.

Some notable segments from the show

[6:56] What marketers can learn from data scientists

[11:07] Steps to take when beginning a new project

[17:33] How to communicate effectively with your team in the post-COVID world

[20:56] Advice for women and minorities that want to enter into data science

Brenda’s journey into data science

Brenda heard of data science about six years ago, even though she did not have a background in tech. She had always been interested in learning more about coding, and so she decided to learn how to code on her own, by watching video tutorials and practicing at home.

Brenda took the leap into data science when she participated in a program that helped entrepreneurs from Latin America. She was a part of a small team, and they had gathered data from over 20,000 people! They needed to find trends in the data, but managing such a large data set was difficult to do.

Brenda was very passionate about this program, since she grew up in Mexico and lived across many latin american countries. At that moment, she decided that she needed to find better ways of analyzing data.

[2:25] “I think that I have heard data science a couple of years ago, maybe like six years ago. And I mean, even though my background is not exactly in tech, I didn't study anything related to software or taking undergrad. I've been learning by myself. I like--through YouTube, through tutorials. Like I learned how to do--how to program in front and development. I learned how to build apps. And I learned--I even created like a bot for social media because I wanted to follow some certain hashtags. But I never did that as a formal education. And in that curiosity, I was in my tech conference because I love conferences and I love the media people. And I was amazed about all the possibilities when I heard that the term big data, but that was back in 2013 and I didn't know exactly how I could transition into data science. After that, I think that the moment when I decided to make for real that transition to data science it was because I was working in this program for helping entrepreneurs from Latin America, and it was a White House initiative. It was Obama's White House initiative, and we have all of these data from 500 entrepreneurs that we needed to find. We have several datasheets like from survey, CV formation, application. Over twenty thousand people applied, and we needed to look for some trends. We had a really small team and they were mostly inclining to political science, public policy, but not really into tech. So I came to the team and I literally show them how to use people tables. So the problem that we had at that moment was that when we were transitioning from one White House to another White House, a way how programs are a way that are in a different sense sometimes before you go. You could find like a couple of stories more and just share those stories like this is changing the world, and this is changing the U.S. as well. But in this case, they want that number. So they wanted to measure that impact. And we launch a survey and then we have all these responses from entrepreneurs. We don't know how to analyze all of the data that we have. It was a lot of data, a lot of rows. We didn't know even how to properly manage, like to read it completely or to find trends. And I remember at that moment, for my team, my team, they were mostly from the US. I'm from from Mexico, so for me, I'm from Mexico and I live in a couple of places in South America, like across Latin America. For me, that was like really close to my heart because I was seeing the impact of this program. I didn't want the program to finish. So in that moment, I basically told myself, I need to do something about these. I need to know how to analyze data because there must be a better way than just reading or printing a bunch of things and analyzing it in that way. So in that moment is when I decided to really look for that transition into data science. And it took me a couple of other years to actually act on it.”

How do you see data science affecting marketing in 2-5 years?

Brenda predicts that marketing teams are going to focus on hiring marketing data scientists, instead of just hiring creative professionals. Teams that do this will have an advantage of having someone that can track the data and analyze it.

Furthermore, Brenda sees automation being on the rise within the next few years. She thinks that small businesses will also need data scientists to be more effective as well, since a majority of the data that is collected by these businesses is “dark data”, or data that is not being used.

Lastly, Brenda thinks that we are going to see faster GPUs, which in turn will lead to running algorithms much faster.

[9:01] “Well, I see how possibly every marketing team now is going to have a data, a marketing data scientists in their team more than a data analyst. Like someone that really understand how to track the data. So I see in the team part to hire more data scientists. That's one thing that I've seen because before these data, these marketing team, it was more like creative people. But now I'm seeing that they're hiring more people that are like with a major in math, with a major economic side. You can say like why someone that is majoring in math is in marketing if marketing is a creative field. No, like everything should be based on numbers. So I'm seeing that field into that. I'm definitely more automation, but way more automation. And right now, 93% of the data that a small business have go to dark data. That means that that data is not used. So I can see also how even the small businesses in their marketing teams will start acquiring or will start hiring people in data science to use their data possibly. And I see like in the five years, probably most of the business will be generating big data, so that's where I'm seeing. Besides that, everything is going to be faster because of the use of GPUs, GPUs are getting cheaper. And that means that, for example, if before you run an algorithm to predict something and it took eight hours, now it is going to take 50 minutes. So that process is getting faster. So probably we're going to reach the point in which everything that's going to be like close to real time with big data and that with marketing data and with all the softwares is just gonna get crazy. Well, I love it.”

Key takeaways from the episode

What data science and marketing can learn from each other

[6:56] Marketers might not be aware of the tools that exist that can help them understand the trends in data regarding impact. This is where data science can help marketing. Tools exist that can measure what music is catchy, and will therefore generate more money.

Steps taken during the beginning of a new project

1.) Understand the type of project you are working on. How does this project help the organization?
2.) Communicate clearly with the organization, to receive the right feedback.

Post-Covid teamwork

[17:33] Be comfortable communicating with your manager and team members openly during this time. Even though you might not be in the office together anymore, make sure your team uses project management tools that optimize the communication between team members. Also, if you have some free time, take some classes to advance your career, and communicate this with your manager. Now is the time to learn and grow.

Breaking into tech is a woman

[20:56] It is difficult breaking into a field where you are not being represented. If you see yourself being represented, then it allows others like you to believe they can also achieve success. My advice to women is to find a community that is going through the same struggles as you, and find a mentor that can guide you.

The four things women need to make tech their next big success

Don’t be afraid to explore. You don’t want to have a fixed mindset into your career, since you will be doing it for the rest of your life.
Trust your plan.
If you need to, delegate tasks.
Be comfortable being the minority in the room.

# Memorable quotes

[15:02] “ need to have communication with your team, and that communication needs to be in one place”

[15:47] “...experiment fast and let things go…”

[23:52] “Be careful with who you listen to, and be careful when those voices are close to you.”

The one thing that Brenda wants you to learn from her story

[29:43] Don’t be afraid to explore and find your passion. You have a long professional career ahead of you, so don’t be afraid to look for your calling. Also, never stop learning.

From the lightning round

Brenda's data science superpower

Brenda has a knack for growing, she's a data gardener! Her superpower is that she can start something and grow it.

The best advice Brenda has ever received

Brenda got a very insightful bit of advice, which I absolutely love - people care about themselves. They don’t care about you. We often think that people think about us way more than we think they do - but the truth is, nobody is thinking about you. They're too busy thinking about themselves.

What motivates Brenda

Brenda is the first one in her family to go to college and become a professional in tech.

Now that she has accomplished some amazing things academically and professionally she is motivated by the opportunity to be a role model to her siblings and cousins, and show them that there is a place for them in tech.

The advice that Brenda would give to her 20 year-old self

Change your career path.

Topic outside of data science we should study:

Effective communication.

Recommended book:

Nudge by Cass Sunstein and Richard Thaler

Data science bias blunder:

Instagram - block photos of overweight women in bikinis.

How to connect with Brenda


Overcome Hurdles in the Job Search by Igniting Your Passion | Chhavi Arora Fri, 03 Jul 2020 14:00:00 -0400 ca481a8b-c11a-4467-a204-f6e26552d071 A mock interview with a rising star of our industry and some helpful tips for preparing for any upcoming interviews you have

On this episode of The Artists of Data Science, we get a chance to hear from Chhavi Arora, one of the rising stars in the data science industry! She gives insight into how she broke into the field, the hurdles she had to overcome in the job search, and how she answered commonly asked questions during an interview.

Chhavi shares with us what got her interested in data science in the first place, along with the biggest self-limiting fear that she had to overcome in order to begin her journey into data science. If you are interested in becoming a data scientist but don’t know where to start, then this episode can answer many of your questions!

Some notable segments from the show

[9:23] The mindset you need to adopt during the job search process

[11:04] How Chhavi overcame her biggest self-limiting belief

[14:58] How to get a leg-up on your competition when applying for jobs

[18:05] Commonly asked questions during interviews, and how to answer them

[24:55] How to prepare questions for the interviewees, and why it’s crucial

Chaavi's Journey into Data Science

Chhavi’s journey into data science began when she volunteered in various NGOs during her undergraduate program. One that was very instrumental for Chhavi was an NGO that focused on children with special needs. Her passion and interest to answer difficult questions began here.

Coincidentally, her statistics professor at the time introduced her to data science and machine learning. Once she found out about this career path, she decided to acquire a master’s in mathematics to build upon the foundation of machine learning.

[3:22] So during my undergrad, as you mentioned correctly, I used to volunteer a lot for a bunch of NGOs, but there was a specific one which catered to a bunch of children with special needs. And they used to conduct different events, social events for them and the students there with special needs, they were the ones who need to step up and perform. And I mean, it just baffled me to see that there's so many things and go around and around us and the world be it on personal level, be on a professional level and so many questions. People want answers for. And I felt I just internally I had this I felt like I found my calling and I felt I would find peace eventually in a career where I can impact the lives of people I work with or who might come in contact with everyday in in one way or another. And coincidentally, it was during my undergrad so my statistic professor has been in touch with me for just understanding a lot. And he he in fact introduced data science to me. And that is that as a first time, I heard of the whole idea of machine learning and how it has been used in a variety of formats to help people around. And I just realized that this is this is the career I wanted to build for myself. And that is when I decided to do a masters in mathematics, because mathematics is the foundation for machine learning, no matter what, no matter how much you code, it is the essence of it. So I just wanted to have a very strong foundation to begin with. And that is that is the whole idea of why I eventually decided to be a data scientist. So it goes all all the way back there.

Key takeaways from the episode

Growth mindset

[9:29] Having a growth mindset is absolutely crucial when it’s time to begin applying to jobs as a data scientist. There is so much competition now, and getting your foot in the door is becoming increasingly challenging. >Having the growth mindset allows you to navigate through the rejections and continuously learn from them.

Biggest self-limiting belief to overcome

[11:04] Take pride in any gaps of unemployment in your life. Rather than thinking these gaps will bog you down, think about the positive benefits these gaps may have had in your life. Make sure to speak about these positives during your interview, and the interviewee will notice your honesty and confidence.

Interview Questions

Chhavi and I go through a mini mock interview where we touch on some commonly asked behavioural interview questions and her responses to them, this starts at the [18:05] mark of the show.

Tell me about yourself

I always begin this answer by talking about my superpower. My biggest superpower is my determination and a desire to learn.

Can you describe a time when you had to deal with competing priorities and with competing deadlines?

Make a to do list to define specific tasks that would come under each of the projects. Then speak to the stakeholders for each of the projects and to try to understand the impact of both of them.

What's the most difficult type of person to deal with and how do you deal with them?

I think the most difficult type of person to deal with are the people who are a little too adamant in their choice of approach and who take it with a difficulty if a new approach to a task is suggested.

So walk me through your discovery process when you're starting a new project.

Get out a pen and paper and write down the following questions:
Why am I doing this project?
What sort of questions am I actually trying to solve from this project?
What impact will this project bring to the business?

Do you have any questions for us?

For this question, make sure you prepare by reading any public information that is available on the company you're interviewing at. Then, ask HR questions as many questions as you can. From the insights you gather with your HR interviewee, frame appropriate questions for the rest of your interview.

How to network with people on LinkedIn:

Make sure that your first message to a person is about them, not about you. Please don't talk about how amazing you are, how many skills you have, how much education you have, etc. Instead, focus on talking points regarding what you learned from this person.

Memorable quotes

[6:39] “...every project you do as a data scientist needs to be something that you have interest in so that you know what questions you are looking for and you will eventually find answers to your work.”

[12:46] “...every little weakness that you think you have can become a positive thing if you spin the story right.”

[17:16] “...the most important thing is to never, never stop being passionate about data science...because the learning never stops.”

The one thing that Chhavi Arora wants you to learn from her story

[28:58] The most important thing about my story is my passion. I take a lot of pride in how passionate I am about being a data scientist, and just about everything else that I do. Know that passion is the only way to go.
From the lightning round

From the lightning round

Where do you see yourself in 5 year?

Chhavi envisions herself in a leadership role as a data science manager.

Best advice

Have faith in yourself.

Advice that Chhavi would give to her18 year old self

Don't worry.

Recommended book:

“Made to Stick” by Chip Heath and Dan Heath

How you can connect with Chhavi Arora

Connect with Chhavi on LinkedIn.

The Monsters in Your Head | Brandon Quach, PhD Fri, 03 Jul 2020 14:00:00 -0400 76464534-4a8a-4074-82e7-7f7bfb72bfe4 On this episode we talk to Dr. Brandon Quach and he shares with us his leadership philosophy, why great thinkers (like data scientists) should hate being told what to do, the mindset of future judgement, and how to deal with the monsters in our head so that we can achieve our full potential.

On this episode of The Artists of Data Science, we get a chance to hear from Brandon Quach, a data scientist who has a PhD in bioengineering, and has worked on threat analysis for security and business ecosystems. He's currently a principal data scientist and manager, leading the charge to modernize the customer experience by applying machine learning to customer support.

Brandon shares his perspective on how data scientists should approach problems, the importance of passing on knowledge, how to be a leader in the data workspace, and the appropriate mindset to develop when faced with difficult problems. Speaking with him was an honor, and this episode has something for everyone to take away from.

Some notable segments from the show

[11:50] Brandon discusses automation and whether or not we will be able to automate human judgement

[18:01] What qualities do you need to become an intrapreneur in your organization

[22:19] A unique way to approach leadership in your organization

[30:08] Why great thinkers abhor being told what to do

[37:37] How important is agile and scrum methodology in data science

[46:13] The mindset you need to accept the monsters in your life

Brandon’s journey into data science

Brandon’s journey began during his PhD program at Caltech, where he was focused on experimental subjects. While working on these experiments, he began to notice factors in an experiment that had large impacts, but were not quite being measured. He thought about the math behind these factors, which led him to becoming a consultant. In this role, his title was Senior Associate, even though he was working as a data scientist. During this time, the field of data science had not quite been established, hence his title.

[4:06] “Yeah. I mean, for me, there wasn't really much of a moment of breaking into it because there wasn't too much of a field when I started. Right. So I did my PhD at Caltech and I was very experimental. We were taking silicon wafers and we were trying to make them do things like duplicate DNA, using micro-fabrication, things like this throughout the process. I was thinking there were a lot of setbacks that happened in the laboratory setting that wasn't didn't really reflect what I thought I was capable of intellectually. Right. So, you know, maybe I would design a cool experiment, but by the time I did it, then something random would happen.

Maybe I turned on the nitrogen tank. There's no more nitrogen, and everything has been thawed, everything's been prepped, and now it's like you've just lost a whole day's worth of work. And sometimes worse, sometimes weeks or maybe even months worth of work because of that one moment when everything ready to go. Something happened that totally contaminated the experiment. And what things like that happen - and I kind of noticed well, you know, I think I'm more of a like a thinking person. I think I enjoy doing the math more than the experimental stuff.

So I started to look into sort of alternate careers that weren't so experimental. And, you know, consulting was there. There were the legal path was there. There were legal firms would come in and say, well, maybe if you work with us reviewing patents and such for a few years, then you might go to law school after that and and become like a patent lawyer. And so then I was interested in all those all of those things. And eventually I went into consulting. So it was with my first employer Opera solutions now known as ElectrifAi. And we just did consulting for a bunch of different companies. There are a lot of different fields that you had mentioned before, and the title was called Senior Associate. That was just it. And it wasn't really called - I mean, even when I was still called something like analytics manager. It wasn't a data scientist. We never really called it that until probably that transition of when I left and went to Lytx when the title became data scientist.”

Where is the field headed in 2-5 years?

When Brandon entered the field, everyone was saying that automation was going to allow models to build themselves, and he thinks we are still in this phase. He thinks this is going to continue to happen over the next few years as well.

[12:06] “Yeah, you know, I thought about that a lot. I'm not really sure myself, because two to five years ago, everyone was saying that this automation was going to come in, that this building the models, the models were gonna build themselves, they're going to tune themselves and all this.

And it makes sense to me. A lot of the things that I was doing did seem pretty simple. You would do it a couple of models you would choose, you would optimize in this way. You would choose this. You would do this parameter searches and. Yeah. You could. I mean, I was automating them to write. I was writing scripts that would automate all those kinds of stuff. And I thought, yeah, this is probably got legs and this is gonna happen. And so for the last two to three years, I was thinking the next to two years was gonna be about automation and that the data scientists would be akin to a modern, let's say, mechanical engineering who might have done studies in how to - like in fluid dynamics, right. And how to model fluids and what's the pressure and velocity at every point along this wing. But they have software for that. You do do the simulation and you're like, well, now I've got the software. So you're thinking, does that mean I don't need the engineer because the software did it automatically? We're - I think we're gonna get to that phase. But the strange thing to me is that I've gotten the impression that that phase is coming very quickly.

For the last couple of years. So now here I am, right. Fast forward two years and it's not and it's kind of here. I've seen it here and there, but I'm at least I'm still using Python. I'm still coding things myself. And so I think that's what's going to happen, continue to happen in the next few years, two to five years. That being said, you know, maybe the time will come when a lot of these efforts to automate things come into play. And, you know, as I mentioned before, it depends on the industry and the problem you're working on in my career path, since I've always been working on new problems. I don't see it impacting us much, but I'm trying it right. And some of the employers that I'm working for, they themselves are trying to build and have built these kinds of things. And I'm I'm trying to use them as well. And I'm providing feedback on the features that have been developed. And what does it mean to work on a like a real data science project where our ROIs expected it to happen and not just kind of a research thing?”

What will separate great data scientists from the rest of them?

[15:56] I think it's the ability to think through problems, and having the intuition to think about what the next step is to your problem.

Key takeaways from the episode

Important soft skills

[50:24] Learn how to think through a problem in its individual components.

How to be an intrapreneur

[18:16] An intrapreneur is somebody who is willing to do whatever it takes to solve the problem, even if that means thinking like a software engineer or from a business perspective. You need a vision, and the ability to take ownership.

Growth mindset and grit

[44:17] You show grit in all aspects of your life when you learn to live with setbacks, pain, and the work that’s required in any endeavor.


[22:30] My philosophy on leadership is based on servant leadership. This is a leader that helps people grow and produce their best work, and gives people the independence to choose how they solve a problem, rather than telling them what to do.

Mindset for difficulties

[46:13] If you have monsters and you feed them, they will keep coming back. Even if you don’t feed them, they may not go away. Don't think “woe is me” for the problems in your life. This feeds the monster. Instead, just accept them and move on.

Memorable quotes

[22:37] “, to me, comes from your ability to not be scared of the results that come out of your work or anything that you do.”

[27:25] …”If I received good advice and….good guidance, then I feel it's sort of my job, my duty, to pass it on to the next generation”

[30:08] “Great thinkers like to figure things out and come to a point that they believe in the solution.”

[35:33] “I want people to look back long after I've gone and say...that decision that was made early on that nobody had appreciated...turned out to be really critical down the road…”

[53:33] “...successful data scientists can think through any kind of problem surrounding data science, not just the core problem.”

[57:05] “You should learn how to think through code. How can you learn how to think through code?. Well, either you have a built in imagination... and/or you have gone through a lot of iterations of code and you can understand the process...”

The one thing that Brandon wants you to learn from his story

[58:21] Expect that good and bad things will happen to you. The good and the bad things will stick with you through your journey, and you can’t get rid of them. Just accept that, don’t fight it.

From the lightning round

Best advice

When you need to communicate a problem with your leaders, make sure you bring solutions.

Source of motivatio:

What motivates me is the idea that I'm going to do something that I'm proud of, not something that I hope somebody else will like. As a leader, what motivates me is to help people grow.

Advice Brandon would give to his 20 year-old self

Keep doing things that you think are interesting.

Topic outside of data science we should study:

Body language. Data scientists spend time explaining things to non-data scientists. If you can understand body language, you may pick up on cues of whether people understand you or not.

Books that Brandon recommendeds you to check out

“Case in Point: Complete Case Interview Preparation” by Marc Cosentino

Books and other media mentioned in this episode

“Search Inside Yourself” by Chade-Meng Tan
“Linchpin: Are You Indispensable?” by Seth Godin

How you can connect with Brandon Quach

Connect with Brandon on LinkedIn
Personal Website and Blog

Tech Culture Needs to Embrace EQUITY | Brandeis Marshall, Phd Fri, 03 Jul 2020 13:00:00 -0400 3aeed775-a6bd-487a-9e21-8201ca0a51f9 Dr. Marshall stops by the show to discuss how she broke into data science, her research involving social media, the #BlackTwitterProject, plus the why's and how's of embracing diversity and equity in the tech world.

On this episode of The Artists of Data Science, we get a chance to hear from Brandeis Marshall, a computer scientist that is excellent at breaking down difficult concepts into easily digestible pieces. She is passionate about educating people on data, as well as understanding the impact data has on race, gender, and socio-economic disparities. She is the CEO of DataEdx, a company which focuses on making data science accessible to all professionals.

She shares her perspective on how data impacts communities, how to promote diversity and inclusion in the data science space, and the importance of documenting your process. It was an absolute pleasure to hear her perspective, and I believe her message will help broaden the data science field.

Some notable segments from the show

[8:29] How data impacts marginalized communities

[13:29] From Brandeis’s perspective, what separates great data scientist from good ones

[14:48] Understanding how data is packaged, and ways to break it down into bite-size portions

[19:30] The impact of live tweeting on social movements

[30:09] Discussing inclusiveness in the data workspace

[39:46] How to be gritty and break away from negative thoughts

Brandeis Marshall’s journey

Brandeis began her data science journey when she entered graduate school. She had always been interested in user experience (UX). But when she found out that the well-known professor that taught UX was retiring, she decided she needed to switch paths.

She happened to be taking a course which involved data at the time, which ended up sparking her interest for data. She took a course that revolved around information retrieval, which ended up being the focus of her dissertation during her PhD program.

[5:10] “Oh, wow. Where do I start? Okay, I'll start easily with entering graduate school. And when I entered graduate school, I actually was very interested in UX.

When I got there, the person who was a UX professor. Well, the well-known individual was actually retiring. And I was like, well, I need to find something else to do. And then happenstance I fell into -

I'm looking at some data that was part of a course. And I was like, this is interesting. This is really interesting. I like the structure of the organization. I and then I start thinking, well, everything needs the structure and organization. So I've actually been a data head since 2000. I've been thinking data is cool from way back then. Took a data science, databases course that dovetail into information retrieval and that's wind up what I concentrated my PhD dissertation on. And so for me, data has been part of my entire career. And in fact, applying data in how data is applied in different spaces has been something I've been doing since I can remember it as part of my my graduate work. So what I see as far as getting into the field, it's a matter of where do you know the origins of the data? Are you interested in that part? And of course, moving forward and trying to figure out what the dataset is, figuring out, how do you know, clean up the data? How do you figure out how to analyze the data? So all of those parts are interesting to me. And that's kind of how I got into it. It was happenstance by luck, by interest, by passion. But I already was a computer scientist, so I always say I'm computer science first, data scientist second.”

Where is the field headed in 2-5 years?

In the next five years, Brandeis feels that we’re going to look at the gender, race, and class disparities that happen inside data. There is a concern about who is able to participate and have access to data, as well as who is represented in the data.

Over and over again, we are seeing marginalized communities that are disproportionately not included or are over saturated in certain datasets.

We need to assess the impact this has on our communities, and then develop and enforce policies in order to make sure that data becomes part of every facet of our society, not just STEM.

[8:29] “Yeah. So next sort of five years is going to be one where I actually tweeted about this early. And in the top of 2020 is to say we're going to be looking a lot at the gender, race, class disparities that happened inside of data, how data is used. We're going to be concerned about who is participating, who has access, how inclusion strategies are working or not working, as well as who's represented in the data.

We're seeing it over and over again where marginalized communities are disproportionately not included or over saturated inside of certain datasets.

And how do we shift the conversation so that all people are included in the data conversation? So the next two to five years is going to now being bringing aboard the understanding of the importance and the power of data and how that impacts communities differently. And, of course, developing policies, enforcing those policies, whatever regulations at the local, federal, national level in order to make sure that data becomes part of our known fabric inside of every facet from curriculum at K through 12, through those that are currently in the workforce, in all workforces, not just STEM

Don't get me wrong, I'm definitely love my STEM people, but it's an all workforces across the board. Everyone needs to know more about data.

What will separate great data scientists from the rest of them?

What's really going to set those apart is going to be those that have open minds with very good documentation [of your processes]. Those that are consistently learning from sources of quality, and are able to discern what is a quality source and what isn’t.

Great data scientists are those that do not try to take on all the responsibilities for the whole process. They understand the importance of teamwork, and know when to delegate to someone who is a better fit to answer certain questions. They understand their expertise.

[13:29] “What's really going to set those apart is going to be those that have open minds with very good documentation.

Those that are consistently learning from sources of quality. And that means you're going to hit some bumpy road, you're going to hit some you know, you might get some disinformation, you might get some misinformation, but then you're going to learn from it.

And then you're going to now be able to discern what is quality and what isn't quality. You're then going to be able to talk about, oh, I know this individual is working in this space. That's not my expertise. So don't ask me. Ask this expert. And I think that's going to be extremely important for data scientists to not try to take on all the responsibility for the whole process. This is one where it is a it's teamwork. So you have to be able in order to share out where other people are better talent and a better fit to answer those questions.

Key takeaways from the episode


[40:09] Gritty people are those that choose to look at difficult situations with a particular lens. They understand the positive, the negative, and realize where work needs to be done. Then, they do the work.

Diversity and inclusion

[31:07] To be inclusive doesn't mean that you are pushing away anybody. It means that you are seeking out those who have an open mind. It means to be someone willing to listen to others and not suppress marginalized groups. It’s about safety. Ask, “can I share with people, and can people share with me?”

Impact of data on communities

[10:59] You need to connect with people you have not connected with before. Open up the conversation about how the data that you're currently using now impacts communities that you're not necessarily a part of. Get yourself out of a comfort zone. That is the key. How you do this is by following someone you've never followed before.

Data across the board

[7:38] Data is a part of every industry. Everyone is concerned with how their data is being used. Data is being created and used at rapid rates, and the challenge is to be able to understand and harvest data in a meaningful way.

Documenting your process

[11:03] It’s very important to document your process, whether it’s scientific or non-scientific, and make sure you push your team to do the same. This is crucial because your data may be used in a way that you did not intend. But if you document your process and start having conversations about it, misinformation will be less likely to spread.

Memorable quotes

[7:57] “I'm trying to do my best to be... that beacon to talk about data in sizable, understandable nuggets, because it's not just a science thing. It is our everyday life.”

[11:45] “...if you stay within your own lane in your own expertise, only talking to people who have your particular background, you're losing the whole story... and with data, there's always a story”

[29:34] “...I want...other people to know that they can talk about their particular ethnicities, content in a research space, in the tech space, and still be successful.”

The one thing that Brandeis wants you to learn from her story

[43:12] My story is not done yet. If you feel like you're done, then that's not data science. The story is never complete.

From the lightning round

Best advice Brandeis has ever recieved

Don’t take any wooden nickels.

Data Science superpower

The ability to explain things easily to people.

Advice that Brandeis would give to her 20 year-old self

I would say it's going to be OK. Your time to shine isn't quite yet.

Topic outside of data science we should study

Sociology. You have to understand social context. If you don't understand social context, you don't understand data.

Books that Brandeis recommended you should read

“Algorithms of Oppression” by Safiya Noble

“Who Gets What and why: The New Economics of Matchmaking and Market Design” by Alvin E. Roth

Books and other media mentioned in this episode

Race After Technology: Abolitionist Tools for the New Jim Code by Ruha Benjamin
Anything by Andre Brock
Data Feminism by Catherine D'Ignazio and Lauren F. Klein
Podcast: #causeascene by Kim Crayton

How you can connect with Brandeis online

Personal Website

Take a Leap of Faith | Alistair Croll Fri, 03 Jul 2020 00:00:00 -0400 4d233e85-fa82-40cb-8b85-b8d9c482443e In this episode we get an opportunity to hear from co-author of Lean Analytics, Alistair Croll. Here are some key takeaways from our conversation

On this episode of The Artists of Data Science, we get a chance to hear from Alistair Croll, a well-established entrepreneur, analyst, and author. He is known for writing Lean Analytics, and for being one of the founders of Coradiant, Year One Labs, and the Strata conference.

He shares some excellent tips one how to ask the right questions when working with data, how to communicate with customers, and the need to be obsessed as an entrepreneur.

Alistair touches on some amazing tips that anyone can use to catapult their success. It was an amazing honor to interview him!

Some notable segments from the show

[11:11] How privacy concerns are being addressed related to data science

[13:39] Incorporating philosophy into data

[14:22] How to compete as an early stage company

[18:30] How unwavering faith can instill the entrepreneurial mindset

[22:35] Identifying the various types of innovation, and their impacts in the organization

[36:56] Music science, and how the digital era has affected the consumption of music

[46:04] The formula for informing and engaging people the right way

[51:38] How to profit off of attention

Alistair's Journey

Alistair’s journey into data science can be traced back into his childhood, when he first began toying with his Apple 2. Then, as a college student, he took a statistics class, which he did well in. This gave Alistair a good stats background, which he relied on when he started his own company, called Coradiant.

Coradiant focused on developing user monitoring products. These products allowed clients to focus on how users navigated websites.

Being in this space, Alistair eventually had the opportunity to write books on data science topics, such as “Complete Web Monitoring” and “Lean Analytics”.

[2:37] I can go back pretty far. I... As a kid, I had an Apple 2 with a 300 baud modem. So I had to, like, spend my summers figuring out how to get that to do things. And then in university, I had a feud with the dean of the business school. It's a long story I won't bore you with. That's to do with student fraud and all kinds of stuff. And we kind of uncovered some stuff that was going on with the University.

And we were running the student council and that caused a lot of discomfort for the dean of the business school. Turns out he was also my stats teacher. And so I had to do really well because if I was going to make any mistakes, he was going to fail me. Right?

So I was like I actually had to open up the books and work hard. And so I got some good stats background there. And my parents are both scientists. So I grew up with the scientific method and thinking about, you know, biases and how to how to understand things properly. And then a few years later, I started this company with some friends called Coradiant. And Coradiant was - It was really user monitoring. So Web analytics shows you what people do on your Web site, but it doesn't show you if they could do it. So, like, maybe the person didn't buy it because the page took forty, forty seconds to load. Maybe the person didn't buy it because they got a five hundred error and there was no JavaScript to tell you that. Right? And so with this product was called TrueSite. It was part of what we call real user monitoring products. But in order to sell it, we started out selling to the technologists - they didn't have the budget or the sort of authority within the organization. So we wound up having to sell to marketers. And so that forced me to get into, you know, speaking analytics to them, even though I was like a networking head. And then we kind of moved from Web analytics - I wrote a book with a a guy named Sean Power called Complete Web Monitoring that got into - like, how do you measure social profiles? And all this other stuff that requires a lot of big data. And then O'Reilly was putting out a book series called Lean - called it based on a lean startup series.

And they asked my co-author Ben Yoskowitz and I to write a book. So we wrote Lean Analytics. Lean - The Lean Startup is an amazing book. It's a book that's launched thousands of ships, that's proverbially speaking, but it is very aspirational.

It's not specific. And we're more like Bob Ross like paint a happy tree, you know, like very boring. Here's the prescriptive stuff. If you're here, do this until it gets this. And so I think that's one of the things that that helped the book catch on. Nobody's more surprised at how far it's gone. I got a mail from someone whose taking university in Madagascar who's like, this is my textbook. But I think that if you are trying to understand the modern world and you're not thinking critically about data and statistics, you are probably being tricked or taken advantage of almost every day. I mean, we need data literacy to survive the modern Internet. And so part of it was business through analytics and web. But a lot of it was just figuring out how navigate today's information heavy world.

Where is the field headed in 2-5 years?

For Alistair, the early days of data science involved the process of ETL (extract, transform, load). But now, he believes that the data science community is starting to see a lot of democratization of those tools. He is starting to see tools that help non-technical people experiment with models.

He also believes that we are going to have the ability to attach models to automated systems that produce new models. This will essentially create an environment where data science can be used to correct and update the system itself. The need for a human to interpret the data will be the exception, not the rule. Instead, data scientists will make sure that the system is still aligned with business goals.

[6:01] I think one of the dirty secrets of data science, at least in the early days, was that 80 percent of what people call data science was just ETL. Just cleaning up data and moving it from one place to another, you know, moving stuff between buckets. We're starting to see a lot of democratization of those tools. And so you're starting to see things like datarobot that will let a non-technical person, experiment with models and kick the tires and so on. But I think one of the things that is going to happen is, when things start out, we use them tentatively. In the early days of cloud computing, you know, that was fine for QA. It was fine for like putting a dev build on there. But you didn't really do it in production.

What will separate great data scientists from the rest of them?

In Alistair’s view, really great data scientists can arrive at a working model much more quickly than their peers. This will be partly due to their intuition, and also due to the role of managing the model rather than building it. The process of anticipating what will make something happen sooner is going to be the mark of a great data scientist.

Alistair also discusses that data scientists who build ethics and trust into their models are going to do really well. In the future, everything's gonna be data driven. Everything's gonna be running on a model. The companies we trust will be the ones that don't squander the trust we've given them.

[8:33] If you'd look at an exponential curve, right, what matters in the exponential curve is the starting number. Right? Because the sooner you start, the better. And the slope of the curve. I think you're going to see that really great data scientists can narrow - it can arrive at a working model much more quickly. Partly through their intuition, and that they will transition from building a model to managing the model. Which is actually a different set of skills, correcting for drift, finding out what could go wrong, are the factors still there, and so on. And I think they're also much more focus on like fast start learning. So instead of needing, you know, millions of compute hours to actually generate a good model, you're going to have data scientists go, oh, I know the model. That's going to be very close to what I need ahead of time.

Key takeaways from the episode

[13:39] The skills that a data scientist should have is the ability to incorporate philosophy into data. Asking questions like “What should the user know?”, or “How could this be misused?” are important. They are what make data science so interesting.

Important soft skills

[45:16] You've got to know what you're willing to fight for, and what you’re willing to compromise on.
Try and understand the customer journey. Ask yourself, “What are the steps that users go through? How do I use those steps to make sure that there's a satisfying experience?”

How to be an intrapreneur

[28:28] For a data scientist to become an intrapreneur, they have to transition to the unknown unknowns. Ask yourself, “What can the data tell me that I don’t know?” and go look in the data for patterns.

How to be an entrepreneur

[19:11] You have to take a leap of faith that this idea that you're absolutely obsessed with is right in the face of criticism. You need incredible humility to learn what other people think and adjust your perception, and then you also have to have this insanely high level of faith to keep you going and believe that you're absolutely right.

Memorable quotes

[14:22] …”as an early stage company, your focus is your biggest currency.”
[22:10] …”crises have a way of accelerating the inevitable.”
[46:04] “ got to first seek to engage and entertain and then you have the ability to inform people.”
[51:38] …”find a way to capture attention that you can turn into profitable demand better than the competition.”

The one thing that Alistair wants you to learn from his story

[53:26] All my life I have seen people solve for lots of things, such as fame, or profit. All the good things that have happened to me come from “solving for interesting”. Find a way to solve for what's interesting about your company, about your life, about your hobbies, and you will thrive in an attention economy.

From the lightning round

Best advice Alistair's ever recieved

People do things because they want to get laid (perceived as attractive), made (“Made man in the mob”; powerful), paid, or unafraid (reduce risk).

Advice that Alistair would give to his 20 year-old self

Find out how to cultivate a personality that's public and fairly strong. Exercise more. Get a sense of style. Be yourself, and don’t be afraid to be a little larger than life, as long as you aren’t a jerk.

The number one book that Alistair recommendeds you to read

“The Righteous Mind” by Jonathan Haidt

Books and other media mentioned in this episode

“Complete Web Monitoring” by Alistair Croll and Sean Power
“Lean Analytics” by Benjamin Yoskovitz and Alistair Croll
“Propose, Prepare, Present: How to Become a Successful, Effective, and Popular Speaker at Industry Conferences” by Alistair Croll
“Just Evil Enough” a book Alistair is currently working on
“Lean Startup” by Eric Ries

How you can connect with Alistair

Connect with Alistair on LinkedIn
On Twitter
Solve for Interesting