The Artists of Data Science

The ONLY self-development podcast for Data Scientists

Chat Transcript from Data Science Happy Hours 13, Dec 11 2020

16:37:17 From Harpreet Sahota : If you guys have questions shoot me a message and I will add you to the queue!
16:37:32 From Vipul Mehta To Harpreet Sahota(privately) : Hi Harpreet,
16:38:05 From Naresh Reddy : Can anyone please suggest datasets that are easy to start with for a data portfolio?
16:38:26 From Thomas Ives : SKLearn Practice Data Sets
16:38:32 From Camille Leonard : People often use these data sets https://archive.ics.uci.edu/ml/index.php
16:39:01 From Jake Beliveau : Hi Harpreet! I had to leave the video portion to look after my daughter. My main question is what is the most rewarding part of being a data scientist? Why are people going into this field?
16:40:05 From Akshay Adlakha : Hello Harpreet, I have a question. Can you please provide any suggestions regarding how to get first breakthrough in Data Science field. As I am a graduate student and looking for full-time opportunities. This is because my previous experience was in Software Development. So, just want to have some expert guidance to get first breakthrough in the industry.
16:40:31 From Himashree R S : I was reading an article and it said "None of your observed variables have to be normal in linear regression analysis, which includes t-test and ANOVA. The errors after modeling, however, should be normal to draw a valid conclusion by hypothesis testing." I always tried normalizing before analysis..am I wrong?
16:40:32 From Mikiko Bazeley : @Naresh: Quite a few of the datasets on Kaggle are good, especially if they’ve been used for a competition
16:40:38 From Vipul Mehta To Harpreet Sahota(privately) : I am working in Product Management as Product Owner. My current role involves Some data analysis through SQL and using Data visualization tools like Power BI. So my questions how to break into Data Science field. should I focus on New Tools or follow the path of statistical learning and learn R or python
16:40:49 From Eric Sims : Excel + dates = LOL
16:41:31 From Mikiko Bazeley : Also encoding
16:41:41 From Naresh Reddy : Thank you Thomas Ives, Camille Leonard and Mikiko Bazeley
16:43:08 From Mikiko Bazeley : Not old at all!
16:44:15 From Christian Capdeville : Question for those with experience delivering data presentations to business stakeholders: do you have a general framework you like to follow in your data presentations?
16:44:47 From Faraaz Sheriff To Harpreet Sahota(privately) : Hi Harpreet,
This is my first happy hour. I am quite excited to be part of it.
I have a question: How does one account for rare events like COVID into predicting models. Do you just skip a few months or come up with a correction factor in your models?

Thank you!
16:45:13 From Dave Langer : As a former PM at the Evil Empire I can tell you that Product Management is a great role to start building your analytics chops.
16:45:22 From Mikiko Bazeley : Agree with everything Thomas and Jennifer said and are saying.
16:45:40 From Mikiko Bazeley : +1 to Dave’s comment
16:45:56 From Dave Langer : Example - If you have access to event logs/telemetry data you've got a gold mine of opportunity.
16:48:03 From Dave Langer : Two words - Process Mining
16:48:03 From Christian Capdeville : Fantastic stuff Dave, thank you!
16:48:04 From Russell Willis : Hybrid solutions can be very useful in some circumstances, whereby using existing transformation methods as a prelude to visualisation in some modern BI tools? i.e SQL/Python/R transformation, prior to Power BI/Tableau/Qlik visualisation, etc?.
16:48:14 From Ray Givler : Tableau Server has good audit data related tow what Dave was talking about.
16:48:18 From Mark Freeman : Can you dive more into process mining? I would love to hear more about that.
16:48:45 From Dave Langer : Market basket analysis is also very useful in Prod Mgmt Analytics.
16:48:58 From Jennifer Nardin To Harpreet Sahota(privately) : +1 deep dive on process mining
16:49:11 From Mikiko Bazeley : One of my roles was working as a Data Scientist focused non Product Adoption for Autodesk, so it’s definitely a thing
16:49:26 From Dave Langer : Titanic
16:49:29 From Eric Sims : @Naresh - What are you interested in?
16:49:45 From Jennifer Nardin : +1 deep dive on process mining
16:49:50 From Naresh Reddy : @Eric Sports
16:50:01 From Eric Sims : @Dave - shots fired. Titanic hit, sunk.
16:50:06 From Ray Givler : @Christian - do you mean a static presentation or an interactive dashboard?
16:50:23 From Dave Langer : Titanic is great for initial skill-building.
16:50:51 From Russell Willis : Any data set, for which you can intuitively know that incorrect output, is incorrect, is a great source to start a transformation/visualisation journey with... Then progress to more complicated, taking previous learnings with you.
16:51:16 From Dave Langer : Titanic isn't 100% clean, the classification problem isn't trivial if you limit yourself to the data at hand, it is ripe for feature engineering, imputation.
16:51:34 From Dave Langer : Oh, and you can use it for learning market basket analysis as well. :-o
16:51:44 From Christian Capdeville : @ray - either, really. Those are probably two different types of findings you would be discussing, but I'm interested in frameworks others lean on for bringing stakeholders up to speed with your data findings
16:52:37 From Eric Sims : @naresh - Sean Sullivan is into baseball. You can check him out here: https://seanwsullivan1.wixsite.com/ssullivananalytics/post/pick-a-pitch-any-pitch
16:52:38 From Timothy Gordon : Agree with Brandon. Collecting your own data and making conclusions on something that interests you can lead to a great and different project
16:52:48 From Russell Willis : LinkedIn now allows each user to request a complex suite of their own data, which can also be a good source, for you to review your own activities...
16:53:06 From Eric Sims : @Harpreet, if I can get in the queue, I've got a question about "deploying" a model/app
16:53:20 From Harpreet Sahota To Eric Sims(privately) : Added!
16:53:26 From Eric Sims To Harpreet Sahota(privately) : Thanks!
16:55:06 From Eric Sims : @Akshay - I'm still working on my breakthrough, but LinkedIn has been awesome for me. Being authentic. Taking the time to participate and get to know people and companies makes a big difference.
16:55:58 From Dave Langer : T-Shaped Professional, Monica
16:55:59 From Mikiko Bazeley : Back online!
16:56:10 From Harpreet Sahota To Mikiko Bazeley(privately) : Ok cool - ill get you next
16:56:20 From Albert Bellamy : The "Superpower"
16:56:22 From Mark Freeman : @Akshay Another piece of advice I received from mentors was playing on my domain expertise. I was able to get my first DS job because of my deep knowledge of healthcare, knowing enough stats/python, not for my coding skills.
16:57:05 From Ray Givler : @Christian - basically, know the customer, know their goals - get some KPIs tied to those goals, graph the trends in those KPIs, determine what data correlate with those goals, graph that too, and look for outliers in those that they can take action on to move their KPIs and ultimately achieve their goals. PM me in LinkedIn for more.
16:57:55 From Naresh Reddy : Thanks @Eric
16:57:56 From Akshay Adlakha : Thanks Eric and Mark for your valuable feedback.
16:58:25 From Abe Diaz : Hello,
16:58:53 From Thomas Ives To Harpreet Sahota(privately) : Let me add just ONE extra thing at the end.
16:58:54 From Russell Willis : "Data Science" is a very broad field!
16:58:56 From Abe Diaz : Any insight on wealth data analytics? That's the field I want to be in.
16:59:10 From Harpreet Sahota To Thomas Ives(privately) : Ok, go for it
17:00:54 From Albert Bellamy : Drax: I wasn't listening, I was thinking of something else.
17:01:02 From Russell Willis : I think that is a great method for many initiatives - Identify the issues first, then work on the most appropriate and urgent solutions!...
17:02:26 From Thomas Ives : Hi Everyone, Can you send me a LinkedIn Connection request? I've embarrassingly exceeded my connection request quota.
https://www.linkedin.com/in/thomives/
17:02:28 From Christian Capdeville : Gotta run - looking forward to catching the rest on youtube- thanks everyone! have a great weekend
17:02:37 From Eric Sims : Later Christian!
17:04:30 From Thomas Ives : Bye Christian!
17:04:36 From Mikiko Bazeley : Worth thinking about: https://projecteuclid.org/euclid.ss/1009213726
17:04:44 From Mikiko Bazeley : The two schools of thought
17:05:02 From Mikiko Bazeley : Statisticians vs ML school of thought
17:05:44 From Dave Langer : If you're interested in Process Mining, here's an awesome Coursera course: https://www.coursera.org/learn/process-mining
17:05:58 From Mark Freeman : Thanks @Dave!
17:06:22 From Dave Langer : BTW - It was the single best Coursera class I have taken to date.
17:06:31 From Jennifer Nardin : @Dave - thanks! Coursera is RICH with content
17:06:36 From Mark Freeman : I work with a lot of event logs… so really excited!
17:08:27 From Dave Langer : Ben!!!!
17:10:06 From Russell Willis : Extraordinary events like extreme weather can be modelled with historical patterns, but COVID was VERY extraordinary, so very difficult to account any accommodation for?
17:10:25 From Eric Sims : Can't you use an intervention in time series data? Basically a 0,1 flag
17:11:59 From Dave Langer : Eric - Depends on what algo you are using.
17:19:28 From Russell Willis : @Greg That could really help if you identify a "Perfect Wave" of sub impacts!!
17:19:51 From Vipul Mehta : Thank you everyone for your insights and explaining things clearly. See you all in next session.Need to drop
17:19:59 From Greg Coquillo : @Russell, that's right!
17:20:22 From Dave Langer : If folks are interested in a different perspective on using statistics to analyze business data, my single favorite book on data analysis:
17:20:23 From Dave Langer : https://www.amazon.com/Making-Sense-Data-Donald-Wheeler/dp/0945320612/
17:22:25 From Russell Willis : Are there also time targets within SLA's that need to be accounted for?
17:23:57 From Mikiko Bazeley : So true!
17:26:31 From Ben Taylor : ha!
17:27:04 From Mikiko Bazeley : +1 Ben
17:28:29 From Ben Taylor : eternal life...baby consciousness work
17:29:39 From Russell Willis : Problem solving is a great skill and the payoff of cultivating solutions is great... Data Science provide lots of problems, so if you like problem solving it can be VERY rewarding, but also challenging and on occasion frustrating!
17:32:28 From Camille Leonard : If anyone would like to connect on LinkedIn, I'd love to connect with you! https://www.linkedin.com/in/camillevleonard/
17:33:12 From Thomas Ives : Mikiko, Great answer!!!
17:34:42 From Ray Givler : Gotta go! Thanks everyone!
17:34:49 From Jennifer Nardin : Gotta run; thanks for another great Happy Hour!
17:34:57 From Thomas Ives : Bye Jenn!
17:36:59 From Ben Taylor : data science is the ether in the world. it connects everything. I find myself interacting with linguists, psychologists, engineers, doctors, it surrounds us. It feels like a powerful magic where impossible is redefined every few years. I would have never imagined we would be doing the things we are doing today.,
17:37:18 From Ben Taylor : ha!
17:38:08 From Didier Muvandimwe To Harpreet Sahota(privately) : Hi Harpreet,
My name is Didier and have a question about starting out in Data Science. I am a maintenance engineer transitioning into this field.
17:42:39 From Ben Taylor : gotta drop, thanks for hosting
17:42:40 From Thomas Ives : https://www.amazon.com/dp/B08D9SP6MB/ref=dp-kindle-redirect?_encoding=UTF8&btkr=1
17:42:42 From Mark Freeman : My M.S. has been extremely helpful for domain experience in an area I love and learning research methods, the data science was learned outside of school.
17:43:21 From Mikiko Bazeley : https://www.amazon.com/Machine-Learning-Algorithmic-Trading-alternative/dp/1839217715/ref=sr_1_1?crid=1046O0NY32OC3&dchild=1&keywords=machine+learning+for+algorithmic+trading&qid=1607730148&sprefix=machine+lea%2Caps%2C256&sr=8-1
17:43:28 From Mark Freeman : Domain knowledge*
17:43:46 From Harpreet Sahota : GitHub: https://github.com/stefan-jansen/machine-learning-for-trading
17:44:02 From Mikiko Bazeley : I think this one was also good: https://www.amazon.com/Advances-Financial-Machine-Learning-Marcos/dp/1119482089/ref=sr_1_4?crid=1046O0NY32OC3&dchild=1&keywords=machine+learning+for+algorithmic+trading&qid=1607730207&sprefix=machine+lea%2Caps%2C256&sr=8-4
17:44:34 From Evangelos Tzimopoulos : Here's a business question if there time, that i'm sure a lot of data scientists relate to. How you do balance more EDA to get a better understanding of your dataset vs making quick steps forward to produce a prototype that you might not even be able explain some times due to bad data :) ? Especially when there's pressure from the business to tick the boxes early and deal with data later?
17:45:36 From Eric Sims : ^ I like this question! Practical.
17:46:42 From Mikiko Bazeley : So by interesting we mean “pain in the butt"?
17:46:46 From Russell Willis : @Evangelos I am also in London. Good to see you here!
17:47:41 From Faraaz Sheriff To Harpreet Sahota(privately) : Thank you for having me, Harpreet. I would have to drop off. Catch you next week
17:48:51 From Harpreet Sahota To Faraaz Sheriff(privately) : Cheers thanks for comin
17:52:02 From Russell Willis : @Thomas Every once in a while you need to prepare a meal from the back of the refrigerator, to remind yourself to keep any eye on quality and freshness!!
17:52:30 From Eric Sims : ^ Ha! Love that.
17:52:44 From Thomas Ives : I agree with that.
17:54:04 From Thomas Ives : Mikiko's answer is spot on!
17:55:24 From Greg Coquillo : Indeed!
17:56:01 From Timothy Gordon : Great point Mikiko and Brandon!

17:56:01 From Russell Willis : @Mikiko What they need vs. What they want can sometimes be quite a gauntlet to run... until the realisation lands!
17:56:10 From Mikiko Bazeley : Absolutely!
17:56:47 From Mikiko Bazeley : Sometimes your business partners can be the hammer that gets the data cleared up
17:57:22 From Timothy Gordon : Great answer Monica!
17:57:41 From Monica Royal : Thank you @Timothy
17:58:12 From Evangelos Tzimopoulos : hey @Russell, good to be here.
17:58:59 From Evangelos Tzimopoulos : All, thank you for your insights, was great to be here. Here's my linkedin profile if you'd like to connect and continue the great chat online - https://www.linkedin.com/in/etzimopoulos/
18:01:52 From Eric Sims : Managers like Brandon make the world a better place.
18:01:57 From Didier Muvandimwe : Hey all, Would like to connect with you and stay in touch. https://www.linkedin.com/in/didiermuvandimwe/
18:01:59 From Thomas Ives : Thought you guys might want to support Camille's post about our office hours today https://www.linkedin.com/posts/camillevleonard_datascience-machinelearning-ai-activity-6743312233234857984-cTiM
18:02:13 From Timothy Gordon : Great question Mark!
18:02:39 From Camille Leonard : I won't be able to make next week. Looking forward to next year!
18:02:46 From Melania : Great points! Thanks everyone!
18:03:22 From Eric Sims : Dave loves R!
18:03:46 From Russell Willis : Great to have been here. Thanks everyone!

Article Comments