#DATATALK at IronHack : How did we build the Trump Mood Index
Published by newspill Team | 2019-09-02
On Thursday 29TH of August 2019, we were invited at IronHack Paris – a bootcamp institute whose mission is to prepare the next generation of Web Developers / UX Designers & Data Scientists.
IronHack gave us the opportunity to give a master class to their Data Science students on how to lead a Data project from the idea to the production by using our Trump Mood Index mini-project as an illustration.
Here are some of the advices we gave during the presentation :
- Data Acquisition : Do not reinvent the wheel, build on top of existing libraries and APIs because it is very likely
that someone already did what you are trying to achieve.
- Data Cleaning : Don’t be a machine and get to know your data, the process of cleaning the data is not purely technical –
it is the opportunity for you to understand the composition of your dataset and have a clear understanding of its potential & flaws.
- Research process : It’s not just Artificial Intelligence, it’s human intelligence. Domain intuition is key to make
your data talk and without a knowledgeable human to ask the right questions you won’t get anywhere close to an interpretable result.
- Modelling : Start simple, don’t try to throw a complex “end-game” model right at the beginning.
By starting simple you will avoid the black-box effect and will be more likely to build a smart algorithm by ensembling.
- Production : A notebook Data Scientist is not a real Data Scientist. If you want to build real-life project that can have an impact,
you need to get them out of your notebook and put them into production (user interface, automated databases, etc.).
Thank you IronHack for this opportunity and thank you Zakariae Moutchou, Data Scientist at Sysmo, for acing the presentation !
Check out our live Trump Mood Predictor : https://www.sysmo.io/en/trump