For the past few months, I attended Insight Data Science—a self-directed fellowship (not a bootcamp) designed to help PhDs from all fields transition into a career as a data scientist in industry. I'll say it upfront: Insight was the most challenging and intense professional endeavor I've undertaken (tops even the PhD or building a nonprofit for me!), but also one of the most rewarding. I'd like to take this opportunity to share some of my experiences.
My Insight project: CampaignCritic¶
I spent about half my time at Insight developing a project that would showcase my data science skills. As a big fan of crowdfunding and indie products, I decided to build a web app for Kickstarter creators called CampaignCritic, which analyzes their project description and helps them boost the chances of being funded by suggesting concrete ways to improve. The app is no longer online (not cheap to host perpetually), but the source code, the design pipeline and thought process are fully documented on my GitHub.
CampaignCritic uses natural language processing to parse the structure and content of your Kickstarter's project description and then uses machine learning to determine the chances of being funded. Here's the landing page:
As an example, let's use the project called Codey Rocky. The results page reports the project's probability of success, but that alone isn't really useful.
To help improve the chances of being funded, CampaignCritic scores how effectively your project utilized six structural components that it had learned to be most predictive of funded projects (having analyzed thousands of them). The app then compares your project's scores with those of the most funded projects' from over the years, and displays them both in a bar graph.
Your task is to match or exceed the scores of the top projects. In the example above, the creator did a good job including images and bolded text but could've added more hyperlinks, exclamation marks, and innovation-related buzzwords to increase their chances of being funded. Of course, all suggestions should be implemented pragmatically, i.e., littering your project description with exclamation marks probably wouldn't be prudent.
CampaignCritic also learned that writing succinctly, thanking your investors, discussing stretch goals, and reiterating the pledges within the project description were advisable. However, the app isn't flawless—it doesn't account for a Kickstarter project's aesthetics, feasibility, usefulness, market demand, media buzz, etc. Given more time, I'd also love to also analyze the pledge structure, the choice of the funding goal, and the creator's video among other things.
Transitioning into data science in industry¶
During grad school, every PhD student designs and completes an independent research project, ideally one that's tied to some real-world problem. Successful completion of this project involves a comprehensive understanding of past findings, and a painstaking and exhaustive analysis that amasses evidence to support the research hypothesis. Not surprisingly, this takes years (little over five for me). But outside of academia, solving problems with this ideology and on this timescale, is unheard of.
Insight does a superb job of training PhDs to solve industry problems using data science. First, we were tasked with designing a project that makes a tangible, real-world impact. Second, we unlearned that completing a project demands perfection, and became comfortable with tight deadlines and building a minimal viable product—a "good enough" state that's ready to be presented at any time. Finally, we were instilled with assessing the return on investment of all tools and routes while working on a project to remain focused, manage our time effectively, and make the most effective decisions.
Developing a project with collaborative learning is amazing¶
For many fellows including myself, week 1 of the fellowship—brainstorming a solid project in a matter of days—was extremely daunting considering we had months to come up with our PhD projects. I shortlisted several ideas but the program directors—having a keen sense of a successful project—quickly weeded out the duds. I couldn't imagine settling on a project by the end of the week—I had pushed my creativity to the brink! Somehow, we all made it happen by repeatedly pitching to the other fellows, thus rapidly garnering feedback that refined our ideas, identified potential pitfalls, and continually spurred our imaginations.
Next, we had exactly two weeks to finish our projects while learning the necessary concepts and tools (no classes or teachers!). Considering every minute was precious, going at it alone wasn't an option. In fact, none of us were well-versed in every tool needed to complete our individual projects; however, each of us were experts in particular areas of data science. For example, I needed to construct a dataset by collecting thousands of webpages—something I'd never done before. Instead of just scouring Stack Overflow, I consulted with a fellow experienced with web scraping (thanks Ruth!) and was able to build a working scraper in a matter of hours.
Creating a five-minute presentation is no cakewalk¶
While working on our projects, we were visited by dozens of companies from New York City, ranging from startups to billion-dollar behemoths in tech, finance, media, fashion, healthcare, nonprofit, retail, entertainment, and consulting. Week 4 was dedicated to designing a cogent five-minute presentation summarizing our project. Even though I've given many 10-minute talks in the past, shrinking a project down to a few minutes while still being articulate was an order of magnitude more difficult! You can check out my slides below:
The final three weeks of Insight involved traveling around the city and delivering our presentations to data science teams at the visiting companies we'd like to work for. During this time, we also shifted into interview prep mode. Again, we utilized collaborative learning to fill in any gaps in our data science knowledge. And throughout the program, we had access to Insight's vast alumni network—these mentors helped us with hurdles in our projects, critiqued our presentations, and prepped us for upcoming interviews.
Advice for future fellows¶
As Insight is unbelievably challenging and dramatically different from grad school, here's some advice for those starting the fellowship:
- Delineate the use case of your project and identify the target audience early. This isn't academia—your efforts can't be driven by curiosity or passion alone—keep real-world impact and actionable insights at the forefront.
- Make sure you can actually complete your project in the allotted time. It's perfectly ok to have lofty goals but be ready to scope them down as you spend time addressing roadblocks or pausing to learn a new tool.
- Verify that someone (or Google) hasn't already built it. Even if your project isn't groundbreaking, you can probably add a twist such as serving a different audience or incorporating other datasets to deliver new insights.
- Pitch your ideas to people outside of the program. Use your friends and family to unearth everyday struggles and how data science can address them to develop and refine your project ideas. If you're constructing a web app, use them to gather feedback on its user experience.
- Highlight the full data science pipeline. Your project is essentially your "portfolio" so pick one that demonstrates your familiarity with as many areas of data science as possible.
- Nail that five-minute demo. A large chunk of data science is communication so ensure your delivery is buttery smooth, exuberant, and crystal clear for the lowest common denominator in the audience.
- Fail fast. I never truly understood this motto until Insight, but it's crucial for completing a great project with limited time.
- Eat and sleep well, exercise, and don't neglect your social life. This may sound like a no-brainer but it's shockingly easy to burnout during the program. You also don't want to get sick and fall behind. I even picked up yoga and mediation!
This is only the beginning¶
About a year ago, I decided to become a data scientist—it encapsulates my passions, strengths, and everything I've wanted in a career. I spent the next eight months teaching myself the tools of the trade, while becoming more adept at machine learning, programming and statistics. I finished several open online courses, attended dozens of meetups, and read countless blogs and books. However, as noted in my previous post, the ultimate steps—networking, prepping for interviews and landing a position—are excruciatingly difficult on your own, especially when coming from a different background.
Attending Insight was one of the best decisions of my professional life and I simply can't recommend it enough. Not only did I bridge my technical gaps, learn about working in industry, and meet with top data science teams, I also had a lot of fun working on my project while making lifelong friends who are just as enthusiastic about data science as I am. And best of all, through the fellowship I was able to fulfill my dream: I recently started as a Lead Data Scientist at Nielsen in New York City!
I couldn't have gotten here without help from the other fellows, the alumni mentors, and the incredible program directors and staff at Insight NYC! I especially want to thank my wife for supporting me through this colossal adventure. And even though this journey comes to a close, I'll absolutely keep writing this blog. Data science is a vast, constantly evolving field and I've only scratched the surface. I want to continue learning and working on side projects, so stay tuned and thanks for reading!