Susan Shu Chang is a Principal Data Scientist at Clearbanc. She transitioned into data science from economics and has an awesome blog. I had the chance to speak to her about what she does at Clearbanc, what advice she has for those new to the field, and how we can learn to be intentional about our career growth.
Clearbanc is an ecommerce investor providing non-dilutive growth capital to founders. Clearbanc's co-founder and president, Michele Romanow, has been on the Dragon's Den TV show (think Shark Tank in Canada), where she noticed that for an early stage company, raising funding via equity can be way too expensive for founders.
So we work on using machine learning and automation to provide fast and flexible funding, with less friction from the human elements like traditional lending and funding (banks, venture capital). More can be found in this Forbes article and on the company website.
I've mentioned machine learning to help with those investment decisions, so naturally that's the components of the products that my team works on. There are the models themselves, and the infrastructure that allows the inference of multiple models be served on demand. I have been focusing on leading on the infrastructure part as of late. This involves technical design and implementation.
To sum up: My team works on augmenting Clearbanc's funding products with machine learning and data science. This includes ML model training, data quality, infrastructure, automation.
As of late, I have been working on the design of infrastructure to help our models and experiments be more reproducible - using open source tools for tracking and saving run artifacts.
Previously I worked in a large corporation, which could throw money to set this up with Red Hat OpenShift, but since Clearbanc is much smaller (for now!) it’s up to us. This does have the added benefit of the solutions being tailored to what our product needs, as well as keeping it nimble to mix and match the functionalities that scale our machine learning operations.
A major part of my role is also knowledge sharing via lunch and learns and capturing the content of those talks in documentation on best practices. I often help out team members through Slack calls and chats to unblock them, working as a multiplier instead of trying to be the “tank” of the team, to use video game analogy.
Code reviews are also essential, and I have been experimenting with blocking out a regular time to do them. Since I am familiar with a broad part of the stack, just just the cozy corner of data science, I provide feedback on code quality as well as design.
I really enjoy our Founder School, which is like a bootcamp for new hires. Speakers from each team gives the cohort an overview of the team. So I would introduce them to the data science team, and someone from marketing will introduce the cohort to the marketing team.
I have quite a lot of experience in public speaking, to both technical and non-technical audiences. So the wide mix of new hires from all kinds of teams is a great way for me to test my communication skills. And I think it’s worked - I was voted “favorite facilitator” (facilitator being the person that presents).
My personal educational background was in Economics, which heavily focused on inferential or predictive modelling, using data such as financial markets, pricing, household earnings, and so on. Sounds quite a lot like data science in industry, doesn’t it?
Econometrics gave me a solid understanding of statistics, since the upper year courses require a lot of calculus and matrix algebra, which were invaluable when I started self-learning about more machine learning algorithms. But, there was a catch - I only learned statistical programming through a proprietary software called Stata, not the languages common in industry, such as Python or R.
Formal education wise, I would have put my Python skills at a 1/10, since I only took one elective computer science course. Note: a 10 here not being a 10 in industry overall, but my impression, whether realistic or not, of the expectation for a very competitive entry level candidate.
I think that regardless of age, educational background, or previous work experience, it is common to have some sort of impostor syndrome when one is looking for a role in data science and highly competitive fields. Susan Shu Chang
However, I had been using Python for a couple of years due to programming video games for fun. One day a friend mentioned to me, “you have these two skills, have you heard of a field called data science”? I had not, and googled it that day. It was a perfect, almost accidental combination of my knowledge of statistics and Python. I write about this entire process in this two-part article series.
I think it’s important to be surrounded by good peers. I went to two of the most competitive universities in Canada, so there were lots of people like that around. In fact, I often considered myself the slacker!
Outside of university, one can definitely find such peers by expanding one's circle, which I elaborate more on in this article.
For example, outside of work I have been on the steering committee of Aggregate Intellect for 2 years, a machine learning knowledge platform and community with 13k+ YouTube subscribers. I met many friends and ambitious peers there who inspire me every day.
I like the predictive side and information extraction of data science. Even if it's not AGI it has still automated so much of what humans used to spend repetitive time and manual effort on, that can't be denied. I enjoy the many many facets of problem solving and the sheer amount of tools (algorithms, or ways of dealing with sparse data) at our disposal. It's a lot of fun.
I almost don't dislike anything about working as a data scientist. I might be kind of weird as in I enjoy things like public speaking and the human aspect of it, which when I started, I didn't (leave me alone! Just want to code!)
If I really had to say one thing it would be there is often misunderstanding as to what data science and machine learning can or cannot do. However, I consider it part of my job to help people understand, so that we can work towards building products together. It's a two way street, if people are open to learning, I find the effort to explain things in plain English is completely worth it.
I think that regardless of age, educational background, or previous work experience, it is common to have some sort of impostor syndrome when one is looking for a role in data science and highly competitive fields.
One way I frame difficult things is that "I've been through worse. While this is difficult, it too shall pass." I talk about this concept quite a lot when speaking to folks that I mentor, but you can find the details in this article.
It is also important to recognize what employers are looking for: not the person that has taken the most online courses, certifications, or gotten the most prestigious degree. They want someone that can help solve a problem. The certifications and degrees are just hints, but not as solid proof as projects.
So, from a person that did not have a formal education directly related to data science, here’s to your journey into data science! Don’t be afraid, even if times can get challenging. All the best!
Some people call it a newsletter - I call it a good time. I write about tech careers and how you can get ahead in yours. It’s my best content (like this case study) delivered to you once a week.