Can you talk about your background?
I work for a bank at the minute. I basically got interested in programming because of the whole financial technology side of things. I wanted to work in the data team at the bank, which was actually using DataCamp. So I started using it, and on my spare time trying to learn. Then I ended up getting a job with them.
I found out about DataCamp beforehand. I knew that the data team used it, and I knew DataCamp was their main way of learning data science. So I started using it as a result of finding out that they did.
What was your experience with data science before DataCamp?
Not at all, really. I kind of had a basic understanding of statistics, but not really any programming or actual data science.
Can you talk about your career path so far?
The whole financial technology thing was taking off when I joined the bank, so I was quite interested in that. And I think a lot of those companies focus on data, particularly machine learning. So I was quite interested in that and was trying to learn about it, but there's a point where you have to understand the technical side as well to really understand what's going on. I started doing a bit of research to see what was happening in the bank and I found this team that was using R, and they were learning through DataCamp. So it was literally just that, seeing what was out there, how people are doing it. I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with because I think it's a lot better.
What were some of the greatest challenges in your career process?
The biggest obstacle for getting a job was that a lot of companies seem to be old-fashioned in that they want people who have studied computer science or statistics at University exclusively, which I didn't do. Knowing where and how to start can be really hard—which is why DataCamp, and particularly the Tracks feature is so good!—because it is such a huge, intimidating field, and if you're entirely new to it then even something like understanding the language used is quite hard. I spent a lot of time on Wikipedia to start with.
What is your current role?
I just recently started on an analytics team. My first project is about text mining and sentiment analysis. Another project I'm working on is using machine learning on our risk data to predict where we should focus our efforts. We do a lot of testing to make sure customers get exactly what they're expected to and we pass all regulatory requirements and stuff like that. If we use machine learning to hone in exactly what we want to look at, it will save a lot of time and effort. At the minute, it is all very manual and not very efficient.
What drew you to data within finance?
I suppose, the strategy. Finance is a data-heavy field, and with things like big data, machine learning, and AI, there's much more of a push in terms of the bank's strategy towards being able to do that sort of stuff and growing their capabilities in that area. And I just think the technology is really cool!
Do you use skills you've learned from DataCamp at work?
Particularly, Bag of Words, that was the most recent course I took, and I am actually doing a project now using that. It is directly applicable. I think as I get more embedded into my team, I will pretty much use all of it. Courses on data visualization and data manipulation are really useful because it is stuff you are going to do all the time.
Tell us more about the bag of words project!
It is kind of similar to the scenario you had in the exercises, actually. The bank does this big survey once a year of all colleagues and they want me to get all of the verbatim comments from that survey and do some analysis on it and find out are we doing better than last year? Are we doing better than other similar companies? What do people like? What don't they like? Very similar to the scenario you had in the course, comparing comments from Amazon and Google.
What do you like about DataCamp?
I like the fact that you have learning tracks. Everything is laid out in progression. Before, I was randomly just picking courses myself. And while that's fine, it meant sometimes I picked a course that had a prerequisite that I didn't realize, which meant I didn't understand some of the stuff in the course. But because the track obviously tells you which course to take next, it is easier to make sure that you do take things in the right order. I quite like that.
I also like the general interactivity. The fact that you are listening to the videos, then reading, then actually doing the code yourself is really useful. And the fact that you've got the documentation in the browser itself is quite good so you don't have to get other tabs out to look things up if you don't understand what's going on.
One of the things I found quite useful is that in many of the videos, when you're talking about certain packages, you also mention new packages, which is quite good. You can go look at those and sort of teach yourself. DataCamp is very up to date.
Oh, and I had a problem with one of the exercises recently, and I got responses really quickly. From that point of view as well, I'm really happy.
How do you choose between R and Python?
Originally I chose Python because I had already been starting to learn it to build websites and things like that. I think Python is a bit more wide-ranging to start with, whereas R is more specific, so it depends on what you are trying to do. When I realized that the team I wanted to work for was using R, that's when I transitioned to using R and focusing just on that. But actually, I plan to go back to Python as well in the future, because some teams obviously use Python, some use R, so I think it is good to have the option to learn both, and to get good at transitioning between the two languages.
What convinced you DataCamp would be worth it?
It is very unintimidating. The website really looks nice which helps. The fact that you've got videos, and you've got people actually in the videos, talking to you—I think it really helps. There is obviously loads of stuff out there on the internet where you can learn data science. But some of it is really intimidating and a bit dry. But DataCamp is quite engaging and it's very visual, which makes it a lot easier. And it is quite reassuring as well, if there is something that is a bit tricky to learn, the person in the video will say so, and you feel a little better about carrying on because obviously it can be quite a steep learning curve. I think I have pretty much gotten everything I want out of DataCamp.
What were your goals when you started on DataCamp? Where those goals met?
My goal was to get to a level of understanding that would help me get a job in analytics – and I definitely achieved that! Now, my next goal is to finish the Data Science track – 55% to go! – and longer-term to build a useful machine learning model at work.
What would you advise someone just starting to learn data science?
One of the main things is having a sort of structure for how you're going to do it. So like I said, the Tracks thing really helps with that because you've then got a learning plan. But also blocking out time in a week to do it and setting out how much time. If you're going to sit down for an hour, make sure you actually do, and make sure you do that a few times a week. The DataCamp website doesn't talk about that, it doesn't really cover that side of things. But obviously setting a schedule is really important part of keeping the momentum up when you're learning something. It is doable to learn this stuff!
My ideal learning plan would involve, say 3-4 long sessions in a week, where I'd spend 2-3 hours split between DataCamp and reading statistics textbooks. However in reality, it's pretty hard to find that time, so I tend to do a minimum of 10 mins of statistics reading or 10 mins of DataCamp each weekday morning so that at least something has been done each day, and 2-3 times per week try and do an extra hour of either one in the evening too. Usually on the weekends, I get in one longer session too.
How does DataCamp compare to other platforms you've tried?
There is a lot more continuity on DataCamp between each of the courses. You can see where one course builds on another. If you're doing something, it will say if you haven't done this course, you might want to go back and do that one first, things like that. It all kind of builds on the next bit. It also doesn't overload you with information like a lot of other websites do. So you're learning the statistics and things like that as you go along, through doing the programming, which I think is the best way. Some of the other courses will have a statistics lesson, then a programming lesson, and you kind of don't see the connection between the two.