It’s Christmas holidays as I write this, my first of four weeks off after completing the first half of my Data Analytics Certificate. I feel like taking a semi-break before trying to use most of this time off to get exposure to concepts I will be expected to learn in the second half of my program.
With that, I wanted to first test and figure out what platform I wanted to use to do this self-learning. My options were to simply read books at the library, watch online tutorials such as KhanAcademy or Lynda, or get involved in online data projects on Kaggle.
Where to Start
My decision was to complete some online courses on Lynda (recently acquired by LinkedIn). It made the most sense to me for the following reasons: first, the kids. With two young children, I have to find time around them to study. So my need for relative independence filtered out Kaggle; I wasn’t sure if I would have to coordinate schedules with fellow collaborators. However, being relatively new to modern “big data” computer science, I didn’t feel that I was knowledgeable enough to self-guide through a plethora of information in the middle of the computers section of a library.
With two young children, I have to find time around them to study.
The two courses below were what I was able to get through for week one. I came across both by searching for courses related to data analytics, at a beginner level, running less than two hours.
[Lynda]: Learning Node.js | Instructor: Alexander Zanfir
- The course was based on a small project (building a chat box) and was a great approach to showing how this would be used end to end to fulfill a user need.
- I wanted to see a bit more about object-oriented databases at work, and this did a good job of demonstrating them, even though it focused more on the front end.
- Following along and scripting alongside the instructor would have helped me get the most out of this course; however, I didn’t do this because I came into the course expecting to see a demonstration of a couple of short examples.
[Lynda]: Spark for Machine Learning & AI | Instructor: Dan Sullivan
- Ideally, I’d like to do as many courses as I can that focus on components of the Hadoop ecosystem.
- Visuals and presentation were great. I was also comfortable with the pace.
- This course demonstrated many of the machine-learning tools in the MLlib library based on Python.
- The course did not go into the underlying algorithm for methodologies such as Naïve Bayes, decision tree classification or k-means clustering, but instead went straight into demonstrating how to use built-in functions in the MLlib on Spark to solve problems at hand.
- Lastly, cosmetic best practices adhered to by the instructor when coding helped keep things clear, such as clearing screens and soft returns to separate different parameters within function calls.
Well, it was a productive week one of my holiday season. Volunteering on a field trip with my daughter’s kindergarten class was a nice cherry on top of an ideal week for a data student dad!
The above courses gave me a nice heads-up on several fronts: how much time (and when) I can comfortably budget on a weekly basis to conduct this self-learning, the format of these online data courses (I should definitely follow along on my computer with what the instructor is doing), and the value I think that subsequent courses could bring is encouraging! This was exactly what I needed to kick off my holiday season.