On a whim in late October 2020, on the advice of a friend, I signed up for an AI Product Development course with educational start-up Aggregate Intellect.
The course I signed up for was a six-week workshop with streams in Product Development, Machine Learning (ML) Ops, and of course Machine Learning. In this workshop, we were developing a product that used Graph Neural Networks.
Having just completed the course, I am putting down all my thoughts on it now, including an unbiased review on the pros and cons of signing up for an Aggregate Intellect workshop.
TL;DR
What?
A six-week workshop on AI Product Development.
Who should take it?
Anyone who works in or aspires to work in data science.
Pre-Requisite knowledge?
I would suggest being decently familiar with statistical and machine learning concepts, and comfortable writing code with Python. Depending on the product you end up building, HTML/CSS knowledge might help as well, as would being able to optimize your code.
Cost?
I paid $320 CAD, and it comes with six months access to course material and Aggregate Intellect Slack channel, job boards and ML presentations.
Review
To give you an idea of what you can build in the course, I present to you PatentNet. This was the product my team and I built over the course of six-weeks.
http://34.204.94.115:5000/main_page
Above is our demo video and the link to the beta-version of our product.
Overview
THE WORKSHOP
First, let me tell you a bit about the workshop itself.
The Workshop can be broken down into three main components – learning, touchpoints, and building.
LEARNING
The learning section of the workshop covered three major topics, as I listed above: Product Development, ML Ops, and Machine Learning. Each six-week workshop focuses on a different Machine Learning concept.
There was a set learning plan for each week, but it was mostly a self-paced affair. A series of videos for each topic were provided, which gave you enough information to get started. For the product development section, PowerPoint slides were also provided which could be filled out during your work. In the ML Ops section, example code was provided. Finally, for the Machine Learning section Google Colab Notebooks were provided with explanations and examples of ML implementation, and bonus Notebooks were provided so we could try it ourselves.
TOUCHPOINTS
Due to the time differences between people in the workshop, two hour-long touchpoints were hosted over Zoom each week. In these sessions, we had the chance to discuss the state of our projects, ask questions, and generally keep up with what the rest of the class was working on.
For the workshop, we could also book one half-hour session with a TA per week, where we could get help with ML Ops, Product Development, or the Machine Learning aspect. These “office hours” meetings were very helpful at resolving blocks in your work.
BUILDING
Since the focus of the course was to build something, the building section was the aspect where I spent the most time.
For the first two weeks, we tried to develop our own idea, interviewing potential users to break down the problem we wanted to solve. After the second week, we were placed into teams of 3, and in those teams, selected one idea we would build out for the duration of the workshop.
This was where the learning was put into practice, as we ideated a product, reviewed it with potential users, designed a website to showcase it, collected data and built the models that would become our product features.
If that seems like a lot to pack into four weeks, especially if you are working a full-time job, you are not wrong.
Comparison to other courses
I have tried other online courses in the past, including from Coursera and Udemy, not to mention completed a full Master's in Data Science.
As someone who learns best by doing, I found this course offered a very impactful way of learning. With Coursera and Udemy courses, I find my enthusiasm flagging after a few weeks of learning, and it can be difficult to get to the end as life's priorities pile on. The touchpoints here helped keep the end goals in sight.
With my Master's program, I had significantly more time to absorb the information I was learning, and a whole lot more structure in the tasks I was working on. If you do not have the time to go complete a full Certificate or degree, one or two of these workshops would help you get a good feel for working in data science.
Overall Experience
I’ll start by saying that I learned a lot during this course, but not in the direction I thought I would. Originally, I signed up because I wanted practical experience deploying Machine Learning models. I do not feel I learned a lot in terms of Graph Neural Networks (what I originally wanted to be learning). Instead, I gained a whole lot of knowledge in the fields of product development, project management, and data management. Partly, this is on me and my team. For the product we were building, data collection and storage ended up being a monumental task, and our focus did not turn to GNN’s until the final few days.
Probably due to the nature of our product, I did find that the course took up a lot more time than I had originally expected. Obviously, you get out what you put in, and the hours I put in were worth it as I think I learned quite a bit, but it did start to drag on (especially in the final two weeks).
LEARNING
I found the learning videos to be well done, and overall were very useful for referring back to whenever I needed to apply a concept. Definitely better than watching a lecture, forgetting what was said, and needing to ask again and again about the concept.
The machine learning notebooks were good for practical explanation, though I found they did not help us learn how to implement GNN’s all that effectively. Being rather new on the Machine Learning scene, GNN frameworks are still in development. As such, I did not find these notebooks all the helpful in translating the implementation to our actual product.
However, they accomplished their job as a primer to working with GNN’s.
TOUCHPOINTS
There is not a lot to say about the touchpoints. On the whole, I found them decently effective, especially in the first few weeks as we “got our sea-legs under us”, so to speak. As we progressed deeper into the workshop, they started to become more like a scrum, where teams would detail what they had accomplished in the previous week, and discuss their current blocks with the TA’s.
The office hours were also very useful to use, and were definitely helpful in resolving blocks that I and my teammates ran into in Product Development, ML Ops, and Machine Learning implementation.
BUILDING
The product we built was eventually named PatentNet, with the main feature being a patent similarity search function.
To be honest, this product did not play well with the timeline we had to develop it. US Patent data, though open, was difficult to access, pull, and clean. Putting different sets of data together was another difficult undertaking, but both of these needed to be taken care of before we could even start to build our product.
The teamwork aspect was done completely remotely, and due to different schedules for all of us, it was often difficult to coordinate our tasks.
I ran into my own set of difficulties as well. I am currently working on rural internet, and let’s just say it sucks. Downloading and uploading data was painfully slow. We used my Google Drive for most of our work, and as I have so far been too cheap to pay for extra storage, that quickly became an issue as well (amazing how fast you can eat up 15 GB).
These complaints aside, we did successfully accomplish our task, building a product with three key features: patent similarity search, patent document classification, and output of similar sentences from other patents to add some automation to filling out an Information Disclosure Form.
We built our data pipelines with Python, mostly using Google Colab Notebooks on the drive. Our models were similarly built. For the ML Ops aspects, we packaged our final models using MLFlow, built our website with Flask (Python), and hosted it with AWS.
Final Thoughts
I think many of the difficulties my team ended up facing, at least partially, related to trying to work with GNN’s. When coming up with a product, we first needed to identify data sources that were already in a network format, or could easily be put in a network. This is why we settled on patents, as there is an existing dataset containing a citation network for many patents.
However, I would suggest to anyone going into this course that they budget an appropriate amount of time per week to it (and this might end up being up to 15 hours in a week).
Although you do not need a lot of prerequisite knowledge to take the course, and your teammates may be able to make up for your lack of knowledge in certain areas, I would recommend that you be comfortable working with Python, understand statistics and machine learning, and are capable of dealing with messy, real-world data. Without at least two of these, you will end up spending a lot more time working to complete your tasks.
In the end, I would recommend this course to anyone looking to gain some practical experience in product development and machine learning. However, the only caveat I would add is to make sure you have the time available to fully commit to it. It may be the type of thing that is best taken while a student, or while looking for a job.