Neural Network Analysis of MOOC Dropout


How can we use neural nets to better predict which students will gain certification in an edX course? This will allow teachers to build courses better aimed at helping students who seek to gain certification.


Our team used recurrent neural nets to create a model that predicts student certification based on the student actions taken within the edX course, including loading course content, pausing video, answering questions, and other actions. This became the basis of a paper called Communication at Scale in a MOOC Using Predictive Engagement Analytics.


I set the goals of the research and managed the project, making sure we kept our scope clear and ordered work to deal with research dependencies. I also programmed and trained the neural network.


Keras, Pandas, Python, and a really big GPU


We began with a literature review of research on predicting dropout in Massive Open Online Courses (MOOCs) and found that recent developments in other areas of educational research had applied recurrent neural nets (RNNs) to predicting future events, but using the raw event stream as an input to the RNN had not been done. I created a project plan, breaking the tasks into replicating previous results (for comparing success) and using the RNN for future research. After assigning tasks within the team, I worked on the RNN research, using existing code for neural nets and working with local experts to apply the analysis to our problem. As the project continued, I updated the scope of the project to ensure it would be completed on time and could use the skills of our team. This included adding t-SNE analysis to look for patterns in the hidden state of the neural net.
Code is available at