Educational systems have traditionally been evaluated using cross-sectional studies, namely,examining a pretest, posttest, and single intervention. Although this is a popular approach in
education, it does not model valuable information such as confounding variables, feedback to
students, and other real-world deviations of studies from ideal conditions. Moreover, learning
inherently is a sequential process and should involve a sequence of interventions. Nowadays, due
to the availability of a large volume of educational data, researchers can develop more intelligent
inference algorithms.
We propose to exploit the rich features in time series data and use them to develop more intelligent
and individualized educational systems. Our approach is five-fold: First, we model the
sequential nature of education using hidden Markov models and show that analysis of a sequence
of student actions is predictive of posttest results. Second, we propose more intelligent experimental
designs by collecting richer data from students by including questions on potential confounders
in the diagnostic test and instructor interventions during office hours. Third, we propose
various experimental and quasi-experimental designs for educational systems and quantify them
using the graphical model and directed acyclic (DAG) graph language. We discuss the application and limitations of each method in education. Fourth, we propose to model the education system
as time-varying treatments, confounders, and time-varying treatments-confounders feedback. We
show that if we control for a sufficient set of confounders and use appropriate inference techniques
such as the inverse probability of treatment weighting (IPTW) or g-formula, we can close the backdoor
paths and derive the unbiased causal estimate of joint interventions on the outcome. Fifth,
we compare the g-formula and IPTW performance and discuss the pros and cons of using each
method.