In recent years, there have been unprecedented technological advances in sensor technology, and sensors have become more affordable than ever. Thus, sensor-driven data collection is increasingly becoming an attractive and practical option for researchers around the globe. Such data is typically extracted in the form of time series data, which can be investigated with data mining techniques to summarize the behaviors of a range of subjects including humans and animals. While enabling cheap and mass collection of data, continuous sensor data recording results in datasets which are big in size and volume, which are challenging to process and analyze with traditional techniques and software in a timely manner. Such collected sensor data is typically extracted in the form of time series data.
Time series data is widely used in many domains and is of significant interest in data mining. In recent years researchers have proposed various algorithms for the efficient processing of time series datasets. The problem of time series classification has been around for decades. There are two main approaches for time series classification in the literature, namely, shape-based classification and feature-based classification. Shape-based classification determines the best class according to a distance measure (e.g., Euclidean distance). Feature-based classification, on the other hand, measures properties of the time series and finds the best class according to the set of features defined for the time series.
In this dissertation, we demonstrate that neither of the two techniques will dominate for some problems, but that some “combination” of both might be the best. In other words, on a single problem, it might be possible that one of the techniques is better for one subset of the behaviors, and the other technique is better for another subset of behaviors. We introduce a hybrid algorithm to classify behaviors, using both shape and feature measures, in weakly labeled time series data collected from sensors to quantify specific behaviors performed by the subject. We demonstrate that our algorithm can robustly classify real, noisy, and complex datasets, based on a combination of shape and features, and tested our proposed algorithm on real-world datasets.