Initial fault detection and diagnostics are essential elements to improve the efficiency, safety, and stability of vehicle operation. Diagnostics can make direct and indirect financial impacts on service and support entities in place for the vehicle. Remote diagnostics can reduce vehicle downtime in service centers and increase customer satisfaction, primarily when conducting over-the-air updates or telephone lines. In order to troubleshoot a vehicle, specific tools can be used to look up failure codes stored in vehicle controllers or manually gather failure symptoms through customer service hotlines and remote service technicians. The overall gathering of data, deciphering, and execution of any repairs still consumes precious time and may suffer from potential human errors. Recently, numerous studies have investigated data-driven approaches to improve vehicle diagnostics using available vehicle data. This study investigates a machine learning pipeline to improve automated vehicle diagnostics and prognostics. Using Natural Language Processing (NLP), we demonstrate a comprehensive model to extract the customer and agent interactions from repair-service call transcriptions. This dissertation applies Machine Learning (ML) algorithms to identify accurate failure reports and claims. Also, it classifies the service requests to the proper service department and utilizes the historical service information along with current customer claims to identify possible failed vehicle parts.
First, NLP techniques are used to automate the task of crucial information extraction from free-text failure reports (generated within customers' calls to the service department). We have introduced an NLP taxonomy in the automotive domain since known NLP techniques had a weak performance on such texts. We have shown that domain-based NLP processing and feature extraction can help to extract meaningful information from the reports.
Deep learning algorithms are employed to validate service requests and filter vague or misleading claims. Various classification algorithms are implemented to classify service requests so that valid service requests can be directed to the relevant service department. We proposed to employ Bidirectional Long Short-term Memory (BiLSTM), along with Convolution Neural Network (CNN) model, which shows more than 18% performance improvement in validating service requests compared to average technicians' capabilities. Furthermore, using domain-based NLP techniques at preprocessing and feature extraction stages along with CNN-BiLSTM-based request validation enhanced the performance of the Gradient Tree Boosting (GTB) service classification model. The performance parameter of the Receiver Operating Characteristic Area Under the Curve (ROC-AUC) reached 0.82.
Next, we performed automated failure classification on extracted data to route the claims to the proper service departments. By introducing optimized feature extraction and classification methods, requests can be forwarded to the correct departments with 80% accuracy. This method exceeds the 60% baseline accuracy for an average customer service technician. NLP analysis can also generate technical information from the text report for vehicle and component prognostics that have not been previously studied.
Finally, we proposed a novel network structure that employs a multi-variant high-dimensional Markov chain to predict the possible failed component of the next service interval to enhance the CNN-LSTM model performance. The Markov model takes advantage of historical records to identify the most efficient CNN kernels in the network structure. The proposed model significantly improved data classification efficiency in correlated historical records such as vehicle service reports. Compared to conventional CNN-LSTM models, the introduced model demonstrated significant performance enhancement of 8% accuracy, 9% sensitivity, 11% specificity, 10% precision, and 12% f-score by reducing the false positive cases in customer claim classification.