Enhancing Data Imputation with Generative AI: Transforming Time Series through Image Processing
Developments in data collecting efficiency in recent years have led to a rise in the quantity of time series data in several areas of study such as finance, environmental science, and healthcare. However, one of the challenges arising from this surge is dealing with missing data. Gaps in time series may arise from various reasons, including sensor malfunctions, discrepancies in financial reports, or other data collection inconsistencies. These gaps can significantly impact time series analyses, leading to erroneous forecasts and misinformed decisions. To address this issue, data imputation plays a crucial role by replacing missing values with estimated substitutes. Despite advancements in this field, existing techniques still face problems including information loss, bias, computational complexity, and reliance on assumptions. These issues create a landscape of challenges regarding reliability and generalization. To tackle these problems, this thesis introduces an approach that combines the ability of generative AI to synthesize data with state-ofthe- art image processing methods. The goal is to overcome the limitations of current strategies and pave the way for reliable data analysis and predictions. A fundamental component of our methodology is the use of the Gramian Angular Field (GAF) algorithm to convert time series data into pictures while preserving important temporal relationships. Subsequently, these synthetic images are utilized to train our Conditional Generative Adversarial Networks (cGANs), which are designed to fill in missing pixels in images created from incomplete time series data. Lastly, we retrieve the full-time series from the images produced by our generative model. This approach shows promise, particularly in cases where a considerable portion of data is lost. To validate this approach, we conducted an analysis utilizing the MIT BIH Arrhythmia dataset. When evaluating the performance of our proposed approach in comparison to a conventional imputation method, we observed that it exhibited enhancements in both accuracy and training time. This exploration has suggested a promising direction for improving data imputation in time series analysis, potentially leading to more precise and reliable predictions.