Conference Program

Return to main conference page

All Times ET

Thursday, June 9
Software & Data Science Technologies
Contributions to Software and Technology
Thu, Jun 9, 3:45 PM - 5:15 PM
Butler
 

WITHDRAWN Practical Target-Based Synchronization Strategies for Immutable Time-Series Data Tables (310074)

Amy Apon, Clemson University 
Bennett Andrew Meares, Clemson University 
Mitch Shue, Clemson University 

Keywords: time-series,database synchronization,ETL

As the Internet of Things and industrial monitoring of utilities grow, efficiently synchronizing immutable time-series data streams between databases becomes a pressing issue. Extracting data from critical production databases demands careful consideration of the stress imposed on the machines, so synchronization strategies are required to minimize the transfer of duplicate data and the load imposed on remote sources.

Literature on the synchronization problem is generalized to arbitrary tables and does not consider the characteristics of time-series data streams, so research was required to investigate methods to quickly synchronize source and target time-series data tables. This paper examines immutable time-series scenarios and synchronization strategies to answer the following question: given several scenarios, which target-based immutable time-series synchronization strategies best optimize run-time, bandwidth, and accuracy?

Many real-world data streaming applications generate immutable time-series data streams. Particularly in the growing IoT industry, numerous commercial and open-source time-series data management systems has emerged. A persistent problem when working with time-series data is how to regularly and efficiently synchronize historical data between databases, such as when copying records from a production server to an analytical database.

Simple strategies may work well for a few, small tables, but as the number of data streams grows, so too does the need to reduce the processing and bandwidth requirements per transaction. For example, copying a table of a few thousand rows might only take a second, but updating several growing tables with millions of rows each requires careful planning to rapidly synchronize changes.