Data duplication after Apache IoTDB pipe restart, how to maintain the existing sync progress?
0
votes
0
answers
4
views
When using Apache IoTDB's pipe feature to sync data to another Apache IoTDB instance, I encountered duplicate data transmission. The scenario is as follows:
I configured a Pipe from root.source to a remote IoTDB instance (root.target):
CREATE PIPESINK remote_sink AS IoTDB ('ip:6667', 'username', 'password');
CREATE PIPE source_to_target TO remote_sink FROM (SOURCE 'root.source.we4rwe.**') INTO (TARGET 'root.target.faknlv93.**');`
START PIPE source_to_target;
The initial sync works fine, but when I manually restart the Pipe service (e.g.,
STOP PIPE source_to_target
followed by START PIPE)
, some historical data (e.g., already synced root.source.d1.sensor
data) is retransmitted, causing duplicates on the target.
The pipe status (SHOW PIPES
) shows status=RUNNING
with no error logs.
Is this the expected behavior of IoTDB Pipe? How to avoid duplicate data transmission during Pipe restart? Are additional configurations (e.g., sync_progress or other parameters) required?
Asked by Hester Tso
(101 rep)
Aug 1, 2025, 08:12 AM