Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

7 votes

1 answers

2083 views

Btrfs/ZFS Network Replication

Is it possible to replicate a ZFS or Btrfs raid volume in real-time (or as close to as possible, network specs aside) over a network? ZFS and Btrfs are ideal because of their CoW properties. I'm thinking something similar to DRBD, but DRBD won't work because it requires a single-block device, and we...

                                  Is it possible to replicate a ZFS or Btrfs raid volume in real-time (or as close to as possible, network specs aside) over a network?

ZFS and Btrfs are ideal because of their CoW properties.

I'm thinking something similar to DRBD, but DRBD won't work because it requires a single-block device, and we're ruling out the option of exporting each disk as a DRBD device because that would get messy.

I don't want to use send/receive because they would be too slow, even if scripted.

Ideally, I'd like something relatively simple to avoid unnecessary complexity.

DevinM (171 rep)

Nov 10, 2015, 12:23 AM • Last activity: Jul 30, 2025, 04:05 PM

0 votes

0 answers

34 views

CacheFiles when the cached system is unmounted, or alternatives

backup nfs replication cache-directory

In my current setup, I have two machines `serverA` and `serverB` in different geographical areas. `serverA` has a limited amount of persistent memory (~256GB), while `serverB` can be considered to have enough that I will never use it all up (several TB). `serverA` has a directory `/data` which is an...

                                  In my current setup, I have two machines serverA and serverB in different geographical areas. serverA has a limited amount of persistent memory (~256GB), while serverB can be considered to have enough that I will never use it all up (several TB).

serverA has a directory /data which is an NFS share from serverB, and also has CacheFiles enabled. 

This setup achieves the following:

 1. replication: if serverA's disk die, I can still recover the data from serverB
 2. unlimited memory: I am not limited by serverA's small amount of persistent memory
 3. fast access to data: the content of /data that is in the cache (basically the most recently accessed 200GB) can be accessed without a round-trip on the network

Note that a simple backing-up setup would not achieve 2. I'd like to achieve 1., 2. and 3. but also the following:

 4. robustness: if serverB goes down temporarily, serverA can still work with the data that's been cached, without me having to manually intervene on serverA
 5. encryption: /data is encrypted by serverA, so that someone with access to serverB cannot access the data

I'm mostly interested in 4. and 5. would only be a bonus. Here are my questions:

 - I suppose CacheFiles does not achieve 4., is this correct?
 - What are the simplest setups that would allow me to achieve 1., 2., 3. and 4., and possibly also 5.?

Quentin (25 rep)

Jun 8, 2025, 11:46 AM • Last activity: Jun 8, 2025, 04:35 PM

2 votes

1 answers

68 views

How to replicate an entire disk using ZFS

zfs replication

I have 2 x 2TB disks (they are not equal). In the `master` I have my ZFS pool (`rpool`, where Proxmox is installed in). I'd like to clone/replicate the entire disk on the `slave` once a day. How can I do?

                                  I have 2 x 2TB disks (they are not equal). In the master I have my ZFS pool (rpool, where Proxmox is installed in). I'd like to clone/replicate the entire disk on the slave once a day.

How can I do?

KaMZaTa (121 rep)

Sep 5, 2024, 10:04 AM • Last activity: Oct 2, 2024, 08:55 PM

0 votes

1 answers

34 views

Adding a fresh zfs sub-dataset pri_zp/Z1/Z99-future to pri_zp/Z1, and resuming recursive replication to sec_zp/Z1

selinux zfs storage snapshot replication

I have set up and replicated the OpenZFS dataset `pri_zp/Z1` (with `pri_zp/Z1/Z00-initial`) to `sec_zp/Z1` using a `zfs send -R`. But then (months later) when I try to create (and replicate) a newer data set, called `pri_zp/Z1/Z99-future`, the replication to `sec_zp/Z1` fails. How do I add a fresh s...

                                  I have set up and replicated the OpenZFS dataset pri_zp/Z1 (with pri_zp/Z1/Z00-initial) to sec_zp/Z1 using a zfs send -R.

But then (months later) when I try to create (and replicate) a newer data set, called pri_zp/Z1/Z99-future, the replication to sec_zp/Z1 fails.

How do I add a fresh sub data set pri_zp/Z1/Z99-future, and enable its recursive replication?

Below is an example trace of the zfs commands that demonstrate the hiccup I encountered.  

**A. Setup Z1 base with Z00-initial:**

    # zfs create pri_zp/Z1
    # zfs create pri_zp/Z1/Z00-initial

    # zfs snapshot -r pri_zp/Z1@ssA
    # zfs send -V -R pri_zp/Z1@ssA | zfs receive -d sec_zp
    full send of pri_zp/Z1@ssA estimated size is 79.1K
    full send of pri_zp/Z1/Z00-initial@ssA estimated size is 78.6K
    total estimated size is 158K

**B. Send an next snapshot, ssB:**

    # zfs snapshot -r pri_zp/Z1@ssB
    # zfs send -V -R -I pri_zp/Z1@ss{A,B} | zfs receive -d sec_zp
    send from @ssA to pri_zp/Z1@ssB estimated size is 624B
    send from @ssA to pri_zp/Z1/Z00-initial@ssB estimated size is 624B
    total estimated size is 1.22K

**C. Start integration of Z99-future:**

    # zfs create pri_zp/Z1/Z99-future

    # zfs snapshot -r pri_zp/Z1@ssC
    # zfs send -V -I pri_zp/Z1@ss{B,C} | zfs receive -d sec_zp
    send from @ssB to pri_zp/Z1@ssC estimated size is 61.1K
    total estimated size is 61.1K

Everything working fine until here.  But now I need to send the initial ssC snapshot of pri_zp/Z1/Z99-future@ssC.  (It's not an incremental)


**This next zfs command probably triggers the problem, but it cannot be done recursively:**

    # zfs send -V pri_zp/Z1/Z99-future@ssC | zfs receive -d sec_zp

**D. Here the problem manifests itself:**

    # zfs snapshot -r pri_zp/Z1@ssD
    # zfs send -V -R -I pri_zp/Z1@ss{C,D} | zfs receive -d sec_zp
    send from @ssC to pri_zp/Z1@ssD estimated size is 624B
    send from @ssC to pri_zp/Z1/Z99-future@ssD estimated size is 624B
    send from @ssC to pri_zp/Z1/Z00-initial@ssD estimated size is 624B
    total estimated size is 1.83K
    cannot receive incremental stream: destination sec_zp/Z1 has been modified
    since most recent snapshot

**The specific dataset change caused by** zfs send -V pri_zp/Z1/Z99-future@ssC:

    # zfs diff sec_zp/Z1@ssC
    +       /sec_zp/Z1/Z99-future
    M       /sec_zp/Z1/
    -       /sec_zp/Z1/Z99-future
    -       /sec_zp/Z1/Z99-future/
    -       /sec_zp/Z1/Z99-future//security.selinux
Also:

    # zfs list -r -t all -S creation sec_zp/Z1      
    NAME                        USED  AVAIL  REFER  MOUNTPOINT
    sec_zp/Z1/Z99-future         96K  1.22T    96K  /sec_zp/Z1/Z99-future
    sec_zp/Z1@ssC                64K      -   104K  -
    sec_zp/Z1/Z99-future@ssC      0B      -    96K  -
    sec_zp/Z1@ssB                 0B      -    96K  -
    sec_zp/Z1/Z00-initial@ssB     0B      -    96K  -
    sec_zp/Z1/Z00-initial        96K  1.22T    96K  /sec_zp/Z1/Z00-initial
    sec_zp/Z1                   416K  1.22T    96K  /sec_zp/Z1
    sec_zp/Z1@ssA                 0B      -    96K  -
    sec_zp/Z1/Z00-initial@ssA     0B      -    96K 

Any hints are welcomed.
                                

NevilleDNZ (250 rep)

Jul 17, 2024, 02:54 AM • Last activity: Jul 17, 2024, 11:07 AM

0 votes

2 answers

956 views

Help removing a failed replica from a FreeIPA setup

ldap replication freeipa

I have two FreeIPA servers in my system; ns-1 and ns-2. To my limited knowledge, ns-1 is our main ipa server and ns-2 was setup as a replica. But I may be incorrect in that regard. In my attempts to upgrade the OS on ns-2, the upgrade failed somewhere in the middle and now the machine is toast. ns-1...

                                  I have two FreeIPA servers in my system; ns-1 and ns-2.  To my limited knowledge, ns-1 is our main ipa server and ns-2 was setup as a replica.  But I may be incorrect in that regard.

In my attempts to upgrade the OS on ns-2, the upgrade failed somewhere in the middle and now the machine is toast.  ns-1 is still operating fine as I was holding off on upgrading that machine until ns-2 was complete.

I blew away ns-2 and rebuilt a new VM in its place and now want to set it up as the new ns-2 replacement.  The problem though is that ns-1 still has a record of the original ns-2 and is preventing the ipa-replica-install command from succeeding on my new ns-2.

In ns-1's Web UI, it still lists ns-2 as an ipa server and displays ns-2 in the topology graph.

From the ns-1 machine I've issued the following commands:

    # ipa-replica-manage list
    ns-2..: master
    ns-1..: master
    
    # ipa-replica-manage del --force --cleanup ns-2..
    Updating DNS system records
    Not allowed on non-leaf entry

    # ldap_delete -x -h 127.0.0.1 -D 'cn=directory manager' -w  'cn=ns-2..,cn=masters,cn=ipa,cn=etc,dc=.'
    ldap_delete: Operation not allowed on non-leaf (66)
          additional info: Entry has replication conflicts as children

    # ipa-replica-manage dnsrange-show
    ns-2..: Connection failed: cannot connect to 'ldaps://ns-2..:636': Transport endpoint is not connected

On my new ns-2 machine I've run the ip-client-install command successfully.  And then I ran the "ipa-replica-install --setup-dns --setup-ca --no-forwarders -P "  It fails because the ns-1 machine appears to believe that there's already an ns-2 machine defined.

I've found the following threads that appears to have experienced the same problem, but no resolution is included:
https://www.spinics.net/linux/fedora/fedora-users/msg498296.html 

I've tried following this documentation, but it does not explain how to resolve replicas that have "children": 
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/identity_management_guide/ipa-replica-manage#repl-conflicts 


                                

dutsnekcirf (179 rep)

Dec 14, 2022, 04:49 PM • Last activity: Jun 3, 2024, 08:28 AM

30 votes

6 answers

79184 views

how to one-way mirror an entire zfs pool to another zfs pool

zfs replication

I have one zfs pool containing several zvols and datasets of which some are also nested. All datasets and zvols are periodically snapshotted by zfs-auto-snapshot. All datasets and zvols also have some manually created snapshots. I have setup a remote pool on which due to lack of time, initial copyin...

                                  I have one zfs pool containing several zvols and datasets of which some are also nested.
All datasets and zvols are periodically snapshotted by zfs-auto-snapshot.
All datasets and zvols also have some manually created snapshots.

I have setup a remote pool on which due to lack of time, initial copying over local high speed network via zfs send -R did not complete (some datasets are missing, some datasets have outdated or missing snapshots).

Now the pool is physically remote over a slow speed connection and I need to periodically sync the remote pool with local pool, meaning data present in local pool must be copied to remote pool, data gone from local pool must be deleted from remote pool, and data present in remote pool but not in local pool must be deleted from remote pool, by data meaning 'zvols', 'datasets' or 'snapshots'.

If I was doing this between two regular filesystems using rsync, it would be "-axPHAX --delete" (that's what I actually do to backup some systems).

How do I setup a synchronizing task so the remote pool zvols & datasets (including their snapshots) can be in sync with local zvols,datasets&snapshots?

I would like to avoid transferring over ssh, because of low throughput performance of ssh; I'd prefer mbuffer or iscsi instead.

Costin Gușă (629 rep)

Feb 16, 2016, 06:34 PM • Last activity: Sep 18, 2022, 06:48 AM

0 votes

1 answers

1170 views

Create multiple file from single txt file

linux shell-script text replication

I have a single input file that needs to be replicated multiple times with a little change in every file i.e every 30th line in every file is to change. For e.g. My main text file name is D0.txt which I need to replicate multiple numbers time. and in that line 30 is "variable D1 equal 0.0 " Now for...

                                  I have a single input file that needs to be replicated multiple times with a little change in every file i.e every 30th line in every file is to change.

For e.g.

My main text file name is D0.txt which I need to replicate multiple numbers time. and in that line 30 is "variable D1 equal 0.0 "

Now for the first time when I replicate this file

I would like to change that string to " variable D1 equal 1 "

and then save the file as D1.txt.

Similarly, I would like to create suppose 5 files then I would like to loop it 100 times.

so files are saved in a folder as

D0.txt

D1.txt

D2.txt and so on with each of their line 30 as "Diameter = $n"
sample text file format is attached below.
and desired format of files in folder.

Yash54 (19 rep)

Mar 24, 2022, 09:44 AM • Last activity: Mar 24, 2022, 10:35 AM

3 votes

1 answers

12066 views

ZFS send/recv full snapshot

linux filesystems zfs snapshot replication

I have been backing up my ZFS pool in Server A to Server B (backup server) via `zfs send/recv`, and using daily incremental snapshots. Server B acts as a backup server, holding 2 pools to Server A and Server C respectively (`zfs41` and `zfs49/tank`) Due to hardware issues, the ZFS pool in Server A i...

I have been backing up my ZFS pool in Server A to Server B (backup server) via zfs send/recv, and using daily incremental snapshots. Server B acts as a backup server, holding 2 pools to Server A and Server C respectively (zfs41 and zfs49/tank) Due to hardware issues, the ZFS pool in Server A is now gone - and I want to restore/recover it asap. Currently the snapshot list in my Server B is as follows :

NAME                        USED  AVAIL     REFER  MOUNTPOINT
zfs41@2021Nov301205        14.9G      -     3.74T  -
zfs41@2021Dec011205        3.87G      -     3.74T  -
zfs41@2021Dec021205        3.77G      -     3.74T  -
zfs41@2021Dec031205           0B      -     3.74T  -
zfs49/tank@2021Nov301705   368G      -     3.52T  -
zfs49/tank@2021Dec011705  65.2G      -     3.52T  -
zfs49/tank@2021Dec021705  66.4G      -     3.52T  -
zfs49/tank@2021Dec031705     0B      -     3.52T  -

where zfs49/tank@2021Dec031705 is the latest for Server B I would like to send back the whole pool (including the snapshots) back to Server A, but I'm unsure of the exact command to run. Question : On Server B, will doing zfs send zfs49/tank@2021Dec031705 | ssh zfs recv tank be sufficient to receive the full ZFS pool + all the snapshots (so I can continue incremental send/recv backups) on Server A?

LooseAlien123 (31 rep)

Dec 5, 2021, 03:39 PM • Last activity: Dec 6, 2021, 12:33 AM

1 votes

1 answers

288 views

(Almost-)Atomic way to merge 2 folders

rsync cp replication

I'm exploring the different ways to add some consistency to a file deployment operation. The current situation is : *A `current` version folder which contains approximately 100K different files* - /current - /path1 - file1 - file2 - ... - /path2 - fileX - ... *An `update` folder which contains aroun...

                                  I'm exploring the different ways to add some consistency to a file deployment operation.
The current situation is :

*A current version folder which contains approximately 100K different files*
- /current
   - /path1
       - file1
       - file2
       - ... 
   - /path2
        - fileX
   - ...

*An update folder which contains around 100 files*
- /update
   - /path1
       - file1
   - /path2
        - fileX
   - ...

The final goal is to send all the files from the update to the current folder, I insist on the "all". Either none of the files should be replicated if there's an error during the operation, or all the files should be deployed in order to flag the operation as successful.

In an ideal word, the scenario I'm looking for would be an "atomic" rsync which would return either a fail error code or a success error code depending on what happened during the operation and would ensure that the original current directory would be seen by the system instantly (after the rsync operation) as a newer version (= no intermediary state during the copy because of a potential electrical cut or whatsover..).

From my understanding, atomic operations are not available on most UNIX systems, so I can consider that the ideal case will clearly not be reached. I'm trying to approximate this behavior as much as possible.

I explored different solutions for this :

- cp -al to mirror the  current directory to a tmp directory, then copy all the files from the update directory in it, then removing current and renaming tmp to current

- rsync (so far the most pertinent) using the --link-dest option in order to create an intermediary folder with hard links to the current directory files. Basically same as the previous case but probably much cleaner as it doesn't require any cp.

- atomic-rsync I encountered an existing perl script Perl atomic-rsync  which supposedly does this kind of operations, but which results in only taking into account the files present in the update directory and getting rid of the "delta files' current folder.


Both solutions seem to work but I have no confidence in using any of them in a real production use case, the problem being that it might be very slow or somehow costly/useless to create 100K hard links.

I also know that a very consistent solution would be to use snapshots and there are plenty of options for that, but it's not acceptable in my case due to the disk size (~70GB and the current folder already takes ~60GB).

I ran out of options in my knowledge, would there be any (better) way to achieve the expected goal?
                                

Bil5 (113 rep)

Jun 20, 2021, 10:27 AM • Last activity: Jun 21, 2021, 04:33 AM

1 votes

1 answers

309 views

Automating password changes

password replication

I'm trying to automate password changes on 36+ servers b/c doing it the manual way to ridiculous and annoying. Basically, I can run a 'grep -ir password' on 1 of my host linux servers and see how many servers respond to the query. I can then go to each one of them and cd into the proper directories...

                                  I'm trying to automate password changes on 36+ servers b/c doing it the manual way to ridiculous and annoying. Basically, I can run a 'grep -ir password' on 1 of my host linux servers and see how many servers respond to the query. I can then go to each one of them and cd into the proper directories and locations and run an update as ':1+$s1password+newpassword+g ' and then save the update. However, this is very tedious and I have to multiple it by how many directories responded to the query and then times that by 36+ servers.

So, can someone please assist if you know how to accomplish my inquire? The only thing I can think of but I don't know if it's right would be to do something like this...
grep -ir password | vi *directory/file | :1+$s1password+newpassword+g && :wq
OR MAYBE, I should be thinking of this as a bash script that needs to be created

Again, your solutions will be much appreciated, thank you!

Regards,

CG

user477924 (11 rep)

Jun 17, 2021, 10:04 PM • Last activity: Jun 17, 2021, 10:27 PM

1 votes

0 answers

43 views

Is there a Bash command or series of commands which will identify and remove replicated directories?

replication

I'm working on a developing a script which will, inside of a directory, verify the existence of a sub-directory, then locate and delete replicated copies in the same directory. For example: A directory has the following sub-directory added: ```none FLDR6544_8765 ``` Other copies of the folder are re...

FLDR6544_8765

Other copies of the folder are replicated, such as:

FLDR6544_8765-0
FLDR6544_8765-1
FLDR6544_8765-2

Is there a command or scripted series of commands which will verify the existence of the original folder (in the example FLDR6544_8765), then remove the replicated folders (FLDR6544_8765-0, FLDR6544_8765-1, FLDR6544_8765-2) I can use find -type d -name "-0" exec rm -r {} \; -prune to find the directories with the replicated endings (-0, -1, etc), but that doesn't ensure the original file exists before deleting the replicated copies. Thanks in advance for any ideas you might have!

William Hauk (11 rep)

Apr 16, 2021, 02:45 PM • Last activity: Apr 16, 2021, 02:49 PM

3 votes

1 answers

9560 views

DRBD Failure: (127) Device minor not allocated

centos openvz replication drbd

I use wmware workstation to run two virtual machines with `OpenVZ 2.6.32-042stab108.2` installed on top on `CentOS 6.6`. I have created another primary partition, `/dev/sda4`, to configure it as drbd resource. I also created a filsystem on it. The second machine is actually created using the virtual...

                                  I use wmware workstation to run two virtual machines with OpenVZ 2.6.32-042stab108.2 installed on top on CentOS 6.6. I have created another primary partition, /dev/sda4, to configure it as drbd resource. I also created a filsystem on it. The second machine is actually created using the virtual disk of the first one, with changed hostname and eth0 ip address. The drbd configuration file is this:

    global { usage-count no; } 
    common { syncer { rate 100M; } } 
    resource r0 { 
            protocol C; 
            startup { 
                    wfc-timeout  15; 
                    degr-wfc-timeout 60; 
            } 
            net { 
                    cram-hmac-alg sha1; 
                    shared-secret "password"; 
            } 
            on primary { 
                    device /dev/drbd0; 
                    disk /dev/sda4; 
                    address 192.168.18.10:7788; 
                    meta-disk internal; 
            } 
            on secondary { 
                    device /dev/drbd0; 
                    disk /dev/sda4; 
                    address 192.168.18.20:7788; 
                    meta-disk internal; 
            } 
    }

After creating the resource with **drbdadm create-md r0**, when I enter **service drbd start**, I get:

    Failure: (127) Device minor not allocated.

The output of **drbdadm dump all** might be helpful:

    [root@primary ~]# drbdadm dump all
    # /etc/drbd.conf
    # resource r0 on primary: not ignored, not stacked
    resource r0 {
        protocol               C;
        on primary {
            device           /dev/drbd0 minor 0;
            disk             /dev/sda4;
            address          ipv4 192.168.18.10:7788;
            meta-disk        internal;
        }
        on secondary {
            device           /dev/drbd0 minor 0;
            disk             /dev/sda4;
            address          ipv4 192.168.18.20:7788;
            meta-disk        internal;
        }
        net {
            cram-hmac-alg    sha1;
            shared-secret    danuts;
        }
        startup {
            wfc-timeout       15;
            degr-wfc-timeout  60;
        }
    }

What is causing this error and how can it be mitigated? Thanks!

                                

Tanatos Daniel (295 rep)

Jun 8, 2015, 11:52 PM • Last activity: Jan 16, 2021, 04:04 PM

1 votes

1 answers

736 views

ZFS send/receive whilst changing dataset properties

zfs replication

**Summary**: Is a ZFS send/receive always to receive an exact replication of the sending dataset - matching dataset properties included - or is it possible to receive into a newly created dataset with different properties (recordsize, compression etc) that the receiving data (after checksum verifica...

                                  **Summary**: Is a ZFS send/receive always to receive an exact replication of the sending dataset - matching dataset properties included - or is it possible to receive into a newly created dataset with different properties (recordsize, compression etc) that the receiving data (after checksum verification etc.) can then write in to.

For example, if I receive a (non-incremental) dataset whose objects were initially written with a recordsize=128K, and receive into a dataset newly created by zfs receive but that either specifies the option -o recordsize=1M or inherits from a parent dataset with an current value of recordsize=1M, will the objects in my new dataset be written with a recordsize of 1M, or must it be that the objects are written as an exact "replica" of the receiving dataset?

    zfs get recordsize tank/files   # returns: 128K

    # Set parent of receiving dataset to 1M
    zfs set recordsize=1M freezer   

    # Also set at receive time    
    zfs send tank/files@transfer | zfs receive -o recordsize=1M freezer/files     

My question is essentially:

    # Query recordsize of objects as written in freezer
    ???

I understand that checking the recordsize of the new dataset isn't itself an answer as this would just report what the dataset property is _currently_ set to, it doesn't itself say anything about the recordsize for any previous writes.  I've tried to examine the objects directly with something like

    zdb -dd freezer

but my zfs version (on FreeBSD 9) seems not to accept this.

man zfs suggests that only 'set-once' properties must match (casesensitivity, normalization, utf8only), but it's not clear to me how objects are written in the general case.  I've also checked docs/google but can't seem to get an explicit answer on this behaviour without getting into studying zfs internals.

wardw (396 rep)

Oct 2, 2020, 05:39 PM • Last activity: Oct 2, 2020, 06:01 PM

0 votes

1 answers

17 views

How to change journal attribute in Hitachi F900

storage replication

To configure **Hitachi Universal Replication**, I need to change the journal attribute from **initial** to **restore**. How can I do this?

                                  To configure **Hitachi Universal Replication**, I need to change the journal attribute from **initial** to **restore**.

How can I do this?

Dexpras (99 rep)

Jun 24, 2020, 03:01 PM • Last activity: Jun 24, 2020, 05:42 PM

0 votes

0 answers

29 views

Data syncronization

filesystems replication

I am planning to provision a medium scale NAS server at my site and at the DR location. I expect this to be around 20Tb. Mostly this will have a fileserver type workload, with few updates to existing files. My issue is how to keep the 2 locations in sync. While CPU is not a big concern, bandwidth ma...

                                  I am planning to provision a medium scale NAS server at my site and at the DR location. I expect this to be around 20Tb. Mostly this will have a fileserver type workload, with few updates to existing files. My issue is how to keep the 2 locations in sync. While CPU is not a big concern, bandwidth may become a problem but the biggest worry is latency - I want to ensure that the DR site is as up-to-date as practical, and would prefer something where I can measure the amount of data waiting to be replicated.

While I could simply run unison or rsync on the filesystem(s) it is my understanding that this simply polls for updates to replicate rather then capturing enqueueing the remote writes. DRBD would be more efficient in terms of CPU and bandwidth - but I can't see how I would be able to have the DR filesystem "live" (writeable) at the same time as the production site (and I don't see how to measure the lag).

Should I be looking at AFS / intermezzo? Any pointers to case studies would be appreciated.

symcbean (6301 rep)

Nov 3, 2019, 07:07 PM • Last activity: Nov 3, 2019, 08:26 PM

1 votes

0 answers

102 views

differences between two rsync commands

linux rsync postgresql replication

I'm following [the postgresql documentation for upgrading a cluster][1] of 2 machines - one primary and one secondary (with replication..). According to the docs, I need to upgrade only the primary and then I can run rsync on the primary to sync the secondary. The command that is mentioned in the do...

                                  I'm following the postgresql documentation for upgrading a cluster  of 2 machines - one primary and one secondary (with replication..). According to the docs, I need to upgrade only the primary and then I can run rsync on the primary to sync the secondary. The command that is mentioned in the docs : 

    rsync --archive --delete --hard-links --size-only --no-inc-recursive primary_parent_of_old_data_dir primary__parent_of_new_data_dir secondary_remote_parent_of_old_data_dir

I wanted to run the rsync command from the secondary, therefore I used the following command : 

    rsync --archive --hard-links --size-only --verbose --human-readable --no-inc-recursive --delete  root@primaryIP:primary_parent_of_new_data_dir secondary_parent_of_old_data_dir

I wanted to ask what is the difference between the commands. Why do I need to specify 3 directories in the first command and not just two?

JeyJ (139 rep)

Jul 31, 2019, 07:14 AM • Last activity: Jul 31, 2019, 07:45 AM

1 votes

0 answers

59 views

OpenLDAP cluster: deletion of an entry is not replicated

cluster openldap replication

We have an OpenLDAP 2.4 cluster of three nodes configured in multi-master and accessed through a VIP in round-robin. The three machines run RHEL7. We noticed that deletion of an entry (done from a Windows machine onto the first node via Oracle's tool `ldapmodify.exe`) is not replicated in the cluste...

                                  We have an OpenLDAP 2.4 cluster of three nodes configured in multi-master and accessed through a VIP in round-robin. The three machines run RHEL7.

We noticed that deletion of an entry (done from a Windows machine onto the first node via Oracle's tool ldapmodify.exe) is not replicated in the cluster i.e. the entry is not deleted from second and third node. 

Here's the relevant extract of cn=config for the first node:

    olcSyncrepl: {0}rid=001 provider=ldap://mynode2:389/ bindmethod=simple
     binddn="cn=Replicator,dc=mydomain,dc=org" credentials=1234567890 searchbase="dc=mydomain,dc=org" scope=sub schemachecking=on type=refreshAndPersist
     retry="30 5 300 +" keepalive="60:5:10"
    olcSyncrepl: {1}rid=002 provider=ldap://mynode3:389/ bindmethod=simple
     binddn="cn=Replicator,dc=mydomain,dc=org" credentials=1234567890 searchbase="dc=mydomain,dc=org" scope=sub schemachecking=on type=refreshAndPersist
     retry="30 5 300 +" keepalive="60:5:10"
    olcMirrorMode: TRUE

Why does it happen and what could be done to fix this, apart from running ldapmodify on all three nodes (which we'd like to avoid)?

EDIT: After a few days we noticed that the cluster was in sync again.  We looked up for the offending entry (thisentry) in all nodes' logs and we found this line on mynode3: 

    Jun 18 14:18:20 mynode3 slapd: conn=1987936 op=14 DEL dn="dc=thisentry,ou=myou,ou=foobars,dc=mydomain,dc=org"

There are no references to thisentry (apart from SRCH operations) on node1 and node2, even if the entry was originally deleted from node1, as said above.

dr_ (32068 rep)

Jun 14, 2019, 09:39 AM • Last activity: Jun 24, 2019, 12:42 PM

1 votes

0 answers

30 views

Currently I have two node cluster (SLES as HANA DB). I would like to check in which condition a failover will be triggered?

suse sles bonding replication failover

Below are the conditions. We could not test these because these nodes are already in production. 1. Replication between the nodes fail. 2. Replication is working but network link is down with no bonding. 3. Replication is working but one of the NIC in bonding failed. 4. Replication is working but ma...

                                  Below are the conditions. We could not test these because these nodes are already in production.

Replication between the nodes fail.
Replication is working but network link is down with no bonding.
Replication is working but one of the NIC in bonding failed.
Replication is working but management IP is down.
Replication is working but backup IP is down.
Replication is working but DB service is stop/restart.
Replication working but server shutdown unexpectedly.
                                

Fabian19 (11 rep)

Jun 14, 2019, 01:43 AM

0 votes

1 answers

1402 views

How to verify which is primary node in a HANA DB cluster operating on SLES 12?

suse cluster sles replication failover

Currently I have two HANA DB servers in a cluster and replication is not running. Upon checking from SUSE Hawk, node 2 is acting as primary node and node 1 is available. How do I confirm that node 2 is primary? [![enter image description here][1]][1] [1]: https://i.sstatic.net/5ZtvI.png

                                  Currently I have two HANA DB servers in a cluster and replication is not running. Upon checking from SUSE Hawk, node 2 is acting as primary node and node 1 is available. How do I confirm that node 2 is primary?

Fabian19 (11 rep)

Jun 11, 2019, 08:43 AM • Last activity: Jun 11, 2019, 09:12 AM

2 votes

1 answers

1044 views

Cannot replicate DNS zone data from master on LAN to slave behind firewall in a DMZ

networking bind replication

Here is some information about my setup. I have a master DNS server in my LAN subnet running on a Ubuntu 16.04 box. In addition, I have some slave DNS servers on my various other subnets (DMZ subnet, service subnet, etc.). All DNS slave servers run different kinds of Linux. Since my master DNS serve...

                                  Here is some information about my setup. I have a master DNS server in my LAN subnet running on a Ubuntu 16.04 box. In addition, I have some slave DNS servers on my various other subnets (DMZ subnet, service subnet, etc.). All DNS slave servers run different kinds of Linux.

Since my master DNS server must know several different subnets, it is set up as split DNS / split horizon.

My firewall defines three zones: LAN, WAN, and DMZ. For safety reasons, no connection from DMZ to LAN can be initiated. The connection must be initiated from the LAN subnet. Such is by policy and I do not want to change it.

# Technical information about relevant servers: #
    Master DNS on my LAN subnet:
    OS: Ubuntu 16.04
    Hostname: master.lan.mydomain.dk
    IP: 192.168.1.4 255.255.255.0

    Slave DNS on DMZ subnet:
    OS: Debian 9
    Hostname: tools.dmz.mydomain.dk
    IP: 172.16.1.4 255.255.255.0

Immediately, my split horizon setup works fine on my master server. But I can not replicate between master and slave server. There is no transfer of zone files.

## Here are the relevant setup files: ##
### named.conf from master DNS server: ###
    key "rndc-key" {
        algorithm hmac-md5;
        secret "w26wwSa7rJB04IsuW99kGQ==";
    };

    controls {
        inet 127.0.0.1 port 953
        allow { 127.0.0.1; } keys { "rndc-key"; };
    };

    include "/etc/bind/named.conf.logging";
    include "/etc/bind/named.conf.keys";
    include "/etc/bind/named.conf.options";
    include "/etc/bind/named.conf.local";

### named.conf.keys from master DNS server: ###
Key definitions are placed in a separate file, so they can be easily updated via rsync.

    key lan-key {
        algorithm HMAC-MD5;
        secret AaEjmxhg3WT2; 
    };

    key dmz-key {
        algorithm HMAC-MD5;
        secret BEhp4DeLnX4u;
    };

    key service-key {
        algorithm HMAC-MD5;
        secret 7rP4CN3Km2QT;
    };

    key management-key {
        algorithm HMAC-MD5;
        secret gNsRz2H7AxLH;
    };

    key update-key {
        algorithm HMAC-MD5;
        secret B88bqW33Fuap;
    };

### named.conf.local from master DNS server: ###

    //
    // Do any local configuration here
    //
    // Keys are defined in /etc/bind/named.conf.keys
    //
    
    acl lan-subnet {
        !key dmz-key;
        !key service-key;
        !key management-key;
        key lan-key;
        127.0.0.0/8;
        192.168.1.0/24;
    };
    
    acl dmz-subnet {
        !key lan-key;
        !key service-key;
        !key management-key;
        key dmz-key;
        172.16.1.0/24;
    };
    
    acl service-subnet {
        !key lan-key;
        !key dmz-key;
        !key management-key;
        key service-key;
        192.168.128.0/24;
    };
    
    acl management-subnet {
        !key lan-key;
        !key dmz-key;
        !key service-key;
        key management-key;
        10.21.12.0/24;
    };
    
    view "internal" {
        match-clients { lan-subnet; };
        allow-recursion { any; };
        allow-transfer { key lan-key; };
        allow-update { key update-key; };
    
        // prime the server with knowledge of the root servers
        zone "." {
            type hint;
            file "/etc/bind/db.root";
        };
    
    // be authoritative for the localhost forward and reverse zones, and for
        // broadcast zones as per RFC 1912
        zone "localhost" {
            type master;
            file "/etc/bind/db.local";
        };
        zone "127.in-addr.arpa" {
            type master;
            file "/etc/bind/db.127";
        };
        zone "0.in-addr.arpa" {
            type master;
            file "/etc/bind/db.0";
        };
        zone "255.in-addr.arpa" {
            type master;
            file "/etc/bind/db.255";
        };
    
        zone "lan.mydomain.dk" {
            type master;
            file "/etc/bind/internals/db.lan.mydomain.dk"; # zone file path
            also-notify { 192.168.1.5 key lan-key; };
            notify yes;
        };
        zone "1.168.192.in-addr.arpa" {
            type master;
            file "/etc/bind/internals/db.192.168.1-rev";
            also-notify { 192.168.1.5 key lan-key; };
            notify yes;
        };
    
        zone "dmz.mydomain.dk" {
            type master;
            file "/etc/bind/internals/db.dmz.mydomain.dk"; # zone file path
            also-notify {
                192.168.1.5 key lan-key;
                172.16.1.4 key dmz-key;
                172.16.1.5 key dmz-key;
                127.0.0.1 key dmz-key;
            };
            notify yes;
        };
        zone "1.16.172.in-addr.arpa" {
            type master;
            file "/etc/bind/internals/db.172.16.1-rev";
            also-notify {
                192.168.1.5 key lan-key;
                172.16.1.4 key dmz-key;
                172.16.1.5 key dmz-key;
                127.0.0.1 key dmz-key;
            };
            notify yes;
        };
    
    zone "service.mydomain.dk" {
            type master;
            file "/etc/bind/internals/db.service.mydomain.dk"; # zone file path
            also-notify {
                192.168.1.5 key lan-key;
                192.168.1.10 key service-key;
                192.168.1.11 key service-key;
                127.0.0.1 key service-key;
            };
            notify yes;
        };
        zone "128.168.192.in-addr.arpa" {
            type master;
            file "/etc/bind/internals/db.192.168.128-rev";
            also-notify {
                192.168.1.5 key lan-key;
                192.168.1.10 key service-key;
                192.168.1.11 key service-key;
                127.0.0.1 key service-key;
            };
            notify yes;
        };
    
        zone "management.mydomain.dk" {
            type master;
            file "/etc/bind/internals/db.management.mydomain.dk"; # zone file path
            also-notify {
                192.168.1.5 key lan-key;
                10.21.12.4 key management-key;
                127.0.0.1 key management-key;
            };
            notify yes;
        };
        zone "12.21.10.in-addr.arpa" {
            type master;
            file "/etc/bind/internals/db.10.21.12-rev";
            also-notify {
                192.168.1.5 key lan-key;
                10.21.12.4 key management-key;
                127.0.0.1 key management-key;
            };
            notify yes;
        };
    
    };
    
    view "externals" {
        match-clients { any; };
        allow-recursion { none; };
        allow-transfer { key dmz-key; };
    
        zone "dmz.mydomain.dk" {
            type slave;
            masters { 127.0.0.1 key lan-key; };
            file "/etc/bind/externals/db.dmz.mydomain.dk"; # zone file path
            also-notify { 192.168.1.5 key dmz-key; };
        };
        zone "1.16.172.in-addr.arpa" {
            type slave;
            masters { 127.0.0.1 key lan-key; };
            file "/etc/bind/externals/db.172.16.1-rev";
            also-notify { 192.168.1.5 key dmz-key; };
        };
    };
    
    view "services" {
        match-clients { service-subnet; };
        allow-recursion { none; };
        allow-transfer { key service-key; };
    
        zone "service.mydomain.dk" {
            type slave;
            masters { 127.0.0.1 key lan-key; };
            file "/etc/bind/services/db.service.mydomain.dk"; # zone file path
            also-notify { 192.168.1.5 key service-key; };
        };
        zone "128.168.192.in-addr.arpa" {
            type slave;
            masters { 127.0.0.1 key lan-key; };
            file "/etc/bind/services/db.192.168.128-rev";
            also-notify { 192.168.1.5 key service-key; };
        };
    };
    
    view "management" {
        match-clients { management-subnet; };
        allow-recursion { none; };
        allow-transfer { key management-key; };
    
        zone "management.mydomain.dk" {
            type slave;
            masters { 127.0.0.1 key lan-key; };
            file "/etc/bind/management/db.management.mydomain.dk"; # zone file path
            also-notify { 192.168.1.5 key management-key; };
        };
        zone "12.21.10.in-addr.arpa" {
            type slave;
            masters { 127.0.0.1 key lan-key; };
            file "/etc/bind/management/db.10.21.12-rev";
            also-notify { 192.168.1.5 key management-key; };
        };
    };

### db.dmz.mydomain.dk from master DNS server: ###

    $TTL    604800
    @       IN      SOA     ns1.dmz.mydomain.dk. root.lan.mydomain.dk. (
                         2018102001         ; Serial
                             604800         ; Refresh
                              86400         ; Retry
                            2419200         ; Expire
                             604800 )       ; Negative Cache TTL
    
    ; name and mail servers - NS records
    @   IN      NS      ns1.dmz.mydomain.dk.
        IN      NS      ns2.dmz.mydomain.dk.
        IN      MX      10 proxymail.dmz.mydomain.dk.
        IN      A       172.16.1.4
    
    ; name servers - A records
    ns1.dmz.mydomain.dk.               IN      A       172.16.1.4
    ns2.dmz.mydomain.dk.               IN      A       172.16.1.5
    
    ; 172.16.1.0/24 - A records
    fwdmz.dmz.mydomain.dk.             IN      A       172.16.1.2
    tools.dmz.mydomain.dk.             IN      A       172.16.1.4
    x3690.vmhost.dmz.mydomain.dk.      IN      A       172.16.1.20
    x3650.vmhost.dmz.mydomain.dk.      IN      A       172.16.1.21
    wwwgate.dmz.mydomain.dk.           IN      A       172.16.1.30
    proxymail.dmz.mydomain.dk.         IN      A       172.16.1.40

### named.conf.local from slave DNS server:

    zone "dmz.mydomain.dk" {
        type slave;
        file "/etc/bind/slaves/db.dmz.mydomain.dk";
        masters { 172.16.1.1 key dmz-key; };
    };
    
    zone "1.16.172.in-addr.arpa" {
        type slave;
        file "/etc/bind/slaves/db.172.16.1-rev";
        masters { 172.16.1.1 key dmz-key; };
    };

As can be seen from the above, I have set the master IP address to be 172.16.1.1 which is the gateway address of the DMZ subnet. The firewall converts any LAN address to the DMZ gateway address followed by a random port number. So it does not make sense to put it into the master server's LAN IP address, which is never allowed to pass through the firewall.

    On the slave server there is the following error message:
    "zone dmz.mydomain.dk/IN: refused notify from non-master: 172.16.1.1#47161".

So, I can understand why the error message comes because I only specified that the master server is called 172.16.1.1 and not 172.16.1.1#47161.
So how do I get Bind9 on the slave server to accept that it's not just an IP address but an IP address and a random port number?

Thanks in advance.
                                

Søren Sjøstrøm (45 rep)

Oct 21, 2018, 06:06 PM • Last activity: Oct 21, 2018, 09:34 PM

Showing page 1 of 20 total questions