Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

2 votes

1 answers

40 views

How to obtain RHEL 8 repo that contain ceph-fuse and install it?

We have planned a ceph upgrade to reef 18.2.7, and it says that Client Kernel drivers older than in Kernel 5.4 are not supported. There is 1 RHEL 8 machine with Kernel 4.18. I tried to mount cephfs anyway, and it really doesn't work. I get the message: ```mount: /mnt: wrong fs type, bad option, bad...

: /mnt: wrong fs type, bad option, bad superblock on 192.168.22.101,192.168.22.100,192.168.22.102:/backup, missing codepage or helper program, or other error.

So, the only option except upgrade to RHEL 9 (this machine is not in my dpt. it is some Bacula backup machine), is to use **ceph-fuse**. However, my colleague wasn't able to install it either with repo rhceph-5-tools-for-rhel-8-x86_64-rpms or with repo 6-tools-for-rhel-8-x86_64-rpms. Actually, this is described here: https://docs.redhat.com/en/documentation/red_hat_ceph_storage/8/html-single/file_system_guide/index#mounting-the-ceph-file-system-as-a-fuse-client_fs , but it seems that this repo doesn't exist. I tried with ceph-fuse rpm downloaded from https://download.ceph.com/ , but there are so many missing dependencies, that this is not feasible. Does anyone know a method to install ceph-fuse on RHEL 8?

dotokija (133 rep)

Jul 18, 2025, 02:24 PM • Last activity: Jul 18, 2025, 02:45 PM

1 votes

1 answers

1924 views

mount cephfs failed because of failure to load kernel module

linux docker ceph

I got a confused issue in docker as below: After install ceph successfully, i want to mount cephfs but failed: [root@dbffa72704e4 ~]$ /bin/mount 172.17.0.4:/ /cephfs -t ceph -o name=admin,secretfile=/etc/ceph/admin.secret -v failed to load ceph kernel module (1) parsing options: rw,name=admin,secret...

                                  I got a confused issue in docker as below:

After install ceph successfully, i want to mount cephfs but failed:

    [root@dbffa72704e4 ~]$ /bin/mount 172.17.0.4:/ /cephfs -t ceph -o name=admin,secretfile=/etc/ceph/admin.secret -v
    failed to load ceph kernel module (1)
    parsing options: rw,name=admin,secretfile=/etc/ceph/admin.secret
    mount error 5 = Input/output error

But ceph related kernel modules have existed:

    [root@dbffa72704e4 ~]$ lsmod | grep ceph
    ceph                  327687  0
    libceph               287066  1 ceph
    dns_resolver           13140  2 nfsv4,libceph
    libcrc32c              12644  3 xfs,libceph,dm_persistent_data
    
    Check the ceph state(i only set data disk for osd):
    
    [root@dbffa72704e4 ~]$ ceph -s
      cluster:
        id:     20f51975-303e-446f-903f-04e1feaff7d0
        health: HEALTH_WARN
                Reduced data availability: 128 pgs inactive
                Degraded data redundancy: 128 pgs unclean
     
      services:
        mon: 2 daemons, quorum dbffa72704e4,5807d12f920e
        mgr: dbffa72704e4(active), standbys: 5807d12f920e
        mds: cephfs-1/1/1 up  {0=5807d12f920e=up:creating}, 1 up:standby
        osd: 0 osds: 0 up, 0 in
     
      data:
        pools:   2 pools, 128 pgs
        objects: 0 objects, 0 bytes
        usage:   0 kB used, 0 kB / 0 kB avail
        pgs:     100.000% pgs unknown
                 128 unknown

    [root@dbffa72704e4 ~]$ ceph version
    ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)

My container is based on centos:centos7.2.1511.

I saw some ceph related images on docker hub so that i think above
operation is ok, did i miss something important?
                                

daixiang0 (141 rep)

Nov 13, 2017, 05:39 AM • Last activity: May 3, 2025, 08:04 AM

0 votes

1 answers

111 views

Is there a FUSE-based caching solution for selective prefetching from a remote filesystem?

cache fuse ceph

I am working with a remote parallel file system (CephFS), mounted at `/mnt/mycephfs/`, which contains a large dataset of small files (200 GB+). My application trains on these files, but reading directly from `/mnt/mycephfs/` is slow due to parallel file system contention and network latency. I am lo...

                                  I am working with a remote parallel file system (CephFS), mounted at /mnt/mycephfs/, which contains a large dataset of small files (200 GB+). My application trains on these files, but reading directly from /mnt/mycephfs/ is slow due to parallel file system contention and network latency.

I am looking for a FUSE-based solution that can:
	1.	Take a list of files required by the application.
	2.	Prefetch and cache these files into a local mount point (e.g., /mnt/prefetched/) without replicating the entire remote storage (as my local RAM and disk space are limited).

The desired behavior:
	•	If a file (e.g., /mnt/mycephfs/file) is already cached at /mnt/prefetched/file, it should be served from the cache.
	•	If not cached, the solution should fetch the file (along with other files from the prefetch list), cache it at /mnt/prefetched/, and then serve it from there.

Are there existing tools or frameworks that support this kind of selective caching and prefetching using FUSE?
                                

H.Jamil (31 rep)

Dec 11, 2024, 04:23 AM • Last activity: Dec 11, 2024, 06:45 PM

1 votes

1 answers

1093 views

does CEPH RBD mount on Linux support boot device?

ceph

Does CEPH RBD mount on Linux support boot device? for RBD deployment, example would be like this: http://blog.programster.org/ceph-deploy-and-mount-a-block-device

                                  Does CEPH RBD mount on Linux support boot device?

for RBD deployment, example would be like this:
http://blog.programster.org/ceph-deploy-and-mount-a-block-device 
                                

Thomas G. Lau (204 rep)

Nov 3, 2017, 01:01 AM • Last activity: Nov 22, 2024, 03:49 PM

0 votes

0 answers

67 views

Mount error Ceph storage

ceph

I'm trying to mount CephFS to my host. mount -t ceph -o name=admin,secretfile=/etc/ceph/admin.secret 192.168.1.88:/exports/data/ /rhev/data-center/mnt/ But I get an error: mount error 2 = No such file or directory This directory is definitely present. drwxr-xr-x. 3 root root 4096 окт 17 17:00 export...

                                  I'm trying to mount CephFS to my host.

    mount -t ceph -o name=admin,secretfile=/etc/ceph/admin.secret  192.168.1.88:/exports/data/ /rhev/data-center/mnt/
But I get an error:

    mount error 2 = No such file or directory

This directory is definitely present.

    drwxr-xr-x.   3 root root  4096 окт 17 17:00 exports

Ceph info:

    [root@host1 /]# ceph -s
    cluster:
      id:     4ef9a8f9-7ac2-4617-a5d8-c450d857973c
      health: HEALTH_WARN
            mons are allowing insecure global_id reclaim
            clock skew detected on mon.host2, mon.host3
 
    services:
      mon: 3 daemons, quorum host1,host2,host3 (age 2h)
      mgr: host1.home.dom(active, since 5h)
      mds: 1/1 daemons up
      osd: 3 osds: 3 up (since 5h), 3 in (since 25h)
 
    data:
      volumes: 1/1 healthy
      pools:   3 pools, 81 pgs
      objects: 24 objects, 594 KiB
      usage:   107 MiB used, 150 GiB / 150 GiB avail
      pgs:     81 active+clean

    [root@host1 /]# systemctl status ceph-mds.target
    ● ceph-mds.target - ceph target allowing to start/stop all ceph-mds@.service instances at once
    Loaded: loaded (/usr/lib/systemd/system/ceph-mds.target; enabled; vendor preset: enabled)
    Active: active since Thu 2024-10-17 13:22:28 MSK; 5h 30min ago

                                

Guamokolatokint (123 rep)

Oct 17, 2024, 03:54 PM

0 votes

0 answers

33 views

Ovirt. Create domain POSIX Ceph

ceph ovirt

I am creating a posix domain, but there is an error mounting the directory: 2024-10-11 13:52:20,164+0300 INFO (jsonrpc/4) [storage.storageServer] Creating directory '/rhev/data-center/mnt/192.168.1.88:_' (storageServer:217) 2024-10-11 13:52:20,164+0300 INFO (jsonrpc/4) [storage.fileutils] Creating d...

                                  I am creating a posix domain, but there is an error mounting the directory:

    2024-10-11 13:52:20,164+0300 INFO  (jsonrpc/4) [storage.storageServer] Creating directory 
    '/rhev/data-center/mnt/192.168.1.88:_' (storageServer:217)
    2024-10-11 13:52:20,164+0300 INFO  (jsonrpc/4) [storage.fileutils] Creating directory: 
    /rhev/data-center/mnt/192.168.1.88:_ mode: None (fileUtils:214)
    2024-10-11 13:52:20,164+0300 INFO  (jsonrpc/4) [storage.mount] mounting 192.168.1.88:/ at 
    /rhev/data-center/mnt/192.168.1.88:_ (mount:190)
    2024-10-11 13:52:20,166+0300 INFO  (jsonrpc/5) [api.host] START getAllVmStats() 
    from=::1,57074 (api:31)
    2024-10-11 13:52:20,167+0300 INFO  (jsonrpc/5) [api.host] FINISH getAllVmStats return= 
    {'status': {'code': 0, 'message': 'Done'}, 'statsList': (suppressed)} from=::1,57074 (api:37)
    2024-10-11 13:52:20,293+0300 INFO  (jsonrpc/4) [IOProcessClient] (Global) Starting client 
    (__init__:340)
    2024-10-11 13:52:20,304+0300 INFO  (ioprocess/24690) [IOProcess] (Global) Starting ioprocess 
    (__init__:465)
    2024-10-11 13:52:20,305+0300 WARN  (jsonrpc/4) [storage.oop] Permission denied for directory: 
    /rhev/data-center/mnt/192.168.1.88:_ with permissions:7 (outOfProcess:177)
    2024-10-11 13:52:20,305+0300 INFO  (jsonrpc/4) [storage.mount] unmounting /rhev/data- 
    center/mnt/192.168.1.88:_ (mount:198)
    2024-10-11 13:52:20,342+0300 ERROR (jsonrpc/4) [storage.storageServer] Could not connect to 
    storage server (storageServer:75)
    Traceback (most recent call last):
     File "/usr/lib/python3.6/site-packages/vdsm/storage/fileSD.py", line 64, in 
     validateDirAccess
     getProcPool().fileUtils.validateAccess(dirPath)
     File "/usr/lib/python3.6/site-packages/vdsm/storage/outOfProcess.py", line 178, in 
     validateAccess
     raise OSError(errno.EACCES, os.strerror(errno.EACCES))
     PermissionError: [Errno 13] Permission denied

     During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
     File "/usr/lib/python3.6/site-packages/vdsm/storage/storageServer.py", line 73, in 
    connect_all
    con.connect()
     File "/usr/lib/python3.6/site-packages/vdsm/storage/storageServer.py", line 241, in connect
    six.reraise(t, v, tb)
    File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise
    raise value
    File "/usr/lib/python3.6/site-packages/vdsm/storage/storageServer.py", line 234, in connect
    self.getMountObj().getRecord().fs_file)
    File "/usr/lib/python3.6/site-packages/vdsm/storage/fileSD.py", line 75, in validateDirAccess
    raise se.StorageServerAccessPermissionError(dirPath)
    vdsm.storage.exception.StorageServerAccessPermissionError: Permission settings on the 
    specified path do not allow access to the storage. Verify permission settings on the 
    specified storage path.: 'path = /rhev/data-center/mnt/192.168.1.88:_'
    2024-10-11 13:52:20,343+0300 INFO  (jsonrpc/4) [storage.storagedomaincache] Invalidating 
    storage domain cache (sdc:57)




How to fix it?
                                

Guamokolatokint (123 rep)

Oct 11, 2024, 12:29 PM

1 votes

1 answers

213 views

Mount CephFS error connection

linux ceph

I need to mount the Ceph as a file storage. But I get an error [root@rv31 ~]# mount -t ceph 192.168.1.88:/ /data/cephmount/ -o name=admin,secretfile=/etc/ceph/admin.secret mount error 110 = Connection timed out The firewall is turned off. Ceph status: [root@host1 ~]# ceph status cluster: id: 28f0f54...

                                  I need to mount the Ceph as a file storage.
But I get an error

    [root@rv31 ~]# mount -t ceph 192.168.1.88:/ /data/cephmount/ -o name=admin,secretfile=/etc/ceph/admin.secret 
    mount error 110 = Connection timed out

The firewall is turned off.

Ceph status:

    [root@host1 ~]# ceph status
    cluster:
      id:     28f0f54f-10a0-442a-ab10-ab68381f56e3
      health: HEALTH_WARN
            mons are allowing insecure global_id reclaim
 
    services:
      mon: 3 daemons, quorum host1,host2,host3 (age 7h)
      mgr: host1.home.dom(active, since 7h)
      osd: 3 osds: 3 up (since 7h), 3 in (since 21h)
 
    data:
      pools:   2 pools, 33 pgs
      objects: 905 objects, 3.4 GiB
      usage:   10 GiB used, 50 GiB / 60 GiB avail
      pgs:     33 active+clean
    

    [root@host1 ~]# ceph orch host ls
    Error ENOENT: No orchestrator configured (try ceph orch set backend)

   Ceph.conf:

    [global]
    fsid = 28f0f54f-10a0-442a-ab10-ab68381f56e3
    mon_initial_members = host1, host2, host3
    mon_host = 192.168.1.88,192.168.1.89,192.168.1.90
    auth_cluster_required = cephx
    auth_service_required = cephx
    auth_client_required = cephx
    [mds.a]
    host = host1.home.dom
Other info:

    [root@host1 ceph]# systemctl status ceph-mds.target
    ● ceph-mds.target - ceph target allowing to start/stop all ceph- 
    mds@.service instances at once
    Loaded: loaded (/usr/lib/systemd/system/ceph-mds.target; enabled; 
    vendor preset: enabled)
    Active: active since Wed 2024-09-25 08:26:56 MSK; 9h ago

    [root@host1 ceph]# ceph mds stat
    2 up:standby






                                

Guamokolatokint (123 rep)

Sep 25, 2024, 06:43 AM • Last activity: Sep 26, 2024, 11:46 AM

2 votes

0 answers

213 views

how can I create s3 bucket on ceph?

ceph

root@tuy:/# ceph --version ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable) ceph is deployed on vmware I can create s3 bucket from my ceph dashboard. root@tuy:/# radosgw-admin bucket list [ "bucket1" ] But I need to create it from cli. Is it possible? rm is here, but no...

                                  

    root@tuy:/# ceph --version
    ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)

ceph is deployed on vmware 

I can create s3 bucket from my ceph dashboard. 

    root@tuy:/# radosgw-admin bucket list
    [
        "bucket1"
    ]

But I need to create it from cli. Is it possible? 

rm is here, but not create function

     
    root@tuy:/# radosgw-admin bucket --help
      bucket list                list buckets (specify --allow-unordered for
                                 faster, unsorted listing)
      bucket limit check         show bucket sharding stats
      bucket link                link bucket to specified user
      bucket unlink              unlink bucket from specified user
      bucket stats               returns bucket statistics
      bucket rm                  remove bucket
      bucket check               check bucket index by verifying size and object count stats
      bucket check olh           check for olh index entries and objects that are pending removal
      bucket check unlinked      check for object versions that are not visible in a bucket listing 
      bucket chown               link bucket to specified user and update its object ACLs
      bucket reshard             reshard bucket
      bucket rewrite             rewrite all objects in the specified bucket
      bucket sync checkpoint     poll a bucket's sync status until it catches up to its remote
      bucket sync disable        disable bucket sync
      bucket sync enable         enable bucket sync
      bucket radoslist           list rados objects backing bucket's objects

I found solutions for developers https://docs.ceph.com/en/latest/radosgw/s3/bucketops/ 
Also I found  "ceph osd crush add-bucket" command but I don't understand what is it

Need help


I cant add tags radosgw and s3

                                

tuytuy20 (115 rep)

Jul 22, 2024, 07:51 AM • Last activity: Aug 22, 2024, 06:35 PM

1 votes

1 answers

604 views

Ceph crush rules explanation for multiroom/racks setup

ceph

I started recently with ceph, inherited 1 large cluster for maintenance and now building recovery cluster. By game of trial and failure I managed to create crush rules to fit my purpose but failed to understand the syntax of crush rule definition. Could someone please explain (don't reference ceph docs, since they don't explain that)? Here is my setup of production cluster: 20 hosts distributed in 2 rooms, 2 racks in each room, 5 servers per rack, 10 OSDs per host, 200 OSDs in total. Someone wanted super safe setup, so replication is 2/4 and rules are (supposedly) defined to replicate to other room, 2 copies in each rack, 4 in total for every object. Here is the rule:

rule replicated_nvme {
	id 4
	type replicated
	min_size 1
	max_size 100
	step take default class nvme
	step choose firstn 0 type room
	step choose firstn 2 type rack
	step chooseleaf firstn 1 type host
	step emit
}

At my new cluster I have smaller setup so just 2 racks with 2 servers in each for test. I tried this, similar to the above, but without room:

rule replicated-nvme {
	id 6
	type replicated
	step take default class nvme
	step choose firstn 0 type rack
	step chooseleaf firstn 1 type host
	step emit
}

However, this doesn't produce desired result (with replication 2/4 it should be copy to other rack each copy to different server). What I got is 2 replicas in servers in different racks and 2 additional copies not created. I get this from ceph:

pgs:     4/8 objects degraded (50.000%)
             1 active+undersized+degraded

and I see that only 2 OSDs are used, not 4! So, I played and just changed to this:

rule replicated-nvme {
	id 6
	type replicated
	step take default class nvme
	step choose firstn 0 type rack
	step chooseleaf firstn 0 type host
	step emit
}

and it works. Pool PGs are replicated to 4 OSDs accross 2 racks (2 OSDs oer each rack). The only difference is chooseleaf firstn 0 type host instead of chooseleaf firstn 1 type host. The questions are: - what is the difference between choose and chooseleaf - what is the meaning of the *number* after firstn - how is the hierarchy defined for **steps**, what is checked before, what after? In short, I would like to know the syntax of crush rules. Just for clarification, altough the production cluster have even number of hosts per room/rack, and even replication rules, the object distribution is not super even. I.e. PGs distribution may differ to up to 10% per OSD. I suspect that 1st rule defined above is wrong and that purely by large number of OSDs is the distribution more or less equal.

dotokija (133 rep)

Aug 2, 2024, 08:51 AM • Last activity: Aug 9, 2024, 12:09 PM

1 votes

0 answers

282 views

My Ceph mon on one node fail and won't start

proxmox ceph

I have A ceph on 3 node working for a year. I get a HEALTH_WARN about : 2 OSD have spurious read erros 1/3 mons down, quorum ceph01,ceph03 I tried to start mon on ceph02. But not working. xxxxxxx@ceph02:~# systemctl status ceph-mon@ceph02 ● ceph-mon@ceph02.service - Ceph cluster monitor daemon Loade...

                                  I have A ceph on 3 node working for a year.

I get a HEALTH_WARN about :

    2 OSD have spurious read erros
    1/3 mons down, quorum ceph01,ceph03

I tried to start mon on ceph02. But not working.


    xxxxxxx@ceph02:~# systemctl status ceph-mon@ceph02
        ● ceph-mon@ceph02.service - Ceph cluster monitor daemon
             Loaded: loaded (/lib/systemd/system/ceph-mon@.service; enabled; vendor preset: enabled)
            Drop-In: /usr/lib/systemd/system/ceph-mon@.service.d
                     └─ceph-after-pve-cluster.conf
             Active: active (running) since Sat 2024-02-03 12:27:49 CST; 5 months 12 days ago
           Main PID: 1450 (ceph-mon)
              Tasks: 24
             Memory: 3.4G
                CPU: 2w 4d 14h 10min 5.925s
             CGroup: /system.slice/system-ceph\x2dmon.slice/ceph-mon@ceph02.service
                     └─1450 /usr/bin/ceph-mon -f --cluster ceph --id ceph02 --setuser ceph --setgroup ceph
        
        Jul 17 12:17:16 ceph02 ceph-mon: 2024-07-17T12:17:16.574+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
        Jul 17 12:17:31 ceph02 ceph-mon: 2024-07-17T12:17:31.590+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
        Jul 17 12:17:46 ceph02 ceph-mon: 2024-07-17T12:17:46.603+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
        Jul 17 12:18:01 ceph02 ceph-mon: 2024-07-17T12:18:01.615+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
        Jul 17 12:18:16 ceph02 ceph-mon: 2024-07-17T12:18:16.627+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
        Jul 17 12:18:31 ceph02 ceph-mon: 2024-07-17T12:18:31.644+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
        Jul 17 12:18:46 ceph02 ceph-mon: 2024-07-17T12:18:46.660+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
        Jul 17 12:1
    
    9:01 ceph02 ceph-mon: 2024-07-17T12:19:01.672+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
    Jul 17 12:19:16 ceph02 ceph-mon: 2024-07-17T12:19:16.685+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
    Jul 17 12:19:31 ceph02 ceph-mon: 2024-07-17T12:19:31.697+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied


And I do have some google about debug it. 

    xxxxxx@ceph02:~# ceph tell mon.1 mon_status 
    Error ENXIO: problem getting command descriptions from mon.1

And tried:

    sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph02.asok mon_status
    ceph-mon -i ceph02 --debug_mon 10
    ls /var/lib/ceph/mon/ceph-ceph02/



Non of them have any output and no respon.
My systeam disk still have space and HEALTH is OK no error.

It's looks like the folder store for mon on this node have some issue.

Should I rm it. Or just reboot the node?


                                

Abe Xu (11 rep)

Jul 17, 2024, 04:32 AM • Last activity: Jul 17, 2024, 04:44 AM

0 votes

2 answers

453 views

Optimizing storage usage in Proxmox + CEPH cluster

debian proxmox ceph

My friend and I bought 3 Dell Power Edge R740xd servers (128GB of RAM each) along with 11 SSD of 1TB each and 14 HDD of 14TB each. They are interconnected through 2 switches in two different networks via 1GB ethernet interfaces. We are wrapping our heads around how to get the best out of the current...

                                  My friend and I bought 3 Dell Power Edge R740xd servers (128GB of RAM each) along with 11 SSD of 1TB each and 14 HDD of 14TB each. They are interconnected through 2 switches in two different networks via 1GB ethernet interfaces.

We are wrapping our heads around how to get the best out of the current storage inventory in setting up a decent CEPH-powered Proxmox cluster.

With firstly saying that we have little to no background in this, this is our current arrangement for each server.

2 SSD in RAID1 for mirroring proxmox at the hardware level.
1 SSD for running containers and VMs.
4 HDD for CEPH pools
With the remaining 2 HDDs, use them for Proxmox backup.

God only knows what to use the remaining 2 SSDs for.

I have to say I don't agree with the RAID1 idea. Yes, you get 1-disk fault tolerance, but at the cost of around 4.8TBs in SSD. The OS only requires a recommended space of 32GB per the proxmox docs.

Also, (and again) according to the docs, CEPH managers, monitors and MDS (in case we setup a CEPHfs) perform heavy read and writes per second, so I think they are best placed in SSDs.

Regarding a shared library for sharing files among the 3 servers, I was wondering it was best formatting the disk and share the fs using NFS protocol (with NFS Ganesha?). From what I have read I concluded NFS is better than CephFS for this, it is a more robust, performant and battle-tested proto.

So my question is: If you were us how would you make the best out of this storage for using Proxmox along with CEPH? Consider also that we want to use proxmox backup, you know, for backups.

d3vr10 (1 rep)

Jun 3, 2024, 01:34 AM • Last activity: Jul 2, 2024, 01:31 AM

0 votes

2 answers

4032 views

Purge disks after removing Ceph

centos ceph

I'm trying to remove ceph totally from my servers. I released OSDs from server node and format disks, create new partition with parted but still see ceph partition inside disks. Followed this procedure to remove OSDs: https://docs.ceph.com/en/nautilus/rados/operations/add-or-rm-osds/#removing-osds-m...

                                  I'm trying to remove ceph totally from my servers. I released OSDs from server node and format disks, create new partition with parted but still see ceph partition inside disks. Followed this procedure to remove OSDs: https://docs.ceph.com/en/nautilus/rados/operations/add-or-rm-osds/#removing-osds-manual 

I need to release disks and let Centos use them by itself.

What I'm missing?

fth (101 rep)

Dec 29, 2020, 08:49 AM • Last activity: Jun 19, 2024, 11:24 PM

1 votes

0 answers

379 views

How to isolate cpu cores, even from kernel space, at boot?

linux kernel cpu proxmox ceph

I have a faulty Ryzen 5900X desktop CPU. Previously, I somewhat tamed its faulty cores via `isolcpus=2,3,14,15` kernel parameter in GRUB2 (see https://blog.cbugk.com/post/ryzen-5850x/). However, on Proxmox 8.2, I have set up a **CEPH** cluster. It had crippling performance of around 2 MB/s. Redone t...

I have a faulty Ryzen 5900X desktop CPU. Previously, I somewhat tamed its faulty cores via isolcpus=2,3,14,15 kernel parameter in GRUB2 (see https://blog.cbugk.com/post/ryzen-5850x/) . However, on Proxmox 8.2, I have set up a **CEPH** cluster. It had crippling performance of around 2 MB/s. Redone the cluster got **20 MB/s** speed while cloning a template VM. I was suspecting my use of second-hand enterprise SSDs but even fresh ones did it (with or without NVMe DB cache). But, when I checked my faulty cores (2,3,14,15) they were being used. The moment I turn down the computer with 5900X, transfer speed jumps to around **100 MB/s** on the remaining two nodes. Networking is 10G between each-node, iperf previously had shown 6G throughput, ~~it cannot be the bottle-neck.~~ **It was the damn cabling.** Some duckduckgo-ing later found out isolcpus= works for user space but not for **kernel space**. watch -n1 -- "ps -axo psr,pcpu,uid,user,pid,tid,args --sort=psr | grep -e '^ 2 ' -e '^ 3 ' -e '^ 14 ' -e '^ 15'" (source ) gives:

2  0.0     0 root          27      27 [cpuhp/2]
  2  0.0     0 root          28      28 [idle_inject/2]
  2  0.3     0 root          29      29 [migration/2]
  2  0.0     0 root          30      30 [ksoftirqd/2]
  2  0.0     0 root          31      31 [kworker/2:0-events]
  2  0.0     0 root         192     192 [irq/26-AMD-Vi]
  2  0.0     0 root         202     202 [kworker/2:1-events]
  3  0.0     0 root          33      33 [cpuhp/3]
  3  0.0     0 root          34      34 [idle_inject/3]
  3  0.3     0 root          35      35 [migration/3]
  3  0.0     0 root          36      36 [ksoftirqd/3]
  3  0.0     0 root          37      37 [kworker/3:0-events]
  3  0.0     0 root         203     203 [kworker/3:1-events]
 14  0.0     0 root          99      99 [cpuhp/14]
 14  0.0     0 root         100     100 [idle_inject/14]
 14  0.3     0 root         101     101 [migration/14]
 14  0.0     0 root         102     102 [ksoftirqd/14]
 14  0.0     0 root         103     103 [kworker/14:0-events]
 14  0.0     0 root         210     210 [kworker/14:1-events]
 15  0.0     0 root         105     105 [cpuhp/15]
 15  0.0     0 root         106     106 [idle_inject/15]
 15  0.3     0 root         107     107 [migration/15]
 15  0.0     0 root         108     108 [ksoftirqd/15]
 15  0.0     0 root         109     109 [kworker/15:0-events]
 15  0.0     0 root         211     211 [kworker/15:1-events]

Since Ceph uses kernel driver, I need a way to isolate cores from the whole system. Running PID 1 on-wards in a taskset is okay. I cannot use cset due to cgroups2 . numactl is also okay. With isolcpus I do not have apparent system stability issues, without that I would face secure connection errors on Firefox and OS installs would fail. But even that is not enough when using CEPH. And now I conclude that it could corrupt data unnoticed if this wasn't my homelab machine. Can anyone suggest a way to **effectively ban these faulty threads as soon as system allows** to do so, permanently? (I better use the phrase CPU affinity in the post) --- I was wrong, redone Cat6 cables just the right length, having cleared power cables earlier I can state intererence should be quite lower than earlier. The same error was there when I disabled half the cores on BIOS including the faulty ones. I get instant VM clones on CEPH pool now, thanks to nvme DB cache I suppose. Also, the kernel threads on the cores are the ones used for scheduling processes, their PID and set of threads on those cores is constant with above watch command even during a VM clone on CEPH pool. So if no tasks are being scheduled, it might be working as intended. Found these tangentially relevant readings interesting: migration - reddit , nohz - lwn.net

cbugk (446 rep)

May 13, 2024, 11:57 PM • Last activity: May 14, 2024, 10:52 PM

3 votes

1 answers

1442 views

Ceph for small files?

storage ceph

Currently I have 6 dedicated servers in a data center. Two Servers are Mailservers running exim and dovecot (Maildir) and 4 Webservers. Each Server has two 3TB HDDs. My current problem is, that we now have a video production team and they need storage. Probably scalable storage. Currently they have...

                                  Currently I have 6 dedicated servers in a data center. Two Servers are Mailservers running exim and dovecot (Maildir) and 4 Webservers. Each Server has two 3TB HDDs.

My current problem is, that we now have a video production team and they need storage. Probably scalable storage. Currently they have to look, on what server they have enough free space. And that's what I want to solve.

So my idea is to use Ceph for two things: First of all to create a Failover Solution for the Mail- and Webservers. So if a server fails, the Load Balancer just switch to another server where the files are also available. 

And the second one is that I get a scalable storage for the video files and the video team doesn't have to care about file size. They have their file structure on a single machine and can work with their files on this "machine". And if I need more storage, I just rent another dedicated server and add it to the "cluster".

That's why I'd like to ask, if Ceph is a good idea for this. Or do you have another better suggestion?

user39063 (201 rep)

Jul 6, 2018, 10:12 AM • Last activity: Apr 9, 2024, 03:27 PM

0 votes

1 answers

197 views

cephadm - how to separate ssh network from monitor network

ssh ansible ceph

In my company, for several years we were using Ceph while using ceph-ansible as the deployer (and for upgrade, scale operations, etc). Recently I was assigned to migrating to 'cephadm' for installation and for day-2 operations too. While doing a POC, I experienced 2 issues, one of them more accute t...

                                  In my company, for several years we were using Ceph while using ceph-ansible as the deployer (and for upgrade, scale operations, etc). Recently I was assigned to migrating to 'cephadm' for installation and for day-2 operations too.
 
While doing a POC, I experienced 2 issues, one of them more accute than the other:

1. We have different separated networks that were relevant for ceph-ansible:

a. provisioning network, used for ssh and for running tasks remotely at ceph-related hosts (nodes with mons/osds/clients)

b. public network - used for nodes that host the mons,mgrs,mdss. These addresses are **not ssh-able**. Our Ceph clusters worked this way perfectly.

c. cluster network - used for internal ceph traffic like heartbeat, replication, etc. Also not ssh-able.

So with cephadm, when bootstrapping it forces me to 'combine' the public network and the provisioning. In other words, unless I allow this network to be ssh-able (which for security reasons we prefer not to do that), the bootstrap command will fail with the bellow message. I couldn't find a way to install the ceph cluster with separate networks for ssh and for ceph purposes (public network for monitors):

    /usr/bin/ceph: stderr   File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 225, in raise_if_exception
    /usr/bin/ceph: stderr     e = pickle.loads(c.serialized_exception)
    /usr/bin/ceph: stderr TypeError: __init__() missing 2 required positional arguments: 'hostname' and 'addr'
    /usr/bin/ceph: stderr
    ERROR: Failed to add host : Failed command: /usr/bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph --init -e CONTAINER_IMAGE=/ceph/daemon:quincy-rockylinux-8-x86_64 -e NODE_NAME= -e CEPH_USE_RANDOM_NONCE=1 -v /var/log/ceph/a0a19cd2-44ec-11ee-a922-ec0d9a94e986:/var/log/ceph:z -v /tmp/ceph-tmpb0u6hlv7:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpooy56ocy:/etc/ceph/ceph.conf:z /ceph/daemon:quincy-rockylinux-8-x86_64 orch host add  

2. We used the original ceph services with names like 'ceph-mon@hostname.service'. With cephadm, each service and each container name has to have the fsid as part of the name. I tried searching where this could be changed but didn't find anything.
                                

Itay R (1 rep)

Aug 27, 2023, 03:45 PM • Last activity: Aug 28, 2023, 09:31 AM

1 votes

1 answers

98 views

is it possible to run a ceph rbd on erasure coded ceph pool, without a separate replicated metadata pool?

ceph

I'm new to ceph so forgive me if this is common knowledge, but I can't find it. This seems like a simple question, but I can't find any solid answer. In 2017 when rbd on ec pools was first implemented, you had to have a separate replicated pool to store the rbd metadata and then you could store the...

                                  I'm new to ceph so forgive me if this is common knowledge, but I can't find it. This seems like a simple question, but I can't find any solid answer. In 2017 when rbd on ec pools was first implemented, you had to have a separate replicated pool to store the rbd metadata and then you could store the actual data on the ec pool.

Is this still true or is there nowadays some way to store the metadata in the same ec_pool so I don't have to manage two pools to make an rbd?

stu (143 rep)

Aug 4, 2023, 01:24 AM • Last activity: Aug 11, 2023, 02:51 PM

1 votes

1 answers

129 views

Can't remove ceph xattrs on linux

linux xattr ceph

I had set xattrs for quota limits on CephFS $ setfattr -n ceph.quota.max_bytes -v 1100000000 /mnt/cephfs/data/ I can get value of this attribute $ getfattr -n ceph.quota.max_bytes /mnt/cephfs/data/ getfattr: Removing leading '/' from absolute path names # file: mnt/cephfs/data/ ceph.quota.max_bytes=...

                                  I had set xattrs for quota limits on CephFS

    $ setfattr -n ceph.quota.max_bytes -v 1100000000  /mnt/cephfs/data/

I can get value of this attribute

    $ getfattr -n ceph.quota.max_bytes /mnt/cephfs/data/
    getfattr: Removing leading '/' from absolute path names
    # file: mnt/cephfs/data/
    ceph.quota.max_bytes="1100000000"

But then I try to remove quota, I had

    $ setfattr -x ceph.quota.max_bytes /mnt/cephfs/data/
    setfattr: /mnt/cephfs/data/: No such attribute

How can I remove this xattr?

Dan B (11 rep)

Jun 4, 2023, 09:29 AM • Last activity: Jun 21, 2023, 08:07 AM

4 votes

3 answers

1962 views

How to delete a invalid osd in ceph cluster?

filesystems ceph

[root@dev-master ceph-cluster]# ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 0.01740 root default -4 0.00580 host osd2 0 0.00580 osd.0 down 0 1.00000 -5 0.00580 host osd3 1 0.00580 osd.1 down 0 1.00000 -6 0.00580 host osd1 2 0.00580 osd.2 down 0 1.00000 5 0 osd.5 up 0 1.000...

                                      [root@dev-master ceph-cluster]# ceph osd tree
    ID WEIGHT  TYPE NAME     UP/DOWN REWEIGHT PRIMARY-AFFINITY
    -1 0.01740 root default
    -4 0.00580     host osd2
     0 0.00580         osd.0    down        0          1.00000
    -5 0.00580     host osd3
     1 0.00580         osd.1    down        0          1.00000
    -6 0.00580     host osd1
     2 0.00580         osd.2    down        0          1.00000
     5       0 osd.5              up        0          1.00000
    [root@dev-master ceph-cluster]# ceph osd out 5
    osd.5 is already out.
    [root@dev-master ceph-cluster]# ceph osd crush remove osd.5
    device 'osd.5' does not appear in the crush map
    [root@dev-master ceph-cluster]# ceph auth del osd.5
    entity osd.5 does not exist
    [root@dev-master ceph-cluster]# ceph osd rm 5
    Error EBUSY: osd.5 is still up; must be down before removal.

But I could not find the osd.5 in any host.

                                

Inuyasha (41 rep)

Dec 21, 2016, 07:07 PM • Last activity: Apr 11, 2023, 08:24 PM

0 votes

0 answers

580 views

Ceph non-replicated pool (replication 1)

ceph

I have a 10 node cluster. I want to create a non-replicated pool (replication 1) and I want to ask some questions about it: Let me tell you my use case: - I don't care about losing data, - All of my data is JUNK and these junk files are usually between 1KB to 32MB. - These files will be deleted in 5...

                                  I have a 10 node cluster. I want to create a non-replicated pool
(replication 1) and I want to ask some questions about it:

Let me tell you my use case:
- I don't care about losing data,
- All of my data is JUNK and these junk files are usually between 1KB to 32MB.
- These files will be deleted in 5 days.
- Writable space and I/O speed is more important.
- I have high Write/Read/Delete operations, minimum 200GB a day.

I'm afraid that, in any failure, I won't be able to access the whole
cluster. Losing data is okay but I have to ignore missing files,
remove the data from the cluster and continue with existing data and
while doing this, I want to be able to write new data to the cluster.

My questions are:  
1- To reach this goal do you have any recommendations?  
2- With this setup, what potential problems do you have in mind?  
3- I think Erasure Coding is not a choice because of the performance
problems and slow file deletion. With this I/O need EC will miss files
and leaks may happen (I've seen before on Nautilus).  
4- You read my needs, is there a better way to do this? Maybe an alternative for ceph?  

Thank you for the answers.
Best regards.
                                

Ozbit (439 rep)

Apr 10, 2023, 08:13 PM • Last activity: Apr 10, 2023, 10:30 PM

0 votes

1 answers

398 views

DRBD on top of Ceph

storage ceph drbd

Would it be possible to have DRBD directly running inside a Ceph Pool? I have a backup machine with files stored directly on disk. The offsite backup machine has Ceph installed and configured on all the disks. I would like to have a second replica of the backup data on the offsite backup machine, bu...

                                  Would it be possible to have DRBD directly running inside a Ceph Pool?

I have a backup machine with files stored directly on disk. The offsite backup machine has Ceph installed and configured on all the disks.

I would like to have a second replica of the backup data on the offsite backup machine, but I'm a bit confused at which 'layers' DRBD and Ceph operate. Would it be possible to create a RBD pool at the offsite backup machine and configure DRBD directly on that or, do I need to go the route where I run a virtual machine using Ceph and configure DRBD in the virtual machine as a abstraction layer?

Edit:

The reason the (single node) offsite backup machine is running Ceph is because it is mirroring the pools of a (multi node) main Ceph cluster.

In addition to the main Ceph cluster we have a backup server creating file backups of the machines running on the cluster. This is a simple RAID5 configuration where the data is stored on.

To have a extra copy of the backup data I also want to sync it, using DRBD so that I do not have a problem with small files, to the offsite backup machine. But as the disks of the backup machine are already configured to be a Ceph OSD I need to store it somehow on in a Ceph pool.

Mr. Diba (400 rep)

Feb 14, 2023, 09:50 AM • Last activity: Feb 15, 2023, 03:45 PM

Showing page 1 of 20 total questions