Sample Header Ad - 728x90

Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

2 votes
1 answers
40 views
How to obtain RHEL 8 repo that contain ceph-fuse and install it?
We have planned a ceph upgrade to reef 18.2.7, and it says that Client Kernel drivers older than in Kernel 5.4 are not supported. There is 1 RHEL 8 machine with Kernel 4.18. I tried to mount cephfs anyway, and it really doesn't work. I get the message: ```mount: /mnt: wrong fs type, bad option, bad...
We have planned a ceph upgrade to reef 18.2.7, and it says that Client Kernel drivers older than in Kernel 5.4 are not supported. There is 1 RHEL 8 machine with Kernel 4.18. I tried to mount cephfs anyway, and it really doesn't work. I get the message:
: /mnt: wrong fs type, bad option, bad superblock on 192.168.22.101,192.168.22.100,192.168.22.102:/backup, missing codepage or helper program, or other error.
So, the only option except upgrade to RHEL 9 (this machine is not in my dpt. it is some Bacula backup machine), is to use **ceph-fuse**. However, my colleague wasn't able to install it either with repo rhceph-5-tools-for-rhel-8-x86_64-rpms or with repo 6-tools-for-rhel-8-x86_64-rpms. Actually, this is described here: https://docs.redhat.com/en/documentation/red_hat_ceph_storage/8/html-single/file_system_guide/index#mounting-the-ceph-file-system-as-a-fuse-client_fs , but it seems that this repo doesn't exist. I tried with ceph-fuse rpm downloaded from https://download.ceph.com/ , but there are so many missing dependencies, that this is not feasible. Does anyone know a method to install ceph-fuse on RHEL 8?
dotokija (133 rep)
Jul 18, 2025, 02:24 PM • Last activity: Jul 18, 2025, 02:45 PM
1 votes
1 answers
1924 views
mount cephfs failed because of failure to load kernel module
I got a confused issue in docker as below: After install ceph successfully, i want to mount cephfs but failed: [root@dbffa72704e4 ~]$ /bin/mount 172.17.0.4:/ /cephfs -t ceph -o name=admin,secretfile=/etc/ceph/admin.secret -v failed to load ceph kernel module (1) parsing options: rw,name=admin,secret...
I got a confused issue in docker as below: After install ceph successfully, i want to mount cephfs but failed: [root@dbffa72704e4 ~]$ /bin/mount 172.17.0.4:/ /cephfs -t ceph -o name=admin,secretfile=/etc/ceph/admin.secret -v failed to load ceph kernel module (1) parsing options: rw,name=admin,secretfile=/etc/ceph/admin.secret mount error 5 = Input/output error But ceph related kernel modules have existed: [root@dbffa72704e4 ~]$ lsmod | grep ceph ceph 327687 0 libceph 287066 1 ceph dns_resolver 13140 2 nfsv4,libceph libcrc32c 12644 3 xfs,libceph,dm_persistent_data Check the ceph state(i only set data disk for osd): [root@dbffa72704e4 ~]$ ceph -s cluster: id: 20f51975-303e-446f-903f-04e1feaff7d0 health: HEALTH_WARN Reduced data availability: 128 pgs inactive Degraded data redundancy: 128 pgs unclean services: mon: 2 daemons, quorum dbffa72704e4,5807d12f920e mgr: dbffa72704e4(active), standbys: 5807d12f920e mds: cephfs-1/1/1 up {0=5807d12f920e=up:creating}, 1 up:standby osd: 0 osds: 0 up, 0 in data: pools: 2 pools, 128 pgs objects: 0 objects, 0 bytes usage: 0 kB used, 0 kB / 0 kB avail pgs: 100.000% pgs unknown 128 unknown [root@dbffa72704e4 ~]$ ceph version ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable) My container is based on centos:centos7.2.1511. I saw some ceph related images on docker hub so that i think above operation is ok, did i miss something important?
daixiang0 (141 rep)
Nov 13, 2017, 05:39 AM • Last activity: May 3, 2025, 08:04 AM
0 votes
1 answers
111 views
Is there a FUSE-based caching solution for selective prefetching from a remote filesystem?
I am working with a remote parallel file system (CephFS), mounted at `/mnt/mycephfs/`, which contains a large dataset of small files (200 GB+). My application trains on these files, but reading directly from `/mnt/mycephfs/` is slow due to parallel file system contention and network latency. I am lo...
I am working with a remote parallel file system (CephFS), mounted at /mnt/mycephfs/, which contains a large dataset of small files (200 GB+). My application trains on these files, but reading directly from /mnt/mycephfs/ is slow due to parallel file system contention and network latency. I am looking for a FUSE-based solution that can: 1. Take a list of files required by the application. 2. Prefetch and cache these files into a local mount point (e.g., /mnt/prefetched/) without replicating the entire remote storage (as my local RAM and disk space are limited). The desired behavior: • If a file (e.g., /mnt/mycephfs/file) is already cached at /mnt/prefetched/file, it should be served from the cache. • If not cached, the solution should fetch the file (along with other files from the prefetch list), cache it at /mnt/prefetched/, and then serve it from there. Are there existing tools or frameworks that support this kind of selective caching and prefetching using FUSE?
H.Jamil (31 rep)
Dec 11, 2024, 04:23 AM • Last activity: Dec 11, 2024, 06:45 PM
1 votes
1 answers
1093 views
does CEPH RBD mount on Linux support boot device?
Does CEPH RBD mount on Linux support boot device? for RBD deployment, example would be like this: http://blog.programster.org/ceph-deploy-and-mount-a-block-device
Does CEPH RBD mount on Linux support boot device? for RBD deployment, example would be like this: http://blog.programster.org/ceph-deploy-and-mount-a-block-device
Thomas G. Lau (204 rep)
Nov 3, 2017, 01:01 AM • Last activity: Nov 22, 2024, 03:49 PM
0 votes
0 answers
67 views
Mount error Ceph storage
I'm trying to mount CephFS to my host. mount -t ceph -o name=admin,secretfile=/etc/ceph/admin.secret 192.168.1.88:/exports/data/ /rhev/data-center/mnt/ But I get an error: mount error 2 = No such file or directory This directory is definitely present. drwxr-xr-x. 3 root root 4096 окт 17 17:00 export...
I'm trying to mount CephFS to my host. mount -t ceph -o name=admin,secretfile=/etc/ceph/admin.secret 192.168.1.88:/exports/data/ /rhev/data-center/mnt/ But I get an error: mount error 2 = No such file or directory This directory is definitely present. drwxr-xr-x. 3 root root 4096 окт 17 17:00 exports Ceph info: [root@host1 /]# ceph -s cluster: id: 4ef9a8f9-7ac2-4617-a5d8-c450d857973c health: HEALTH_WARN mons are allowing insecure global_id reclaim clock skew detected on mon.host2, mon.host3 services: mon: 3 daemons, quorum host1,host2,host3 (age 2h) mgr: host1.home.dom(active, since 5h) mds: 1/1 daemons up osd: 3 osds: 3 up (since 5h), 3 in (since 25h) data: volumes: 1/1 healthy pools: 3 pools, 81 pgs objects: 24 objects, 594 KiB usage: 107 MiB used, 150 GiB / 150 GiB avail pgs: 81 active+clean [root@host1 /]# systemctl status ceph-mds.target ● ceph-mds.target - ceph target allowing to start/stop all ceph-mds@.service instances at once Loaded: loaded (/usr/lib/systemd/system/ceph-mds.target; enabled; vendor preset: enabled) Active: active since Thu 2024-10-17 13:22:28 MSK; 5h 30min ago
Guamokolatokint (123 rep)
Oct 17, 2024, 03:54 PM
0 votes
0 answers
33 views
Ovirt. Create domain POSIX Ceph
I am creating a posix domain, but there is an error mounting the directory: 2024-10-11 13:52:20,164+0300 INFO (jsonrpc/4) [storage.storageServer] Creating directory '/rhev/data-center/mnt/192.168.1.88:_' (storageServer:217) 2024-10-11 13:52:20,164+0300 INFO (jsonrpc/4) [storage.fileutils] Creating d...
I am creating a posix domain, but there is an error mounting the directory: 2024-10-11 13:52:20,164+0300 INFO (jsonrpc/4) [storage.storageServer] Creating directory '/rhev/data-center/mnt/192.168.1.88:_' (storageServer:217) 2024-10-11 13:52:20,164+0300 INFO (jsonrpc/4) [storage.fileutils] Creating directory: /rhev/data-center/mnt/192.168.1.88:_ mode: None (fileUtils:214) 2024-10-11 13:52:20,164+0300 INFO (jsonrpc/4) [storage.mount] mounting 192.168.1.88:/ at /rhev/data-center/mnt/192.168.1.88:_ (mount:190) 2024-10-11 13:52:20,166+0300 INFO (jsonrpc/5) [api.host] START getAllVmStats() from=::1,57074 (api:31) 2024-10-11 13:52:20,167+0300 INFO (jsonrpc/5) [api.host] FINISH getAllVmStats return= {'status': {'code': 0, 'message': 'Done'}, 'statsList': (suppressed)} from=::1,57074 (api:37) 2024-10-11 13:52:20,293+0300 INFO (jsonrpc/4) [IOProcessClient] (Global) Starting client (__init__:340) 2024-10-11 13:52:20,304+0300 INFO (ioprocess/24690) [IOProcess] (Global) Starting ioprocess (__init__:465) 2024-10-11 13:52:20,305+0300 WARN (jsonrpc/4) [storage.oop] Permission denied for directory: /rhev/data-center/mnt/192.168.1.88:_ with permissions:7 (outOfProcess:177) 2024-10-11 13:52:20,305+0300 INFO (jsonrpc/4) [storage.mount] unmounting /rhev/data- center/mnt/192.168.1.88:_ (mount:198) 2024-10-11 13:52:20,342+0300 ERROR (jsonrpc/4) [storage.storageServer] Could not connect to storage server (storageServer:75) Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/storage/fileSD.py", line 64, in validateDirAccess getProcPool().fileUtils.validateAccess(dirPath) File "/usr/lib/python3.6/site-packages/vdsm/storage/outOfProcess.py", line 178, in validateAccess raise OSError(errno.EACCES, os.strerror(errno.EACCES)) PermissionError: [Errno 13] Permission denied During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/storage/storageServer.py", line 73, in connect_all con.connect() File "/usr/lib/python3.6/site-packages/vdsm/storage/storageServer.py", line 241, in connect six.reraise(t, v, tb) File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise raise value File "/usr/lib/python3.6/site-packages/vdsm/storage/storageServer.py", line 234, in connect self.getMountObj().getRecord().fs_file) File "/usr/lib/python3.6/site-packages/vdsm/storage/fileSD.py", line 75, in validateDirAccess raise se.StorageServerAccessPermissionError(dirPath) vdsm.storage.exception.StorageServerAccessPermissionError: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/192.168.1.88:_' 2024-10-11 13:52:20,343+0300 INFO (jsonrpc/4) [storage.storagedomaincache] Invalidating storage domain cache (sdc:57) screenshot How to fix it?
Guamokolatokint (123 rep)
Oct 11, 2024, 12:29 PM
1 votes
1 answers
213 views
Mount CephFS error connection
I need to mount the Ceph as a file storage. But I get an error [root@rv31 ~]# mount -t ceph 192.168.1.88:/ /data/cephmount/ -o name=admin,secretfile=/etc/ceph/admin.secret mount error 110 = Connection timed out The firewall is turned off. Ceph status: [root@host1 ~]# ceph status cluster: id: 28f0f54...
I need to mount the Ceph as a file storage. But I get an error [root@rv31 ~]# mount -t ceph 192.168.1.88:/ /data/cephmount/ -o name=admin,secretfile=/etc/ceph/admin.secret mount error 110 = Connection timed out The firewall is turned off. Ceph status: [root@host1 ~]# ceph status cluster: id: 28f0f54f-10a0-442a-ab10-ab68381f56e3 health: HEALTH_WARN mons are allowing insecure global_id reclaim services: mon: 3 daemons, quorum host1,host2,host3 (age 7h) mgr: host1.home.dom(active, since 7h) osd: 3 osds: 3 up (since 7h), 3 in (since 21h) data: pools: 2 pools, 33 pgs objects: 905 objects, 3.4 GiB usage: 10 GiB used, 50 GiB / 60 GiB avail pgs: 33 active+clean [root@host1 ~]# ceph orch host ls Error ENOENT: No orchestrator configured (try ceph orch set backend) Ceph.conf: [global] fsid = 28f0f54f-10a0-442a-ab10-ab68381f56e3 mon_initial_members = host1, host2, host3 mon_host = 192.168.1.88,192.168.1.89,192.168.1.90 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx [mds.a] host = host1.home.dom Other info: [root@host1 ceph]# systemctl status ceph-mds.target ● ceph-mds.target - ceph target allowing to start/stop all ceph- mds@.service instances at once Loaded: loaded (/usr/lib/systemd/system/ceph-mds.target; enabled; vendor preset: enabled) Active: active since Wed 2024-09-25 08:26:56 MSK; 9h ago [root@host1 ceph]# ceph mds stat 2 up:standby enter image description here enter image description here
Guamokolatokint (123 rep)
Sep 25, 2024, 06:43 AM • Last activity: Sep 26, 2024, 11:46 AM
2 votes
0 answers
213 views
how can I create s3 bucket on ceph?
root@tuy:/# ceph --version ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable) ceph is deployed on vmware I can create s3 bucket from my ceph dashboard. root@tuy:/# radosgw-admin bucket list [ "bucket1" ] But I need to create it from cli. Is it possible? rm is here, but no...
root@tuy:/# ceph --version ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable) ceph is deployed on vmware I can create s3 bucket from my ceph dashboard. root@tuy:/# radosgw-admin bucket list [ "bucket1" ] But I need to create it from cli. Is it possible? rm is here, but not create function root@tuy:/# radosgw-admin bucket --help bucket list list buckets (specify --allow-unordered for faster, unsorted listing) bucket limit check show bucket sharding stats bucket link link bucket to specified user bucket unlink unlink bucket from specified user bucket stats returns bucket statistics bucket rm remove bucket bucket check check bucket index by verifying size and object count stats bucket check olh check for olh index entries and objects that are pending removal bucket check unlinked check for object versions that are not visible in a bucket listing bucket chown link bucket to specified user and update its object ACLs bucket reshard reshard bucket bucket rewrite rewrite all objects in the specified bucket bucket sync checkpoint poll a bucket's sync status until it catches up to its remote bucket sync disable disable bucket sync bucket sync enable enable bucket sync bucket radoslist list rados objects backing bucket's objects I found solutions for developers https://docs.ceph.com/en/latest/radosgw/s3/bucketops/ Also I found "ceph osd crush add-bucket" command but I don't understand what is it Need help I cant add tags radosgw and s3
tuytuy20 (115 rep)
Jul 22, 2024, 07:51 AM • Last activity: Aug 22, 2024, 06:35 PM
1 votes
1 answers
604 views
Ceph crush rules explanation for multiroom/racks setup
I started recently with ceph, inherited 1 large cluster for maintenance and now building recovery cluster. By game of trial and failure I managed to create crush rules to fit my purpose but failed to understand the syntax of crush rule definition. Could someone please explain (don't reference ceph d...
I started recently with ceph, inherited 1 large cluster for maintenance and now building recovery cluster. By game of trial and failure I managed to create crush rules to fit my purpose but failed to understand the syntax of crush rule definition. Could someone please explain (don't reference ceph docs, since they don't explain that)? Here is my setup of production cluster: 20 hosts distributed in 2 rooms, 2 racks in each room, 5 servers per rack, 10 OSDs per host, 200 OSDs in total. Someone wanted super safe setup, so replication is 2/4 and rules are (supposedly) defined to replicate to other room, 2 copies in each rack, 4 in total for every object. Here is the rule:
rule replicated_nvme {
	id 4
	type replicated
	min_size 1
	max_size 100
	step take default class nvme
	step choose firstn 0 type room
	step choose firstn 2 type rack
	step chooseleaf firstn 1 type host
	step emit
}
At my new cluster I have smaller setup so just 2 racks with 2 servers in each for test. I tried this, similar to the above, but without room:
rule replicated-nvme {
	id 6
	type replicated
	step take default class nvme
	step choose firstn 0 type rack
	step chooseleaf firstn 1 type host
	step emit
}
However, this doesn't produce desired result (with replication 2/4 it should be copy to other rack each copy to different server). What I got is 2 replicas in servers in different racks and 2 additional copies not created. I get this from ceph:
pgs:     4/8 objects degraded (50.000%)
             1 active+undersized+degraded
and I see that only 2 OSDs are used, not 4! So, I played and just changed to this:
rule replicated-nvme {
	id 6
	type replicated
	step take default class nvme
	step choose firstn 0 type rack
	step chooseleaf firstn 0 type host
	step emit
}
and it works. Pool PGs are replicated to 4 OSDs accross 2 racks (2 OSDs oer each rack). The only difference is chooseleaf firstn 0 type host instead of chooseleaf firstn 1 type host. The questions are: - what is the difference between choose and chooseleaf - what is the meaning of the *number* after firstn - how is the hierarchy defined for **steps**, what is checked before, what after? In short, I would like to know the syntax of crush rules. Just for clarification, altough the production cluster have even number of hosts per room/rack, and even replication rules, the object distribution is not super even. I.e. PGs distribution may differ to up to 10% per OSD. I suspect that 1st rule defined above is wrong and that purely by large number of OSDs is the distribution more or less equal.
dotokija (133 rep)
Aug 2, 2024, 08:51 AM • Last activity: Aug 9, 2024, 12:09 PM
1 votes
0 answers
282 views
My Ceph mon on one node fail and won't start
I have A ceph on 3 node working for a year. I get a HEALTH_WARN about : 2 OSD have spurious read erros 1/3 mons down, quorum ceph01,ceph03 I tried to start mon on ceph02. But not working. xxxxxxx@ceph02:~# systemctl status ceph-mon@ceph02 ● ceph-mon@ceph02.service - Ceph cluster monitor daemon Loade...
I have A ceph on 3 node working for a year. I get a HEALTH_WARN about : 2 OSD have spurious read erros 1/3 mons down, quorum ceph01,ceph03 I tried to start mon on ceph02. But not working. xxxxxxx@ceph02:~# systemctl status ceph-mon@ceph02 ● ceph-mon@ceph02.service - Ceph cluster monitor daemon Loaded: loaded (/lib/systemd/system/ceph-mon@.service; enabled; vendor preset: enabled) Drop-In: /usr/lib/systemd/system/ceph-mon@.service.d └─ceph-after-pve-cluster.conf Active: active (running) since Sat 2024-02-03 12:27:49 CST; 5 months 12 days ago Main PID: 1450 (ceph-mon) Tasks: 24 Memory: 3.4G CPU: 2w 4d 14h 10min 5.925s CGroup: /system.slice/system-ceph\x2dmon.slice/ceph-mon@ceph02.service └─1450 /usr/bin/ceph-mon -f --cluster ceph --id ceph02 --setuser ceph --setgroup ceph Jul 17 12:17:16 ceph02 ceph-mon: 2024-07-17T12:17:16.574+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied Jul 17 12:17:31 ceph02 ceph-mon: 2024-07-17T12:17:31.590+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied Jul 17 12:17:46 ceph02 ceph-mon: 2024-07-17T12:17:46.603+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied Jul 17 12:18:01 ceph02 ceph-mon: 2024-07-17T12:18:01.615+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied Jul 17 12:18:16 ceph02 ceph-mon: 2024-07-17T12:18:16.627+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied Jul 17 12:18:31 ceph02 ceph-mon: 2024-07-17T12:18:31.644+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied Jul 17 12:18:46 ceph02 ceph-mon: 2024-07-17T12:18:46.660+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied Jul 17 12:1 9:01 ceph02 ceph-mon: 2024-07-17T12:19:01.672+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied Jul 17 12:19:16 ceph02 ceph-mon: 2024-07-17T12:19:16.685+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied Jul 17 12:19:31 ceph02 ceph-mon: 2024-07-17T12:19:31.697+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied And I do have some google about debug it. xxxxxx@ceph02:~# ceph tell mon.1 mon_status Error ENXIO: problem getting command descriptions from mon.1 And tried: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph02.asok mon_status ceph-mon -i ceph02 --debug_mon 10 ls /var/lib/ceph/mon/ceph-ceph02/ Non of them have any output and no respon. My systeam disk still have space and HEALTH is OK no error. It's looks like the folder store for mon on this node have some issue. Should I rm it. Or just reboot the node?
Abe Xu (11 rep)
Jul 17, 2024, 04:32 AM • Last activity: Jul 17, 2024, 04:44 AM
0 votes
2 answers
453 views
Optimizing storage usage in Proxmox + CEPH cluster
My friend and I bought 3 Dell Power Edge R740xd servers (128GB of RAM each) along with 11 SSD of 1TB each and 14 HDD of 14TB each. They are interconnected through 2 switches in two different networks via 1GB ethernet interfaces. We are wrapping our heads around how to get the best out of the current...
My friend and I bought 3 Dell Power Edge R740xd servers (128GB of RAM each) along with 11 SSD of 1TB each and 14 HDD of 14TB each. They are interconnected through 2 switches in two different networks via 1GB ethernet interfaces. We are wrapping our heads around how to get the best out of the current storage inventory in setting up a decent CEPH-powered Proxmox cluster. With firstly saying that we have little to no background in this, this is our current arrangement for each server. 2 SSD in RAID1 for mirroring proxmox at the hardware level. 1 SSD for running containers and VMs. 4 HDD for CEPH pools With the remaining 2 HDDs, use them for Proxmox backup. God only knows what to use the remaining 2 SSDs for. I have to say I don't agree with the RAID1 idea. Yes, you get 1-disk fault tolerance, but at the cost of around 4.8TBs in SSD. The OS only requires a recommended space of 32GB per the proxmox docs. Also, (and again) according to the docs, CEPH managers, monitors and MDS (in case we setup a CEPHfs) perform heavy read and writes per second, so I think they are best placed in SSDs. Regarding a shared library for sharing files among the 3 servers, I was wondering it was best formatting the disk and share the fs using NFS protocol (with NFS Ganesha?). From what I have read I concluded NFS is better than CephFS for this, it is a more robust, performant and battle-tested proto. So my question is: If you were us how would you make the best out of this storage for using Proxmox along with CEPH? Consider also that we want to use proxmox backup, you know, for backups.
d3vr10 (1 rep)
Jun 3, 2024, 01:34 AM • Last activity: Jul 2, 2024, 01:31 AM
0 votes
2 answers
4032 views
Purge disks after removing Ceph
I'm trying to remove ceph totally from my servers. I released OSDs from server node and format disks, create new partition with parted but still see ceph partition inside disks. Followed this procedure to remove OSDs: https://docs.ceph.com/en/nautilus/rados/operations/add-or-rm-osds/#removing-osds-m...
I'm trying to remove ceph totally from my servers. I released OSDs from server node and format disks, create new partition with parted but still see ceph partition inside disks. Followed this procedure to remove OSDs: https://docs.ceph.com/en/nautilus/rados/operations/add-or-rm-osds/#removing-osds-manual I need to release disks and let Centos use them by itself. What I'm missing?
fth (101 rep)
Dec 29, 2020, 08:49 AM • Last activity: Jun 19, 2024, 11:24 PM
1 votes
0 answers
379 views
How to isolate cpu cores, even from kernel space, at boot?
I have a faulty Ryzen 5900X desktop CPU. Previously, I somewhat tamed its faulty cores via `isolcpus=2,3,14,15` kernel parameter in GRUB2 (see https://blog.cbugk.com/post/ryzen-5850x/). However, on Proxmox 8.2, I have set up a **CEPH** cluster. It had crippling performance of around 2 MB/s. Redone t...
I have a faulty Ryzen 5900X desktop CPU. Previously, I somewhat tamed its faulty cores via isolcpus=2,3,14,15 kernel parameter in GRUB2 (see https://blog.cbugk.com/post/ryzen-5850x/) . However, on Proxmox 8.2, I have set up a **CEPH** cluster. It had crippling performance of around 2 MB/s. Redone the cluster got **20 MB/s** speed while cloning a template VM. I was suspecting my use of second-hand enterprise SSDs but even fresh ones did it (with or without NVMe DB cache). But, when I checked my faulty cores (2,3,14,15) they were being used. The moment I turn down the computer with 5900X, transfer speed jumps to around **100 MB/s** on the remaining two nodes. Networking is 10G between each-node, iperf previously had shown 6G throughput, ~~it cannot be the bottle-neck.~~ **It was the damn cabling.** Some duckduckgo-ing later found out isolcpus= works for user space but not for **kernel space**. watch -n1 -- "ps -axo psr,pcpu,uid,user,pid,tid,args --sort=psr | grep -e '^ 2 ' -e '^ 3 ' -e '^ 14 ' -e '^ 15'" (source ) gives:
2  0.0     0 root          27      27 [cpuhp/2]
  2  0.0     0 root          28      28 [idle_inject/2]
  2  0.3     0 root          29      29 [migration/2]
  2  0.0     0 root          30      30 [ksoftirqd/2]
  2  0.0     0 root          31      31 [kworker/2:0-events]
  2  0.0     0 root         192     192 [irq/26-AMD-Vi]
  2  0.0     0 root         202     202 [kworker/2:1-events]
  3  0.0     0 root          33      33 [cpuhp/3]
  3  0.0     0 root          34      34 [idle_inject/3]
  3  0.3     0 root          35      35 [migration/3]
  3  0.0     0 root          36      36 [ksoftirqd/3]
  3  0.0     0 root          37      37 [kworker/3:0-events]
  3  0.0     0 root         203     203 [kworker/3:1-events]
 14  0.0     0 root          99      99 [cpuhp/14]
 14  0.0     0 root         100     100 [idle_inject/14]
 14  0.3     0 root         101     101 [migration/14]
 14  0.0     0 root         102     102 [ksoftirqd/14]
 14  0.0     0 root         103     103 [kworker/14:0-events]
 14  0.0     0 root         210     210 [kworker/14:1-events]
 15  0.0     0 root         105     105 [cpuhp/15]
 15  0.0     0 root         106     106 [idle_inject/15]
 15  0.3     0 root         107     107 [migration/15]
 15  0.0     0 root         108     108 [ksoftirqd/15]
 15  0.0     0 root         109     109 [kworker/15:0-events]
 15  0.0     0 root         211     211 [kworker/15:1-events]
Since Ceph uses kernel driver, I need a way to isolate cores from the whole system. Running PID 1 on-wards in a taskset is okay. I cannot use cset due to cgroups2 . numactl is also okay. With isolcpus I do not have apparent system stability issues, without that I would face secure connection errors on Firefox and OS installs would fail. But even that is not enough when using CEPH. And now I conclude that it could corrupt data unnoticed if this wasn't my homelab machine. Can anyone suggest a way to **effectively ban these faulty threads as soon as system allows** to do so, permanently? (I better use the phrase CPU affinity in the post) --- I was wrong, redone Cat6 cables just the right length, having cleared power cables earlier I can state intererence should be quite lower than earlier. The same error was there when I disabled half the cores on BIOS including the faulty ones. I get instant VM clones on CEPH pool now, thanks to nvme DB cache I suppose. Also, the kernel threads on the cores are the ones used for scheduling processes, their PID and set of threads on those cores is constant with above watch command even during a VM clone on CEPH pool. So if no tasks are being scheduled, it might be working as intended. Found these tangentially relevant readings interesting: migration - reddit , nohz - lwn.net
cbugk (446 rep)
May 13, 2024, 11:57 PM • Last activity: May 14, 2024, 10:52 PM
3 votes
1 answers
1442 views
Ceph for small files?
Currently I have 6 dedicated servers in a data center. Two Servers are Mailservers running exim and dovecot (Maildir) and 4 Webservers. Each Server has two 3TB HDDs. My current problem is, that we now have a video production team and they need storage. Probably scalable storage. Currently they have...
Currently I have 6 dedicated servers in a data center. Two Servers are Mailservers running exim and dovecot (Maildir) and 4 Webservers. Each Server has two 3TB HDDs. My current problem is, that we now have a video production team and they need storage. Probably scalable storage. Currently they have to look, on what server they have enough free space. And that's what I want to solve. So my idea is to use Ceph for two things: First of all to create a Failover Solution for the Mail- and Webservers. So if a server fails, the Load Balancer just switch to another server where the files are also available. And the second one is that I get a scalable storage for the video files and the video team doesn't have to care about file size. They have their file structure on a single machine and can work with their files on this "machine". And if I need more storage, I just rent another dedicated server and add it to the "cluster". That's why I'd like to ask, if Ceph is a good idea for this. Or do you have another better suggestion?
user39063 (201 rep)
Jul 6, 2018, 10:12 AM • Last activity: Apr 9, 2024, 03:27 PM
0 votes
1 answers
197 views
cephadm - how to separate ssh network from monitor network
In my company, for several years we were using Ceph while using ceph-ansible as the deployer (and for upgrade, scale operations, etc). Recently I was assigned to migrating to 'cephadm' for installation and for day-2 operations too. While doing a POC, I experienced 2 issues, one of them more accute t...
In my company, for several years we were using Ceph while using ceph-ansible as the deployer (and for upgrade, scale operations, etc). Recently I was assigned to migrating to 'cephadm' for installation and for day-2 operations too. While doing a POC, I experienced 2 issues, one of them more accute than the other: 1. We have different separated networks that were relevant for ceph-ansible: a. provisioning network, used for ssh and for running tasks remotely at ceph-related hosts (nodes with mons/osds/clients) b. public network - used for nodes that host the mons,mgrs,mdss. These addresses are **not ssh-able**. Our Ceph clusters worked this way perfectly. c. cluster network - used for internal ceph traffic like heartbeat, replication, etc. Also not ssh-able. So with cephadm, when bootstrapping it forces me to 'combine' the public network and the provisioning. In other words, unless I allow this network to be ssh-able (which for security reasons we prefer not to do that), the bootstrap command will fail with the bellow message. I couldn't find a way to install the ceph cluster with separate networks for ssh and for ceph purposes (public network for monitors): /usr/bin/ceph: stderr File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 225, in raise_if_exception /usr/bin/ceph: stderr e = pickle.loads(c.serialized_exception) /usr/bin/ceph: stderr TypeError: __init__() missing 2 required positional arguments: 'hostname' and 'addr' /usr/bin/ceph: stderr ERROR: Failed to add host : Failed command: /usr/bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph --init -e CONTAINER_IMAGE=/ceph/daemon:quincy-rockylinux-8-x86_64 -e NODE_NAME= -e CEPH_USE_RANDOM_NONCE=1 -v /var/log/ceph/a0a19cd2-44ec-11ee-a922-ec0d9a94e986:/var/log/ceph:z -v /tmp/ceph-tmpb0u6hlv7:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpooy56ocy:/etc/ceph/ceph.conf:z /ceph/daemon:quincy-rockylinux-8-x86_64 orch host add 2. We used the original ceph services with names like 'ceph-mon@hostname.service'. With cephadm, each service and each container name has to have the fsid as part of the name. I tried searching where this could be changed but didn't find anything.
Itay R (1 rep)
Aug 27, 2023, 03:45 PM • Last activity: Aug 28, 2023, 09:31 AM
1 votes
1 answers
98 views
is it possible to run a ceph rbd on erasure coded ceph pool, without a separate replicated metadata pool?
I'm new to ceph so forgive me if this is common knowledge, but I can't find it. This seems like a simple question, but I can't find any solid answer. In 2017 when rbd on ec pools was first implemented, you had to have a separate replicated pool to store the rbd metadata and then you could store the...
I'm new to ceph so forgive me if this is common knowledge, but I can't find it. This seems like a simple question, but I can't find any solid answer. In 2017 when rbd on ec pools was first implemented, you had to have a separate replicated pool to store the rbd metadata and then you could store the actual data on the ec pool. Is this still true or is there nowadays some way to store the metadata in the same ec_pool so I don't have to manage two pools to make an rbd?
stu (143 rep)
Aug 4, 2023, 01:24 AM • Last activity: Aug 11, 2023, 02:51 PM
1 votes
1 answers
129 views
Can't remove ceph xattrs on linux
I had set xattrs for quota limits on CephFS $ setfattr -n ceph.quota.max_bytes -v 1100000000 /mnt/cephfs/data/ I can get value of this attribute $ getfattr -n ceph.quota.max_bytes /mnt/cephfs/data/ getfattr: Removing leading '/' from absolute path names # file: mnt/cephfs/data/ ceph.quota.max_bytes=...
I had set xattrs for quota limits on CephFS $ setfattr -n ceph.quota.max_bytes -v 1100000000 /mnt/cephfs/data/ I can get value of this attribute $ getfattr -n ceph.quota.max_bytes /mnt/cephfs/data/ getfattr: Removing leading '/' from absolute path names # file: mnt/cephfs/data/ ceph.quota.max_bytes="1100000000" But then I try to remove quota, I had $ setfattr -x ceph.quota.max_bytes /mnt/cephfs/data/ setfattr: /mnt/cephfs/data/: No such attribute How can I remove this xattr?
Dan B (11 rep)
Jun 4, 2023, 09:29 AM • Last activity: Jun 21, 2023, 08:07 AM
4 votes
3 answers
1962 views
How to delete a invalid osd in ceph cluster?
[root@dev-master ceph-cluster]# ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 0.01740 root default -4 0.00580 host osd2 0 0.00580 osd.0 down 0 1.00000 -5 0.00580 host osd3 1 0.00580 osd.1 down 0 1.00000 -6 0.00580 host osd1 2 0.00580 osd.2 down 0 1.00000 5 0 osd.5 up 0 1.000...
[root@dev-master ceph-cluster]# ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 0.01740 root default -4 0.00580 host osd2 0 0.00580 osd.0 down 0 1.00000 -5 0.00580 host osd3 1 0.00580 osd.1 down 0 1.00000 -6 0.00580 host osd1 2 0.00580 osd.2 down 0 1.00000 5 0 osd.5 up 0 1.00000 [root@dev-master ceph-cluster]# ceph osd out 5 osd.5 is already out. [root@dev-master ceph-cluster]# ceph osd crush remove osd.5 device 'osd.5' does not appear in the crush map [root@dev-master ceph-cluster]# ceph auth del osd.5 entity osd.5 does not exist [root@dev-master ceph-cluster]# ceph osd rm 5 Error EBUSY: osd.5 is still up; must be down before removal. But I could not find the osd.5 in any host.
Inuyasha (41 rep)
Dec 21, 2016, 07:07 PM • Last activity: Apr 11, 2023, 08:24 PM
0 votes
0 answers
580 views
Ceph non-replicated pool (replication 1)
I have a 10 node cluster. I want to create a non-replicated pool (replication 1) and I want to ask some questions about it: Let me tell you my use case: - I don't care about losing data, - All of my data is JUNK and these junk files are usually between 1KB to 32MB. - These files will be deleted in 5...
I have a 10 node cluster. I want to create a non-replicated pool (replication 1) and I want to ask some questions about it: Let me tell you my use case: - I don't care about losing data, - All of my data is JUNK and these junk files are usually between 1KB to 32MB. - These files will be deleted in 5 days. - Writable space and I/O speed is more important. - I have high Write/Read/Delete operations, minimum 200GB a day. I'm afraid that, in any failure, I won't be able to access the whole cluster. Losing data is okay but I have to ignore missing files, remove the data from the cluster and continue with existing data and while doing this, I want to be able to write new data to the cluster. My questions are: 1- To reach this goal do you have any recommendations? 2- With this setup, what potential problems do you have in mind? 3- I think Erasure Coding is not a choice because of the performance problems and slow file deletion. With this I/O need EC will miss files and leaks may happen (I've seen before on Nautilus). 4- You read my needs, is there a better way to do this? Maybe an alternative for ceph? Thank you for the answers. Best regards.
Ozbit (439 rep)
Apr 10, 2023, 08:13 PM • Last activity: Apr 10, 2023, 10:30 PM
0 votes
1 answers
398 views
DRBD on top of Ceph
Would it be possible to have DRBD directly running inside a Ceph Pool? I have a backup machine with files stored directly on disk. The offsite backup machine has Ceph installed and configured on all the disks. I would like to have a second replica of the backup data on the offsite backup machine, bu...
Would it be possible to have DRBD directly running inside a Ceph Pool? I have a backup machine with files stored directly on disk. The offsite backup machine has Ceph installed and configured on all the disks. I would like to have a second replica of the backup data on the offsite backup machine, but I'm a bit confused at which 'layers' DRBD and Ceph operate. Would it be possible to create a RBD pool at the offsite backup machine and configure DRBD directly on that or, do I need to go the route where I run a virtual machine using Ceph and configure DRBD in the virtual machine as a abstraction layer? Edit: The reason the (single node) offsite backup machine is running Ceph is because it is mirroring the pools of a (multi node) main Ceph cluster. In addition to the main Ceph cluster we have a backup server creating file backups of the machines running on the cluster. This is a simple RAID5 configuration where the data is stored on. To have a extra copy of the backup data I also want to sync it, using DRBD so that I do not have a problem with small files, to the offsite backup machine. But as the disks of the backup machine are already configured to be a Ceph OSD I need to store it somehow on in a Ceph pool.
Mr. Diba (400 rep)
Feb 14, 2023, 09:50 AM • Last activity: Feb 15, 2023, 03:45 PM
Showing page 1 of 20 total questions