Unix & Linux Stack Exchange
Q&A for users of Linux, FreeBSD and other Unix-like operating systems
Latest Questions
2
votes
1
answers
40
views
How to obtain RHEL 8 repo that contain ceph-fuse and install it?
We have planned a ceph upgrade to reef 18.2.7, and it says that Client Kernel drivers older than in Kernel 5.4 are not supported. There is 1 RHEL 8 machine with Kernel 4.18. I tried to mount cephfs anyway, and it really doesn't work. I get the message: ```mount: /mnt: wrong fs type, bad option, bad...
We have planned a ceph upgrade to reef 18.2.7, and it says that Client Kernel drivers older than in Kernel 5.4 are not supported.
There is 1 RHEL 8 machine with Kernel 4.18. I tried to mount cephfs anyway, and it really doesn't work.
I get the message:
: /mnt: wrong fs type, bad option, bad superblock on 192.168.22.101,192.168.22.100,192.168.22.102:/backup, missing codepage or helper program, or other error.
So, the only option except upgrade to RHEL 9 (this machine is not in my dpt. it is some Bacula backup machine), is to use **ceph-fuse**.
However, my colleague wasn't able to install it either with repo rhceph-5-tools-for-rhel-8-x86_64-rpms
or with repo 6-tools-for-rhel-8-x86_64-rpms
. Actually, this is described here: https://docs.redhat.com/en/documentation/red_hat_ceph_storage/8/html-single/file_system_guide/index#mounting-the-ceph-file-system-as-a-fuse-client_fs , but it seems that this repo doesn't exist.
I tried with ceph-fuse
rpm downloaded from https://download.ceph.com/ , but there are so many missing dependencies, that this is not feasible.
Does anyone know a method to install ceph-fuse on RHEL 8?
dotokija
(133 rep)
Jul 18, 2025, 02:24 PM
• Last activity: Jul 18, 2025, 02:45 PM
1
votes
1
answers
1924
views
mount cephfs failed because of failure to load kernel module
I got a confused issue in docker as below: After install ceph successfully, i want to mount cephfs but failed: [root@dbffa72704e4 ~]$ /bin/mount 172.17.0.4:/ /cephfs -t ceph -o name=admin,secretfile=/etc/ceph/admin.secret -v failed to load ceph kernel module (1) parsing options: rw,name=admin,secret...
I got a confused issue in docker as below:
After install ceph successfully, i want to mount cephfs but failed:
[root@dbffa72704e4 ~]$ /bin/mount 172.17.0.4:/ /cephfs -t ceph -o name=admin,secretfile=/etc/ceph/admin.secret -v
failed to load ceph kernel module (1)
parsing options: rw,name=admin,secretfile=/etc/ceph/admin.secret
mount error 5 = Input/output error
But ceph related kernel modules have existed:
[root@dbffa72704e4 ~]$ lsmod | grep ceph
ceph 327687 0
libceph 287066 1 ceph
dns_resolver 13140 2 nfsv4,libceph
libcrc32c 12644 3 xfs,libceph,dm_persistent_data
Check the ceph state(i only set data disk for osd):
[root@dbffa72704e4 ~]$ ceph -s
cluster:
id: 20f51975-303e-446f-903f-04e1feaff7d0
health: HEALTH_WARN
Reduced data availability: 128 pgs inactive
Degraded data redundancy: 128 pgs unclean
services:
mon: 2 daemons, quorum dbffa72704e4,5807d12f920e
mgr: dbffa72704e4(active), standbys: 5807d12f920e
mds: cephfs-1/1/1 up {0=5807d12f920e=up:creating}, 1 up:standby
osd: 0 osds: 0 up, 0 in
data:
pools: 2 pools, 128 pgs
objects: 0 objects, 0 bytes
usage: 0 kB used, 0 kB / 0 kB avail
pgs: 100.000% pgs unknown
128 unknown
[root@dbffa72704e4 ~]$ ceph version
ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)
My container is based on centos:centos7.2.1511.
I saw some ceph related images on docker hub so that i think above
operation is ok, did i miss something important?
daixiang0
(141 rep)
Nov 13, 2017, 05:39 AM
• Last activity: May 3, 2025, 08:04 AM
0
votes
1
answers
111
views
Is there a FUSE-based caching solution for selective prefetching from a remote filesystem?
I am working with a remote parallel file system (CephFS), mounted at `/mnt/mycephfs/`, which contains a large dataset of small files (200 GB+). My application trains on these files, but reading directly from `/mnt/mycephfs/` is slow due to parallel file system contention and network latency. I am lo...
I am working with a remote parallel file system (CephFS), mounted at
/mnt/mycephfs/
, which contains a large dataset of small files (200 GB+). My application trains on these files, but reading directly from /mnt/mycephfs/
is slow due to parallel file system contention and network latency.
I am looking for a FUSE-based solution that can:
1. Take a list of files required by the application.
2. Prefetch and cache these files into a local mount point (e.g., /mnt/prefetched/
) without replicating the entire remote storage (as my local RAM and disk space are limited).
The desired behavior:
• If a file (e.g., /mnt/mycephfs/file
) is already cached at /mnt/prefetched/file
, it should be served from the cache.
• If not cached, the solution should fetch the file (along with other files from the prefetch list), cache it at /mnt/prefetched/
, and then serve it from there.
Are there existing tools or frameworks that support this kind of selective caching and prefetching using FUSE?
H.Jamil
(31 rep)
Dec 11, 2024, 04:23 AM
• Last activity: Dec 11, 2024, 06:45 PM
1
votes
1
answers
1093
views
does CEPH RBD mount on Linux support boot device?
Does CEPH RBD mount on Linux support boot device? for RBD deployment, example would be like this: http://blog.programster.org/ceph-deploy-and-mount-a-block-device
Does CEPH RBD mount on Linux support boot device?
for RBD deployment, example would be like this:
http://blog.programster.org/ceph-deploy-and-mount-a-block-device
Thomas G. Lau
(204 rep)
Nov 3, 2017, 01:01 AM
• Last activity: Nov 22, 2024, 03:49 PM
0
votes
0
answers
67
views
Mount error Ceph storage
I'm trying to mount CephFS to my host. mount -t ceph -o name=admin,secretfile=/etc/ceph/admin.secret 192.168.1.88:/exports/data/ /rhev/data-center/mnt/ But I get an error: mount error 2 = No such file or directory This directory is definitely present. drwxr-xr-x. 3 root root 4096 окт 17 17:00 export...
I'm trying to mount CephFS to my host.
mount -t ceph -o name=admin,secretfile=/etc/ceph/admin.secret 192.168.1.88:/exports/data/ /rhev/data-center/mnt/
But I get an error:
mount error 2 = No such file or directory
This directory is definitely present.
drwxr-xr-x. 3 root root 4096 окт 17 17:00 exports
Ceph info:
[root@host1 /]# ceph -s
cluster:
id: 4ef9a8f9-7ac2-4617-a5d8-c450d857973c
health: HEALTH_WARN
mons are allowing insecure global_id reclaim
clock skew detected on mon.host2, mon.host3
services:
mon: 3 daemons, quorum host1,host2,host3 (age 2h)
mgr: host1.home.dom(active, since 5h)
mds: 1/1 daemons up
osd: 3 osds: 3 up (since 5h), 3 in (since 25h)
data:
volumes: 1/1 healthy
pools: 3 pools, 81 pgs
objects: 24 objects, 594 KiB
usage: 107 MiB used, 150 GiB / 150 GiB avail
pgs: 81 active+clean
[root@host1 /]# systemctl status ceph-mds.target
● ceph-mds.target - ceph target allowing to start/stop all ceph-mds@.service instances at once
Loaded: loaded (/usr/lib/systemd/system/ceph-mds.target; enabled; vendor preset: enabled)
Active: active since Thu 2024-10-17 13:22:28 MSK; 5h 30min ago
Guamokolatokint
(123 rep)
Oct 17, 2024, 03:54 PM
0
votes
0
answers
33
views
Ovirt. Create domain POSIX Ceph
I am creating a posix domain, but there is an error mounting the directory: 2024-10-11 13:52:20,164+0300 INFO (jsonrpc/4) [storage.storageServer] Creating directory '/rhev/data-center/mnt/192.168.1.88:_' (storageServer:217) 2024-10-11 13:52:20,164+0300 INFO (jsonrpc/4) [storage.fileutils] Creating d...
I am creating a posix domain, but there is an error mounting the directory:
2024-10-11 13:52:20,164+0300 INFO (jsonrpc/4) [storage.storageServer] Creating directory
'/rhev/data-center/mnt/192.168.1.88:_' (storageServer:217)
2024-10-11 13:52:20,164+0300 INFO (jsonrpc/4) [storage.fileutils] Creating directory:
/rhev/data-center/mnt/192.168.1.88:_ mode: None (fileUtils:214)
2024-10-11 13:52:20,164+0300 INFO (jsonrpc/4) [storage.mount] mounting 192.168.1.88:/ at
/rhev/data-center/mnt/192.168.1.88:_ (mount:190)
2024-10-11 13:52:20,166+0300 INFO (jsonrpc/5) [api.host] START getAllVmStats()
from=::1,57074 (api:31)
2024-10-11 13:52:20,167+0300 INFO (jsonrpc/5) [api.host] FINISH getAllVmStats return=
{'status': {'code': 0, 'message': 'Done'}, 'statsList': (suppressed)} from=::1,57074 (api:37)
2024-10-11 13:52:20,293+0300 INFO (jsonrpc/4) [IOProcessClient] (Global) Starting client
(__init__:340)
2024-10-11 13:52:20,304+0300 INFO (ioprocess/24690) [IOProcess] (Global) Starting ioprocess
(__init__:465)
2024-10-11 13:52:20,305+0300 WARN (jsonrpc/4) [storage.oop] Permission denied for directory:
/rhev/data-center/mnt/192.168.1.88:_ with permissions:7 (outOfProcess:177)
2024-10-11 13:52:20,305+0300 INFO (jsonrpc/4) [storage.mount] unmounting /rhev/data-
center/mnt/192.168.1.88:_ (mount:198)
2024-10-11 13:52:20,342+0300 ERROR (jsonrpc/4) [storage.storageServer] Could not connect to
storage server (storageServer:75)
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/vdsm/storage/fileSD.py", line 64, in
validateDirAccess
getProcPool().fileUtils.validateAccess(dirPath)
File "/usr/lib/python3.6/site-packages/vdsm/storage/outOfProcess.py", line 178, in
validateAccess
raise OSError(errno.EACCES, os.strerror(errno.EACCES))
PermissionError: [Errno 13] Permission denied
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/vdsm/storage/storageServer.py", line 73, in
connect_all
con.connect()
File "/usr/lib/python3.6/site-packages/vdsm/storage/storageServer.py", line 241, in connect
six.reraise(t, v, tb)
File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise
raise value
File "/usr/lib/python3.6/site-packages/vdsm/storage/storageServer.py", line 234, in connect
self.getMountObj().getRecord().fs_file)
File "/usr/lib/python3.6/site-packages/vdsm/storage/fileSD.py", line 75, in validateDirAccess
raise se.StorageServerAccessPermissionError(dirPath)
vdsm.storage.exception.StorageServerAccessPermissionError: Permission settings on the
specified path do not allow access to the storage. Verify permission settings on the
specified storage path.: 'path = /rhev/data-center/mnt/192.168.1.88:_'
2024-10-11 13:52:20,343+0300 INFO (jsonrpc/4) [storage.storagedomaincache] Invalidating
storage domain cache (sdc:57)
How to fix it?

Guamokolatokint
(123 rep)
Oct 11, 2024, 12:29 PM
1
votes
1
answers
213
views
Mount CephFS error connection
I need to mount the Ceph as a file storage. But I get an error [root@rv31 ~]# mount -t ceph 192.168.1.88:/ /data/cephmount/ -o name=admin,secretfile=/etc/ceph/admin.secret mount error 110 = Connection timed out The firewall is turned off. Ceph status: [root@host1 ~]# ceph status cluster: id: 28f0f54...
I need to mount the Ceph as a file storage.
But I get an error
[root@rv31 ~]# mount -t ceph 192.168.1.88:/ /data/cephmount/ -o name=admin,secretfile=/etc/ceph/admin.secret
mount error 110 = Connection timed out
The firewall is turned off.
Ceph status:
[root@host1 ~]# ceph status
cluster:
id: 28f0f54f-10a0-442a-ab10-ab68381f56e3
health: HEALTH_WARN
mons are allowing insecure global_id reclaim
services:
mon: 3 daemons, quorum host1,host2,host3 (age 7h)
mgr: host1.home.dom(active, since 7h)
osd: 3 osds: 3 up (since 7h), 3 in (since 21h)
data:
pools: 2 pools, 33 pgs
objects: 905 objects, 3.4 GiB
usage: 10 GiB used, 50 GiB / 60 GiB avail
pgs: 33 active+clean
[root@host1 ~]# ceph orch host ls
Error ENOENT: No orchestrator configured (try
ceph orch set backend
)
Ceph.conf:
[global]
fsid = 28f0f54f-10a0-442a-ab10-ab68381f56e3
mon_initial_members = host1, host2, host3
mon_host = 192.168.1.88,192.168.1.89,192.168.1.90
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
[mds.a]
host = host1.home.dom
Other info:
[root@host1 ceph]# systemctl status ceph-mds.target
● ceph-mds.target - ceph target allowing to start/stop all ceph-
mds@.service instances at once
Loaded: loaded (/usr/lib/systemd/system/ceph-mds.target; enabled;
vendor preset: enabled)
Active: active since Wed 2024-09-25 08:26:56 MSK; 9h ago
[root@host1 ceph]# ceph mds stat
2 up:standby


Guamokolatokint
(123 rep)
Sep 25, 2024, 06:43 AM
• Last activity: Sep 26, 2024, 11:46 AM
2
votes
0
answers
213
views
how can I create s3 bucket on ceph?
root@tuy:/# ceph --version ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable) ceph is deployed on vmware I can create s3 bucket from my ceph dashboard. root@tuy:/# radosgw-admin bucket list [ "bucket1" ] But I need to create it from cli. Is it possible? rm is here, but no...
root@tuy:/# ceph --version
ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)
ceph is deployed on vmware
I can create s3 bucket from my ceph dashboard.
root@tuy:/# radosgw-admin bucket list
[
"bucket1"
]
But I need to create it from cli. Is it possible?
rm is here, but not create function
root@tuy:/# radosgw-admin bucket --help
bucket list list buckets (specify --allow-unordered for
faster, unsorted listing)
bucket limit check show bucket sharding stats
bucket link link bucket to specified user
bucket unlink unlink bucket from specified user
bucket stats returns bucket statistics
bucket rm remove bucket
bucket check check bucket index by verifying size and object count stats
bucket check olh check for olh index entries and objects that are pending removal
bucket check unlinked check for object versions that are not visible in a bucket listing
bucket chown link bucket to specified user and update its object ACLs
bucket reshard reshard bucket
bucket rewrite rewrite all objects in the specified bucket
bucket sync checkpoint poll a bucket's sync status until it catches up to its remote
bucket sync disable disable bucket sync
bucket sync enable enable bucket sync
bucket radoslist list rados objects backing bucket's objects
I found solutions for developers https://docs.ceph.com/en/latest/radosgw/s3/bucketops/
Also I found "ceph osd crush add-bucket" command but I don't understand what is it
Need help
I cant add tags radosgw and s3
tuytuy20
(115 rep)
Jul 22, 2024, 07:51 AM
• Last activity: Aug 22, 2024, 06:35 PM
1
votes
1
answers
604
views
Ceph crush rules explanation for multiroom/racks setup
I started recently with ceph, inherited 1 large cluster for maintenance and now building recovery cluster. By game of trial and failure I managed to create crush rules to fit my purpose but failed to understand the syntax of crush rule definition. Could someone please explain (don't reference ceph d...
I started recently with ceph, inherited 1 large cluster for maintenance and now building recovery cluster. By game of trial and failure I managed to create crush rules to fit my purpose but failed to understand the syntax of crush rule definition. Could someone please explain (don't reference ceph docs, since they don't explain that)?
Here is my setup of production cluster:
20 hosts distributed in 2 rooms, 2 racks in each room, 5 servers per rack, 10 OSDs per host, 200 OSDs in total.
Someone wanted super safe setup, so replication is 2/4 and rules are (supposedly) defined to replicate to other room, 2 copies in each rack, 4 in total for every object.
Here is the rule:
rule replicated_nvme {
id 4
type replicated
min_size 1
max_size 100
step take default class nvme
step choose firstn 0 type room
step choose firstn 2 type rack
step chooseleaf firstn 1 type host
step emit
}
At my new cluster I have smaller setup so just 2 racks with 2 servers in each for test. I tried this, similar to the above, but without room:
rule replicated-nvme {
id 6
type replicated
step take default class nvme
step choose firstn 0 type rack
step chooseleaf firstn 1 type host
step emit
}
However, this doesn't produce desired result (with replication 2/4 it should be copy to other rack each copy to different server). What I got is 2 replicas in servers in different racks and 2 additional copies not created. I get this from ceph:
pgs: 4/8 objects degraded (50.000%)
1 active+undersized+degraded
and I see that only 2 OSDs are used, not 4!
So, I played and just changed to this:
rule replicated-nvme {
id 6
type replicated
step take default class nvme
step choose firstn 0 type rack
step chooseleaf firstn 0 type host
step emit
}
and it works. Pool PGs are replicated to 4 OSDs accross 2 racks (2 OSDs oer each rack). The only difference is chooseleaf firstn 0 type host
instead of chooseleaf firstn 1 type host
.
The questions are:
- what is the difference between choose
and chooseleaf
- what is the meaning of the *number* after firstn
- how is the hierarchy defined for **steps**, what is checked before, what after?
In short, I would like to know the syntax of crush rules.
Just for clarification, altough the production cluster have even number of hosts per room/rack, and even replication rules, the object distribution is not super even. I.e. PGs distribution may differ to up to 10% per OSD.
I suspect that 1st rule defined above is wrong and that purely by large number of OSDs is the distribution more or less equal.
dotokija
(133 rep)
Aug 2, 2024, 08:51 AM
• Last activity: Aug 9, 2024, 12:09 PM
1
votes
0
answers
282
views
My Ceph mon on one node fail and won't start
I have A ceph on 3 node working for a year. I get a HEALTH_WARN about : 2 OSD have spurious read erros 1/3 mons down, quorum ceph01,ceph03 I tried to start mon on ceph02. But not working. xxxxxxx@ceph02:~# systemctl status ceph-mon@ceph02 ● ceph-mon@ceph02.service - Ceph cluster monitor daemon Loade...
I have A ceph on 3 node working for a year.
I get a HEALTH_WARN about :
2 OSD have spurious read erros
1/3 mons down, quorum ceph01,ceph03
I tried to start mon on ceph02. But not working.
xxxxxxx@ceph02:~# systemctl status ceph-mon@ceph02
● ceph-mon@ceph02.service - Ceph cluster monitor daemon
Loaded: loaded (/lib/systemd/system/ceph-mon@.service; enabled; vendor preset: enabled)
Drop-In: /usr/lib/systemd/system/ceph-mon@.service.d
└─ceph-after-pve-cluster.conf
Active: active (running) since Sat 2024-02-03 12:27:49 CST; 5 months 12 days ago
Main PID: 1450 (ceph-mon)
Tasks: 24
Memory: 3.4G
CPU: 2w 4d 14h 10min 5.925s
CGroup: /system.slice/system-ceph\x2dmon.slice/ceph-mon@ceph02.service
└─1450 /usr/bin/ceph-mon -f --cluster ceph --id ceph02 --setuser ceph --setgroup ceph
Jul 17 12:17:16 ceph02 ceph-mon: 2024-07-17T12:17:16.574+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
Jul 17 12:17:31 ceph02 ceph-mon: 2024-07-17T12:17:31.590+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
Jul 17 12:17:46 ceph02 ceph-mon: 2024-07-17T12:17:46.603+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
Jul 17 12:18:01 ceph02 ceph-mon: 2024-07-17T12:18:01.615+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
Jul 17 12:18:16 ceph02 ceph-mon: 2024-07-17T12:18:16.627+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
Jul 17 12:18:31 ceph02 ceph-mon: 2024-07-17T12:18:31.644+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
Jul 17 12:18:46 ceph02 ceph-mon: 2024-07-17T12:18:46.660+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
Jul 17 12:1
9:01 ceph02 ceph-mon: 2024-07-17T12:19:01.672+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
Jul 17 12:19:16 ceph02 ceph-mon: 2024-07-17T12:19:16.685+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
Jul 17 12:19:31 ceph02 ceph-mon: 2024-07-17T12:19:31.697+0800 7f1ccdd33700 -1 mon.ceph02@1(peon) e3 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
And I do have some google about debug it.
xxxxxx@ceph02:~# ceph tell mon.1 mon_status
Error ENXIO: problem getting command descriptions from mon.1
And tried:
sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph02.asok mon_status
ceph-mon -i ceph02 --debug_mon 10
ls /var/lib/ceph/mon/ceph-ceph02/
Non of them have any output and no respon.
My systeam disk still have space and HEALTH is OK no error.
It's looks like the folder store for mon on this node have some issue.
Should I rm it. Or just reboot the node?
Abe Xu
(11 rep)
Jul 17, 2024, 04:32 AM
• Last activity: Jul 17, 2024, 04:44 AM
0
votes
2
answers
453
views
Optimizing storage usage in Proxmox + CEPH cluster
My friend and I bought 3 Dell Power Edge R740xd servers (128GB of RAM each) along with 11 SSD of 1TB each and 14 HDD of 14TB each. They are interconnected through 2 switches in two different networks via 1GB ethernet interfaces. We are wrapping our heads around how to get the best out of the current...
My friend and I bought 3 Dell Power Edge R740xd servers (128GB of RAM each) along with 11 SSD of 1TB each and 14 HDD of 14TB each. They are interconnected through 2 switches in two different networks via 1GB ethernet interfaces.
We are wrapping our heads around how to get the best out of the current storage inventory in setting up a decent CEPH-powered Proxmox cluster.
With firstly saying that we have little to no background in this, this is our current arrangement for each server.
2 SSD in RAID1 for mirroring proxmox at the hardware level.
1 SSD for running containers and VMs.
4 HDD for CEPH pools
With the remaining 2 HDDs, use them for Proxmox backup.
God only knows what to use the remaining 2 SSDs for.
I have to say I don't agree with the RAID1 idea. Yes, you get 1-disk fault tolerance, but at the cost of around 4.8TBs in SSD. The OS only requires a recommended space of 32GB per the proxmox docs.
Also, (and again) according to the docs, CEPH managers, monitors and MDS (in case we setup a CEPHfs) perform heavy read and writes per second, so I think they are best placed in SSDs.
Regarding a shared library for sharing files among the 3 servers, I was wondering it was best formatting the disk and share the fs using NFS protocol (with NFS Ganesha?). From what I have read I concluded NFS is better than CephFS for this, it is a more robust, performant and battle-tested proto.
So my question is: If you were us how would you make the best out of this storage for using Proxmox along with CEPH? Consider also that we want to use proxmox backup, you know, for backups.
d3vr10
(1 rep)
Jun 3, 2024, 01:34 AM
• Last activity: Jul 2, 2024, 01:31 AM
0
votes
2
answers
4032
views
Purge disks after removing Ceph
I'm trying to remove ceph totally from my servers. I released OSDs from server node and format disks, create new partition with parted but still see ceph partition inside disks. Followed this procedure to remove OSDs: https://docs.ceph.com/en/nautilus/rados/operations/add-or-rm-osds/#removing-osds-m...
I'm trying to remove ceph totally from my servers. I released OSDs from server node and format disks, create new partition with parted but still see ceph partition inside disks. Followed this procedure to remove OSDs: https://docs.ceph.com/en/nautilus/rados/operations/add-or-rm-osds/#removing-osds-manual
I need to release disks and let Centos use them by itself.
What I'm missing?
fth
(101 rep)
Dec 29, 2020, 08:49 AM
• Last activity: Jun 19, 2024, 11:24 PM
1
votes
0
answers
379
views
How to isolate cpu cores, even from kernel space, at boot?
I have a faulty Ryzen 5900X desktop CPU. Previously, I somewhat tamed its faulty cores via `isolcpus=2,3,14,15` kernel parameter in GRUB2 (see https://blog.cbugk.com/post/ryzen-5850x/). However, on Proxmox 8.2, I have set up a **CEPH** cluster. It had crippling performance of around 2 MB/s. Redone t...
I have a faulty Ryzen 5900X desktop CPU. Previously, I somewhat tamed its faulty cores via
isolcpus=2,3,14,15
kernel parameter in GRUB2 (see https://blog.cbugk.com/post/ryzen-5850x/) .
However, on Proxmox 8.2, I have set up a **CEPH** cluster. It had crippling performance of around 2 MB/s. Redone the cluster got **20 MB/s** speed while cloning a template VM. I was suspecting my use of second-hand enterprise SSDs but even fresh ones did it (with or without NVMe DB cache).
But, when I checked my faulty cores (2,3,14,15) they were being used. The moment I turn down the computer with 5900X, transfer speed jumps to around **100 MB/s** on the remaining two nodes. Networking is 10G between each-node, iperf previously had shown 6G throughput, ~~it cannot be the bottle-neck.~~ **It was the damn cabling.**
Some duckduckgo-ing later found out isolcpus=
works for user space but not for **kernel space**.
watch -n1 -- "ps -axo psr,pcpu,uid,user,pid,tid,args --sort=psr | grep -e '^ 2 ' -e '^ 3 ' -e '^ 14 ' -e '^ 15'"
(source ) gives:
2 0.0 0 root 27 27 [cpuhp/2]
2 0.0 0 root 28 28 [idle_inject/2]
2 0.3 0 root 29 29 [migration/2]
2 0.0 0 root 30 30 [ksoftirqd/2]
2 0.0 0 root 31 31 [kworker/2:0-events]
2 0.0 0 root 192 192 [irq/26-AMD-Vi]
2 0.0 0 root 202 202 [kworker/2:1-events]
3 0.0 0 root 33 33 [cpuhp/3]
3 0.0 0 root 34 34 [idle_inject/3]
3 0.3 0 root 35 35 [migration/3]
3 0.0 0 root 36 36 [ksoftirqd/3]
3 0.0 0 root 37 37 [kworker/3:0-events]
3 0.0 0 root 203 203 [kworker/3:1-events]
14 0.0 0 root 99 99 [cpuhp/14]
14 0.0 0 root 100 100 [idle_inject/14]
14 0.3 0 root 101 101 [migration/14]
14 0.0 0 root 102 102 [ksoftirqd/14]
14 0.0 0 root 103 103 [kworker/14:0-events]
14 0.0 0 root 210 210 [kworker/14:1-events]
15 0.0 0 root 105 105 [cpuhp/15]
15 0.0 0 root 106 106 [idle_inject/15]
15 0.3 0 root 107 107 [migration/15]
15 0.0 0 root 108 108 [ksoftirqd/15]
15 0.0 0 root 109 109 [kworker/15:0-events]
15 0.0 0 root 211 211 [kworker/15:1-events]
Since Ceph uses kernel driver, I need a way to isolate cores from the whole system. Running PID 1
on-wards in a taskset
is okay. I cannot use cset
due to cgroups2 . numactl
is also okay.
With isolcpus I do not have apparent system stability issues, without that I would face secure connection errors on Firefox and OS installs would fail. But even that is not enough when using CEPH. And now I conclude that it could corrupt data unnoticed if this wasn't my homelab machine.
Can anyone suggest a way to **effectively ban these faulty threads as soon as system allows** to do so, permanently? (I better use the phrase CPU affinity in the post)
---
I was wrong, redone Cat6 cables just the right length, having cleared power cables earlier I can state intererence should be quite lower than earlier. The same error was there when I disabled half the cores on BIOS including the faulty ones. I get instant VM clones on CEPH pool now, thanks to nvme DB cache I suppose.
Also, the kernel threads on the cores are the ones used for scheduling processes, their PID and set of threads on those cores is constant with above watch command even during a VM clone on CEPH pool. So if no tasks are being scheduled, it might be working as intended.
Found these tangentially relevant readings interesting: migration - reddit , nohz - lwn.net
cbugk
(446 rep)
May 13, 2024, 11:57 PM
• Last activity: May 14, 2024, 10:52 PM
3
votes
1
answers
1442
views
Ceph for small files?
Currently I have 6 dedicated servers in a data center. Two Servers are Mailservers running exim and dovecot (Maildir) and 4 Webservers. Each Server has two 3TB HDDs. My current problem is, that we now have a video production team and they need storage. Probably scalable storage. Currently they have...
Currently I have 6 dedicated servers in a data center. Two Servers are Mailservers running exim and dovecot (Maildir) and 4 Webservers. Each Server has two 3TB HDDs.
My current problem is, that we now have a video production team and they need storage. Probably scalable storage. Currently they have to look, on what server they have enough free space. And that's what I want to solve.
So my idea is to use Ceph for two things: First of all to create a Failover Solution for the Mail- and Webservers. So if a server fails, the Load Balancer just switch to another server where the files are also available.
And the second one is that I get a scalable storage for the video files and the video team doesn't have to care about file size. They have their file structure on a single machine and can work with their files on this "machine". And if I need more storage, I just rent another dedicated server and add it to the "cluster".
That's why I'd like to ask, if Ceph is a good idea for this. Or do you have another better suggestion?
user39063
(201 rep)
Jul 6, 2018, 10:12 AM
• Last activity: Apr 9, 2024, 03:27 PM
0
votes
1
answers
197
views
cephadm - how to separate ssh network from monitor network
In my company, for several years we were using Ceph while using ceph-ansible as the deployer (and for upgrade, scale operations, etc). Recently I was assigned to migrating to 'cephadm' for installation and for day-2 operations too. While doing a POC, I experienced 2 issues, one of them more accute t...
In my company, for several years we were using Ceph while using ceph-ansible as the deployer (and for upgrade, scale operations, etc). Recently I was assigned to migrating to 'cephadm' for installation and for day-2 operations too.
While doing a POC, I experienced 2 issues, one of them more accute than the other:
1. We have different separated networks that were relevant for ceph-ansible:
a. provisioning network, used for ssh and for running tasks remotely at ceph-related hosts (nodes with mons/osds/clients)
b. public network - used for nodes that host the mons,mgrs,mdss. These addresses are **not ssh-able**. Our Ceph clusters worked this way perfectly.
c. cluster network - used for internal ceph traffic like heartbeat, replication, etc. Also not ssh-able.
So with cephadm, when bootstrapping it forces me to 'combine' the public network and the provisioning. In other words, unless I allow this network to be ssh-able (which for security reasons we prefer not to do that), the bootstrap command will fail with the bellow message. I couldn't find a way to install the ceph cluster with separate networks for ssh and for ceph purposes (public network for monitors):
/usr/bin/ceph: stderr File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 225, in raise_if_exception
/usr/bin/ceph: stderr e = pickle.loads(c.serialized_exception)
/usr/bin/ceph: stderr TypeError: __init__() missing 2 required positional arguments: 'hostname' and 'addr'
/usr/bin/ceph: stderr
ERROR: Failed to add host : Failed command: /usr/bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph --init -e CONTAINER_IMAGE=/ceph/daemon:quincy-rockylinux-8-x86_64 -e NODE_NAME= -e CEPH_USE_RANDOM_NONCE=1 -v /var/log/ceph/a0a19cd2-44ec-11ee-a922-ec0d9a94e986:/var/log/ceph:z -v /tmp/ceph-tmpb0u6hlv7:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpooy56ocy:/etc/ceph/ceph.conf:z /ceph/daemon:quincy-rockylinux-8-x86_64 orch host add
2. We used the original ceph services with names like 'ceph-mon@hostname.service'. With cephadm, each service and each container name has to have the fsid as part of the name. I tried searching where this could be changed but didn't find anything.
Itay R
(1 rep)
Aug 27, 2023, 03:45 PM
• Last activity: Aug 28, 2023, 09:31 AM
1
votes
1
answers
98
views
is it possible to run a ceph rbd on erasure coded ceph pool, without a separate replicated metadata pool?
I'm new to ceph so forgive me if this is common knowledge, but I can't find it. This seems like a simple question, but I can't find any solid answer. In 2017 when rbd on ec pools was first implemented, you had to have a separate replicated pool to store the rbd metadata and then you could store the...
I'm new to ceph so forgive me if this is common knowledge, but I can't find it. This seems like a simple question, but I can't find any solid answer. In 2017 when rbd on ec pools was first implemented, you had to have a separate replicated pool to store the rbd metadata and then you could store the actual data on the ec pool.
Is this still true or is there nowadays some way to store the metadata in the same ec_pool so I don't have to manage two pools to make an rbd?
stu
(143 rep)
Aug 4, 2023, 01:24 AM
• Last activity: Aug 11, 2023, 02:51 PM
1
votes
1
answers
129
views
Can't remove ceph xattrs on linux
I had set xattrs for quota limits on CephFS $ setfattr -n ceph.quota.max_bytes -v 1100000000 /mnt/cephfs/data/ I can get value of this attribute $ getfattr -n ceph.quota.max_bytes /mnt/cephfs/data/ getfattr: Removing leading '/' from absolute path names # file: mnt/cephfs/data/ ceph.quota.max_bytes=...
I had set xattrs for quota limits on CephFS
$ setfattr -n ceph.quota.max_bytes -v 1100000000 /mnt/cephfs/data/
I can get value of this attribute
$ getfattr -n ceph.quota.max_bytes /mnt/cephfs/data/
getfattr: Removing leading '/' from absolute path names
# file: mnt/cephfs/data/
ceph.quota.max_bytes="1100000000"
But then I try to remove quota, I had
$ setfattr -x ceph.quota.max_bytes /mnt/cephfs/data/
setfattr: /mnt/cephfs/data/: No such attribute
How can I remove this xattr?
Dan B
(11 rep)
Jun 4, 2023, 09:29 AM
• Last activity: Jun 21, 2023, 08:07 AM
4
votes
3
answers
1962
views
How to delete a invalid osd in ceph cluster?
[root@dev-master ceph-cluster]# ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 0.01740 root default -4 0.00580 host osd2 0 0.00580 osd.0 down 0 1.00000 -5 0.00580 host osd3 1 0.00580 osd.1 down 0 1.00000 -6 0.00580 host osd1 2 0.00580 osd.2 down 0 1.00000 5 0 osd.5 up 0 1.000...
[root@dev-master ceph-cluster]# ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 0.01740 root default
-4 0.00580 host osd2
0 0.00580 osd.0 down 0 1.00000
-5 0.00580 host osd3
1 0.00580 osd.1 down 0 1.00000
-6 0.00580 host osd1
2 0.00580 osd.2 down 0 1.00000
5 0 osd.5 up 0 1.00000
[root@dev-master ceph-cluster]# ceph osd out 5
osd.5 is already out.
[root@dev-master ceph-cluster]# ceph osd crush remove osd.5
device 'osd.5' does not appear in the crush map
[root@dev-master ceph-cluster]# ceph auth del osd.5
entity osd.5 does not exist
[root@dev-master ceph-cluster]# ceph osd rm 5
Error EBUSY: osd.5 is still up; must be down before removal.
But I could not find the osd.5 in any host.
Inuyasha
(41 rep)
Dec 21, 2016, 07:07 PM
• Last activity: Apr 11, 2023, 08:24 PM
0
votes
0
answers
580
views
Ceph non-replicated pool (replication 1)
I have a 10 node cluster. I want to create a non-replicated pool (replication 1) and I want to ask some questions about it: Let me tell you my use case: - I don't care about losing data, - All of my data is JUNK and these junk files are usually between 1KB to 32MB. - These files will be deleted in 5...
I have a 10 node cluster. I want to create a non-replicated pool
(replication 1) and I want to ask some questions about it:
Let me tell you my use case:
- I don't care about losing data,
- All of my data is JUNK and these junk files are usually between 1KB to 32MB.
- These files will be deleted in 5 days.
- Writable space and I/O speed is more important.
- I have high Write/Read/Delete operations, minimum 200GB a day.
I'm afraid that, in any failure, I won't be able to access the whole
cluster. Losing data is okay but I have to ignore missing files,
remove the data from the cluster and continue with existing data and
while doing this, I want to be able to write new data to the cluster.
My questions are:
1- To reach this goal do you have any recommendations?
2- With this setup, what potential problems do you have in mind?
3- I think Erasure Coding is not a choice because of the performance
problems and slow file deletion. With this I/O need EC will miss files
and leaks may happen (I've seen before on Nautilus).
4- You read my needs, is there a better way to do this? Maybe an alternative for ceph?
Thank you for the answers.
Best regards.
Ozbit
(439 rep)
Apr 10, 2023, 08:13 PM
• Last activity: Apr 10, 2023, 10:30 PM
0
votes
1
answers
398
views
DRBD on top of Ceph
Would it be possible to have DRBD directly running inside a Ceph Pool? I have a backup machine with files stored directly on disk. The offsite backup machine has Ceph installed and configured on all the disks. I would like to have a second replica of the backup data on the offsite backup machine, bu...
Would it be possible to have DRBD directly running inside a Ceph Pool?
I have a backup machine with files stored directly on disk. The offsite backup machine has Ceph installed and configured on all the disks.
I would like to have a second replica of the backup data on the offsite backup machine, but I'm a bit confused at which 'layers' DRBD and Ceph operate. Would it be possible to create a RBD pool at the offsite backup machine and configure DRBD directly on that or, do I need to go the route where I run a virtual machine using Ceph and configure DRBD in the virtual machine as a abstraction layer?
Edit:
The reason the (single node) offsite backup machine is running Ceph is because it is mirroring the pools of a (multi node) main Ceph cluster.
In addition to the main Ceph cluster we have a backup server creating file backups of the machines running on the cluster. This is a simple RAID5 configuration where the data is stored on.
To have a extra copy of the backup data I also want to sync it, using DRBD so that I do not have a problem with small files, to the offsite backup machine. But as the disks of the backup machine are already configured to be a Ceph OSD I need to store it somehow on in a Ceph pool.
Mr. Diba
(400 rep)
Feb 14, 2023, 09:50 AM
• Last activity: Feb 15, 2023, 03:45 PM
Showing page 1 of 20 total questions