Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

2 votes

1 answers

2080 views

Setup of TFTP to boot OS from PXE

I am trying to network system from network using PXE. I almost done, but have some problem. When I boot my client from PXE I got this on clients' screen: PXE-E11: ARP timeout PXE-E38: TFTP cannot open connection There are some attempts on the Internet to fix it, but nothing concrete, so I will descr...

                                  I am trying to network system from network using PXE. I almost done, but have some problem.
When I boot my client from PXE I got this on clients' screen:
PXE-E11: ARP timeout
PXE-E38: TFTP cannot open connection

There are some attempts on the Internet to fix it, but nothing concrete, so I will describe my case in details.

in this file /etc/default/atftpd I changed 
**USE_INETD=true** 
to 
**USE_INETD=false**
then in the file /etc/default/tftpd-hpa 

I put TFTP_DIRECTORY="/srv/tftp", because srv/tftp was in the end of file /etc/default/atftpd. 

Then I run sudo /etc/init.d/atftpd start

 and few last things I typped sudo mount -o loop /home/tux/ubuntu16-Desktop.iso /srv/tftp/ubuntu/
 and done with tftp, but it does not work.

Do you have ideas to fix it?
                                

John (297 rep)

Nov 15, 2016, 08:20 PM • Last activity: May 7, 2025, 09:02 PM

2 votes

1 answers

59 views

Cannot set valid_lft to forever in Ubuntu 24.04 with diskless network boot, causes freeze

ubuntu freeze netboot

After starting Ubuntu 24.04 over the network, I have a problem where `valid_lft`, after going down to 0, causes the system to freeze. `valid_lft` can be checked by executing the command `ip a`. How can I set the `valid_lft` parameter to `forever`? I don't want to execute additional commands — I just...

                                  After starting Ubuntu 24.04 over the network, I have a problem where valid_lft, after going down to 0, causes the system to freeze.

valid_lft can be checked by executing the command ip a.  
How can I set the valid_lft parameter to forever? I don't want to execute additional commands — I just want to set it in Linux. I don't see such an option in dhcpcd.

In Ubuntu 22.04, started over the network, the default value of valid_lft is set to forever.

Jarosław Krawczyński (21 rep)

Apr 25, 2025, 04:13 PM • Last activity: Apr 25, 2025, 08:20 PM

0 votes

2 answers

2281 views

What's the relationship between vmlinuz and ISO image (netboot)?

kernel iso pxe netboot

Why can't I use vmlinuz and initrd from the newest release of kernel 5.11 while the default netboot image used in PXE is 5.4?

                                  Why can't I use vmlinuz and initrd from the newest release of kernel 5.11 while the default netboot  image used in PXE is 5.4?
                                

user1098490 (99 rep)

Oct 8, 2021, 10:13 AM • Last activity: Jun 10, 2024, 10:06 AM

0 votes

1 answers

462 views

Can systemd networkd be configured for netboot, PXE boot, if yes, how?

linux systemd-networkd pxe dhcpcd netboot

If I understand this "issue" [(systemd-networkd DHCP Server ignores SendOptions #15780)](https://github.com/systemd/systemd/issues/15780) , SystemD can be configured to handle network booting. However, I am unable to find more information about that functionality. I am currently using a DHCPD server with minimal configuration, why it would be nice if it could be moved to systemd-networkd, which handles all other network functionalities in my environment.

# /etc/dhcpd.conf
allow booting; # How is this defined in systemd-networkd?
allow bootp;   # How is this defined in systemd-networkd?

# If this DHCP server is the official DHCP server for the local
# network, the authoritative directive should be uncommented.
authoritative; # How is this defined in systemd-networkd?

option architecture code 93 = unsigned integer 16; # I think this corresponds to SendOption=93:uint16:architecture

host client_computer {
  hardware ethernet a1:b2:c3:d4:e5:f6; # This should be captured with [Match] MACAddress=a1:b2:c3:d4:e5:f6

  fixed-address 192.168.1.101; # I think this corresponds to something like SendOption=???:ipv4address:192.168.1.101
  next-server 192.168.1.100; # This should be defines as [Network] Address?

  option host-name "clientname"; # I think this corresponds to something like SendOption=12:string:clientname
  option root-path "/srv/tftp";  # I think this corresponds to something like SendOption=17:string:/srv/tftp

  if option architecture = 00:07 {
    filename "grub/x86_64-efi/core.efi"; # I think this corresponds to SendOption=67:string:grub/x86_64-efi/core.efi
  }
  else {
    filename "grub/i396-pc/core.0"; # I think this corresponds to SendOption=67:string:grub/i396-pc/core.0
  }
}

It seems that I need the "option" codes, but where can I find them? Is there a specification? -- Found them :) * [Dynamic Host Configuration Protocol (DHCP) and Bootstrap Protocol (BOOTP) Parameters](https://www.iana.org/assignments/bootp-dhcp-parameters/bootp-dhcp-parameters.xhtml) The SystemD NetworkD [documentation](https://www.freedesktop.org/software/systemd/man/latest/systemd.network.html#%5BDHCPv4%5D%20Section%20Options) . What I have so far:

#allow booting; = ? # Not necessary?
#allow bootp;   = ? # Not necessary?
#authoritative; = ? # Not necessary?

[Match]
MACAddress=a1:b2:c3:d4:e5:f6

[Network]
DHCP=no
DHCPServer=true

Address=192.168.1.100/24 # DHCP server IP

[DHCPv4]
ClientIdentifier=mac

[DHCPServer]
PoolOffset=3
PoolSize=7

BootServerAddress=192.168.1.100/24

#SendOption=93:uint16:architecture # Failed to parse DHCP uint16 data, ignoring assignment: architecture # Not necessary?

#SendOption=???:ipv4address:192.168.1.101

SendOption=12:string:clientname # 12 "Hostname"

SendOption=17:string:/srv/tftp  # 17 "Root Path"

# BootFilename=grub/i396-pc/core.0 # Sane as code 67
# SendOption=67:string:grub/x86_64-efi/core.efi
SendOption=67:string:grub/i396-pc/core.0

[DHCPServerStaticLease]
MACAddress=a1:b2:c3:d4:e5:f6
Address=192.168.1.101

# Update #1 Using tcpdump -i -nn -s0 -v -A udp port 67 I can see that SystemD NetworkD DCHP Server is interacting with the client! However, the system does not boot, and the problem seems to be that the expected static IP address is not assigned to the client. The [DHCPServerStaticLease] section does not seem to have an effect. I found something about a bug in systemctl --version <= 253 and added the workaround ClientIdentifier=mac. However, it should not be necessary as I am running version 255. * [Static IP address not being assigned by DHCP server to host on a certain interface of a systemd-networkd bridge](https://superuser.com/questions/1760528/static-ip-address-not-being-assigned-by-dhcp-server-to-host-on-a-certain-interfa) Oh, and I added the Pool* parameters but the DHCP server still assigns the same IP (192.168.1.242).

user212827 (91 rep)

May 12, 2024, 09:06 PM • Last activity: May 13, 2024, 02:52 PM

1 votes

1 answers

209 views

Is it possible to boot into new Linux image which is stored in RAM (and not written to disk)

boot ramdisk kexec netboot

I'm looking to create a setup where an OS image is (automatically) downloaded over the network and then booted into. The obvious way would be to write it to disk, reconfigure `grub` (or whatever) and reboot, but I'm looking for a way to do this **without any disk writes** at all. The disk would only...

                                  I'm looking to create a setup where an OS image is (automatically) downloaded over the network and then booted into. The obvious way would be to write it to disk, reconfigure grub (or whatever) and reboot, but I'm looking for a way to do this **without any disk writes** at all. The disk would only be used to read the fixed initial image, responsible for downloading the real image and everything after that would run purely from a RAMdisk.

After a real reboot (like a shutdown command or disconnected power), I would expect the device to boot back into the initial disk image, which would again start from scratch by downloading the real image, etc.

I've heard of netboot for diskless setups, but it seems not appropriate for my use case as I need a full Linux userland running for downloading the image (I want to have the option to download over WiFi, use gpg to verify signatures, etc. which is not feasible from bootloader).

I've also looked into kexec, but I'm not sure how it could be used to load a full bootable image.

PhilipRoman (149 rep)

Mar 16, 2024, 02:50 PM • Last activity: Apr 11, 2024, 08:36 AM

1 votes

0 answers

1137 views

Alpine linux how to load system into RAM

ssh-tunneling initramfs initrd netboot tftpd

I need advice if possible to load entire system into RAM. Let's imagine following situation: Booting LIVE OS from DVD and login as root lsblk shows ``` NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS loop0 7:0 0 2.9G 1 loop sda 8:0 0 19.1G 0 disk |-sda1 8:1 0 1G 0 part `-sda2 8:2 0 18.1G 0 part sr0 11:0 1...

I need advice if possible to load entire system into RAM. Let's imagine following situation: Booting LIVE OS from DVD and login as root lsblk shows

NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
loop0    7:0    0  2.9G  1 loop 
sda      8:0    0 19.1G  0 disk 
|-sda1   8:1    0    1G  0 part 
`-sda2   8:2    0 18.1G  0 part 
sr0     11:0    1 1024M  0 rom

I cannot do netboot with ipxe because I will need to have VPN. The only option here is to use DD and overwrite existing /dev/sda with my custom alpine.img I will use qemu and apline linux virt image. Here are the steps I have done so far: - Booted Alpine linux from iso and created /dev/sda - sda will bave only boot partition as mounting point as following:

NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
sda      8:0    0    1G  0 disk 
├─sda1   8:1    0  300M  0 part /boot

- in /boot directory I have copied following:

/boot/
├── alpine-rootfs.tar.gz
├── System.map-virt
├── boot -> .
├── config-virt
├── extlinux.conf
├── initramfs-virt
├── ldlinux.c32
├── ldlinux.sys
├── libcom32.c32
├── libutil.c32
├── lost+found
├── mboot.c32
├── menu.c32
├── vesamenu.c32
└── vmlinuz-virt

extlinux.conf contains

# Generated by update-extlinux 6.04_pre1-r15
DEFAULT menu.c32
PROMPT 0
MENU TITLE Alpine/Linux Boot Menu
MENU HIDDEN
MENU AUTOBOOT Alpine will be booted automatically in # seconds.
TIMEOUT 10
LABEL virt
  MENU LABEL Linux virt
  LINUX vmlinuz-virt
  INITRD initramfs-virt
  APPEND root=/dev/ram0 modules=sd-mod,usb-storage,ext4 quiet rootfstype=ext4

In the initramfs-virt I have added following code:

# Create a RAM disk with a filesystem (adjust the size as needed)
mkdir -p /sysroot
mount -t tmpfs -o size=512M tmpfs /sysroot
# Extract the contents of your root filesystem (e.g., Alpine Linux) to the RAM disk
tar -xzvf /path/to/your/alpine-rootfs.tar.gz -C /sysroot
# Pivot to the RAM disk as the new root filesystem
exec switch_root /sysroot /sbin/init

Unfortunately boot process fails, Can someone advice what am I doing wrong here The whole point here is to have following solution: - Once rescue DVD is booted use dd to overwrite existing /dev/sda with my custom alpine.img - next boot from custom alpine.img and load system into RAM - once you login to shell create ssh tunnelling and deploy final operating system

Rafal Niznik (333 rep)

Oct 15, 2023, 01:46 PM

0 votes

1 answers

178 views

how to debug losing nfsroot connection on Centos 7 ? (observing "task blocked for more than 120 seconds")

centos nfs pxe netboot

I am experiencing diskless clients losing connection to their nfsroot server within 24 hours of booting. Initially I thought it was hardware related as i simultaneously upgraded 16 blades from Centos6 to Centos7 (diskless/pxe boot with nfsroot) and they all lose connection at the same time after boo...

                                  I am experiencing diskless clients losing connection to their nfsroot server within 24 hours of booting. Initially I thought it was hardware related as i simultaneously upgraded 16 blades from Centos6 to Centos7 (diskless/pxe boot with nfsroot) and they all lose connection at the same time after booting ok and running 12 hours+. When they do they all print to the console "task blocked for more than 120 seconds". I setup one of the blades to boot from local disk and when reproducing the problem the 15 diskless blades fail as described and the blade with boot disk continues as before. The nfs server continues serving other clients fine.

I've concluded that my nfsroot connection is getting lost on these diskless blades (Dell M620s in M1000e chassis). Nothing interesting is getting logged in messages file either end. I do not think it is hardware because the all that's changed is upgrade from Centos6 to 7, although there could be compatibilty issue i suppose. The hardware does claim to support Centos7.

Can anyone advise good way to debug why the nfsroot conenction is getting lost ? 
kernel = 3.10.0-1160.59.1.el7.x86_64

richm1000 (1 rep)

Jan 14, 2023, 12:09 PM • Last activity: Jan 16, 2023, 06:45 PM

1 votes

1 answers

2470 views

How to supply nfsroot path via DHCP when network booting

dhcp pxe netboot

I'm trying to network boot some Linux machines, so they mount their root filesystem over NFS. I would like to supply the path to the NFS mount via DHCP, so that all the machines can share the same TFTP configuration but mount different NFS folders according to the value in the DHCP response. If I supply the nfsroot kernel parameter (e.g. nfsroot=1.2.3.4:/srv/client) then they will happily mount the root filesystem from that NFS server and path and boot normally, so the set up is working as long as I supply the path in the kernel parameters. However I don't want to hard-code this parameter on the kernel command line, and I want it loaded from the DHCP response, but I can't get this to work. If I set DHCP option 17 with option root-path "1.2.3.4:/srv/client" in my DHCP server config then the kernel boot log shows IP-Config: Complete and rootserver=1.2.3.4, rootpath=/srv/client so the kernel autoconfiguration is picking up this DHCP option correctly. However once I remove the nfsroot parameter from the kernel command line it appears to skip some of the NFS steps:

:: mounting '' on real root
nfs: Bad value for 'source'

So if I add nfsroot= back (but leave it empty) then the missing NFS messages come back but fail because there is no path - it is not picking the path supplied by DHCP:

NFS-Mount: y:y
Waiting 10 seconds for device /dev/nfs ...
nfsmount: need a path
ERROR: Failed to mount the real root device.

Looking at [the kernel code](https://github.com/torvalds/linux/blob/5bfc75d92efd494db37f5c4c173d3639d4772966/fs/nfs/nfsroot.c#L141) it seems if the path starts with /, , or a digit then it will parse it, otherwise it populates the path with the default /tftproot/%s string instead. So I tried supplying the path as , (i.e. empty server, empty path, the options separator character, and then no options) which made a little progress - now the DHCP-supplied path appears:

NFS-Mount: ,:/srv/client
Waiting 10 seconds for device /dev/nfs ...
nfsmount: can't parse IP address ','
ERROR: Failed to mount the real root device.

However now it's treating the comma as the server IP. Testing out my theory to see what happens if I do supply an actual path with nfsroot=/dummy, sure enough it overrides the DHCP response:

NFS-Mount: /dummy:/dummy
Waiting 10 seconds for device /dev/nfs ...
nfsmount: can't parse IP address '/dummy'
ERROR: Failed to mount the real root device.

I'm kind of stuck at this point. It seems that if you omit the nfsroot parameter then it won't try to do an NFS mount, but if you supply it then it will overwrite the value returned from the DHCP server. Any ideas how I can tell the kernel to do an nfsroot mount but to use the values from the DHCP response?

Malvineous (7395 rep)

Jul 16, 2021, 11:58 AM • Last activity: Jul 16, 2021, 12:06 PM

0 votes

0 answers

368 views

pxe boot pass variable to submenu

grub2 pxe netboot

I'm using RHEL 8 and I currently have the ability to install a single image the LAN. I'm trying to expand our options such that we can select from several images. I'd like to have a top level menu where I can select image A,B,C. Then it would call a submenu and pass in the selected imaged. The menus...

                                  I'm using RHEL 8 and I currently have the ability to install a single image the LAN. I'm trying to expand our options such that we can select from several images. I'd like to have a top level menu where I can select image A,B,C. Then it would call a submenu and pass in the selected imaged. The menus and submenus work but I cannot figure out how to pass a variable from the top level menu to the lower menu. For example if I want to pass image A into the submenu this is what I've tried..

    BUILD="A"
    append pxelinux.cfg/Config BUILD="$BUILD"

Then in the submenu

    menu being
    menu title BUILD="$BUILD"
When I boot the system it does not show the variable it just prints

    BUILD="$BUILD"

So my question is am I passing in the variable correctly? How do I display the variable in the menu?
                                

EncryptedWatermelon (101 rep)

May 14, 2021, 02:40 PM

5 votes

1 answers

1661 views

Debian preseed: How to force prompt for hostname and domain?

debian debian-installer pxe netboot

I have a preseed file which works perfectly in that the install goes from start to finish fully automated without prompts. However, I want to force a prompt for hostname and domain. I have tried adding: d-i netcfg/get_hostname seen false d-i netcfg/get_domain seen false However the installer just ig...

                                  I have a preseed file which works perfectly in that the install goes from start to finish fully automated without prompts.

However, I want to force a prompt for hostname and domain.

I have tried adding:

    d-i netcfg/get_hostname seen false
    d-i netcfg/get_domain seen false

However the installer just ignores this and I end up with a system with the default debian hostname etc.

netcfg/get_hostname, d-i netcfg/dhcp_hostname and netcfg/get_domain are **not** defined in my preseed file.

If it makes any difference, this question relates to Debian 10.

Little Code (491 rep)

Jan 20, 2020, 11:54 AM • Last activity: May 4, 2021, 02:50 PM

5 votes

1 answers

6841 views

Can grub load a kernel from HTTP?

kernel boot grub netboot

Typically the kernel file is loaded via disk. ### Questions - Is it possible to load it via HTTP/HTTPS? - If so how?

                                  Typically the kernel file is loaded via disk. 

### Questions
- Is it possible to load it via HTTP/HTTPS?
- If so how?
                                

Duke Dougal (1135 rep)

Sep 9, 2015, 01:41 AM • Last activity: May 2, 2020, 10:14 AM

0 votes

1 answers

316 views

Ubuntu 18.04.3 netboot PXE fresh installed boots up with Ethernet not managed by Network Manager

network-interface networkmanager pxe initrd netboot

I'm trying to set up a **PXE** server to deliver Linux image for end user workstation in my company. I've finally made it to set up `dnsmasq` and `nfs-kernel-server` so now I can get the images via network and start the installer. *The problem is*: all installer steps runs well but when I get into m...

                                  I'm trying to set up a **PXE** server to deliver Linux image for end user workstation in my company.

I've finally made it to set up dnsmasq and nfs-kernel-server so now I can get the images via network and start the installer. 

*The problem is*: all installer steps runs well but when I get into my fresh installed OS the Ethernet connection appears as "**unmanaged**" by Network Manager.

Looking for an answer to it on the web I've realized there's a workaround that solves the issue.

*My big doubt is*: Is there a way to preset an post-install config file (/etc/NetworkManager/NetworkManager.conf) inside ISO's folder to prevent this behavior?

**NOTE**: installing the same ISO from a flash drive prevents the problem, but I don't understand this behavior because both PXE server and flash drive are using same initrd and vmlinuz files.

Could anyone please give me a hand? 
Thanks in advance.

mr troubleshooter (1 rep)

Dec 18, 2019, 08:02 PM • Last activity: Dec 27, 2019, 07:41 PM

2 votes

2 answers

1930 views

Redhat Satellite 6 - valueerror new value non-existent xfs filesystem

rhel tftp netboot redhat-satellite

When trying to provision a CentOS 7 system with the RedHat Satellite 6.2.9 I got the following error: >... > > ValueError: new value non-existent xfs filesystem is not valid as a default fs type > > Pane is dead Can anyone help?

                                  When trying to provision a CentOS 7 system with the RedHat Satellite 6.2.9 I got the following error:

>...
>
> ValueError: new value non-existent xfs filesystem is not valid as a default fs type
>
> Pane is dead

Can anyone help?
                                

Adail Junior (141 rep)

May 30, 2018, 07:31 PM • Last activity: Oct 21, 2019, 06:27 AM

1 votes

0 answers

1644 views

Should I be able to boot any ISO with pxelinux?

boot iso pxe netboot

I have downloaded some `ISO` file for linux (actually from here: http://minimal.linux-bg.org/download/2018/) The I have configured it with PXE: LABEL minimallinux MENU LABEL Minimal Linux KERNEL vmlinuz INITRD /iso/minimal_linux_live_28-Jan-2018_64-bit_bios.iso APPEND iso raw and got hang or kernel...

                                  I have downloaded some ISO file for linux (actually from here: http://minimal.linux-bg.org/download/2018/) 

The I have configured it with PXE:

    LABEL minimallinux
    MENU LABEL Minimal Linux
    KERNEL vmlinuz
    INITRD /iso/minimal_linux_live_28-Jan-2018_64-bit_bios.iso
    APPEND iso raw

and got hang or kernel panic.

I was able to boot a lot of ISO-s this way. And I see on booting computer that kernel is downloaded ok, ISO is downloaded and pritouts starting.

Should I be able to boot any ISO this way? If not, then what are requirements for it?

What are "bios", "uefi" and "mixed" types? Isn't ISO just a container of files, which are mounted as r/o filesystem? How it can depend on it?

***

I was able to boot with 

    KERNEL memdisk

where memdisk is a binary from syslinux package. What did I do? Where is Linux kernel then? Was it booted from inside ISO on later stages or it memdisk is a kernel?

Dims (3425 rep)

Aug 22, 2019, 07:28 PM • Last activity: Aug 23, 2019, 05:38 PM

5 votes

1 answers

7529 views

How to redirect PXE boot to another TFTP or to HTTP?

pxe syslinux netboot

I would like to make the installation of Linux always possible in my LAN. So, I have configured PXE boot on my DHCP server. The DHCP server points to the TFTP server, and it normally loads menus and can boot kernels. All files were taken from syslinux.org distribution. Also I know that Ubuntu ISO no...

                                  I would like to make the installation of Linux always possible in my LAN. So, I have configured PXE boot on my DHCP server. The DHCP server points to the TFTP server, and it normally loads menus and can boot kernels. All files were taken from syslinux.org distribution.

Also I know that Ubuntu ISO normally contains some PXE files for network boot. But I don't want to replace all my PXE menus with ones from the distro. 

Is it possible to redirect or chain (don't know how to say) from my PXE menu to another PXE menu and / or serve it differently (via HTTP)?

The following straightforward config does not work (nothing happens, no errors):

    DEFAULT vesamenu.c32
    PROMPT 0
    
    MENU TITLE MyTitle
    
    LABEL install1404server
    MENU LABEL Install Ubuntu 14.04.1 Server AMD64
    KERNEL http://192.168.10.25/boot/ubuntu-14.04.1-server-amd64/install/netboot/ubuntu-installer/amd64/linux.0 
    APPEND vga=788 initrd=http://129.168.10.25/boot/ubuntu-14.04.1-server-amd64/install/netboot/ubuntu-installer/amd64/initrd.gz 

All these files are accessible via HTTP.

Directory is follows:

    $ ls
    ldlinux.c32   libutil.c32  moon640.jpg  pxelinux.0    sagittarius-a.jpg
    libcom32.c32  menu.c32     moon800.jpg  pxelinux.cfg  vesamenu.c32
    $ pwd
    /var/lib/tftpboot

**UDPATE**

I found that:

1) To work with HTTP, lpxelinux.0 should be used instead of pxelinux.0.

2) To redirect to another menu, its menu binary should be set as KERNEL, 
and config file should be set as APPEND (not sure).

3) TFTPD does not support symlinks for now.

Dims (3425 rep)

Mar 27, 2016, 04:45 PM • Last activity: Mar 10, 2018, 01:47 PM

5 votes

1 answers

1887 views

archlinux netboot diskless node/system, systemd on NFS (v4) fails, rpc.idmapd

arch-linux systemd nfs netboot nfsv4

**updates: 5 (20171209)** **updates: 5 (20171210)** * `mount -t nfs4 [SERVER IP]:/archlinux /mnt` works. * `ss -ntp | grep 2049` the client establishes a connection to the server before systemd begins. * NSF4 id mapper can only be used with Kerberos? # the problem I am attempting to set up a diskles...

                                  **updates: 5 (20171209)**

**updates: 5 (20171210)**

* mount -t nfs4 [SERVER IP]:/archlinux /mnt works.
* ss -ntp | grep 2049 the client establishes a connection to the server before systemd begins.
* NSF4 id mapper can only be used with Kerberos?

# the problem
I am attempting to set up a diskless node/workstation/system. The OS (4.13.12-1-ARCH) is installed on the SERVER /srv/archlinux. After a [successful netboot from GRUB to NFSv4](https://unix.stackexchange.com/questions/408477/archlinux-efi-netboot-kernel-ip-does-not-work-systemd-failed-to-start-switc) , systemd begins but fails at multiple stages, for example:

* Failed to mount Kernel Configuration File System.
* Failed to mount Kernel Debug File System.
* Failed to mount Huge Pages File System
* Failed to start Load/Save Random Seed.
* Failed to mount /tmp.
* Failed to start Rebuild Journal Catalog.
* Then ends with Not tainted 4.13.12-1-ARCH #1...

Or,

* Failed to mount POSIX Message Queue File System.
* Failed to start Remount Root and Kernel File System.
* Failed to mount Huge Pages File System.
* Failed to mount Kernel Debug File System.
* Failed to mount Kernel Configuration File System.
* Then ends with Not tainted 4.13.12-1-ARCH #1...

I suspect the failures are caused by an incorrect configuration of NFSv4 or the local network. 

## rpc.idmapd

    /etc/idmapd.conf
      [General]
      Verbosity = 7
      Pipefs-Directory = /var/lib/nfs/rpc_pipefs
      Domain = localdomain
      [Mapping]
      Nobody-User = nobody
      Nobody-Group = nobody
      [Translation]
      Method = nnswitch

    /etc/exports
    (printed using # exportfs -v)
      /srv            (rw,sync,wdelay,hide,no_subtree_check,fsid=0,sec=sys,no_root_squash,no_all_squash)
      /srv/archlinux  (rw,sync,wdelay,hide,no_subtree_check,sec=sys,no_root_squash,no_all_squash)

    (Exposed to "world" for debugging purposes)

Running rpc.idmapd -fvvv on a separate tty during bootup logs the following:

    rpc.idmapd: libnfsidmap: using domain: localdomain
    rpc.idmapd: libnfsidmap: Realms list: 'LOCALDOMAIN'
    rpc.idmapd: libnfsidmap: processing 'Method' list
    rpc.idmapd: libnfsidmap: loaded plugin /usr/lib/libnfsidmap/nsswitch.so for method nsswitch
    rpc.idmapd: Expiration time is 600 seconds.
    rpc.idmapd: Opened /proc/net/rpc/nfs4.nametoid/channel
    rpc.idmapd: Opened /proc/net/rpc/nfs4.idtoname/channel
    rpc.idmapd: nfsdcb: authbuf=* authtype=user
    rpc.idmapd: nfs4_uid_to_name: calling nsswitch->uid_to_name
    rpc.idmapd: nfs4_uid_to_name: nsswitch->uid_to_name returned 0
    rpc.idmapd: nfs4_uid_to_name: final return value is 0
    rpc.idmapd: Server : (user) id "0" -> name "root@localdomain"

If exportfs sec=sys, it continues like:

    rpc.idmapd: nfsdch: authbuf=* authtype=user
    rpc.idmapd: nfs4_name_to_uid: calling nsswitch->name_to_uid
    rpc.idmapd: nss_getpwnam: name '0' domain 'localdomain': resulting localname '(null)'
    rpc.idmapd: nss_getpwnam: name '0' does not map into domain 'localdomain'
    rpc.idmapd: nfs4_name_to_uid: nsswitch->name_to_uid returned -22
    rpc.idmapd: nfs4_name_to_uid: final return value is -22
    rpc.idmapd: Server : (user) name "0" -> id "99"
    (stops here)

+(20171209) After making sure that the /etc/hostname for the CLIENT was set to client2 (duh), if exportfs sec=none **or** sec=sys, it continues like:

    rpc.idmapd: nfsdch: authbuf=* authtype=group
    rpc.idmapd: nfs4_gid_to_name: calling nsswitch->gid_to_name
    rpc.idmapd: nfs4_gid_to_name: nsswitch->gid_to_name returned 0
    rpc.idmapd: nfs4_gid_to_name: final return value is 0
    rpc.idmapd: Server : (group) id "190" -> name "systemd-journal@localdomain"
    rpc.idmapd: nfsdch: authbuf=* authtype=user
    rpc.idmapd: nfs4_name_to_uid: calling nsswitch->name_to_uid
    rpc.idmapd: nss_getpwnam: name '0' domain 'localdomain': resulting localname '(null)'
    rpc.idmapd: nss_getpwnam: name '0' does not map into domain 'localdomain'
    rpc.idmapd: nfs4_name_to_uid: nsswitch->name_to_uid returned -22
    rpc.idmapd: nfs4_name_to_uid: final return value is -22
    rpc.idmapd: Server : (user) name "0" -> id "99"
    (stops here)

If I instead change method from nsswitch to static (https://unix.stackexchange.com/questions/286924/uid-mapping-in-nfs) 

    /etc/idmapd.conf
      ...
      [Translation]
      Method = static
      [Static]
      root@localdomain = root

The rpc.idmapd -fvvv on a separate tty during bootup logs the following:

    rpc.idmapd: libnfsidmap: using domain: localdomain
    rpc.idmapd: libnfsidmap: Realms list: 'LOCALDOMAIN'
    rpc.idmapd: libnfsidmap: processing 'Method' list
    rpc.idmapd: static_getpwnam: name 'root@localdomain' mapped to 'root'
    rpc.idmapd: static_getpwnam: group 'root@localdomain' mapped to ' root'
    rpc.idmapd: libnfsidmap: loaded plugin /usr/lib/libnfsidmap/static.so for method static
    rpc.idmapd: Expiration time is 600 seconds.
    rpc.idmapd: Opened /proc/net/rpc/nfs4.nametoid/channel
    rpc.idmapd: Opened /proc/net/rpc/nfs4.idtoname/channel
    rpc.idmapd: nfsdcb: authbuf=* authtype=user
    rpc.idmapd: nfs4_uid_to_name: calling static->uid_to_name
    rpc.idmapd: nfs4_uid_to_name: static->uid_to_name returned 0
    rpc.idmapd: nfs4_uid_to_name: final return value is 0
    rpc.idmapd: Server : (user) id "0" -> name "root@localdomain"

If exportfs sec=sys, it continues like:

    rpc.idmapd: nfsdch: authbuf=* authtype=user
    rpc.idmapd: nfs4_name_to_uid: calling static->name_to_uid
    rpc.idmapd: nfs4_name_to_uid: static->name_to_uid returned -2
    rpc.idmapd: nfs4_name_to_uid: final return value is -2
    rpc.idmapd: Server : (user) name "0" -> id "99"
    (stops here)
    
If exportfs sec=none, it continues like:
    
    rpc.idmapd: nfsdch: authbuf=* authtype=group
    rpc.idmapd: nfs4_gid_to_name: calling static->gid_to_name
    rpc.idmapd: nfs4_gid_to_name: static->gid_to_name returned -2
    rpc.idmapd: nfs4_gid_to_name: final return value is -2
    rpc.idmapd: Server : (group) id "190" -> name "nobody"
    rpc.idmapd: nfsdch: authbuf=* authtype=user
    rpc.idmapd: nfs4_name_to_uid: calling static->name_to_uid
    rpc.idmapd: nfs4_name_to_uid: static->name_to_uid returned -2
    rpc.idmapd: nfs4_name_to_uid: final return value is -2
    rpc.idmapd: Server : (user) name "0" -> id "99"
    (stops here)

Similar problems with the user ID mapping:

* [NFSv4 User Mapping](https://serverfault.com/questions/812813/nfsv4-user-mapping) 
* [NFS user mapping](https://serverfault.com/questions/520276/nfs-user-mapping) 
* [Mapping UID and GID of local user to the mounted NFS share](https://serverfault.com/questions/514118/mapping-uid-and-gid-of-local-user-to-the-mounted-nfs-share) 
* And many many more... Often related to a switch from NFSv3 to NFSv4, and rarely about netboot.

# troubleshooting

* No firewall
* No Kerberos, LDAP, etc.
* No SELinux
* The user root exists on both SERVER and CLIENT, with the same password.

## SERVER
All other relevant configuration files for NFSv4 I could identify on the SERVER.

    /etc/nsswitch.conf
      passwd: compat mymachines systemd
      group: compat mymachines systemd
      shadow: compat
      publickey: files
      hosts: files mymachines resolve [!UNAVAIL=return] dns myhostname
      networks: files
      protocols: files
      services: files
      ethers: files
      rpc: files
      netgroup: files

    /etc/nfs.conf
      (all settings commented out)
    /etc/conf.d/nfs-common.conf
      (all settings commented out)

### network configuration
* [How to set the domain name on GNU/Linux?](https://serverfault.com/questions/490825/how-to-set-the-domain-name-on-gnu-linux) 
* [Archlinux Wiki Network configuration: Set the hostname](https://wiki.archlinux.org/index.php/Network_configuration#Set_the_hostname) 
* [Archlinux Wiki Network configuration: Local network hostname resolution](https://wiki.archlinux.org/index.php/Network_configuration#Local_network_hostname_resolution) 

The SERVER hostname is server and has 3 network devices (nd[1-3]). The Gateway default via 192.168.0.1 nd1.

    /etc/hosts
      127.0.0.1      localhost.localdomain  localhost
      ::1            ip6.localhost          localhost
      192.168.0.101  nd1.localdomain        server servernd1
      192.168.1.101  nd2.localdomain        server servernd2
      192.168.2.101  nd3.localdomain        server servernd2
      192.168.1.102  client1.localdomain    client1
      192.168.2.102  client2.localdomain    client2

    /etc/resolveconf.conf
      name_servers=192.168.0.1

    # hostname -f
    # nd1.localdomain

    # hostname -i
    192.168.0.101 192.168.1.101 192.168.2.101

    # getent hosts IP -> the corresponding line in /etc/hosts
    # getent ahosts HOSTNAME -> the corresponding line in /etc/hosts

    # ping -c 3 server.localdomain -> 0% packet loss

    # id -u root -> 0
    # id -un 0 -> root

    Display the system's effective NFSv4 domain name on stdout.
    # nfsidmap -d -> localdomain

    Display on stdout all keys currently in the keyring used to cache ID mapping results. These keys are visible only to the superuser.
    # nfsidmap -l -> nfsidmap: '.id_resolver' keyring was not found.

## CLIENT

    /etc/hostname +(20171209)
      client2
    /etc/hosts
      (exactly the same as the hosts file on the server)
    /etc/resolveconf.conf
      name_servers=192.168.0.1
    /etc/idmapd.conf
      (exactly the same as the idmapd.conf file on the server)
    /etc/fstab
      # sys=sec or sys=none to correspond to server export settings. 
      /dev/nfs  /  nfs  rw,hard,rsize=9151,sec=sys,clientaddr=192.168.2.102  0  0
      devtmpfs  /dev   devtmpfs  defaults
      proc      /proc  proc      defaults
      none      /run   tmpfs     defaults
      sys       /sys   sysfs     defaults
      run       /run   tmpfs     defaults
      tmp       /tmp   tmpfs     defaults

The fstab was defined by comparing the mounted directories on the server using findmnt -A.

## net_nfs4

* +(20171210) NFS version on SERVER and CLIENT cat /proc/fs/nfsd/versions -> -2 +3 +4 +4.1 +4.2 
* On SERVER and CLIENT [cat /sys/module/nfsd/parameters/nfs4_disable_idmapping -> N](https://wiki.archlinux.org/index.php/NFS#Ensure_NFSv4_idmapping_is_fully_enabled) .
* On SERVER echo "options nfsd nfs4_disable_idmapping=0" > /etc/modprobe.d/nfsd.conf.
* On CLIENT the /sys/module/nfs/parameters/nfs4_disable_idmapping does not exist, and not sure how to manually create it as the /sys is read only. 
*  +(20171210) On CLIENT  echo "options nfs nfs4_disable_idmapping=0" > /etc/modprobe.d/nfs.conf.

The CLIENT IP is 192.168.2.102/24. The CLIENT network device is connected to SERVER nd2 192.168.2.101/24 (hostname: servernd2).

The network information during boot:

    :: running early hook [udev]
    starting version 235
    :: running hook [udev]
    :: Triggering uevents...
    :: running hook [net_nfs4]
    IP-Config: eth0 hardware address [CLIENT NETWORK DEVICE MAC] mtu 1500 DHCP
    hostname client2 IP-Config: eth0 guessed broadcast address 192.168.2.255
    IP-Config: eth0 complete (from 192.168.0.101):
     address: 192.168.2.102     broadcast: 192.168.2.255     netmask: 255.255.255.0
     gateway: 192.168.2.101     dns0     : 192.168.0.1       dns1   : 0.0.0.0
     host   : client2
     domain : localdomain
     rootserver: 192.168.0.101 rootpath: /srv/archlinux
     filename  : /netboot/grub/i386-pc/core.0
    NFS-Mount: 192.168.2.101:/archlinux
    Waiting 10 seconds for device /dev/nfs ...
    (systemd takes over from here)

## Why the NSFv4 errors occur?
### Server : (group) id "190" -> name "nobody"

>With NFSv4, things change: users are mapped by username, and the mapping between user names and user IDs is handled by a process called "ID map daemon" (idmapd). In particular, NFSv4 clients and server should use the same domain for the mapping to work properly, otherwise requests will be mapped to the anonymous user/group.
-- [Trying out NFSv4 (on Linux and Solaris) -- March 15th, 2012 - 13:03 / bronto](https://syslog.me/2012/03/15/trying-out-nfsv4-on-linux-and-solaris/) 

---

>In an ideal world, the user and group of the requesting client would determine the permissions of the data returned. We don't live in an ideal world. Two real-world problems intervene:

> 1. You might not trust the root user of a client with root access to
     the server's files.
> 1. The same username on client and server might have different
     numerical ID's

>Problem 1 is conceptually simple. John Q. Programmer is given a test machine for which he has root access. In no way does that mean that John Q. Programmer should be able to alter root owned files on the server. Therefore NFS offers root squashing, a feature that maps uid 0 (root) to the anonymous (nfsnobody) uid, which defaults to -2 (65534 on 16 bit numbers).
-- [NFS: Overview and Gotchas -- Copyright (C) 2003 by Steve Litt](http://www.troubleshooters.com/linux/nfs.htm#_Configure_the_NFS_Server) 

### +(20171209) rpc.idmapd: nss_getpwnam: name '0' domain 'localdomain': resulting localname '(null)'

According to [Steve Dickson in a comment (2011-08-12 16:01:55 EDT) to a Red Hat Bugzilla – Bug 715430 report](https://bugzilla.redhat.com/show_bug.cgi?id=715430#c2) 

>The [error] statement explains the problem. DNS on the local machine was
not set up (or returning NULL) and the Domain= variable in 
/etc/idmapd.conf was  not set.

### nss_getpwnam: name '0' does not map into domain

On the Debian Mailing Lists, in an [e-mail correspondence between Jonas Meurer and Christian Seiler (20150722) concerning "Kerberos-secured NFSv4"](https://lists.debian.org/debian-user/2015/07/msg00966.html)  the error is explained in detail. My summary of the discussion:

When the NFS client sends nss_getpwnam: name '8' domain 'freesources.org': resulting localname '(null)' 

> The NFS client sends just the uid converted to a string in some cases instead of the properly translated NFS username, which the server then rejects.

The client should send nss_getpwnam: name 'mail@freesources.org' domain 'freesources.org': resulting localname 'mail'

> Here you can see that the owner name that was transmitted by the NFS
client was 'mail@freesources.org' (and not simply '8'), so that does
contain an @; nss_getpwname can see that the domain name matches
and just strips it, resulting in a user name 'mail', which it looks
up in /etc/passwd, returns the user id (in this case, 8, because it's
the same on client and server) and the server is perfectly happy.

> So why does the client send the wrong username?
> ... every once in a while, idmapping will fail, so the kernel will just send a number. But that number will cause the chown command to fail, since the server won't translate it
back.
>
> Short answer: I have no idea.
>
> Longer answer: ...

If I understand the longer answer correctly, the problem could occur because the NFS client relies on the "kernel's key cache". For the NFS server this should never be a problem because the "kernel's key cache" is never used. 

Nonetheless,

> Since you are using just regular nsswitch via /etc/passwd, nss_getpwnam should *never* fail in your case, unless you do some weird stuff with /etc/passwd at the same time.

The answer also refers to an alternative method to idmapd; nfsidmap, although reading the man I cannot quite understand how it would replace idmapd.

### +(20171209)  nss_getpwnam: name 'root@domain.com' does not map into domain 'localdomain'
This error message does not seem to occur for me, I am however including the answer from [SUSE's support knowledgebase -- 10-DEC-13 Modified Date: 12-OCT-17 --](https://www.suse.com/support/kb/doc/?id=7014266)  because of the description of cause, and the proposed remedy which stands in contrast to the other found discussions.

>NFSv4 handles user identities differently than NFSv3.  In v3, an nfs client would simply pass a UID number in chown (and other requests) and the nfs server would accept that (even if the nfs server did not know of an account with that UID number).  However, v4 was designed to pass identities in the form of @.  To function correctly, that normally requires idmapd (id mapping daemon) to be active at client and server, and for each to consider themselves part of the same id mapping domain.
>
>Chown failures or idmapd errors like the ones documented above are typically a result of either:
>
>1.  The username is known to the client but not known to the server, or
>2.  The idmapd domain name is set differently on the client than it is on the server.
>
>Therefore, this issue can be fixed by insuring that the nfs server and client are configured with the same idmapd domain name (/etc/idmapd.conf) and both have knowledge of the usernames / accounts in question.
>
>However, it is often not convenient to insure that both sides have the same user account knowledge, especially if the nfs server is a filer.  The NFS community has recognized that this idmapd feature of NFSv4 is often more troublesome that it is worth, so there are steps and modifications coming into effect to allow the NFSv3 behavior to work even under NFSv4.

The proposed remedy is to disable idmapd.

    nfs.nfs4_disable_idmapping=1

## +(20171209) Wireshark
Analyzing the Wireshark log, it is quite extensive but begins with something like:

    [IP CLIENT] -> [IP SERVER] NFS 226 V4 Call ACCESS FH: [HEX VALUE], [Check: RD LU MD XT DL]
    [IP SERVER] -> [IP CLIENT] NFS 238 V4 Reply (Call In 34) ACCESS, [Allowed: RD LU MD XT DL]
    [IP CLIENT] -> [IP SERVER] NFS 246 V4 Call LOOKUP DH: [HEX VALUE]/archlinux

where a similar pattern [A HEX VALUE]/[PATH] can be discerned for 
/sbin, /usr, /bin, /init, /lib, /systemd, /dev, /proc, /sys, /run, /, /lib64.

When the CLIENT requests /Id-linux-x86-64.so.2 the first errors start to appear:

    [IP CLIENT] -> [IP SERVER] NFS 342 V4 Call OPEN DH: [HEX VALUE]/Id-linux-x86-64.so.2
    [SERVER IP] -> [CLIENT IP] NFS 166 V4 Reply (Call In 124) OPEN Status: NFS4ERR_SYMLINK

The pattern more or less repeats itself with more frequent errors, for example, LOOKUP Status; and OPEN Status: reporting NFS4ERR_NOENT.

Interestingly, it is at the very end of the log where to first and only reference to user permission is made,

    [SERVER IP] -> [CLIENT IP] NFS 182 V4 Reply (Call In 9562) SETATTR Status: NFS4ERR_BADOWNER

## RFC
According to

* [RFC7530 (Network File System (NFS) Version 4 Protocol, 201503, PROPOSED STANDARD)](https://www.rfc-editor.org/rfc/rfc7530)  -- Updated by [RFC7931](https://www.rfc-editor.org/rfc/rfc7931) 
* [RFC5661 (Network File System (NFS) Version 4 Minor Version 1 Protocol, 201001, PROPOSED STANDARD)](https://www.rfc-editor.org/rfc/rfc5661)  -- Updated by [RFC8178](https://www.rfc-editor.org/rfc/rfc8178) 
* [RFC7862 (Network File System (NFS) Version 4 Minor Version 2 Protocol, 201001, PROPOSED STANDARD)](https://www.rfc-editor.org/rfc/rfc7862)  -- Updated by [RFC8178](https://www.rfc-editor.org/rfc/rfc8178)  -- which refers back to [RFC5661].

### NFS4ERR_BADOWNER (Error Code 10039)
>This error is returned when an owner or owner_group attribute value or the who field of an ACE within an ACL attribute value cannot be translated to a local representation.

The specifications discuss in Section 5.9. *Interpreting owner and owner_group*, I am not sure what to cite as relevant however.

### NFS4ERR_SYMLINK (Error Code 10029)
>The current filehandle designates a symbolic link when the current operation does not allow a symbolic link as the target.

### NFS4ERR_NOENT (Error Code 2)
> This indicates no such file or directory.  The file system object referenced by the name specified does not exist.

The error could however be expected ...

>The current filehandle is assumed to refer to a regular directory a named attribute directory.  LOOKUPP assigns the filehandle for its parent directory to be the current filehandle.  If there is no parent directory, an NFS4ERR_NOENT error must be returned.  Therefore, NFS4ERR_NOENT will be returned by the server when the current filehandle is at the root or top of the server's file tree.

## +(20171210) mount -t nfs4 [SERVER IP]:/archlinux /mnt
On the client computer, using the Archlinux "LiveUSB" I was able to mount the network drive, download the latest kernel (4.14-4-1-ARCH) via the SERVER internet connection, and install archlinux on the [SERVER IP]/archlinux.

During install rpc.idmapd -fvvv indicated a successful mapping of usernames, for example,

    rpc.idmapd: Server : (user) id "0" -> name "root@localdomain"
    rpc.idmapd: Server : (group) id "99" -> name "nobody@localdomain"
    ... -> name "tty@localdomain"
    ... -> name "systemd-journal-upload@localdomain"
    ... -> name rpc@localdomain
    ... -> name systemd-journal@localdomain
    ... -> name utmp@localdomain

The result of genfstab was also different:


Nevertheless, after reboot systemd failed again with the same failures as described at the beginning of the post.

## +(20171210) Is the remote directory on the server mounted to /new_root?

The mkinitcpio script uses the variable mount_handler to carry an assigned "mounting function", in this case nfs_mount_handler(), to which the "root path" is passed $1 at a later stage; /new_root.

I am trying to verify that the client has mounted the [SERVER IP]:/archlinux to the /new_root. On the server, I can only observe that the client has established a connection but not if the directory is mounted and to where?

    showmount -a server -> All mount points on server: (empty)

    ss -ntp | grep 2049 ->
    ESTAB  0    0   192.168.2.101:2049  192.168.2.102:809 (random port)

## +(20171210) NFS4, sec=sys and id mapper are incompatible? 

>**Reading the doco, it looks like sec=sys and the id mapper can be used to correctly map uid/gid to name where the client and server have different mappings in /etc/passwd and /etc/group. This simply isn't true.**
>
>That's because with sec=sys the id mapper doesn't come into play in the authentication part of the nfs protocol, only the file attributes part. With sec=sys authentication, nfs just passes the client uid/gid which is used directly by the server. So permissions checks will be screwed if client and server uid and gid don't align. To confuse things further, when the client creates a new file it is the authentication credentials that are used, so the file gets created at the server with the client's uid/gid. After that nfs uses idmap to get the file attributes, so the uid/gid (which originally came from the client) gets mapped at the server, and you end up seeing the server's name for a client uid/gid. Borkage! On the other hand, if the file was originally created at the server, you will see the correct name at the client, even if the uid/gid differs. But permissions checking will still be broken. -- [kimmie -- Posted: Wed Feb 20, 2013 3:14 am    Post subject:](https://forums.gentoo.org/viewtopic-p-7250220.html?sid=f9d53191215294ce744797d1da1aee27#7250220)  -- Emphasis in original
                                

user212827 (91 rep)

Dec 8, 2017, 04:21 PM • Last activity: Dec 27, 2017, 04:04 AM

1 votes

0 answers

689 views

Adding wifi nfs boot to initramfs, under armbian and uboot

nfs initramfs u-boot netboot armbian

I'm trying to setup my an initramfs image to nfs boot from a WPA wifi access point. My image is working with armbian and uboot. so far I have NFS booting over ethernet just fine. But no real path to follow on the idea of getting all the right software, drivers, modules and scripts onto the initramfs...

                                  I'm trying to setup my an initramfs image to nfs boot from a WPA wifi access point. My image is working with armbian and uboot.

so far I have NFS booting over ethernet just fine. But no real path to follow on the idea of getting all the right software, drivers, modules and scripts onto the initramfs image or even which are the right software,drivers,modules, scripts.

For hardware I have 35 OrangePi Zero 
- manufacture link  and wiki link 

The idea is to get them all up and booted from nfs using wifi and one sd card.

Tasha (11 rep)

Dec 17, 2017, 07:26 AM

2 votes

1 answers

1714 views

archlinux efi netboot kernel "ip" does not work?; systemd "Failed to start Switch Root."

arch-linux linux-kernel systemd nfs netboot

I am attempting to set up a diskless [node/workstation](https://en.wikipedia.org/wiki/Diskless_node)/[system](https://wiki.archlinux.org/index.php/Diskless_system), using the instructions provided in the guide [Diskless system](https://wiki.archlinux.org/index.php/Diskless_system) for [archlinux](ht...

                                  I am attempting to set up a diskless [node/workstation](https://en.wikipedia.org/wiki/Diskless_node)/[system](https://wiki.archlinux.org/index.php/Diskless_system) , using the instructions provided in the guide [Diskless system](https://wiki.archlinux.org/index.php/Diskless_system)  for [archlinux](https://www.archlinux.org)  (4.13.12-1-ARCH).

#the problem
The client successfully connects to TFTP ([atftp](http://github.com/seveas/atftp)) , transfers all files and presents the GRUB selection menu (relevant excerpt from grub.cfg):

    load_video
    set gfxpayload=keep
    insmod gzip
    insmod ext3
    insmod net
    insmod tftp
    insmod efinet

    set root=(tftp,192.168.0.101)
    set prefix=(tftp,192.168.0.101)/netboot/grub

    linux /netboot/vmlinuz-linux add_efi_memmap root=/dev/nfs rootfstype=nfs nfsroot=192.168.0.101:/srv/[CLIENT OS] nfsrootdebug rw ip=dhcp
    initrd /netboot/initramfs-linux.img

I have tried various assignments of ip (https://www.kernel.org/doc/Documentation/filesystems/nfs/nfsroot.txt) 

     ip=:::::efinet0:dhcp
     ip=:::::eno1s0:dhcp
     ip=:::::eth0:dhcp
     ip=[CLIENT IP]:[SERVER IP]:[GATEWAY IP]:[NETMASK]:[HOSTNAME]:[DEVICE]:dhcp

While both linux and initrd are loaded, continuing results in

    [FAILED] "Failed to start Switch Root."
    See 'systemctl status initrd-switch-root.service' for details.
    You are in emergency mode. After logging in, type "journalctl -xb" to view
    system logs, "systemctl reboot" to reobot, "systemctl default or ^D to enter into default mode.
    Press Enter for maintenance
    (or press Control-D to continue):

# troubleshooting
## removing add_efi_mmap
Instead of Failed to start Switch Root., the kernel panics:

    [    1.114386] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,255)
    [    1.114458] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 4.13.12-1-ARCH #1 
    [    1.114509] Hardware name: ASUSTeK COMPUTER INC. UX51V2A/UX51VZA, BIOS UX51VZA.204 12/03/2012
    [    1.114573] Call Trace:
    [    1.114604]  dump_stack+0x63/0x8b
    [    1.114637]  panic+0xe4/0x23d
    [    1.114667]  mount_block_root+0x1f4/0x2ab
    [    1.114703]  ? set_debug_rodata+0x17/0x17
    [    1.114737]  mount_root+0x6a/0x6d
    [    1.114767]  prepare_namespace+0x134/0x16c
    [    1.114802]  kernel_init_freeable+0x1ec/0x205
    [    1.114840]  ? rest_init+0xe0/0xe0
    [    1.114872]  kernel_init+0xc/0xfc
    [    1.114904]  ret_from_fork+0x25/0x30
    [    1.114957] Kernel Offset: 0x3000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
    [    1.115040] ---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,255)
## [systemd debugging](https://freedesktop.org/wiki/Software/systemd/Debugging/) 
I cannot access the journalctl. Either the keyboard is not detected or the system freezes because I can neither press Enter nor ^D to continue.

Attempting to directly boot into emergency mode by adding systemd.unit=emergency.target or emergency to the kernel CLI does not seem to work.

+(UPDATE 2) The mkinitcpio argument break=premount does not change the systemd startup.

## network
Using Wireshark, there is no network activity after the initial PXE boot, that is, when linux and initrd are loaded, there is no more communication between the client and server. 

    SERVER IP: 192.168.2.101/24
    CLIENT IP: 192.168.2.102/24

### GRUB
The GRUB net_* [commands](https://www.gnu.org/software/grub/manual/grub/html_node/Networking-commands.html#Networking-commands)  and [environment variables](https://www.gnu.org/software/grub/manual/grub/html_node/Network.html)  seem to indicate that everything is in order; tftp works.

    net_ls_cards  efinet0 [CLIENT NETWORK DEVICE MAC]
    net_ls_addr   efinet0 [CLIENT NETWORK DEVICE MAC] 192.168.2.102
    net_ls_routes efinet0:local 192.168.2.0/24 efinet0
                  efinet0:default 0.0.0.0/0 gw 192.168.2.101

    echo $net_default_ip               192.168.2.102
    echo $net_default_mac              [CLIENT NETWORK DEVICE MAC]
    echo $net_default_server           192.168.2.101
    echo $net_efinet0_ip               192.168.2.102
    echo $net_efinet0_mac              [CLIENT NETWORK DEVICE MAC]
    echo $net_efinet0_hostname         (empty)
    echo $net_efinet0_domain           (empty)
    echo $net_efinet0_dhcp_server_name (empty
    echo $net_efinet0_next_server      192.168.0.101
    echo $net_efinet0_root_path        102.168.0.101:/srv/[CLIENT OS]
    echo $net_efinet0_extensionpath    (empty)

### Kernel support for nfsroot and ip
Given that there is no network activity, I presume that the ip or nfsroot are not being executed.

In fact, the problem I am having is described in the question [Built the kernel with NFS support but not getting /dev/nfs](https://unix.stackexchange.com/questions/140176/built-the-kernel-with-nfs-support-but-not-getting-dev-nfs) .

The answer to that question states (Andreas Wiese Jul 1 '14 at 14:58)

> ... make sure to have NFS support built into your kernel binary and not as a module (or have an initramfs, which takes care of this). Same goes for network drivers: you'll most probably want to have the driver for you ethernet NIC built into your kernel image, otherwise you'll have to load it from an initramfs.



In short, there are several possibilities:


  1. Do as above link tells you: have root=/dev/nfs set, give the correct nfsroot parameter and tell your kernel your network configuration via the ip parameter (this would be the best way to make sure it's working at all, i.e. to rule out a misconfigured DHCP server).


  2. Have CONFIG_IP_PNP and CONFIG_IP_PNP_DHCP enabled and set up a DHCP daemon to tell your client which IP address to use and where to find its NFS-root.


  3. Build an initramfs which does the correct configuration and NFS-mounting.

Investigating the archlinux kernel 

    zgrep CONFIG_NFS_FS= /proc/config.gz -> CONFIG_NFS_FS=m
    zgrep DHCP /proc/config.gz           -> (nothing)
    zgrep _IP_PNP_ /proc/config.gz       -> CONFIG_IP_PNP is not set

indicates that archlinux does not have support for the ip compiled with the kernel.

In comment from a bug report (2006) [FS#5056 - Default kernel has NFS root mouting disabled](https://bugs.archlinux.org/task/5056?opened=243&status%5B0%5D=)  

> mkinitcpio supports netbooting already without changing the kernel

Which can be compared to the comment to the accepted answer in the referred question.

> Since around 10 years the kernel doesn't boot nfs directly, but it mounts an initial ramdisk, which re-interprets the kernel command line and boots from where you want. – peterh Jun 17 '16 at 13:54 

### mkinitcpio

From the lsinitcpio -a 

    ...
    Created with mkinitcpio 24
    Kernel: 4.13.12-1-ARCH
    Size: 55,63 MiB
    Compressed with: gzip
      ...

    Included modules:
    ... nfs ... nfsv3 nfsv4 [explicit] ...

    Included binaries:
    ... ipconfig ... mount.nsf4 ... nfsmount ...  

    Early hook run order:
    udev

    Hook run order:
    udev net net_nsf4 nbd

    Cleanup hook order:
    udev

### mkinitcpio support for network device (update #1)
Although the drivers for the network card should be loaded, I wanted to make sure after reading [[SOLVED] Diskless - ipconfig: no devices to configure](https://bbs.archlinux.org/viewtopic.php?id=169335) .

> put network module drive in /etc/mkinitcpio.conf.

    MODULES=(atl1c nbd nfsv4)

Neither explicitly declaring the module nor building the entire initramfs.img on the client made no change.

> Don't use autodetect if the image should run on different machines. autodetect removes all drivers which are not necessary for booting on the currently running system.

Removing autodetect from hooks resulted in an interesting outcome; the earlier observed kernel panic when removing add_efi_mmap occurred. Removing add_efi_mmap when loading the no-autodetect initramfs had no further effect.  

### mkinitcpio support for nfs
Archlinux may or may not have support for nsf4.

* [mkinitcpio Runtime customization Using net](https://wiki.archlinux.org/index.php/mkinitcpio#Using_net) 
* [FS#28287 - [mkinitpio-nfs-utils] NFS4 Support](https://bugs.archlinux.org/task/28287) 

As far as I can tell, this is a secondary issue; the network must work before an attempt to mount nfs can be made.

### mkinitcpio support for ip

I have just found out that

* [mkinitcpio-nfs-utils (0.3-5)](https://www.archlinux.org/packages/core/x86_64/mkinitcpio-nfs-utils/)  includes a "ipconfig", 
* there is a [mkinitcpio-netconf 0.0.4-2](https://aur.archlinux.org/packages/mkinitcpio-netconf/) .

# additional information
This may or may not be relevant.

The reason for using "UEFI PXE boot" instead of "BIOS PXE boot" is because GRUB i386-pc fails to load the grub.cfg. The computer either restarts, freezes on "Welcome to GRUB!" and may clutter the screen with colorized pixels; the outcome seems random. The Wireshark logs reveal that tftp sometimes loads all grub modules, sometimes not. The last log entry is often the client asking for the server network device; ARP 60 Who has [SERVER IP]? Tell [CLIENT IP]?
 


                                

user212827 (91 rep)

Dec 3, 2017, 02:26 AM • Last activity: Dec 10, 2017, 11:48 PM

14 votes

2 answers

8385 views

Describe in detail the boot process of a Linux system

linux boot boot-loader netboot

I am preparing a document in detail showing light on the boot sequence of Linux right from pressing of Power-on button the host to the login prompt appearance. It would be great if we could combine and collate that right answers here into a single place of reference. Please include any details worth...

                                  I am preparing a document in detail showing light on the boot sequence of Linux right from pressing of Power-on button the host to the login prompt appearance.

It would be great if we could combine and collate that right answers here into a single place of reference. Please include any details worth possible to note during the startup.
Once the document gets complete from all the points, I will post the document details here as well and update the link in the question. 

Please consider all possible scenarios like booting from disk, booting from usb, booting from network on a disk-less client where the rootfs(/) is on network.

Nikhil Mulley (8405 rep)

Dec 17, 2011, 10:05 AM • Last activity: Sep 12, 2017, 01:38 PM

2 votes

1 answers

141 views

Debian on Virtual Mips Malta platform

debian debian-installer mips netboot

I am trying to run Debian on MipsMalta platform which is emulated by [OVP][1]. The problem is that the setup tool needs to download files from Debian mirrors but it cannot proceed in downloading files, I tried many different mirrors from different countries, but I get the following message: > The in...

                                  I am trying to run Debian on MipsMalta platform which is emulated by OVP .
The problem is that the setup tool needs to download files from Debian mirrors but it cannot proceed in downloading files, I tried many different mirrors from different countries, but I get the following message:

> The installer failed to download a file from the mirror. This maybe a
> problem from your network, or with the mirror. You can choose to retry
> to download, select a different mirror, or cancel and choose another
> installation method.

I also changed the default Debian Installer embedded in OVP with the one available here: http://ftp.nl.debian.org/debian/dists/jessie/main/installer-mipsel/current/images/malta/netboot/  but the same message appears.

Obviously, this method of installation is based on netboot. Do you know where exactly can I find the required files for offline method of installation, in case the netboot version is not fixed?

                                

user211993 (21 rep)

Jan 23, 2017, 05:01 PM • Last activity: Jan 23, 2017, 06:23 PM

Showing page 1 of 20 total questions