Sample Header Ad - 728x90

Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

0 votes
1 answers
1666 views
If I have a json string how do I calculate the number of bytes needed when stored?
I have a json string formatted displayed in a web page. What I am trying to understand is what is the size in terms of bytes that this json string requires. If I copy and pipe to `wc -c` I get `1000` which is the number of characters but I don't think that this means that the json string is 1000 byt...
I have a json string formatted displayed in a web page. What I am trying to understand is what is the size in terms of bytes that this json string requires. If I copy and pipe to wc -c I get 1000 which is the number of characters but I don't think that this means that the json string is 1000 bytes as I have seen googling around as a suggestion. The reason I am confused is the following: In java for instance a String is composed of char and each char is 2 bytes to support utf-8. Json also supports utf-8 so I am not sure if I should consider that the size of the json string requires 2000 bytes or what is a way to figure this out?
Jim (1479 rep)
Feb 22, 2024, 06:27 PM • Last activity: Mar 1, 2024, 09:17 PM
3 votes
1 answers
1058 views
What is "length" of a string in Bourne shell compatibles' `${#string}`?
Arising from [this](https://unix.stackexchange.com/questions/685602/count-bytes-of-filename/685603?noredirect=1#comment1295723_685603) discussion: When I have (zsh 5.8, bash 5.1.0) ```shell var="ASCII" echo "${var} has the length ${#var}, and is $(printf "%s" "$var"| wc -c) bytes long" ``` the answe...
Arising from [this](https://unix.stackexchange.com/questions/685602/count-bytes-of-filename/685603?noredirect=1#comment1295723_685603) discussion: When I have (zsh 5.8, bash 5.1.0)
var="ASCII"
echo "${var} has the length ${#var}, and is $(printf "%s" "$var"| wc -c) bytes long"
the answer is simple: these are 5 characters, occupying five bytes. Now, var=Müller yields
Müller has the length 6, and is 7 bytes long
Which suggests the ${#} operator counts codepoints, not bytes. This is a bit unclear [in POSIX](https://pubs.opengroup.org/onlinepubs/9699919799.2016edition/utilities/V3_chap02.html#tag_18_06_02) , where they say it counts "characters". This would be clearer if characters in POSIX C weren't octets, normally. Anyways: Nice! Kind of good, seeing that LANG==en_US.utf8. Now,
var='🧜🏿‍♀️'
echo "${var} has the length ${#var}, and is $(printf "%s" "$var"| wc -c) bytes long"
🧜🏿‍♀️ has the length 5, and is 17 bytes long
Soooo, we decompose "Mermaid of dark skin color" into the Unicode codepoint 1. Merperson 2. Dark skin tone 3. Zero-Width Joiner 4. Female 5. Print print the previous character as emoji Fine, so we're really counting Unicode codepoints!
var="e\xcc\x81"
echo "${var} has the length ${#var}, and is $(printf "%s" "$var"| wc -c) bytes long"
é has the length 9, and is 9 bytes long
(of course, my console font decided that the ´ combines with the following space, not the preceding e. The latter would be correct. But let's leave my rage about that for somewhen else.) Um, a slight "wat" is in order here.
> printf "e\xcc\x81"|wc -c
3
> printf "%s" "${var}" |wc -c
9
> echo -n ${var} |wc -c
3
> echo "${var} has the length ${#var}, and is $(printf "%s" "$var"| wc -c) bytes long"
é has the length 9, and is 9 bytes long
> printf "%s" "${var}" |xxd
00000000: 655c 7863 635c 7838 31                   e\xcc\x81
Here's where I give up. echo $var, echo ${var} and echo "${var}" all "correctly" emit three bytes. However, echo ${#var} tells me it's 9 charachters. Where is this documented/standardized, what's the rules for all this?
Marcus Müller (47107 rep)
Jan 9, 2022, 12:36 PM • Last activity: Jan 30, 2023, 04:42 PM
0 votes
1 answers
2448 views
Convert variable from little endian to big endian
Working in Bash I have a hex variable that I must convert from little endian to big endian I am new to the entire concept of this and only learned about this about 20 minutes ago, so please bare with me. My script determines a hex variable that undergoes a few changes: decimal, signed 2's complement...
Working in Bash I have a hex variable that I must convert from little endian to big endian I am new to the entire concept of this and only learned about this about 20 minutes ago, so please bare with me. My script determines a hex variable that undergoes a few changes: decimal, signed 2's complement, and division by 8. Before everything though it must go through little endian to big endian conversion (I may be confusing the two but my example below should clarify) EXAMPLE: 1. Hex Value: 0080 After Conversion: 8000 2. Hex Value: 9800 After Conversion: 0098 3. Hex Value: 1234 After Conversion: 3412 I believe that this is a 16 bit hex variable as it is always 4 digits.
Nir (1 rep)
Nov 15, 2022, 07:52 PM • Last activity: Nov 15, 2022, 08:25 PM
0 votes
1 answers
666 views
print byte from number in awk
I can print a byte from a string literal like: `awk 'BEGIN {print "\001"}' | cat -v` But I need to print a byte of the result of a bitwise OR. So how can I print a byte from a number? Gawk is ok.
I can print a byte from a string literal like: awk 'BEGIN {print "\001"}' | cat -v But I need to print a byte of the result of a bitwise OR. So how can I print a byte from a number? Gawk is ok.
sedwho (5 rep)
Feb 4, 2022, 11:39 PM • Last activity: Feb 5, 2022, 12:48 AM
0 votes
2 answers
1187 views
Count bytes of filename
How can I know how many bytes does it weight the name of a filename? Just the file, not the full path. I've tried this: echo 'filename.extension' | wc -c is this right?
How can I know how many bytes does it weight the name of a filename? Just the file, not the full path. I've tried this: echo 'filename.extension' | wc -c is this right?
Smeterlink (295 rep)
Jan 8, 2022, 11:23 PM • Last activity: Jan 9, 2022, 01:56 PM
0 votes
1 answers
101 views
Execute Program until specific amount of bytes has been returned on stdout, then terminate
Imagine I have the following program/script `./generate-infinite-byte-stream`: ```bash #!/bin/bash echo -n 'hello' sleep infinity ``` The infinite sleep command represents a network connection that may or may not deliver more data in the indefinite future that I am not interested in. I would like to...
Imagine I have the following program/script ./generate-infinite-byte-stream:
#!/bin/bash
echo -n 'hello'
sleep infinity
The infinite sleep command represents a network connection that may or may not deliver more data in the indefinite future that I am not interested in. I would like to have a program, let's call it take 5 that runs ./generate-infinite-byte-stream until it has output 5 bytes on stdout and then terminates it:
take 5 ./generate-infinite-byte-stream
# gives 'hello' and returns with exit code 0
Is there such a program or do I need to roll my own with popen()? The program take should also redirect stdin to the executed program. Note: head -c 5 does not do the right thing, because it does not terminate:
./generate-infinite-byte-stream | head -c 5
# this returns 'hello', but never terminates
Aside: The name Take is inspired by the https://reference.wolfram.com/language/ref/Take.html command which returns the first n elements of a list.
masterxilo (137 rep)
Jun 4, 2021, 09:48 AM • Last activity: Jun 4, 2021, 10:13 AM
11 votes
2 answers
2798 views
How do I find the first non-zero byte on a block device, with an optional offset?
I'm trying to find the first non-zero byte (starting from an optional offset) on a block device using `dd` and print its offset, but I am stuck. I didn't mention `dd` in the title as I figured there might be a more appropriate tool than `dd` to do this, but I figured `dd` should be a good start. If...
I'm trying to find the first non-zero byte (starting from an optional offset) on a block device using dd and print its offset, but I am stuck. I didn't mention dd in the title as I figured there might be a more appropriate tool than dd to do this, but I figured dd should be a good start. If you know of a more appropriate tool and/or more efficient way to reach my goal, that's fine too. In the meantime I'll show you how far I've come with dd in bash, so far.
#!/bin/bash

# infile is just a temporary test file for now, which will be replaced with /dev/sdb, for instance
infile=test.txt
offset=0

while true; do
  byte=dd status='none' bs=1 count=1 if="$infile" skip=$offset
  ret=$?

  # the following doesn't appear to work
  # ret is always 0, even when the end of file/device is reached
  # how do I correctly determine if dd has reached the end of file/device?
  if [ $ret -gt 0 ]; then
    echo 'error, or end of file reached'
    break
  fi

  # I don't know how to correctly determine if the byte is non-zero
  # how do I determine if the read byte is non-zero?
  if [ $byte ???? ]; then
    echo "non-zero byte found at $offset"
    break
  fi

  ((++offset))
done
As you can see, I'm stuck with two issues that I don't know how to solve: a. How do I make the while loop break when dd has reached the end of the file/device? dd gives an exit code of 0, where I expected a non-zero exit code instead. b. How do I evaluate whether the byte that dd read and returns on stdout is non-zero? I think I've read somewhere that special care should be taken in bash with \0 bytes as well, but I'm not even sure this pertains to this situation. Can you give me some hints on how to proceed, or perhaps suggest and alternative way to achieve my goal?
ExploringQuest (113 rep)
Jun 1, 2021, 01:01 PM • Last activity: Jun 2, 2021, 10:53 PM
0 votes
2 answers
304 views
Is there a way to strip the high bit of each byte in a file?
I've been trying to figure out if this can be done in `sed` or `tr`, but I can't find it. I have a bunch of files from an old Apple II which have the high bit set on each byte. On a Mac, this results in a bunch of gibberish. Of course, I could write a program to xor $80 each byte, but I'm thinking t...
I've been trying to figure out if this can be done in sed or tr, but I can't find it. I have a bunch of files from an old Apple II which have the high bit set on each byte. On a Mac, this results in a bunch of gibberish. Of course, I could write a program to xor $80 each byte, but I'm thinking that there MUST be a way in UNIX to do this! Any ideas?
bjb (113 rep)
Apr 1, 2021, 10:56 PM • Last activity: Apr 1, 2021, 11:50 PM
5 votes
3 answers
9541 views
What is the difference between a byte and a character (at least *nixwise)?
I understand that any character is comprised of one or more byte/s. If I am not mistaken, at least in *nix operating systems, a character will generally (or totally?) be comprised of only one byte. What is the difference between a byte and a character (at least *nixwise)?
I understand that any character is comprised of one or more byte/s.
If I am not mistaken, at least in *nix operating systems, a character will generally (or totally?) be comprised of only one byte. What is the difference between a byte and a character (at least *nixwise)?
variableexpander (125 rep)
Feb 23, 2021, 06:00 PM • Last activity: Feb 24, 2021, 06:00 PM
20 votes
4 answers
40315 views
Is there a oneliner that converts a binary file from little endian to big endian?
and vice versa. I am running a RedHat if relevant.
and vice versa. I am running a RedHat if relevant.
Fermat's Little Student (585 rep)
Oct 29, 2015, 04:22 PM • Last activity: Feb 24, 2021, 06:20 AM
5 votes
5 answers
2990 views
How to count the number of bytes in a file, grouping the same bytes?
Example: I have the file "mybinaryfile", and the contents in hex are: A0 01 00 FF 77 01 77 01 A0 I need to know how many A0 bytes there are in this file, how many 01, and so on. The result could be: A0: 2 01: 3 00: 1 FF: 1 77: 2 Is there some way to make this count directly in shell or do I need to...
Example: I have the file "mybinaryfile", and the contents in hex are: A0 01 00 FF 77 01 77 01 A0 I need to know how many A0 bytes there are in this file, how many 01, and so on. The result could be: A0: 2 01: 3 00: 1 FF: 1 77: 2 Is there some way to make this count directly in shell or do I need to write a program in whatever language to do this specific task?
Lawrence (329 rep)
Jun 28, 2019, 04:50 PM • Last activity: Mar 31, 2020, 11:56 AM
1 votes
4 answers
1569 views
How to count the number of bytes in a very large file, grouping the same bytes?
I am searching for a way to get a statistics on a very large (multiple times larger than the available RAM) the outputs what byte values in the files are present and how often: A0 01 00 FF 77 01 77 01 A0 I need to know how many A0 bytes there are in this file, how many 01, and so on. The result coul...
I am searching for a way to get a statistics on a very large (multiple times larger than the available RAM) the outputs what byte values in the files are present and how often: A0 01 00 FF 77 01 77 01 A0 I need to know how many A0 bytes there are in this file, how many 01, and so on. The result could be: A0: 2 01: 3 00: 1 FF: 1 77: 2 Therefore this question is very close to the question [How to count the number of bytes in a file, grouping the same bytes?](https://unix.stackexchange.com/questions/527521/how-to-count-the-number-of-bytes-in-a-file-grouping-the-same-bytes) but non of the existing answers works for larger files. From my understanding all answers require a minimum RAM equal to the size of the file to be tested (up to multiple times). Hence the answers don't work on systems with small RAM, e.g. a Raspberry for processing a multi-GB file. Is there a simple solution that works on any file size even if we have for example only 512MB RAM available?
Robert (163 rep)
Mar 30, 2020, 12:31 PM • Last activity: Mar 31, 2020, 12:51 AM
0 votes
1 answers
287 views
Entropy: whats the difference between bits and bytes?
If I use `openssl` to generate some random data (for a keyfile, for example): openssl rand -hex 2048 >/tmp/file Is this 4097 bits (or bytes?) of entropy? -rw-rw-r-- 1 username username 4097 Oct 30 20:01 /tmp/file
If I use openssl to generate some random data (for a keyfile, for example): openssl rand -hex 2048 >/tmp/file Is this 4097 bits (or bytes?) of entropy? -rw-rw-r-- 1 username username 4097 Oct 30 20:01 /tmp/file
user318576 (3 rep)
Oct 31, 2018, 03:04 AM • Last activity: Oct 31, 2018, 03:44 AM
1 votes
2 answers
28619 views
What options does wget --report-speed take?
When I do this command: wget --report-speed=type they only *type* it accepts is `bits`. It won't have numbers, kilobits / kilobytes or bytes. The help page (`wget --help`) says: --report-speed=TYPE Output bandwidth as TYPE. TYPE can be bits. suggesting that they TYPE **can** be something else? What...
When I do this command: wget --report-speed=type they only *type* it accepts is bits. It won't have numbers, kilobits / kilobytes or bytes. The help page (wget --help) says: --report-speed=TYPE Output bandwidth as TYPE. TYPE can be bits. suggesting that they TYPE **can** be something else? What options does it take that I haven't found, and (if this option doesn't do this) how can I force the speed to be displayed as bytes or Kilobytes.
Tim (123 rep)
Dec 7, 2014, 01:13 PM • Last activity: Jul 8, 2017, 08:37 PM
22 votes
3 answers
26815 views
How do I trim bytes from the beginning and end of a file?
I have a file, that has trash (binary header and footer) at the beginning and end of the file. I would like to know how to nuke these bytes. For an example, let's assume 25 bytes from the beginning. And, 2 bytes from the end. I know I can use truncate and dd, but truncate doesn't work with a stream...
I have a file, that has trash (binary header and footer) at the beginning and end of the file. I would like to know how to nuke these bytes. For an example, let's assume 25 bytes from the beginning. And, 2 bytes from the end. I know I can use truncate and dd, but truncate doesn't work with a stream and it seems kind of cludgey to run two commands on the hard file. It would be nicer if truncate, knowing how big the file was, could cat the file to dd. Or, if there was a nicer way to do this?
Evan Carroll (34663 rep)
May 24, 2017, 08:38 PM • Last activity: May 24, 2017, 09:57 PM
26 votes
3 answers
1536 views
Byte count of "ls -l <random file>" versus that of "wc -c <random file>"
Is there any possible situation when ls -l file.txt is showing not the same number of bytes as wc -c file.txt In one script I found comparison of those two values. What could be the reason of that? Is it even possible to have different byte counts of the same file?
Is there any possible situation when ls -l file.txt is showing not the same number of bytes as wc -c file.txt In one script I found comparison of those two values. What could be the reason of that? Is it even possible to have different byte counts of the same file?
Rokas.ma (573 rep)
Jan 16, 2017, 02:35 PM • Last activity: Jan 16, 2017, 03:49 PM
3 votes
2 answers
740 views
How to bury an invisible mark into lines of text?
How can I bury an **invisible** mark into random lines of text? Such a mark has to be there, though it will be invisible to someone reading that text printed out on the console. I want to identify those lines by means of an invisible mark in order to, for instance, grep them in or out later. I tried...
How can I bury an **invisible** mark into random lines of text? Such a mark has to be there, though it will be invisible to someone reading that text printed out on the console. I want to identify those lines by means of an invisible mark in order to, for instance, grep them in or out later. I tried 0x00 without success. I expected grep to print lines matching 0x00 somewhere. But this didn't work: $ echo -e "a\0b" | hexdump -C 00000000 61 00 62 0a |a.b.| 00000004 $ echo -e "a\0b" | grep "a\0b"
n.r. (2263 rep)
Dec 29, 2013, 09:53 PM • Last activity: Dec 29, 2013, 11:17 PM
-2 votes
1 answers
2045 views
How large is the Linux kernel compared to Unix?
Not in just LOC (lines of code), but in storage size, as in bytes, megabytes, gigabytes, etc. Also, any sources where I can download the original-based Unix OS? Thanks!
Not in just LOC (lines of code), but in storage size, as in bytes, megabytes, gigabytes, etc. Also, any sources where I can download the original-based Unix OS? Thanks!
thomas brain (15 rep)
Dec 6, 2013, 09:42 PM • Last activity: Dec 6, 2013, 10:37 PM
0 votes
1 answers
316 views
Why is od calculating decimal values wrong?
This question is related to the answer from enzotib to the question: https://unix.stackexchange.com/questions/88848/how-could-i-use-bash-to-find-2-bytes-in-a-binary-file-increase-their-values-an This converts the two bytes into its hex value: $ echo -n $'\x1b\x1f' | od -tx2 0000000 1f1b 0000002 But...
This question is related to the answer from enzotib to the question: https://unix.stackexchange.com/questions/88848/how-could-i-use-bash-to-find-2-bytes-in-a-binary-file-increase-their-values-an This converts the two bytes into its hex value: $ echo -n $'\x1b\x1f' | od -tx2 0000000 1f1b 0000002 But now, this should give me the decimal value: echo -n $'\x1b\x1f' | od -tu2 0000000 7963 0000002 But if I convert the hex value into decimal it should be $ printf "%d" 0x1b1f 6943 Why is that? Am I using od wrong for decimal output?
erik (17679 rep)
Aug 30, 2013, 09:48 PM • Last activity: Aug 30, 2013, 10:18 PM
Showing page 1 of 19 total questions