Unix & Linux Stack Exchange
Q&A for users of Linux, FreeBSD and other Unix-like operating systems
Latest Questions
0
votes
4
answers
1869
views
create variables from CSV with varying number of fields
Looking for some help turning a CSV into variables. I tried using IFS, but seems you need to define the number of fields. I need something that can handle varying number of fields. *I am modifying my original question with the current code I'm using (taken from the answer provided by hschou) which i...
Looking for some help turning a CSV into variables. I tried using IFS, but seems you need to define the number of fields. I need something that can handle varying number of fields.
*I am modifying my original question with the current code I'm using (taken from the answer provided by hschou) which includes updated variable names using type instead of row, section etc.
I'm sure you can tell by my code, but I am pretty green with scripting, so I am looking for help to determine if and how I should add another loop or take a different approach to parsing the typeC data because although they follow the same format, there is only one entry for each of the typeA and typeB data, and there can be between 1-15 entries for the typeC data. The goal being only 3 files, one for each of the data types.
Data format:
Container: PL[1-100]
TypeA: [1-20].[1-100].[1-1000].[1-100]-[1-100]
TypeB: [1-20].[1-100].[1-1000].[1-100]-[1-100]
TypeC (1 to 15 entries): [1-20].[1-100].[1-1000].[1-100]-[1-100]
*There is no header in the CSV, but if there were it would look like this (Container, typeA, and typeB data always being in position 1,2,3, and typeC data being all that follow): Container,typeA,typeB,typeC,tycpeC,typeC,typeC,typeC,..
CSV:
PL3,12.1.4.5-77,13.6.4.5-20,17.3.577.9-29,17.3.779.12-33,17.3.802.12-60,17.3.917.12-45,17.3.956.12-63,17.3.993.12-42
PL4,12.1.4.5-78,13.6.4.5-21,17.3.577.9-30,17.3.779.12-34
PL5,12.1.4.5-79,13.6.4.5-22,17.3.577.9-31,17.3.779.12-35,17.3.802.12-62,17.3.917.12-47
PL6,12.1.4.5-80,13.6.4.5-23,17.3.577.9-32,17.3.779.12-36,17.3.802.12-63,17.3.917.12-48,17.3.956.12-66
PL7,12.1.4.5-81,13.6.4.5-24,17.3.577.9-33,17.3.779.12-37,17.3.802.12-64,17.3.917.12-49,17.3.956.12-67,17.3.993.12-46
PL8,12.1.4.5-82,13.6.4.5-25,17.3.577.9-34
Code:
#!/bin/bash
#Set input file
_input="input.csv"
# Pull variables in from csv
# read file using while loop
while read; do
declare -a COL=( ${REPLY//,/ } )
echo -e "containerID=${COL}\ntypeA=${COL}\ntypeB=${COL}" >/tmp/typelist.txt
idx=1
while [ $idx -lt 10 ]; do
echo "typeC$idx=${COL[$((idx+2))]}" >>/tmp/typelist.txt
let idx=idx+1
#whack off empty variables
sed '/\=$/d' /tmp/typelist.txt > /tmp/typelist2.txt && mv /tmp/typelist2.txt /tmp/typelist.txt
#set variables from temp file
. /tmp/typelist.txt
done
sleep 1
#Parse data in this loop.#
echo -e "\n"
echo "Begin Processing for $container"
#echo $typeA
#echo $typeB
#echo $typeC
#echo -e "\n"
#Strip - from sub data for extra parsing
typeAsub="$(echo "$typeA" | sed 's/\-.*$//')"
typeBsub="$(echo "$typeB" | sed 's/\-.*$//')"
typeCsub1="$(echo "$typeC1" | sed 's/\-.*$//')"
#strip out first two decimils for extra parsing
typeAprefix="$(echo "$typeA" | cut -d "." -f1-2)"
typeBprefix="$(echo "$typeB" | cut -d "." -f1-2)"
typeCprefix1="$(echo "$typeC1" | cut -d "." -f1-2)"
#echo $typeAsub
#echo $typeBsub
#echo $typeCsub1
#echo -e "\n"
#echo $typeAprefix
#echo $typeBprefix
#echo $typeCprefix1
#echo -e "\n"
echo "Getting typeA dataset for $typeA"
#call api script to pull data ; echo out for test
echo "API-gather -option -b "$typeAsub" -g all > "$container"typeA-dataset"
sleep 1
echo "Getting typeB dataset for $typeB"
#call api script to pull data ; echo out for test
echo "API-gather -option -b "$typeBsub" -g all > "$container"typeB-dataset"
sleep 1
echo "Getting typeC dataset for $typeC1"
#call api script to pull data ; echo out for test
echo "API-gather -option -b "$typeCsub" -g all > "$container"typeC-dataset"
sleep 1
echo "Getting additional typeC datasets for $typeC2-15"
#call api script to pull data ; echo out for test
echo "API-gather -option -b "$typeCsub2-15" -g all >> "$container"typeC-dataset"
sleep 1
echo -e "\n"
done < "$_input"
exit 0
Speed isnt a concern, but if I've done anything really stupid up there, feel free to slap me in the right direction. :)
Jdubyas
(45 rep)
Jul 12, 2017, 05:09 AM
• Last activity: Aug 6, 2025, 12:04 AM
0
votes
1
answers
28
views
Storing the iterations of the Receiver (or node) number and RSSI value into an CSV file
I'm quite new to Linux. Recently, I've been able to create a bash script that allows me to obtain the RSSI of the receivers (or node) with a running iteration. How can I store these results in an CSV file with a format such as: Node RSSI ... [...,...] ... [...,...] Here is what the output looks like...
I'm quite new to Linux. Recently, I've been able to create a bash script that allows me to obtain the RSSI of the receivers (or node) with a running iteration. How can I store these results in an CSV file with a format such as:
Node RSSI
... [...,...]
... [...,...]
Here is what the output looks like when running the file:
bash test.sh -l 170,171 -k 2
=== Iteration 1 ===
Node RSSI
------ ----------
170 -43 dBm
171 -43 dBm
=== Iteration 2 ===
Node RSSI
------ ----------
170 -43 dBm
171 -44 dBm
Trinh Dat
(1 rep)
Jul 25, 2025, 08:57 PM
• Last activity: Jul 25, 2025, 09:33 PM
1
votes
3
answers
95
views
try to split csv with multiple headers
I have a csv from a solar inverter and I need to input the data into a SQL database. The problem I have is that two inverters are in the same csv, so I need to split the csv into two or "do something" with the data from the first, then the second. This is a sample: #SmartLogger ESN:102469042181 #INV...
I have a csv from a solar inverter and I need to input the data into a SQL database.
The problem I have is that two inverters are in the same csv, so I need to split the csv into two or "do something" with the data from the first, then the second.
This is a sample:
#SmartLogger ESN:102469042181
#INV1 ESN:ES22B0048634
#Time;Upv1;Upv2;Upv3;Upv4;Upv5;Upv6;Upv7;Upv8;Ipv1;Ipv2;Ipv3;Ipv4;Ipv5;Ipv6;Ipv7;Ipv8;Uac1;Uac2;Uac3;Iac1;Iac2;Iac3;Status;Error;Temp;cos;fac;Pac;Qac;Eac;E-Day;E-Total;Cycle Time
08-12-2024 15:30:00;504.3;504.3;502.8;502.8;620.3;620.3;493.0;493.0;0.11;-0.04;0.13;-0.05;0.06;0.00;0.09;0.00;228.7;229.7;228.0;0.640;0.607;0.637;512;0;19.4;0.975;50.00;0.030;0.007;0.01;6.42;162.22;5;
08-12-2024 15:25:00;506.5;506.5;500.2;500.2;631.5;631.5;460.9;460.9;0.10;-0.04;0.12;-0.06;0.04;0.00;0.09;0.00;228.7;229.7;228.0;0.552;0.541;0.563;512;0;19.6;0.994;49.99;0.026;0.003;0.00;6.41;162.21;5;
#INV2 ESN:ES22B0048591
#Time;Upv1;Upv2;Upv3;Upv4;Upv5;Upv6;Upv7;Upv8;Ipv1;Ipv2;Ipv3;Ipv4;Ipv5;Ipv6;Ipv7;Ipv8;Uac1;Uac2;Uac3;Iac1;Iac2;Iac3;Status;Error;Temp;cos;fac;Pac;Qac;Eac;E-Day;E-Total;Cycle Time
08-12-2024 15:30:00;480.3;480.3;492.7;492.7;377.1;377.1;386.9;386.9;-0.07;0.13;0.02;0.05;-0.01;0.07;0.02;0.00;229.6;231.3;231.7;0.510;0.469;0.523;512;0;19.5;0.999;50.00;0.045;-0.002;0.01;6.65;164.65;5;
08-12-2024 15:25:00;478.8;478.8;484.7;484.7;385.1;385.1;410.9;410.9;-0.07;0.12;0.02;0.04;-0.02;0.06;0.00;0.00;229.6;232.3;231.7;0.486;0.451;0.522;512;0;19.6;0.993;49.99;0.036;0.004;0.00;6.64;164.64;5;
so I need to do soemthing with those lines:
08-12-2024 15:30:00;504.3;504.3;502.8;502.8;620.3;620.3;493.0;493.0;0.11;-0.04;0.13;-0.05;0.06;0.00;0.09;0.00;228.7;229.7;228.0;0.640;0.607;0.637;512;0;19.4;0.975;50.00;0.030;0.007;0.01;6.42;162.22;5;
08-12-2024 15:25:00;506.5;506.5;500.2;500.2;631.5;631.5;460.9;460.9;0.10;-0.04;0.12;-0.06;0.04;0.00;0.09;0.00;228.7;229.7;228.0;0.552;0.541;0.563;512;0;19.6;0.994;49.99;0.026;0.003;0.00;6.41;162.21;5;
and then I need to do something with those lines:
08-12-2024 15:30:00;480.3;480.3;492.7;492.7;377.1;377.1;386.9;386.9;-0.07;0.13;0.02;0.05;-0.01;0.07;0.02;0.00;229.6;231.3;231.7;0.510;0.469;0.523;512;0;19.5;0.999;50.00;0.045;-0.002;0.01;6.65;164.65;5;
08-12-2024 15:25:00;478.8;478.8;484.7;484.7;385.1;385.1;410.9;410.9;-0.07;0.12;0.02;0.04;-0.02;0.06;0.00;0.00;229.6;232.3;231.7;0.486;0.451;0.522;512;0;19.6;0.993;49.99;0.036;0.004;0.00;6.64;164.64;5;
nay idea how to distinguish between the both headers?
Header one:
#INV1 ESN:ES22B0048634
Header two: #INV2 ESN:ES22B0048591
those should be ignored:
#Time;Upv1;Upv2;Upv3;Upv4;Upv5;Upv6;Upv7;Upv8;Ipv1;Ipv2;Ipv3;Ipv4;Ipv5;Ipv6;Ipv7;Ipv8;Uac1;Uac2;Uac3;Iac1;Iac2;Iac3;Status;Error;Temp;cos;fac;Pac;Qac;Eac;E-Day;E-Total;Cycle Time
humnab
(21 rep)
Jul 22, 2025, 08:48 PM
• Last activity: Jul 24, 2025, 09:10 PM
1
votes
1
answers
2662
views
How can I use a variable in awk command
With my code I am trying to sum up the values with the specific name of a column in a csv file, depending on the input of the name. Here's my code: ``` #!/bin/bash updatedata() { index=0 while IFS="" read -r line do IFS=';' read -ra array <<< "$line" for arrpos in "${array[@]}" do if [ "$arrpos" ==...
With my code I am trying to sum up the values with the specific name of a column in a csv file, depending on the input of the name.
Here's my code:
#!/bin/bash
updatedata() {
index=0
while IFS="" read -r line
do
IFS=';' read -ra array <<< "$line"
for arrpos in "${array[@]}"
do
if [ "$arrpos" == *"$1"* ] || [ "$1" == "$arrpos" ]
then
break
else
let index=index+1
fi
done
break
done < data.csv
((index=$index+1))
if [ $pos -eq 0 ]
then
v0=$(awk -F";", -v index=$index '{x+=$index}END{print x}' ./data.csv )
elif [ $pos -eq 1 ]
then
v1=$(awk -F";" '{x+=$index}END{print x}' ./data.csv )
elif [ $pos -eq 2 ]
then
v2=$(awk -F";" '{x+=$index}END{print x}' ./data.csv )
elif [ $pos -eq 3 ]
then
v3=$(awk -F";" '{x+=$index}END{print x}' ./data.csv )
fi
}
`
In the middle of the code you can see in v0=, I was trying to experiment a little, but I just keep getting errors:
First I tried this:
v0=$(awk -F";" '{x+=$index}END{print x}' ./data.csv)
but it gave me this error:
'awk: line 1: syntax error at or near }'
so then I decided to try this(as you can see in the code)
v0=$(awk -F";", -v index=$index '{x+=$index}END{print x}' ./data.csv )
And I got this error:
'awk: run time error: cannot command line assign to index
type clash or keyword
FILENAME="" FNR=0 NR=0'
I don't know what to do. Can you guys help me.
anonymous
(11 rep)
Aug 28, 2020, 09:26 AM
• Last activity: Jun 25, 2025, 10:52 AM
2
votes
7
answers
5384
views
Removing new line character from a column in a CSV file
We are getting new line character in one the column of a CSV file. The data for the column is coming in consecutive rows. Eg: ``` ID,CODE,MESSAGE,DATE,TYPE,OPER,CO_ID 12202,INT_SYS_OCS_EX_INT-0000,"""OCSSystemException: HTTP transport error: java.net.ConnectException: Tried all: '1' addresses, but c...
We are getting new line character in one the column of a CSV file. The data for the column is coming in consecutive rows.
Eg:
ID,CODE,MESSAGE,DATE,TYPE,OPER,CO_ID
12202,INT_SYS_OCS_EX_INT-0000,"""OCSSystemException: HTTP transport error: java.net.ConnectException: Tried all: '1' addresses, but could not connect over HTTP to server: '10.244.166.9', port: '8080'
failed reasons:
address:'/10.244.166.9',port:'8080' : java.net.ConctException: Connection refused
""",06-09-2021 05:52:32,error,BillCycle,6eb8642aa4b
20840,,,06-09-2021 16:17:18,response,changeLimit,1010f9ea05ff
The issue is for column Message
and id 12202
, in which data is coming in triple quotes and in consecutive rows.
My requirement is that for the column Message
, the data should come in a single row rather than multiple rows, because my etl loader fails to import an embedded newline.
mansi bajaj
(29 rep)
Sep 22, 2021, 11:47 AM
• Last activity: Jun 24, 2025, 01:25 PM
5
votes
1
answers
3191
views
sqlite3 command line - how to set mode and import in one step
I need to be able to do this via command line in one step: lab-1:/etc/scripts# sqlite3 test.db SQLite version 3.8.10.2 2015-05-20 18:17:19 Enter ".help" for usage hints. sqlite> .mode csv ; sqlite> .import /tmp/test.csv users sqlite> select * from users; John,Doe,au,0,"",1,5555,91647fs59222,audio sq...
I need to be able to do this via command line in one step:
lab-1:/etc/scripts# sqlite3 test.db
SQLite version 3.8.10.2 2015-05-20 18:17:19
Enter ".help" for usage hints.
sqlite> .mode csv ;
sqlite> .import /tmp/test.csv users
sqlite> select * from users;
John,Doe,au,0,"",1,5555,91647fs59222,audio
sqlite> .quit
I've tried the following:
lab-1:/etc/scripts# sqlite3 test.db ".mode csv ; .import /tmp/deleteme.csv users"
and
lab-1:/etc/scripts# sqlite3 test.db ".mode csv .import /tmp/deleteme.csv users"
I don't get errors but I also don't end up with any data in the users table.
Any tips would be appreciated.
dot
(755 rep)
May 23, 2018, 06:46 PM
• Last activity: Jun 12, 2025, 08:19 PM
2
votes
3
answers
2013
views
Convert SQLite CSV output to JSON
I want to format SQLite output in JSON format from the command line. Currently, I have CSV output that looks like this: label1,value1 label2,value2 label3,value3 ... Now I'd like to have it formatted like this: {'label1' : 'value1', 'label2': 'value2', ... } Thanks!
I want to format SQLite output in JSON format from the command line. Currently, I have CSV output that looks like this:
label1,value1
label2,value2
label3,value3
...
Now I'd like to have it formatted like this:
{'label1' : 'value1', 'label2': 'value2', ... }
Thanks!
michelemarcon
(3593 rep)
Jan 13, 2016, 03:45 PM
• Last activity: Jun 12, 2025, 08:10 PM
92
votes
30
answers
69531
views
Is there a robust command line tool for processing CSV files?
I work with CSV files and sometimes need to quickly check the contents of a row or column from the command line. In many cases `cut`, `head`, `tail`, and friends will do the job; however, cut cannot easily deal with situations such as "this, is the first entry", this is the second, 34.5 Here, the fi...
I work with CSV files and sometimes need to quickly check the contents of a row or column from the command line. In many cases
cut
, head
, tail
, and friends will do the job; however, cut cannot easily deal with situations such as
"this, is the first entry", this is the second, 34.5
Here, the first comma is part of the first field, but cut -d, -f1
disagrees. Before I write a solution myself, I was wondering if anyone knew of a good tool that already exists for this job. It would have to, at the very least, be able to handle the example above and return a column from a CSV formatted file. Other desirable features include the ability to select columns based on the column names given in the first row, support for other quoting styles and support for tab-separated files.
If you don't know of such a tool but have suggestions regarding implementing such a program in Bash, Perl, or Python, or other common scripting languages, I wouldn't mind such suggestions.
Steven D
(47418 rep)
Feb 15, 2011, 07:55 AM
• Last activity: May 11, 2025, 10:46 PM
5
votes
2
answers
1954
views
Filter one very large CSV based on values from another CSV
I am processing some CSV files that do not fit in RAM. The 2 CSV files have the following structure: **first.csv** | id | name | timestamp | |--------|------|---------------------| | serial | str | yyyy-mm-dd hh:mm:ss | **second.csv** | id | name | date | |--------|------|------------| | serial | st...
I am processing some CSV files that do not fit in RAM.
The 2 CSV files have the following structure:
**first.csv**
| id | name | timestamp |
|--------|------|---------------------|
| serial | str | yyyy-mm-dd hh:mm:ss |
**second.csv**
| id | name | date |
|--------|------|------------|
| serial | str | yyyy-mm-dd |
The goal is to select rows from
first.csv
that match some criteria compared to second.csv
:
- name
is equal
- timestamp
is in the range of [date
-1, date
+1].
After iterating all these rows the output can be combined into one output file.
conclv_damian
(51 rep)
Nov 17, 2021, 12:00 AM
• Last activity: May 1, 2025, 05:57 PM
4
votes
2
answers
4216
views
How to replace commas with white spaces in a csv, but inserting a different number of space after each column?
I have a file in the following format: s1,23,789 s2,25,689 and I would like to transform it into a file in the following format: s1 23 789 s2 25 689 i.e 6 white spaces between the 1st and 2nd column and only 3 white spaces between the 2nd and 3rd column? Is there a quick way of pulling this off usin...
I have a file in the following format:
s1,23,789
s2,25,689
and I would like to transform it into a file in the following format:
s1 23 789
s2 25 689
i.e 6 white spaces between the 1st and 2nd column and only 3 white spaces between the 2nd and 3rd column?
Is there a quick way of pulling this off using sed or awk?
Alex Kinman
(575 rep)
Jan 17, 2017, 01:22 AM
• Last activity: Apr 23, 2025, 03:28 PM
0
votes
5
answers
1056
views
command-line tool to sum the values in a column of a CSV file
I am looking for a command-line tool to calculate the sum of the values in a specified column of a CSV file. (**Update**: The CSV file might have quoted fields, so a simple solution just to break on a delimiter (',') does not work.) Given the following sample CSV file: ``` description A,description...
I am looking for a command-line tool to calculate the sum of the values in a specified column of a CSV file. (**Update**: The CSV file might have quoted fields, so a simple solution just to break on a delimiter (',') does not work.)
Given the following sample CSV file:
description A,description B,data 1, data 2
fruit,"banana,apple",3,17
veggie,cauliflower,7,18
animal,"fish,meat",9,22
I want to build the sum, for example, over the column data 1
with the result **19**.
I have tried to use [csvkit] for this but didn't get very far. Are there other command-lien tools specialised in this CSV operation?
halloleo
(649 rep)
Jul 2, 2024, 01:41 AM
• Last activity: Apr 23, 2025, 03:04 PM
2
votes
4
answers
8378
views
How do I remove all \r\n from a file, but preserve \n
I have a CSV with unix line endings, but some of the string values have windows line endings in them: date,notes\n 2014-01-01,"Blah Blah Blah"\n 2014-01-02,"Two things:\r\n - first thing\r\n - second thing\n 2014-01-03,"Foo"\n Note that \n and \r just show where the non-printable characters are in t...
I have a CSV with unix line endings, but some of the string values have windows line endings in them:
date,notes\n
2014-01-01,"Blah Blah Blah"\n
2014-01-02,"Two things:\r\n - first thing\r\n - second thing\n
2014-01-03,"Foo"\n
Note that \n and \r just show where the non-printable characters are in the file, it's not how it would look if you opened it in a text editor.
**I want to remove instances of \r\n, but keep the actual line endings, where it's just \n.** The output should look like:
date,notes\n
2014-01-01,"Blah Blah Blah"\n
2014-01-02,"Two things: - first thing - second thing\n
2014-01-03,"Foo"\n
I need something like
tr -d '\r\n' file.csv
but where it deletes the string \r\n
, rather than either \r
or \n
.
If I try to process it with sed
it's treated like so when processing line-by-line, so it doesn't really work:
date,notes
2014-01-01,"Blah Blah Blah"
2014-01-02,"Two things:\r
- first thing\r
- second thing
2014-01-03,"Foo"
Dean
(551 rep)
Mar 22, 2016, 06:36 PM
• Last activity: Apr 23, 2025, 12:48 PM
2
votes
2
answers
153
views
If there are more than x number of pipes in a csv line, then delete the 2nd instance
I have a csv file which is supposed to contain 4 columns of data which include a product number, a title, a url and a price. Each column is separated by a `|` delimiter (this has to be maintained, there are other reasons why I can't switch to an alternative delimiter which I won't go into here). As...
I have a csv file which is supposed to contain 4 columns of data which include a product number, a title, a url and a price. Each column is separated by a
|
delimiter (this has to be maintained, there are other reasons why I can't switch to an alternative delimiter which I won't go into here). As can be seen in the bottom entry (which is the problem entry in this example) the title contains a pipe, which breaks the pattern, which could potentially causes issues if the data needs to be imported into a database.
5456435121|The making of the blue album|https://www.example1.co.uk|55
1321354567|Wow this example has no imagination|https://www.cherrypickers.co.uk|89
5456456456|King of the Barbarians | Last Man Standing|https://www.babarians.co.uk|79
What I would like to know is, how can I run a command which could effectively analyse the file, and for every line where there are more than 3 pipes( i.e. every line where the title contains a pipe) then delete the 2nd one in that line. This would effectively allow me to remove the pipe(s) in the title if there is one or more present. I don't know how to achieve it.
I would like the file to look like this once processed:
5456435121|The making of the blue album|https://www.example1.co.uk|55
1321354567|Wow this example has no imagination|https://www.cherrypickers.co.uk|89
5456456456|King of the Barbarians Last Man Standing|https://www.babarians.co.uk|79
neilH
(433 rep)
May 5, 2016, 11:15 AM
• Last activity: Apr 23, 2025, 12:07 PM
0
votes
3
answers
989
views
insert comma after the last letter
I use a txt file, and I need to convert it in csv file Saint Petersburg 0 10 0.1 - N Moscow - 9 0 - N Novgorod 0 7 1 30 Y In bash, how can I insert comma after the last letter, and after every number or "-" For example Saint Petersburg, 0, 10, 0.1, -, N Moscow, -, 9, 0, -, N Novgorod, 0, 7, 1, 30, Y...
I use a txt file, and I need to convert it in csv file
Saint Petersburg 0 10 0.1 - N
Moscow - 9 0 - N
Novgorod 0 7 1 30 Y
In bash, how can I insert comma after the last letter, and after every number or "-"
For example
Saint Petersburg, 0, 10, 0.1, -, N
Moscow, -, 9, 0, -, N
Novgorod, 0, 7, 1, 30, Y
Best
Enric Agud Pique
(123 rep)
Nov 10, 2014, 05:50 PM
• Last activity: Apr 23, 2025, 04:44 AM
-3
votes
2
answers
130
views
Add a character to duplicate emails using bash only
Input data: ``` id,location_id,name,title,name@zzz.com,department 1,1,Susan houston,Director of Services,shouston@zzz.com, 2,1,Christina Gonzalez,Director,cgonzalez@zzz.com, 3,2,Brenda brown,"Director, Second Career Services",bbrown@zzz.com, 4,3,Howard Lader,"Manager, Senior Counseling",hlader@zzz.c...
Input data:
id,location_id,name,title,name@zzz.com,department
1,1,Susan houston,Director of Services,shouston@zzz.com,
2,1,Christina Gonzalez,Director,cgonzalez@zzz.com,
3,2,Brenda brown,"Director, Second Career Services",bbrown@zzz.com,
4,3,Howard Lader,"Manager, Senior Counseling",hlader@zzz.com,
8,6,Bart charlow,Executive Director,bcharlow@zzz.com,
9,7,Bart Charlow,Executive Director,bcharlow@zzz.com,
I need to add a character to duplicate emails after the email part, i.e. bcharlow@zzz.com would become bcharlow7@zzz.com (the digit after the email part needs to be taken from the second column). How can I do that in Bash for all entries?
NetRanger
(7 rep)
Mar 29, 2025, 12:48 PM
• Last activity: Apr 4, 2025, 06:14 PM
-1
votes
6
answers
145
views
How to count blank fields from a delimited file in Unix
from script below: ``` EmpID:Name:Designation:UnitName:Location:DateofJoining:Salary 1001:Thomson:SE:IVS:Mumbai:10-Feb-1999:60000 1002:Johnson:TE::Bangalore:18-Jun-2000:50000 1003:Jackson:DM:IMS:Hyderabad:23-Apr-1985:90000 1004:BobGL::ETA:Mumbai:05-Jan-2004:55000 1005:Alice:PA:::26-Aug-2014:25000 10...
from script below:
EmpID:Name:Designation:UnitName:Location:DateofJoining:Salary
1001:Thomson:SE:IVS:Mumbai:10-Feb-1999:60000
1002:Johnson:TE::Bangalore:18-Jun-2000:50000
1003:Jackson:DM:IMS:Hyderabad:23-Apr-1985:90000
1004:BobGL::ETA:Mumbai:05-Jan-2004:55000
1005:Alice:PA:::26-Aug-2014:25000
1006:LilySE::IVS:Bangalore:17-Dec-2015:40000
1007:Kirsten:PM:IMS:Mumbai:26-Aug-2014:45000
1004:BobGL::ETA:Mumbai:05-Jan-2021:55000
I would like to get the count of blank spaces (which are represented as '::' ). Thank you very appreciated for your support.
Ismael Sanchez
(183 rep)
Mar 3, 2025, 02:21 AM
• Last activity: Mar 30, 2025, 06:48 PM
0
votes
2
answers
239
views
CSV processing: moving column/row value to different row where column value matches
I have a csv file with around 50 columns, can be anywhere between 20 and 100 rows. The individual records have IDs, and some records can be in a group of 2. Essentially what I need to do is add an ID to the same row that another ID in that group is in. Example: ID ,group,blank column 2019-1 , , 2019...
I have a csv file with around 50 columns, can be anywhere between 20 and 100 rows.
The individual records have IDs, and some records can be in a group of 2. Essentially what I need to do is add an ID to the same row that another ID in that group is in. Example:
ID ,group,blank column
2019-1 , ,
2019-2 ,GRP1 ,
2019-3 ,GRP2 ,
2019-4 ,GRP1 ,
2019-5 , ,
2019-6 ,GRP2 ,
And the output I would like is:
ID ,group,blank column
2019-1 , ,
2019-2 ,GRP1 ,2019-4
2019-3 ,GRP2 ,2019-6
2019-5 , ,
In my attempts using awk, I haven't had any luck. I either end up leaving out rows that have no group, or I end up repeating values.
Cody Thomas
(9 rep)
Feb 11, 2019, 06:52 PM
• Last activity: Mar 10, 2025, 10:53 PM
1
votes
5
answers
494
views
Sorting elements of a CSV file
I have a csv file with seven numbers per line like this: 1083,20,28,42,23,10,43 1084,20,5,29,59,40,33 1085,39,50,21,12,40,55 1086,45,4,6,23,10,2 1087,36,46,28,32,3,20 I want to keep the first number in place (column 1) and sort columns 2 to 7, making the file like 1083,10,20,23,28,42,43 1084,5,20,29...
I have a csv file with seven numbers per line like this:
1083,20,28,42,23,10,43
1084,20,5,29,59,40,33
1085,39,50,21,12,40,55
1086,45,4,6,23,10,2
1087,36,46,28,32,3,20
I want to keep the first number in place (column 1) and sort columns 2 to 7, making the file like
1083,10,20,23,28,42,43
1084,5,20,29,33,40,59
1085,12,21,39,40,50,55
1086,2,4,6,10,45,23
1087,3,20,28,32,36,46
How can I do that with
awk
, sed
or whatever?
Thanks
Duck
(4794 rep)
Feb 26, 2019, 10:03 AM
• Last activity: Mar 10, 2025, 10:35 PM
1
votes
3
answers
226
views
How could I (painlessly) split or reverse "Last, First" within a record in Miller?
I have a tab-delimited file where one of the columns is in the format "LastName, FirstName". What I want to do is split that record out into two separate columns, `last`, and `first`, use `cut` or some other verb(s) on _that_, and output the result to JSON. I should add that I'm not married to JSON,...
I have a tab-delimited file where one of the columns is in the format "LastName, FirstName". What I want to do is split that record out into two separate columns,
last
, and first
, use cut
or some other verb(s) on _that_, and output the result to JSON.
I should add that I'm not married to JSON, and I know how to use other tools like [jq
](https://github.com/stedolan/jq) , but it would be nice to get it in that format in one step.
The syntax for the nest
verb looks like it requires memorizing a lot of frankly non-memorable options, so I figured that there would be a simple DSL operation to do this job. Maybe that's not the case?
Here's what I've tried. (Let's just forget about the extra space that's attached to Firstname
right now, OK? I would use strip
or ssub
or something to get rid of that later.)
echo -e "last_first\nLastName, Firstname" \
| mlr --t2j put '$o=splitnv($last_first,",")'
# result:
# { "last_first": "LastName, Firstname", "o": "(error)" }
# expected something like:
# { "last_first": "LastName, Firstname", "o": { 1: "LastName", 2: "Firstname" } }
#
# or:
# { "last_first": "LastName, Firstname", "o": [ "LastName", "Firstname" ] }
Why (error)
? Is it not reasonable that assigning to $o
as above would assign a new column o
to the result of splitnv
?
Here's something else I tried that didn't work like I would've expected either:
echo -e "last_first\nLastName, Firstname" \
| mlr -T nest --explode --values --across-fields --nested-fs , -f last_first
# result (no delimiter here, just one field, confirmed w/ 'cat -A')
# last_first
# LastName, Firstname
# expected:
# last_first_1last_first_2
# LastName, Firstname
**Edit**: The problem with the command above is I should've used --tsv
, **not** -T
, which is a synonym for --nidx --fs tab
(numerically-indexed columns). Problem is, Miller doesn't produce an error message when it's obviously wrong to ask for named columns in that case, which might be a mis-feature; see [issue #233](https://github.com/johnkerl/miller/issues/233) .
Any insight would be appreciated.
Kevin E
(540 rep)
Mar 7, 2019, 10:52 AM
• Last activity: Mar 10, 2025, 10:05 PM
1
votes
1
answers
1048
views
Sorting CSV data by third column (Revenues) in descending order
**Problem:** I have a CSV file with five columns (1st column is a string and the other four are integers). I want to filter based on the third column, `Revenues`, the largest on the top and smallest on the bottom. I want to write the result to a new CSV file. It seems that I would need to use someth...
**Problem:**
I have a CSV file with five columns (1st column is a string and the other four are integers). I want to filter based on the third column,
Revenues
, the largest on the top and smallest on the bottom. I want to write the result to a new CSV file.
It seems that I would need to use something like
awk -F '","' 'BEGIN {OFS=","} { if (Revenues($5) > ?? print }' Valuation.csv > Ranking.csv
**Data:**
Company,Nbr employees, Revenues , Revenues per employee , Valuation
Facebook,"35,587","55,800,000,000","1,567,988","491,000,000,000"
Uber,"16,000","11,300,000,000","706,250","120,000,000,000"
Snapchat,"3,069","1,180,000,000","384,490","7,200,000,000"
Airbnb,"3,100","2,600,000,000","838,710","38,000,000,000"
LinkedIn,"13,000","26,200,000,000","2,015,385","26,200,000,000"
Coursora,280,"140,000,000","500,000","815,000,000"
Google,"98,771","39,120,000,000","396,068","720,000,000,000"
Stripe,"1,500","450,000,000","300,000","22,500,000,000"
Epic Games,700,"3,000,000,000","4,285,714","15,000,000,000"
Grab,"3,000","2,750,000,000","916,667","10,000,000,000"
Pinterest,800,"1,000,000,000","1,250,000","12,000,000,000"
James
(11 rep)
Mar 15, 2019, 01:05 PM
• Last activity: Mar 10, 2025, 09:15 PM
Showing page 1 of 20 total questions