Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

0 votes

4 answers

1869 views

create variables from CSV with varying number of fields

Looking for some help turning a CSV into variables. I tried using IFS, but seems you need to define the number of fields. I need something that can handle varying number of fields. *I am modifying my original question with the current code I'm using (taken from the answer provided by hschou) which i...

                                  Looking for some help turning a CSV into variables. I tried using IFS, but seems you need to define the number of fields. I need something that can handle varying number of fields.


*I am modifying my original question with the current code I'm using (taken from the answer provided by hschou) which includes updated variable names using type instead of row, section etc.

I'm sure you can tell by my code, but I am pretty green with scripting, so I am looking for help to determine if and how I should add another loop or take a different approach to parsing the typeC data because although they follow the same format, there is only one entry for each of the typeA and typeB data, and there can be between 1-15 entries for the typeC data. The goal being only 3 files, one for each of the data types. 

Data format: 

    Container: PL[1-100] 	
    TypeA: [1-20].[1-100].[1-1000].[1-100]-[1-100]   					
    TypeB: [1-20].[1-100].[1-1000].[1-100]-[1-100]  						
    TypeC (1 to 15 entries):  [1-20].[1-100].[1-1000].[1-100]-[1-100] 

 
*There is no header in the CSV, but if there were it would look like this (Container, typeA, and typeB data always being in position 1,2,3, and typeC data being all that follow): Container,typeA,typeB,typeC,tycpeC,typeC,typeC,typeC,..

CSV:

    PL3,12.1.4.5-77,13.6.4.5-20,17.3.577.9-29,17.3.779.12-33,17.3.802.12-60,17.3.917.12-45,17.3.956.12-63,17.3.993.12-42
    PL4,12.1.4.5-78,13.6.4.5-21,17.3.577.9-30,17.3.779.12-34
    PL5,12.1.4.5-79,13.6.4.5-22,17.3.577.9-31,17.3.779.12-35,17.3.802.12-62,17.3.917.12-47
    PL6,12.1.4.5-80,13.6.4.5-23,17.3.577.9-32,17.3.779.12-36,17.3.802.12-63,17.3.917.12-48,17.3.956.12-66
    PL7,12.1.4.5-81,13.6.4.5-24,17.3.577.9-33,17.3.779.12-37,17.3.802.12-64,17.3.917.12-49,17.3.956.12-67,17.3.993.12-46
    PL8,12.1.4.5-82,13.6.4.5-25,17.3.577.9-34

Code: 
    
    #!/bin/bash
    #Set input file
    _input="input.csv"
    #  Pull variables in from csv
    # read file using while loop
    while read; do
        declare -a COL=( ${REPLY//,/ } )
        echo -e "containerID=${COL}\ntypeA=${COL}\ntypeB=${COL}" >/tmp/typelist.txt
        idx=1
        while [ $idx -lt 10 ]; do
            echo "typeC$idx=${COL[$((idx+2))]}" >>/tmp/typelist.txt
            let idx=idx+1
    #whack off empty variables
    sed '/\=$/d' /tmp/typelist.txt > /tmp/typelist2.txt && mv /tmp/typelist2.txt /tmp/typelist.txt
    #set variables from temp file
    . /tmp/typelist.txt
    done
    sleep 1
    
    #Parse data in this loop.#
    echo -e "\n"
    echo "Begin Processing for $container"
    #echo $typeA
    #echo $typeB
    #echo $typeC
    #echo -e "\n"
    
    #Strip - from sub data for extra parsing  
    typeAsub="$(echo "$typeA" | sed 's/\-.*$//')"
    typeBsub="$(echo "$typeB" | sed 's/\-.*$//')"
    typeCsub1="$(echo "$typeC1" | sed 's/\-.*$//')"
    
    #strip out first two decimils for extra parsing
    typeAprefix="$(echo "$typeA" | cut -d "." -f1-2)"
    typeBprefix="$(echo "$typeB" | cut -d "." -f1-2)"
    typeCprefix1="$(echo "$typeC1" | cut -d "." -f1-2)"
    
    #echo $typeAsub
    #echo $typeBsub
    #echo $typeCsub1
    #echo -e "\n"
    
    #echo $typeAprefix
    #echo $typeBprefix
    #echo $typeCprefix1
    #echo -e "\n"
    
    echo "Getting typeA dataset for $typeA"
    #call api script to pull data ; echo out for test
    echo "API-gather -option -b "$typeAsub" -g all > "$container"typeA-dataset"
    sleep 1  
    
    
    echo "Getting typeB dataset for $typeB"
    #call api script to pull data ; echo out for test
    echo "API-gather -option -b "$typeBsub" -g all > "$container"typeB-dataset"
    sleep 1  
    
    echo "Getting typeC dataset for $typeC1"
    #call api script to pull data ; echo out for test
    echo "API-gather -option -b "$typeCsub" -g all > "$container"typeC-dataset"
    sleep 1  
    
    echo "Getting additional typeC datasets for $typeC2-15"
    #call api script to pull data ; echo out for test
    echo "API-gather -option -b "$typeCsub2-15" -g all >> "$container"typeC-dataset"
    sleep 1  
    
    echo -e "\n"
    done < "$_input"
    
    exit 0

Speed isnt a concern, but if I've done anything really stupid up there, feel free to slap me in the right direction. :)




                                

Jdubyas (45 rep)

Jul 12, 2017, 05:09 AM • Last activity: Aug 6, 2025, 12:04 AM

0 votes

1 answers

28 views

Storing the iterations of the Receiver (or node) number and RSSI value into an CSV file

shell csv text-formatting excel

I'm quite new to Linux. Recently, I've been able to create a bash script that allows me to obtain the RSSI of the receivers (or node) with a running iteration. How can I store these results in an CSV file with a format such as: Node RSSI ... [...,...] ... [...,...] Here is what the output looks like...

                                  I'm quite new to Linux. Recently, I've been able to create a bash script that allows me to obtain the RSSI of the receivers (or node) with a running iteration. How can I store these results in an CSV file with a format such as:

    Node    RSSI
    ...   [...,...]
    
    ...   [...,...]

Here is what the output looks like when running the file:

    bash test.sh -l 170,171 -k 2

    === Iteration 1 ===
    
       Node       RSSI  
      ------   ----------
        170    -43 dBm
        171    -43 dBm

    === Iteration 2 ===
    
       Node       RSSI  
      ------   ----------
        170    -43 dBm
        171    -44 dBm

Trinh Dat (1 rep)

Jul 25, 2025, 08:57 PM • Last activity: Jul 25, 2025, 09:33 PM

1 votes

3 answers

95 views

try to split csv with multiple headers

bash csv

I have a csv from a solar inverter and I need to input the data into a SQL database. The problem I have is that two inverters are in the same csv, so I need to split the csv into two or "do something" with the data from the first, then the second. This is a sample: #SmartLogger ESN:102469042181 #INV...

                                  I have a csv from a solar inverter and I need to input the data into a SQL database.
The problem I have is that two inverters are in the same csv, so I need to split the csv into two or "do something" with the data from the first, then the second.

This is a sample:

    #SmartLogger ESN:102469042181
    #INV1 ESN:ES22B0048634
    #Time;Upv1;Upv2;Upv3;Upv4;Upv5;Upv6;Upv7;Upv8;Ipv1;Ipv2;Ipv3;Ipv4;Ipv5;Ipv6;Ipv7;Ipv8;Uac1;Uac2;Uac3;Iac1;Iac2;Iac3;Status;Error;Temp;cos;fac;Pac;Qac;Eac;E-Day;E-Total;Cycle Time
    08-12-2024 15:30:00;504.3;504.3;502.8;502.8;620.3;620.3;493.0;493.0;0.11;-0.04;0.13;-0.05;0.06;0.00;0.09;0.00;228.7;229.7;228.0;0.640;0.607;0.637;512;0;19.4;0.975;50.00;0.030;0.007;0.01;6.42;162.22;5;
    08-12-2024 15:25:00;506.5;506.5;500.2;500.2;631.5;631.5;460.9;460.9;0.10;-0.04;0.12;-0.06;0.04;0.00;0.09;0.00;228.7;229.7;228.0;0.552;0.541;0.563;512;0;19.6;0.994;49.99;0.026;0.003;0.00;6.41;162.21;5;
    #INV2 ESN:ES22B0048591
    #Time;Upv1;Upv2;Upv3;Upv4;Upv5;Upv6;Upv7;Upv8;Ipv1;Ipv2;Ipv3;Ipv4;Ipv5;Ipv6;Ipv7;Ipv8;Uac1;Uac2;Uac3;Iac1;Iac2;Iac3;Status;Error;Temp;cos;fac;Pac;Qac;Eac;E-Day;E-Total;Cycle Time
    08-12-2024 15:30:00;480.3;480.3;492.7;492.7;377.1;377.1;386.9;386.9;-0.07;0.13;0.02;0.05;-0.01;0.07;0.02;0.00;229.6;231.3;231.7;0.510;0.469;0.523;512;0;19.5;0.999;50.00;0.045;-0.002;0.01;6.65;164.65;5;
    08-12-2024 15:25:00;478.8;478.8;484.7;484.7;385.1;385.1;410.9;410.9;-0.07;0.12;0.02;0.04;-0.02;0.06;0.00;0.00;229.6;232.3;231.7;0.486;0.451;0.522;512;0;19.6;0.993;49.99;0.036;0.004;0.00;6.64;164.64;5;

so I need to do soemthing with those lines:

    08-12-2024 15:30:00;504.3;504.3;502.8;502.8;620.3;620.3;493.0;493.0;0.11;-0.04;0.13;-0.05;0.06;0.00;0.09;0.00;228.7;229.7;228.0;0.640;0.607;0.637;512;0;19.4;0.975;50.00;0.030;0.007;0.01;6.42;162.22;5;
    08-12-2024 15:25:00;506.5;506.5;500.2;500.2;631.5;631.5;460.9;460.9;0.10;-0.04;0.12;-0.06;0.04;0.00;0.09;0.00;228.7;229.7;228.0;0.552;0.541;0.563;512;0;19.6;0.994;49.99;0.026;0.003;0.00;6.41;162.21;5;

and then I need to do something with those lines:

    08-12-2024 15:30:00;480.3;480.3;492.7;492.7;377.1;377.1;386.9;386.9;-0.07;0.13;0.02;0.05;-0.01;0.07;0.02;0.00;229.6;231.3;231.7;0.510;0.469;0.523;512;0;19.5;0.999;50.00;0.045;-0.002;0.01;6.65;164.65;5;
    08-12-2024 15:25:00;478.8;478.8;484.7;484.7;385.1;385.1;410.9;410.9;-0.07;0.12;0.02;0.04;-0.02;0.06;0.00;0.00;229.6;232.3;231.7;0.486;0.451;0.522;512;0;19.6;0.993;49.99;0.036;0.004;0.00;6.64;164.64;5;

nay idea how to distinguish between the both headers?

Header one: #INV1 ESN:ES22B0048634
Header two: #INV2 ESN:ES22B0048591

those should be ignored:

    #Time;Upv1;Upv2;Upv3;Upv4;Upv5;Upv6;Upv7;Upv8;Ipv1;Ipv2;Ipv3;Ipv4;Ipv5;Ipv6;Ipv7;Ipv8;Uac1;Uac2;Uac3;Iac1;Iac2;Iac3;Status;Error;Temp;cos;fac;Pac;Qac;Eac;E-Day;E-Total;Cycle Time

humnab (21 rep)

Jul 22, 2025, 08:48 PM • Last activity: Jul 24, 2025, 09:10 PM

1 votes

1 answers

2662 views

How can I use a variable in awk command

bash shell-script awk variable csv

With my code I am trying to sum up the values with the specific name of a column in a csv file, depending on the input of the name. Here's my code: ``` #!/bin/bash updatedata() { index=0 while IFS="" read -r line do IFS=';' read -ra array <<< "$line" for arrpos in "${array[@]}" do if [ "$arrpos" ==...

With my code I am trying to sum up the values with the specific name of a column in a csv file, depending on the input of the name. Here's my code:

#!/bin/bash

updatedata() {

    index=0
    while IFS="" read -r line
    do
        IFS=';' read -ra array <<< "$line"
        for arrpos in "${array[@]}"
        do
            if [ "$arrpos" == *"$1"* ] || [ "$1" == "$arrpos" ]
            then
                break
            else
                let index=index+1
            fi
        done
        break
       
    done < data.csv
    ((index=$index+1))


       
    if [ $pos -eq 0 ]
    then
        v0=$(awk -F";", -v index=$index '{x+=$index}END{print x}' ./data.csv )
    elif [ $pos -eq 1 ]
    then
        v1=$(awk -F";" '{x+=$index}END{print x}' ./data.csv )
    elif [ $pos -eq 2 ]
    then
        v2=$(awk -F";" '{x+=$index}END{print x}' ./data.csv )
    elif [ $pos -eq 3 ]
    then
        v3=$(awk -F";" '{x+=$index}END{print x}' ./data.csv )
    fi
               
                   
         
}

` In the middle of the code you can see in v0=, I was trying to experiment a little, but I just keep getting errors: First I tried this:

v0=$(awk -F";" '{x+=$index}END{print x}' ./data.csv)

but it gave me this error: 'awk: line 1: syntax error at or near }' so then I decided to try this(as you can see in the code)

v0=$(awk -F";", -v index=$index '{x+=$index}END{print x}' ./data.csv )

And I got this error: 'awk: run time error: cannot command line assign to index type clash or keyword FILENAME="" FNR=0 NR=0' I don't know what to do. Can you guys help me.

anonymous (11 rep)

Aug 28, 2020, 09:26 AM • Last activity: Jun 25, 2025, 10:52 AM

2 votes

7 answers

5384 views

Removing new line character from a column in a CSV file

text-processing csv

We are getting new line character in one the column of a CSV file. The data for the column is coming in consecutive rows. Eg: ``` ID,CODE,MESSAGE,DATE,TYPE,OPER,CO_ID 12202,INT_SYS_OCS_EX_INT-0000,"""OCSSystemException: HTTP transport error: java.net.ConnectException: Tried all: '1' addresses, but c...

We are getting new line character in one the column of a CSV file. The data for the column is coming in consecutive rows. Eg:

ID,CODE,MESSAGE,DATE,TYPE,OPER,CO_ID
12202,INT_SYS_OCS_EX_INT-0000,"""OCSSystemException: HTTP transport error: java.net.ConnectException: Tried all: '1' addresses, but could not connect over HTTP to server: '10.244.166.9', port: '8080'
 failed reasons:
   address:'/10.244.166.9',port:'8080' : java.net.ConctException: Connection refused
""",06-09-2021 05:52:32,error,BillCycle,6eb8642aa4b
20840,,,06-09-2021 16:17:18,response,changeLimit,1010f9ea05ff

The issue is for column Messageand id 12202 , in which data is coming in triple quotes and in consecutive rows. My requirement is that for the column Message, the data should come in a single row rather than multiple rows, because my etl loader fails to import an embedded newline.

mansi bajaj (29 rep)

Sep 22, 2021, 11:47 AM • Last activity: Jun 24, 2025, 01:25 PM

5 votes

1 answers

3191 views

sqlite3 command line - how to set mode and import in one step

command-line csv sqlite

I need to be able to do this via command line in one step: lab-1:/etc/scripts# sqlite3 test.db SQLite version 3.8.10.2 2015-05-20 18:17:19 Enter ".help" for usage hints. sqlite> .mode csv ; sqlite> .import /tmp/test.csv users sqlite> select * from users; John,Doe,au,0,"",1,5555,91647fs59222,audio sq...

                                  I need to be able to do this via command line in one step:

    lab-1:/etc/scripts# sqlite3 test.db
    SQLite version 3.8.10.2 2015-05-20 18:17:19
    Enter ".help" for usage hints.
    sqlite> .mode csv ;
    sqlite> .import /tmp/test.csv users
    sqlite> select * from users;
    John,Doe,au,0,"",1,5555,91647fs59222,audio
    sqlite> .quit

I've tried the following: 

     lab-1:/etc/scripts# sqlite3 test.db ".mode csv ; .import /tmp/deleteme.csv users"

and

     lab-1:/etc/scripts# sqlite3 test.db ".mode csv .import /tmp/deleteme.csv users"

I don't get errors but I also don't end up with any data in the users table. 

Any tips would be appreciated.
                                

dot (755 rep)

May 23, 2018, 06:46 PM • Last activity: Jun 12, 2025, 08:19 PM

2 votes

3 answers

2013 views

Convert SQLite CSV output to JSON

csv json conversion sqlite

I want to format SQLite output in JSON format from the command line. Currently, I have CSV output that looks like this: label1,value1 label2,value2 label3,value3 ... Now I'd like to have it formatted like this: {'label1' : 'value1', 'label2': 'value2', ... } Thanks!

                                  I want to format SQLite output in JSON format from the command line. Currently, I have CSV output that looks like this:

    label1,value1
    label2,value2
    label3,value3
    ...

Now I'd like to have it formatted like this:

    {'label1' : 'value1',  'label2': 'value2', ... }

Thanks!
                                

michelemarcon (3593 rep)

Jan 13, 2016, 03:45 PM • Last activity: Jun 12, 2025, 08:10 PM

92 votes

30 answers

69531 views

Is there a robust command line tool for processing CSV files?

command-line software-rec text-processing csv

I work with CSV files and sometimes need to quickly check the contents of a row or column from the command line. In many cases `cut`, `head`, `tail`, and friends will do the job; however, cut cannot easily deal with situations such as "this, is the first entry", this is the second, 34.5 Here, the fi...

                                  I work with CSV files and sometimes need to quickly check the contents of a row or column from the command line.  In many cases cut, head, tail, and friends will do the job; however, cut cannot easily deal with situations such as

    "this, is the first entry", this is the second, 34.5

Here, the first comma is part of the first field, but cut -d, -f1 disagrees. Before I write a solution myself, I was wondering if anyone knew of a good tool that already exists for this job.  It would have to, at the very least, be able to handle the example above and return a column from a CSV formatted file.  Other desirable features include the ability to select columns based on the column names given in the first row, support for other quoting styles and support for tab-separated files.

If you don't know of such a tool but have suggestions regarding implementing such a program in Bash, Perl, or Python, or other common scripting languages, I wouldn't mind such suggestions.

Steven D (47418 rep)

Feb 15, 2011, 07:55 AM • Last activity: May 11, 2025, 10:46 PM

5 votes

2 answers

1954 views

Filter one very large CSV based on values from another CSV

shell-script text-processing csv

I am processing some CSV files that do not fit in RAM. The 2 CSV files have the following structure: **first.csv** | id | name | timestamp | |--------|------|---------------------| | serial | str | yyyy-mm-dd hh:mm:ss | **second.csv** | id | name | date | |--------|------|------------| | serial | st...

                                  I am processing some CSV files that do not fit in RAM.

The 2 CSV files have the following structure:

**first.csv**

| id     | name | timestamp           |
|--------|------|---------------------|
| serial | str  | yyyy-mm-dd hh:mm:ss |

**second.csv**

| id     | name | date       |
|--------|------|------------|
| serial | str  | yyyy-mm-dd |

The goal is to select rows from first.csv that match some criteria compared to second.csv:
-   name is equal
-   timestamp is in the range of [date-1, date+1]. 

After iterating all these rows the output can be combined into one output file.
                                

conclv_damian (51 rep)

Nov 17, 2021, 12:00 AM • Last activity: May 1, 2025, 05:57 PM

4 votes

2 answers

4216 views

How to replace commas with white spaces in a csv, but inserting a different number of space after each column?

text-processing command-line awk sed csv

I have a file in the following format: s1,23,789 s2,25,689 and I would like to transform it into a file in the following format: s1 23 789 s2 25 689 i.e 6 white spaces between the 1st and 2nd column and only 3 white spaces between the 2nd and 3rd column? Is there a quick way of pulling this off usin...

                                  I have a file in the following format:  

    s1,23,789  
    s2,25,689

and I would like to transform it into a file in the following format: 

    s1      23  789  
    s2      25  689

i.e 6 white spaces between the 1st and 2nd column and only 3 white spaces between the 2nd and 3rd column? 

Is there a quick way of pulling this off using sed or awk?

Alex Kinman (575 rep)

Jan 17, 2017, 01:22 AM • Last activity: Apr 23, 2025, 03:28 PM

0 votes

5 answers

1056 views

command-line tool to sum the values in a column of a CSV file

command-line csv columns csvkit

I am looking for a command-line tool to calculate the sum of the values in a specified column of a CSV file. (**Update**: The CSV file might have quoted fields, so a simple solution just to break on a delimiter (',') does not work.) Given the following sample CSV file: ``` description A,description...

description A,description B,data 1, data 2
fruit,"banana,apple",3,17
veggie,cauliflower,7,18
animal,"fish,meat",9,22

I want to build the sum, for example, over the column data 1 with the result **19**. I have tried to use [csvkit] for this but didn't get very far. Are there other command-lien tools specialised in this CSV operation?

halloleo (649 rep)

Jul 2, 2024, 01:41 AM • Last activity: Apr 23, 2025, 03:04 PM

2 votes

4 answers

8378 views

How do I remove all \r\n from a file, but preserve \n

csv replace

I have a CSV with unix line endings, but some of the string values have windows line endings in them: date,notes\n 2014-01-01,"Blah Blah Blah"\n 2014-01-02,"Two things:\r\n - first thing\r\n - second thing\n 2014-01-03,"Foo"\n Note that \n and \r just show where the non-printable characters are in t...

                                  I have a CSV with unix line endings, but some of the string values have windows line endings in them:

    date,notes\n
    2014-01-01,"Blah Blah Blah"\n
    2014-01-02,"Two things:\r\n  - first thing\r\n  - second thing\n
    2014-01-03,"Foo"\n

Note that \n and \r just show where the non-printable characters are in the file, it's not how it would look if you opened it in a text editor.

**I want to remove instances of \r\n, but keep the actual line endings, where it's just \n.** The output should look like:

    date,notes\n
    2014-01-01,"Blah Blah Blah"\n
    2014-01-02,"Two things:  - first thing  - second thing\n
    2014-01-03,"Foo"\n

I need something like tr -d '\r\n' file.csv but where it deletes the string \r\n, rather than either \r or \n.

If I try to process it with sed it's treated like so when processing line-by-line, so it doesn't really work:

    date,notes
    2014-01-01,"Blah Blah Blah"
    2014-01-02,"Two things:\r
      - first thing\r
      - second thing
    2014-01-03,"Foo"

                                

Dean (551 rep)

Mar 22, 2016, 06:36 PM • Last activity: Apr 23, 2025, 12:48 PM

2 votes

2 answers

153 views

If there are more than x number of pipes in a csv line, then delete the 2nd instance

shell-script text-processing csv

I have a csv file which is supposed to contain 4 columns of data which include a product number, a title, a url and a price. Each column is separated by a `|` delimiter (this has to be maintained, there are other reasons why I can't switch to an alternative delimiter which I won't go into here). As...

                                  I have a csv file which is supposed to contain 4 columns of data which include a product number, a title, a url and a price. Each column is separated by a | delimiter (this has to be maintained, there are other reasons why I can't switch to an alternative delimiter which I won't go into here). As can be seen in the bottom entry (which is the problem entry in this example) the title contains a pipe, which breaks the pattern, which could potentially causes issues if the data needs to be imported into a database. 

    5456435121|The making of the blue album|https://www.example1.co.uk|55 
    1321354567|Wow this example has no imagination|https://www.cherrypickers.co.uk|89 
    5456456456|King of the Barbarians | Last Man Standing|https://www.babarians.co.uk|79 

What I would like to know is, how can I run a command which could effectively analyse the file, and for every line where there are more than 3 pipes( i.e. every line where the title contains a pipe) then delete the 2nd one in that line. This would effectively allow me to remove the pipe(s) in the title if there is one or more present. I don't know how to achieve it.

I would like the file to look like this once processed:

    5456435121|The making of the blue album|https://www.example1.co.uk|55 
    1321354567|Wow this example has no imagination|https://www.cherrypickers.co.uk|89 
    5456456456|King of the Barbarians Last Man Standing|https://www.babarians.co.uk|79

neilH (433 rep)

May 5, 2016, 11:15 AM • Last activity: Apr 23, 2025, 12:07 PM

0 votes

3 answers

989 views

insert comma after the last letter

text-processing csv

I use a txt file, and I need to convert it in csv file Saint Petersburg 0 10 0.1 - N Moscow - 9 0 - N Novgorod 0 7 1 30 Y In bash, how can I insert comma after the last letter, and after every number or "-" For example Saint Petersburg, 0, 10, 0.1, -, N Moscow, -, 9, 0, -, N Novgorod, 0, 7, 1, 30, Y...

                                  I use a txt file, and I need to convert it in csv file

    Saint Petersburg      0     10    0.1   -   N
    Moscow                -     9       0   -   N
    Novgorod              0     7       1   30  Y

In bash, how can I insert comma after the last letter, and after every number or "-"

For example 

    Saint Petersburg,      0,     10,    0.1,   -,  N
    Moscow,                -,     9,       0,   -,  N
    Novgorod,              0,     7,       1,   30, Y

Best
                                

Enric Agud Pique (123 rep)

Nov 10, 2014, 05:50 PM • Last activity: Apr 23, 2025, 04:44 AM

-3 votes

2 answers

130 views

Add a character to duplicate emails using bash only

bash csv

Input data: ``` id,location_id,name,title,name@zzz.com,department 1,1,Susan houston,Director of Services,shouston@zzz.com, 2,1,Christina Gonzalez,Director,cgonzalez@zzz.com, 3,2,Brenda brown,"Director, Second Career Services",bbrown@zzz.com, 4,3,Howard Lader,"Manager, Senior Counseling",hlader@zzz.c...

Input data:

id,location_id,name,title,name@zzz.com,department
1,1,Susan houston,Director of Services,shouston@zzz.com,
2,1,Christina Gonzalez,Director,cgonzalez@zzz.com,
3,2,Brenda brown,"Director, Second Career Services",bbrown@zzz.com,
4,3,Howard Lader,"Manager, Senior Counseling",hlader@zzz.com,
8,6,Bart charlow,Executive Director,bcharlow@zzz.com,
9,7,Bart Charlow,Executive Director,bcharlow@zzz.com,

I need to add a character to duplicate emails after the email part, i.e. bcharlow@zzz.com would become bcharlow7@zzz.com (the digit after the email part needs to be taken from the second column). How can I do that in Bash for all entries?

NetRanger (7 rep)

Mar 29, 2025, 12:48 PM • Last activity: Apr 4, 2025, 06:14 PM

-1 votes

6 answers

145 views

How to count blank fields from a delimited file in Unix

command-line csv

from script below: ``` EmpID:Name:Designation:UnitName:Location:DateofJoining:Salary 1001:Thomson:SE:IVS:Mumbai:10-Feb-1999:60000 1002:Johnson:TE::Bangalore:18-Jun-2000:50000 1003:Jackson:DM:IMS:Hyderabad:23-Apr-1985:90000 1004:BobGL::ETA:Mumbai:05-Jan-2004:55000 1005:Alice:PA:::26-Aug-2014:25000 10...

from script below:

EmpID:Name:Designation:UnitName:Location:DateofJoining:Salary
1001:Thomson:SE:IVS:Mumbai:10-Feb-1999:60000
1002:Johnson:TE::Bangalore:18-Jun-2000:50000
1003:Jackson:DM:IMS:Hyderabad:23-Apr-1985:90000
1004:BobGL::ETA:Mumbai:05-Jan-2004:55000
1005:Alice:PA:::26-Aug-2014:25000
1006:LilySE::IVS:Bangalore:17-Dec-2015:40000
1007:Kirsten:PM:IMS:Mumbai:26-Aug-2014:45000
1004:BobGL::ETA:Mumbai:05-Jan-2021:55000

I would like to get the count of blank spaces (which are represented as '::' ). Thank you very appreciated for your support.

Ismael Sanchez (183 rep)

Mar 3, 2025, 02:21 AM • Last activity: Mar 30, 2025, 06:48 PM

0 votes

2 answers

239 views

CSV processing: moving column/row value to different row where column value matches

text-processing awk csv

I have a csv file with around 50 columns, can be anywhere between 20 and 100 rows. The individual records have IDs, and some records can be in a group of 2. Essentially what I need to do is add an ID to the same row that another ID in that group is in. Example: ID ,group,blank column 2019-1 , , 2019...

                                  I have a csv file with around 50 columns, can be anywhere between 20 and 100 rows. 

The individual records have IDs, and some records can be in a group of 2. Essentially what I need to do is add an ID to the same row that another ID in that group is in. Example:

    ID     ,group,blank column
    2019-1 ,     ,
    2019-2 ,GRP1 ,
    2019-3 ,GRP2 ,
    2019-4 ,GRP1 ,
    2019-5 ,     ,
    2019-6 ,GRP2 ,

And the output I would like is:

    ID     ,group,blank column
    2019-1 ,     ,
    2019-2 ,GRP1 ,2019-4
    2019-3 ,GRP2 ,2019-6
    2019-5 ,     ,
In my attempts using awk, I haven't had any luck. I either end up leaving out rows that have no group, or I end up repeating values. 
                                

Cody Thomas (9 rep)

Feb 11, 2019, 06:52 PM • Last activity: Mar 10, 2025, 10:53 PM

1 votes

5 answers

494 views

Sorting elements of a CSV file

awk sed sort csv

I have a csv file with seven numbers per line like this: 1083,20,28,42,23,10,43 1084,20,5,29,59,40,33 1085,39,50,21,12,40,55 1086,45,4,6,23,10,2 1087,36,46,28,32,3,20 I want to keep the first number in place (column 1) and sort columns 2 to 7, making the file like 1083,10,20,23,28,42,43 1084,5,20,29...

                                  I have a csv file with seven numbers per line like this:

    1083,20,28,42,23,10,43
    1084,20,5,29,59,40,33
    1085,39,50,21,12,40,55
    1086,45,4,6,23,10,2
    1087,36,46,28,32,3,20

I want to keep the first number in place (column 1) and  sort columns 2 to 7, making the file like

    1083,10,20,23,28,42,43
    1084,5,20,29,33,40,59
    1085,12,21,39,40,50,55
    1086,2,4,6,10,45,23
    1087,3,20,28,32,36,46

How can I do that with awk, sed or whatever?

Thanks
                                

Duck (4794 rep)

Feb 26, 2019, 10:03 AM • Last activity: Mar 10, 2025, 10:35 PM

1 votes

3 answers

226 views

How could I (painlessly) split or reverse "Last, First" within a record in Miller?

text-processing csv json miller

I have a tab-delimited file where one of the columns is in the format "LastName, FirstName". What I want to do is split that record out into two separate columns, `last`, and `first`, use `cut` or some other verb(s) on _that_, and output the result to JSON. I should add that I'm not married to JSON,...

I have a tab-delimited file where one of the columns is in the format "LastName, FirstName". What I want to do is split that record out into two separate columns, last, and first, use cut or some other verb(s) on _that_, and output the result to JSON. I should add that I'm not married to JSON, and I know how to use other tools like [jq](https://github.com/stedolan/jq) , but it would be nice to get it in that format in one step. The syntax for the nest verb looks like it requires memorizing a lot of frankly non-memorable options, so I figured that there would be a simple DSL operation to do this job. Maybe that's not the case? Here's what I've tried. (Let's just forget about the extra space that's attached to Firstname right now, OK? I would use strip or ssub or something to get rid of that later.)

echo -e "last_first\nLastName, Firstname" \
  | mlr --t2j put '$o=splitnv($last_first,",")'

# result:
# { "last_first": "LastName, Firstname", "o": "(error)" }

# expected something like:
# { "last_first": "LastName, Firstname", "o": { 1: "LastName", 2: "Firstname" } }
#
# or:
# { "last_first": "LastName, Firstname", "o": [ "LastName", "Firstname" ] }

Why (error)? Is it not reasonable that assigning to $o as above would assign a new column o to the result of splitnv? Here's something else I tried that didn't work like I would've expected either:

echo -e "last_first\nLastName, Firstname" \
  | mlr -T nest --explode --values --across-fields --nested-fs , -f last_first

# result (no delimiter here, just one field, confirmed w/ 'cat -A')
# last_first
# LastName, Firstname

# expected:
# last_first_1last_first_2
# LastName, Firstname

**Edit**: The problem with the command above is I should've used --tsv, **not** -T, which is a synonym for --nidx --fs tab (numerically-indexed columns). Problem is, Miller doesn't produce an error message when it's obviously wrong to ask for named columns in that case, which might be a mis-feature; see [issue #233](https://github.com/johnkerl/miller/issues/233) . Any insight would be appreciated.

Kevin E (540 rep)

Mar 7, 2019, 10:52 AM • Last activity: Mar 10, 2025, 10:05 PM

1 votes

1 answers

1048 views

Sorting CSV data by third column (Revenues) in descending order

linux awk csv

**Problem:** I have a CSV file with five columns (1st column is a string and the other four are integers). I want to filter based on the third column, `Revenues`, the largest on the top and smallest on the bottom. I want to write the result to a new CSV file. It seems that I would need to use someth...

**Problem:** I have a CSV file with five columns (1st column is a string and the other four are integers). I want to filter based on the third column, Revenues, the largest on the top and smallest on the bottom. I want to write the result to a new CSV file. It seems that I would need to use something like

awk -F '","'  'BEGIN {OFS=","} { if (Revenues($5) > ?? print }' Valuation.csv > Ranking.csv

**Data:**

Company,Nbr employees, Revenues  , Revenues per employee , Valuation 
Facebook,"35,587","55,800,000,000","1,567,988","491,000,000,000"
Uber,"16,000","11,300,000,000","706,250","120,000,000,000"
Snapchat,"3,069","1,180,000,000","384,490","7,200,000,000"
Airbnb,"3,100","2,600,000,000","838,710","38,000,000,000"
LinkedIn,"13,000","26,200,000,000","2,015,385","26,200,000,000"
Coursora,280,"140,000,000","500,000","815,000,000"
Google,"98,771","39,120,000,000","396,068","720,000,000,000"
Stripe,"1,500","450,000,000","300,000","22,500,000,000"
Epic Games,700,"3,000,000,000","4,285,714","15,000,000,000"
Grab,"3,000","2,750,000,000","916,667","10,000,000,000"
Pinterest,800,"1,000,000,000","1,250,000","12,000,000,000"

James (11 rep)

Mar 15, 2019, 01:05 PM • Last activity: Mar 10, 2025, 09:15 PM

Showing page 1 of 20 total questions