Add columns from variable number of files to base file
4
votes
3
answers
230
views
I'm dealing with a series of bed files, which look like this:
chr1 100 110 0.5
chr1 150 175 0.2
chr1 200 300 1.5
With the columns being chromosome, start, end, score. I have multiple different files with different scores in each one, and I'd like to combine them like this:
> cat a.bed
chr1 100 110 0.5
chr1 150 175 0.2
chr1 200 300 1.5
> cat b.bed
chr1 100 110 0.4
chr1 150 175 0.7
chr1 200 300 0.9
> cat c.bed
chr1 100 110 1.5
chr1 150 175 1.2
chr1 200 300 0.1
> cat combined.bed
chr1 100 110 0.5 0.4 1.5
chr1 150 175 0.2 0.7 1.2
chr1 200 300 1.5 0.9 0.1
All the score columns (last column of the file) are added to a single file. I found [this answer](https://unix.stackexchange.com/a/167290/387150) , which can combine a column from one additional file into an existing file, but I would like a command which can add a *variable* number of columns together. So if I have 10 bed files to combine, I'd like a command that can process them all together and create a single file with 10 score columns.
Each file should have the same number of lines, and each entry should have the same coordinates in all the files, so there should be no conflicts there. However there can be a lot of entries in each of the files (100K or more generally), so I'd like to avoid processing each one multiple times.
Is there a way to handle this cleanly? This will be in a script so no need to be a one liner.
Asked by Whitehot
(245 rep)
Feb 24, 2025, 03:00 PM
Last activity: Feb 25, 2025, 10:43 PM
Last activity: Feb 25, 2025, 10:43 PM