Problems with join on sorted key column but linux says "not in expected order"
1
vote
0
answers
35
views
File 1. Column1=old_id; Column2=snp_info (50 000 integers, so too large to show, containts 0,1,2,5 -> i.e., 020112202010511)
File 2, column1=old_id, column2=new_id
Result1 where it continuous:
Result2 where it stops.
Any help will be immensely appreciated. Kind regards, Michiel.
80024979
80024987
80025141
80107980
80922131
81666414
81667586
87021127
87028460
2010112924
2010115513
2010186050
File 2. Column1=old_id; Column2=new_id.
79931168 58155
79944336 58190
79969242 72833
80107980 58150
80922131 58109
2010112924 96821
2010115513 80604
2010186050 47254
2010198857 90190
2010229173 96927
2010229330 67548
I am trying to join on column 1 which was sorted using sort -k 1n file1 and the same for file2. When joining (in linux) I get the following error:
join: file 2 is not in expected order
join: file 1 is not in expected order.
If I look at the join, this is where it gives the error:
...200000010000000000000000 58150
join: file 2 is not in expected order -> but continues with the next old_id (file1) and their column2 value ->
80922131 1100000020000...
...001000010 58109 -> and then it stops here, that is again with column2 of file1 and column 2 of file 2 (the new_id column I want to put in).
I looked at the values, it start giving errors and stop where in file2 column 1 the old_id goes from a value with 8 integers to values of 10 integers. Notice that it is indeed sorted correctly.
How do I solve this? What is the cause? Pictures included. Thank you in advance.
File 1, column1, (column2 not shown, but = 0101020222015... -> 50 000 integers per id)




Asked by Michiel Van Niekerk
(41 rep)
Jan 7, 2024, 04:29 PM
Last activity: Jan 7, 2024, 04:39 PM
Last activity: Jan 7, 2024, 04:39 PM