Sample Header Ad - 728x90

Problems with join on sorted key column but linux says "not in expected order"

1 vote
0 answers
35 views
File 1. Column1=old_id; Column2=snp_info (50 000 integers, so too large to show, containts 0,1,2,5 -> i.e., 020112202010511)
80024979
80024987
80025141
80107980
80922131
81666414
81667586
87021127
87028460
2010112924
2010115513
2010186050
File 2. Column1=old_id; Column2=new_id.
79931168 58155
79944336 58190
79969242 72833
80107980 58150
80922131 58109
2010112924 96821
2010115513 80604
2010186050 47254
2010198857 90190
2010229173 96927
2010229330 67548
I am trying to join on column 1 which was sorted using sort -k 1n file1 and the same for file2. When joining (in linux) I get the following error:
join: file 2 is not in expected order
join: file 1 is not in expected order.
If I look at the join, this is where it gives the error: ...200000010000000000000000 58150 join: file 2 is not in expected order -> but continues with the next old_id (file1) and their column2 value -> 80922131 1100000020000... ...001000010 58109 -> and then it stops here, that is again with column2 of file1 and column 2 of file 2 (the new_id column I want to put in). I looked at the values, it start giving errors and stop where in file2 column 1 the old_id goes from a value with 8 integers to values of 10 integers. Notice that it is indeed sorted correctly. How do I solve this? What is the cause? Pictures included. Thank you in advance. File 1, column1, (column2 not shown, but = 0101020222015... -> 50 000 integers per id) enter image description here File 2, column1=old_id, column2=new_id enter image description here Result1 where it continuous: enter image description here Result2 where it stops. enter image description here Any help will be immensely appreciated. Kind regards, Michiel.
Asked by Michiel Van Niekerk (41 rep)
Jan 7, 2024, 04:29 PM
Last activity: Jan 7, 2024, 04:39 PM