Sample Header Ad - 728x90

Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

-2 votes
1 answers
73 views
Parse txt file on basis of occurrence of a tag in Linux
I am trying to parse a txt file containing xml "messages"in linux, something like this ``` xyz xyz xyz xyz and so on ``` The code will read file, extract each section from ``` ``` till ``` ``` and put each section into a separate file. My code for this is as below ``` input_file="input_file.txt" # E...
I am trying to parse a txt file containing xml "messages"in linux, something like this
xyz     xyz     xyz     xyz   and so on
The code will read file, extract each section from
till
and put each section into a separate file. My code for this is as below
input_file="input_file.txt"

# Extracting Document parts
sed -n '//p' "$input_file" > temp_output.txt

# Splitting into Different Files
csplit -f output -b %d.txt -z temp_output.txt '//' '{*}'

# Cleaning up temporary files
rm temp_output.txt
However, this code is extracting several xml messages into one file, particularly the ones with no line break. Could someone suggest what can be rectified in the above code?
python6 (1 rep)
May 15, 2024, 01:29 PM • Last activity: May 15, 2024, 02:09 PM
3 votes
1 answers
8509 views
Split a text file into multiple files, beyond the {99} limit of csplit
I'd like to split the contents of a .txt file into multiple files, but I'm encountering two questions about limitations of csplit: (1) can anyone offer a way around csplit's maximum limit of '99' file splits? I have a file with up to 384 splits based on a recurring blank line or character. I'd like...
I'd like to split the contents of a .txt file into multiple files, but I'm encountering two questions about limitations of csplit: (1) can anyone offer a way around csplit's maximum limit of '99' file splits? I have a file with up to 384 splits based on a recurring blank line or character. I'd like csplit to be able to accomodate this with {*}, but this exceeds csplit's intrinsic file generation capacity. (2) does anyone know of a way to pass the contents of a file to csplit (pipe to csplit), or can csplit only be used in its conventional way of calling a file in place? i.e. csplit -f split_name file_to_split.txt /split/ {*} vs. [series of commands] | csplit -f split_name /split/ {*} Thank you for any suggestions, or alternatives to accomplish a similar task.
kehmsen (59 rep)
Mar 25, 2016, 10:42 PM • Last activity: Feb 26, 2024, 07:10 PM
1 votes
2 answers
151 views
Split file into specific output filenames by pattern match
I have a file with this content: # new file text in file 1 # new file text in file 2 # new file text in file 3 The pattern here is `# new file`. I instead of saving each file to xx00, xx01 and xx02, save to specific files: `another file`, `file new`, `last one`. The 3 files exist in current director...
I have a file with this content: # new file text in file 1 # new file text in file 2 # new file text in file 3 The pattern here is # new file. I instead of saving each file to xx00, xx01 and xx02, save to specific files: another file, file new, last one. The 3 files exist in current directory, so I want to provide them as array, overwrite them: csplit -z infile '/# new file/' "${array[*]}" The array can be provided directly array=('another file' 'file new' 'last one') echo ${array[*]} another file file new last one Or list current directory array=($(find . -type f)) echo ${array[*]} ./another file ./file new ./last one A modification of this script could be the solution: awk -v file="1" -v occur="2" ' { print > (file".txt") } /^\$\$\$\$$/{ count++ if(count%occur==0){ if(file){ close(file".txt") ++file } } } ' Input_file
Smeterlink (295 rep)
Dec 12, 2023, 06:47 AM • Last activity: Dec 13, 2023, 11:14 AM
0 votes
1 answers
300 views
using csplit to split a file based on a regular expression to multiple files
I have a text file that has the contents of the example below, and I would like to split the file to multiple files. ``` [TXT] /path/to/[TXT] [BAT] /path/to/[BAT] [TXT] /path/to/blah/[TXT] [BAT] /path/to/blah/[BAT] ``` So I have figured out I can use `csplit` to at least partially do what I wanted t...
I have a text file that has the contents of the example below, and I would like to split the file to multiple files.
[TXT]	/path/to/[TXT]
[BAT]	/path/to/[BAT]
[TXT]	/path/to/blah/[TXT]
[BAT]	/path/to/blah/[BAT]
So I have figured out I can use csplit to at least partially do what I wanted to achieve. csplit -f 'paths-' -b '%04d.txt' 'path/to/filelist.txt' '/^\[(.*)]\t/' '{*}' However this splits to paths-0000.txt. I was hoping for something more like paths-txt.txt and paths-bat.txt. Is there anyway I can get the regex match into the prefix match at all? I did try things like -f 'paths-$1.txt' and -f 'paths-\1.txt'. But neither of those did what I was hoping for them to do.
AeroMaxx (227 rep)
Jul 12, 2023, 11:21 PM • Last activity: Jul 13, 2023, 04:31 AM
4 votes
1 answers
4497 views
How do I use modern coreutils on Mac?
How do I get modern coreutils on mac? --- I ran into this problem using `csplit`: `foo.txt`: ``` foo 1 foo 2 foo 3 ``` ``` $: csplit foo '^foo$' '{*}' # error ``` Double checking the `manpage`, `man csplit`, `csplit` on Mac is the FreeBSD version and does not offer the `'{*}'` option. In fact, I mus...
How do I get modern coreutils on mac? --- I ran into this problem using csplit: foo.txt:
foo
1
foo
2
foo
3
$: csplit foo '^foo$' '{*}'
# error
Double checking the manpage, man csplit, csplit on Mac is the FreeBSD version and does not offer the '{*}' option. In fact, I must provide the exact number of splits ahead of time. This will either trigger a czplit re-implementation by me, or maybe I can get GNU coreutils on mac. Is there a way?
Chris (1075 rep)
Dec 20, 2022, 03:32 PM • Last activity: Dec 20, 2022, 08:07 PM
3 votes
1 answers
1255 views
csplit regex with pipe (|)
i want to split file by regular expression, i have file format as below 0|t| lorem ... some text 138|t| title some text if i execute `egrep "[0-9]+\|t\|" file | wc -l` it counts occurrence correctly but if i execute `csplit filename /[0-9]+\|t\|/` then it says no match found and does not split file....
i want to split file by regular expression, i have file format as below 0|t| lorem ... some text 138|t| title some text if i execute egrep "[0-9]+\|t\|" file | wc -l it counts occurrence correctly but if i execute csplit filename /[0-9]+\|t\|/ then it says no match found and does not split file. seems some issue with pipe in pattern but not able to figure out solution.
Jigar Parekh (133 rep)
Mar 30, 2017, 05:58 AM • Last activity: Oct 15, 2022, 08:52 PM
1 votes
2 answers
215 views
Divide a fasta file with scaffolds into same lenght files respecting the scaffold ID and the sequence
I am currently working with a large fasta file (3.7GB) that has scaffolds in it. Each scaffold has a unique identifier that starts with `>` on the first line and on the consecutive line it has the DNA sequence like this: ``` >9999992:0-108 AAAGAATTGTATTCCCTCCAGGTAGGGGGGATAGTTGAGGGGATACATAG TGGGAAGGC...
I am currently working with a large fasta file (3.7GB) that has scaffolds in it. Each scaffold has a unique identifier that starts with > on the first line and on the consecutive line it has the DNA sequence like this:
>9999992:0-108
AAAGAATTGTATTCCCTCCAGGTAGGGGGGATAGTTGAGGGGATACATAG
TGGGAAGGCTTTTCATGCGGAGGGACTAGAATGTGCTCCCGACTGACAAA
GCAGCTTG
>9999993:0-118
AGGGACTAGAAATGAGATTAAAAAGAGTAAAAGCACTGATACAAGTACAA
AAACAAATTGCTTCACCTCCAAAACCCCAGAAACTGCCCCACTTGGCTCC
CATTTAACCTACCTTCAA
>9999994:0-113
CCATCCTCATCCTTTCCTCCCCATATCTTCCTCTGACCCCAAAGCTCAGG
TTTCCTGTCTTGTTTCCCAGAATCTGTACCTCATGGTAGTTAAACCTTCC
CCTCTGGCAGCCA
>9999997:0-87
AACATCCCTGTGGCCTGAGAGACTGCCAGCCACAGCGGTGACAGTCCCTG
CGAGAGGCTGCTGCAAAAAGACTGGAGAGAAAGCAGA
>9999998:0-100
AAACATCAGCGCCAAGTCCCCGAAACCAGCAGGGTCACTGGGCGGCCGGC
CTGAAATACCCCAGCAGGCCAGCAGTGCCGGGTGCCTGGGGAGGTGTCCT
>9999999:0-94
AAGAAACTTTTCCCTTAACCAATGAAGAGTTTTATGTAAAGGAAATTTAG
TAATTTTTTAAAAAATGGTAATGACAGATTTAAGTAATTTAATT
I want to split the file into small files preferably of the same length to process it, but I need to respect the ID and the sequence together, and obtain something like this:
file1.fa
>9999992:0-108
AAAGAATTGTATTCCCTCCAGGTAGGGGGGATAGTTGAGGGGATACATAG
TGGGAAGGCTTTTCATGCGGAGGGACTAGAATGTGCTCCCGACTGACAAA
GCAGCTTG
>9999993:0-118
AGGGACTAGAAATGAGATTAAAAAGAGTAAAAGCACTGATACAAGTACAA
AAACAAATTGCTTCACCTCCAAAACCCCAGAAACTGCCCCACTTGGCTCC
CATTTAACCTACCTTCAA

file2.fasta
>9999994:0-113
CCATCCTCATCCTTTCCTCCCCATATCTTCCTCTGACCCCAAAGCTCAGG
TTTCCTGTCTTGTTTCCCAGAATCTGTACCTCATGGTAGTTAAACCTTCC
CCTCTGGCAGCCA
>9999997:0-87
AACATCCCTGTGGCCTGAGAGACTGCCAGCCACAGCGGTGACAGTCCCTG
CGAGAGGCTGCTGCAAAAAGACTGGAGAGAAAGCAGA

file3.fasta
>9999998:0-100
AAACATCAGCGCCAAGTCCCCGAAACCAGCAGGGTCACTGGGCGGCCGGC
CTGAAATACCCCAGCAGGCCAGCAGTGCCGGGTGCCTGGGGAGGTGTCCT
>9999999:0-94
AAGAAACTTTTCCCTTAACCAATGAAGAGTTTTATGTAAAGGAAATTTAG
TAATTTTTTAAAAAATGGTAATGACAGATTTAAGTAATTTAATT
Please help me. I have tried to use csplit and grep but I get the wrong outputs.
Nadia Tamayo (13 rep)
Oct 13, 2022, 02:02 AM • Last activity: Oct 13, 2022, 10:45 AM
0 votes
1 answers
336 views
Help me understand a script that uses csplit and sed
I wanted a simple way to export notes from the reference manager, Zotero. I start by selecting multiple notes and dragging them into a blank text file. I also want achieve "atomicity" of my notes, so I need to split the resulting text files which contain the individual notes in sections separated by...
I wanted a simple way to export notes from the reference manager, Zotero. I start by selecting multiple notes and dragging them into a blank text file. I also want achieve "atomicity" of my notes, so I need to split the resulting text files which contain the individual notes in sections separated by lines of dashes. I then want to use the heading I gave to each note to name the new files i.e.: rename with the first line of each section. I want to save these new files as markdown files. The script I have put together is made up of suggestions for each of these functions by contributors on the web. I am trying to make sure that I understand the commands in the script correctly before sharing it with colleagues who have a similar use case to mine. My understanding (from reading Gilles' answer to another question - see reference link below) of the need for quote marks around the "$f" in the "head" command does not seem to be correct. I tried the script without the quotes and got the same result. Are the double quotes not really needed because "$f" appears on the right-hand side of an assignment? Are they only there because it is easier to double quote by default than to remember when they are not needed? Any further explanation would be much appreciated. An example of the input file would be the following in Notes_test.txt
This is note 1

It has some notes

--------------------------------------------------

This is note 2

It has some more notes
The output from this should be two files:
This is note 1.md
This is note 2.md
This is the script I am using on the command line:
csplit Notes_test.txt -f_ -z -b'%03d.md' /--------------------------------------------------/1 {*} && sed -i '/./,$!d' *.md && for f in *.md
    do
    f1=$(head -n1 "$f")
    mv -n "$f" "$f1.md"
    done
and this is my understanding of the commands so far: -fPREFIX Use PREFIX as the output file name prefix. In this case an underscore is specified: "_" which I see is just a placeholder. -z Suppress the generation of zero-length output files. I think this is necessary because otherwise csplit will produce an empty file at the end of each run through splitting the original files. -bSuffix Use SUFFIX as the output file name suffix. In this case: "md" %03d puts a 3 digit number as a placeholder for the file name. I added the zero before the 3 at the suggestion of FelixJN. /--------------------------------------------------/1 specifies the delimiter for the split, with the split being made 2 lines below the the line of "-"s (count starts from 0). {\*} tells bash to run the split until the end of the file. As Felix points out, "{n}" is the number of splits to be executed. In this case "*" means do as many as possible. && means execute the following command on the condition that the previous command has completed sed -i directs sed to operate on files with a particular suffix '/./,$!d' means "remove blank lines at head of file" Thanks to Felix again for explaining that that this is to specify the range on which sed works: A "." means any character, so it specifies the first character that occurs in the document. Since empty lines do not have any characters, we will need to apply the negative "!" after defining the range. The range is defined by the pattern /"start"/,/"end"/ to apply the command between the strings "start" and "end". $ refers to the last line, so the range is all the non-empty lines in the document. To apply the negative use "!" meaning "NOT", i.e. tell sed to select the opposite of the previous range. In this case all lines before the first line with any character. "d" then deletes these lines. *.md means "which has any name with suffix .md" f1=$(head -n1 "$f") means: define f1 as the first line ("head" means "first line") of the file. This is done by using the variable signifier "$" to define "f1" which will be a placeholder (in the next line of the script) for the new file names (minus suffix). "head" is a bash command that normally outputs the first 10 lines of each file: head [OPTION]... [FILE]... The option -n1 specifies to output one line only. Here, instead of specifying a particular FILE, "$f" specifies "all files." The quote marks around "$f" are needed so that whitespace is ignored (otherwise $f uses whitespace as field separator and further splits the files - see reference link below). mv -n "$f" "$f1.md" means: rename each file as "f1.md" The bash command "mv" takes options and arguments: mv [OPTION]... [-T] SOURCE DEST i.e.: "Rename SOURCE to DEST." The -n option stands for --no-clobber "do not overwrite an existing file." I think this is just in case there are files (notes) that have the same first line. See https://www.tutorialspoint.com/unix_commands/csplit.htm and coreutils for unix-like operating stems at https://www.gnu.org/software/coreutils/manual/coreutils.pdf and https://www.howtoforge.com/linux-csplit-command/ Q2.How to split files using regular expressions? and https://unix.stackexchange.com/questions/131766/why-does-my-shell-script-choke-on-whitespace-or-other-special-characters https://unix.stackexchange.com/questions/68694/when-is-double-quoting-necessary
Christopher J Poor (3 rep)
Sep 6, 2021, 04:35 AM • Last activity: Sep 11, 2021, 09:43 PM
4 votes
2 answers
764 views
How to split a file into multiple file after N appearence of a pattern?
I have a file on Linux, containing the coordinates of thousands of molecules. Each molecule starts with a line containing always the same pattern: @ MOLECULE And then continues with other lines. I would like to split the file into multiple files, each containing a certain number of molecules. What i...
I have a file on Linux, containing the coordinates of thousands of molecules. Each molecule starts with a line containing always the same pattern: @MOLECULE And then continues with other lines. I would like to split the file into multiple files, each containing a certain number of molecules. What is the easiest way to do this?
ginopino (380 rep)
May 21, 2021, 09:04 AM • Last activity: May 22, 2021, 02:16 PM
0 votes
1 answers
1107 views
How to make csplit start outputing files with filenames starting from 001?
I use csplit to divide a complex file named ```file.docked.pdb``` to small files. ``` csplit -k -s -n 3 -f file.docked. file.docked.pdb '/^ENDMDL/+1' '{'7'}' ``` ```man csplit``` explains the following the code perfectly ``` NAME csplit - split a file into sections determined by context lines -k, --...
I use csplit to divide a complex file named
.docked.pdb
to small files.
csplit -k -s -n 3 -f file.docked. file.docked.pdb '/^ENDMDL/+1' '{'7'}'
csplit
explains the following the code perfectly
NAME
       csplit - split a file into sections determined by context lines


       -k, --keep-files
              do not remove output files on errors

      -s, --quiet, --silent
              do not print counts of output file sizes
      -n, --digits=DIGITS
              use specified number of digits instead of 2

       -f, --prefix=PREFIX
              use PREFIX instead of 'xx'

   Each PATTERN may be:
      

       /REGEXP/[OFFSET]
              copy up to but not including a matching line

       {*}    repeat the previous pattern as many times as possible
My doubt is that the output files are starting to be named from
.docked.000
and extending forward How to make the numbering start from
.docked.001
??? If the tooling does not support this at all, please give a workaround.
Praveen Kumar-M (622 rep)
May 31, 2020, 03:55 PM • Last activity: Jun 1, 2020, 02:08 AM
2 votes
3 answers
2036 views
csplit multiple files into multiple files
folks- I'm a bit stumped, on this one. I'm trying to write a bash script that will use csplit to take multiple input files and split them according to the same pattern. (For context: I have multiple TeX files with questions in them, separated by the \question command. I want to extract each question...
folks- I'm a bit stumped, on this one. I'm trying to write a bash script that will use csplit to take multiple input files and split them according to the same pattern. (For context: I have multiple TeX files with questions in them, separated by the \question command. I want to extract each question into their own file.) The code I have so far: #!/bin/bash # This script uses csplit to run through an input TeX file (or list of TeX files) to separate out all the questions into their own files. # This line is for the user to input the name of the file they need questions split from. read -ep "Type the directory and/or name of the file needed to split. If there is more than one file, enter the files separated by a space. " files read -ep "Type the directory where you would like to save the split files: " save read -ep "What unit do these questions belong to?" unit # This is a check for the user to confirm the file list, and proceed if true: echo "The file(s) being split is/are $files. Please confirm that you wish to split this file, or cancel." select ynf in "Yes" "No"; do case $ynf in No ) exit;; Yes ) echo "The split files will be saved to $save. Please confirm that you wish to save the files here." select ynd in "Yes" "No"; do case $ynd in Yes ) # This line will create a loop to conduct the script over all the files in the list. for i in ${files[@]} do # Mass re-naming is formatted to give "guestion###.tex' to enable processing a large number of questions quickly. # csplit is the utility used here; run "man csplit" to learn more of its functionality. # the structure is "csplit [name of file] [output options] [search filter] [separator(s)]. # this script calls csplit, will accept the name of the file in the argument, searches the files for calls of "question", splits the file everywhere it finds a line with "question", and renames it according to the scheme [prefix]#[suffix] (the %03d in the suffix-format is what increments the numbering automatically). # the '\\question' allows searching for \question, which eliminates the split for \end{questions}; eliminating the \begin{questions} split has not yet been understood. csplit $i --prefix=$save'/'$unit'q' --suffix-format='%03d.tex' /'\\question'/ '{*}' done; exit;; No ) exit;; esac done esac done return I can confirm it does do the loop as I intended for the input files I have. However, the behavior I'm noticing is that it'll split the first file into "q1.tex q2.tex q3.tex" as expected, and when it moves on to the next file in the list, it'll split the questions and overwrite the old files, and the third file it will overwrite the second file's splits, etc. What I would like to happen is that, say, if File1 has 3 questions, it will output: q1.tex q2.tex q3.tex And then if File2 has 4 questions, it will then continue incrementing to: q4.tex q5.tex q6.tex q7.tex Is there a way for csplit to detect the numbering that has already been done in this loop, and increment appropriately? Thanks for any help you folks can offer!
Wayne (35 rep)
Jan 3, 2020, 01:35 PM • Last activity: Jan 5, 2020, 07:06 PM
0 votes
1 answers
895 views
Split file into n files using csplit (or similar tool)
I have a huge file with the following pattern: ABC line 1 line 2 line 3 ABC line 1 line 2 ABC line1 ABC line 1 line 3 Using `csplit` tool I'm able to split the file above according to `/ABC/` pattern into 4 subfiles: csplit -z input.txt /ABC/ {*} I wonder how to manually specify the number of desire...
I have a huge file with the following pattern: ABC line 1 line 2 line 3 ABC line 1 line 2 ABC line1 ABC line 1 line 3 Using csplit tool I'm able to split the file above according to /ABC/ pattern into 4 subfiles: csplit -z input.txt /ABC/ {*} I wonder how to manually specify the number of desired output files.
Andrej (353 rep)
Dec 17, 2019, 06:41 AM • Last activity: Dec 17, 2019, 10:08 AM
2 votes
1 answers
550 views
Bash - extract an indented code block into new file
I have a bunch of [LilyPond](http://www.lilypond.org) files in the following format: \score { \new StaffGroup = "" \with { instrumentName = \markup { \bold \huge \larger "1." } } > \layout {} \midi {} } How would one extract the `\relative c {...}` block into a new file, so it would look like this:...
I have a bunch of [LilyPond](http://www.lilypond.org) files in the following format: \score { \new StaffGroup = "" \with { instrumentName = \markup { \bold \huge \larger "1." } } > \layout {} \midi {} } How would one extract the \relative c {...} block into a new file, so it would look like this: \relative c { \clef bass \key c \major \time 3/4 \tuplet 3/2 4 { c8(\downbow\f b c e g e) } c'4 | %01 \tuplet 3/2 4 {c,8( b c e f a) } c4 | %02 \tuplet 3/2 4 { g,8( d' f g f d) } b'4 | %03 } A fix of the indentation is not necessarily needed in this case. Would that be an awk or csplit task? What would it look like?
nath (6094 rep)
Dec 1, 2019, 08:24 PM • Last activity: Dec 2, 2019, 01:13 AM
4 votes
4 answers
892 views
text processing rows to columns for a block
I have a file containing lists on Solaris: List A hi hello hw r u List B Hi Yes List C Hello I need to transpose the lists as shown below: List A List B List C hi Hi Hello hello Yes hw r u How can I do this on Solaris?
I have a file containing lists on Solaris: List A hi hello hw r u List B Hi Yes List C Hello I need to transpose the lists as shown below: List A List B List C hi Hi Hello hello Yes hw r u How can I do this on Solaris?
John (51 rep)
Sep 7, 2017, 10:44 AM • Last activity: Apr 9, 2019, 10:12 AM
0 votes
1 answers
208 views
Splitting a file based on values next to matching pattern
I am having a file input.txt which include ~50,000 rows and ~100 column. I want to split is according to matching entry followed by the matching patter. File separator are both space and tab. input.txt #information #dateofcreation #file type AA BB CC DD EE FF GG HH II AA bb ac aD FF GG hg ad DA ga D...
I am having a file input.txt which include ~50,000 rows and ~100 column. I want to split is according to matching entry followed by the matching patter. File separator are both space and tab. input.txt #information #dateofcreation #file type AA BB CC DD EE FF GG HH II AA bb ac aD FF GG hg ad DA ga Dt pp Ee FF gg pm TT DA bR AT GT Gg FF GG Hb Yh NM gt Jh GT FF hb TH KM MM In the input file there a matching field FF in all the lines followed by the entry matches in some lines. I want to have three output file from this input file GG.txt AA BB CC DD EE FF GG HH II AA bb ac aD FF GG hg ad DA bR AT GT Gg FF GG Hb Yh gg.txt DA ga Dt pp Ee FF gg pm TT hb.txt NM gt Jh GT FF hb TH KM MM Thanks.
user3377241 (103 rep)
Oct 18, 2018, 08:42 PM • Last activity: Oct 18, 2018, 09:38 PM
1 votes
1 answers
1391 views
alternative to csplit - splitting after the pattern
I want to split a file after a delimiter, not before the delimiter, which is what csplit does. I can't find anything anywhere! (Also, why would there be a tool that specifically splits before a pattern, but not one that splits after it?) File: a b c d split at c output: file1: a b c file 2 d
I want to split a file after a delimiter, not before the delimiter, which is what csplit does. I can't find anything anywhere! (Also, why would there be a tool that specifically splits before a pattern, but not one that splits after it?) File: a b c d split at c output: file1: a b c file 2 d
LizzAlice (113 rep)
Apr 26, 2018, 01:28 PM • Last activity: Apr 26, 2018, 01:48 PM
5 votes
2 answers
1186 views
csplit not recognizing provided regexp
I'm working on this big file (**DATA.DAT**, ~900MB) which contains several other files. It's from a PS2 game. Sound samples (which are in **.AIFF** format), precisely what I'm after, make up most of its size. After searching the web for PS2 **.DAT** extractors I found out that they're basically deve...
I'm working on this big file (**DATA.DAT**, ~900MB) which contains several other files. It's from a PS2 game. Sound samples (which are in **.AIFF** format), precisely what I'm after, make up most of its size. After searching the web for PS2 **.DAT** extractors I found out that they're basically developer dependent and since this game/tool is rather obscure and not finding much about it online, I thought about automating the process myself. Inspecting the file on a hex editor I came across some **.AIFF** headers, cloned the chunks to new **.AIFF** files and without any further work, they were playable. Having spent a while getting the rust out of my VERY limited bash knowledge and having read similar questions here, I came up with this expression: gcsplit -f "sample-" -b "%04d.aif" DATA.DAT /FORM/ '{*}' (I'm on OSX using coreutils, hence the g- prefix on csplit) Given that **.AIFF** files start with the string "FORM" and given that basically all samples in the file are next to each other (spaced apart by disregardable amounts of data that won't generate unwanted end noise on the samples), I thought that the regexp /FORM/ would suffice to split the files up. However, every split file is being output with junk data that sits in between sound samples before the **.AIFF** header, rendering it unplayable. Screenshots of the hex data of a split sound sample below: bad split This actual sample begins roughly around the 1500 bytes mark: sample What's making this expression split the files with an offset?
João (53 rep)
Nov 26, 2017, 04:02 AM • Last activity: Nov 26, 2017, 08:15 PM
2 votes
2 answers
831 views
How to split a file based on context?
I have some files that contain the results of the `lldpneighbors` command from all our servers. I would like to split these files into individual files for each server in order to make it easier to import this data into our inventory system. **Sample Input** === Output from 00000000-0000-0000-0000-0...
I have some files that contain the results of the lldpneighbors command from all our servers. I would like to split these files into individual files for each server in order to make it easier to import this data into our inventory system. **Sample Input** === Output from 00000000-0000-0000-0000-000000000000 (SERVERNAME1): Interface 'ixgbe0' has 1 LLDP Neighbors: Neighbor 1: Chassis ID: MAC Address - 00 01 02 03 04 05 Port ID: Interface Name - TenGigabitEthernet 0/6 Time To Live: 120 seconds System Name: name-of-switch-01 End Of LLDPDU: Interface 'igb0' has 1 LLDP Neighbors: Neighbor 1: Chassis ID: MAC Address - 00 01 02 03 04 05 Port ID: Interface Name - TenGigabitEthernet 0/23 Time To Live: 120 seconds System Name: name-of-switch-02 End Of LLDPDU: === Output from 00000000-0000-0000-0000-000000000000 (SERVERNAME2): Interface 'ixgbe0' has 1 LLDP Neighbors: Neighbor 1: Chassis ID: MAC Address - 00 01 02 03 04 05 Port ID: Interface Name - TenGigabitEthernet 0/2 Time To Live: 120 seconds System Name: name-of-switch-01 End Of LLDPDU: Interface 'igb0' has 1 LLDP Neighbors: Neighbor 1: Chassis ID: MAC Address - 00 01 02 03 04 05 Port ID: Interface Name - TenGigabitEthernet 0/19 Time To Live: 120 seconds System Name: name-of-switch-02 End Of LLDPDU: This is roughly what all the results look like with some variation(They are not all the same length, some are several lines longer because of more interfaces). The delimiting string I would like to match on is: === Output from [UUID] ([HOSTNAME]): Ideally I would like each file to be named the hostname(this would just be convenience and is not necessary), so above results would be split into files like: **SERVERNAME1** === Output from 00000000-0000-0000-0000-000000000000 (SERVERNAME1): Interface 'ixgbe0' has 1 LLDP Neighbors: Neighbor 1: Chassis ID: MAC Address - 00 01 02 03 04 05 Port ID: Interface Name - TenGigabitEthernet 0/6 Time To Live: 120 seconds System Name: name-of-switch-01 End Of LLDPDU: Interface 'igb0' has 1 LLDP Neighbors: Neighbor 1: Chassis ID: MAC Address - 00 01 02 03 04 05 Port ID: Interface Name - TenGigabitEthernet 0/23 Time To Live: 120 seconds System Name: name-of-switch-02 End Of LLDPDU: **SERVERNAME2** === Output from 00000000-0000-0000-0000-000000000000 (SERVERNAME2): Interface 'ixgbe0' has 1 LLDP Neighbors: Neighbor 1: Chassis ID: MAC Address - 00 01 02 03 04 05 Port ID: Interface Name - TenGigabitEthernet 0/2 Time To Live: 120 seconds System Name: name-of-switch-01 End Of LLDPDU: Interface 'igb0' has 1 LLDP Neighbors: Neighbor 1: Chassis ID: MAC Address - 00 01 02 03 04 05 Port ID: Interface Name - TenGigabitEthernet 0/19 Time To Live: 120 seconds System Name: name-of-switch-02 End Of LLDPDU: I'm trying to use csplit to accomplish this but I'm not able to match the regex for some reason. The commands I've tried: $ csplit jbutryn_us-west-a_neighbors %===.*:% '{20}' csplit: ===.*:: no match $ csplit jbutryn_us-west-a_neighbors /===.*:/ '{20}' 552 552 552 csplit: ===.*:: no match $ csplit jbutryn_us-west-a_neighbors '/===.*:/' '{20}' 552 552 552 csplit: ===.*:: no match $ csplit -ks -f test jbutryn_us-west-a_neighbors '/===.*:/' '{20}' csplit: ===.*:: no match Any suggestions?
jesse_b (41447 rep)
Aug 24, 2017, 03:44 PM • Last activity: Aug 24, 2017, 04:41 PM
1 votes
2 answers
87 views
Select the contents respective to some specific content from a file and move it to an output file
I have a file tnsnames.ora and its contents are as below. NEWDB = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = linuxerp.de.mph.com)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = linuxerp.de.mph.com)(PORT = 1550)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = newdb) ) ) LISTENER_DG11G...
I have a file tnsnames.ora and its contents are as below. NEWDB = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = linuxerp.de.mph.com)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = linuxerp.de.mph.com)(PORT = 1550)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = newdb) ) ) LISTENER_DG11G = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = linuxerp.de.mph.com)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = linuxerp.de.mph.com)(PORT = 1550)) ) LISTENER_SABDB = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = linuxerp.de.mph.com)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = linuxerp.de.mph.com)(PORT = 1550)) ) STEST = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = linuxerp.de.mph.com)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = STEST) ) ) RBSDB = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = linuxerp.de.mph.com)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = RBSDB) ) ) In the above file NEWDB = LISTENER_DG11G = LISTENER_SABDB = STEST = RBSDB = are the database names and the respective service names are included in SERVICE_NAME = So, From the above file I am trying to extract the Database name and respective service names and put it into a file or .xls in linux. The output file should be like NEWDB newdb STEST STEST RBSDB RBSDB And what all databases that don't have service name should not be added into the output file. I tried using CSPLIT and move the first set of lines to a file "X" and select the first line and SERVICE_NAME using cat X | grep -i "SERVICE_NAME" | cut -d "=" -f2 | rev | cut -d ")" -f2 | rev | awk "NF" and move it to a file and append the same way to rest of database names. But it seems so complicated. Any other idea how it can be done will be appreciated.
sabarish jackson (628 rep)
Mar 21, 2017, 04:00 AM • Last activity: Mar 21, 2017, 12:27 PM
7 votes
3 answers
2751 views
Exclude delimiter with csplit
Is it possible to remove the delimiter with csplit? Example: $ cat in abc --- def --- ghi $ csplit -q in /-/ '{*}' $ ls x* xx00 xx01 xx02 $ head xx* ==> xx00 xx01 xx02 xx00 xx01 xx02 <== ghi While it can be done in two steps as above, can it be done in one step? If it cannot be done with csplit, is...
Is it possible to remove the delimiter with csplit? Example: $ cat in abc --- def --- ghi $ csplit -q in /-/ '{*}' $ ls x* xx00 xx01 xx02 $ head xx* ==> xx00 xx01 xx02 xx00 xx01 xx02 <== ghi While it can be done in two steps as above, can it be done in one step? If it cannot be done with csplit, is there a one-step way that is shorter compared to the two invocations (csplit + sed) above? No preference to a tool used as long as it's reasonably readable.
levant pied (231 rep)
May 5, 2016, 07:33 PM • Last activity: May 11, 2016, 01:31 PM
Showing page 1 of 20 total questions