using wget to download all audio files (over 100,000 pages on wikia)
2
votes
1
answer
3158
views
I am trying to download all audio files in Wookiepedia, the Star Wars wiki.
My first thought is something like this
wget -r -A -nd .mp3 .ogg http://starwars.wikia.com/wiki/
This should download all .mp3 and .ogg from the wiki while preventing creation of a directory. However, when I run this in terminal I get:
>bash: http://starwars.wikia.com/wiki/ : No such file or directory
The problem is that I can't use for loops since the URLs are unique to each wiki page. For example:
http://starwars.wikia.com/wiki/Retcon
http://starwars.wikia.com/wiki/C-3PX
http://starwars.wikia.com/wiki/Star_Wars_Legends
Is it possible to download URLs in this structure?
EDIT: This is the message I get back using the answer.
>--2016-02-10 16:21:26-- http://starwars.wikia.com/wiki/
Resolving starwars.wikia.com (starwars.wikia.com)... 23.235.33.194, 23.235.37.194, 104.156.81.194, ...
Connecting to starwars.wikia.com (starwars.wikia.com)|23.235.33.194|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://starwars.wikia.com/wiki/Main_Page [following]
--2016-02-10 16:21:26-- http://starwars.wikia.com/wiki/Main_Page
Reusing existing connection to starwars.wikia.com:80.
HTTP request sent, awaiting response... 200 OK
Length: 569628 (556K) [text/html]
Saving to: ‘index.html’
>100%[========================>] 569,628 217KB/s in 2.6s
>2016-02-10 16:21:29 (217 KB/s) - ‘index.html’ saved [569628/569628]
>Removing index.html since it should be rejected.
>FINISHED --2016-02-10 16:21:29--
Total wall clock time: 2.7s
Downloaded: 1 files, 556K in 2.6s (217 KB/s)
sl
gives me nothing, there are no files in the working directory.
Asked by user147855
Feb 6, 2016, 05:40 PM
Last activity: Jul 3, 2025, 01:03 PM
Last activity: Jul 3, 2025, 01:03 PM