I bought a Deluge a while back, and I’ve owned synthesizers and kaossilators and all kinds of other things for years. The Deluge is several things: expensive, awkward to use, but (with practice) it can make some reasonable music. Here are some ambient tunes I’ve written with it:
[Edit: If you want a light introduction to this, I recommend this double CD]
I wanted to download it … all!
But apart from this gnomic explanation it isn’t obvious how, so I had to work it out. Here’s how I did it …
Firstly you do need to start with the Advanced Search form. Using the second form on that page, in the query box put collection:georgeblood, select the identifier field (only), set the format to CSV. Set the limit to 30000 (there are about 25000+ records), and download the huge CSV:
$ ls -l search.csv
-rw-rw-r--. 1 rjones rjones 2186375 Aug 14 21:03 search.csv
$ wc -l search.csv25992 search.csv
$ head -5 search.csv
"identifier"
"78_jeannine-i-dream-of-you-lilac-time_bar-harbor-society-orch.-irving-kaufman-shilkr_gbia0010841b"
"78_a-prisoners-adieu_jerry-irby-modern-mountaineers_gbia0000549b"
"78_if-i-had-the-heart-of-a-clown_bobby-wayne-joe-reisman-rollins-nelson-kane_gbia0004921b"
"78_how-many-times-can-i-fall-in-love_patty-andrews-and-tommy-dorsey-victor-young-an_gbia0013066b"
A bit of URL exploration found a fairly straightforward way to turn those identifiers into directory listings. For example:
What I want to do is pick the first MP3 file in the directory and download it. I’m not fussy about how to do that, and Python has both a CSV library and an HTML fetching library. This turns the CSV file of links into a list of MP3 URLs. You could easily adapt this to download FLAC files instead.
#!/usr/bin/python
import csv
import re
import urllib2
import urlparse
from BeautifulSoup import BeautifulSoup
with open('search.csv', 'rb') as csvfile:
csvreader = csv.reader(csvfile, delimiter=',', quotechar='"')
for row in csvreader:
if row[0] == "identifier":
continue
url = "https://archive.org/download/%s/" % row[0]
page = urllib2.urlopen(url).read()
soup = BeautifulSoup(page)
links = soup.findAll('a', attrs={'href': re.compile("\.mp3$")})
# Only want the first link in the page.
link = links[0]
link = link.get('href', None)
link = urlparse.urljoin(url, link)
print link
When you run this it converts each identifier into a download URL:
Edit: Amusingly WordPress turns the next pre section with MP3 URLs into music players. I recommend listening to them!
$ ./download.py | head -10
And after that you can download as many 78s as you can handle 🙂 by doing:
I only downloaded about 5% of the tracks, but it looks as if downloading it all would be ~ 100 GB. Also most of these tracks are still in copyright (thanks to insane copyright terms), so they may not be suitable for sampling on your next gramophone-rap record.
(and that doesn’t count things like radio programs encoded in mp4 or flv)
I still haven’t found a music player that can deal with this. In particular:
ID3 tags are often wrong — don’t use them. Use a database like Shazam to work out what’s really in each file.
There are duplicates. I’m never going to be able to fix that. Transparently pick the highest quality file when playing, and don’t show me duplicates ever.
Give me an intelligent way to navigate this. I’d like to have the player automatically group them or suggest related music, perhaps by looking for similarities in the audio (and definitely not by using the ID3 tag).
Related to the previous point, search should be super simple.
When choosing a party playlist, don’t mix in radio programs or podcasts.
Some of the titles are not US ASCII. It still seems like this isn’t a completely solved problem (sigh).
It’d be nice if it could stream the music to my squeezebox players (of which I now have two, yay!), but I’m not expecting miracles …
Update #2: Whichever cock used a download manager, making 10s of simultaneous connections to my site at once, you’ve ruined it for everyone because I’ve now taken these down.
Update: If you can mirror these files, please post a comment with the link.
Frog were a totally brilliant band that I had the pleasure of seeing live several times. Sadly their music has long been unavailable in any format, not even for live gigs.
Important note: In order to fool the IFPI drones, these are encoded as encrypted ZIP files. You will need to supply a password, which is the name in all lowercase of a famous 80s arcade game involving a small green amphibian trying to cross a major highway. What was that game called?
I am Richard W.M. Jones, a computer programmer. I have strong opinions on how we write software, about Reason and the scientific method. Consequently I am an atheist [To nutcases: Please stop emailing me about this, I'm not interested in your views on it] By day I work for Red Hat on all things to do with virtualization. I am a "citizen of the world".
My motto is "often wrong". I don't mind being wrong (I'm often wrong), and I don't mind changing my mind.
This blog is not affiliated or endorsed by Red Hat and all views are entirely my own.