Use machine learning to generate brand new never-before-heard sounds

to_the_sun's icon

Maybe it's a bit of blasphemy to post this here, but I would love to see someone break sound design by using machine learning to generate endless new never-before-heard sounds to be utilized like presets in a VSTi. It may be cutting edge at the moment, but it's certainly possible. See this video:

Instead of using images as in the video, train the unsupervised deep learning algorithm on tons of different sounds (probably both VSTi notes and other musical samples). Then refine its output with an adversarial network, etc. Anyone else get giddy with excitement at this prospect or does it make you fear for your job? Anyone out there with the skills to make an attempt at this??

Roman Thilenius's icon

this algorithm is how they summoned trump.

maaark's icon

Check this out: https://deepmind.com/blog/wavenet-generative-model-raw-audio/
Under "Knowing what to say" it gets really interesting. There's lots of implementations on Github... Easy enough to set up but it takes so long to train and generate...!

Also check out Espen Sommer Eide's article on Deep Learning Dead Languages: http://www.3quarksdaily.com/3quarksdaily/2017/01/deep-learning-dead-languages.html

If anyone's got more links in this vein, I'm super interested.

lyve forms's icon

am v interested in this, working on something myself.
some random stuff so far

- there are several courses on ML for artists on kadenze & other platforms
- https://cycling74.com/forums/machine-learning-in-max-lKKLSjT
- https://cycling74.com/forums/machine-learning-in-max-hallelujah/
- max objects: mubu, ml.adaboost, cnmat NN, ml.*, ml-lib >> DTW-based methods (time-based) eventually most suited to audio?
- ...

brendan mccloskey's icon

Here's a spanner for the machinery:

Until we relinquish the 'discrete sample' model of audio DSP, there can't be anything that is truly "endless new never-before-heard sounds". And yes, I am repeating something someone else said.

2c

to_the_sun's icon

@MAAAARK Those wavenets are highly intuiging. The second article you linked sounded like it was using them as well and it made it sound like anyone could try them out just by downloading "the necessary software". Could it be true? I haven't stumbled across a way as of yet.

Roman Thilenius's icon

@brendan

i think in regards of "endless" ... this question is already clarified.

that means that "truly endless" (which is probably more endless than endless) also does not exist.

but we should discuss "never-before-heard". somehow i think it is not so easy to find, but it is still on the to do list, isnt it?

Bill 2's icon

@Roman: "this algorithm is how they summoned trump"

That must be the funniest thing I've ever read on this forum. :-)

Noah's icon

I'm extremely ambivalent about this idea. I have no doubt if it were to be applied to sound design, it'd generate some things beyond anything we could make ourselves. But at the same time there's two small, yet very important parts to that last sentence : "some" and "make ourselves". The very fact that machine learning tends to imitate what it's been fed makes me wonder exactly how unique its sounds would be after a certain point, and the fact that it seems to to fall flat before that point makes me wonder how many cool sounds it makes would be happy accidents versus its intended function. Its practical applications appear to be more utilitarian than artistic. Admittedly, I don't claim to have as deep of knowledge on machine learning as I could, but knowing what I do, that's my first concern.

Secondly, though, is the "ourselves" part of things. Even if, hypothetically, there did end up being some "instant sound design" (maybe I'm misinterpreting you there? Apologies if I am) application for this, I have a hard time seeing myself using it. Part of what I love about so much about sound design is personally interacting with synths and effects and samples and such, and constantly finding new things I can do with them, or, when I'm not doing that, hearing other people's work and thinking to myself "holy shit, that sounds incredible!! how did they do that?!". To me, a source of instant gratification like that would be kind of like replacing hunting for buried treasure with the lottery.

maaark's icon

@to_the_sun well, if you just search for wavenet on GitHub, you'll find some good code sources. Even I was able to get one of them to run. It's a few months ago now, so I don't remember which one i used. Wasn't able to get it up and running calculating on the GPU, so it was very slow. https://github.com/search?utf8=✓&q=wavenet