Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Stability.ai sent a take down request to Runway ML's SD v1.5 citing IP Leak (huggingface.co)
179 points by amrrs on Oct 20, 2022 | hide | past | favorite | 72 comments


Seems the CEO/Founder of Runway has replied that there hasn't been any leaks and it's a proper release, made as permitted by the license and therefore should be online soon again.

> Cris here - the CEO and Co-founder of Runway. Since our founding in 2018, we’ve been on a mission to empower anyone to create the impossible. So, we’re excited to share this newest version of Stable Diffusion so that we can continue delivering on our mission.

> This version of Stable Diffusion is a continuation of the original High-Resolution Image Synthesis with Latent Diffusion Models work that we created and published (now more commonly referred to as Stable Diffusion). Stable Diffusion is an AI model developed by Patrick Esser from Runway and Robin Rombach from LMU Munich. The research and code behind Stable Diffusion was open-sourced last year. The model was released under the CreativeML Open RAIL M License.

> We confirm there has been no breach of IP as flagged and we thank Stability AI for the compute donation to retrain the original model.

https://huggingface.co/runwayml/stable-diffusion-v1-5/discus...

Edit: According to Discord messages from Emad Mostaque (founder of Stability AI) they have taken back the takedown request and seems this is the official release of the Stable Diffusion model version 1.5.


Sounds to me like runway released it without consulting stability, called it “1.5” - which according to the license they’re allowed to do but pretty scammy since emad had hyped a model with that label. And now stability is deciding to call this the official release to be nice to runway and avoid a general PR thing and community infight.


To me it seems the other way around.

1.5 was apparently held back by Stability for weeks. Runway finally decided to just release it.

Stability requested a take down, and here you see the Runway CEO telling Stability in no uncertain terms "this ours to release, we created it, it's under an OS license, you don't hold any IP rights here; all you did was provide compute"

If anything this is a pretty stern rebuke of Stability and a sign of considerable disagreement between the two parties.


Well if that’s the case that’s still a pretty shitty thing to do on runways part. Just be curteous to what stability’s needs are, keep good business relations. Weird behaviour and I wouldn’t be surprised if in the future runway are silently excluded from before-public releases (which seems to be many in the years ahead).


Doesn't the "OS license" mean that Runway has permission to release it already? Ack that there might be other agreements and business relations involved though.


Release it, fine. There’s been lots of fine tuned and continued trained SD models. Just don’t call it “1.5” which is the specific label for the model stability is training internally. Again the license ‘permits’ them to do it but seems like a very bad business decision since runway given what their service does would likely benefit hugely from early access to eg stability’s future text2video models (etc), which they now likely won’t get until everyone else (leaving someone else to possibly take market share in their field, and if this is the trade - because they got ‘impatient’ - that seems awfully not smart).


It seems very weird to me that “stability” is building a company around something they didn’t create.


Seems to me that stability AI did a pretty shitty thing and runway ran out of patience. Pretty weird to paint runway as the bad guys here.


didn't runway create the model? How could stability exclude them?


Hilarious. So a VC darling $100mil investment company is just a server farm dishing free compute donations around to get some good PR and doesnt own anything.


This is exactly why Stability's role as a middleman in these model releases is an indefensible business position. The research underlying stable diffusion does not belong to them, so any company which wants to fund the training of the next iteration of these models can easily step in and fill their role. And since Stability itself is not pioneering this line of research (rather, they are just facilitating the training and distribution of models), they will always be lagging behind other companies which employ the researchers themselves (i.e. Runway's apparent position in this debacle). Stability's role as middleman has been so far innocuous if not generally helpful, but now that this role has a market price of $1B, companies and researchers are going to be eager to cut them out of the process and reap the rewards for themselves.


Runway is not "any company" they were part of the research for the original stable-diffusion release and were still working with StabilityAI team on SD. There must have been internal disagreements on the release of 1.5 version


Stability has a research team. They've just brought over David Ha (@hardmaru on Twitter) from Google Brain to lead strategy.


It's quite strange to have a great RS to lead _strategy_


If that's what they want to work on?


This is really strange, and is being handled very badly.

RunwayML are an official partner of Stability AI. Stability has been delaying the release of their 1.5 model for several weeks now, citing unspecified "legal concerns".

RunwayML has seemingly gone ahead and released it themselves anyway, forcing Stability to file a takedown request for leaking their IP.

In the aftermath of the AUTOMATIC1111 debacle last week, this doesn't reflect well on Stability's comms. Things seem a lot less open since investors got on board...


I hadn't heard of the controversy, it seems like this is a good summary: https://old.reddit.com/r/StableDiffusion/comments/y1uuvj/aut...


Maybe that post has been edited into coherence, but when it was posted that summary was misleading or flat wrong about a lot of points.


https://old.reddit.com/r/OutOfTheLoop/comments/y22zg6/whats_...

This is the summary that should be read, the top post by ttopE.


I'm not sure how trustworthy that post is. I haven't read it all, but there are lots of mistakes & misnomers that makes it hard to trust the rest.

> Stable Diffusion is an open source program

It's not a program, it's a model. It is also not open source, its license is "Creative ML OpenRAIL-M" which is "open-source like" or "permissive", but it does limit the usage of the model.

> has seen many variations, or forks, on the original code, all typically hosted on github, a code sharing platform. The most popular and arguably the most complex of these forks is automatic1111s fork

Automatic1111 is not maintaining a fork of the model/software but is providing a UI that uses the model rather. It was started from scratch from a user on 4chan and was taken by automatic1111 and put on GitHub, where they further continued working on the codebase. I think automatic1111 is the only SD UI that also has public codebase but is not licensed under a permissive license (automatic1111 retains the copyright to themselves)

These are things I know for sure are not true/misnomers. Which makes statements that I'm not 100% sure about, like:

> someone on 4chan used an exploit on github to extract the unique model of Stable Diffusion that NovelAI had developed and just released on their service

less likely to also be true. Also, if there was a 0day for GitHub, don't you think someone would have used it for something more interesting than leaking a image diffusion model and some associated code? It probably would also have been all over the news + announcements from GitHub, but I personally haven't come across any of them. But maybe I just missed it.

Overall, I'd take everything you read about all this drama with a pinch of salt, there is a lot of misunderstandings and misinformation about the whole thing, and that many people who "summarize" the whole thing seems to not be actual developers don't help either.


You're right, although I'd call them oversimplifications than outright fabrications. SD is not technically open source, and the Automatic1111 "fork" should not really be called a fork. The 4chan leak of NovelAI's model is also true (although I don't know the details of how exactly it was obtained), as I've downloaded and used it before. It does quite well with anime characters.


The apology it references at the end was also fake.


What makes you say that?

The link at the end of the post by ttopE was mangled. It should have linked to [1] which links directly to the actual apology: [2]

1: https://www.reddit.com/r/StableDiffusion/comments/y34h2a/ema...

2: https://github.com/AUTOMATIC1111/stable-diffusion-webui/disc...


This post here from that thread [1]. But please, I just don't care about it anymore. I don't care what the truth is or who was lying about what, or which account is fake and which is real. Stable Diffusion has the worst tech community I've ever dealt with. I have never seen so many bullshitters, scammers and liars in one place. I've never seen a company piss away good will as fast as Stability AI. It's a top-to-bottom shit show.

1: https://www.reddit.com/r/StableDiffusion/comments/y34h2a/ema...


Ah, I see. Thanks for that.

It’s hopeless deciding who to believe when everyone involved is choosing to be economical with the truth.


It's a little confusing, but as far as a understand it, Stable Diffusion was created by a collaboration between RunwayML and Compvis (aka Machine Vision and Learning research group at Ludwig Maximilian University of Munich) with Stability.AI funding the computing power for training and LAION.AI (also funded by Stability.AI) providing the dataset.

The first few releases of model and code have been done by Compvis, and this one by RunwayML. More than permitted by the license, this seems to have been released by the actual developers.

I've seen speculation that it has been released this way to provide distance between Stability.AI from eventual future litigations, but it feels more like an internal ultimatum deadline.


> I've seen speculation that it has been released this way to provide distance between Stability.AI from eventual future litigations, but it feels more like an internal ultimatum deadline.

It could be both. A company it is not necessarily a monolithic entity.


Stable Diffusion was released with Open RAIL License which allows anyone to fine-tune and release their own models while complying the ToS of the original SD release. It's quite strange IP Leak is quoted in the take down request.

https://huggingface.co/spaces/CompVis/stable-diffusion-licen...


Hmm, that doesn't seem like an open source license.

https://opensource.org/osd


Pedantically iiuc, Stable Diffusion 1.3 and 1.4 were released with that license but 1.5 has yet to be released officially. Besides a verbal promise from Emad which isn't legally binding, there's no reason a) 1.5 ever has to be released, but b) if it is, since it's Stability AI's IP, they're free to put whatever license they want on 1.5 including a proprietary for-money license which having a public torrent leak cuts in the revenue from.


1.5 is out now, so I suppose the point's moot.

edit: never mind, jumped the gun and thought it was an official release but it turns out I was just confused.

https://huggingface.co/runwayml/stable-diffusion-v1-5


That is the repo/release this takedown request was about.

Note that it is not the stabilityai repo.


I get the impression we're about to see an ocean of litigation around AI and IP. If you're not a participant it will probably make good popcorn fodder. Though I don't doubt that there is a threat to graphics design and other creative jobs, may not be time to switch careers quite yet.


Because there is a lot of money and fear involved I agree that there will be a lot of litigation, but the outcomes will be mixed at best for the copyright holders.

Lots of people have memorized lyrics to copyrighted songs but no one is suing them for the memorization. They are only suing when the memorizer is trying to make money off of the song without permission of the copyright holder. The burden of proof is on the copyright holder to prove that the builder of the model was intending to distribute IP that they did not own.


A lot of people have sampled literally everything in music and it is very very uncommon to get sued for creating something substantially different. I fail to see how AI is any different.

People need to realize that IP enforcement is a very leaky sieve at best, and that the very concept of intellectual property is an affront to the nature of information and ideas, particularly in a digital realm where bits are free to be copied at will

The sooner we come to this realization, and those that stand to profit from IP enforcement bullshit take their monetization elsewhere, the better off we will be as a species.


A lot of established artists (or upcoming artists) don't just randomly sample stuff, release it and call it a day. Usually songs go through a "sample clearing" pass where all the samples are "cleared", meaning basically licensed for use. Usually you don't do that until right before releasing it (it's not part of the conception of the song, to make sure you're using cleared samples), but most popular released music has cleared samples before releasing the songs.

If you're not familiar with the process, you can read about it here: https://medium.com/the-courtroom/the-art-of-clearing-a-sampl...


I don't think that most of Beatport's or Bandcamp's catalogue has gone through that clearance process, and I'd wager a large part of Spotify (via tunecore/etc) hasn't either

Nobody has ever given me any crap for the blatant sampling I do from old movies and video games, though I am a nobody and they are always a small (if focal) part of a much larger and original work


They probably haven't, yet ;)

But if a song (yours or other people's songs on Beatport/Bandcamp/Spotify) becomes popular enough and it has samples you haven't cleared, expect your door to be knocked on by some of the right holders who are gonna ask you to pay up.

There are plenty of cases of unknown people making songs that become popular overnight and then get stuck in long litigations regarding sample usage. Obviously they're not gonna go after music makers who barely have any listeners.


The blurred lines lawsuit may be a good counter example where Marvin Gaye’s family was awarded 7.4 million dollars because the song blurred lines copied the “feel” and “sound” of “Got to Give it Up”. I think it’s ultimately going to rest less on the details of specific examples or the nature of the underlying technology and more on how much money is involved.


It's premature to claim these "AI" models are equivalent to a human memorizing lyrics to copyrighted songs. They're much closer to XOR'ing copyrighted works so that the encoded form is not casually recognizable. I don't have sources close to hand, but I seem to recall during the 00's reading about such schemes being regularly proposed and also regularly struck down, legally.

From a high level both things consist of putting copyrighted bits in a digital blender, producing a mixture unlike the originals but which, if you prod it the right way, can be induced to reproduce some parts of the (copyrighted) input. I'm not sure calling it AI makes it (legally) different. What is the argument here - that previous schemes just didn't blend the bits enough?


It isn't premature at all. Stable Diffusion took 100GB of already compressed images and turned them into a 2GB model. The XOR of those 100GM images could never fit into 2GB.


All this argument proves is that model it doesn't contain lossless full size byte perfect copies of all the image. And that's assuming all those 100GB of are images are unique with no common data. Copying 2% of those images or maybe 8% at 50% width would still be a problem.Just the difference between high and low quality JPG compression can easily make 10x size difference. You can start to see visible artifacts at that compression levels, but I wouldn't try to convince a movie studio that badly compressed version of their movie doesn't count as copyright infringement.


But that is exactly how copyright law works. If you copy the first 10000 bytes of a steven king book and compress it with zip and distribute it, there is a good chance you would lose the lawsuit if sued.

If on the other hand you copy the first, middle, and last byte of a Steven King book and share those three bytes with someone you would win a copyright lawsuit, because those three bytes are not unique enough to be copyrighted.


> They are only suing when the memorizer is trying to make money off of the song without permission of the copyright holder.

Granted everything I know about copyright law I learned in a few audio tapes in my car, but I think the question I have is not, can you sue AI for absorbing information, but is AI unique enough that you couldn't find byte strings or patterns that match source material too closely.

See the recent twitter thread on CSParse for example: https://twitter.com/docsparse/status/1581461734665367554


It is just weights, there is no string of bytes


You are correct that the "model" may be a separate piece of work with it's own copyright but that does not mean that the model does not produce copyrighted works.


I’m talking about outputs not the model itself, the GitHub copilot example has lots of shared bytes


They claim to have not sent a takedown. https://i.imgur.com/ayMxTYp.png


This plays strangely with the: "Stability legal team reached out to Hugging Face reverting the initial takedown request, therefore we closed this thread" that is on the Hugging Face page.


Yeah suuuure, Emad


Does anyone have a good resource on Stability, their history, and relationship to Runway? I haven't followed the space closely in a couple of years but they seemingly exploded from nowhere and I'm not entirely sure what their mandate is.


tl;dr the development of the original Stable Diffusion was assisted via a collaboration between Runway and StabilityAI.


I’d highly recommend that anyone chiming in read about the origins of Stable Diffusion before casting judgement: https://research.runwayml.com/the-research-origins-of-stable...


As I see it, this whole drama is only because big money, it's lawyers, and clueless C-level managers tasked to create a business are getting involved.


Someone involved with Stability.ai posted this article about the "leak":

https://danieljeffries.substack.com/p/why-the-future-of-open...


> We’ve heard from regulators and the general public that we need to focus more strongly on security to ensure that we’re taking all the steps possible to make sure people don't use Stable Diffusion for illegal purposes or hurting people.

This is absurd. We don't hold off on releasing other open source code or datasets that could be used to break the law or hurt people, except where they are subject to export controls (cryptography).

I can sympathize with people worried about the danger of photorealistic AI-generated images of individuals, but (1) stable diffusion is not that good: it is quite common to see extra fingers or limbs that blur together. (2) When image generation does become good enough to be dangerous, society will adapt by holding people less responsible for images they appear in.


> stable diffusion is not that good: it is quite common to see extra fingers or limbs that blur together.

This is quite simply no longer the case. Yes, a naive use of the toolset will output bad quality images but experienced users can get images of this quality reliably now: https://imgur.com/a/W7D7Djh (possibly nsfw: realistic vampire woman with clothes drenched in blood).

It would be easy for someone to instead generate illegal content of that quality.

This does not mean that I believe future release should be blocked because of this fast however. Someone can use ready available FaceApp filters to create illegal content and the application is still on the app stores, any skilled person can use Photoshop to stitch together illegal content and Adobe has yet to close down, the list goes on.


Claiming IP, eh?

Tell me where you're sourcing all those pictures and videos from, along with the license for each.


Are models that aren't overfit on something else copyrighted even copyrightable in the US? I know they are in the UK and it may differ from place to place.


Opps meant to say Stability.ai are in the UK, not imply trained weights are copyrightable in the UK, I don't think that is known


I'm thoroughly confused. Is this the 1.5 release everyone was waiting for, or not?


Yes, but it was unclear.


No


This is the v1.5 we were waiting for, you can test it by using the same seed locally and on DreamStudio.


I wonder how this will be sorted in the future: what if Stability comes with a newer version, on top of Runway's one? I forecast a never ending battle of versions numbers as both teams have the same trunk of work.


>Use Restrictions You agree not to use the Model or Derivatives of the Model:

>- To generate or disseminate information for the purpose to be used for administration of justice, law enforcement, immigration or asylum processes, such as predicting an individual will commit fraud/crime commitment (e.g. by text profiling, drawing causal relationships between assertions made in documents, indiscriminate and arbitrarily-targeted use).

Ironic isn't it? How people who licensed their software forbidding its use for stopping illegal activity are accused of breaking the law.


What are the odds this has to do with midterm elections in the US


From heroes to villains in just over a month.


Are machine learning models even copyightable? They are clearly usually derived from copyrighted materials but does the training process extinguish that copyright?


That remains to be seen.

Laws and courts take their sweet time to venture into novel fields which is good when it comes to technology.

I wouldn't be surprised if it'll end up classified as a derived work in the end with some sort of royalty payment towards registered right holders (managed by some sort of collection society)

It'll probably be different depending on your legislation.

Has someone yet trained a model on music?



That (music) will definitely bring the rightsholders out of the woodwork.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: