A long time ago, on what.cd, I spent some time to upload rare and odd CDs I found at the library in Austin, TX. For a while, I had only one downloader, consistently. A user name Librarian. I looked up their IP address as it was connected to my seedbox and it was The Internet Archive’s. I suspect they have more content than they let on — that they are sitting on a trove of content that they are waiting for copyright expirations or reform.
If I were Kahle, on my death bed I would just upload the entire archive to some IPFS or torrent site and just let it all be free.
Side note: libgen is still around and it’s also another pure instance of what the internet should be.
> I suspect they have more content than they let on — that they are sitting on a trove of content that they are waiting for copyright expirations or reform.
This is one of the things that their official status as a "library" explicitly allows them to do, so yes, I think they have even acknowledged that. When stuff goes into the public domain, the Internet Archive will release it.
If I'm remembering it correctly the user name was "Archivist". As far as I know most of the meta data of the albums that were listed on What are publicly visible on archive.org.
The IA is one of the purer forces for good on the internet. It's definitely worth your donations. I do wonder, though, how well-filled the niche is for providers for the content that can't be on IA, specifically that which has been requested to be taken down. Quite a lot of the "wild west" era internet content both ranks highly on things I personally want preserved and is quite likely to have takedown requests.
It'd be pretty cool if someone could organize an "Internet Wild West" archive specifically for that kind of content and host it using torrents and I2P so that it can't be easily challenged. Yeah, there's probably a lot of problems with that, but there's nevertheless parts of the Before Time that I'd like preserved.
This is why I download and archive locally things I enjoy repeatedly and find significant. If I remember something I once loved or laughed hysterically at back in the old days, I use yt-dlp to download it because I know someday it'll get erased and a shrinking minority of people will even remember what it was.
To me, it's about personal culture. There are videos and other online content that became an institution in my young brain, but inevitably faded into irrelevance, eventually disappearing entirely in some cases. If I remember a piece of media but find that it's been lost for good, it's like I've lost a small part of myself.
In any case, I definitely recommend creating a personal archive. I agree that the Internet Archive is an amazing thing and that we are blessed to have it, but clearly not everything has an obvious place on it and I think it's only a matter of time before the Cathedral, if you will, recognizes its existence and subverts or destroys it.
If I remember a piece of media but find that it's been lost for good, it's like I've lost a small part of myself.
I could not have phrased it more perfectly myself. Those things are a part of us. Back in the day your preserved memories were a journal, a book, a photograph, a videotape. Now they live on the cloud and all it takes to lose part of it is for a creator to decide that content doesn't suit their brand anymore and erase it. Just going through your Youtube Liked Videos list and seeing how many are no longer available should be enough to send a cold chill down your spine.
If I like it, I save it. There are some things I suspect I may have the only existing copy of. I wish someone would come up with a site archiver like HTTRACK but built for archival of modern websites that are full of content not hosted directly on the site's server.
YouTube's search experience became so bad that I can't find videos that I watched 10 years or more ago. And I can't tell whether it's because their search algorithm & UI went downhill or because the videos have been removed.
I'm pretty sure it's all of those things combined. Their search has become heavily weighed towards mainstream media, so if a video even has a single keyword that can be related to a current event or important person, it may be impossible to find because CNN, Vox, and Jimmy Kimmel Live dominate most of the results.
The search feature also seems to return less in general, and has become geared towards suggesting content below the fold. You can get it to keep returning more that's related to the search query by just continuing to scroll, but it runs out remarkably quickly.
So yeah, if you like it, download it. I don't care what anyone says about copyright. Unless it's something you know will be around, it can disappear for good.
the-eye.eu is supposed to be that, but it appears to be centralized. i can't remember if they had any decentralized file storage abilities. it's currently mostly down and being restored from backups or something.
I have a disorder where I go through phases where I compulsively scan and clean-up old books and booklets that are long out of print but which, for one reason or another, had made an impression on my younger self. I have uploaded those to the Internet Archive so that they might be more likely to outlive me.
For those who don’t know, if you don’t need the book or media back, you can mail the artifact to the Internet Archive’s San Francisco address and it will be added to their catalog as processing capacity permits. I’m sending them a family members’s extensive LP collection when they pass, for example.
Brewster Kahle is a fascinating human being. Before the Internet Archive, he created and sold Alexa.com to Amazon, and sold a different company to AOL. He had a similar start to many current serial entrepreneurs or VC folks, but he chose a different path.
I'm generally a big fan of the IA, but lately they've gotten lost on one big issue and this could hurt them. They tried to give away digital books still under copyright when the pandemic began. They seemed to think that this was somehow helping people who couldn't get to a physical library, but this was kind of a ruse. First, many bookstores were still open online and they were desparate for orders to stay in business. Second, many libraries were also operating, albeit under COVID rules.
When they started this plan, I wrote my friends at the IA and begged them not to go down this route. They answered with lots of rhetoric about how they were some how helping and seemed oblivious to the claims of the authors and the libraries.
So I'm not surprised that they're being sued. They went down this path willfully.
The books were not given away, they were lent, with significant DRM-enforced restrictions, whilst libraries were closed. What they did was allow multiple users to concurrently borrow a single work in some cases, where previously this wouldn't have been allowed. But they also had the support of multiple physical libraries while doing this, so there's that too.
The books were not DRM-protected in any meaningful way. They could not have been, given that they were viewable using open source browsers without proprietary plugins.
I remember specifically testing this by borrowing a book, flipping through it, and seeing that the image URLs were all very easily accessible in Firefox's developer tools.
> So I'm not surprised that they're being sued. They went down this path willfully.
That's right. And when compelled to remove copyrighted works by living authors, Kahle made claims equating the lawsuits with "digital book burnings."
If Ray Bradbury were still alive, he would have likely have rejected this claim. The Science Fiction and Fantasy Writers of America (SFWA), which Bradbury was long associated with, and many major writers still belong to, has been critical of the IA's actions from the beginning:
The Internet Archive justified these actions based on an unproven and dubious legal argument called “Controlled Digital Lending” which supposedly would allow the Archive to make and distribute a single digital copy of a donated physical book in their storehouse as long as they “control” its distribution. It was and is SFWA’s understanding that this is not library lending, but direct infringement of authors’ copyrights. As if this wasn’t bad enough, using the Coronavirus pandemic as an excuse, the Archive has created the “National Emergency Library” and removed virtually all controls from the digital copies so that they can be viewed and downloaded by an unlimited number of readers. The uncontrolled distribution of copyrighted material is an additional blow to authors who are already facing long-term disruption of their income because of the pandemic. Uncontrolled Digital Lending lacks any legal argument or justification.
Idk I like the idea of controlled digital lending. There was a really niche topic book (the seminal work on the expulsion of Zainichi Koreans from Japan to the DPRK post-WWII) that I had been trying to find for a while. It was out of print and fetching 100 plus dollars on the resale market and I couldn't find a library that had it. Luckily IA had a single copy that they are lending out to one user at a time via a DRM applet on their site. I have to check it out every hour in order to read it but I'm super grateful.
I wonder if they were always planning to bring the concept of CDL to court and this was a good way to provoke that while not making it too easy to be painted as feckless pirates by the plaintiffs (they were just trying to help in a national crisis, you know?).
CDL seems a common sense process, inasmuch as copyright can be said to be sensible. If I own a book, I can show it to others, even electronically, but only to one person at time, and I don't look at it while showing it, so I'm not actually increasing the copies in circulation. Most normal (as in, not familiar with IP law) people would probably agree that that seems a reasonable way for a library to operate, since that's fundamentally how normal libraries do operate (except for the electronic bit). Those weird DVD-ripping jukeboxes that locked the data and/or discs down are similar too, and they were declared legal.
However, making sense is not required under the law: the concept has (apparently, according to the plaintiffs) never been actually legally tested. It is in the IA's interest to actually test it, as getting it legally proven to be a valid way to run a library would be revolutionary: any library could suddenly lend their legally-held holdings digitally without legal clouds around them. It would also put the IA's massive CDL system on a solid footing.
Meanwhile, proving it illegal would raise difficult questions over what it means to own even a physical item. As a form of accelerationism, that can work to the IA's advantage too, by shining a light on the fact that you, yes you, actually do not really own a book in that it's illegal to lend it in certain ways.
Either way, the outcome will be very interesting, though it'll be far better if CDL comes out as allowed.
That issue isn't nearly as straight forwards as you are trying to present it.
> First, many bookstores were still open online and they were desparate for orders to stay in business.
Bookstores were not an option for all the people who lost their job during the pandemic.
> Second, many libraries were also operating, albeit under COVID rules.
And many libraries were closed, leaving people with no access to affordable reading material.
> They tried to give away digital books still under copyright when the pandemic began.
This doesn't seem accurate at all. The IA expanded their existing lending program and removed wait limits, they didn't start "giving away books".
The ongoing efforts to hobble libraries abilities to lend eBooks is a significant factor in this.
While I will buy that this expanded lending program broke the crappy copyright laws we are stuck with, I whole heartedly support this IA decision.
If these industry associations didn't have their head so far up their ass, they would have realized what a wonderful opportunity the pandemic was to boost book readership and their overall market, rather than trying to sue the IA because they want to ring every last penny out of the crisis.
The law is in large part a creation of the interests it serves --- in the case of copyright, that being publishers.
Changing that regime requires challenging the assumptions and spotlighting the harms and costs. Which is what IA undertook. It's an activist cause, yes, and one aligned with the Archive's interests and mission.
To that extent, IA's action is entirely commendable.
it's not in the archives interests to get sued out of existence. Breaking the law isn't just "challenging" something - they'll lose in court and set a bad precedent for future challenges.
Maybe it should separate its activism from its legal, and important, archiving mission; rather than throw the baby with bathwater.
Its not controlling preservation but it is involved:
“Through our partnership with Cloudflare, we are learning about, and archiving, web pages we might not have otherwise known about, and by integrating with Cloudflare’s Always Online service, archives of those pages are available to people trying to access them if they become unavailable via the live web,” said Mark Graham, Director of the Wayback Machine at the Internet Archive.
The Internet Archive will continue to rely on its own infrastructure to perform crawls. You can take corporate handouts without becoming beholden to the corporation.
If I were Kahle, on my death bed I would just upload the entire archive to some IPFS or torrent site and just let it all be free.
Side note: libgen is still around and it’s also another pure instance of what the internet should be.