Quite frequently I come across scanned books that are viewable for free online. For example, the publisher put them there (such as preview chapters), a library (old books from their collection that are in public domain), etc. Since I like hoarding data, and the online viewers that are used to present the book to me might not be very practical, I frequently try to download the books one way or another. This requires toying with the “inspect element” tool and various other methods of getting the images/PDF. Now, all that I access is what is, well, accessible; I don’t hack into the servers or something. But - the stuff is meant to be hidden from the normal user. Does that act of hiding the material, no matter how primitive and easily circumvented, mean that I’m not allowed to access it at all?
I suppose ripping a public domain book is no big deal, but would books under copyright fare differently?
Mainly I’m asking out of curiosity, I don’t expect the police to come visit me for ripping a 16th century dictionary.
Note: I live in EU, but I’d be curious to hear how this is treated elsewhere too.
Edit: I also remembered a funny trick I noticed on one site - it allows viewing PDFs on their website, but not downloading, unless you pay for the PDF. But when you load the page, even without paying, the PDF is already downloaded onto your computer and can be found in the browser cache. Is it legal to simply save the file that is already on your computer?
According to the big tech its ok if you’re training large language model with it.
You’re confusing the law that applies for the ruling class with the one that applies to common people
There’s a law for the ruling class? I always figured they gotta just cut their political buddies in.
My brain is essentially an enormous language model.
Unironically yes, you would not know who Spiderman was without viewing a copyrighted work demonstrating what he looks like, and now you understand while generative AI fundamentally has to ingest copyrighted works.
AFAIK web scraping (the act of grabbing and downloading any data you see available on the internet) isn’t illegal, and I would assume downloading PDFs provided to you online would fall under that. Since it is copyrighted it would probably be illegal to share it, though.
This. In a case around LinkedIn courts ruled that in the US it’s legal to scrape publicly available data. The company doing the scraping was selling that data to corporate customers, but ultimately use might depend on the information you’re accessing and under what permissions. (Not a lawyer)
What if I web scraped something like a pirate site full of all the good media?
If you scraped a pirate site and stored a bunch of links to copyrighted content you’d probably be fine, actually using those links to download or share copyrighted content is what’s illegal. It’d be like buying the stuff to make a bomb or drugs, but then not making any bombs or drugs.
That being said, while not necessarily illegal, I wouldn’t want authorities to find my bomb and drug ingredients, or my scraped piracy links, as I’d probably have some 'splainin to do.
(Not a lawyer)
Why can I scrape the content from one place and not another?
Who said that you can’t scrape content from the other place?
If you scraped a pirate site and stored a bunch of links to copyrighted content you’d probably be fine,
If you’re referring to the last line, I say I wouldn’t want authorities to find it because I don’t want to have to explain it. I’m 99% sure someone would not just store links to a bunch of pirated content for fun, they probably have accessed said pirated content, now you have to explain to the authorities why you have links to pirated content without implicating yourself in copyright infringement.
Like I said, probably fine, I just wouldn’t want the hassle if I somehow got caught.
Why links and not video?
Sorry man, I’m not exactly sure what you’re asking.
If you are able to load the content on your computer without infringing copyright laws, you’re allowed to circumvent whatever the website has in place to store whatever data you would like from whatever website you would like, regardless of the nature of the site, so long as the content is legal (is not CP) and again not being presented in a way that infringes aforementioned copyright laws.
If you’re asking why the copyright laws exist, I can’t really help you with that one.
If you can see it, you’ve already downloaded it. You’re just chosing to retain it.
As with everything with the law, it depends.
In Australia, distribution is the illegal part, seeding/sharing is where they get you. Not the actual download itself.
It’s usually not a question of legality, but efficiency.
It’s easy and efficient to bust someone for seeding, but busting hundreds for the odd file you can prove they downloaded is expensive and takes forever.
Removed by mod
You wouldn’t download a car…
Nooo never…
You’re right! Babies take way too much work to raise properly.
You might not.
viewable for free online
If you are viewing it on your computer, you have already downloaded it.
Don’t let anyone tell you otherwise.already downloaded onto your computer and can be found in the browser cache
Exactly.
Ask the AI companies who scraped my sites while the media companies were DCMA-ing everything in sight and working with enforcement paid for with publuc funds to prosecute/persecute the “pirates”.
It’s ridiculous that Homeland Security is spending resources taking down pirate sites. That’s a department specifically created to prevent terrorism, and instead they’re operating as Pinkertons for broadcasting companies.
The laws are bullshit and shouldn’t be followed. Information should be free to all
I never said I follow the law, I’m just wondering what the law says ;)
I’d say if the copyright holder says you’re not allowed to then you’re not. It’s piracy.
People will tell you that you’ve already downloaded the data so saving it is fundamentally, technically no different, but that doesn’t matter to the law, it’s still piracy.
Like yeah, it’s absurd and pointless and anti-consumer and anti-knowledge and unenforceable and unsustainable, but that’s copyright. It’s always been that way.
Copyright destroys culture and piracy is our ethical duty in the face of that. The only reason to care about it is so you don’t get caught.
What about AI? Don’t they basically do exactly this.
Sure, and I’d say that’s piracy too. I wouldn’t mind if it wasn’t also being siloed into private hands to enrich the wealthy and screw the rest of us.
Not an expert, but in the U.S. making a copy of a broadcast for personal use is legal under fair-use. Anything that loads up on your computer screen you can make a copy and save it for personal use. So screen captures are by definition legal.
How exactly you copy the material on your screen gets tricky under the DMCA clusterfuck. Breaking encryption to copy the material is illegal unless there is an valid exception for fair-use. What exactly those valid exceptions are is above my paygrade.
Laws of course differ from country to country but generally if it is legally publicly available then no, it at best violates their EULA or something if you scrap such data. A company trying to prevent direct downloads cannot really charge you for you finding ways around that, because from a technical point of view the data was already cached onto your PC anyway.
As a tip, use the browsers F12 console’s Network tab, instead of inspect element. For videos you may also try the absolute right click addon. It breaks the video player controls when enabled but often you can just right click save video if it isn’t timed out and you can also enable regular controls via right click show controls. Tools like JDownloader2 can also often scrap various files but the former methods may work better.
There’s also the video download helper add-on for Firefox that will allow you to download streams that aren’t just media files your browser can http get. Though your browser can still access those streams, it needs a script component to handle it, so the built in file downloader/saver won’t even see it as a thing to download.
That one is scummy as hell.
How so?
Just check the reviews, or the permissions.
Reviews say it’s adding a giant QR code to downloaded videos to get people to pay a license fee but I do not see that after downloading something just now. Though tbf, they did update it yesterday and might have removed that because of the feedback they were getting.
Permissions look reasonable to me, based on my understanding of what they need to do for the functionality, though I suppose there is potential for abuse.
It requires a companion desktop program for some streams, which did seem sketchy at first but I wasn’t able to find any specific claims of it doing anything undesired, just people who noped out when they saw it wanted them to install something and others who said it does function as desired. Again, hard to say if it does anything in addition to enabling some streams to be downloaded, but I haven’t noticed anything out of place on my PC since installing it either from tool-based scans or manual checks of places where malware can put itself to survive restarts.
There were also claims that it didn’t work with YouTube in the reviews, but that doesn’t seem to be the case for me, since it does light up. Though maybe that was timing-based, too, where Google briefly managed to block it only for them to adjust.
So I haven’t seen any of those issues but YMMV. I’m going to keep using it but will also keep an eye on it. Either way, thanks for letting me know.
Everything on the Internet can be downloaded, copied etc
You care more than all of the ‘AI’ companies combined
deleted by creator
Not true in the US, but commonly stated by Americans like it’s fact.
That sadly isn’t true everywhere. Here in Germany (and I suspect large parts of the EU) downloading/streaming copyrighted content without license used to be a grey area but has been completely illegal for a few years now.
Of course, VPNs are perfectly legal.
If it’s in the public domain, it’s almost certainly legal. I don’t have the general answer to your question.
Really this question shows how outdated copyright law is; in many countries it prohibits “copying”, but in the age of computers nearly all accessing of information involves “copying” it in some way.