THE LEGALITY OF DATA SCRAPING : THE REALITY AND THE MYTH
QUESTION 1: CAN I LEGALLY SCRAPE IMAGES?
Web Scraping and Web Crawling itself are not illegal, unless the user uses it unethically. Web scraping is just like any other tool in the world. You can use it in a good way and a bad way. They are known as “Good Bots” and “Bad Bots” respectively. Good bots enable, for example, search engines to index web content, price comparison services to save consumers money, and market researchers to gauge sentiment on social media and Bad bots, however, fetch content from a website with the intent of using it for purposes outside the site owner’s control.
As a matter of fact, web scraping or web crawling, were historically associated with well-known search engines like Google or Bing. These search engines crawl sites and index the web. Because these search engines, built trust and brought back traffic and visibility to the sites they crawled, their bots created a favorable view towards web scraping. It is all about how you web scrape and what you do with the data you acquire.
Let us consider an example to further simplify it for you:
Suppose, you allow someone to enter your residence from Main Gate in general, But the person is preferred to come over through crossing the boundary wall. So, will you allow the person to enter in your residence? Similarly, the data displayed by most of the websites are generally accessible to the public as it is legal to store that data in your system for personal use. But in case you are looking forward to using it as your own without the consent of the owner and by violating the ‘Terms & Conditions’ Guidelines, here it will be treated as illegal. However, the law regarding Web Scraping is not transparent but there are still some regulations in which you can fall for doing unauthorized web scraping. Some of these are listed below:
- Violation of the Digital Millennium Copyright Act (DMCA)
- Violation of the Computer Fraud and Abuse Act (CFAA)
- Breach of Contract
- Copyright Infringement
- Trespassing, etc.
QUESTION 2: CAN I DISPLAY SCRAPED IMAGES?
Extracting images from the internet is known as “Screen Scraping”. So now the question that arises is, if photos are available on the internet for free, then are they free for taking and using?
The answer to it is “NO”. Screen scraping is basically an act of copying information that shows on a digital display so that it can be further used for another purpose. Visual data can be collected as raw text from on-screen elements such as a text or image that appear on the desktop, in an application or on a website. Screen scraping has a variety of uses, both ethical and unethical. In the world of copyright, each original image theoretically has an “Author” who created the image, and is the first owner of the copyright. The exception to this rule is that an image (or, indeed, any other copyright-protected work), which is created by an employee/author in the course of employment is owned by the employer. So, an image has an owner, even if that owner chose to post the image online. And copying that image without the permission of the owner could be an infringement of the owner’s copyright.
QUESTION 3: WHICH IS THE LAW THAT PROTECTS SEARCH ENGINES AND CAN WE USE IT?
Search Engines are protected under Copyright Act. Copyright is basically an exclusive statutory right of literary (authors, playwrights, poets, etc.), musical (composers, musicians), visual (painter, photographers, sculptors, etc.) and other artists to control the reproduction, use and disposition of their work.
Article 27 of the Universal Declaration of Human Rights (UDHR) provides as a basic right that ‘everyone has the right to the protection of the moral and material interests resulting from any scientific, literary or artistic production of which he is the author’. Thus, it guarantees protection against copyright infringement. Several countries consider copyright as a basic right of property. Different countries have different Copyright Acts for example USA has US Copyright Act, 1976, UK has The Copyright, Design and Patent Act, 1988. Apart from it, countries have technology-based copyright laws as in US Digital Millennium Copyright Act.
Anyone who without the authorization of the Copyright owner, exercises any of the exclusive rights of a copyright owner, as granted and limited by the Copyright Act of the land, is an infringer of the copyright. Copyright infringement is determined without regard to the intent or the state of mind of the infringer, “innocent infringement” is an infringement nonetheless.
QUESTION 4: WHAT CREDIT DO WE NEED TO DISPLAY TO THE OWNER? CAN WE PUT A GENERIC ‘MAY BE SUBJECT TO COPYRIGHT MESSAGE’ LIKE GOOGLE IMAGES SHOWS?
WHAT CREDIT DO WE NEED TO DISPAY TO THE OWNER?
To understand how to give credit and the terms and conditions set for a given image here are some commonly used terms:
Copyright is the right to “(1) distribute the work, (2) reproduce (or make copies of) the work, (3) display the work (for example, a painting that you want to allow a museum to publicly display), (4) perform the work, and (5) create Derivative Works based upon the original work”, and it’s also often referred to with the copyright symbol ©
Image License :
Image license is referring to what license the image owners are giving the potential users. The points further shall help you better understand commonly used licenses.
All Rights Reserved :
All Rights Reserved signals the potential image user that they can only use that specific visual work if they received permission personally or directly from the image owner who has “all rights reserved” for it.
Some Rights Reserved :
Some Rights Reserved refers to the fact that image owners can move on a spectrum of choices when it comes to the types of rights they reserve for their work.
Creative Commons :
Creative Commons (CC) license is a legal framework for enabling the copyrighted work to be distributed under certain conditions, set by the image owner.
Fair Use :
Fair use means that the copyrighted photo can only be used for educational, personal or research purposes or if it’s beneficial to the public. A common misunderstanding is that fair use rules out all commercial uses, however, courts always balance “the purpose and character of use” against a number of factors, meaning that there is a possibility that a “fair use” image would be allowed to get used for commercial purposes but you must clarify that with the image owner.
Public Domain :
Public Domain is the absence of copyright, meaning that the given work is not subject to copyright or other legal restrictions, this can happen if the owner of the work died or abandoned all rights related to it which is called the relinquishment of copyright.
HOW TO GIVE IMAGE CREDITS?
Make sure you can use the image in the first place :
In some cases, the image owner won’t give permission to use their work in any form, or in other instances, as mentioned, you might not be able to use it because the image credits are missing, you cannot contact the owner to clarify the rules and there is just no legally safe option to use the image and give credit.
If the image owner and the image credits allows you to use the image, take time to learn in what way you can actually use it :
Some image owners or photographers restrict usage by platform, i.e., online vs. print or others may let you use the image online but with the exception of using it in an advertisement.
Place the image credits adjacent to the photo :
This usually means below it or positioned somewhere along one edge.
Make the image credit noticeable and readable :
Besides placing it near the image it refers to, use a font style and size that is easy to see and understand.
Follow an image credit template unless :
It has to be modified specifically because of the image owner’s required terms and conditions.
QUESTION 5: WHO IS THE COPYRIGHT OWNER? THE WEBSITE OR THE CONTENT CREATOR (THESE ARE DIFFERENT ENTITIES IN ALL OUR CASES). HOW DO WE DETERMINE COPYRIGHT OWNERSHIP?
WHO IS THE COPYRIGHT OWNER? THE WEBSITE OR THE CONTENT CREATOR?
The first owner of copyright to a work is generally the original creator or author of the work. There are, however, some exceptions to this rule. In some countries, for example, the economic rights to a copyright work initially rest with the person/organization employing the creator. In other countries the economic rights are deemed to be automatically assigned or transferred to the employer.
HOW DO WE DETERMINE COPYRIGHT OWNERSHIP?
Figuring out who this copyright holder is a tricky part. It depends upon the type of work, who made it, who it was made for, and whether it was sold or not. Often the copyright belongs to the person who made the work, who is known as the “author”. This is true particularly for non-commercial works.
For example, if A draws a picture of B in his sketchbook, A is the copyright holder of the picture as soon as he draws it. If A rips the sketch out of the book and gives it to B, A will still own the copyright even though B owns the only copy of it in the world. Therefore, if one wants to contact who owns a copyright, a great place to start would be to contact the person who made the work.
It’s extremely common though, particularly for commercial work, that the copyright holder is different than the person who made the work. Generally, you have to determine who owns a copyright on a case-by-case basis, which is as follows;
Work for Hire
Because works that are made within the scope of a creator’s employment belong to the employer, the copyright might belong to the person or company who the work was made for.
Assignment/transfer of Copyright
The copyright holder may have sold or transferred the rights to someone else through an assignment.
No Copyright Owner
There may have been no copyright at all, such as in the case of works made by the federal government, or the copyright may have expired and the public might own it, known as public domain.
QUESTION 6: HOW DO WE HANDLE REMOVAL REQUESTS? WHO SHOULD WE AGREE TO REMOVE AND WHO SHOULD WE DECLINE?
If your hosting service receives a DMCA takedown notice regarding your content, it ordinarily will respond by removing the complained-of material and it will do this automatically without making any judgement about whether your content actually is infringing. However, the DMCA notice and takedown procedures provide you with protection from wrongful claim of copyright infringement. The DMCA requires your service provider to notify you promptly when it removes any of your content because of a takedown notice, and you have the right to submit a counter-notice asking that the material put back up. There is no specific time limit for submitting a counter-notice, but you should not delay unreasonably in doing so. If you send a counter-notice, your online service provider is required to replace the dispute content unless the complaining party sues you within 14 business days of your sending the counter-notice. (Your service provider may replace the disputed material after 10 business days if the complaining party has not filed a lawsuit, but it is required to replace it within 14 business days).
Before you send a counter-notice you should consider carefully whether you are in fact infringing the complaining party’s copyright. There are two reasons for you to consider carefully.
- The counter notice requires you to state, under penalty of perjury, that you have a good faith belief that your material was wrongly removed. You do not want to make this claim lightly because it might come back to haunt you.
- If the complaining party has a good infringement claim, sending a counter notice may trigger a lawsuit.
If you are not prepared to stand up for your use of the copyright owner’s work in a lawsuit, you should think twice about firing back a counter-notice. The said copyright owners sometimes send bogus takedown notices that have a no basis in law or fact, which are meant solely to intimidate the target. A prompt counter-notice can make these empty threats go away for good.