Web Scraping Watch: Cases Set to Clarify Application of the Computer Fraud and Abuse Act

and

For years, website owners have leveraged the federal Computer Fraud & Abuse Act (CFAA) as a tool to combat unauthorized scraping of data and other content from their websites. Due to a circuit court split on the interpretation of the CFAA’s “exceeds authorized access” provision, there has long been a legal gray area around the widespread practice of web scraping and whether scraping data from publicly accessible websites can give rise to liability under the CFAA. A set of closely watched, high-level court cases, however, may soon offer some long-awaited clarification on the reach of the CFAA to web scraping.

Web scraping refers to the use of automated bots to extract large amounts of data from websites at a rate not humanly possible. These days, web scraping is a common practice in many industries, which can be used to collect and use data for a wide array of purposes, such as for research, web indexing, price comparison, gathering real estate listings, weather data monitoring, among countless other uses. However, while web scraping has become a widespread practice, many website owners prohibit the use of bots to scrape data from their sites, typically in their terms of use, and many take measures to restrict scraping, such as by implementing methods of detecting and blocking bots from scraping their sites and/or by sending cease-and-desist letters.

The CFAA, while primarily a criminal anti-hacking statute, has been the primary legal mechanism by which website owners have attempted to stop unwanted web scraping from their sites. The CFAA provides a number of causes of action related to unauthorized access to or breach of computer systems. With respect to web scraping, the statute prohibits one who “intentionally accesses a computer without authorization or exceeds authorized access” with intent to either obtain information, further a fraud, or damage the computer or its data.

Most cases involving CFAA claims against web scraping have focused on whether the defendant was authorized to access the website, whether the defendant exceeded its authorization, or whether the defendant consciously intended to cause damage to the website. In most instances, the data being scraped is publicly available, but the web scraper’s activities exceeded what a court finds the website owner would expect or be willing to tolerate. For instance, some courts have found that a web scraper violated the CFAA when the website expressly revoked the scraper’s authorization to access and scrape its website, such as by sending a cease-and-desist letter and terminating the scraper’s access to the website, yet the scraper persisted with scraping. Such persistent scraping has been through a proxy network or a third-party vendor, or by circumventing the website’s security protections to scrape data, such as by scraping behind a login.

Meanwhile, some courts have held that web scraping from a publicly accessible website cannot violate the CFAA without some express revocation of access to the site. While that position has not been fully adopted among the circuits, two important cases may soon resolve the split and clarify the issue.

In June, the Supreme Court decided a case that, while not directly on point, offers some additional clarity as to whether web scraping from publicly accessible websites violates the CFAA. In Van Buren v. United States, the Court held by a vote of 6‐3 that the CFAA does not cover situations in which the defendant was authorized to access information from a website, but did so for an improper purpose. The case involved a police sergeant who had used a law enforcement database, which he was authorized to access as part of his job, to obtain information for a personal purpose outside of the scope of his job. The Court found that the defendant had authority to access the database at issue, such that he could not be convicted under the CFAA for misusing that access. More specifically, the Court found that “an individual ‘exceeds authorized access’ when he accesses a computer with authorization but then obtains information located in particular areas of the computer‐such as files, folders, or databases‐that are off limits to him.” In other words, under the CFAA, the defendant did not exceed authorized access of the database to obtain the information considering he was permitted to access the information, regardless of whether he used the information for an improper purpose, though the outcome would have been different if he had accessed restricted areas of the database to obtain the information.

Though the case involved different circumstances, the Court’s ruling reinforces the position held by some circuit courts that the CFAA’s “exceeds authorized access” provision does not apply to web scraping from publicly accessible websites, even if such activity is prohibited by the website’s terms of use. What was not clarified by the Court, however, is whether such activity could still violate the CFAA’s “intentionally access a computer without authorization” provision and, if so, what qualifies as revoking authorization to access a public website, such as whether the website must affirmatively restrict access to the site through technological barriers or whether sending a cease-and-desist letter is sufficient.

Meanwhile, in another high-profile case directly involving the application of the CFAA to web scraping, hiQ Labs, Inc. v. LinkedIn Corp., the Supreme Court recently granted LinkedIn’s petition for certiorari, vacating the Ninth Circuit’s 2019 opinion. In hiQ, LinkedIn had sent a cease-and-desist letter to defendant hiQ, which scaped and used data from publicly available profiles on LinkedIn’s website. HiQ sought a preliminary injunction against LinkedIn from restricting its access to the website. The district court granted the injunction, finding that applying the CFAA to the access of websites open to the public would have sweeping consequences. The Ninth Circuit then affirmed the injunction, finding that scraping publicly accessible data does not constitute accessing a computer without authorization or exceeding authorized access in violation of the CFAA. After the Supreme Court’s ruling in Van Buren, it remanded the case to the Ninth Circuit for further consideration in light of the Van Buren opinion and a hearing is scheduled for October 18, 2021.

By granting LinkedIn’s petition for certiorari in hiQ, it seems likely that the Ninth Circuit, potentially followed by the Supreme Court, will eventually provide some clarification as to whether the CFAA’s “without authorization” provision can be used to restrict web scraping from a publicly available website after sending a cease-and-desist letter to the scraper, or whether the CFAA can only be used to prevent scraping when some technological or code-based measure has been circumvented by the scraper.

Regardless of the outcome in hiQ, there are a number of other potential legal claims website owners may have against unauthorized web scraping, including for copyright infringement, trespass to chattels, and misappropriation, such that there remain significant legal risks with certain forms of web scraping even if the courts ultimately restrict the application of the CFAA.

RELATED ARTICLES

Google v. Oracle, Fair Use and the Decreasing Value of Code Over Time

A Short History of the Fair Use Defense in the Software Industry