Following up on our earlier post regarding the Era of Hashtag Surveillance, the FBI has published documents indicating that it intends to enter into a deal with a Twitter data miner, appropriately named Dataminr (and partially owned by Twitter), for access to its monitoring technology. Techcrunch reports that the FBI disclosed its intent to enter into a licensing agreement with Dataminr for access to Twitter’s “firehose” data stream. As opposed to the normal data streams that Twitter makes available to the public which only provide access to a fraction of the posts made to the site, the “firehose” stream contains all public posts made on Twitter and would essentially allow a user to search, in almost real-time, every post made to the service.
Earlier this year, the CIA reportedly used Twitter’s unfiltered data stream in a similar fashion, until Twitter rescinded the CIA’s access over concerns that the public’s use of Twitter would slow if users knew that the CIA was monitoring their posts. According to Techcrunch, however, Dataminr has stated that the service it is providing the FBI is different from what it provided to the CIA.
Officials at Dataminr have stated that the FBI would have “a limited version of [Dataminr’s] breaking news alerting product.” This seems to indicate that the license provided to the FBI would be restricted in some manner, but the statement does not provide a clear indication of what those restrictions are. Included in the FBI’s disclosure was additional information that the FBI had identified two other contracts with Dataminr by government agencies: the Department of Homeland Security, Transportation Safety Administration and the Department of Defense.
As social media has become an integral part of how we interact with our friends and community, government agencies are recognizing the potential of these same platforms to aid in the execution of their respective mandates. That in turn raises plenty of questions, the answers to which are currently or soon to be decided in the courts. When it comes to data streams and unfiltered “firehoses,” who gets access, how complete is that access and how can such access be balanced against the privacy rights of individuals? While those questions remain unanswered, the firehose flows on.