A targeted hack on the Internet Archive has threatened billions of archived web pages and a comprehensive digital history of the globe.

The website remained offline on Friday after its founder confirmed a major cyber attack that also exposed millions of users' data.

But what is it and why is it so important?


What is the Internet Archive?

The online archive of web pages, images, historical documents and books was originally set up in 1996 by Brewster Kahle, a US IT specialist.

Based in San Francisco and set up as a nonprofit, the site operates a tool called the Wayback Machine which takes snapshots of web pages and saves them in the event they are altered or deleted.

Professor George Buchanan, the deputy dean of RMIT's School of Computing Technologies, called it an "internet time machine" for its ability to show users things like what the White House website looked like in 1995 or other important historical records — a crucial resource for fact-checkers, researchers and journalists.

"The internet has no memory, there's no undo on that," Dr Buchanan said.

"The whole point of the Internet Archive is to time-travel back," he continued, listing musical archives, knitting patterns and family genealogies as other ways people make use of the digital library.


What happened in the hack?

Mr Kahle, the Internet Archive's founder and digital librarian, acknowledged a series of distributed denial-of-service (DDoS) attacks aimed at disrupting the archive's website and servers since Tuesday.

The assault led to the "defacement of our website" and a breach of usernames, emails and passwords, he wrote on X on Wednesday.



In a new post hours later, Mr Kahle said the attackers had returned, knocking down both the Internet Archive's main site and its "Open Library," an open source catalogue of digitised books.

The Internet Archive's data "has not been corrupted," he wrote in a subsequent post.

On Wednesday, users reported a pop-up message claiming the site had been hacked and the data of 31 million accounts breached.

"Have you ever felt like the Internet Archive runs on sticks and is constantly on the verge of suffering a catastrophic security breach?" said the pop-up, apparently posted by the hackers.

"It just happened. See 31 million of you on HIBP!"

HIBP refers to site called "Have I been Pwned," a site that allows users to check whether their emails and passwords have been leaked in data breaches.

In another post on X, HIBP confirmed that 31 million records from the Internet Archive had been stolen, including email addresses, screen names and passwords.

A hacker group called "SN_BLACKMETA" claimed responsibility, saying it had targeted the archive "because [it] belongs to the USA" and linked the attack to the US government's alliance and support of Israel.

The Internet Archive is not owned by the US government and has no ties to Israel.

“They’re probably doing it more for the shock value and visibility of the story,” Dr Buchanan said.


What would it mean if the archive was gone?

While most libraries operate digital archives that capture some of our online history, there are vast expanses of the internet that aren't otherwise recorded — except for the Internet Archive.

"There's hundreds of things where for any of us those things won't matter, but there will be someone for whom it does matter," Dr Buchanan said.

"It is very literally irreplaceable," he continued.

"The cost of running it every year is significant and there’s no alternative available because of the technical expertise that’s needed to develop that system."

Digitised versions of local newspapers or crucial histories such as the early #MeToo movement's writers, who used blogs or Tumblr could also be lost if the Internet Archive's data was deleted, Dana Mckay, the associate dean at RMIT's School of Computing Technologies, said.

For now, the archive remains offline with the Wayback Machine and Open Library inaccessible, but the site's operators said services would be restored "as quickly and safely as possible".

Users across social media were quick to mourn the service's disruption.