BBC: The decaying web and our disappearing history
This is just one investigation, and a preliminary one at that. The figures, though, suggest a clear linear trend: the loss of just over 10% of the resources shared via social media each year, even when archiving is taken into account, or around 0.02% of this content lost every day.
Stories along these lines seem to surface at least once a year and seem to me to almost be accepted as truth. Curious, I decided to follow through to the original paper (you can tell I have a lot of things I should be doing when I start to check sources).
The news item is accurate in that the research is studying the decay of the web, or more accurately, the URI/URLs used to identify and link to pieces of information on the web.
However, just because URL may no longer be valid does not mean that the information no longer exists. It may simply have moved to another URL. Is this bad practice as far as web architecture goes? Yes. It is maintaining URL persistence always the highest priority for web development projects. No.
(I am certainly guilty personally and professionally of not making URL persistence a high priority for many of my web projects and sites. Something for me to work on.)
While URL persistence is a desirable goal, it isn’t necessary for the web to exist or for us to use it effectively. Web authors are constantly updating their links, and our search tools are constantly updating their indexes, and for the most part if the information still exists, we’re able to find it regardless of whether its URL has changed.
Now there is information that is disappearing from the web, some of it in huge chunks as hosting services shut down and web sites go offline. That IMHO is a far more important issue than link rot. It would be interesting to see some studies that look into that phenomenon. Thankfully we have the Internet Archive (currently offline as I write this!) and groups like the Archive Team who are working to preserve that information.
To say that the web is decaying it true, but it doesn’t provide a full or accurate picture of what is happening. There is as much if not more growth than decay. The content and structures of the web are being added to and modified continually. That the web is usable at all suggests that it is for the most part stable. Given the choice, I think we should focus our efforts on creating, improving, and preserving content, and issues with the structure will work themselves out.