Archived fact-checking: search for deleted or modified information
When a news item disappears from a website and a post on a social network is deleted, this does not mean that the information is lost forever. Web archives store digital “snapshots” of pages, allowing you to restore the chronology of changes, find deleted data and check the original version of the content.
Web archives are digital repositories that regularly keep copies of web pages. With their help, you can access deleted information or see how the site looked before the changes at a specific time. Depending on what kind of data you want to find, you need to use different repositories and services, since each of them offers unique approaches to saving and restoring data.
Web archives operating based on automated web (crawlers) that crawl websites, analyze their contents, and store copies of pages on servers. Each saved copy captures the state of the page at the time of scanning, including HTML code, images, styles, and scripts. To optimize the process, algorithms are used that determine the priority of sites depending on their popularity and frequency of changes. The data is archived as snapshots, which allow you to restore past versions. Such services store billions of pages thanks to compression technologies and distributed server networks for quick access.
Working with saved copies of pages, you can reveal many useful details:
- Check the fact of indexing. The presence of a snapshot in the search engine cache directly confirms that a robot scanned the page.
- Identify the indexed version of the content. By comparing a copy with the current page, you can determine exactly which version of the text is involved in the ranking.
- Identify the chronology of changes. Analyzing images from different dates allows you to record which content was changed or deleted and exactly when it happened.
- Recover lost data. Copies serve as an evidence base and a source for restoring information if the site has become unavailable (for example, due to domain expiration).
Services and their application
- Analysis of the history of site changes
If your task is to analyze an old website, recover lost data, or study the history of content changes, then a service such as the Wayback Machine will be useful to you. To study the data you need, enter the URL of the page you are interested in in the search bar and select the desired date from the calendar to open the saved version of this page.
- Archiving dynamic and protected content
Another effective tool that takes static snapshots of pages is Archive.today. Its difference from the Wayback Machine is that it saves content “on demand” and does not rely on regular scanning. Nevertheless, it is indispensable and extremely reliable when working with dynamic websites, because unlike other services, it creates static snapshots of content. It is noteworthy that the service can capture even pages protected from automatic scanners.
The tool is also ideal for saving social media pages, news portals, and other resources where content is frequently updated or deleted.
- Access to the search engine cache
During the process of verifying information, it is often necessary to check whether a web page has been modified after an event or to view the original version of the publication if the current one has been edited or deleted. The search engine cache helps in this — a “snapshot” of the page at the time of the last visit of the search robot.
The function of viewing saved versions of pages from the cache of Google and Bing search engines has now been removed by the companies themselves. However, this feature is provided by the Yandex search engine. You can use one of two methods to view the cached version of the site:
1. Using Yandex search: find the page in the search results, click on the three dots next to the URL and select “Saved copy”.
2. Using the CacheView service: enter the URL of the desired page on the site and click “View cache in Yandex”. If Yandex indexes the page, you will see its cached version.
- View deleted posts
For a fact checker, deleting a post is not the end of the story, but an important part of it. Analyzing the reasons for deletion (for example, a public figure hastily deleting his own statement) often helps to establish the context and motives. Since information is stored on the Network, you can use the Telemetr or TGStat services to access remote content on Telegram. To do this, enter the channel name in the search. Then go to the “View posts” section by unchecking “Hide deleted” and find the desired publication in the list — under the date it will be indicated that the post has been deleted or changed. To see its contents, click on the date.
It is important to remember that using third-party services and applications may not be safe. Take care to protect your personal data.