PDA

View Full Version : Importing from old archive html files?


Tim Wheatley
08-14-2020, 04:51 PM
I ran a site 20 years ago that was lost and recently acquired the entire archive or everything that is on archive.org - it's a ton of posts, attachments, etc.

Is there any way at all to import from the archive (for example index.php?t-125231.html or whatever) into a real post in a database? Obviously expecting some data loss, non-recovery of users, etc. But is ANYTHING possible?

If not I will likely just upload the html archive and let search engines crawl it, maybe remove the links to the 'real' forum that no longer exists.

Dave
08-14-2020, 05:00 PM
If you have the .html files in a relatively organized structure and format then it's possible by creating a custom PHP script that iterates over all the files.

This PHP script should utilize either the DOMDocument PHP class or regular expressions to get the content and to insert it properly into a database.

Tim Wheatley
08-15-2020, 01:09 AM
If you have the .html files in a relatively organized structure and format then it's possible by creating a custom PHP script that iterates over all the files.

This PHP script should utilize either the DOMDocument PHP class or regular expressions to get the content and to insert it properly into a database.

Thanks very much for the reply. I'm really interested to hear it may be possible. The backups were downloaded from archivarix_com, so the formatting is quite nice I think. I'll see what I can do in terms of using find/replace and uploading it as an archive at this point...