Rebuilding GeoCities on Mac OS 12

A while back, some of you may remember that I entered RetroChallenge 2021/10. I was somewhat successfully at a rebuild of the GeoCities Archive, also known as The Archive Team Geocities Valhalla, and managed to host it on a Raspberry Pi.

Of course it wasn’t perfect and I knew that. So since then I’ve tried to reboot it again using the GitHub repository from despens as it would potentially solve some of the problems I had when I started to merge data together. But then this happened…

Ideally, I wanted to use Mac OS instead of Ubuntu, but Mac OS doesn’t support long file names. Limit of 1024 characters apparently. Ubuntu can hit a whopping 4096 or so. Nice!

I was also getting checksum mismatches compared to the original torrent, which needless to say, is very bad. Not sure what caused that, but it was possibly the same thing that resulted in duplicates. Note the “.1.” in the file names.

I did find what appears to be a typo in the despens GitHub too.

GEO SCRIPTS/filer-indexes.pl should probably be GEO SCRIPTS/filter-indexes.pl. Don’t worry, I’ll fix it in post.

The following is just for my reference and was more related to the pre-004 steps on despens instructions under Mac OS. You can see at the end I hit the 1024-character limit in Mac OS which is where I headed back to Ubuntu 22.04.

Verify and complete your Geocities Download
From here - https://blog.geocities.institute/archives/3209

curl -L https://install.perlbrew.pl | bash
source ~/perl5/perlbrew/etc/bashrc
perlbrew install perl-5.36.0	
perlbrew switch perl-5.36.0	This needs to be done repeatedly in the active terminal window, find bin file
/usr/bin/which -a perl
sudo cpan -i XML::LibXSLT

This makes future installs less questiony
cpan -i App::cpanminus

This is needed for the thing
cpanm XML::TreePP

Also needs this to calculate md5s on Mac
brew install md5sha1sum

perl geo-torrent-checksums.pl 2009-archiveteam-geocities-part1_files.xml > download1.sh
perl geo-torrent-checksums.pl 2009-archiveteam-geocities-part2_files.xml > download2.sh
perl geo-torrent-checksums.pl 2009-archiveteam-geocities-part3_files.xml > download3.sh
perl geo-torrent-checksums.pl 2009-archiveteam-geocities-part4_files.xml > download4.sh
perl geo-torrent-checksums.pl 2009-archiveteam-geocities-part5_files.xml > download5.sh
perl geo-torrent-checksums.pl 2009-archiveteam-geocities-part6_files.xml > download6.sh
perl geo-torrent-checksums.pl 2009-archiveteam-geocities-part7_files.xml > download7.sh
perl geo-torrent-checksums.pl 2009-archiveteam-geocities-part8_files.xml > download8.sh

Note: All must be in the same directory when verifying

Next, backup files to root to avoid accidental erasure.
Then sudo chmod +x download*.sh
./download1.sh
./download2.sh
./download3.sh
./download4.sh
./download5.sh
./download6.sh
./download7.sh
./download8.sh

Once completed sudo rm download*.sh files re-run the md5 checker:
perl geo-torrent-checksums.pl 2009-archiveteam-geocities-part1_files.xml > download1-1.sh
perl geo-torrent-checksums.pl 2009-archiveteam-geocities-part2_files.xml > download2-1.sh
perl geo-torrent-checksums.pl 2009-archiveteam-geocities-part3_files.xml > download3-1.sh
perl geo-torrent-checksums.pl 2009-archiveteam-geocities-part4_files.xml > download4-1.sh
perl geo-torrent-checksums.pl 2009-archiveteam-geocities-part5_files.xml > download5-1.sh
perl geo-torrent-checksums.pl 2009-archiveteam-geocities-part6_files.xml > download6-1.sh
perl geo-torrent-checksums.pl 2009-archiveteam-geocities-part7_files.xml > download7-1.sh
perl geo-torrent-checksums.pl 2009-archiveteam-geocities-part8_files.xml > download8-1.sh

Everything *should* be commented out, indicating no issues. Then it is time to decompress.

Resolving conflicts
https://github.com/despens/Geocities/tree/master/scripts/geocities.archiveteam.torrent Per readme on https://github.com/despens/Geocities check variables in terminal with env nano ~/.zshrc This appears to be updated! https://github.com/kyledrake/Geocities Mac OS limit of ~1024 characters /Volumes/GeoCities/work/geocities/YAHOOIDS/E/n/EnchantedForest/Dell/2237/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino/dino-d.html

For now, I’m back in the game with RetroChallenge 2022/10 doing something similar but different on hardware and software from 2012. This should result in less errors and a more accurate replication of despens results. 🙂


Creative Commons License Netscape Internet Explorer ICQ Powered by Debian! Tor - Anonymity Online mIRC for all! Do Only Good Everyday!

Established February 2012 and was last updated 6th of April 2024. You are visitor number Counter since sometime in the mid-1990s. We are powered by a Raspberry Pi.
:BIGGAYROLL: