Sunday, April 25, 2021

Hosting webtrees on NearlyFreeSpeech

Brief notes on setting up webtrees (PHP, MySQL/MariaDB) genealogy software on NearlyFreeSpeech.net Web Hosting
  • Email did not work out of the box (required for user signup, etc): NFS requires sites use sendmail. The default option webtrees uses is "sendmail -bs" which implies an SMTP config, instead use "-t" (see doco)
  • If you have a domain name managed by NFS, you can host any standard type of site from a subdomain very easily when you create it and it will add the DNS record. Then just configure webtrees to use that, eg `base_url="http://webtrees.example.com"` in `data/config.ini.php`
  • Set up HTTPS with Let's Encrypt; NFS provides a CLI script `tls-setup.sh` - the first time I ran it I got an error accessing the `.well-known/acme-challenge/...` file it creates, but the 2nd time it worked fine. Then it auto-upgrades connections to HTTPS, which meant static files didn't load, so I updated the base_url to https in the config file as above. 
  • Pretty URLs: A slight deviation from the documentation. NFS uses Apache 2.4 and apparently without the mod_access_compat module for backwards compatibility (fix found here). The contents of data/.htaccess should be changed to "Require all granted". For the root .htaccess file, I used "RewriteBase /" because webtrees is hosted at the root, and otherwise doesn't work once rewrite_urls="1" is set in data/config.ini.php 
  • Changing the session timeout: default is 7200s (2 hours) but apparently depends on number of page loads, so if you have a quiet site it may not log you out?... don't know why the cookies don't just expire. 
  • Upgrading: As of July 2022 I have successfully completed 2 upgrades without any above changes being reverted, but it's still possible. Just follow the release instructions.
Importing data from other tools:
  • GEDCOM files can be imported (use standard v5.5.1) - if they are small they can be uploaded from the webpage, but beyond some size (maybe 50MB?) they will take a while and then fail to upload, so you have to copy the file to the `data/` folder on the server. 
  • When you start a large import (I had a 755,000 individual file), it shows you progress on the screen, but for some reason it only progresses while you have a tab open - you can't sleep the computer, but you can just keep it in the background. If the computer does go to sleep, if you just browse back to the tree in the Control Panel, it will show the import status and continue. 
  • I had a file that failed to import because of a bad character in a description; the file was valid UTF-8 but inserting this seemed to confuse the SQL statement with "SQLSTATE[22007]: Invalid datetime format: 1366 Incorrect string value: '\xF2\xAC\xA0\xB3 S...' for column `webtrees`.`wt_individuals`.`i_gedcom`..." It appeared a bad copy paste caused this; I opened the GEDCOM file directly in a text editor, found the entry from the value it was inserting, and removed some weird characters, to be fixed up later. 
Workaround to delete a large tree:

I needed to delete the aforementioned large tree after debugging the bad characters, and to import new changes, but the website delete button didn't give any feedback and eventually the database locked up, as it continued to try to delete in the background. I stopped the DB process gracefully via the web interface, with no ill effects except people complained the site was down, and the tree remained untouched. 

Obviously this is beyond the scope of a normal NFS MariaDB instance, but I had been able to observe the delete statements running via phpMyAdmin, yet they were still there afterwards, so it must be running in a transaction. I looked up the source to determine the SQL statements, and ran each or a few at a time (so not in a transaction, but still in order), and it actually worked perfectly and much faster. 

First, determine the tree ID, a.k.a. `gedcom_id` that you want to delete, from the wt_gedcom table. 

Then run these statements for that ID, eg 2 (sorry about the lack of code formatting): 
delete from wt_gedcom_chunk where gedcom_id = 2;
delete from wt_individuals where i_file = 2;
delete from wt_families where f_file = 2;
delete from wt_sources where s_file = 2;
delete from wt_other where o_file = 2;
delete from wt_places where p_file = 2;
delete from wt_placelinks where pl_file = 2;
delete from wt_name where n_file = 2; -- big
delete from wt_dates where d_file = 2; -- big
delete from wt_change where gedcom_id = 2;
delete from wt_link where l_file = 2; -- big one 2.2 mil rows
delete from wt_media_file where m_file = 2;
delete from wt_media where m_file = 2;
delete wt_block_setting from wt_block_setting join wt_block on wt_block.block_id = wt_block_setting.block_id where gedcom_id = 2
delete from wt_block where gedcom_id = 2;
delete from wt_user_gedcom_setting where gedcom_id = 2;
delete from wt_gedcom_setting where gedcom_id = 2;
delete from wt_module_privacy where gedcom_id = 2;
delete from wt_hit_counter where gedcom_id = 2;
delete from wt_default_resn where gedcom_id = 2;
delete from wt_log where gedcom_id = 2;
delete from wt_gedcom where gedcom_id = 2;

These may change over time if tables are added or changed. 

Of course, you should take a database backup first. Adapted from the NFS documentation, running this on the web host will create a gzipped SQL backup of database "mywebtrees" with a datetime stamp:

mysqldump --user=yourusername --host=mywebtrees.db mywebtrees -p | gzip > /home/private/backup-webtrees`date "+%F_%H-%M-%S"`.sql.gz

It would be prudent to copy this to your own backup lest NFS lose it, same as your media files.