Video 10: Upgrading VuFind® Using Git
The tenth VuFind® instructional video explains how to add Git version control to an existing installation, if that installation is not already version-controlled, and how to use Git and shell scripting to simplify the process of upgrading both the core code and your local configurations/customizations.
- VuFind's changelog: a vital resource when planning an upgrade
- Git wiki page: useful advice and background information about Git
- Git Branches wiki page: information on how branches and tags are used in VuFind's Git repository
- Automatically updating locally customized files with Git and diff3: the blog post containing the script referenced in the video
With the recent release of VuFind 7.0, I thought now would be a good time to share my upgrade workflow to show how I use Git to upgrade VuFind. This video is going to assume that you have a basic familiarity with Git, and if you do not, I strongly, strongly recommend learning at least the basics. Doing software development and deployment without version control is like driving without a seatbelt, and you will find that investing a little time in learning these tools will save you all kinds of trouble down the road.
So this video will cover two basic topics. The first will be turning an existing installation of VuFind into a Git repository so that you can use Git to manage it, and the second part will show some processes and tools for using Git to perform actual upgrades.
So the virtual machine I've been using for this series of tutorials was initially set up by installing VuFind from the Debian package, which just puts the files on disk but does not include any kind of version control. We are going to take advantage of a useful characteristic of Git, namely that all of the version control data is stored in a dot Git subdirectory, which can be easily moved around, and we're going to use that to turn our existing installation into a Git repository. We will check out a temporary clone of the repo, move the dot Git directory into our VuFind home, and then we'll be able to work from there.
But before we can begin doing all of that, the most important thing we need to do first is figure out exactly which version of VuFind we are already running, because that will enable us to put Git into the correct state to start all of the work that we need to do. There are a couple of ways to identify which version of VuFind you're running. A simple one, particularly useful if you don't have access to the server running VuFind, is to simply view the page source while looking at any VuFind webpage, and look for the generator meta tag, which should have a version number embedded in it, so we can see here 6.1.1. However, the generator meta tag is not 100% reliable, because it's actually created based on a config.ini setting, and if that configuration file gets out of date or is customized, it's possible that what it's reporting is not actually the truth. If you want to be more confident that you have the right number, you should go to the command line, switch to your VuFind home directory, and edit the build.xml file. This is the control file used by the phing automation tool to do various VuFind related build tasks. Most users don't have to worry about it on a day-to-day basis, but it does have the useful characteristic of having a version number embedded in it. So if you scroll down until you find the property named version, you can see that once again this confirms we are working with you find 6.1.1.
Now that we know what version we're running, as I said, we can start creating a Git repository to turn this bare file collection into a tracked version controlled repository. And before I begin, I'm just going to confirm if I run a Git status from my VuFind home, it tells me it's not a Git repository. So let's go get one.
If we go to the temp directory, we can run git clone https github.com/vufind-org/vufind.git, and that will get us a copy of the full repository and all of the history of VuFind development. And while this takes time to download, I wanted to talk a little bit about how we use Git to track various versions of VuFind. First of all, mainline bleeding edge development takes place in a branch called dev. So if you want the latest code, the most up-to-date, the most current you would want to check out dev. But of course, bleeding edge code comes with risks because there may be some recently introduced bugs. And while we try to never break the dev branch, less tested code is always more risky. If you want more stable code, we have a series of release branches. So for every major or minor release, we create a branch named release dash and a number. So for example, release-7.0 or release-6.1. And these branches are where we do bug fixing on existing releases. So we'll never add a new feature to a release branch, but we will add fixes as needed. And it's these release branches that we use to issue bug fix releases that only include fixes even after development in the main dev branch has moved on to further new exciting territory. A final tool that you may find useful is that we also tag every release. So if you want the exact point in time for a particular VuFind release, you can check out a tag which is just going to be V and a number. So for example, V7.0 or V6.1.1. So if you want to get to a very precise state in the code, you can use a tag. And that will be useful, for example, in this situation where we're trying to sync up with a particular known version. But if you're trying to get the latest stable code on a particular thread of development, use a release branch instead. All that being said, the Git repo is now cloned. We need to switch into the directory that the Git clone operation created, CD vufind. And then we need to check out the appropriate version tag, in this case, the 6.1.1. Now Git is going to complain that it's in a detached head state because we've selected a tag rather than a branch. We will clean up that mess in a moment. But first, since we now have Git at the same point in time as the release of VuFind we're working with, we should be able to move that didn't get directory into the VuFind home directory to turn it into a Git repo. So now I'm going to switch to VuFind home. And now if I do a Git status, instead of telling me this is not a repository, it's going to re-index itself and then tell me we're at a detached head at version 6.1.1. And there are a whole bunch of untracked files. This is exactly what we want. We're not seeing any diffs, which means that we picked the right version and we haven't accidentally edited any core files. But it is seeing all of the local files that we added in our local directory and our custom theme. So what we want to do now is to establish a branch for our local code so that we can begin to track the things that we change and customize over time.
What we can do is say
git checkout -b and make up a name for our branch. I'm going to call this
local_tutorial. And now Git has created a local branch where we can begin to add files and track changes. I should note that in this example, I'm going to be committing files containing passwords and local information to this Git branch. So if you do something similar, you just have to be careful that you don't accidentally publish this branch somewhere you don't wish to. So watch out for that. But here there's not too much danger and tracking these files is useful. So I'm just going to go ahead and start adding files to git. So I'm going to git add the env.bat file that the installer set up, even though we don't need it since we're not running the windows. I'm going to add everything under the local/config/vufind directory. I'm going to add local harvest oai.ini because that configuration is useful to track. I'm going to add local import, local HTTP vufind.com and env.tutorial. I do another git status just to check that I did that right. We see that I've added a whole bunch of files that are waiting to be committed. The only things I've omitted are these two harvest subdirectories which contain harvested metadata and it really doesn't make any sense to track those in version control. So in fact what I'm going to do is I'm going to edit my git ignore file which tells git certain directories that can safely ignore. And I'm going to add local harvest expositions and local harvest vufind so that they don't confuse me or get accidentally committed in the future. I do a git status again. It now tells me that I've modified git ignore but it doesn't show me those directories I want to hide. So I'm going to git add git ignore to my list and then I'm going to commit everything with a message of initial local finance. So now I've got vufind 6.1 branched off into my own local branch where I have added all of my local configuration files. We now have set the stage for performing the upgrade. But before I upgrade I want to talk about a few things that are really important to keep in mind. First of all before you start to upgrade be sure that you understand what your local customizations are. What themes have you created, you have a local code module, have you customized configuration files? What's in there and why? Obviously if you're working on a team you may not know every detail of everything but that's one of the reasons to use version control because your version history of your local files can tell you who did things when they did them, why they did them and so forth. But at least at a high level it's really helpful to go into the upgrade process with an idea of what you customized so that you can test those customizations post upgrade and be sure that they didn't get broken. One of the reasons why VuFind separates local files from core files so deliberately with the local settings directory, a separate module for local code, a separate theme directory for local themes is to make it very clear what is yours and what is part of the project so that if for example a new developer needs to take over a VuFind instance that someone has already customized they can review those local directories and see what's been done and sort of understand the context.
This is why it's important to be disciplined about separating your localizations from the core. And one of the advantages of using Git is that it helps you to do this because if you accidentally change a core file that will show up when you do a Git status and then you'll know oh I shouldn't have done that I need to move that to the appropriate place.
In any case once you're comfortable that you understand your local customizations in their scope the next thing I highly recommend doing is looking at the change log in the VuFind wiki and I will link to this from the video recording. But for every release of VuFind we include notes not only on new features that might be of interest but also of changes to code and configuration that could potentially cause problems during an upgrade. We are very inclusive here because we want to catch every possible issue that might be a problem. Most of these are unlikely to affect most users but that's why it's helpful to have a broad idea of what you've customized because you can then read through this list and take note of which issues are likely to be a problem and which ones you can very safely ignore.
In the case of this specific upgrade that we are about to run the one change log note that I need to be concerned about is this one. Starting with VuFind 7 we changed the default port that Solr runs on from 8080 to 8983. This is because 8983 is the standard port number used by Solr and using 8080 historically has caused port conflicts with other applications. It seems to make sense to standardize that but now at upgrade time we need to be aware of this change so that we can deal with it appropriately. Right now our Solr is running on port 8080 and after the upgrade VuFind will be looking for it on port 8983. I'll show you how to deal with that after the rest of the process and of course this particular issue only applies to VuFind 7 but this is just an example of the kind of thing you should be aware of when you're reviewing the change log.
Finally the third thing that you should always do before attempting an upgrade is back everything up and of course it's best to test an upgrade on a non-production server before you dive in in the real world. In this instance I'm not going to show you how I backed things up because this is a virtual machine and I've just backed up the whole disk image before I started so I can roll back if I have to but whatever your situation just be sure that you have a rollback plan so if something goes wrong during upgrade you haven't broken your system and gotten into an unrecoverable position.
With all that background out of the way we're just about ready to begin. There's only one other thing to watch out for and this is that when we use Git to do updates the user running the git command needs to have permission to write to all of the folders and files in your VuFind home because it's going to be updating core files. So it's a good idea to check and make sure that you're using an appropriate user. So in this instance if I look at the file ownership of my VuFind directory I see that most of these are owned by me, dkatz, except for some reason this batch file is owned by root and the Solr directory is owned by Solr.
We changed the Solr ownership because the user running the Solr process needs to be able to write there but what we can do is work around this with a group ownership. I will leave Solr owned by Solr but I'm going to add it to the dkatz group which will give my user account permission to update files in that directory and I'm going to do that with the
sudo chgrp -R dkatz Solr so that's going to recursively change the group ownership of the Solr directory and then I'm going to use
chmod -R g+w to add group write permissions to that directory so now because all these files and folders have my group and they have group write permission I will be able to modify them. Again the way that you actually modify your VuFind directory permissions in a real world situation will depend very much on your strategy for deployment but if you're using Git to update things you need to make sure that Git can write to all the files. So now that that's done all I need to do is run a Git merge operation to pull in all of the changes between release 6.1.1 where we are currently and the target endpoint we want to get to. So if I wanted to upgrade to specifically the 7.0 release I could say Git merge v7.0 using the v7.0 tag and I would get exactly to the code as it was released in 7.0. However in this instance I happen to know that there were some bugs in 7.0 that have been subsequently fixed and that would cause me problems on this test box so I'm instead going to choose to merge the release 7.0 branch which is the stable code from release 7.0 with subsequent bug fixes applied. So all I need to do is say git merge origin/release 7.0. The origin part is because origin is the default remote repository name that you get when you clone something so that refers to the public VuFind repo and of course release 7.0 is the branch I want. So when I run that command I am prompted to customize the commit message if I want to but I'm happy to take the default so I'll just exit out of this text editor and now I watched a bunch of changes flyby and my core code has been upgraded to VuFind 7.0. But we're not quite done. First of all we need to make sure that all of our dependencies are up to date because some parts of VuFind loaded in with composer and updating the code with git will have changed the composer configuration but it won't have automatically triggered a composer install. So if I just say composer install following my merge that will bring all of our dependencies up to date and there are quite a few changes in play with this upgrade because this is where we switch from zend framework to its successor laminots. So almost every core dependency of VuFind changed its name here even though the functionality is the same. So we have to wait for all of that to download and update itself. The other thing we want to be sure to clean up is VuFind's cache because it's possible that there is data in the cache that is now out of date that could cause problems post upgrade.
So I am just going to
sudo rm -rf local/cache/* to clear out all of the cache directories under there. But then I'm also going to recreate the local/cache/CLI directory and change that to be owned by me because we need a command line cache separate from the web based cache to allow our command line utilities to run correctly.
So in reality I usually have a pre written script for automating all the steps of the upgrade which includes removing the cache and recreating the command line cache. You would probably benefit from doing that as well but for now I just wanted to show those important steps.
So now we have updated code, we have updated dependencies, we have a clean cache but there's one more very important step. Because all of our local files are separated from the core, obviously when we did our git merge and update, it updated all the core files but the upstream changes don't apply to our local files at all because they're local files. However because all of our local files are just copies of core files with a few changes applied, we can do a bit of clever scripting to find changes to the core equivalents to the local files and then merge those changes into the local files.
I've written a bash script that does all of this work and in fact I've written a blog post explaining exactly how that works and the reasoning behind it. That's another link that I will include on the notes in this video. But for right now I am just going to skip to the answer and copy and paste my script code out of the blog post and into a file on disk. So I'm going to create a file called merge local.sh and I'm going to paste all my code into it.
So with this file all ready to go I just need to make it executable so I can run the script. Now I'm ready to apply the upgrades from the core to my local files. One very important note about this script is that you have to run it immediately after you perform a merge because it works by looking at what changed in the most recent Git commit and if the most recent Git commit is not the merge to upgrade VuFind it won't know how to change your local files.
So it's always important to do this merge process right away after doing a merge to update the core. So I'm just going to run my script and then it's going to spew out a bunch of messages. Note that it complained about all of the metadata files under my local harvest directory because of course those have no equivalent in the core code because they're metadata files. So the script just safely skips over them. It just gives me an alert that it saw them and didn't know what to do with them.
It also reports that it ran into at least one conflict and conflicts are somewhat inevitable in any kind of different merging situation because what my script is essentially doing is saying I know that we started with a particular file in vufind 6.1.1 and when vufind 7 came out some changes were made to that file to upgrade to vufind 7. But you've also created a local copy where you've made some changes of your own. The script process will try to reconcile those changes as best it can but sometimes both paths of history will have tried to change the same part of the file and then of course there's no automated way to figure out which thing is right and that's how we end up having to do some manual conflict resolution. Fortunately it's not too difficult in most cases as long as you understand how to read the conflict markers and you understand the history and reasoning for your customizations.
So first of all let's find out which files have conflicts in them. To figure out which files contain conflicts, we are going to run a get diff command which will output a list of all the changes to files that haven't been committed yet. We're going to pipe that into a grep command looking for a series of greater than symbols which is a conflict marker that the merge process will have inserted into the files and that should give us a hint of which files need attention.
As you can see in this example we have three conflicts in config.ini. You'll notice that the file names we're seeing here actually refer to the core file so we also need to be able to figure out based on familiarity with our code which local file is the equivalent to that. In this case, I know that configuration files exist in my local directory and otherwise have an equivalent path so what I need to do is edit my local /config/bfind/config.ini file and I can search through this file for any instances of series of less than or greater than signs to find areas of potential conflict.
So for example, here is one conflict that's presenting me with three different sets of text. It's showing me what my original local file looked like, then it's showing me what the old vufind 6.1 file looked like, and then it's showing me what the vufind 7 file looks like and what I essentially need to do is pick one of these versions, delete all the other ones and get rid of the surrounding markers.
Sure, here's the text separated into sentences and paragraphs using punctuation:
This particular conflict occurred because at some point in the past, I upgraded this tutorial box to from vufind 6.0 to vufind 6.1, and I didn't do a very good job, so I introduced an inconsistency. So, I apologize for causing that confusion, but it did create an useful example here, and in this instance, it really doesn't matter too much what we do because these are all just comments and they're not going to have any effect on the code. But, to be ready for future updates, the smart thing to do is to accept the newest version of the comment, this one down here that's associated with config/vufind/config.ini, the current core code. So, I'm going to delete the two unwanted old versions, delete the conflict marker at the end, and then I can move on. And then there's another conflict right below here as well, this is telling me that there are some additions that did not exist in either my locally customized file or the previous version of vufind. Again, this is all related to me not having fully updated my config.ini the last time I upgraded, but once again, the solution is simply to accept the final option here, which is the newest version of the configuration. So, I'm going to delete the old versions and take off the conflict markers. And then there is just one more, and it's the same kind of thing. There are three different versions of the form setting in this example in the comment, the one on the bottom is the current default in vufind 7.0. So, I am just going to delete all the other versions and delete the conflict marker, and that's it. All of my conflicts are now resolved.
This obviously requires a bit of brainpower to get through. Sometimes it's not entirely obvious how to resolve a conflict, but as long as you understand that each section of the conflict block is showing you a different version of that chunk of code or configuration, it's usually possible to figure out from the context which is the right one, and if nothing else, you can make notes and review your commit history to see why you had gotten a particular line of code into a particular state. Anyway, now that I have completed my conflict resolution, I can do a git diff, and this will just summarize everything that got automatically applied by that merge script, and it's always a good idea to look through these and just see if anything stands out as a potential problem.
So, as I mentioned at the very top of the video, the generator value that you find uses is stored in config.ini, and we can see here that the upgrade process correctly updated that from 6.1.1 to 7.0. Then, I think most of the remainder of the dips we're going to see here are new settings or changed comments that took place during the course of you find development between the versions. You can see here the solar port number has changed from 8080 to 8983, as I was expecting to see, and then we just have more changed comments. Some things have been removed here because of the removal of support for Amazon services and you find 7.0, but most of these changes are in comments, so it's pretty safe to trust that they're not going to cause any problems. Just keep scrolling, and of course looking through these dips is also a great way to learn about new find features because any new thing that gets added will show up as a new set of options in the config files, so it certainly pays to pay attention to these if you're doing this for real.
Right now, I'm just going to move quickly past the rest of config.ini. And now I've reached my import settings where the only change is related to the solar port number change once again. And this is so that the mark import tool knows where to find solar. So we have two files with that port number changing. We have some new examples added to our mark local dot properties. Again, it's all just comments, so it won't hurt anything, but it brings our local version up to date in case we want to turn this on in the future. And finally, a little bit of adjustment to our local custom theme to reflect some style changes that were made to the core and which have been automatically applied correctly here thanks to the merge script.
So that's it, our VuFind is now fully upgraded. We just have one last thing to deal with, which is the issue I raised earlier of the change solar port number. Because right now, we haven't done anything the solar during this upgrade process, so it's still running on port 8080 where it used to live. So first, let's stop it by saying solar port equals 8080 systemctl stop VuFind. This is taking advantage of the system deconfiguration that was set up in an earlier video.
So VuFind has now stopped, but we are going to need to make one small adjustment to the system deconfiguration because the port number is embedded in the PID file that VuFind uses to keep track of running processes. So we need to tell systemd that this 8080 dot PID has changed to an 8983 PID. And because we made a change to a systemd file, we need to say pseudo systemctl demon reload to make sure that the latest version of that is already. And then we can start the VuFind service once again.
There's just one important finishing touch, which is of course that we want to commit all of the files that were customized by the merge scripts to get so that the whole history is tracked. And we're ready to move on to our next set of changes when the time comes. So let's just do a git status to look again at what has changed. Not too much. So we can say git add local config local import dm/tutorial. Do one more git status to be sure all the right files were selected. They were, so that we can get commit with a message like merge 7.0 changes to local files and we're all set.
And now, if all has gone smoothly, I should be able to refresh my VuFind homepage. And if I view the source of it, I see that my generator now says 7.0 instead of 6.1.1. That's a good sign. And let's perform a search and confirm that yes, the whole thing still works. We are now upgraded to VuFind 7.0.
So just to summarize, upgrading VuFind with git. It's not an easy process. It requires an understanding of your local system and configuration. It requires some problem solving and resolving conflicts. It's always good to have backups before you attempt to tackle it. But as you become familiar with the tools of git and the supplemental merge script, it does automate the vast majority of the work for you and draws your attention to key areas that might require additional work. So I hope this demonstration has been of some help and will help you form useful habits to keep up to date on VuFind in the future. And as always, if you have problems or questions, please reach out to me directly or to the VuFind mailing lists, and I'll be happy to help. Thank you for your time.
This is an edited version of an automated transcript. Apologies for any errors.