Video 14: Internationalization (I18n)
The fourteenth VuFind® instructional video discusses VuFind®'s internationalization (i18n) capabilities, explaining how to configure languages, how to create new translations, and how to incorporate i18n into your code.
I'm recording too, so I should share my screen. For this month's VuFind video, we are taking a deep dive into VuFind's internationalization system. We're going to start at the surface talking about what it does and how you can configure it. Then we'll go a little bit deeper to talk about how you can add additional languages to the software, and then even deeper to talk about how internationalization can be used when you're developing on the software, adding new features, or building local customizations. I've tried to organize this content in order by complexity, so you can leave the video whenever you think you've had enough, but you can keep going to get deeper and deeper into the system. So VuFind is a piece of software that's used throughout the world, and of course in different parts of the world people speak different languages, so VuFind needs a way to present its interface in multiple languages, and that's where internationalization comes into play. If you have a default VuFind installation, you'll have this language control at the top right, as you can see, which shows many different languages, all rendered using their native representation, so people who speak the language can instantly pick out their native preference, and if you choose one of these, the whole interface redraws itself in the new language. All of this, like everything else in VuFind, is highly configurable, so we're going to go into some of that today. It also is useful even if you're presenting your VuFind in only one language, because the internationalization system provides a way to override a lot of the text that VuFind uses, so if you need to customize specific language in the interface, you can often do it just by changing one line of text, making it a really easy way to customize your VuFind experience. We'll talk about that too. So let's start by taking a look at some configuration files. I'm just bringing up Visual Studio, which is pointed at my VuFind home directory, so I can quickly navigate through the files. So if we look at the config.ini main configuration, there are a few settings that are of interest. First of all, fairly near the top, there are a few language-related settings. There's browser detect language, which by default is turned on at true, and this means that VuFind will try to read HTTP headers sent by your web browser to determine your native language, and if that language is supported by VuFind, it will turn it on by default.
So if you have a VuFind instance that's used by many people speaking many different languages, having this setting on just ensures that they won't be inconvenienced by having to switch the language manually. Right below the detect language setting is just language, where you can set the default language of your VuFind installation. So this is the language that will be turned on if you either turn off the detect language setting or if a user's browser specifies a language that VuFind isn't currently configured to support. So you aren't necessarily forcing this language, but this is the one that will be used if nothing more specific is identified. I should also note that right under language is the locale setting, and this indicates where your VuFind instance is located, and this is completely independent of the language setting. This is mainly used for things like determining what type of currency is displayed when users look at their finds and so forth. So you want to set this to where you are or where your library is, and it's independent of the language setting because the language is controlled by the end user. In any case, there's one more part of config.ini we should look at, which is further down, and this is just a section called languages, and this is a list of language codes mapped to those language names in English, and we use English language names in the configuration here just for consistent readability. They get translated into their native forms elsewhere, and I'll show you that later, but the point is this lists all of the languages that VuFind currently supports, and they are all turned on by default. The order of this list also controls the order of the language drop down in the VuFind interface, so if your community is more likely to speak a certain set of languages, you might want to move them to the top of this list, and if there are languages here that are very narrowly used and don't apply to your audience, you might want to turn them off just to simplify the interface. You'll also notice that there are some languages that are turned off by default, and these are regional variations, so for example, you could switch English to use British rather than American spellings, or you could turn on both versions if you wanted to.
Similarly, there's a separate file for Flemish Dutch, but that's turned off by default. We do offer both Portuguese and Brazilian Portuguese by default, so this is something you might even want to consider turning off if that's not relevant to your users.
While we're in the configuration files, there's just one other thing that I would like to show you, which is in facets.ini, the facet configuration, in the advanced section, the advanced settings more specifically, there's a section called translated facets, and this tells VuFind, when it's displaying facet lists, which facet fields should be run through the translation system and potentially translated. As you can see, by default we're only translating the format names and the high level Library of Congress call numbers. In the future, we may add more translated fields, but of course it's a lot of work to translate all possible values in a facet, so we only add those as time permits that work to be done, and this is certainly something members of the community could contribute back if it's important to them. For example, there's a long-standing project to translate language names into all supported languages, which I'd love to see completed, but it just hasn't been done yet because it's a lot of work, but it is certainly technically possible within this system.
One little detail I will highlight is that when translating a facet, you can specify either that it comes from the main language file by specifying the name of the field by itself, or you can follow the field name with a colon and a name of a text domain containing relevant translations. Text domains are a way of organizing our translations, and I will show these to you in more detail in just a moment.
So now that I've showed you how to configure internationalization, both in terms of what languages are provided and also in terms of what facet fields can get translated, let me show you how the translations are actually organized within VuFind. Inside your VuFind home directory, you'll find a languages subdirectory that contains many, many ini files, and as you can see, most of these use the two-letter or two-letter and regional subdivision codes, the same ones that are found in the config.ini file.
This is, of course, how we organize each language's translations into a separate file. There's also this special native.ini file, and as I mentioned, this is how we translate the English language names from the configuration file into their native forms for display in the drop down. It's the one special exception within this directory where otherwise every file name corresponds to a code.
So let's look at one of the actual language files, en.ini, where all of the English codes are found. The ini files containing VuFind's language translations use a subset of the broader ini file standard. They support semicolons for comments, and beyond that, everything is in the format of translation key and equal sign and then the translation in double quotes. VuFind's language files are standardized to always use this format. Every string is always in double quotes, and the keys are always sorted alphabetically just for consistency and to make it easier to find things.
If you scroll through, you'll notice that some of the translation strings include placeholder values of the format percent percent, some name percent percent. These placeholder tokens are used in cases where a translation needs to have a value inserted into it at a particular position, and using these tokens is really important for translation because if you're assembling a sentence, different languages have different grammars, and so the word order may be different. So by using a token, this makes it possible for translators to always put the changeable parts of a phrase in the correct position while it's being translated.
You'll also notice that some of the translation keys end in underscore HTML. This indicates that the translation string itself may include some HTML, and when it is being displayed in the VuFind interface, it should not be escaped because we want to actually render the HTML for the end user rather than display the literal less than em greater than in this particular example. Most of the other keys that do not have the HTML suffix on them are going to be assumed to be plain text with no HTML, and when they get rendered, they will have any special characters escaped to ensure that they display to the end user exactly as they are written in the translation file with no characters being interpreted as HTML.
You'll also notice that there's kind of a mix in here of actual English phrases being used as keys and more abstract tokens being used as keys. This is sort of a side effect of VuFind's code evolving over time. When VuFind was first written, everything was just an English key, but it pretty quickly became apparent that it was sometimes more useful to use a more abstract name for a translation. This is particularly critical when the same English word might have different words for different meanings in another language, so we started using keys that offer more context about how the translation is used so that even if the text is identical for two different things in English, it's possible for a translation to use different phrases for each context in which the words are used if that happens to be necessary.
You'll also notice that there are some subdirectories under the languages directory, and these contain more .ini files. These are the text domains that I mentioned earlier. If you have a specific set of related translations, rather than putting them in the top level language file, which is already very long and contains a lot of different things, you could instead decide that they belong in a text domain, which is just an .ini file in a named subdirectory. So for example, when I showed you the facet translations earlier, I showed that the call number first field had its own text domain. So if we look in these files, you'll see that all of the top level Library of Congress classifications are translated in these files. In some cases, complete translations haven't been provided yet, but the point is this way we can sort out our translations into a more logical arrangement. It becomes easier to find things and easier to read things.
Again, there are still many things in the top level translation files that would probably better belong in a separate text domain, but because text domains were introduced to VuFind later in its development, not everything has been refactored yet. So over time you may start to see more text domains and shorter top level language files, but for now we are just refactoring as time permits and circumstances allow. The language files are really useful for translating short phrases and words to build larger interfaces, but there are also some other translation facilities in VuFind for situations where you have a huge block of text that perhaps would not be appropriate to include as a line in a language file.
One of the most obvious examples of this is VuFind's help pages. If I go over here, there are a few different search screens in VuFind like the search tips page which pops up here that have a whole lot of text in them, but which we want to be able to provide to users in multiple languages. So for example, here if I switch this to Spanish and pop open the search tips, I get that whole page in Spanish, but if I had to break that down into a separate translation for every sentence in the file, that would be unreasonably tedious. So instead for help screens, we have a template-based system. If you go into the themes, you will find in the root theme where our widely shared content lives under the templates directory, there is a help translations folder, and in here as you can see there are folders that correspond to language codes, and not every help screen has been translated into every language yet. But where they have been, you'll find that there are templates with parallel names. So for example, you know here in English is SearchPHTML which has the HTML template for search tips, and under ES for Español, we have the Spanish version of the same file. So for help screens, VuFind just has some code that looks at the user's current language, checks to see if a help template exists in that language, and displays it if it's available. If a translation is not available, the help system will just display the English template with a warning message apologizing that a translation is not available. This is another area of course where members of the community could contribute by translating more of the help screens into more languages. One final thing which I will briefly mention and will show in more detail in a future video when we talk about managing static content in VuFind, there are a few different ways that VuFind can display static pages of content. If you want to use it as a lightweight content management system, for example if you want to build local pages with frequently asked questions or information about your library, all of these also have template based mechanisms for providing translations. Just as a quick example, let me switch into the Bootstrap 3 theme, go into templates, and in here there is a folder called content, and as you can see there are a few different example content templates here.
All of these are really designed to be overridden locally, so you know for example the frequently asked questions page is very short. The ask a librarian page just says this is the default page this isn't meant to be used in production, but what I want to show here relating to internationalization is that all of these content pages have a fixed name, but if you put an underscore and language code after that you can create an English specific version of the page that will be used when that language is selected. So for example you know you can see ask a librarian has a default page and an English page, so if I go back to English and click on ask a librarian to view this, it says this is the English ask a librarian page, but if I switch to say French, why does it still say English? I have encountered a bug. Ah, test everything before you do a public demo, Amy. I will make a note of this and look into it later. Oh no, I see the problem. The problem is that the language is stuck on the url. I'm not sure. No, that's not the problem. All right, I'll troubleshoot this later. And I think we can just edit out the the demo portion of this for now. It's not that important. I was trying to get extra credit. So let me regain my place. So I'm just trying to think where I left off so that I can smoothly transition and make the edit easier. So now that I've showed you where all of the language files are located and how they work, let me do a quick demo of how you can easily override some language strings to customize something in your your local instance of VuFind. So suppose I don't like the message that is displayed when I perform a search and don't get any results. So right now it says your search, no results, did not match any resources. But say I want to change that language. The first thing I need to do is figure out what language string is actually causing this message to appear. The easiest way to do that is to go into the main language file and search for a phrase that matches the string I want to change.
And of course if I don't find it in the main file I can search in all of the text domains as well. But most things you see in the VuFind interface are going to come from the main translation file. So let me go back to Visual Studio here and look in the English language file. I'm just going to search for any resources because that's a fairly distinctive part of the string I want to override. And sure enough I find that there is a string here called no hit look for HTML. So this is the string I can customize if I want to change this particular message in the user interface.
Like many things in VuFind, you can create files in your local settings directory to override settings from the core. Languages are no exception. So if I go to my local directory and I create a folder called languages and inside that folder I create a file called bn.ini, I can customize English language strings here and override the defaults. A nice thing about the language file overriding is that you don't have to copy the whole language file from the core. You can override only the strings that you want to change here and they'll get merged automatically with the defaults from the core. So suppose I want this to instead say, sorry, we could not find any matches for your search. I just added the string here in my local language file.
But there's one more important step that we need to remember. And this is because the language translations are distributed across a number of files, VuFind maintains a cache of language strings so that it doesn't have to spend a lot of time gathering things together every time it renders a page. So if you change a language file, you also need to remember to empty out the language cache so that your changes will immediately take effect. I can do this by going to my VuFind home directory and then simply running sudo rm-rf local cache languages and that will clean out my cache. And now if I refresh my page over here, sure enough, my custom translation has kicked in and it says, sorry, we could not find any matches for no results. So customizing any piece of text in VuFind is that easy.
And of course, if you need to customize the text in multiple languages, you just need to create multiple language files inside your local languages directory. So now that we've talked about how to use the language system and how to customize what is displayed, let's go a little bit deeper and talk about how you can add additional languages to VuFind.
You can probably guess that it's really just a matter of creating some new files. You just need to use the language file to add additional languages to VuFind. You just need to figure out what the language code is for your new language. Then you can take one of the existing language files, rename it to match the code of the new language and translate all the strings within it. You of course have to repeat that process for all of the text domains if you want your translation to be comprehensive. You also need to add the new language to config.ini in the languages section so that it's included in the list of available languages. And you want to make sure that there's an entry in the native.ini so that we can represent the language name in English in config.ini for consistency with the language within that file, but we can also represent the language in its native form in the user interface. So it's really just those few steps. The hard part of course is doing the actual translation work.
And of course if anybody does want to add a new language to VuFind and has any questions about the context of particular strings or needs any help, please feel free to reach out to me or to the community at large and you'll get some assistance because we're always pleased to expand the reach of VuFind by adding more languages.
But outside of that general process for creating languages, I also wanted to highlight one useful tool that VuFind includes that's helpful while managing languages, which I use every time a new VuFind release is being planned and we need to update all of the translations. VuFind actually includes several tools within it that are geared toward developers, and as such these tools are only available when VuFind is switched into development mode. I showed this in a past video, but just as a reminder if you go to your Apache configuration in the local directory httpdviewfind.conf, there is a line in there which sets development mode.
And right now in the demo instance I'm teaching with, development mode is already turned on. But if this were not already turned on, you would have to make sure that any comment marker in front of it was removed and after making that change you would need to restart Apache to make the change take effect. If this were setting were not in place, you would not be able to see the developer's tools. But since in my case it is already turned on, I can just go ahead and show you that if you go to VuFind slash dev tools, you will see a page summarizing the available development tools, and one of these is language details. If you look at the language details page, you will see a list of all of the languages currently supported by VuFind, and all of these are getting compared against the English language file to show you where there are gaps. Because while we try to keep all of our translations up to date, not all of our volunteer translators are always available, so some languages have more complete translations than others. And this tool allows you to find out what strings are missing. So for example, if I wanted to see which lines still need to be translated into Welsh, I could click show here and I get a pop-up summarizing all of the English lines that still need translation. If you click on this, it automatically highlights the whole thing for you, so it's easy to copy and paste. And this is the process I follow when I request new translations for every release. I go through here, copy and paste all the lists of missing lines, and email them to the volunteer translators who then send them back to me for incorporation into the project. This extra lines column is also useful because if you accidentally create a translation in another language file that doesn't exist in English, it will be highlighted here. And sometimes this can catch minor typographical errors or formatting problems because they'll be interpreted as inappropriate lines in the file. This will help you track them down and fix them. You can also see this extra help files column gives you a count of how many files exist in that help template directory I showed you for each language.
So as you can see, there are some opportunities to fill gaps here if anybody cares to do so. Also note that there's been some recent work to improve the look and feel of this screen, so by the time you're watching this video, it may actually look a little bit different, but the functionality will remain the same.
Now that I've showed you where to create files to add a new language and how to manage languages through this development tool, I'd also like to highlight a few useful command line tools that you can use for managing the language files. And these are mainly used during development of the core project. They're less useful for local customizations, so you would find yourself using these primarily if you're adding new features to the VuFind core and those features introduce new language strings.
So I'm going to pop to the command line here, and just a reminder, you can always get a summary of all of VuFind's command line capabilities by just running the index.php file in the public directory through php. So this gives me a summary of many commands, and you'll notice that there are a few language specific ones whose names all start with language. So there is language add using template. This provides a mechanism for creating new language strings by combining existing ones, and I will actually do a demo of this a little bit later so it will become more clear then. There is a copy string command, which simply copies one string in all of the language files to another string. This could be useful if you want to differentiate something. So suppose you have a text string that's being used in two places, but you decide that maybe you want to refine the language in one of those two places to be more specific. You could use copy string to create two different identical strings in all of the language files, and then you could customize them where appropriate. This way you can add a new string without creating gaps in the translation, but you still provide the opportunity to customize more specifically as needed. There's also language delete, which is fairly self-explanatory. It removes a particular translation key from all of the language files. This can be useful if something becomes obsolete.
Again, I'll demonstrate this for you in a moment. And finally, there is language slash normalize, the language file normalizer. For this one, you give it the name of a directory containing any files, and it will format all of them to meet VuFind's language file standards. As I mentioned earlier, it alphabetizes everything by key and it puts all of the text to the right of the equal sign into double quotes. It's just a nice way to be sure that everything is neatly formatted. And if you're submitting new language strings to VuFind, our continuous integration process, which looks at submissions and checks their validity, will double check that all of your language files are normalized correctly. So running this tool can just help prevent some bumps during the contribution process. So with all of that introduced, let's dive into a hands-on example of using these command line tools. And I'm actually going to use this to create a real change to the VuFind code that I think will be useful, which I'll submit as a pull request following this call. So if I go to the home page of VuFind, by default, I see all of these BrowseBy headings for the the different facets that we display on the home page, and I happen to notice that this isn't formatted in the best possible way. Let me show you what I mean. If I go into the English language file and I search for BrowseBy, I have this home browse key, which just translates to BrowseBy. So what this means, and actually I can show you this in a moment, is that we are just concatenating a facet name onto a translation to create those headings. And as I mentioned, that may not be appropriate in every language, because perhaps in some other language the grammar specifies that the facet name should be before the string or even in the middle of the string. This is a situation where we really should be using a placeholder token instead of combining strings together. And I'm sure the reason this is the way it is, is that this BrowseBy string was added to VuFind in its early days before we improved our practices. So this seems like a perfect opportunity to improve that little detail and demonstrate all of the processes around it along the way.
So if we're going to change a string in the language files, the first thing we need to do is figure out where it is being used. I like to use the grep command to do this. It's a standard Unix command line tool for finding things in files. So let me just do that. I'll go to the command line. I'm going to go into the themes directory, because I think in this instance it's safe to assume that this text is only being used in templates. It's probably not found in any of our controller or service logic. And then I'm just going to say grep minus r home browse star. So grep minus r means recursively search through all the subdirectories of the current directory. Look for the phrase home browse and star just means look in every file. So grep is telling us here that home browse is only found in one file, but within that file it's actually used in three places. So let me open up this file templates content block facetlist.phtml. And let me make sure I'm in the right theme for doing that. Yes, content block facetlist.phtml. So there are a few different conditions in this file that cause the headings to be generated in different contexts, but in all three situations it's doing exactly the same thing. And as I said before, it's first translating the home browse text, then it's adding a space to it, and then it is displaying the label. But this is not as flexible as we would like. So let's change this so that it instead uses a token in the appropriate position. And this is a perfect use case for the add using template command line tool for managing language files. So I'm moving back to my VuFind home directory, and then I'm just going to show you. You can run php public slash index dot php language add using template minus help to get help on how to use this particular command. In this instance, the help screen shows us that the first parameter is the name of the new key that we want to create, and the second is a template showing it how to build that string. And this uses double piped placeholders to embed existing translation strings.
So this will become more clear as I show the example. So right now we have an existing translation string called home browse, and we need to create a new one because this tool creates a new key. It doesn't overwrite an existing key, but let's take this opportunity to use a more specific key name anyway. So let's call this home browse by facet, and then we're going to create the template to match the current behavior. So I'm going to use double quotes to surround my template because it will have spaces in it, and I want it to be treated as a single piece of text when it gets passed in as a command line parameter to my tool. So I'm going to double pipe home browse, so this is going to take the existing translation of the home browse key and put it into my new home browse by facet key. Then I'm going to add a space, and I'm just going to create a token called percent percent facet percent percent, and then I'm going to end my double quoted phrase for the template. So what this is going to do is it's going to go through every single language file. It's going to create a new home browse by facet translation, which will be the existing home browse translation followed by a space and followed by this percent percent facet percent percent token. Of course, this may not be optimal in every language, but this is going to match the current behavior, and it's at least going to give us the possibility of improving this in the future, whereas as currently implemented, there's nothing you can do to change the order of this display. So when I run it, it runs through all of my language files. You'll notice that it skips a couple of files because they don't contain a home browse key and thus cannot translate it, and these files that got skipped are the regional translations. So the British English and the Flemish Dutch, and these don't contain this particular key because they only contain text that provides a regional variation, and in both of these instances, there's no difference between regions for this particular piece of text. In any case, I've now edited a whole bunch of files.
So if I do a git diff, it will show me what has been added, and as you can see, it's just taken the existing home browse translation and added a facet on the end of it, which is exactly what we wanted. But now our intent is to replace the home browse with home browse by facet, so this is an opportunity to also use the language delete command. Now that we've moved home browse, we can get rid of it, and so that works just the same. We say, you know, PHP public index, language delete, the name of the key to delete. It runs through all the files and deletes it. It reports files that don't contain the key, so if I do a git diff now, we can see that it has removed home browse, but it's still adding home browse by facet. So now that we've created the home browse by facet, we need to actually use it. This is a great opportunity to demonstrate VuFind's translation view helpers, which are used throughout the language file, throughout the themes in the templates. There are two of them. One is called this translate and one is called this trans-esque, and they both work exactly the same way, except trans-esque HTML escapes the translation when it's done running, whereas translate leaves the text raw. So the escaping helper is used in most places, but the translate helper is used if we need to render HTML, which is why we have that HTML suffix to indicate templates that contain HTML, or it's used in contexts where we're not generating HTML. For example, if we're building the body of a text email. So let's refactor this code a little bit to make it easier to modify. So since we have identical logic in three different places, I think it will be cleaner if we just take this whole bit of logic and we create a variable called label heading. So then we can do the calculation in one place in the code and display it three times over. So I'm just going to put some PHP logic right here that says, sorry, label heading equals, and that's the current logic, which we will change in a moment. I'm just going to finish refactoring this everywhere.
I know there are three of these. Where's the third one? One, two, three. Okay. So now we have one piece of code that's generating the label heading, and we're using that value in three different places, but we now have a token. So instead of concatenating this stuff together, and just to refresh memory on that, what we used to be doing was translating the home browse key, then adding a space, then translating the name of the facet field being displayed. Let's be smarter about this, and instead let's translate home browse by facet, and both of the translate view helpers accept a second parameter, which is optional, and contains an array of tokens to values. So in this case we created a token called a percent percent facet, and we want that to instead be the name of that facet. So I'm going to take away the concatenation, and instead plug this value right into this array. But there's one more important detail in this instance, which is that here we're translating and escaping the label, but we're also translating and escaping this whole thing. We don't want to escape the value twice, because if we do that we might end up causing some HTML entities to be rendered to the end user, and that would be confusing. So when generating the list of tokens here, we'll just do a flat out translate, and then we'll escape it at the end after we've combined all the parts together. So that completes my refactoring, but you will recall that I mentioned that the language cache needs to be cleared in order for changes to the language files to take effect, and let me just demonstrate that to prove that what I've done has actually had an effect. If I refresh the page now, all of my headings just changed to home browse by facet, because I created a new string, but that string isn't in the cache yet. So now if I go down to the terminal and I remove local cache languages again to clear out that part of the cache, and then I refresh my page again, now we're back to having appropriate headings on all of our columns, but now we're using a more appropriate tokenized version of the language string that will be easier to customize in the future.
So we've just about run out of time here, but the last thing I wanted to very quickly highlight is I've showed you how translation works in the template files. You can also occasionally need to do translations deeper in the code when you're writing controllers or building services, and VuFind provides a couple of useful mechanisms to help you with that. If you look inside the core VuFind module code under VuFind i18n translator, there's something called the translator aware interface and something called the translator aware trait, and to make a long story short, if you implement the translator aware interface on a service or controller, many of VuFind's service managers will automatically call its set translator method and pass in the Laminas class that actually does the work of translating things. Even if it doesn't get auto injected, this makes it easy to inject through a factory, so depending on context. The point is anything that has a translator aware interface can receive a translator and use it to translate strings. The translator aware trait is a trait that can be mixed into any of your classes. It gives you the set translator method to match the translator aware interface, and it provides some useful utility methods that you might find helpful. It has this get translator locale method which will find out what user language is currently active, so if you need to find out what user language code has been selected this will tell you, and it has the translate method which we showed in the view helper earlier that takes a string to translate an array of tokens, and also a third parameter, a default value that can be used if no translation is found at all. So this is used throughout the VuFind code. It can be useful in your custom code. I don't have time today to go much deeper into it, but just wanted to make you aware that this is available should you need it. And that's all we have time for, so I hope this has been a useful introduction to internationalization, and that it will help you customize your local interface, and perhaps inspire you to contribute some more languages to VuFind. Thank you as always for your time, and see you next time.
This is an edited version of an automated transcript. Apologies for any errors.