Warning: This page has not been updated in over over a year and may be outdated or deprecated.
videos:administering_a_vufind_server
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
videos:administering_a_vufind_server [2021/12/18 00:02] – [Transcript] akilsdonk | videos:administering_a_vufind_server [2023/04/26 13:34] (current) – [Transcript] crhallberg | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Video 5: Administering a VuFind | + | ====== Video 5: Administering a VuFind® |
- | The fifth VuFind | + | The fifth VuFind® |
Video is available as an [[https:// | Video is available as an [[https:// | ||
Line 18: | Line 18: | ||
===== Transcript ===== | ===== Transcript ===== | ||
- | // This is a raw machine-generated transcript; | + | Welcome to the fifth VuFind video. |
- | Welcome to the fifth VuFind | + | So let's start with Solr setup. For the purposes of this video, I am going to show you how to set up VuFind autostart using systemd. Systemd is a set of tools shared across Linux distributions which are focused on system and service management. This is a comparatively new development in Linux land, which is to say it' |
- | This time around we are going to talk | + | |
- | about administering a VuFind server | + | |
- | after you've built and configured it | + | |
- | since there are a few common tasks that | + | |
- | it' | + | |
- | cover both getting Solr setup in a | + | |
- | secure way and making sure that it | + | |
- | starts automatically when your server | + | |
- | boots up and we're going to talk about | + | |
- | some cleanup you'll want to do to make | + | |
- | sure that you don't accidentally fill up | + | |
- | your disk without realizing | + | |
- | So let' | + | But first, |
- | of this video I am going to show you how | + | |
- | to set up VuFind | + | |
- | systemd. Systemd is a set of tools | + | |
- | shared across Linux distributions which | + | |
- | are focused on system and service | + | |
- | management. This is a comparatively new | + | |
- | development in Linux land, which is to | + | |
- | say it's been around for several years, | + | |
- | but if you've been in Linux a long time | + | |
- | you'd be you may be more familiar with | + | |
- | the earlier system that used symbolic | + | |
- | links to a directory called init.d. | + | |
- | systemd gets rid of obscure Bash | + | |
- | scripts and replaces them with | + | |
- | configuration files. And while it took a | + | |
- | little getting used to I've come to | + | |
- | really like it and so I'm going to go | + | |
- | through it in some detail in this video | + | |
- | to help you understand how it works and | + | |
- | what it's doing for you. | + | |
- | But first let's talk about Solr security quickly. So | + | So what we are going to do in this video is create a Solr user, give ownership of the Solr directories to that Solr user, and then set up systemd so that the Solr user starts up Solr when the server boots. So first of all, we'll just do the bare minimum to create a user. We will say '' |
- | when you install Solr you're creating a | + | |
- | web service that can have data sent to | + | |
- | it and that VuFind communicates with | + | |
- | to do searching. I hope it goes without | + | |
- | saying, but it might not, that you do not | + | |
- | want to expose your Solr index to the | + | |
- | whole world because people can do | + | |
- | malicious things to it. So you should | + | |
- | always have Solr behind a firewall so | + | |
- | that VuFind can talk to it but nobody | + | |
- | who doesn' | + | |
- | Additionally it's a really good idea to | + | |
- | create a user account dedicated to | + | |
- | running Solr and give ownership of the | + | |
- | Solr directories to that account so that if somebody does | + | |
- | somehow get to your Solr web interface | + | |
- | and exploit a bug that allows them to do | + | |
- | something malicious, their ability to | + | |
- | do harm is somewhat constrained by file | + | |
- | ownerships and so forth. | + | |
- | going to do in this video is create a | + | |
- | Solr user, give ownership of the Solr | + | |
- | directories to that Solr user, and then | + | |
- | set up systemd so that the Solr user | + | |
- | starts up Solr when the server boots. | + | |
- | So first of all we'll just do the bare | + | And we're just going to accept all the defaults because this is a demo. So now that we have a user created called solr, we can give it ownership of our Solr directories, " |
- | minimum to create a user we will say "sudo add user solr" so this creates a | + | |
- | user named solr and we're going to set | + | Now we are all set to create a systemd service to boot up Solr web. There is a directory called |
- | the disabled password switch because we | + | |
- | don't need to have a password for this | + | I'm going to start with '' |
- | account we're not going to be logging in | + | |
- | as it. And we're just going to accept all | + | And I'm going to create a '' |
- | the defaults because this is a demo. So | + | |
- | now that we have a user created called | + | Next I'm going to say '' |
- | solr we can give it ownership of our | + | |
- | solr directories "sudo chown -R", for recursive, " | + | Next I'm going to say '' |
- | ownership of the directory in question | + | |
- | and we're going to say " | + | Next I'm going to say '' |
- | the VuFind home Solr directory I see | + | |
- | that it is owned by solr and in the | + | Next, '' |
- | solr group. Now we are all set to create | + | |
- | a systemd service to boot up Solr with. | + | We also add '' |
- | There is a directory called | + | |
- | system D slash system which is where | + | Finally we add '' |
- | service definitions live so I'm going to | + | |
- | use my Nano editor | + | Finally, |
- | Etsy system | + | |
- | find that service | + | If you use the older Linux startup system, within init.d, |
- | definition needs to end in that service | + | |
- | to tell system view that | + | So I've now saved this file, I'm going to exit out of here and hope I didn't make any typos. |
- | is a service | + | |
- | file and I'm going to type in a whole | + | So just to show you I'm going to open up VuFind |
- | bunch of parameters here to explain how | + | |
- | the service is supposed to work. I'm | + | I just do '' |
- | going to start with after equals | + | |
- | dot target | + | I wait a moment |
- | to essentially define dependencies so | + | |
- | you can make services that wait for | + | What we are really concerned about here is ensuring that Solr starts every time our server boots up so that we don't have to remember to start it by hand and if something happens in the middle of the night, it just recovers on its own. So that's easily done. We just say '' |
- | other services to start before they | + | |
- | start and so forth | + | So let's prove that this works and reboot the server. All right. So now I'm going to log in, and now if everything worked, I should be able to open up a web browser and access my VuFind |
- | but in this instance we're just going to | + | |
- | use network target which is a predefined | + | So just to demonstrate, if I cd into my VuFind |
- | setting in systemd which means wait | + | |
- | further | + | The default is to use PHP's built-in disk-based session handling, where it just sticks files in a directory, but you can also set it up to use a database table or to use different kinds of memory-based stores like Redis or Memcached. Depending |
- | you do anything Solr doesn' | + | |
- | good without network access | + | With the default disk-based sessions, normally PHP should clean up after itself, and you shouldn' |
- | going to create a service | + | |
- | most of the main settings will live I'm | + | If you use the database-based session storage, there' |
- | going to say type equals | + | |
- | used when you run a script that exits | + | So that's all I have for today. I hope that's helpful. There are certainly other issues to think about when administering a server, and there are some wiki pages that talk about this in more detail, but if you can get Solr to start and you can avoid filling up your disk, you are well on your way to having a happy and healthy |
- | quickly but spawns a long-lived child | + | |
- | process which is a an accurate | + | //This is an edited version of an automated transcript. Apologies for any errors.// |
- | description of the solr SH script that | + | |
- | you find uses to start up solr script | + | |
- | returns but it forks a process that | + | |
- | lives until we stop it next I'm going to | + | |
- | say exact start equal slash bin slash s | + | |
- | H minus L minus C and then in single | + | |
- | quotes a slash user slash local slash Q | + | |
- | fine slash solar dot s H starch and | + | |
- | single quote minus X so as you can | + | |
- | probably guess from the setting name | + | |
- | exact start is where you specify the | + | |
- | systemd what command you use to start | + | |
- | the service | + | |
- | full paths to everything because we | + | |
- | don't want to make any assumptions about | + | |
- | the environment that's set up when | + | |
- | systemd is running things | + | |
- | saying use the standard shell the minus | + | |
- | else which makes the shell act as if a | + | |
- | user has logged into it when running a | + | |
- | command which sets up the environment | + | |
- | correctly so this gives us access to if | + | |
- | you find home if you find local dir etc | + | |
- | then the minus C is just telling the | + | |
- | shell what command to run so that quoted | + | |
- | string we're running the Solar scripts | + | |
- | to start solar and finally | + | |
- | just provides extra detailed | + | |
- | the shell which can be useful for error | + | |
- | logging and troubleshooting | + | |
- | going to say Pig file equals user local | + | |
- | few fine solar vendor bin solar - 88 e | + | |
- | dot Pig this tells system D where the | + | |
- | file containing the process ID of the | + | |
- | running | + | |
- | something that's set up by the solar dot | + | |
- | s H script when we start things up and | + | |
- | it's useful for knowing whether or not | + | |
- | the service is running and also | + | |
- | understanding how to stop the process | + | |
- | next I'm going to say user equals solar | + | |
- | this is where we specify that the solar | + | |
- | user we created earlier will be used to | + | |
- | run the solar process | + | |
- | equals slash bin slash s H minus L minus | + | |
- | C user local view fine solar SH stop all | + | |
- | in single quotes and minus X so as you | + | |
- | can see this exactly mirrors the exact | + | |
- | start command except this is the command | + | |
- | used to stop solar instead of to start | + | |
- | it up we add success exit status equals | + | |
- | 0 this tells system D that the exact | + | |
- | start command will return a return value | + | |
- | of zero when it succeeds | + | |
- | so if solar dive SH comes back with some | + | |
- | other exit status something is wrong and | + | |
- | the process will throw an error | + | |
- | finally limit n Oh file equals 65,000 | + | |
- | and limit in proc equals 65,000 you may | + | |
- | have noticed recently when you start up | + | |
- | solar data SH from the command line | + | |
- | it throws warnings about file | + | |
- | and process limit settings if you don't | + | |
- | allow a solar to have lots of files open | + | |
- | it can potentially cause performance | + | |
- | issues and so it's recommended that you | + | |
- | set these limits to these values when | + | |
- | running solar in production and using | + | |
- | system D provides a really convenient | + | |
- | way to set those settings and then not | + | |
- | have to think about them anymore | + | |
- | we create an install | + | |
- | wanted by equals | + | |
- | this section tells system D what | + | |
- | circumstances it should start this | + | |
- | process under when the process is | + | |
- | enabled | + | |
- | the server is running and accepting | + | |
- | logins but isn't necessarily presenting | + | |
- | a graphical interface so it's kind of a | + | |
- | low threshold for system is up and | + | |
- | running in a normal mode if you use the | + | |
- | older Linux startup system | + | |
- | D there were things called run levels | + | |
- | those don't exist anymore | + | |
- | instead | + | |
- | multi-user target is a safe and | + | |
- | appropriate option for this use case so | + | |
- | I've now saved this file I'm going to | + | |
- | exit out of here and hope I didn't make | + | |
- | any typos so just to show you I'm going | + | |
- | to open up if you find in a web browser | + | |
- | and try to do a search and it fails | + | |
- | because | + | |
- | that I've defined a service I can start | + | |
- | solar up using the standard | + | |
- | command which systemd uses to start and | + | |
- | stop services | + | |
- | systemctl start view fine because I | + | |
- | named my file view find dot service | + | |
- | I wait a moment | + | |
- | and now I'm back | + | |
- | prompt so it appears to have succeeded | + | |
- | let's refresh our browser | + | |
- | have search results it worked | + | |
- | wanted to stop or restart the service I | + | |
- | could do sudo systemctl stop view find | + | |
- | which stops it and if it were running I | + | |
- | could just substitute restart for stop | + | |
- | to stop and then start it but what we | + | |
- | are really concerned about here is | + | |
- | ensuring that solar starts every time | + | |
- | our server boots up so that we don't | + | |
- | have to remember to start it by hand and | + | |
- | if something happens in the middle of | + | |
- | the night it just recovers on its own so | + | |
- | that's easily done we just say sudo | + | |
- | systemctl enable | + | |
- | system has enabled the service based on | + | |
- | that wanted by setting we put in the | + | |
- | service file it knows that when it's | + | |
- | enabled it needs to start up whenever | + | |
- | the system is running in multi-user mode | + | |
- | and accepting connections | + | |
- | that this works and reboot the server | + | |
- | all right so now I'm going to login and | + | |
- | now if everything worked I should be | + | |
- | able to open up a web browser and access | + | |
- | my view find instance and do searches | + | |
- | without having to manually start | + | |
- | anything | + | |
- | so here' | + | |
- | results we are successful | + | |
- | covered getting | + | |
- | automatically let's also talk a little | + | |
- | bit about cleaning up because | + | |
- | potentially has a lot of users accessing | + | |
- | it and some of the activity that users | + | |
- | perform creates traces that can over | + | |
- | time accumulate into quite a bit of data | + | |
- | so first of all search history | + | |
- | time anybody does a search it creates a | + | |
- | row in a database in my sequel | + | |
- | whatever database platform you're using | + | |
- | called search | + | |
- | table is that it allows us to maintain a | + | |
- | search history | + | |
- | I can see that I did a blank search | + | |
- | every user has a search history | + | |
- | maintained in the search table also | + | |
- | there' | + | |
- | can potentially save their searches so | + | |
- | that they can refer back to them in | + | |
- | future | + | |
- | have a notification feature that when | + | |
- | enabled lets people subscribe to | + | |
- | searches and get emails | + | |
- | new results have showed up in those sets | + | |
- | so it's useful to have this database but | + | |
- | of course the vast majority of searches | + | |
- | that get entered in the database are | + | |
- | just forgotten about and never referred | + | |
- | to again and if you have people doing | + | |
- | thousands or millions of searches in | + | |
- | your system this database table can get | + | |
- | really big fortunately view find has a | + | |
- | command-line utility called expire | + | |
- | searches which will clean out the table | + | |
- | so just to demonstrate if I CD into my | + | |
- | view find home directory and run PHP | + | |
- | public index dot PHP util expire | + | |
- | searches in this example it deleted 70 | + | |
- | old searches from all of the time | + | |
- | I've done searching in past videos and | + | |
- | you can see that if I manage to create | + | |
- | 70 searches just in the process of | + | |
- | recording these videos you can end up | + | |
- | with a lot of these things if you have | + | |
- | search engines crawling you and/or a | + | |
- | large user base so I strongly encourage | + | |
- | you to set up a cron job that regularly | + | |
- | runs this expire searches | + | |
- | otherwise | + | |
- | sequel | + | |
- | you'll also learn if you find yourself | + | |
- | in that situation that well my sequel | + | |
- | can grow it can never shrink | + | |
- | sequel | + | |
- | even if you clear data out of it it | + | |
- | doesn' | + | |
- | the already claimed space if you really | + | |
- | need to reclaim disk space from an | + | |
- | out-of-control | + | |
- | best thing to do is to dump the whole | + | |
- | database with my sequel | + | |
- | the database and re-import | + | |
- | clean up all the disk space and make a | + | |
- | nice new small optimized file for you so | + | |
- | just a heads up this might also be a | + | |
- | good time to point out that if you find | + | |
- | has a whole bunch of command line tools | + | |
- | for you and if you just run the public | + | |
- | index dot PHP script from the command | + | |
- | line you will get a list of all of them | + | |
- | so there are a few different kinds of | + | |
- | exploration | + | |
- | and ends some of which we will go into | + | |
- | more detail in on future videos but just | + | |
- | be aware these exist they might come in | + | |
- | handy | + | |
- | so getting back to the subject of | + | |
- | cleaning up after ourselves there' | + | |
- | other thing that can potentially take up | + | |
- | a lot of space and that is user sessions | + | |
- | so the way that PHP and really any | + | |
- | web-based system allows users to have a | + | |
- | persistent state within the system such | + | |
- | as being logged in or tracking a | + | |
- | partially completed workflow is to store | + | |
- | some | + | |
- | data on the server called a session | + | |
- | PHP sends a session cookie to the user | + | |
- | which gives them a unique identifier | + | |
- | that's tied to a session file on the | + | |
- | server and every time the user comes in | + | |
- | with that cookie PHP loads that session | + | |
- | data and then can use it to see who is | + | |
- | being interacted with and what they' | + | |
- | currently in the process of doing whew | + | |
- | fine doesn' | + | |
- | most of the time but there are certainly | + | |
- | places where it's important such as | + | |
- | enabling you to log in and stay logged | + | |
- | in or tracking what page to redirect you | + | |
- | to after you've completed a login | + | |
- | process | + | |
- | setting that controls how user session | + | |
- | data is stored because there are | + | |
- | actually several options | + | |
- | to use PHP s built-in disk based session | + | |
- | handling where it just sticks files in a | + | |
- | directory but you can also set it up to | + | |
- | use a database table or to use different | + | |
- | kinds of memory based stores like Redis | + | |
- | or memcache D depending | + | |
- | you choose you may have different | + | |
- | maintenance issues to deal with | + | |
- | with the default disk based sessions | + | |
- | normally PHP should clean up after | + | |
- | itself and you shouldn' | + | |
- | about it | + | |
- | but I have experienced situations where | + | |
- | things have not quite gone as planned | + | |
- | and session files have accumulated | + | |
- | faster than desired | + | |
- | have such a heavy load that there are | + | |
- | too many files in the session directory | + | |
- | for PHP to handle it might stop cleaning | + | |
- | up after itself | + | |
- | may want to monitor on your server | + | |
- | perhaps with a cron job that cleans out | + | |
- | files past a certain age in the | + | |
- | directory used for holding sessions | + | |
- | you use the database based session | + | |
- | storage there' | + | |
- | command-line utility which you can see | + | |
- | listed right here which cleans up the | + | |
- | table in the database | + | |
- | you where these settings live if you | + | |
- | look in your config | + | |
- | local slash config slash you find slash | + | |
- | config | + | |
- | file called session which I'm going to | + | |
- | search for and as you can see you can | + | |
- | set the type which here defaults to file | + | |
- | that other options include memcache and | + | |
- | database | + | |
- | session which defaults to an hour so in | + | |
- | theory these things should be cleaned up | + | |
- | after an hour if a user stops being | + | |
- | active | + | |
- | if you're worried about anything | + | |
- | sensitive in there and then there are a | + | |
- | number of settings that are specific to | + | |
- | different session handlers so for | + | |
- | example if you're using files you can | + | |
- | specify a non-default save path from the | + | |
- | directory where the sessions live if | + | |
- | you're using memcache you can specify | + | |
- | how to connect the memcache server etc | + | |
- | so that's all I have for today I hope | + | |
- | that's helpful | + | |
- | there are certainly other issues to | + | |
- | think about when administering a server | + | |
- | and there are some wiki pages that talk | + | |
- | about this in more detail but if you can | + | |
- | get solar to start and you can avoid | + | |
- | filling up your disk you are well on | + | |
- | your way to having a happy and healthy | + | |
- | view fine server | + | |
---- struct data ---- | ---- struct data ---- | ||
+ | properties.Page Owner : | ||
---- | ---- | ||
videos/administering_a_vufind_server.1639785752.txt.gz · Last modified: 2021/12/18 00:02 by akilsdonk