Warning: This page has not been updated in over over a year and may be outdated or deprecated.
administration:fault_tolerance_and_load_balancing
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
vufind2:fault_tolerance_and_load_balancing [2015/11/16 13:23] – [VuFind Setup] emaijala | administration:fault_tolerance_and_load_balancing [2023/03/30 19:31] (current) – Update image location in sitemap tidy-up cmurdoch | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== Fault Tolerance and Load Balancing ====== | ====== Fault Tolerance and Load Balancing ====== | ||
- | This page is by no means complete, but it gives some hints and guidelines for creating a fault tolerant and load balanced VuFind service. Configuration of a | + | This page contains strategies, examples |
- | load balancer is out of the scope of this document. A hardware laod-balancer could be as well as a software-based solution such as HAProxy. | + | |
- | There are of course many ways to improve fault tolerance of a service, but this page describes one way to implement | + | ===== High-Availability Strategies ===== |
+ | ==== Load-Balancing ==== | ||
+ | Load-balancing distributes requests across multiple servers, or VuFind® nodes. If configured correctly, a load-balanced | ||
- | While it's easy to replicate VuFind' | + | If possible, load balancers should be redundant themselves. Many vendor load balancers (F5, Barracuda, Kemp, etc.) and open-source solutions (Zen, HAProxy, etc.) have documented solutions to achieve high availability. |
- | ===== Required Software Components | + | It may be noted that many installations separate the VuFind® front-end (Apache, MySQL, etc.) from the Solr back-end in order to minimize the risk associated with downtime on a particular server and keep service management concepts simple. |
+ | |||
+ | Below are two example configurations: | ||
+ | |||
+ | {{: | ||
+ | === Front-End Sessions === | ||
+ | When configuring a load-balancer, | ||
+ | |||
+ | There are two primary strategies for dealing with this issue: | ||
+ | |||
+ | == Node Persistence == | ||
+ | Most load-balancers can be configured such that a user will always hit the same node during their session. This is often referred to as sticky sessions. The strategies for achieving this goal depend on the load balancer but can range from using cookies to keeping track of IP addresses. Consult the documentation for the load balancer to learn about the strategies available. | ||
+ | |||
+ | The issue with this strategy is that if a front-end node goes down, all the users using that node will be kicked off and will need to re-authenticate when served by another node. Additionally, | ||
+ | |||
+ | == Session Distribution == | ||
+ | If sessions are stored inside a local database instance on a node (MySQL/ | ||
+ | |||
+ | * MariaDB Galera Cluster: [[https:// | ||
+ | |||
+ | ==== Solr Replication ==== | ||
+ | There are two primary options for keeping multiple Solr nodes synchronized: | ||
+ | * SolrCloud: [[https:// | ||
+ | * Traditional Replication: | ||
+ | |||
+ | ===== Example Configuration ===== | ||
+ | |||
+ | This example is by no means complete, but it gives some hints and guidelines for creating a fault tolerant and load balanced VuFind® service. Configuration of a load balancer is out of the scope of this document. A hardware load-balancer could be used as well as a software-based solution such as HAProxy. | ||
+ | |||
+ | There are of course many ways to improve fault tolerance of a service, but this example describes one way to implement a fault-tolerant and load balanced VuFind®. The basic idea is to replicate a single VuFind® server into at least three separate servers. Three is an important number in that it allows clustered services to always have a successful vote and understanding of the leader etc. This is often called having a quorum. It aims to avoid the so-called split-brain situation where two groups of servers are having trouble communicating with each other but continue to serve users' requests. Having at least three servers also allows one to fail without the total capacity suffering as much as with only two servers. | ||
+ | |||
+ | While it's easy to replicate VuFind®' | ||
+ | |||
+ | ==== Required Software Components ==== | ||
All of the components below will have an instance running on each server node. | All of the components below will have an instance running on each server node. | ||
- | * VuFind, of course | + | * VuFind®, of course |
* MariaDB Galera Cluster | * MariaDB Galera Cluster | ||
- | * This replaces the standard MySQL or MariaDB installation and allows | + | * This replaces the standard MySQL or MariaDB installation and allows |
- | * See https:// | + | * See https:// |
- | for instructions on how to get started with a database cluster. | + | |
* SolrCloud | * SolrCloud | ||
- | * This replaces the single Solr instance | + | * This replaces the single Solr instance |
* See https:// | * See https:// | ||
- | * There' | + | * There' |
- | ==== Load Balancer and Apache Setup ==== | + | === Load Balancer and Apache Setup === |
To enable the front-end servers' | To enable the front-end servers' | ||
Line 34: | Line 67: | ||
RPAF_SetPort | RPAF_SetPort | ||
- | ==== VuFind | + | === VuFind® |
- | Here it is assumed that the MariaDB Galera Cluster and SolrCloud are already up and running on each node. Basic setup of VuFind | + | Here it is assumed that the MariaDB Galera Cluster and SolrCloud are already up and running on each node. Basic setup of VuFind® |
- | * Configure | + | * Configure |
- | * Configure | + | * Configure |
[Index] | [Index] | ||
Line 46: | Line 79: | ||
url[] = http:// | url[] = http:// | ||
- | * Configure | + | * Configure |
[Session] | [Session] | ||
type = Database | type = Database | ||
- | * Set up the load balancer to use a health probe to check server status from VuFind | + | * Set up the load balancer to use a health probe to check server status from VuFind® |
* Set up scheduled tasks like removal of expired searches to run on only one of the servers. | * Set up scheduled tasks like removal of expired searches to run on only one of the servers. | ||
Line 57: | Line 90: | ||
* Set up AlphaBrowse index creation (if use use it) to run on ALL servers. AlphaBrowse index files are not automatically distributed to other nodes by Solr. | * Set up AlphaBrowse index creation (if use use it) to run on ALL servers. AlphaBrowse index files are not automatically distributed to other nodes by Solr. | ||
- | ==== Special Considerations | + | === Special Considerations === |
- | === Static File Timestamps | + | == Static File Timestamps == |
- | It is important to make sure all the static files served by VuFind | + | It is important to make sure all the static files served by VuFind® |
* Deploy the files using a .zip package so that timestamps are preserved. | * Deploy the files using a .zip package so that timestamps are preserved. | ||
* Deploy from a git repository and use [[https:// | * Deploy from a git repository and use [[https:// | ||
Line 68: | Line 101: | ||
In any case, make sure the timestamps are advanced if a file is changed. | In any case, make sure the timestamps are advanced if a file is changed. | ||
- | === Apache ETag === | + | == Apache ETag == |
ETag is used between a web server and a browser in addition to the last modification time to check if a file has been changed. The server sends an ETag header along the file, and every time the browser requests the file from the server it includes the ETag in the request. If the tags match (and the file is not newer than If-Modified-Since header in the request), the file is returned from the server. Otherwise the server can optimize the transfer by returning just a 304 "Not Modified" | ETag is used between a web server and a browser in addition to the last modification time to check if a file has been changed. The server sends an ETag header along the file, and every time the browser requests the file from the server it includes the ETag in the request. If the tags match (and the file is not newer than If-Modified-Since header in the request), the file is returned from the server. Otherwise the server can optimize the transfer by returning just a 304 "Not Modified" | ||
Line 78: | Line 111: | ||
Now the ETag of a file will be identical on all servers as long as the modification time and file size are identical. | Now the ETag of a file will be identical on all servers as long as the modification time and file size are identical. | ||
- | === Additional Implementation Notes === | + | == Asset Pipeline and other shared files == |
+ | |||
+ | Load balancing and VuFind®' | ||
+ | * Use a shared disk for all the load-balanced servers. This might have performance and reliability implications. | ||
+ | * Use sticky session in the load balancer. This has its own downsides like causing future requests from clients to go to the same server as before, which could cause imbalance between the servers especially when new ones are added. | ||
+ | |||
+ | Note that the above issues also affect things like the cover cache, but since covers can always be recreated from the source, it does not cause actual issues with servicing the requests. | ||
+ | |||
+ | == Additional Implementation Notes == | ||
* If you use Shibboleth, configure it to use ODBC to store its data so that it's available on all the server nodes. If you're running RHEL/CentOS 6.x, check out https:// | * If you use Shibboleth, configure it to use ODBC to store its data so that it's available on all the server nodes. If you're running RHEL/CentOS 6.x, check out https:// | ||
* At the time of writing this it's not recommended to run Piwik in a load balancer environment like this. A [[http:// | * At the time of writing this it's not recommended to run Piwik in a load balancer environment like this. A [[http:// | ||
---- struct data ---- | ---- struct data ---- | ||
+ | properties.Page Owner : | ||
---- | ---- | ||
administration/fault_tolerance_and_load_balancing.txt · Last modified: 2023/03/30 19:31 by cmurdoch