[VUFIND-1187] Tags are semi-case-insensitive in MySQL-based installations Created: 23/Jun/16  Updated: 26/Aug/16  Resolved: 26/Aug/16

Status: Resolved
Project: VuFind®
Components: MyResearch, User Interface
Affects versions: 1.0RC1, 1.0RC2, 1.0, 1.0.1, 1.1, 1.2, 1.3, 1.4, 2.0alpha, 2.0beta, 2.0RC1, 2.0, 2.0.1, 2.1, 2.1.1, 2.2, 2.2.1, 2.3, 2.3.1, 2.4, 2.4.1, 2.5, 2.5.1, 2.5.2, 2.5.3, 2.5.4, 3.0, 3.0.1, 3.0.2
Fix versions: 3.1

Type: Bug Priority: Minor
Reporter: Demian Katz Assignee: Demian Katz
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified


 Description   
I’ve just discovered a strange behavior in VuFind that I’m pretty sure has existed since day one, but which no one has noticed before. Here’s the sequence to reproduce it:
 
1.) Go to record A, and tag it “WaCKy”.
2.) Go to record B, and tag it “wacky” – your tag shows up as “WaCKy.”
 
Basically, if there are multiple case variations of a tag within VuFind, the first one always wins, and you can never change it.
 
The reason for this is that MySQL uses a case-insensitive collation by default, so while it can store different cases, it always retrieves in a case-insensitive fashion. This seems to me like a very bad, unintuitive default behavior – but it is what it is.
 
I think we have a few options:
 
1.) Ignore it – nobody has complained yet, so maybe it’s not a big deal.
2.) Force all tags to a particular case – if we want case-insensitive tags, perhaps we should force all strings to lowercase so no one is disconcerted by random, inexplicable case variations.
3.) Change the MySQL definition to a case-sensitive collation – this would allow “WaCKy” and “wacky” to be treated as two independent tags, which might or might not be desirable. This would require database structure changes, and I’d have to investigate how easy it would be to incorporate into the database upgrade script, and it’s also possible that the change will reveal bugs we haven’t noticed before due to the odd database default.
4.) Hybrid of 2 & 3 – make the database case sensitive, but include a config setting to allow strings to be forced to lower case. Best of both worlds since it eliminates confusing behavior and makes case sensitivity configurable… but also the most work to implement.
 
Also note that this ONLY affects MySQL as far as I know… so PostgreSQL users are probably currently experiencing case sensitive behavior. This might be another argument toward option #4, since that would better lend itself to cross-platform consistency.


 Comments   
Comment by Enrique Martínez [ 23/Jun/16 ]
5) use a hash function instead of a case-change.
Comment by Demian Katz [ 23/Jun/16 ]
Interesting idea. So you would hash the text of the tag to a separate field, and then do all lookups using that field? Seems like it should work, though it's not the most straightforward approach!
Comment by Jay Roos [ 27/Jun/16 ]
6) Store the tag in a case-sensitive fashion, but allow the query to be configured as case-sensitive or not. This would allow a library to make a decision to be case-insensitive without losing any the ability to use that user-entered case if they later change their mind.
Comment by Demian Katz [ 05/Jul/16 ]
The proposed #6 also makes sense -- but like #4 would require either #3 or #5 as a prerequisite. (Not that that's a problem -- just documenting dependencies for future reference).
Comment by Demian Katz [ 06/Jul/16 ]
See these dev call minutes for some further discussion: https://vufind.org/wiki/developers_call:minutes20160705#tags_and_case_sensitivity
Comment by Demian Katz [ 25/Jul/16 ]
See https://github.com/vufind-org/vufind/pull/764
Comment by Demian Katz [ 26/Aug/16 ]
Pull request has been merged; see the 3.1 section of the changelog at https://vufind.org/wiki/changelog for best practices when upgrading.
Generated at Fri Mar 29 11:50:37 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100248-rev:6a03a54452e975225e04dfda06fdac6fd9e95b00.