After we set up AboutUs’s search feature our (awesome) sysadmin wanted a redundant setup. He doesn’t like having to cancel his weekend if a box goes down. We talked about running an instance of HAProxy on all our app servers to load balance between two Sphinx servers, but that seemed heavy handed.
It turned out to be really easy to add this functionality to ThinkingSphinx (and the Riddle client it uses to talk to Sphinx).
Basically instead of a sphinx.yml like this:
production:
morphology: stem_en
mem_limit: 1600M
address: 10.1.0.42
listen: 0.0.0.0
It looks this:
production:
morphology: stem_en
mem_limit: 1600M
timeout: 0.5
address:
- 10.1.0.42
- 10.1.0.43
listen: 0.0.0.0
Queries will load balance between the multiple servers specified in address (it’s still ok to have just one), and in the event of a failure they’ll failover to the other servers. You can also specify a timeout option, and the clients will failover after they’ve hit it. (This is good for cases where the server is totally down, to avoid waiting on a TCP timeout of around 30 seconds.)
These changes were pulled back into ThinkingSphinx and Riddle, so they should be available when the next versions of these gems are released.