» Is your app behaving erratically?
If your app is experiencing mysterious stability problems, Timeout::Error
may be to blame. The general consensus in the Ruby community is to avoid using Timeout
, especially in network-related code (Redis, Elastic Search, Web API calls, Database calls, Any service that runs on TCP/UDP).
The main problem with using Timeout
in network-related code is that timeout exceptions aren't raised on the other end of the connection, which may in turn send messages back to your application to be read in the wrong context. This causes unforeseen and hard to debug errors in your application.
Here is an excerpt from Mike Perham's article on the same topic which does a great job explaining this scenario:
Imagine this sequence of events:
- Code makes request A to Redis.
- Timeout triggers, block stops executing.
- Redis connection is returned to connection pool.
- Network receives response A for request A.
- Code checks out same connection and makes request B.
- Code reads response A instead of waiting for response B!
- That shared Redis connection has been corrupted due to Timeout skipping response A handling.
Mike's conclusion sums it up well:
Ruby's Timeout is a giant hammer and will only lead to a big mess. Don't use it.
» On using Rack::Timeout
If your project uses Rack::Timeout
read up on the Timing Out Inherently Unsafe section of the Rack::Timeout
README, and understand the potential problems you may run into before using this gem.
» What can you do about it?
If you run into a Timeout::Error
while using a library or someone else's code, try to understand why such a situation would occur. Check if it is possible to reduce the execution time of your code. If not, see if you can increase the timeout using a configuration setting.
» More resources
The Ultimate Guide to Ruby Timeouts is a great article which documents how to add timeouts while using popular gems.
» Discussion
This post resulted in some lively discussion on reddit.
I think most Ruby implementors agree that timeout is pretty bad. Timeouts are raised in blocking code by interrupting the call by basically whatever means possible, and it can be very abrupt and unhelpful to the other end, can close file handles so nothing can be communicated when handling the timeout, and they also do some things I believe like create a new thread per timeout.
Rubinius re-implemented them using at least a single thread instead of one per timeout. I hope to do something better in TruffleRuby.
Timing out a "complex" operation means that any resources used within that operation are in an unknown, unpredictable state. If you throw a giant timeout around a Rack operation, how on earth do you know which resources were used and how to reset them?
That's why Timeout.timeout(&block) is a bad idea but simple network timeouts are ok.
I also wrote about why this is bad with a suggestion on how we could change the API to allow for a safer timeout. https://www.schneems.com/2017/02/21/the-oldest-bug-in-ruby-why-racktimeout-might-hose-your-server/
As Seaton mentioned already the problem is that the timeout can fire when cleaning up anything, not just when checking connections back into thread pools.