Trading Bot Error Handling: Retries, Timeouts & Fail-Safes

A backtest never throws an exception, but a live bot faces a hostile world: exchanges time out, rate-limit you, reject orders, return partial fills and occasionally go down entirely. The difference between a bot that survives and one that blows up is rarely the strategy — it is the error handling. A bot that crashes mid-trade can leave a naked position; one that retries blindly can double an order. This guide covers the error types you must handle, retry-with-backoff, partial fills, rate limits and the kill switch, all with real code.

Why error handling decides survival

Live trading is unreliable by nature. A bot that assumes every API call succeeds will eventually crash at the worst possible moment — mid-position, with a stop not yet placed. Robust error handling turns transient failures into harmless retries and turns genuine problems into a safe, controlled stop rather than an uncontrolled loss. It is the operational core of risk management.

The error types to handle

With ccxt the failures fall into clear buckets: NetworkError (timeouts, DNS, connection resets — usually transient, safe to retry), ExchangeError (rejected order, bad params — usually a real bug, do not blindly retry), RateLimitExceeded (back off and slow down), and InsufficientFunds (a logic error in sizing). Each needs a different response.

Branch by error type: retry transient network errors, back off on rate limits, halt on genuine exchange rejections.

Retry with exponential backoff

python · retry.pyimport ccxt, time

def safe_call(fn, *args, tries=5):
    for i in range(tries):
        try:
            return fn(*args)
        except ccxt.NetworkError as e:
            wait = 2 ** i            # 1, 2, 4, 8, 16s
            print(f"network error, retry in {wait}s: {e}")
            time.sleep(wait)
        except ccxt.ExchangeError as e:
            print(f"exchange rejected, NOT retrying: {e}")
            raise               # a real bug — surface it
    raise RuntimeError("exhausted retries")

Order-level safety

Never blindly retry an order

If a create_order call times out, you do not know whether it filled. Blindly retrying can place the order twice. Instead, re-fetch open orders and recent trades to learn the true state before acting, and use a durable clientOrderId. The complete workflow is in duplicate-order prevention; persist the intent and reconcile it after restart with crash-safe state recovery.

Handling rate limits

Keep ccxt’s enableRateLimit on so it self-throttles, and still catch RateLimitExceeded to add an extra pause. Hammering an exchange gets your key temporarily banned, which can strand an open position. Use a shared priority queue and bounded backoff as detailed in trading bot API rate limits. If authenticated calls fail because their signatures are stale, follow the separate clock synchronization runbook.

The kill switch

A top-level handler should halt risk-increasing actions on an unrecoverable error, then apply an incident-specific policy for working orders and exposure. Immediate flattening is not always safest in an illiquid market, and simply killing the process can leave remote orders live. The trading bot kill-switch guide defines the trigger ladder, shutdown order, and controlled restart. Pair it with account-level risk limits and full logging.

Not financial advice. This content is educational. Automated and algorithmic trading carries a real risk of financial loss. Never trade money you cannot afford to lose. Review the SEC investor.gov and CFTC resources before trading.

Frequently asked questions

Why is error handling important for a trading bot?

Because live trading is unreliable: exchanges time out, rate-limit, reject orders and occasionally go down. A bot that assumes every call succeeds will eventually crash mid-trade, potentially leaving a naked position with no stop in place. Robust error handling turns transient failures into harmless retries and genuine problems into a safe, controlled stop instead of an uncontrolled loss.

How should a trading bot handle network errors?

Network errors such as timeouts and connection resets are usually transient, so the right response is to retry with exponential backoff — waiting 1, then 2, then 4 seconds and so on for a few attempts. In ccxt you catch ccxt.NetworkError specifically and retry it, while letting genuine exchange rejections halt the bot instead of being retried blindly.

What happens if an order request times out?

A timed-out order is dangerous because you do not know whether it actually filled, so blindly retrying can place it twice. The safe approach is to re-fetch open orders and recent trades to discover the true state before acting, and to attach a unique clientOrderId so the exchange rejects an accidental duplicate. Always reconcile state from the exchange rather than assuming.

What is a kill switch in a trading bot?

A kill switch is a top-level safety mechanism that, on an unrecoverable error or a breached risk limit, flattens positions or at least cancels open orders and stops the bot — a software dead-man’s switch. Combined with a maximum-drawdown limit and full logging, it ensures that when something goes badly wrong the bot fails safely instead of compounding the damage.

Trading bot error handling: retries, timeouts and fail-safes