Retry async with delay

Hi,

I’m trying to implement a delay based retry. As far as I know there is nothing in ic_cdk_timers that allows for this so here is my implementation. Are there any big holes? Is this dangerous AF, especially if it’s called multiple times?


pub async fn retry_with_attempts_delay<F, Fut, T, E>(
    max_attempts: u8,
    delay_duration: Duration,
    mut f: F
)
    -> Result<T, E>
    where F: FnMut() -> Fut + 'static, Fut: std::future::Future<Output = Result<T, E>>, E: Default
{
    let mut attempt = 0;

    while attempt < max_attempts {
        match f().await {
            Ok(result) => {
                return Ok(result);
            }
            Err(err) if attempt < max_attempts - 1 => {
                attempt += 1;

                set_timer(Duration::from(delay_duration), move || {
                    ic_cdk::spawn(async move {
                        let _ = retry_with_attempts_delay(max_attempts, delay_duration, f).await;
                    });
                });

                return Err(err);
            }
            Err(err) => {
                return Err(err);
            }
        }
    }

    Err(E::default()) 
}

You may want to check out what is done in GitHub - PanIndustrial-Org/timerTool: A timer management tool for motoko. for async. Obviously it’s motoko, but it should translate well enough to rust. There is a safety timer that is set with each async call and a timeout is applied. There are hooks for both handling trapped errors and eventual timeouts by the client allowing you to retry if you’d like. Currently this makes handling asyncs synchronous per timerTool instance, but you could instantiate a new one for each process.

1 Like

Hi @frederico02, Looks like that code will run an infinite number of attempts as long as the function f returns errors. The attempt variable is being set to 0 on every timer-run.

Try something like this:


pub async fn retry_with_attempts_delay<F, Fut, T, E>(
    max_attempts: u8,
    delay_duration: Duration,
    mut f: F
)
    -> Result<T, E>
    where F: FnMut() -> Fut + 'static, Fut: std::future::Future<Output = Result<T, E>>, E: Default
{
    __retry_with_attempts_delay(max_attempts, 1, delay_duration, f).await
}

async fn __retry_with_attempts_delay<F, Fut, T, E>(
    max_attempts: u8,
    current_attempt: u8,
    delay_duration: Duration,
    mut f: F
)
    -> Result<T, E>
    where F: FnMut() -> Fut + 'static, Fut: std::future::Future<Output = Result<T, E>>, E: Default
{
    if current_attempt <= max_attempts {
        match f().await {
            Ok(result) => {
                return Ok(result);
            }
            Err(err) => {
                if current_attempt + 1 <= max_attempts {
                    set_timer(Duration::from(delay_duration), move || {
                        ic_cdk::spawn(async move {
                            let _ = __retry_with_attempts_delay(max_attempts, current_attempt + 1, delay_duration, f).await;
                        });
                    });
                }
                return Err(err);
            }
        }
    }
    return Err(E::default());
}

1 Like

Thanks Levi :slight_smile: that worked really well.

                            let _ = __retry_with_attempts_delay(max_attempts, current_attempt + 1, delay_duration, f).await;

You are never getting the recursive result, right ?
Meaning with your code, if you retry at least one time you’ll return Error even if one nested called worked, or am i missing something?

One thing about this.

 ic_cdk::spawn(async move {
                            let _ = __retry_with_attempts_delay(max_attempts, current_attempt + 1, delay_duration, f).await;
                        });

dont we need to return the call instead of assign it to let _ = …;

Yes that’s right. The original question was about retrying after some delay with a timer. A timer doesn’t have a return value and you can’t await timer. A timer can’t return a value, but a timer can update your canister’s state. If you want to wait for the retries within the current-call-context you can remove the timer and return the result.

Thanks levi. Unfortunately this wont work for us. its quite frustrating that to have a simple delay in a rust based canister is so complex