It’s maybe because I query OpenAI and each replica gets a different answers given the generative nature of the answer? If that’s correct, does it mean OpenAI cannot be queried from a canister on the IC?
Damn, that was overengineered. I think it took me close to five hours to make it happen.
Indeed, the solution was to extend my proxy to support an idempotency key in order to always return the same successful result for the same key. The tricky part was that within the proxy, the request had to be made atomic, given that the functions can run in parallel at the same time and also because OpenAI crashes with an error if too many requests are sent. So, I had to put requests in a database in an atomic way, and for those requests that were already queued, implement a try/repeat/timeout mechanism to wait until the first request was over.
In addition, I also needed to adjust the smart contract to transform the response in order to trim the information that is not necessary and which can be dynamic, therefore cannot be replicated. I also needed a few more minutes for this particular step because the examples in the IC documentation are outdated.
Nevertheless, it worked out. Thanks for the hint @domwoe