Greetings everyone,
Our team is currently immersed in a project that heavily relies on a language model. However, we have encountered challenges stemming from the storage limitations of canisters, preventing the accommodation of a billion-parameter model. In response, we have chosen to download the model directly onto the user’s browser. Regrettably, this approach introduces two significant issues that warrant consideration:
-
The download process on the user’s browser is time-consuming, taking several minutes to complete.
-
To enable this feature, users must activate webGPU on their browser, which, as an initial observation, is an experimental feature across most browsers.
-
Regrettably, these circumstances collectively contribute to a suboptimal user experience.
In light of these challenges, we are actively seeking ways to circumvent these issues without compromising the integrity of our project. Your insights and suggestions would be greatly appreciated.