Improve model size detection: replace ad-hoc string parsing with reliable params_b field in MODELS dict ab92e0d Luigi commited on Oct 12
Set better defaults for free-tier users: Qwen3-1.7B model, 1024 max tokens, search disabled 2cae073 Luigi commited on Oct 12
Adjust duration estimation for H200 performance - reduce conservative estimates de766da Luigi commited on Oct 12
Use actual parameter count for AOT decision instead of string matching e3e334f Luigi commited on Oct 12
Make AOT compilation conditional for models >= 2B parameters to optimize free tier usage 4500f92 Luigi commited on Oct 12
disable two models that cannot run or too run too slowly on hf spaces with zerogpu 3dc7ced Luigi commited on Oct 11
feat(models): add Granite-4.0-Micro and Qwen3-4B-Instruct-2507 to MODELS registry c30a7f7 verified Luigi commited on Oct 9
remove prevously added breeze models (as it didn't work), add smollm 135m taiwan b3fd72e Luigi commited on Aug 4