Thought on architecture
#6
by
sometimesanotion
- opened
This is already one of my favorite models for its size! Could a derivative of it have dense layers appended to the head, to yield a model that can easily be finetuned without expert collapse? I see there are two dense layers at the foot.
Is that a sensible strategy for a model deployed on laptops and workstations? Is LiquidAI interested in making such models?
sometimesanotion
changed discussion status to
closed