Elijah Wilt
ooj
AI & ML interests
None yet
Recent Activity
updated
a model
25 days ago
ooj/my_awesome_billsum_model
updated
a model
26 days ago
ooj/distilbert-rotten-tomatoes
published
a model
about 1 month ago
ooj/my_awesome_billsum_model
Organizations
None yet
Model Approaches
Big | New
-
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework
Paper • 2404.14619 • Published • 126 -
Multi-Head Mixture-of-Experts
Paper • 2404.15045 • Published • 60 -
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Paper • 2404.14219 • Published • 258 -
Learn Your Reference Model for Real Good Alignment
Paper • 2404.09656 • Published • 89
Optimize
-
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions
Paper • 2404.13208 • Published • 40 -
Mixture-of-Agents Enhances Large Language Model Capabilities
Paper • 2406.04692 • Published • 59 -
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models
Paper • 2406.04271 • Published • 30
3D
Prompting
RL
Runnin Local
Specific Models
-
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Paper • 2404.14219 • Published • 258 -
How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study
Paper • 2404.14047 • Published • 45 -
Octopus v4: Graph of language models
Paper • 2404.19296 • Published • 118 -
DeepSeek-V3 Technical Report
Paper • 2412.19437 • Published • 71
Stable Diffusion
-
Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis
Paper • 2404.13686 • Published • 28 -
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Paper • 2406.06525 • Published • 71 -
IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models
Paper • 2308.06721 • Published • 33 -
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps
Paper • 2501.09732 • Published • 71
Datasets
Theoretical
-
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
Paper • 2405.21060 • Published • 67 -
Your Transformer is Secretly Linear
Paper • 2405.12250 • Published • 158 -
KAN: Kolmogorov-Arnold Networks
Paper • 2404.19756 • Published • 115 -
Trusted Machine Learning Models Unlock Private Inference for Problems Currently Infeasible with Cryptography
Paper • 2501.08970 • Published • 6
Keystone
RL
Model Approaches
Runnin Local
Big | New
-
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework
Paper • 2404.14619 • Published • 126 -
Multi-Head Mixture-of-Experts
Paper • 2404.15045 • Published • 60 -
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Paper • 2404.14219 • Published • 258 -
Learn Your Reference Model for Real Good Alignment
Paper • 2404.09656 • Published • 89
Specific Models
-
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Paper • 2404.14219 • Published • 258 -
How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study
Paper • 2404.14047 • Published • 45 -
Octopus v4: Graph of language models
Paper • 2404.19296 • Published • 118 -
DeepSeek-V3 Technical Report
Paper • 2412.19437 • Published • 71
Optimize
-
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions
Paper • 2404.13208 • Published • 40 -
Mixture-of-Agents Enhances Large Language Model Capabilities
Paper • 2406.04692 • Published • 59 -
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models
Paper • 2406.04271 • Published • 30
Stable Diffusion
-
Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis
Paper • 2404.13686 • Published • 28 -
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Paper • 2406.06525 • Published • 71 -
IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models
Paper • 2308.06721 • Published • 33 -
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps
Paper • 2501.09732 • Published • 71
3D
Datasets
Prompting
Theoretical
-
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
Paper • 2405.21060 • Published • 67 -
Your Transformer is Secretly Linear
Paper • 2405.12250 • Published • 158 -
KAN: Kolmogorov-Arnold Networks
Paper • 2404.19756 • Published • 115 -
Trusted Machine Learning Models Unlock Private Inference for Problems Currently Infeasible with Cryptography
Paper • 2501.08970 • Published • 6