LLM-Leaderboard's picture

LLM-Leaderboard

StarscreamDeceptions

·

AI & ML interests

None yet

Recent Activity

liked a dataset 5 days ago

LLM-Tuning-Safety/HEx-PHI

upvoted a paper 28 days ago

DeepWideSearch: Benchmarking Depth and Width in Agentic Information Seeking

upvoted a paper 28 days ago

HSCodeComp: A Realistic and Expert-level Benchmark for Deep Search Agents in Hierarchical Rule Application

View all activity

Organizations

spaces 1

🌐 Multilingual MMLU Benchmark Leaderboard

View and submit LLM benchmarks

models 0

None public yet

datasets 2

StarscreamDeceptions/results

Viewer • Updated Nov 13, 2024 • 17 • 17

StarscreamDeceptions/requests

Preview • Updated Nov 13, 2024 • 34