LLM-Leaderboard's picture

LLM-Leaderboard

StarscreamDeceptions

·

AI & ML interests

None yet

Recent Activity

liked a dataset 8 days ago

LLM-Tuning-Safety/HEx-PHI

upvoted a paper about 1 month ago

DeepWideSearch: Benchmarking Depth and Width in Agentic Information Seeking

upvoted a paper about 1 month ago

HSCodeComp: A Realistic and Expert-level Benchmark for Deep Search Agents in Hierarchical Rule Application

View all activity

Organizations

StarscreamDeceptions 's Spaces 1

🌐 Multilingual MMLU Benchmark Leaderboard

View and submit LLM benchmarks