Peter Szemraj's picture

Peter Szemraj PRO

pszemraj

·

https://pszemraj.carrd.co/

AI & ML interests

metallic intuition

Recent Activity

upvoted an article about 19 hours ago

ViDoRe V3: a comprehensive evaluation of retrieval for enterprise use-cases

upvoted an article about 20 hours ago

The Heterogeneous Feature of RoPE-based Attention in Long-Context LLMs

updated a model about 21 hours ago

pszemraj/medgemma-4b-it-heretic

View all activity

Organizations

upvoted an article about 19 hours ago

Article

ViDoRe V3: a comprehensive evaluation of retrieval for enterprise use-cases

15 days ago

•

48

upvoted an article about 20 hours ago

Article

The Heterogeneous Feature of RoPE-based Attention in Long-Context LLMs

4 days ago

•

11

upvoted a paper 1 day ago

OlmoEarth: Stable Latent Image Modeling for Multimodal Earth Observation

Paper • 2511.13655 • Published 2 days ago • 9

upvoted a collection 5 days ago

Motif-2-12.7B

2 items • Updated 14 days ago • 5

upvoted a paper 5 days ago

Motif 2 12.7B technical report

Paper • 2511.07464 • Published 13 days ago • 38

upvoted a collection 8 days ago

SYNTH

Fully generalist synthetic dataset and SOTA small reasoners • 3 items • Updated 9 days ago • 9

upvoted a collection 14 days ago

GVE

Towards General Video Embeddings: Models and Benchmarks • 4 items • Updated 17 days ago • 19

upvoted a paper 15 days ago

Trove: A Flexible Toolkit for Dense Retrieval

Paper • 2511.01857 • Published 16 days ago • 10

upvoted 2 papers 18 days ago

The Principles of Diffusion Models

Paper • 2510.21890 • Published 27 days ago • 57

Reasoning Language Model Inference Serving Unveiled: An Empirical Study

Paper • 2510.18672 • Published 29 days ago • 7

upvoted a collection 18 days ago

LightOnOCR

The Case for End-to-End and Efficient Domain-Specific Vision-Language Models for OCR • 7 items • Updated 6 days ago • 14

upvoted a paper 19 days ago

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published 20 days ago • 107

upvoted 2 papers 23 days ago

WorldGrow: Generating Infinite 3D World

Paper • 2510.21682 • Published 26 days ago • 41

Reasoning with Sampling: Your Base Model is Smarter Than You Think

Paper • 2510.14901 • Published Oct 16 • 47

upvoted 2 papers 26 days ago

Attention Sinks in Diffusion Language Models

Paper • 2510.15731 • Published Oct 17 • 48

olmOCR 2: Unit Test Rewards for Document OCR

Paper • 2510.19817 • Published 28 days ago • 13

upvoted 2 papers 28 days ago

AION-1: Omnimodal Foundation Model for Astronomical Sciences

Paper • 2510.17960 • Published about 1 month ago • 28

Chem-R: Learning to Reason as a Chemist

Paper • 2510.16880 • Published Oct 19 • 52

upvoted a paper 29 days ago

When to Ensemble: Identifying Token-Level Points for Stable and Fast LLM Ensembling

Paper • 2510.15346 • Published Oct 17 • 33

upvoted a paper 30 days ago

Robust Layerwise Scaling Rules by Proper Weight Decay Tuning

Paper • 2510.15262 • Published Oct 17 • 5