Jiangshan's Personal Website
HomeBlogArchivesCategoriesTagsAbout
  • Tags
  • qwen2.5-3B
Posted 2026-06-29Updated 2026-07-01Study Experiment6 minutes read (About 918 words)

2026-06-29-SlimeSearchR1Example

Training and evaluation of the slime Search-R1-trained Qwen2.5-3B model on NQ and HotpotQA.
Read more
Jiangshan Gong

Jiangshan Gong

University of Illinois at Urbana-Champaign

Posts

28

Categories

4

Tags

21

Follow

Links

  • Hexohexo.io
  • Bulmabulma.io

Categories

  • Life9
  • Research Experiment1
  • Study15
  • Study Experiment3

Recents

2026-06-29

2026-06-29-QWenVL

Study

2026-06-29

2026-06-29-SlimeSearchR1Example

Study Experiment

2026-06-10

PPO vs GRPO — Post-Training Qwen2.5-0.5B-Instruct on GSM8K with veRL

Study Experiment

2026-06-07

2026-06-07-RLClassic

Study

2026-06-06

2026-06-06-RLPre

Study

Archives

  • 20267
  • 20253
  • 20238
  • 202210

Tags

BASH1
COOKING1
CUDA3
DATABASE1
DATASTRUCTURE4
GRPO1
LINUX1
LLM1
LLMTRAINING2
PARALLELPROGRAMMING3
PPO1
Qwen-VL1
RL4
SLAHMR1
TRAVEL1
VLM1
WEBDEV1
WEBSITE5
qwen2.5-3B1
search-r11
slime1
Jiangshan's Personal Website

© 2026 Jiangshan Gong  Powered by Hexo & Icarus

© 2019

×