Reading - 分类 - vllbc02's blogs

Reading

2025

Group Sequence Policy Optimization 07-28

Can Language Models Serve as Text-Based World Simulators? 07-28

Towards Effective Code-Integrated Reasoning 07-26

Routine：A Structural Planning Framework for LLM Agent System in Enterprise 07-25

Search and Refine During Think：Autonomous Retrieval - Augmented Reasoning of LLMs 07-20

Peri-LN：Revisiting Normalization Layer in the Transformer Architecture 07-19

ZEROSEARCH：Incentivize the Search Capability of LLMs without Searching 07-18

WebThinker：Empowering Large Reasoning Models with Deep Research Capability 07-16

Search-R1：Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning 07-16

PROCESS REINFORCEMENT THROUGH IMPLICIT REWARDS 07-16

1
2
3