456 Episodes

  1. ShiQ: Bringing back Bellman to LLMs

    Published: 5/22/2025
  2. Policy Learning with a Natural Language Action Space: A Causal Approach

    Published: 5/22/2025
  3. Multi-Objective Preference Optimization: Improving Human Alignment of Generative Models

    Published: 5/22/2025
  4. End-to-End Learning for Stochastic Optimization: A Bayesian Perspective

    Published: 5/21/2025
  5. TEXTGRAD: Automatic Differentiation via Text

    Published: 5/21/2025
  6. Steering off Course: Reliability Challenges in Steering Language Models

    Published: 5/20/2025
  7. Past-Token Prediction for Long-Context Robot Policies

    Published: 5/20/2025
  8. Recovering Coherent Event Probabilities from LLM Embeddings

    Published: 5/20/2025
  9. Systematic Meta-Abilities Alignment in Large Reasoning Models

    Published: 5/20/2025
  10. Predictability Shapes Adaptation: An Evolutionary Perspective on Modes of Learning in Transformers

    Published: 5/20/2025
  11. Efficient Exploration for LLMs

    Published: 5/19/2025
  12. Rankers, Judges, and Assistants: Towards Understanding the Interplay of LLMs in Information Retrieval Evaluation

    Published: 5/18/2025
  13. Bayesian Concept Bottlenecks with LLM Priors

    Published: 5/17/2025
  14. Transformers for In-Context Reinforcement Learning

    Published: 5/17/2025
  15. Evaluating Large Language Models Across the Lifecycle

    Published: 5/17/2025
  16. Active Ranking from Human Feedback with DopeWolfe

    Published: 5/16/2025
  17. Optimal Designs for Preference Elicitation

    Published: 5/16/2025
  18. Dual Active Learning for Reinforcement Learning from Human Feedback

    Published: 5/16/2025
  19. Active Learning for Direct Preference Optimization

    Published: 5/16/2025
  20. Active Preference Optimization for RLHF

    Published: 5/16/2025

13 / 23

Cut through the noise. We curate and break down the most important AI papers so you don’t have to.