456 Episodes

  1. Test-Time Alignment of Diffusion Models without reward over-optimization

    Published: 5/16/2025
  2. Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback

    Published: 5/16/2025
  3. GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment

    Published: 5/16/2025
  4. Advantage-Weighted Regression: Simple and Scalable Off-Policy RL

    Published: 5/16/2025
  5. Can RLHF be More Efficient with Imperfect Reward Models? A Policy Coverage Perspective

    Published: 5/16/2025
  6. Transformers can be used for in-context linear regression in the presence of endogeneity

    Published: 5/15/2025
  7. Bayesian Concept Bottlenecks with LLM Priors

    Published: 5/15/2025
  8. In-Context Parametric Inference: Point or Distribution Estimators?

    Published: 5/15/2025
  9. Enough Coin Flips Can Make LLMs Act Bayesian

    Published: 5/15/2025
  10. Bayesian Scaling Laws for In-Context Learning

    Published: 5/15/2025
  11. Posterior Mean Matching Generative Modeling

    Published: 5/15/2025
  12. Can Generative AI Solve Your In-Context Learning Problem? A Martingale Perspective

    Published: 5/15/2025
  13. Dynamic Search for Inference-Time Alignment in Diffusion Models

    Published: 5/15/2025
  14. Is In-Context Learning in Large Language Models Bayesian? A Martingale Perspective

    Published: 5/12/2025
  15. Leaked Claude Sonnet 3.7 System Instruction tuning

    Published: 5/12/2025
  16. Converging Predictions with Shared Information

    Published: 5/11/2025
  17. Test-Time Alignment Via Hypothesis Reweighting

    Published: 5/11/2025
  18. Rethinking Diverse Human Preference Learning through Principal Component Analysis

    Published: 5/11/2025
  19. Active Statistical Inference

    Published: 5/10/2025
  20. Data Mixture Optimization: A Multi-fidelity Multi-scale Bayesian Framework

    Published: 5/10/2025

14 / 23

Cut through the noise. We curate and break down the most important AI papers so you don’t have to.