All People
Sasha Rush

Sasha Rush

Cornell, Mamba

Recent Activity57 x-posts

Recent Activity

srush_io
Sasha Rush

@srush_io

Been working on text feedback / OPSD in Composer. Really interesting space, and much more to be explored.

Highlights: Sasha Rush is working on text feedback and OPSD in Composer, indicating ongoing development in AI-assisted coding tools.

Worth reading: Provides insight into the direction of Cursor's research on interactive code generation.

ToolingAgent
srush_io
Sasha Rush

@srush_io

⛏️

Highlights: A short post with a pickaxe emoji, possibly referencing mining or digging into research.

Worth reading: May signal a new project or discovery, but lacks context.

LLM
srush_io
Sasha Rush

@srush_io

today i woke up to a living version of a phd student's nightmare. a new paper in my inbox: a detailed reproduction of a paper i wrote

Highlights: Sasha Rush expresses surprise and perhaps concern about a reproduction of his paper arriving in his inbox.

Worth reading: Highlights the real-world impact of reproducibility in AI research.

EvaluationSafety
srush_io
Sasha Rush

@srush_io

today i woke up to a living version of a phd student's nightmare. a new paper in my inbox: a detailed reproduction of a paper i wrote

Highlights: A personal anecdote about receiving a reproduction of his own paper, highlighting the peer review process.

Worth reading: Shows the human side of academic research and the impact of reproducibility.

Evaluation
srush_io
Sasha Rush

@srush_io

One personal reflection is how interesting a challenge RL is. Unlike other ML systems, you can't abstract

Highlights: Reflects on the unique challenges of reinforcement learning compared to other ML systems.

Worth reading: Provides insight into the difficulties of RL from a researcher's perspective.

Agent
srush_io
Sasha Rush

@srush_io

⛏️

Highlights: A tweet with a pickaxe emoji, possibly hinting at mining or hard work.

Worth reading: Ambiguous but may signal a new project or effort.

Tooling
srush_io
Sasha Rush

@srush_io

today i woke up to a living version of a phd student's nightmare. a new paper in my inbox: a detailed reproduction of a paper i wrote

Highlights: Sasha Rush woke up to a detailed reproduction of his own paper, a common PhD student nightmare.

Worth reading: Illustrates the anxiety of academic reproduction and the vulnerability of published work.

Evaluation
srush_io
Sasha Rush

@srush_io

Oh it looks like I only made videos for the first half... If people start doing it I'll add more videos.

Highlights: Sasha Rush notes he only made videos for the first half of a series and will add more if there is interest.

Worth reading: Shows his responsiveness to audience engagement and iterative content creation.

Tooling
srush_io
Sasha Rush

@srush_io

(Thanks to @AntonAbilov who led a lot of this work)

Highlights: Sasha Rush thanks Anton Abilov for leading a significant portion of a project.

Worth reading: Demonstrates collaborative spirit and attribution in research.

LLM
srush_io
Sasha Rush

@srush_io

On the infra side, composer 2 uses CP. This is (i think?) the first real detail from using CP on MLA. My understanding is that each rank

Highlights: Sasha discusses infrastructure details: Composer 2 uses CP (context parallelism) on MLA (Multi-head Latent Attention).

Worth reading: Provides insight into practical implementation details of large model training infrastructure.

InfraLLM
srush_io
Sasha Rush

@srush_io

today i woke up to a living version of a phd student's nightmare. a new paper in my inbox: a detailed reproduction of a paper i wrote

Highlights: Sasha humorously describes the experience of receiving a reproduction of his own paper.

Worth reading: Illustrates the academic experience of having one's work independently replicated.

Evaluation
srush_io
Sasha Rush

@srush_io

On the infra side, composer 2 uses CP. This is (i think?) the first real detail from using CP on MLA. My understanding is that each rank

Highlights: Sasha Rush discusses infrastructure details about composer 2 using CP (context parallelism) on MLA (multi-head latent attention), noting it may be the first real detail from using CP on MLA.

Worth reading: Provides insight into cutting-edge LLM infrastructure and parallelism techniques.

InfraLLM
srush_io
Sasha Rush

@srush_io

Sasha Rush (@srush_nlp). 7 likes.

Highlights: Tweet with 7 likes, content not available.

Worth reading: Not enough context to determine significance.

srush_io
Sasha Rush

@srush_io

Sasha Rush (@srush_nlp). 15 likes.

Highlights: Tweet with 15 likes, content not available.

Worth reading: Not enough context to determine significance.

srush_io
Sasha Rush

@srush_io

⛏️

Highlights: Tweet with a pickaxe emoji, 140 likes.

Worth reading: Possibly a reference to mining or hard work, but context is minimal.

srush_io
Sasha Rush

@srush_io

⛏️

Highlights: A single pickaxe emoji post, possibly indicating a new tool or project.

Worth reading: Sasha Rush often uses minimal posts to hint at new work; this could signal a release.

Tooling
srush_io
Sasha Rush

@srush_io

No content extracted.

Highlights: A post with 10 likes, content not available.

Worth reading: Cannot determine significance without content.

srush_io
Sasha Rush

@srush_io

No content extracted.

Highlights: A post with 7 likes, content not available.

Worth reading: Cannot determine significance without content.

srush_io
Sasha Rush

@srush_io

On the infra side, composer 2 uses CP. This is (i think?) the first real detail from using CP on MLA. My understanding is that each rank

Highlights: Sasha Rush discusses infrastructure details about composer 2 using CP (context parallelism) on MLA (Multi-Head Latent Attention), noting it's the first real detail from using CP on MLA.

Worth reading: Provides insight into cutting-edge infrastructure for large language models, relevant for those working on model parallelism.

InfraLLM
srush_io
Sasha Rush

@srush_io

On the infra side, composer 2 uses CP. This is (i think?) the first real detail from using CP on MLA. My understanding is that each rank

Highlights: Sasha Rush discusses infrastructure details about composer 2 using CP (context parallelism) on MLA (multi-head latent attention).

Worth reading: Provides insight into practical deployment of advanced parallelism techniques for large language models.

InfraLLM
srush_io
Sasha Rush

@srush_io

On the infra side, composer 2 uses CP. This is (i think?) the first real detail from using CP on MLA. My understanding is that each rank first computes the compressed KVs, all gather this compressed latents. while the all gather is in flight, compute the Q proj

Highlights: Sasha discusses infrastructure details of composer 2 using CP (context parallelism) on MLA, describing the process of computing compressed KVs and all-gathering latents.

Worth reading: Provides insight into advanced parallelism techniques for large language model inference.

InfraLLM
srush_io
Sasha Rush

@srush_io

⛏️

Highlights: A single pickaxe emoji, possibly indicating a mining or digging metaphor.

Worth reading: Minimal content, but may signal a new project or finding.

LLM
srush_io
Sasha Rush

@srush_io

Highlights: No text content available from search snippet.

Worth reading: Incomplete data; unable to determine significance.

srush_io
Sasha Rush

@srush_io

today i woke up to a living version of a phd student's nightmare. a new paper in my inbox: a detailed reproduction of a paper i wrote

Highlights: Sasha Rush expresses surprise at a reproduction of his own paper.

Worth reading: Reflects on the reproducibility culture in ML research.

Evaluation
srush_io
Sasha Rush

@srush_io

Talk at Ray Summit on 'Building Cursor Composer.' Overview of the work from our research team.

Highlights: Sasha Rush gave a talk at Ray Summit about building Cursor Composer.

Worth reading: Showcases technical work on AI-powered code composition tools.

Tooling
srush_io
Sasha Rush

@srush_io

today i woke up to a living version of a phd student's nightmare. a new paper in my inbox: a detailed reproduction of a paper i wrote

Highlights: Sasha Rush humorously describes receiving a reproduction of his own paper, a common academic anxiety.

Worth reading: Illustrates the reproducibility challenge in AI research with a personal anecdote.

Evaluation
srush_io
Sasha Rush

@srush_io

Been reflecting a bit on the Harvard news. This paper from 2017 was ... Didn't realize at the time how lucky for us Americans to work with incredible people from around the world.

Highlights: Reflecting on Harvard news and gratitude for international collaboration.

Worth reading: Provides personal perspective on academic and global collaboration.

Safety
srush_io
Sasha Rush

@srush_io

Been reflecting a bit on the Harvard news. This paper from 2017 was ... Didn't realize at the time how lucky for us Americans to work with incredible people from around the world.

Highlights: Reflection on Harvard news and gratitude for working with international colleagues.

Worth reading: Provides personal perspective from an AI researcher on academic and global collaboration.

Safety
srush_io
Sasha Rush

@srush_io

Been reflecting a bit on the Harvard news. This paper from 2017 was ... Didn't realize at the time how lucky for us Americans to work with incredible people from around the world.

Highlights: Reflection on Harvard news and gratitude for international collaboration.

Worth reading: Offers personal perspective on academic events and global teamwork.

Evaluation
srush_io
Sasha Rush

@srush_io

Been reflecting a bit on the Harvard news. This paper from 2017 was ... Didn't realize at the time how lucky for us Americans to work with incredible people from around the world.

Highlights: Reflection on Harvard news and gratitude for international collaboration.

Worth reading: Personal reflection on academic community.

Safety
srush_io
Sasha Rush

@srush_io

Composer is a new model we built at Cursor. We used RL to train a big MoE model to be really good at real-world coding, and also very fast. https://cursor.com/blog/composer Excited for the potential of building specialized models to help in critical domains.

Highlights: Sasha Rush announces Composer, a new RL-trained MoE coding model from Cursor, emphasizing speed and real-world coding performance.

Worth reading: Showcases a practical application of RL and MoE for specialized coding assistance.

LLMFine-tuningTooling
srush_io
Sasha Rush

@srush_io

today i woke up to a living version of a phd student's nightmare. a new paper in my inbox: a detailed reproduction of a paper i wrote

Highlights: Sasha Rush woke up to a detailed reproduction of his own paper, a common PhD student nightmare.

Worth reading: Illustrates the pressures of academic reproducibility and the anxiety of having one's work scrutinized.

Evaluation
srush_io
Sasha Rush

@srush_io

today i woke up to a living version of a phd student's nightmare. a new paper in my inbox: a detailed reproduction of a paper i wrote

Highlights: Sasha Rush received a reproduction paper of his own work, calling it a PhD student's nightmare.

Worth reading: Reflects on reproducibility in AI research.

Evaluation
srush_io
Sasha Rush

@srush_io

today i woke up to a living version of a phd student's nightmare. a new paper in my inbox: a detailed reproduction of a paper i wrote

Highlights: Sasha Rush received a reproduction paper of his own work, calling it a PhD student's nightmare.

Worth reading: Reflects on the reproducibility culture in AI research.

EvaluationSafety
srush_io
Sasha Rush

@srush_io

Some personal news: I recently joined Cursor. Cursor is a small, ambitious team, and they've created

Highlights: Sasha Rush announced joining Cursor, a small ambitious team.

Worth reading: Shows a notable career move by a prominent AI researcher.

Tooling
srush_io
Sasha Rush

@srush_io

Some personal news: I recently joined Cursor. Cursor is a small, ambitious team, and they've created

Highlights: Sasha Rush announced joining Cursor, a small ambitious team.

Worth reading: Indicates a notable career move by a prominent researcher.

Tooling
srush_io
Sasha Rush

@srush_io

Some personal news: I recently joined Cursor. Cursor is a small, ambitious team, and they've created

Highlights: Sasha Rush announces joining Cursor, an ambitious small team.

Worth reading: Shows a notable researcher moving to a startup, indicating industry trends.

Tooling
srush_io
Sasha Rush

@srush_io

Some personal news: I recently joined Cursor. Cursor is a small, ambitious team, and they've created

Highlights: Sasha Rush announced joining Cursor, a small ambitious team.

Worth reading: Highlights a notable career move by a prominent AI researcher.

LLMDeploymentTooling
srush_io
Sasha Rush

@srush_io

Some personal news: I recently joined Cursor. Cursor is a small, ambitious team, and they've created

Highlights: Sasha Rush announced joining Cursor, a small ambitious team.

Worth reading: Indicates a move to industry from academia.

Deployment
srush_io
Sasha Rush

@srush_io

Some personal news: I recently joined Cursor. Cursor is a small, ambitious team, and they've created

Highlights: Sasha Rush announced joining Cursor, highlighting the team's ambition and product.

Worth reading: Shows a prominent researcher's move to a startup, indicating industry trends.

Tooling
srush_io
Sasha Rush

@srush_io

Some personal news: I recently joined Cursor. Cursor is a small, ambitious team, and they've created...

Highlights: Sasha Rush announced joining Cursor, a small ambitious team.

Worth reading: Indicates a career move to an AI startup.

Tooling
srush_io
Sasha Rush

@srush_io

Some personal news: I recently joined Cursor. Cursor is a small, ambitious team, and they've created

Highlights: Sasha Rush announces joining Cursor, a small ambitious team.

Worth reading: Highlights a notable career move by a prominent NLP researcher.

Tooling
srush_io
Sasha Rush

@srush_io

Some personal news: I recently joined Cursor. Cursor is a small, ambitious team, and they've created

Highlights: Sasha Rush announces joining Cursor, a small ambitious team.

Worth reading: Highlights a notable career move in AI/ML industry.

DeploymentTooling
srush_io
Sasha Rush

@srush_io

Some personal news: I recently joined Cursor. Cursor is a small, ambitious team, and they've created

Highlights: Sasha Rush announced joining Cursor, a small ambitious team.

Worth reading: Indicates a key career move for a prominent AI researcher.

Tooling
srush_io
Sasha Rush

@srush_io

Wager established. Jonathan Frankle (@jefrankle) stepped up to my Transformer long bet.

Highlights: Sasha Rush made a bet about Transformers with Jonathan Frankle.

Worth reading: Shows engagement in research debates around Transformers.

LLM
srush_io
Sasha Rush

@srush_io

Wager established. Jonathan Frankle (@jefrankle) stepped up to my Transformer long bet.

Highlights: Sasha Rush makes a bet about Transformers with Jonathan Frankle.

Worth reading: Illustrates a public debate on Transformer longevity in AI.

LLM
srush_io
Sasha Rush

@srush_io

Wager established. Jonathan Frankle (@jefrankle) stepped up to my Transformer long bet.

Highlights: Sasha Rush established a bet with Jonathan Frankle about Transformers.

Worth reading: Shows engagement in AI community debates about Transformer architectures.

LLM
srush_io
Sasha Rush

@srush_io

Wager established. Jonathan Frankle (@jefrankle) stepped up to my Transformer long bet.

Highlights: Sasha Rush made a long bet about Transformers with Jonathan Frankle.

Worth reading: Shows an ongoing debate about the future of Transformer architectures.

LLMInfra
srush_io
Sasha Rush

@srush_io

Wager established. Jonathan Frankle (@jefrankle) stepped up to my Transformer long bet.

Highlights: Sasha Rush made a long bet with Jonathan Frankle about Transformers.

Worth reading: Shows a playful academic challenge between researchers.

LLM
srush_io
Sasha Rush

@srush_io

Wager established. Jonathan Frankle (@jefrankle) stepped up to my Transformer long bet.

Highlights: Sasha Rush established a long bet with Jonathan Frankle about Transformers.

Worth reading: Shows Rush's engagement in public bets on ML research.

LLM
srush_io
Sasha Rush

@srush_io

Wager established. Jonathan Frankle (@jefrankle) stepped up to my Transformer long bet.

Highlights: Sasha Rush establishes a bet with Jonathan Frankle regarding Transformers.

Worth reading: Shows engagement in research debates about Transformer architectures.

LLM
srush_io
Sasha Rush

@srush_io

Wager established. Jonathan Frankle (@jefrankle) stepped up to my Transformer long bet.

Highlights: Sasha Rush engages in a public bet about Transformers with Jonathan Frankle.

Worth reading: Shows engagement in ML community debates.

LLM
srush_io
Sasha Rush

@srush_io

#acl2020nlp Lot of threads online about likes and dislikes for the conference. Twitter is fleeting, github is forever. Send issues or PRs: https://github.com/Mini-Conf/Mini-Conf/issues… It's early days, we're making up virtual conferences as we go along.

Highlights: Sasha Rush advocates for using GitHub over Twitter for lasting conference feedback, and acknowledges the experimental nature of virtual conferences.

Worth reading: Highlights the shift to virtual conferences and the value of persistent open-source contributions.

Infra
srush_io
Sasha Rush

@srush_io

(My last chance to tweet about Yoon Kim as he leaves the lab 😢. Part of an amazing group of students.) Congrats to Yoon on winning this year's HarvardCS thesis award! And since its public, Yoon is heading next to MIT. Highly recommend sending an app🍎 https://seas.

Highlights: Sasha Rush congratulates student Yoon Kim on winning a thesis award and announces his move to MIT.

Worth reading: Shows Rush's mentorship and pride in his students' achievements.

Evaluation
srush_io
Sasha Rush

@srush_io

Congrats to Dr. Yoon Kim 🍾 who zoom defended his dissertation "Deep Latent Variable Model of Natural Language". Yoon's research is wonderful, he's also such a thoughtful teacher and dedicated collaborator. Very curious what he decides to do next

Highlights: Sasha Rush celebrates Yoon Kim's PhD defense and praises his research and character.

Worth reading: Demonstrates Rush's support for students and recognition of their contributions.

Evaluation
srush_io
Sasha Rush

@srush_io

Some news: moving this fall from Harvard -> Cornell Tech. Sad to leave such an incredible ...

Highlights: Sasha Rush announced his move from Harvard to Cornell Tech.

Worth reading: Shows a career transition in academia, relevant to understanding his professional trajectory.

Deployment
srush_io
Sasha Rush

@srush_io

Some news: moving this fall from Harvard -> Cornell Tech. Sad to leave such an incredible

Highlights: Sasha Rush announced moving from Harvard to Cornell Tech.

Worth reading: Shows a career move in academia.

Deployment
57 x-posts · All time