Co-Director @McGill-NLP.bsky.social
Researcher @ServiceNow.bsky.social
Alumni: @StanfordNLP.bsky.social, EdinburghNLP
Natural Language Processor #NLProc
Read the paper here: arxiv.org/abs/2502.05670
Read the paper here: arxiv.org/abs/2502.05670
If you want the naive, training-free / model-agnostic approach: their related work section says it is most common to using the final token’s last hidden state.
If you want the naive, training-free / model-agnostic approach: their related work section says it is most common to using the final token’s last hidden state.