I saw you ran ablations on model and context size.
Did you also run an ablation to assess the benefit of making the cross attention mechanism phylogeny-aware?
I saw you ran ablations on model and context size.
Did you also run an ablation to assess the benefit of making the cross attention mechanism phylogeny-aware?