RETNET: Challenging the Inference Cost of Transformer Based Language Models
We all love our new best friend – the large language model. However, this friendship can be a costly one due to the high price of inferences. A novel architecture proposed by Sun et al. holds promise to optimize the return the investment. Retentive Networks (RetNet – https://arxiv.org/pdf/2307.08621.pdf) is a pioneering architecture that combines recurrence […]