"The results show that AMALIA-DPO [Direct Preference Optimisation] achieves the best performance among the fully open models by a considerable margin, even obtaining the best results among all the models in lexicology and semantics, demonstrating a robust mastery of the specific linguistic competences" of Portuguese in several categories.

The Portuguese Amália LLM [Large Language Model] has been constantly evolving by the consortium of Portuguese universities, leading its development.

According to the technical report, in an in-depth evaluation of European Portuguese, Amália has clear advantages over other open models.

In Portuguese national exams (long-answer Portuguese questions), Amália "obtains the best score of all the fully open source models, demonstrating good comprehension of complex sentences and coherent text production, with appropriate grammar and register".

In this report, "we present an LLM that prioritises the European Portuguese language and its cultural context," reads the document, which states that Amália uses data from arquivo.pt and post-training data prepared specifically for European Portuguese.

The document indicates that the LLM was trained using language modelling and instruction adjustment strategies.

"A fundamental challenge in the development of this model was the lack of benchmarks to monitor the progress of the model's performance," the report notes.

To mitigate this limitation, "we used national PT-PT exams, created a linguistic benchmark and translated several datasets" with a dedicated high-quality machine translation (MT) model.

"The evaluation showed that Amália outperforms all previous open-source models in PT-PT and many “open-weight” models [which share the weights (trained parameters)]," concludes the technical report.

"Experiments on language comprehension and inference benchmarks show state-of-the-art or comparable results, while in language generation benchmarks, the model excels in the quality of the generated text. Security experiments also show that the model is in line with the state of the art," reads the report.

In the future, "we will explore other reinforcement learning methods and develop new combinations of training data to improve reasoning abilities in PT-PT".

In other words, in practice, these results indicate that Amália is becoming reliable as an assistant in European Portuguese.

The report was written by João Magalhães (UNL) and André Martins (IST), the coordinators, and a team of around 20 people from the University of Lisbon and Universidade Nova de Lisboa.

The Amalia model is being developed by a team made up of the Universidade Nova de Lisboa, the Instituto Superior Técnico, the Universidade de Coimbra, the Universidade do Porto, the Universidade do Minho and the Fundação para a Ciência e Tecnologia.

The process of creating Amália began with the collection and processing of European Portuguese data on a large scale, which was filtered based on its relevance and linguistic quality. The Portuguese Web Archive was used for this purpose. The model was pre-trained on this data and then fine-tuned on other data sets to follow instructions, reason and solve problems.

Large-scale computing infrastructure was used to train the models, using national supercomputers (Mare Nostrum 5 and Deucalion) and European supercomputers (through the EuroHPC network).