List of models optimized for Portuguese (both Brazilian and European varieties), focusing on architectures with at least 1 billion parameters.
When appropriate, the models are grouped into “families”. The information presented for each model includes: name, release date, license, Portuguese variety, size, base model, weights, variations, training data, data cutoff date, associated API, online chat, and development team.
Click on the family name to access information about each language model variant.
Alpaca-LoRA-PTBR
- Name: Alpaca-LoRA-PTBR
- Release date: 2023-03-18
- License: Creative Commons Attribution 4.0
- Portuguese variety: Brazil
- Size: 7B
- Base model: Llama 1
- Weights: sim
- Variations: Alpaca-lora-ptbr-7b
- Training data: Stanford Alpaca (traduzido automaticamente)
- Data cutoff (training): ≥ 2022-09
- Associated API: —
- Online chat: —
- Team: Maicon Domingues
Amália
- Name: Amália
- Release date: (futuro)
- License: (?)
- Portuguese variety: Portugal
- Size: 9B
- Base model: (?)
- Weights: (futuro)
- Variations: (?)
- Training data: (?)
- Data cutoff (training): (?)
- Associated API: (futuro)
- Online chat: —
- Team: Governo de Portugal, NOVA FCT, IST-UL
Amazônia IA
- Name: Amazônia IA
- Release date: 2024-08-05
- License: Proprietária
- Portuguese variety: Brazil
- Size: (confidencial)
- Base model: (?)
- Weights: não
- Variations: —
- Training data: (?)
- Data cutoff (training): (?)
- Associated API: (futuro)
- Online chat: sim
- Team: Widelabs, Oracle, NVIDIA
Bode
- Name: Bode
- Release date: 2023-10-11
- License: MIT
- Portuguese variety: Brazil
- Size: 7B e 13B
- Base model: Llama 2
- Weights: sim
- Variations: Bode-7b-alpaca-pt-br, Bode-13b-alpaca-pt-br
- Training data: Subconjunto em português do mC4
- Data cutoff (training): ≥ 2023-07
- Associated API: —
- Online chat: —
- Team: Recogna
Canarim
Canarim Instruct
- Name: Canarim Instruct
- Release date: 2023-11-17
- License: Llama 2 Community License
- Portuguese variety: Brazil
- Size: 7B
- Base model: Llama 2
- Weights: sim
- Variations: Canarim-7B-Instruct
- Training data: CC-MAIN-2023-23
- Data cutoff (training): ≥ 2023-07
- Associated API: —
- Online chat: —
- Team: Maicon Domingues
Canarim VestibulAide
- Name: Canarim VestibulAide
- Release date: 2023-11-17
- License: Llama 2 Community License
- Portuguese variety: Brazil
- Size: 7B
- Base model: Llama 2
- Weights: sim
- Variations: Canarim-7B-VestibulAide
- Training data: Provas de vestibulares
- Data cutoff (training): ≥ 2023-07
- Associated API: —
- Online chat: —
- Team: Maicon Domingues
Carvalho_pt-gl
- Name: Carvalho_pt-gl
- Release date: 2024-03 até 2025-03
- License: Llama 3.1 Community License
- Portuguese variety: Galiza e Portugal
- Size: 1.3B
- Base model: Cerebras-GPT
- Weights: sim
- Variations: Llama-Carvalho-PT-GL, Carvalho_pt-gl-1.3B, Llama-Carvalho-PT, Llama-Carvalho-GL
- Training data: CorpusNOS, BNE-gl, Arquivo.pt
- Data cutoff (training): (?)
- Associated API: —
- Online chat: —
- Team: CiTIUS e ILG-USC
Cocoruta
Cocoruta
- Name: Cocoruta
- Release date: 2023-10-28
- License: Llama 2 Community License
- Portuguese variety: Brazil
- Size: 7B
- Base model: Llama 2
- Weights: sim
- Variations: Cocoruta-7b
- Training data: Legislação brasileira sobre meio ambiente
- Data cutoff (training): 2023
- Associated API: —
- Online chat: —
- Team: KEML-C4AI (Felipe de Oliveira Espírito Santo)
Cocoruta 2
- Name: Cocoruta 2
- Release date: 2025-02-10
- License: Llama 3.1 Community License
- Portuguese variety: Brazil
- Size: 8B
- Base model: Llama 3.1
- Weights: sim
- Variations: Cororuta-2-8b
- Training data: Legislação brasileira sobre meio ambiente
- Data cutoff (training): 2025
- Associated API: —
- Online chat: —
- Team: KEML-C4AI (Felipe de Oliveira Espírito Santo)
Gaia
Gervásio
Gervásio 7B PTBR
- Name: Gervásio 7B PTBR
- Release date: 2024-02-28
- License: MIT
- Portuguese variety: Brazil
- Size: 7B
- Base model: Llama 2
- Weights: sim
- Variations: Gervasio-7b-portuguese-ptbr-decoder
- Training data: GLUE, SuperGLUE
- Data cutoff (training): (?)
- Associated API: —
- Online chat: —
- Team: Portulan Clarin
Gervásio 8B PTPT
- Name: Gervásio 8B PTPT
- Release date: 2025-06-11
- License: MIT
- Portuguese variety: Portugal
- Size: 8B
- Base model: Lamma 3.1
- Weights: sim
- Variations: Gervasio-8b-portuguese-ptpt-decoder
- Training data: extraGLUE-Instruct, MMLU PT, Natural Instructions, Wikipedia, Proverbs
- Data cutoff (training): (?)
- Associated API: —
- Online chat: sim
- Team: Portulan Clarin
Glória
- Name: GlórIA
- Release date: 2024-02-26
- License: ClueWeb22 Dataset License
- Portuguese variety: Portugal
- Size: 1.3B
- Base model: GPT-Neo
- Weights: sim
- Variations: GlorIA-1.3B
- Training data: ClueWeb22 PTPT, OSCAR PTPT, ArquivoPT, OpenSubtitles PTPT, PTWiki, EuroParl PTPT
- Data cutoff (training): (?)
- Associated API: —
- Online chat: —
- Team: Pesquisadores da NOVA FCT
Juru
- Name: Juru
- Release date: 2025-06-29
- License: (?)
- Portuguese variety: Brazil
- Size: 7B
- Base model: Mistral
- Weights: sim
- Variations: Juru-7B
- Training data: Artigos acadêmicos do domínio legal brasileiro, dados do LexML, e documentos do STF
- Data cutoff (training): 2024
- Associated API: —
- Online chat: —
- Team: Roseval Malaquias Junior
L³M
- Name: L³M
- Release date: (futuro)
- License: Proprietária
- Portuguese variety: Brazil
- Size: (?)
- Base model: (?)
- Weights: não
- Variations: (?)
- Training data: Escavador
- Data cutoff (training): (?)
- Associated API: —
- Online chat: —
- Team: NeuralMind e Escavador
openCabrita
- Name: openCabrita
- Release date: 2023-07-06
- License: Apache 2.0
- Portuguese variety: (?)
- Size: 3B
- Base model: Llama 1
- Weights: sim
- Variations: Open-cabrita3b
- Training data: Subconjunto em português do mC4
- Data cutoff (training): ≥ 2022-09
- Associated API: (futuro)
- Online chat: —
- Team: 22h
Sabiá
Sabiá-3
- Name: Sabiá-3
- Release date: 2024-04
- License: Proprietária
- Portuguese variety: Brazil
- Size: (confidencial)
- Base model: (confidencial)
- Weights: não
- Variations: Sabia-3, Sabiazinho-3
- Training data: Dados públicos da internet
- Data cutoff (training): 2023
- Associated API: exige pagamento
- Online chat: sim
- Team: Maritaca AI
Sabiá-3.1
- Name: Sabiá-3.1
- Release date: 2025-05
- License: Proprietária
- Portuguese variety: Brazil
- Size: (confidencial)
- Base model: (confidencial)
- Weights: não
- Variations: Sabia-3.1
- Training data: Dados públicos da internet
- Data cutoff (training): 2024-08
- Associated API: exige pagamento
- Online chat: sim
- Team: Maritaca AI
Sabiá-7B
- Name: Sabiá-7B
- Release date: 2023-11-08
- License: LLaMA License
- Portuguese variety: Brazil
- Size: 7B
- Base model: Llama 1
- Weights: sim
- Variations: Sabia-7b
- Training data: ClueWeb22
- Data cutoff (training): 2022
- Associated API: —
- Online chat: sim
- Team: Maritaca AI
SoberanIA
- Name: SoberanIA
- Release date: (futuro)
- License: (?)
- Portuguese variety: Brazil
- Size: (?)
- Base model: (?)
- Weights: (futuro)
- Variations: (?)
- Training data: (?)
- Data cutoff (training): (?)
- Associated API: (futuro)
- Online chat: (futuro)
- Team: Governo do Piauí
Tucano
- Name: Tucano
- Release date: 2024-11-07
- License: Apache 2.0
- Portuguese variety: Brazil
- Size: 0.16B a 2B
- Base model: (?)
- Weights: sim
- Variations: Tucano-1b1-Instruct, Tucano-2b4-Instruct
- Training data: GigaVerbo
- Data cutoff (training): (?)
- Associated API: —
- Online chat: —
- Team: Tucano Project
Contributors:
- Vinícius Bitencourt Matos
- Arnaldo Candido Junior
