{"id":2574,"date":"2025-12-22T12:49:55","date_gmt":"2025-12-22T14:49:55","guid":{"rendered":"https:\/\/sites.usp.br\/keml\/?page_id=2574"},"modified":"2025-12-22T12:52:40","modified_gmt":"2025-12-22T14:52:40","slug":"language-models-for-portuguese","status":"publish","type":"page","link":"https:\/\/sites.usp.br\/keml\/en\/language-models-for-portuguese\/","title":{"rendered":"Language Models for Portuguese"},"content":{"rendered":"<p>List of models optimized for Portuguese (both Brazilian and European varieties), focusing on architectures with <strong>at least 1 billion parameters<\/strong>.<\/p>\n<p>When appropriate, the models are grouped into <strong>\u201cfamilies\u201d<\/strong>. The information presented for each model includes: name, release date, license, Portuguese variety, size, base model, weights, variations, training data, data cutoff date, associated API, online chat, and development team.<\/p>\n<p>Click on the family name to access information about each language model variant.<\/p>\n<details style=\"margin-bottom: 1.2em; padding: 0.8em 1em; border: 1px solid #ddd; border-radius: 8px; background-color: #f9f9f9;\">\n<summary style=\"font-weight: bold; color: #01484e; font-size: 1.1em; cursor: pointer; display: list-item;\">Alpaca-LoRA-PTBR<\/summary>\n<ul style=\"margin-top: 0; margin-bottom: 0.2em; padding-left: 1.2em;\">\n<li>Name: <strong>Alpaca-LoRA-PTBR<\/strong><\/li>\n<li>Release date: 2023-03-18<\/li>\n<li>License: Creative Commons Attribution 4.0<\/li>\n<li>Portuguese variety: Brazil<\/li>\n<li>Size: 7B<\/li>\n<li>Base model: Llama 1<\/li>\n<li>Weights: <a href=\"https:\/\/huggingface.co\/dominguesm\/alpaca-lora-ptbr-7b\" target=\"_blank\" rel=\"noopener\">yes<\/a><\/li>\n<li>Variations: Alpaca-lora-ptbr-7b<\/li>\n<li>Training data: <a href=\"https:\/\/github.com\/tatsu-lab\/stanford_alpaca\" target=\"_blank\" rel=\"noopener\">Stanford Alpaca (automatically translated)<\/a><\/li>\n<li>Data cutoff (training): \u2265 2022-09<\/li>\n<li>Associated API: &#8212;<\/li>\n<li>Online chat: &#8212;<\/li>\n<li>Team: <a href=\"https:\/\/github.com\/DominguesM\" target=\"_blank\" rel=\"noopener\">Maicon Domingues<\/a><\/li>\n<\/ul>\n<\/details>\n<details style=\"margin-bottom: 1.2em; padding: 0.8em 1em; border: 1px solid #ddd; border-radius: 8px; background-color: #f9f9f9;\">\n<summary style=\"font-weight: bold; color: #01484e; font-size: 1.1em; cursor: pointer; display: list-item;\">Amadeus-Verbo<\/summary>\n<ul style=\"margin-top: 0; margin-bottom: 0.2em; padding-left: 1.2em;\">\n<li>Name: <strong>Amadeus-Verbo<\/strong><\/li>\n<li>Release date: 2025-02-15<\/li>\n<li>License: Apache 2.0<\/li>\n<li>Portuguese variety: Brazil<\/li>\n<li>Size: 0.5B a 72B<\/li>\n<li>Base model: Qwen 2<\/li>\n<li>Weights: <a href=\"https:\/\/huggingface.co\/collections\/amadeusai\/amadeus-verbo-qwen25-pt-br-powered-by-aws\" target=\"_blank\" rel=\"noopener\">yes<\/a><\/li>\n<li>Variations: <a href=\"https:\/\/huggingface.co\/collections\/amadeusai\/amadeus-verbo-qwen25-pt-br-powered-by-aws\" target=\"_blank\" rel=\"noopener\">21 varia\u00e7\u00f5es<\/a><\/li>\n<li>Training data: (?)<\/li>\n<li>Data cutoff (training): (?)<\/li>\n<li>Associated API: &#8212;<\/li>\n<li>Online chat: &#8212;<\/li>\n<li>Team: <a href=\"https:\/\/www.linkedin.com\/company\/amadeus-ai\/?originalSubdomain=br\" target=\"_blank\" rel=\"noopener\">Amadeus<\/a><\/li>\n<\/ul>\n<\/details>\n<details style=\"margin-bottom: 1.2em; padding: 0.8em 1em; border: 1px solid #ddd; border-radius: 8px; background-color: #f9f9f9;\">\n<summary style=\"font-weight: bold; color: #01484e; font-size: 1.1em; cursor: pointer; display: list-item;\">Am\u00e1lia<\/summary>\n<ul style=\"margin-top: 0; margin-bottom: 0.2em; padding-left: 1.2em;\">\n<li>Name: <a href=\"https:\/\/www.portugal.gov.pt\/pt\/gc24\/comunicacao\/noticia?i=modelo-de-linguagem-em-grande-escala-para-a-lingua-portuguesa\" target=\"_blank\" rel=\"noopener\"><strong>Am\u00e1lia<\/strong><\/a><\/li>\n<li>Release date: (future)<\/li>\n<li>License: (?)<\/li>\n<li>Portuguese variety: Portugal<\/li>\n<li>Size: 9B<\/li>\n<li>Base model: (?)<\/li>\n<li>Weights: (future)<\/li>\n<li>Variations: (?)<\/li>\n<li>Training data: (?)<\/li>\n<li>Data cutoff (training): (?)<\/li>\n<li>Associated API: (future)<\/li>\n<li>Online chat: &#8212;<\/li>\n<li>Team: <a href=\"https:\/\/www.base.gov.pt\/Base4\/pt\/resultados\/?type=doc_documentos&amp;id=2233342&amp;ext=.pdf\" target=\"_blank\" rel=\"noopener\">Governo de Portugal<\/a>, <a href=\"https:\/\/www.fct.unl.pt\/\" target=\"_blank\" rel=\"noopener\">NOVA FCT<\/a>, <a href=\"https:\/\/tecnico.ulisboa.pt\/pt\/\" target=\"_blank\" rel=\"noopener\">IST-UL<\/a><\/li>\n<\/ul>\n<\/details>\n<details style=\"margin-bottom: 1.2em; padding: 0.8em 1em; border: 1px solid #ddd; border-radius: 8px; background-color: #f9f9f9;\">\n<summary style=\"font-weight: bold; color: #01484e; font-size: 1.1em; cursor: pointer; display: list-item;\">Amaz\u00f4nia IA<\/summary>\n<ul style=\"margin-top: 0; margin-bottom: 0.2em; padding-left: 1.2em;\">\n<li>Name: <a href=\"https:\/\/amazoniaia.com.br\/\" target=\"_blank\" rel=\"noopener\"><strong>Amaz\u00f4nia IA<\/strong><\/a><\/li>\n<li>Release date: 2024-08-05<\/li>\n<li>License: Proprietary<\/li>\n<li>Portuguese variety: Brazil<\/li>\n<li>Size: (confidential)<\/li>\n<li>Base model: (?)<\/li>\n<li>Weights: no<\/li>\n<li>Variations: &#8212;<\/li>\n<li>Training data: (?)<\/li>\n<li>Data cutoff (training): (?)<\/li>\n<li>Associated API: (future)<\/li>\n<li>Online chat: <a href=\"https:\/\/plataforma.amazoniaia.com.br\/\" target=\"_blank\" rel=\"noopener\">yes<\/a><\/li>\n<li>Team: <a href=\"https:\/\/www.widelabs.com.br\/\" target=\"_blank\" rel=\"noopener\">Widelabs, Oracle, NVIDIA<\/a><\/li>\n<\/ul>\n<\/details>\n<details style=\"margin-bottom: 1.2em; padding: 0.8em 1em; border: 1px solid #ddd; border-radius: 8px; background-color: #f9f9f9;\">\n<summary style=\"font-weight: bold; color: #01484e; font-size: 1.1em; cursor: pointer; display: list-item;\">Bode<\/summary>\n<ul style=\"margin-top: 0; margin-bottom: 0.2em; padding-left: 1.2em;\">\n<li>Name: <a href=\"https:\/\/arxiv.org\/abs\/2401.02909\" target=\"_blank\" rel=\"noopener\"><strong>Bode<\/strong><\/a><\/li>\n<li>Release date: 2023-10-11<\/li>\n<li>License: MIT<\/li>\n<li>Portuguese variety: Brazil<\/li>\n<li>Size: 7B e 13B<\/li>\n<li>Base model: Llama 2<\/li>\n<li>Weights: <a href=\"https:\/\/huggingface.co\/collections\/recogna-nlp\/bode-llm-em-portugues-65b97aa411162bf34f8da221\" target=\"_blank\" rel=\"noopener\">yes<\/a><\/li>\n<li>Variations: Bode-7b-alpaca-pt-br, Bode-13b-alpaca-pt-br<\/li>\n<li>Training data: <a href=\"https:\/\/www.tensorflow.org\/datasets\/catalog\/c4#c4multilingual\" target=\"_blank\" rel=\"noopener\">Portuguese subset of mC4<\/a><\/li>\n<li>Data cutoff (training): \u2265 2023-07<\/li>\n<li>Associated API: &#8212;<\/li>\n<li>Online chat: &#8212;<\/li>\n<li>Team: <a href=\"https:\/\/www.recogna.tech\/\" target=\"_blank\" rel=\"noopener\">Recogna<\/a><\/li>\n<\/ul>\n<\/details>\n<details style=\"margin-bottom: 1.2em; padding: 0.8em 1em; border: 1px solid #ddd; border-radius: 8px; background-color: #f9f9f9;\">\n<summary style=\"font-weight: bold; color: #01484e; font-size: 1.1em; cursor: pointer; display: list-item;\">Boto<\/summary>\n<ul style=\"margin-top: 0; margin-bottom: 0.2em; padding-left: 1.2em;\">\n<li>Name: <strong>Boto<\/strong><\/li>\n<li>Release date: 2024-07-11<\/li>\n<li>License: Apache 2.0<\/li>\n<li>Portuguese variety: Brazil<\/li>\n<li>Size: 9B<\/li>\n<li>Base model: Gemma 2 9B<\/li>\n<li>Weights: <a href=\"https:\/\/huggingface.co\/lucianosb\/boto-9B\/tree\/main\" target=\"_blank\" rel=\"noopener\">yes<\/a><\/li>\n<li>Variations: Boto 9B it<\/li>\n<li>Training data: <a href=\"https:\/\/huggingface.co\/datasets\/lucianosb\/cetacean-ptbr\" target=\"_blank\" rel=\"noopener\">Open-Orca and Dolphin (translated into Portuguese)<\/a><\/li>\n<li>Data cutoff (training): \u2265 2024-04<\/li>\n<li>Associated API: &#8212;<\/li>\n<li>Online chat: &#8212;<\/li>\n<li>Team: <a href=\"https:\/\/www.linkedin.com\/in\/lucianosb\" target=\"_blank\" rel=\"noopener\">Luciano Santa Br\u00edgida<\/a><\/li>\n<\/ul>\n<\/details>\n<details style=\"margin-bottom: 1.2em; padding: 0.8em 1em; border: 1px solid #ddd; border-radius: 8px; background-color: #f9f9f9;\">\n<summary style=\"font-weight: bold; color: #01484e; font-size: 1.1em; cursor: pointer; display: list-item;\">Canarim<\/summary>\n<details style=\"margin-left: 1.2em; margin-top: 0.6em;\">\n<summary style=\"font-weight: 600; color: #333; cursor: pointer; display: list-item;\">Canarim Instruct<\/summary>\n<ul style=\"margin-top: 0; margin-bottom: 0.2em; padding-left: 1.2em;\">\n<li>Name: <a href=\"https:\/\/nlp.rocks\/projects\/canarim-7b-instruct\" target=\"_blank\" rel=\"noopener\"><strong>Canarim Instruct<\/strong><\/a><\/li>\n<li>Release date: 2023-11-17<\/li>\n<li>License: Llama 2 Community License<\/li>\n<li>Portuguese variety: Brazil<\/li>\n<li>Size: 7B<\/li>\n<li>Base model: Llama 2<\/li>\n<li>Weights: <a href=\"https:\/\/huggingface.co\/dominguesm\/Canarim-7B-Instruct\">yes<\/a><\/li>\n<li>Variations: Canarim-7B-Instruct<\/li>\n<li>Training data: <a href=\"https:\/\/data.commoncrawl.org\/crawl-data\/CC-MAIN-2023-23\/index.html\" target=\"_blank\" rel=\"noopener\">CC-MAIN-2023-23<\/a><\/li>\n<li>Data cutoff (training): \u2265 2023-07<\/li>\n<li>Associated API: &#8212;<\/li>\n<li>Online chat: &#8212;<\/li>\n<li>Team:\u00a0<a href=\"https:\/\/github.com\/DominguesM\" target=\"_blank\" rel=\"noopener\">Maicon Domingues<\/a><\/li>\n<\/ul>\n<\/details>\n<details style=\"margin-left: 1.2em; margin-top: 0.6em;\">\n<summary style=\"font-weight: 600; color: #333; cursor: pointer; display: list-item;\">Canarim VestibulAide<\/summary>\n<ul style=\"margin-top: 0; margin-bottom: 0.2em; padding-left: 1.2em;\">\n<li>Name: <a href=\"https:\/\/nlp.rocks\/projects\/canarim-7b-vestibulaide\" target=\"_blank\" rel=\"noopener\"><strong>Canarim VestibulAide<\/strong><\/a><\/li>\n<li>Release date: 2023-11-17<\/li>\n<li>License: Llama 2 Community License<\/li>\n<li>Portuguese variety: Brazil<\/li>\n<li>Size: 7B<\/li>\n<li>Base model: Llama 2<\/li>\n<li>Weights: <a href=\"https:\/\/huggingface.co\/dominguesm\/canarim-7b-vestibulaide\" target=\"_blank\" rel=\"noopener\">yes<\/a><\/li>\n<li>Variations: Canarim-7B-VestibulAide<\/li>\n<li>Training data: <a href=\"https:\/\/nlp.rocks\/projects\/canarim-7b-vestibulaide\" target=\"_blank\" rel=\"noopener\">College entrance exam tests<\/a><\/li>\n<li>Data cutoff (training): \u2265 2023-07<\/li>\n<li>Associated API: &#8212;<\/li>\n<li>Online chat: &#8212;<\/li>\n<li>Team: <a href=\"https:\/\/github.com\/DominguesM\" target=\"_blank\" rel=\"noopener\">Maicon Domingues<\/a><\/li>\n<\/ul>\n<\/details>\n<\/details>\n<details style=\"margin-bottom: 1.2em; padding: 0.8em 1em; border: 1px solid #ddd; border-radius: 8px; background-color: #f9f9f9;\">\n<summary style=\"font-weight: bold; color: #01484e; font-size: 1.1em; cursor: pointer; display: list-item;\">Caramelo<\/summary>\n<ul style=\"margin-top: 0; margin-bottom: 0.2em; padding-left: 1.2em;\">\n<li>Name: <strong>Caramelo<\/strong><\/li>\n<li>Release date: 2023-06-09<\/li>\n<li>License: Apache 2.0<\/li>\n<li>Portuguese variety: Brazil<\/li>\n<li>Size: 7B<\/li>\n<li>Base model: Falcon 7B<\/li>\n<li>Weights: <a href=\"https:\/\/huggingface.co\/Bruno\/Caramelo_7B\/tree\/main\" target=\"_blank\" rel=\"noopener\">yes<\/a><\/li>\n<li>Variations: Caramelinho<\/li>\n<li>Training data: <a href=\"https:\/\/github.com\/gururise\/AlpacaDataCleaned\" target=\"_blank\" rel=\"noopener\">Cleaned Alpaca (automatically translated)<\/a><\/li>\n<li>Data cutoff (training): \u2265\u00a0 2023-06<\/li>\n<li>Associated API: &#8212;<\/li>\n<li>Online chat: &#8212;<\/li>\n<li>Team: <a href=\"https:\/\/github.com\/brunotech\" target=\"_blank\" rel=\"noopener\">Bruno Henrique<\/a><\/li>\n<\/ul>\n<\/details>\n<details style=\"margin-bottom: 1.2em; padding: 0.8em 1em; border: 1px solid #ddd; border-radius: 8px; background-color: #f9f9f9;\">\n<summary style=\"font-weight: bold; color: #01484e; font-size: 1.1em; cursor: pointer; display: list-item;\">Carvalho_pt-gl<\/summary>\n<ul style=\"margin-top: 0; margin-bottom: 0.2em; padding-left: 1.2em;\">\n<li>Name: <a href=\"https:\/\/doi.org\/10.1007\/978-3-031-73503-5_24\" target=\"_blank\" rel=\"noopener\"><strong>Carvalho_pt-gl<\/strong><\/a><\/li>\n<li>Release date: 2024-03 at\u00e9 2025-03<\/li>\n<li>License: Llama 3.1 Community License<\/li>\n<li>Portuguese variety: Galiza and Portugal<\/li>\n<li>Size: 1.3B<\/li>\n<li>Base model: Cerebras-GPT<\/li>\n<li>Weights: <a href=\"https:\/\/huggingface.co\/collections\/Nos-PT\/carvalho-family-67e423bf209c732396377b61\" target=\"_blank\" rel=\"noopener\">yes<\/a><\/li>\n<li>Variations: Llama-Carvalho-PT-GL, Carvalho_pt-gl-1.3B, Llama-Carvalho-PT, Llama-Carvalho-GL<\/li>\n<li>Training data: <a href=\"https:\/\/github.com\/proxectonos\/corpora\" target=\"_blank\" rel=\"noopener\">CorpusNOS<\/a>, BNE-gl, <a href=\"http:\/\/arquivo.pt\" target=\"_blank\" rel=\"noopener\">Arquivo.pt<\/a><\/li>\n<li>Data cutoff (training): (?)<\/li>\n<li>Associated API: &#8212;<\/li>\n<li>Online chat: &#8212;<\/li>\n<li>Team: <a href=\"https:\/\/citius.gal\/\" target=\"_blank\" rel=\"noopener\">CiTIUS<\/a> and <a href=\"https:\/\/ilg.usc.gal\/\" target=\"_blank\" rel=\"noopener\">ILG-USC<\/a><\/li>\n<\/ul>\n<\/details>\n<details style=\"margin-bottom: 1.2em; padding: 0.8em 1em; border: 1px solid #ddd; border-radius: 8px; background-color: #f9f9f9;\">\n<summary style=\"font-weight: bold; color: #01484e; font-size: 1.1em; cursor: pointer; display: list-item;\">Cocoruta<\/summary>\n<details style=\"margin-left: 1.2em; margin-top: 0.6em;\">\n<summary style=\"font-weight: 600; color: #333; cursor: pointer; display: list-item;\">Cocoruta<\/summary>\n<ul style=\"margin-top: 0; margin-bottom: 0.2em; padding-left: 1.2em;\">\n<li>Name: <a href=\"https:\/\/doi.org\/10.1109\/IJCNN60899.2024.10650895\" target=\"_blank\" rel=\"noopener\"><strong>Cocoruta<\/strong><\/a><\/li>\n<li>Release date: 2023-10-28<\/li>\n<li>License: Llama 2 Community License<\/li>\n<li>Portuguese variety: Brazil<\/li>\n<li>Size: 7B<\/li>\n<li>Base model: Llama 2<\/li>\n<li>Weights: <a href=\"https:\/\/huggingface.co\/felipeoes\/cocoruta-7b\" target=\"_blank\" rel=\"noopener\">yes<\/a><\/li>\n<li>Variations: Cocoruta-7b<\/li>\n<li>Training data: <a href=\"https:\/\/huggingface.co\/datas ets\/felipeoes\/cocoruta-evaluation\" target=\"_blank\" rel=\"noopener\">Brazilian environmental legislation<\/a><\/li>\n<li>Data cutoff (training): 2023<\/li>\n<li>Associated API: &#8212;<\/li>\n<li>Online chat: &#8212;<\/li>\n<li>Team: <a href=\"https:\/\/sites.usp.br\/keml\/\" target=\"_blank\" rel=\"noopener\">KEML-C4AI<\/a> (<a href=\"https:\/\/www.linkedin.com\/in\/felipeoes\/\" target=\"_blank\" rel=\"noopener\">Felipe de Oliveira Esp\u00edrito Santo<\/a>)<\/li>\n<\/ul>\n<\/details>\n<details style=\"margin-left: 1.2em; margin-top: 0.6em;\">\n<summary style=\"font-weight: 600; color: #333; cursor: pointer; display: list-item;\">Cocoruta 2<\/summary>\n<ul style=\"margin-top: 0; margin-bottom: 0.2em; padding-left: 1.2em;\">\n<li>Name: <strong>Cocoruta 2<\/strong><\/li>\n<li>Release date: 2025-02-10<\/li>\n<li>License: Llama 3.1 Community License<\/li>\n<li>Portuguese variety: Brazil<\/li>\n<li>Size: 8B<\/li>\n<li>Base model: Llama 3.1<\/li>\n<li>Weights: <a href=\"https:\/\/huggingface.co\/felipeoes\/cocoruta-2-8b\" target=\"_blank\" rel=\"noopener\">yes<\/a><\/li>\n<li>Variations: Cororuta-2-8b<\/li>\n<li>Training data: <a href=\"https:\/\/huggingface.co\/collections \/felipeoes\/cocoruta-2-67e83faabe30b17cb4a fb1bd\" target=\"_blank\" rel=\"noopener\">Brazilian environmental legislation<\/a><\/li>\n<li>Data cutoff (training): 2025<\/li>\n<li>Associated API: &#8212;<\/li>\n<li>Online chat: &#8212;<\/li>\n<li>Team:\u00a0<a href=\"https:\/\/sites.usp.br\/keml\/\" target=\"_blank\" rel=\"noopener\">KEML-C4AI<\/a> (Felipe de Oliveira Esp\u00edrito Santo)<\/li>\n<\/ul>\n<\/details>\n<\/details>\n<details style=\"margin-bottom: 1.2em; padding: 0.8em 1em; border: 1px solid #ddd; border-radius: 8px; background-color: #f9f9f9;\">\n<summary style=\"font-weight: bold; color: #01484e; font-size: 1.1em; cursor: pointer; display: list-item;\">Gaia<\/summary>\n<ul style=\"margin-top: 0; margin-bottom: 0.2em; padding-left: 1.2em;\">\n<li>Name: <a href=\"https:\/\/deepmind.google\/models\/gemma\/gemmaverse\/gaia\/\" target=\"_blank\" rel=\"noopener\"><strong>Gaia<\/strong><\/a><\/li>\n<li>Release date: 2024-05<\/li>\n<li>License: Gemma<\/li>\n<li>Portuguese variety: Brazil<\/li>\n<li>Size: 4B<\/li>\n<li>Base model: Gemma 3 4B<\/li>\n<li>Weights: <a href=\"https:\/\/huggingface.co\/CEIA-UFG\/Gemma-3-Gaia-PT-BR-4b-it\" target=\"_blank\" rel=\"noopener\">yes<\/a><\/li>\n<li>Variations: Gemma-3-Gaia-PT-BR-4b-it<\/li>\n<li>Training data: Scientif Papers and Wikip\u00e9dia<\/li>\n<li>Data cutoff (training): 2024-09<\/li>\n<li>Associated API: &#8212;<\/li>\n<li>Online chat: &#8212;<\/li>\n<li>Team: <a href=\"https:\/\/ceia.ufg.br\/\" target=\"_blank\" rel=\"noopener\">CEIA-UFG<\/a><\/li>\n<\/ul>\n<\/details>\n<details style=\"margin-bottom: 1.2em; padding: 0.8em 1em; border: 1px solid #ddd; border-radius: 8px; background-color: #f9f9f9;\">\n<summary style=\"font-weight: bold; color: #01484e; font-size: 1.1em; cursor: pointer; display: list-item;\">Gerv\u00e1sio<\/summary>\n<details style=\"margin-left: 1.2em; margin-top: 0.6em;\">\n<summary style=\"font-weight: 600; color: #333; cursor: pointer; display: list-item;\">Gerv\u00e1sio 7B PTBR<\/summary>\n<ul style=\"margin-top: 0; margin-bottom: 0.2em; padding-left: 1.2em;\">\n<li>Name: <a href=\"https:\/\/arxiv.org\/abs\/2402.18766\" target=\"_blank\" rel=\"noopener\"><strong>Gerv\u00e1sio 7B PTBR<\/strong><\/a><\/li>\n<li>Release date: 2024-02-28<\/li>\n<li>License: MIT<\/li>\n<li>Portuguese variety: Brazil<\/li>\n<li>Size: 7B<\/li>\n<li>Base model: Llama 2<\/li>\n<li>Weights: <a href=\"https:\/\/huggingface.co\/PORTULAN\/gervasio-7b-portuguese-ptbr-decoder\" target=\"_blank\" rel=\"noopener\">yes<\/a><\/li>\n<li>Variations: Gervasio-7b-portuguese-ptbr-decoder<\/li>\n<li>Training data: <a href=\"https:\/\/gluebenchmark.com\/tasks\" target=\"_blank\" rel=\"noopener\">GLUE<\/a>, <a href=\"https:\/\/super.gluebenchmark.com\/tasks\" target=\"_blank\" rel=\"noopener\">SuperGLUE<\/a><\/li>\n<li>Data cutoff (training): (?)<\/li>\n<li>Associated API: &#8212;<\/li>\n<li>Online chat: &#8212;<\/li>\n<li>Team: <a href=\"https:\/\/portulanclarin.net\/\" target=\"_blank\" rel=\"noopener\">Portulan Clarin<\/a><\/li>\n<\/ul>\n<\/details>\n<details style=\"margin-left: 1.2em; margin-top: 0.6em;\">\n<summary style=\"font-weight: 600; color: #333; cursor: pointer; display: list-item;\">Gerv\u00e1sio 8B PTPT<\/summary>\n<ul style=\"margin-top: 0; margin-bottom: 0.2em; padding-left: 1.2em;\">\n<li>Name: <a href=\"https:\/\/huggingface.co\/PORTULAN\/gervasio-8b-portuguese-ptpt-decoder\" target=\"_blank\" rel=\"noopener\"><strong>Gerv\u00e1sio 8B PTPT<\/strong><\/a><\/li>\n<li>Release date: 2025-06-11<\/li>\n<li>License: MIT<\/li>\n<li>Portuguese variety: Portugal<\/li>\n<li>Size: 8B<\/li>\n<li>Base model: Lamma 3.1<\/li>\n<li>Weights: <a href=\"https:\/\/huggingface.co\/PORTULAN\/gervasio-8b-portuguese-ptpt-decoder\/tree\/main\" target=\"_blank\" rel=\"noopener\">yes<\/a><\/li>\n<li>Variations: Gervasio-8b-portuguese-ptpt-decoder<\/li>\n<li>Training data: extraGLUE-Instruct, MMLU PT, Natural Instructions, Wikipedia, Proverbs<\/li>\n<li>Data cutoff (training): (?)<\/li>\n<li>Associated API: &#8212;<\/li>\n<li>Online chat: <a href=\"https:\/\/evaristo.ai\/models\/Gerv%C3%A1sio%208B%20PTPT\" target=\"_blank\" rel=\"noopener\">yes<\/a><\/li>\n<li>Team: <a href=\"https:\/\/portulanclarin.net\/\" target=\"_blank\" rel=\"noopener\">Portulan Clarin<\/a><\/li>\n<\/ul>\n<\/details>\n<\/details>\n<details style=\"margin-bottom: 1.2em; padding: 0.8em 1em; border: 1px solid #ddd; border-radius: 8px; background-color: #f9f9f9;\">\n<summary style=\"font-weight: bold; color: #01484e; font-size: 1.1em; cursor: pointer; display: list-item;\">Gl\u00f3ria<\/summary>\n<ul style=\"margin-top: 0; margin-bottom: 0.2em; padding-left: 1.2em;\">\n<li>Name: <a href=\"https:\/\/aclanthology.org\/2024.propor-1.45\/\" target=\"_blank\" rel=\"noopener\"><strong>Gl\u00f3rIA<\/strong><\/a><\/li>\n<li>Release date: 2024-02-26<\/li>\n<li>License: ClueWeb22 Dataset License<\/li>\n<li>Portuguese variety: Portugal<\/li>\n<li>Size: 1.3B<\/li>\n<li>Base model: GPT-Neo<\/li>\n<li>Weights: <a href=\"https:\/\/huggingface.co\/NOVA-vision-language\/GlorIA-1.3B\" target=\"_blank\" rel=\"noopener\">yes<\/a><\/li>\n<li>Variations: GlorIA-1.3B<\/li>\n<li>Training data: ClueWeb22 PTPT, OSCAR PTPT, ArquivoPT, OpenSubtitles PTPT, PTWiki, EuroParl PTPT<\/li>\n<li>Data cutoff (training): (?)<\/li>\n<li>Associated API: &#8212;<\/li>\n<li>Online chat: &#8212;<\/li>\n<li>Team: <a href=\"https:\/\/www.fct.unl.pt\/\" target=\"_blank\" rel=\"noopener\">Researches from NOVA FCT<\/a><\/li>\n<\/ul>\n<\/details>\n<details style=\"margin-bottom: 1.2em; padding: 0.8em 1em; border: 1px solid #ddd; border-radius: 8px; background-color: #f9f9f9;\">\n<summary style=\"font-weight: bold; color: #01484e; font-size: 1.1em; cursor: pointer; display: list-item;\">Juru<\/summary>\n<ul style=\"margin-top: 0; margin-bottom: 0.2em; padding-left: 1.2em;\">\n<li>Name: <a href=\"https:\/\/arxiv.org\/abs\/2403.18140v2\" target=\"_blank\" rel=\"noopener\"><strong>Juru<\/strong><\/a><\/li>\n<li>Release date: 2025-06-29<\/li>\n<li>License: (?)<\/li>\n<li>Portuguese variety: Brazil<\/li>\n<li>Size: 7B<\/li>\n<li>Base model: Mistral<\/li>\n<li>Weights: <a href=\"https:\/\/huggingface.co\/roseval\/Juru-7B\" target=\"_blank\" rel=\"noopener\">yes<\/a><\/li>\n<li>Variations: Juru-7B<\/li>\n<li>Training data: Academic papers in the Brazilian legal domain, LexML data, and documents from Brazil&#8217;s Supreme Federal Court (STF)<\/li>\n<li>Data cutoff (training): 2024<\/li>\n<li>Associated API: &#8212;<\/li>\n<li>Online chat: &#8212;<\/li>\n<li>Team: <a href=\"https:\/\/huggingface.co\/roseval\" target=\"_blank\" rel=\"noopener\">Roseval Malaquias Junior<\/a><\/li>\n<\/ul>\n<\/details>\n<details style=\"margin-bottom: 1.2em; padding: 0.8em 1em; border: 1px solid #ddd; border-radius: 8px; background-color: #f9f9f9;\">\n<summary style=\"font-weight: bold; color: #01484e; font-size: 1.1em; cursor: pointer; display: list-item;\">L\u00b3M<\/summary>\n<ul style=\"margin-top: 0; margin-bottom: 0.2em; padding-left: 1.2em;\">\n<li>Name: <a href=\"https:\/\/neuralmind.ai\/2025\/02\/20\/neuralmind-e-escavador-desenvolvem-l%C2%B3m-um-modelo-de-linguagem-inovador-para-o-sistema-juridico-brasileiro\/\" target=\"_blank\" rel=\"noopener\"><strong>L\u00b3M<\/strong><\/a><\/li>\n<li>Release date: (futuro)<\/li>\n<li>License: Proprietary<\/li>\n<li>Portuguese variety: Brazil<\/li>\n<li>Size: (?)<\/li>\n<li>Base model: (?)<\/li>\n<li>Weights: no<\/li>\n<li>Variations: (?)<\/li>\n<li>Training data: Escavador<\/li>\n<li>Data cutoff (training): (?)<\/li>\n<li>Associated API: &#8212;<\/li>\n<li>Online chat: &#8212;<\/li>\n<li>Team: <a href=\"https:\/\/neuralmind.ai\/\" target=\"_blank\" rel=\"noopener\">NeuralMind<\/a> and <a href=\"https:\/\/www.escavador.com\/\" target=\"_blank\" rel=\"noopener\">Escavador<\/a><\/li>\n<\/ul>\n<\/details>\n<details style=\"margin-bottom: 1.2em; padding: 0.8em 1em; border: 1px solid #ddd; border-radius: 8px; background-color: #f9f9f9;\">\n<summary style=\"font-weight: bold; color: #01484e; font-size: 1.1em; cursor: pointer; display: list-item;\">openCabrita<\/summary>\n<ul style=\"margin-top: 0; margin-bottom: 0.2em; padding-left: 1.2em;\">\n<li>Name: <a href=\"https:\/\/arxiv.org\/abs\/2308.11878\" target=\"_blank\" rel=\"noopener\"><strong>openCabrita<\/strong><\/a><\/li>\n<li>Release date: 2023-07-06<\/li>\n<li>License: Apache 2.0<\/li>\n<li>Portuguese variety: (?)<\/li>\n<li>Size: 3B<\/li>\n<li>Base model: Llama 1<\/li>\n<li>Weights: <a href=\"https:\/\/huggingface.co\/22h\/open-cabrita3b\" target=\"_blank\" rel=\"noopener\">yes<\/a><\/li>\n<li>Variations: Open-cabrita3b<\/li>\n<li>Training data: <a href=\"https:\/\/www.tensorflow.org\/datasets\/catalog\/c4#c4multilingual\" target=\"_blank\" rel=\"noopener\">Subconjunto em portugu\u00eas do mC4<\/a><\/li>\n<li>Data cutoff (training): \u2265 2022-09<\/li>\n<li>Associated API: (futuro)<\/li>\n<li>Online chat: &#8212;<\/li>\n<li>Team: 22h<\/li>\n<\/ul>\n<\/details>\n<details style=\"margin-bottom: 1.2em; padding: 0.8em 1em; border: 1px solid #ddd; border-radius: 8px; background-color: #f9f9f9;\">\n<summary style=\"font-weight: bold; color: #01484e; font-size: 1.1em; cursor: pointer; display: list-item;\">Periquito<\/summary>\n<ul style=\"margin-top: 0; margin-bottom: 0.2em; padding-left: 1.2em;\">\n<li>Name: <strong>Periquito<\/strong><\/li>\n<li>Release date: 2023-09-20<\/li>\n<li>License: Apache 2.0<\/li>\n<li>Portuguese variety: Unspecified<\/li>\n<li>Size: 3B<\/li>\n<li>Base model: LLaMa 3B<\/li>\n<li>Weights: <a href=\"http:\/\/huggingface.co\/wandgibaut\/periquito-3B\/tree\/main\" target=\"_blank\" rel=\"noopener\">yes<\/a><\/li>\n<li>Variations: Periquito 3B i1 GGUF, Periquito 3B GGUF<\/li>\n<li>Training data: <a href=\"https:\/\/huggingface.co\/wandgibaut\/periquito-3B\" target=\"_blank\" rel=\"noopener\">Wikipedia in Portugues<\/a><\/li>\n<li>Data cutoff (training): (?)<\/li>\n<li>Associated API: &#8212;<\/li>\n<li>Online chat: &#8212;<\/li>\n<li>Team: <a href=\"https:\/\/www.linkedin.com\/in\/wandgibaut\/?locale=en_US\" target=\"_blank\" rel=\"noopener\">Wandemberg Gibaut<\/a><\/li>\n<\/ul>\n<\/details>\n<details style=\"margin-bottom: 1.2em; padding: 0.8em 1em; border: 1px solid #ddd; border-radius: 8px; background-color: #f9f9f9;\">\n<summary style=\"font-weight: bold; color: #01484e; font-size: 1.1em; cursor: pointer; display: list-item;\">Sabi\u00e1<\/summary>\n<details style=\"margin-left: 1.2em; margin-top: 0.6em;\">\n<summary style=\"font-weight: 600; color: #333; cursor: pointer; display: list-item;\">Sabi\u00e1-3<\/summary>\n<ul style=\"margin-top: 0; margin-bottom: 0.2em; padding-left: 1.2em;\">\n<li>Name: <a href=\"https:\/\/chat.maritaca.ai\/\" target=\"_blank\" rel=\"noopener\"><strong>Sabi\u00e1-3<\/strong><\/a><\/li>\n<li>Release date: 2024-04<\/li>\n<li>License: Proprietary<\/li>\n<li>Portuguese variety: Brazil<\/li>\n<li>Size: (confidencial)<\/li>\n<li>Base model: (confidencial)<\/li>\n<li>Weights: n\u00e3o<\/li>\n<li>Variations: Sabia-3, Sabiazinho-3<\/li>\n<li>Training data: Public data from the internet<\/li>\n<li>Data cutoff (training): 2023<\/li>\n<li>Associated API: <a href=\"https:\/\/docs.maritaca.ai\/pt\/visao-geral\" target=\"_blank\" rel=\"noopener\">requires payment<\/a><\/li>\n<li>Online chat: <a href=\"https:\/\/chat.maritaca.ai\/\" target=\"_blank\" rel=\"noopener\">yes<\/a><\/li>\n<li>Team: <a href=\"https:\/\/www.maritaca.ai\/\" target=\"_blank\" rel=\"noopener\">Maritaca AI<\/a><\/li>\n<\/ul>\n<\/details>\n<details style=\"margin-left: 1.2em; margin-top: 0.6em;\">\n<summary style=\"font-weight: 600; color: #333; cursor: pointer; display: list-item;\">Sabi\u00e1-3.1<\/summary>\n<ul style=\"margin-top: 0; margin-bottom: 0.2em; padding-left: 1.2em;\">\n<li>Name: <a href=\"https:\/\/arxiv.org\/abs\/2410.12049\" target=\"_blank\" rel=\"noopener\"><strong>Sabi\u00e1-3.1<\/strong><\/a><\/li>\n<li>Release date: 2025-05<\/li>\n<li>License: Proprietary<\/li>\n<li>Portuguese variety: Brazil<\/li>\n<li>Size: (confidential)<\/li>\n<li>Base model: (confidential)<\/li>\n<li>Weights: no<\/li>\n<li>Variations: Sabia-3.1<\/li>\n<li>Training data: Public data from the internet<\/li>\n<li>Data cutoff (training): 2024-08<\/li>\n<li>Associated API: <a href=\"https:\/\/docs.maritaca.ai\/pt\/visao-geral\" target=\"_blank\" rel=\"noopener\">requires payment<\/a><\/li>\n<li>Online chat: <a href=\"https:\/\/chat.maritaca.ai\/\" target=\"_blank\" rel=\"noopener\">yes<\/a><\/li>\n<li>Team: <a href=\"https:\/\/www.maritaca.ai\/\" target=\"_blank\" rel=\"noopener\">Maritaca AI<\/a><\/li>\n<\/ul>\n<\/details>\n<details style=\"margin-left: 1.2em; margin-top: 0.6em;\">\n<summary style=\"font-weight: 600; color: #333; cursor: pointer; display: list-item;\">Sabi\u00e1-7B<\/summary>\n<ul style=\"margin-top: 0; margin-bottom: 0.2em; padding-left: 1.2em;\">\n<li>Name: <a href=\"https:\/\/huggingface.co\/maritaca-ai\/sabia-7b\" target=\"_blank\" rel=\"noopener\"><strong>Sabi\u00e1-7B<\/strong><\/a><\/li>\n<li>Release date: 2023-11-08<\/li>\n<li>License: LLaMA License<\/li>\n<li>Portuguese variety: Brazil<\/li>\n<li>Size: 7B<\/li>\n<li>Base model: Llama 1<\/li>\n<li>Weights: <a href=\"https:\/\/huggingface.co\/maritaca-ai\/sabia-7b\/tree\/main\" target=\"_blank\" rel=\"noopener\">yes<\/a><\/li>\n<li>Variations: Sabia-7b<\/li>\n<li>Training data: ClueWeb22<\/li>\n<li>Data cutoff (training): 2022<\/li>\n<li>Associated API: &#8212;<\/li>\n<li>Online chat: <a href=\"https:\/\/evaristo.ai\/models\/Sabi%C3%A1%207B\" target=\"_blank\" rel=\"noopener\">yes<\/a><\/li>\n<li>Team: <a href=\"https:\/\/www.maritaca.ai\/\" target=\"_blank\" rel=\"noopener\">Maritaca AI<\/a><\/li>\n<\/ul>\n<\/details>\n<\/details>\n<details style=\"margin-bottom: 1.2em; padding: 0.8em 1em; border: 1px solid #ddd; border-radius: 8px; background-color: #f9f9f9;\">\n<summary style=\"font-weight: bold; color: #01484e; font-size: 1.1em; cursor: pointer; display: list-item;\">Samba<\/summary>\n<ul style=\"margin-top: 0; margin-bottom: 0.2em; padding-left: 1.2em;\">\n<li>Name: <strong>Samba<\/strong><\/li>\n<li>Release date: (?)<\/li>\n<li>License: Academic Free License 3.0<\/li>\n<li>Portuguese variety: Unspecified<\/li>\n<li>Size: 1B<\/li>\n<li>Base model: TinyLlama-1.1B<\/li>\n<li>Weights: <a href=\"https:\/\/huggingface.co\/lrds-code\/boana-7b-instruct\/tree\/main\" target=\"_blank\" rel=\"noopener\">yes<\/a><\/li>\n<li>Variations: Samba 1.1B GGUF and Angelinis\/Outputs<\/li>\n<li>Training data: (?)<\/li>\n<li>Data cutoff (training): (?)<\/li>\n<li>Associated API: &#8212;<\/li>\n<li>Online chat: &#8212;<\/li>\n<li>Team: <a href=\"https:\/\/www.linkedin.com\/in\/leonardo-souza-289a16b3\/\" target=\"_blank\" rel=\"noopener\">Leonardo Souza<\/a><\/li>\n<\/ul>\n<\/details>\n<details style=\"margin-bottom: 1.2em; padding: 0.8em 1em; border: 1px solid #ddd; border-radius: 8px; background-color: #f9f9f9;\">\n<summary style=\"font-weight: bold; color: #01484e; font-size: 1.1em; cursor: pointer; display: list-item;\">SoberanIA<\/summary>\n<ul style=\"margin-top: 0; margin-bottom: 0.2em; padding-left: 1.2em;\">\n<li>Name: <a href=\"https:\/\/soberania.ai\/\" target=\"_blank\" rel=\"noopener\"><strong>SoberanIA<\/strong><\/a><\/li>\n<li>Release date: (future)<\/li>\n<li>License: (?)<\/li>\n<li>Portuguese variety: Brazil<\/li>\n<li>Size: (?)<\/li>\n<li>Base model: (?)<\/li>\n<li>Weights: (future)<\/li>\n<li>Variations: (?)<\/li>\n<li>Training data: (?)<\/li>\n<li>Data cutoff (training): (?)<\/li>\n<li>Associated API: (future)<\/li>\n<li>Online chat: (future)<\/li>\n<li>Team: <a href=\"https:\/\/www.pi.gov.br\/\" target=\"_blank\" rel=\"noopener\">Government of Piau\u00ed<\/a><\/li>\n<\/ul>\n<\/details>\n<details style=\"margin-bottom: 1.2em; padding: 0.8em 1em; border: 1px solid #ddd; border-radius: 8px; background-color: #f9f9f9;\">\n<summary style=\"font-weight: bold; color: #01484e; font-size: 1.1em; cursor: pointer; display: list-item;\">Tucano<\/summary>\n<ul style=\"margin-top: 0; margin-bottom: 0.2em; padding-left: 1.2em;\">\n<li>Name: <a href=\"https:\/\/arxiv.org\/abs\/2411.07854\" target=\"_blank\" rel=\"noopener\"><strong>Tucano<\/strong><\/a><\/li>\n<li>Release date: 2024-11-07<\/li>\n<li>License: Apache 2.0<\/li>\n<li>Portuguese variety: Brazil<\/li>\n<li>Size: 0.16B a 2B<\/li>\n<li>Base model: (?)<\/li>\n<li>Weights: <a href=\"https:\/\/huggingface.co\/TucanoBR\/models\" target=\"_blank\" rel=\"noopener\">yes<\/a><\/li>\n<li>Variations: Tucano-1b1-Instruct, Tucano-2b4-Instruct<\/li>\n<li>Training data: GigaVerbo<\/li>\n<li>Data cutoff (training): (?)<\/li>\n<li>Associated API: &#8212;<\/li>\n<li>Online chat: &#8212;<\/li>\n<li>Team: <a href=\"https:\/\/huggingface.co\/TucanoBR\" target=\"_blank\" rel=\"noopener\">Tucano Project<\/a><\/li>\n<\/ul>\n<\/details>\n<p>Contributors:<\/p>\n<ul>\n<li>Vin\u00edcius Bitencourt Matos<\/li>\n<li>Arnaldo Candido Junior<\/li>\n<li>Filipe Bison de Souza<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>List of models optimized for Portuguese (both Brazilian and European varieties), focusing on architectures with at least 1 billion parameters. When appropriate, the models are grouped into \u201cfamilies\u201d. The information presented for each model includes: name, release date, license, Portuguese variety, size, base model, weights, variations, training data, data cutoff date, associated API, online chat,<a href=\"https:\/\/sites.usp.br\/keml\/en\/language-models-for-portuguese\/\">[&#8230;]<\/a><\/p>\n","protected":false},"author":24022,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"inline_featured_image":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":"","_links_to":"","_links_to_target":""},"class_list":["post-2574","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/sites.usp.br\/keml\/wp-json\/wp\/v2\/pages\/2574","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sites.usp.br\/keml\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/sites.usp.br\/keml\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/sites.usp.br\/keml\/wp-json\/wp\/v2\/users\/24022"}],"replies":[{"embeddable":true,"href":"https:\/\/sites.usp.br\/keml\/wp-json\/wp\/v2\/comments?post=2574"}],"version-history":[{"count":2,"href":"https:\/\/sites.usp.br\/keml\/wp-json\/wp\/v2\/pages\/2574\/revisions"}],"predecessor-version":[{"id":2579,"href":"https:\/\/sites.usp.br\/keml\/wp-json\/wp\/v2\/pages\/2574\/revisions\/2579"}],"wp:attachment":[{"href":"https:\/\/sites.usp.br\/keml\/wp-json\/wp\/v2\/media?parent=2574"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}