
LatinX in
Natural Language Processing
Research Workshop at
NAACL 2024
June 16th, 2024 Mexico City, Mexico
* At least a tutorial only registration is required.
LatinXinAI @ NAACL24 -
LatinXinAI @ NAACL24 -
LatinX in NLP (LXNLP) is hosting a hybrid workshop to connect LatinX identifying AI Researchers and Engineers around the North American Chapter of the Association for Computational Linguistics (NAACL) taking place June 16th, 2024.
Thanks everyone for making this workshop a great experience!
PROGRAM
Start | End | Activity | |
08:30 | - | Check-in | |
08:45 | - | 09:00 | Opening remarks |
09:00 | - | 09:45 | Keynote - Mariano Felice |
NLP for language assessment: where are we now and where are we going? | |||
09:45 | - | 10:00 | Break |
10:00 | - | 10:50 | Oral Presentations |
10:00 | - | 10:10 | - An empirical study of Definition Modeling with LLMs for the main languages of Latin America |
10:10 | - | 10:20 | - Computational Resources for Indigenous Languages Spoken in Argentina: An Introductory Survey |
10:20 | - | 10:30 | - Understanding Toxicity and Sentiment Dynamics in Social Media: LLM Analysis of Diverse and Focused Interest Users |
10:30 | - | 10:40 | - Change My Frame: Reframing in the Wild in r/ChangeMyView |
10:40 | - | 10:50 | - A big ant or a small elephant: metaphor interpretation on large language models |
10:50 | - | 11:00 | - Identification of climate change and sustainability texts using pre-trained Spanish language models |
11:00 | - | 11:10 | Break |
11:10 | - | 12:00 | Poster Session |
- Mayasoundex: A Phonetically Grounded Algorithm for Information Retrieval in the Maya Language | |||
- A Cross-linguistic Examination of Language Complexity in South American Indigenous Languages | |||
- Challenging Linguistic and Cultural Diversity: Evaluation of AI Models in the Detection of Hate Speech in Brazilian Social Networks | |||
- Towards Portuguese Hate Speech Detection with Transformers | |||
- Detection of hate speech and inappropriate content in Mexican Spanish Memes | |||
- Social Biases in Models Trained with Chilean Corpus | |||
- Voces de Latinoamérica: Un sistema TTS de bajo recursos para múltiples acentos | |||
- Learning how to cook healthy using LLMs, supervised fine-tuning and Retrieval Augmented Generation | |||
12:00 | - | 13:45 | Lunch - LXAI Social |
13:45 | - | 14:30 | Keynote - Jocelyn Dunstan |
Clinical NLP: uses and challenges | |||
14:30 | - | 14:40 | Break |
14:40 | - | 15:30 | Oral Presentations |
14:40 | - | 14:50 | - On the use of Multimodal Attention for Questionable Content Detection in Videos |
14:50 | - | 15:00 | - Detecting correct answers to open questions and its impact on language models' confidence scores |
15:00 | - | 15:10 | - A Spanish-language Dataset of Twitter Conversations for Stance Detection |
15:10 | - | 15:20 | - Towards Improved RAC Accessibility: Dataset and LLMs, approach to enhancing RAC accessibility |
15:20 | - | 15:30 | - Sequence-to-Sequence Spanish Pre-trained Language Models |
15:25 | - | 15:35 | Break |
15:35 | - | 16:40 | Poster Session |
- Multimodal Learning for Hate Speech Detection in Videos for Mexican Spanish | |||
- On the Challenges of Creating Datasets for Analyzing Commercial Sex Advertisements to Assess Human Trafficking Risk and Organized Activity | |||
- Depression Detection through Phrase Generation with ChatGPT-3 in Clinical Interviews | |||
- Eigenpruning | |||
- Exploring the Trade-off Between Model Performance and Explanation Plausibility of Text Classifiers Using Human Rationales | |||
- Proyecto #Somos600M: Generación de recursos que representen la riqueza de las lenguas de LATAM, El Caribe y España | |||
- Efficacy of ByteT5 in Multilingual Translation of Biblical Texts for Underrepresented Languages | |||
- Cross-Linguistic Framing Analysis: Unveiling Political and Cultural Narratives in Spanish-Language News | |||
16:40 | - | 17:30 | Networking Session |
17:30 | - | 15:45 | Closing remarks - Best Paper Awards |
19:00 | - | NAACL Welcome Reception |
KEYNOTE SPEAKERS
Mariano Felice
Dr Mariano Felice is a Senior Researcher and Data Scientist for Language Assessment and Learning at the British Council. His work involves looking at how artificial intelligence (AI) and natural language processing (NLP) can be used to improve language learning and assessment, from mining datasets and building models to supporting colleagues in the adoption of new technology. Mariano is also a Visiting Researcher at the University of Cambridge, where he completed his PhD in Computer Science and worked as a Research Associate before joining the British Council. Mariano has published extensively in top-tier NLP conferences and is a frequent speaker at international conferences and reviewer for workshops, journals and conferences in his field.
Jocelyn Dunstan Escudero
Prof. Jocelyn Dunstan Escudero is an academic at the Catholic University of Chile in a joint appointment between the Department of Computer Science and the Institute for Mathematical and Computational Engineering. She holds a master's in Physics, a Ph.D. in applied mathematics and a postdoc in public health. Prof. Dunstan leads the clinical natural language processing group in Chile. In particular, they use methods to extract critical information and support decision-making. Besides, she is leading applied projects to leverage information extracted from free text with a gender perspective and privacy-preserving approaches. She also has a podcast called "ciencia de datos". She enjoys discussing work-life balance in academia, the need for more diversity in science, and the importance of interdisciplinary research.
IMPORTANT DATES
-
Submission Open : February 23, 2024
Submission deadline: M̶a̶r̶c̶h̶ ̶2̶9̶,̶2̶0̶2̶4̶ April 10, 2024
Notification of acceptance: A̶p̶r̶i̶l̶ ̶1̶5̶,̶2̶0̶2̶4̶ April 26, 2024
Camera-ready submissions: May 14, 2024 -
Application Deadline: April 30, 2024 Link for volunteers application : https://forms.gle/qxTmN4LLbLE13cDu8
-
Application Deadline: May 3, 2024 Link for grant application : https://forms.gle/b1FV7SoZuXMgUg9h6
-
Hotel Hilton Reforma, Salon Don Genaro
-
Doña Julita, Av. Hidalgo 85, CDMX, Mexico
https://maps.app.goo.gl/6WHfeEQtQpqp6bwX7
Reclaim your tickets at the workshop registration counter.
*All deadlines are 23:59:59 AoE (UTC-12)
FINANCIAL ASSISTANCE
LatinX in AI is committed to supporting LatinX & Hispanic individuals from all around the world. This year we want to ensure that our workshop is accessible to everyone, no matter where they live. Consequently, you have three (3) options to finance your participation at our workshop:
Discounted Registration NAACL2024 for Latin American Researchers in the main conference
NACCL is offering 15 discounted registrations at student rate for Latin American (LATAM) researchers. Eligible applicants should reside in a LATAM country and be affiliated with a LATAM institution. Note that the discounts are offered to only researchers (e.g., postdoc, professors, industrial or government researchers and engineers), not students. Eligible applicants should plan to attend NAACL in person.
Application: Fill out this form: https://forms.office.com/r/u99ERzYcN7
Deadline: April 18th, 2024
Notification: April 28th, 2024
For more information please visit: Discounted Registration for Latin American Researchers - NAACL-HLT 2024
NAACL Diversity and Inclusion Subsidies
NAACL 2024 is providing D&I funds for registration, caregiving, bandwidth, travel and VPN subsidies. We strongly encourage researchers to apply for both subsidies and volunteering opportunities to maximise their chances of getting their registration fees waived.
Application: Fill out this form: https://forms.office.com/r/H3c54ZvxEi
Deadline: April 18th, 2024
Notification: April 23rd, 2024
For more information visit: Call for NAACL Diversity and Inclusion Subsidies - NAACL-HLT 2024
LatinX in AI grants
We are happy to provide support for registration fees and travel grants to help those that have financial needs. If you believe you need financial assistance to cover your travel expenses or to virtually register to NAACL, please make sure to fill out these applications as accurately and truthfully as possible.
Opening: April 22, 2024
Deadline: May 3, 2024
Notification: May 17, 2024
CALL FOR VOLUNTEERS
Join the amazing community of LatinX in AI doing Natural Language processing by registering as a volunteer for the upcoming workshop co-located with NAACL 2024 in Mexico City, Mexico
As a volunteer, you have the opportunity to choose from a variety of engaging roles, from moderating Q&A sessions to assisting with poster presentations and social media engagement. Plus, we understand the value of your time and commitment, which is why we will prioritize our registration grants for our volunteers. Do not miss out on this opportunity to contribute with your skills and passion to our workshop.
Volunteer Application Deadline: April 25, 2024
Volunteer Notification Date: May 17, 2024
Call For Papers
We welcome extended abstracts that may introduce new theories, methodology, or applications of NLP. We also welcome position papers and demos. Work may be previously published, completed, or ongoing. Submissions will be peer-reviewed by at least 2 reviewers in the area. Specifically, we allow two type of submissions:
Archival: Must be blind for the double-blind review process. Accepted works will be published in the Journal of LatinX in AI Research as proceedings.
Non-archival: May be submitted to any venue in the future. Previously published work can also be submitted as non-archival, with the additional requirement to state in the first page the original publication source.
The best two (2) accepted papers of the workshop
will be awarded with an
travel Scholarship
to atttend to
KHIPU 2025 (Santiago, Chile)
Specific topics include, but not limited to:
Dialogue and Interactive Systems
Discourse and Pragmatics
Efficient Methods for NLP
Ethics and NLP
Generation
Information Extraction
Information Retrieval and Text Mining
Interpretability and Analysis of Models for NLP
Language Grounding to Vision, Robotics and Beyond
Linguistic theories, Cognitive Modeling and Psycholinguistics
Machine Learning for NLP
Machine Translation and Multilinguality
NLP Applications
Phonology, Morphology and Word Segmentation
Question Answering
Resources and Evaluation
Semantics: Lexical
Semantics: Sentence-level Semantics, Textual Inference and Other areas
Sentiment Analysis, Stylistic Analysis, and Argument Mining
Speech and Multimodality
Summarization
Syntax: Tagging, Chunking and Parsing
Submissions will be double-blind peer-reviewed and should be submitted as a single PDF file of up to 3 pages, excluding references. The submissions should strictly follow the formatting guidelines (excluding paper length) provided by the NAACL 2024 to avoid the risk of being rejected without consideration of their merits. Please follow the ACL Style Template for your submission. Submissions must state the research problem, motivation, and technical contribution. All submissions must be self-contained and include all figures, tables, and references.
The submission deadlines for the full papers and extended abstracts as well as other important dates are given below. Please note that no extensions will be offered for submissions.
If you have any questions, please email to lxnlp2024@latinxinai.org
NOTE: Works may be submitted in English, Spanish or Portuguese. We will assist authors in translating their accepted work into English (translated works will appear in both languages in the proceedings).
Mission
This affinity workshop is aimed at LatinX individuals working on or interested in Computational Linguistics with a goal to increase the visibility of researchers of LatinX origin in a field that has been dominated by countries such as China, USA and Germany.
Those already working on Computational Linguistics will have the opportunity to connect with fellow LatinX and make their own work known, while those new to the field will benefit from the scientific exchange, guidance and advice of researchers with their same background. Participants will be able to engage in discussions about Computational Linguistics (formal and informal) and to share their thoughts on how to increase the presence of LatinX in Computational Linguistics.
Diversity and inclusion are key to achieve better and more creative science. By promoting this event, NAACL will not only advocate for an underrepresented community, it will also help to promote the development of technologies and language resources that take into account the different languages that exist in Latin America.
MENTORING PROGRAM
LatinX in AI is hosting a mentoring program alongside our official workshops. The LatinX in AI Mentoring Program requires mentors and mentees to meet once a month. On the day of the workshop, some mentees will be asked to share their experiences and the learnings they obtained from the program.
Mentoring Program Notification: TBA
Mentoring Program End Date: TBA
ACCEPTED PAPERS
Title | Authors | Presentation Type |
Mayasoundex: A Phonetically Grounded Algorithm for Information Retrieval in the Maya Language | Alejandro Molina-Villegas (Conacyt-CentroGeo)* | Accept - POSTER |
On the use of Multimodal Attention for Questionable Content Detection in Videos | Arnold Morales (Instituto Nacional de Astrofísica, Óptica y Electrónica)*; Elaheh Baharlouei (Univesity of Houston); Thamar Solorio (University of Houston, USA); Hugo Jair Escalante (INAOE) | Accept - ORAL |
A Cross-linguistic Examination of Language Complexity in South American Indigenous Languages | Felipe R. Serras (Institute of Mathematics and Statistics, University of São Paulo)*; Miguel de Mello carpi (Institute of Mathematics and Statistics, University of São Paulo); Matheus Castello Branco Lima de Araujo (Institute of Mathematics and Statistics, University of São Paulo); Marcelo Finger (University of Sao Paulo) | Accept - POSTER |
Depression Detection through Phrase Generation with ChatGPT-3 in Clinical Interviews | Karla M Valencia (Instituto Nacional de Astrofísica, Óptica y Electrónica)*; Hugo Jair Escalante (INAOE); Luis Villaseñor (INAOE) | Accept - POSTER |
Multimodal Learning for Hate Speech Detection in Videos for Mexican Spanish | Itzel Tlelo (INAOE)*; Hugo Jair Escalante (INAOE) | Accept - POSTER |
Understanding Toxicity and Sentiment Dynamics in Social Media: LLM Analysis of Diverse and Focused Interest Users | Abi Oppenheim (ICC)*; Federico Albanese (University of Buenos Aires); Esteban Feuerstein (FCEyN-UBA) | Accept - ORAL |
Challenging Linguistic and Cultural Diversity: Evaluation of AI Models in the Detection of Hate Speech in Brazilian Social Networks | Annie Amorim (Federal Fluminense University - UFF)*; Gabriel Assis (IC/UFF); Jonnathan Carvalho (Department of Informatics/Instituto Federal Fluminense); Daniela Vianna (Universidade Federal Fluminense); Daniel Oliveira (UFF, Brazil); Mariza Ferro (Federal Fluminense University - UFF); Aline Paes (Institute of Computing / Universidade Federal Fluminense) | Accept - POSTER |
Towards Portuguese Hate Speech Detection with Transformers | Gabriel Assis (IC/UFF)*; Annie Amorim (Federal Fluminense University - UFF); Jonnathan Carvalho (Department of Informatics / Instituto Federal Fluminense); Daniel Oliveira (UFF, Brazil); Daniela Vianna (Universidade Federal Fluminense); Aline Paes (Institute of Computing / Universidade Federal Fluminense) | Accept - POSTER |
An empirical study of Definition Modeling with LLMs for the main languages of Latin America | Erica Kido Shimomoto (National Institute of Advanced Industrial Science and Technology)*; Edison Marrese-Taylor ( National Institute of Advanced Industrial Science and Technology (AIST)); Enrique A Reid (University of Tokyo) | Accept - ORAL |
Eigenpruning | Tomás Vergara Browne (PUC)*; Álvaro Soto (PUC); Akiko Aizawa (National Institute of Informatics) | Accept - POSTER |
Sequence-to-Sequence Spanish Pre-trained Language Models | Vladimir Araujo (KU Leuven)*; Maria Trusca (KU Leuven); Rodrigo Tufiño (Universidad Politécnica Salesiana); Sien Moens (KU Leuven) | Accept - ORAL |
A Spanish-language Dataset of Twitter Conversations for Stance Detection | Leo Ramos (Computer Vision Center, Universitat Autònoma de Barcelona); Mike Bermeo (Yachay Tech University)*; Silvana Escobar (Yachay Tech University); Diego Morales-Navarrete (Yachay Tech University); Erick Cuenca (Yachay Tech) | Accept - POSTER |
On the Challenges of Creating Datasets for Analyzing Commercial Sex Advertisements to Assess Human Trafficking Risk and Organized Activity | Pablo Rivas (Baylor University)*; Tomas Cerny (Baylor University ); Alejandro Rodriguez Perez (Baylor University); Javier S Turek (Intel Labs); Laurie Giddens (University of North Texas); Gisela Bichler (California State University, San Bernardino); Stacie Petter (Wake Forest University) | Accept - POSTER |
Identificación de textos relacionados al cambio climático y sustentabilidad utilizando modelos de lenguaje preentrenados en español | Gerardo Huerta (UNI)*; Gabriela Zuñiga Rojas (UNSAAC) | Accept - ORAL |
Aprendiendo a cocinar de manera saludable con Large Language Models, Supervised Fine Tuning y Retrieval Augmented Generation | Andrea Morales-Garzón (University of Granada)*; Sara Benel Ramirez (Independent Researcher); Gabriel Tuco Casquino (UCSM); Oscar A. Rocha (Independent Researcher); Alberto Medina (UPM-ETSI) | Accept - POSTER |
Detection of hate speech and inappropriate content in Mexican Spanish Memes | Horacio Jarquín (INAOE)*; Itzel Tlelo (INAOE); Marco Casavantes (Instituto Nacional de Astrofísica, Óptica y Electrónica (INAOE)) | Accept - POSTER |
Towards Improved RAC Accessibility: Dataset and LLMs, approach to enhancing RAC accessibility | Edison Jair Bejarano Sepulveda (Universidad de Barcelona)*; Hector Nicolai Potes Patiño (Fundación Universitaria Los Libertadores); Santiago Pineda Montoya (UNAL); Felipe Ivan Rodriguez (Fundación Universitaria Los Libertadores); Jaime Enrique Orduy (Fundación Universitaria Los Libertadores); Danny Stevens Traslaviña (Fundación Universitaria Los Libertadores); Alec Mauricio Rosales (Fundación Universitaria Los Libertadores); Sergio Nicolás Madrid (Fundación Universitaria Los Libertadores) | Accept - ORAL |
Exploring the Trade-off Between Model Performance and Explanation Plausibility of Text Classifiers Using Human Rationales | Lucas E Resck (Fundação Getulio Vargas)*; Marcos M. Raimundo (University of Campinas); Jorge Poco (FGV, Brazil) | Accept - POSTER |
A big ant or a small elephant: metaphor interpretation on large language models | Luiz Matos (Universidade Federal Fluminense)*; Aline Paes (Institute of Computing / Universidade Federal Fluminense) | Accept - POSTER |
Social Biases in Models Trained with Chilean Corpus | Tamara Quiroga (Pontificia Universidad Católica de Chile)*; Jocelyn Dunstan (Pontificia Universidad Católica de Chile) | Accept - POSTER |
Cross-Linguistic Framing Analysis: Unveiling Political and Cultural Narratives in Spanish-Language News | Juan Cuadrado (Universidad Tecnologica de Bolivar)*; Elizabeth Martinez (Universidad Tecnologica de Bolivar); Edwin Puertas (Universidad Tecnológica Bolívar); Juan Carlos Martinez-Santos (Universidad Tecnológica Bolívar) | Accept - POSTER |
Detecting correct answers to open questions and its impact on language models' confidence scores | Guido Ivetta (Universidad Nacional de Córdoba)*; Hernán Maina (CONICET); Luciana Benotti (Universidad Nacional de Cordoba) | Accept - ORAL |
Efficacy of ByteT5 in Multilingual Translation of Biblical Texts for Underrepresented Languages | Jason Wu (Baylor University); Colton Wismer (Baylor University); Zhaoyu Wang (Baylor University); Xiaokan Tian (Baylor University); Lauren Adams (Baylor University); Corinne Aars (Baylor University); Pablo Rivas (Baylor University)*; Korn Sooksatra (Baylor University); Matthew Fendt (Baylor University) | Accept - POSTER |
Voces de Latinoamérica: Un sistema TTS de bajo recursos para múltiples acentos | Jefferson Quispe (vozy)* | Accept - POSTER |
Suitability in Combining Structural and Content Features for Early Detection of Alzheimer’s Disease | Carlos Olachea (Instituto Nacional de Astrofísica, Óptica y Electrónica)* | Accept - ORAL |
Change My Frame: Reframing in the Wild in r/ChangeMyView | Arturo MP (NAIST)* | Accept - ORAL |
Computational Resources for Indigenous Languages Spoken in Argentina: An Introductory Survey | María Belén Ticona (University of Buenos Aires)*; Fernando Carranza (Universidad de Buenos Aires); Cotik Viviana (Universidad de Buenos Aires) | Accept - ORAL |
Proyecto #Somos600M: Generación de recursos que representen la riqueza de las lenguas de LATAM, El Caribe y España | María Grandury (SomosNLP)* | Accept - POSTER |
PROGRAM COMMITTEE
Reviewer | Affiliation |
Jaime Acevedo Viloria | LXAI |
Fernando Alva-Manchego | Cardiff University |
Vladimir Araujo | KU Leuven |
Jorge Arraut | AcrossA |
Juan Banda | Stanford University |
CJ Barberan | Rice University |
Rubenia Borge | University of North Texas |
Juan Cardenas Cartagena | University of Groningen |
Jose Cordova-Garcia | ESPOL |
Daniela Cortes Bermudez | LatinX in AI (LXAI) |
Oscar Cumbicus Pineda | Universidad Nacional de Loja |
Mariela De Lucas Alvarez | German Research Center for Artificial Intelligence |
Mateo Espinosa Zarlenga | University of Cambridge |
Diana Galvan-Sosa | University of Cambridge |
Yuan Gao | University of Cambridge |
Gabrielle Gaudeau | University of Cambridge |
Miguel Gonzalez-Mendoza | Tecnologico de Monterrey |
Andrew Hamara | Baylor University |
Julio Hurtado | University of Warwick |
Pride Kavumba | Tohoku University |
Bikram Khanal | Baylor University |
Jinqi Luo | University of Pennsylvania |
Errol Mamani Condori | Research and Innovation Center in Computer Science UCSP |
Mauricio Mazuecos | ESolutions |
Juan Miguel Navarro Carranza | Stanford University |
Ted Pedersen | University of Minnesota Duluth |
Nayeli Perez Padilla | Universidad de Guadalajara |
Ernesto Quevedo Caballero | Baylor University |
Maisha Binte Rashid | Baylor University |
Abel Reyes-Angulo | Michigan Technological University |
Pablo Rivas | Baylor University |
Alejandro Rodriguez Perez | University of Havana |
David Romero | MBZUAI |
Danae Sanchez Villegas | University of Copenhagen |
Jesus Solano | ETH Zürich |
Korn Sooksatra | Baylor University |
Sadia Tisha | Baylor University |
Maria Trusca | KU Leuven |
Javier Turek | Intel |
Matias Valdenegro Toro | Department of AI, University of Groningen |
Jorge Yero Salazar | Baylor University |
ORGANIZERS
General Chairs — Vladimir Araujo (KU Leuven), Diana Galvan Sosa (University of Cambridge)
Program Committee Chairs — Javier Turek (Intel Labs), Pablo Rivas (Baylor University)
Finance and Sponsor Chairs — Miguel Gonzales (Tecnologico de Monterrey), Luciana Benotti (UNC), Jaime Acevedo-Viloria (BrainFood)
Public Relations Chair — Errol Wilderd (Universidad Católica San Pablo)
Website Chair — Jesus Solano (ETH Zürich)
Volunteers Chair — Danae Sánchez Villegas (University of Copenhagen)
SPONSORS
Platinum
GOLD
SILVER
BRONZE