Document Type

Conference Proceeding

Publication Date

2021

Abstract

Language models are trained only on text despite the fact that humans learn their first language in a highly interactive and multimodal environment where the first set of learned words are largely concrete, denoting physical entities and embodied states. To enrich language models with some of this missing experience, we leverage two sources of information: (1) the Lancaster Sensorimotor norms, which provide ratings (means and standard deviations) for over 40,000 English words along several dimensions of embodiment, and which capture the extent to which something is experienced across 11 different sensory modalities, and (2) vectors from coefficients of binary classifiers trained on images for the BERT vocabulary. We pre-trained the ELECTRA model and fine-tuned the RoBERTa model with these two sources of information then evaluate using the established GLUE benchmark and the Visual Dialog benchmark. We find that enriching language models with the Lancaster norms and image vectors improves results in both tasks, with some implications for robust language models that capture holistic linguistic meaning in a language learning context.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Publication Information

Kennington, Casey. (2021). "Enriching Language Models with Visually-Grounded Word Vectors and the Lancaster Sensorimotor Norms". Proceedings of the 25th Conference on Computational Natural Language Learning, 148-157. http://doi.org/10.18653/v1/2021.conll-1.11

Download

Included in

Computer Sciences Commons

COinS

ScholarWorks

Computer Science Faculty Publications and Presentations

Enriching Language Models with Visually-Grounded Word Vectors and the Lancaster Sensorimotor Norms

Document Type

Publication Date

Abstract

Creative Commons License

Publication Information

Included in

Browse

Links

Search

Author Corner

ScholarWorks

Computer Science Faculty Publications and Presentations

Enriching Language Models with Visually-Grounded Word Vectors and the Lancaster Sensorimotor Norms

Authors

Document Type

Publication Date

Abstract

Creative Commons License

Publication Information

Included in

Share

Browse

Links

Search

Author Corner