Multimodal Processing of Humorous Tweets

Doctor of Philosophy (Ph.D)


Literature and Languages

Fall 2020


For this dissertation, I investigated the processing of multimodal (text and image) humorous tweets by native speakers of English and Spanish. This dissertation consists of three separate but related studies: the first study examines the question of whether untrained participants can identify the structural elements of the humor in a multimodal text. Specifically, I contrast three conditions: whether the humor lies in the text, in the image, or in both. The second study examines the order of processing of textual and visual stimuli (tweets), in particular focusing on what type of information (verbal or visual) the participants access first. Finally, the third study investigates whether pupil dilation correlates with the recognition of the humor.While there is plenty of literature showing that naive participants are sensitive to the location of humor in text and a much smaller amount of research about the capacity to spot humor in visual sources, such as cartoons, there is almost no research on the processing of multimodal (text and image) humorous stimuli. Previous studies (e.g., Hernández-Méndez & Muñoz-Leiva, 2015; Yus, 2019) on the processing of image and text have shown that the order of analyzing these modes is neither predictable nor consistent among various multimedia mediums, such as memes or cartoons. Viewers usually focus on the area that captures their attention or seek relevant information that promotes the understanding of the image-text interaction. Research on pupillometry (e.g., Schmidtke, 2018; Sirois & Brisson, 2014) shows that maximum pupil dilation coincides with cognitive effort or processing. Additionally, both humor comprehension and appreciation are connected similarly to cognitive and affective processes (Chen et al., 2017). In order to investigate these issues, this study employed a quantitative method design and used two different groups of participants. With regard to identifying the location of incongruity, 49 participants took part in a survey to determine the humorous part(s) of English tweets. The results from the survey indicated that naïve viewers were able to identify the location of incongruity in a visual scene. Due to COVID restrictions during the data collection period (Spring 2020), it was not possible to ask this same group of participants to perform the second task regarding the processing of humorous tweets. In order to investigate this question, Spanish data previously collected from 36 participants were used to understand the processing of humorous tweets. These data showed that participants were attracted first to the image and then to the caption. Finally, with regard to the pupillometry results, the same data used in the previous study were used, also for the same reason. The findings did not reveal a correlation between pupil dilation and discovering the joke in the tweets. In summary, the studies discussed in this dissertation show that speakers can identify and differentiate the different locations of humor in multimodal texts and that they have strategies of processing order for multimodal humorous texts that differ from those of other multimodal texts.


Salvatore Attardo

Arts and Humanities | Digital Humanities