Intro Slide

NY-14 is currently represented by Alexandria Ocasio-Cortez.  On the right, she is pictured protesting the Trump administration immigration policies and child detention in Texas.

Slide #1

Congressional District NY-14 has a high, non-native, non-white population.  21.6% of residents are naturalized foreign-born citizens, while 25.4% are non-U.S. citizens.  By racial demographics, the district is comprised of 47.7% Latinx, 18% Asian, and 11.1% Black constituents.  Academic literature suggests that, because of the high proportions of minority groups, NY-14 will have resources available to improve lives of immigrants.  Specifically, Abrajano & Singh (2009) found that, when available, Spanish television news sources are statistically significantly more positive when speaking about immigration topics than their English counterparts.  Because of NY-14’s high minority population, the current study hypothesizes the following: H1) that the local Spanish daily newspaper will be more positive towards immigration than the local English daily newspaper; H2) that the Spanish coverage will focus more on humanitarian aid than the English source.  These hypotheses are grounded in the research by Abrajano & Singh as more positive coverage may imply more focus on humanitarian aid rather than partisanship issues and border security.

Slide #2

One hundred articles were collected from each of the two local news sites — El Diario (Spanish) and the Daily News (English) —  from the period of December 11,2018 – January 31,2019, coinciding with the government shutdown.  An article was collected if it contained any of the following words: immigration, immigrant, border, wall, shutdown, security, or undocumented (English); or inmigrante, muro fronterizo, inmigración, muro en la frontera, indocumentado, indocumentada, al cierre del gobierno, or seguridad (Spanish).  Perfect translations for the Spanish terms often did not exist and needed to be constructed by a phrase or were necessarily extended to multiple genders if an adjective.  In order to specify key terms and date ranges, articles were scraped by a Python script through Google News.  The scrape was limited to 100 articles in each language due to a lack of computational power.  To complete the analysis, word sentiments were obtained through the NRC Word-Emotion Association lexicon, which provides positive-negative ratings and emotional connotations of words in over 100 languages based in the original English lexicon.  In order to determine article focus in categories of Partisanship, Security, and Humanitarian Aid, two words (pictured on the bottom right) were chosen for each category that acted as signals for article content.  As with the search terms, “Republican” was necessarily extended to “Republicano” and “Republicana” to account for gendered adjectives in Spanish.

Slide #3

Two-sample t-tests were run on the English and Spanish article content to determine if one source used more positive or negative language in speaking about immigration. Both sentiments were tested because positivity and negativity are not necessarily mutually exclusive since many words in the lexicon are not associated with either sentiment and some words may carry both connotations.  Results did not show a significant difference between sentiment values, so the null hypothesis that the language sources are not different in sentiment cannot be rejected and H1 is inconclusive.  Graphs are pictured on the right, which show the top words used in each language for each sentiment.  Although it is not a statistical test and general conclusions cannot be drawn, the results are interesting to interpret.  For example, both language sources heavily used “government” and “president,” but English positive words center around money and resources, while the Spanish words are about agreement and personal security (“Seguro”, “casa”, “construir”).  Visualizing the top used words also helps identify the limitations of this analysis and the NRC lexicon in particular.  “President,” for example, while generally a positive term, could easily be seen as negative or neutral in the situation of the shutdown, depending on the political beliefs of the human interpreter.  Likewise, “cierre” in Spanish is classified as positive; however, it means “closure,” which is not positive in the context of immigration. It is also interesting, and perhaps problematic in this context, that “white” is heavily used in the English source and that it is classified as positive.  Future research should address these limitations by compiling an emotion lexicon specific to immigration through mass polling to avoid researcher bias.

Slide #4

Articles were classified into each focus by examining which category had the most terms per article. For example, if an article had 30 Security terms, and only 10 each of Partisanship and Humanitarian Aid, it was classified as a Security article.  Two-sample t-tests were also run to determine focus of each language. Although the three categories are mutually exclusive, all three tests were run to determine which focuses were preferred, if any, in each language.  There was no significant difference in Partisanship classification between sources; however, a significant difference was shown for both Security and Humanitarian Aid.  El Diario articles were shown to be classified as Security at a statistically significantly higher rate than The Daily News articles.  Dissimilarly, El Diario articles were classified as Humanitarian Aid focus at a significantly lower rate than The Daily News.  These results hold even after applying a Bonferroni correction (m = 5, for each sentiment and focus test; p < 0.01). That said, as pictured on the right, the English articles contained more focus terms for each category in general, with the average English article containing 8.48 Partisanship terms as compared to Spanish articles at 1.4 terms per article.  These results are similar across focus categories as seen in the graph. This difference may be due, in part, to imperfect translation of the terms.  Although they were directly translated using crowd-sourced translation service (Linguee) and researcher language experience (Advanced Proficient), a direct translation may not have been appropriate in this instance, because the language may use other terms to signal the focus categories.  Future research should consult native Spanish speakers and Spanish news writers to adequately translate focus terms.