I am a doctoral candidate in the Linguistics Department at the University of Hawaiʻi at Mānoa. My dissertation research focuses on language attitudes on Pohnpei, FSM. My other research projects involve language documentation and conservation, linguistic geography, acoustic phonetics, discourse analysis, quantitative approaches to linguistic research, sociolinguistics, Micronesian languages, and decolonizing linguistic methodologies. I also have a strong interest in data science, quantitative methods, and R programming.
Interests
Sociolinguistics
Language attitudes
Pohnpei, Federated States of Micronesia
Data science
R
Education
PhD in Linguistics, 2018
University of Hawaiʻi at Mānoa
MA in Linguistics, 2017
University of Hawaiʻi at Mānoa
BA in Linguistics and German Language and Literature, 2011
This article explores a more nuanced understanding of topological relations in the Pohnpeian language (Austronesian). The BowPed Toolkit (Bowerman Melissa and Pederson Eric 1992. Topological relations picture series. In Space stimuli kit 1.2: November 1992, 51, Nijmegen: Max Planck Institute for Psycholinguistics. http://fieldmanuals.mpi.nl/volumes/1992/bowped/ .) is employed as an elicitation tool with five Pohnpeian speakers. Evolutionary classification tree modeling is used as a discovery tool to find patterns in the data. The results show that the two prepositions in Pohnpeian, nan and ni, should be redefined in terms of topological relations as ‘containment’ and ‘attachment’ respectively. Likewise the meaning of some prepositional nouns are further revised.
What to do with categorical data? Categorical data can be challenging to analyze quantitatively In language research we often have data that are purely categorical In today’s presentation we will deal with a specific type of categorical data found in questionnaires Questionnaire Data Questionnaires are frequently used in a variety of language research scenarios They often ask people to rate something (likert scales) or select the most appropriate response Example: select the language that is most appropriate to use in a given domain Example: rate level of agreement with several statements Research Question How do the questionnaire respondents relate to each other based on their responses?
What to do with categorical data? Categorical data can be challenging to analyze quantitatively In language research we often have data that are purely categorical In today’s presentation we will deal with a specific type of categorical data found in questionnaires Questionnaire Data Questionnaires are frequently used in a variety of language research scenarios They often ask people to rate something (likert scales) or select the most appropriate response There are often multiple questions that have the same answer scale (same choices) Example: select the language that is most appropriate to use in a given domain Example: rate level of agreement with several statements Research Question Often we are interested in how the questions relate to each other in terms of their answers as well as how the answers relate to each other based on the questions they were most used with Correspondence Analysis CA is statistical technique that shows how the questions and answers of multiple questions relate to each other Requires the data to have the same scale (all questions must have some possible answers) CA is a descriptive tool and doesn’t give p-values per se How does CA work?
This document comes from a UH-Mānoa data science group for linguists presentation
This is a top level section R Markdown This is an R Markdown presentation. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document.
Overview of Presentation Bayesian vs. frequentist inference Bayes’ Theorem Example of Bayesian inference Bayesian LMERs with rstanarm how to code models selecting priors displaying & interpreting the posterior distribution model diagnostics and comparisons Bayesian vs. Frequentist Inference Frequentist Inference Uses only the data and compares it to an idealized model to make inferences about the data Example Problem: You lose your cellphone in your house. You have a friend call it and you listen for the sound to find it.
What is Git? Git is… A distributed version control system Used to allow multiple people to collaborate on the same code Also useful for managing your own code and larger projects Applications for language researchers? Coauthoring papers Working on R scripts or other code Sharing code with others (via github or bitbucket) Managing large projects like a dissertation, thesis, or article What does Git do?
What is tidy data? Tidy data have the following characteristics:
Observations are in rows Variables are in columns Contained in a single dataset An example of tidy data Participant Gender Trial Value 01 M 1 100 01 M 2 210 02 F 1 50 02 F 2 75 An example of messy data Participant Trial1 Trial2 M01 100 210 F02 50 75 R Packages for tidy data library("tidyverse") ## ── Attaching packages ──────────────────────────────────────────────────────────────────────────────────── tidyverse 1.