Hello! My name is Mark Norris. I am currently working as an Analytical Linguist at Grammarly. There, I manage collection and quality of datasets to support Grammarly’s product offerings. I most recently worked (on contract) as a Language Data Engineer on the Lex Data team at AWS. There, I managed curation, collection, and QA of linguistic data for chatbots in a variety of languages.

I have a PhD in linguistics, and I was a faculty member in the Department of Modern Languages, Literatures, and Linguistics at the University of Oklahoma for 5 years. There, I researched and taught undergraduate courses about linguistic theory and typology. For more information on my academic career, see my Academic CV.

For journal editors: I am still available to review scholarly articles on linguistics, but it really has to be a good fit. Please contact me if you have an article on generative morphology/syntax or agreement that you think I would be a good reviewer for. If it’s a good match, I’d love to continue to help the research community.

LinkedInGitHubAcademic CVOSF

Language/Linguistic database creation and analysis

  • Managed collection, curation, QA, and delivery of various supplemental data sets for Amazon Lex chatbot client
    • Variety of languages (including some I do not speak), modalities, content domains
    • Designed annotation guidelines for collections novel to the Lex Data team
  • Designing studies focused on collecting data to test specific linguistic questions
  • Building datasets for NLP: accident or not accident? (model testing or evaluation)
  • Research expertise in syntax, morphology, semantics, and pragmatics; strong knowledge of phonetics, phonology, and psycholinguistics

Qualitative research methods

  • Conducted interviews about language with Estonian speakers in Tartu, Estonia and Estonian-speaking communities in SF Bay area (i.e., Linguistic Fieldwork)
  • Designed interview materials for both exploratory inquiries and testing specific assumptions from linguistic literature

Quantitative research methods

  • Designed psycholinguistic experiments to investigate syntactic and morphological properties of English
  • Collected corpus examples for probabilistic analysis: Adjectives in Estonian
  • Strong knowledge of sampling for cross-linguistic analysis: Typology of nominal concord

Technical Experience

  • Excel/Sheets • regex • Python • bash/terminal • git
    • Poketext: using Python and regular expressions to scrape and clean text data from Bulbapedia
    • Python scripting for processing and QAing linguistic data
    • Python scripting for optimization of NLU data set creation and quality
  • Experience with: SQL, R