Hello! My name is Mark Norris. I am currently working as an Analytical Linguist at Grammarly. There, I manage collection and quality of datasets to support Grammarly’s product offerings. I most recently worked (on contract) as a Language Data Engineer on the Lex Data team at AWS. There, I managed curation, collection, and QA of linguistic data for chatbots in a variety of languages.
I have a PhD in linguistics, and I was a faculty member in the Department of Modern Languages, Literatures, and Linguistics at the University of Oklahoma for 5 years. There, I researched and taught undergraduate courses about linguistic theory and typology. For more information on my academic career, see my Academic CV.
For journal editors: I am still available to review scholarly articles on linguistics, but it really has to be a good fit. Please contact me if you have an article on generative morphology/syntax or agreement that you think I would be a good reviewer for. If it’s a good match, I’d love to continue to help the research community.
LinkedIn • GitHub • Academic CV • OSF
Language/Linguistic database creation and analysis
- Managed collection, curation, QA, and delivery of various supplemental data sets for Amazon Lex chatbot client
- Variety of languages (including some I do not speak), modalities, content domains
- Designed annotation guidelines for collections novel to the Lex Data team
- Designing studies focused on collecting data to test specific linguistic questions
- Collecting examples from online corpora of Estonian: e.g., Estonian numerals, possessor binding
- Collecting morphosyntactic data from broad range of languages: Archived numerical data and qualitative analysis
- Building datasets for NLP: accident or not accident? (model testing or evaluation)
- Research expertise in syntax, morphology, semantics, and pragmatics; strong knowledge of phonetics, phonology, and psycholinguistics
Qualitative research methods
- Conducted interviews about language with Estonian speakers in Tartu, Estonia and Estonian-speaking communities in SF Bay area (i.e., Linguistic Fieldwork)
- Designed interview materials for both exploratory inquiries and testing specific assumptions from linguistic literature
Quantitative research methods
- Designed psycholinguistic experiments to investigate syntactic and morphological properties of English
- Collected corpus examples for probabilistic analysis: Adjectives in Estonian
- Strong knowledge of sampling for cross-linguistic analysis: Typology of nominal concord
Technical Experience
- Excel/Sheets • regex • Python • bash/terminal • git
- Poketext: using Python and regular expressions to scrape and clean text data from Bulbapedia
- Python scripting for processing and QAing linguistic data
- Python scripting for optimization of NLU data set creation and quality
- Experience with: SQL, R