Mark Norris – Linguist • San Francisco, CA • Analytical Linguist at Grammarly

Hello! My name is Mark Norris. I am currently working as an Analytical Linguist at Grammarly. There, I manage collection and quality of datasets to support Grammarly’s product offerings. I most recently worked (on contract) as a Language Data Engineer on the Lex Data team at AWS. There, I managed curation, collection, and QA of linguistic data for chatbots in a variety of languages.

I have a PhD in linguistics, and I was a faculty member in the Department of Modern Languages, Literatures, and Linguistics at the University of Oklahoma for 5 years. There, I researched and taught undergraduate courses about linguistic theory and typology. For more information on my academic career, see my Academic CV.

For journal editors: I am still available to review scholarly articles on linguistics, but it really has to be a good fit. Please contact me if you have an article on generative morphology/syntax or agreement that you think I would be a good reviewer for. If it’s a good match, I’d love to continue to help the research community.

LinkedIn • GitHub • Academic CV • OSF

Language/Linguistic database creation and analysis

Managed collection, curation, QA, and delivery of various supplemental data sets for Amazon Lex chatbot client
- Variety of languages (including some I do not speak), modalities, content domains
- Designed annotation guidelines for collections novel to the Lex Data team
Designing studies focused on collecting data to test specific linguistic questions
- Collecting examples from online corpora of Estonian: e.g., Estonian numerals, possessor binding
- Collecting morphosyntactic data from broad range of languages: Archived numerical data and qualitative analysis
Building datasets for NLP: accident or not accident? (model testing or evaluation)
Research expertise in syntax, morphology, semantics, and pragmatics; strong knowledge of phonetics, phonology, and psycholinguistics

Qualitative research methods

Conducted interviews about language with Estonian speakers in Tartu, Estonia and Estonian-speaking communities in SF Bay area (i.e., Linguistic Fieldwork)
Designed interview materials for both exploratory inquiries and testing specific assumptions from linguistic literature

Quantitative research methods

Designed psycholinguistic experiments to investigate syntactic and morphological properties of English
Collected corpus examples for probabilistic analysis: Adjectives in Estonian
Strong knowledge of sampling for cross-linguistic analysis: Typology of nominal concord

Technical Experience

Excel/Sheets • regex • Python • bash/terminal • git
- Poketext: using Python and regular expressions to scrape and clean text data from Bulbapedia
- Python scripting for processing and QAing linguistic data
- Python scripting for optimization of NLU data set creation and quality
Experience with: SQL, R