Staff Data Scientist at Betterworks (March 2021 - Present)
Computational Social Scientist at Humanyze (Late 2017 - Mar 2021)
Earned PhD in Computational Social Science from GMU (Late 2020)
Graduate Research Assistant in Center for Social Complexity (2016 - 2017)
Graduate Research Assistant in Machine Learning and Inference Lab (2012 - 2016)

toz <at> gmu <dot> edu

At Betterworks (BW). I joined BW as its first data scientist. My most recent project is on NLP (sequence labeling). Given a feedback (that employee A gave to coworker B), my model identifies the competencies of strength and of opportunity in the feedback text. In another project, I analyzed the impact of employees' own behaviors and that of their managers on their goal completion rates. In my first project, I conceptualized and developed a PoC for the insights module (based on project Aristotle).

At Humanyze. I worked as the computational social scientist (data scientist) and led the research area at Humanyze (a startup that was born out of Sandy Pentland's Human Dynamics group at MIT Media Lab). The main problem I'm trying to solve is how to model the indicators of engagement of employees, productivity of teams, and adaptability of organizations using the metadata (no content) of the workplace technologies such as calendar, e-mail, Slack, etc., and sensors (if available). The theories and methods on which my algorithms rely are primarily from the fields of organizational behavior and social network analysis. My responsibilities include writing papers, collaborating with researchers, and supervising interns. Some of my research output:

At George Mason University. While studying towards my PhD in Computational Social Science with Andrew Crooks I was a graduate research assistant in the Center for Social Complexity (one year) and Machine Learning and Inference Lab (for five years). My work there includes:

  • My PhD Defense Presentation (2020): US$ 1 Trillion is the annual cost of lower productivity due to stress (WHO, 2019). To help solving this problem, in my PhD, I proposed and showcased a strategy leveraging communication (meta)data and computational techniques to identify behaviors leading to and resulting from collective stress.
  • Computational Social Science of Disasters: Opportunities and Challenges (2019): We extensively reviewed the roles of subfields of social sciences, crisis informatics, and computational social sciences in disaster research (260 citations!). By discussing opportunites and challenges, this paper justifiably invites (i) CSS researchers to work on disasters and (ii) traditional disaster research people to pay more attention to CSS.
  • Attribution of Blame and Responsibility - #FlintWaterCrisis (2018): Tested the generalizibility of a set of theories in the sociology of disasters using online communication data. Formed and tested our hypotheses on (i) who to blame, (ii) role of partisan predisposition, (iii) concerned geographies, and (iv) contagion of complaining. An earlier version of this work was presented at Social Web for Disaster Management (SWDM'16) workhop.
  • Generation of Realistic Mega-Cities (2017): As part of the project What Happens If a Nuclear Bomb Goes Off in Manhattan?, we propose a method for synthesizing populations and social networks for agent-based modeling. I created a synthetic population (of two NY counties), their social networks, and road networks to simulate the responses to a nuclear attack. (Jupyter Notebook).
  • Doomsayers of Pollyannas? News Sentiment and Public Gatekeeping on Twitter: Does the public tend to favor more positive news when retweeting or not? To investigate the power of the public to reshape the news flow, we conducted sentiment analyses of published news, tweeted news, and retweeted news from eight mainstream news organizations. (Jupyter Notebook)
  • Politicians Busted while Agenda-setting on Social Media: To what extent the Members of Congress (MCs) use social media for agenda building? That is, do they talk (tweet) on some topics but avoiding others? I created the co-commentation network of MCs of 113th Congress and detected two emergent communities in this network, which correspond to the two parties in the Congress. The group memberships overlapped by 95%+. I presented this work in PolNet (2015) workshop and CSSS summit (2015).
  • AirBnB++: Search Listings by Reputation and Description: AirBnB lacks the features of (i) filtering the search results by seller reputations and (ii) keyword search within listing contents. I developed a geo-web app to fix these issues. Please see the video demo towards the end of the notebook.
  • Twlets: Twitter→Excel: I think this Chrome Web Browser Extension is the most convenient way to download data from Twitter. With a single click you can download tweets, followers, etc. as an MS Excel file.
Collective Stress in the Digital Age - My PhD defense presentation