Benjamin Anderson
Projects

Machine Learning Projects

Guide to Transformers

Benjamin Anderson

Project for Regulation, Evaluation, and Governance Lab – Autumn 2022

This comprehensive guide explains the key ideas behind the Transformer architecture. It is intended for a relatively technical audience, and assumes some familiarity with neural networks and deep learning (but not with the Transformer). I created this document to assist in my own learning, and to provide a resource for others who want to understand the architecture currently powering a deep learning renaissance. [Link] [PDF]


The Connection Between Male and Female Military Sexual Assault

Benjamin Anderson

STATS205 Final Project – June 2021

Building on a substantial literature documenting the prevalence, causes, and health effects of sexual assault in the United States Armed Forces, I analyze the relationship between the risk of sexual assault for male and female members of the military. I use the baseline of linear regression, and compare this to non-parametric methods (locally-weighted regression and splines) studied in the class, which are tuned using leave-one-out cross-validation. [Paper] [Code]


TrollSpotting: Identifying Tweets from Russian Trolls

Benjamin Anderson

CS229 Final Project – Autumn 2020

Since the 2016 election, Twitter has identified thousands of accounts belonging to employees of the Internet Research Agency (IRA), a "troll factory" operating out of St. Petersburg, Russia. In this project, I test the classical bag-of-words approach to extract semantic content from Twitter posts, and compare it to modern feature extraction techniques based on Transformer neural networks. I apply various supervised learning algorithms to these features to distinguish Tweets written by ordinary users from Tweets written by trolls. [Paper]


Playing "Dominion" with Deep Reinforcement Learning

Benjamin Anderson and Garrick Fernandez

CS238 Final Project – Autumn 2020

In this project, we apply deep reinforcement learning to the multi-player card game "Dominion." We trained an agent using the SARSA algorithm, combined with global function approximation. After training through self-play, the RL agent was able to easily beat a computer player which selected a random action every turn. [Paper]


Interpretable Criminal Risk Assessment

Benjamin Anderson and Gaeun Kim

CS221 Final Project – Autumn 2019

Algorithmic criminal risk assessment has the potential to make the criminal justice system more fair, remove arbitrariness and human bias, and reduce incarceration by releasing people who are unlikely to re-offend. However, the current gold standard for criminal risk assessment (COMPAS) is severely lacking: an unaccountable and costly "black box" assessment which has also faced credible accusations of racial bias (most famously by ProPublica]. In this project, we prototype several simple, interpretable algorithms for criminal risk assessment. We show that logistic regression, decision tree, and Bayesian network approaches all have comparable accuracy to the gold standard proprietary algorithm, while mitigating racial bias in classification error. [Paper]

Other Projects

GPTemail

Benjamin Anderson

Personal Project – Autumn 2022

Browser extension that uses GPT3 to answer your emails. Uses context from email text selected by the user to craft a custom response. Try it yourself! All you need is an OpenAI or Cohere API key, the extension is free and open source. [Code]


HiveMind Chess

Benjamin Anderson

Personal Project – Spring 2021

Chess interface that uses the open source Lichess database to select a move with probability proportional to its popularity, with the goal of simulating the experience of playing chess against a person. The game continues until a novel position is reached that is not in the database. [Interactive Demo] [Code]


Visual Explainer: The Affordable Care Act

Benjamin Anderson

CS448B Final Project – Autumn 2020

This interactive visual explainer walks the reader through the landscape of the American healthcare system since the Affordable Care Act was signed into law in 2010. Through maps and interactive charts, the reader is prompted to think about their own beliefs about how the Affordable Care Act changed the healthcare system, and then presented with the data and context. This project was build using D3, a Javascript library for web-based data visualization. [Interactive Demo] [Code]


TimeSearcher

Benjamin Anderson

CS448B Course Project – Autumn 2020

Originally developed by Harry Hochheiser and Ben Shneiderman, TimeSearcher is an interactive tool to filter and query time-series data. For a class project, I created a web-based TimeSearcher interface for the Stanford Cable News dataset in D3, a JavaScript library for data visualization in the browser. The interface allows the user to draw boxes to filter the dataset, as well as search for persons of interest by name. [Interactive Demo] [Code]


DEET

Benjamin Anderson, Ryan Eberhardt, Armin Namavari

CS110L Class Project – Spring 2020

GDB-like debugger written in Rust that uses ptrace to get information about a running program. Built in CS110L using starter code by Ryan Eberhardt and Armin Namavari. [Code]


BalanceBeam

Benjamin Anderson, Ryan Eberhardt, Armin Namavari

CS110L Class Project – Spring 2020

Multithreaded reverse proxy / load balancer written in Rust. Uses asynchronous tokio library for nonblocking I/O. Implements passive and active health checks, and rate-limiting. Built in CS110L using starter code by Ryan Eberhardt and Armin Namavari. [Code]