Fall 2023

Prototype an approach to fine-tune a large language model (LLM) to help diagnose areas to improve a specific writing product. For example, scientific papers require consistent language but in creative writing variety matters. Proposed steps are:

Writing Product: Coordinate with project mentors to choose a common and important writing product, such as a position paper or an academic conference. Identify/gather a rubric and a corpus.
Inject Bad Writing: For each element of the rubric, develop prompts for generative AI to decrease the quality of writing based on the rubric (i.e., make it worse). This will form a training data set of the good example and version worse on certain characteristics.
Fine Tune: Students will be expected to attempt to fine tune an LLM (e.g., LLAMA 2) based on this synthetically generated data
Evaluate: Research if tuning suggests better domain-specific areas to improve.

This project aligns with ongoing work with the USC Generative AI Center.

Build a multilingual decipherment system

Advisor: Jonathan May (Viterbi School of Engineering)
Students: Aman Kumar , bowen leng , Emerson Jin , Zijie Lei

We will build a working system that can decipher a letter substitution cipher into 14 languages and beyond, based on https://aclanthology.org/2021.acl-long.561/ then apply it to languages it has never seen

Building a Platform for NFL Data Insights

Advisor: Jeremy Abramson (Viterbi School of Engineering)
Students: Abhinav Arun , Anish Ari , Brad Powell , Tyler Pomposelli , Yitong Qian

Open source sports data such as the nflverse has lead to a massive increase in public sports analytics. But it’s still hard to process, subset, visualize and analyze this data. This project will build a general-purpose analysis platform and dashboard, similar to what many teams use internally. Using the nflfastr data, this platform will allow interested individuals to select the play parameters they’re interested in, and will provide relevant analysis, visualization and insight. Ideally, we’ll set up the dashboard on the internet, and open source the project, allowing others to expand the available datasets, analyses and visualizations.

Does Municipal Broadband Deliver as Promised? An examination of broadband pricing and household adoption in areas served by muni networks.

Advisor: Hernan Galperin (Annenberg School for Communication and Journalism)
Students: Menghan Jiao , Ming Shan Lee , Yinuo Chen

Broadband networks owned and/or operated by local governments (“muni networks”) are increasingly seen as a key tool to close the digital divide in Internet availability and adoption. There is however only anecdotal evidence about whether muni networks deliver on the promise of more affordable broadband in communities of little interest to traditional ISPs - typically disadvantaged communities. Taking advantage of the greater level of resolution in the new FCC broadband availability maps, this project will examine broadband pricing and adoption at the address level in areas served by muni networks, using a matched sample of comparable areas as a reference point. The goal of the project is to empirically assert whether muni networks are delivering on the promise of more affordable services, and whether this results in more household adoption than expected. The project is a component of an ongoing collaboration with digital equity advocacy organizations.