Portfolio
These are the projects and writeups that I’m most proud of. If you like what you see, please take a look at my resume, GitHub, LinkedIn, or email me at sam@samlowe.dev.
📊 Data Science Projects
👁️ Convoluted Computer Vision • Repository
Python, Jupyter, tensorflow, matplotlib
- A variety of case studies in computer vision to classify various well-known image data sets using convolutional neural networks and modern techniques (transfer learning, image augmentation, depthwise convolution)
- Obtained accuracy of over 96% on mnist, 93% on pneumonia_mnist, and 75% on cifar-10, etc.
🏆 League of Losers • Writeup • Repository
Python, Jupyter, NumPy, scikit-learn, tensorflow, matplotlib, seaborn
A 3d graph comparing model accuracy with various hyperparameters
- A case study of League of Legends ranked games to predict which teams win for insights into winning strategies
- Performed exploratory data analysis, then compared many different models including quadratic discriminant analysis, support vector machines, logistic regression, and neural networks to achieve an accuracy of >70% despite very limited data
- Writeup discusses how I iterated on each neural network to improve accuracy, why I chose the accuracy goals that I did, and what stuck out to me during my analysis
🦠 COVID Mortality Prediction • Writeup • Repository
Python, Jupyter, NumPy, scikit-learn, tensorflow, matplotlib, seaborn
- Exploration of a data set from Mexico’s Ministry of Health with two teammates containing patient comorbidity and mortality. We used seaborn and matplotlib to analyze and visualize patient demographics and find the features that correlated most strongly with patient mortality
- We used principal component analysis to reduce dimensions from 20 down to 3 and trained trained various models including logistic regression, random forest, artificial neural network, linear and quadratic discriminant analysis, k-nearest neighbors, and more to predict patients’ health risk
- Despite the data set missing crucial features, we still achieved >70% accuracy! The team and I finally presented our findings and methodology to the class and earned a full score
- The writeup, Lessons From My First Data Science Project, explains our process, methodology, and findings more deeply, and includes some of the matplotlib and seaborn visuals I worked so hard on!
💻 Programming Projects
🧑🎓 Pisa • Repository
Python, Django, TypeScript, HTML, CSS, Liquid, Lean4, uv, Git
- Pisa is a Django website for teachers to design and assign programming and proof assignments in Lean4, Microsoft’s open-source proof assistant
- Teachers can design assignments with multiple problems consisting of starter code blocks (that cannot be edited) and response blocks that can be edited, and set point values for each problem and deadlines. Students can submit their assignments to be automatically graded by a Lean4 executable on the server for instant feedback
- Multiple quality of life improvements including code highlighting, live code feedback from Lean, and easy student score exporting to csv or excel file formats
💎 Gems in the Rough • Repository
Python, Hugging Face, PySide6, uv, Git
- An image sorting tool that uses a local AI assistant, Gemma, to help users sort images on their computer with human oversight. The user specifies an input directory and folder options, and a UI will open showing each image in the directory and Gemma’s top 4 suggested folders. The user can choose which suggestion to accept (or reject all of them) by clicking UI buttons or by hitting the 1, 2, 3, 4, or 5 keys
- Alternatively, the user can disable AI mode and just specify which directories should be associated to the 1, 2, 3, and 4 keys
- Features dynamic queueing and loads images as the user move through the directory; the user can specify the number of active threads and how far ahead the program will look before it pauses for the user to catch up. The user can also choose which version of Gemma 4 they want, with the smallest Gemma 4 model (E2B-it) being selected by default
🔀 Playlist Tools • Repository
Rust, YouTube API, tokio, oauth2, plotters, Git
Visualizing the video durations in one of my playlists, “0 Watch Later”
- A command-line program allowing the user to sort one of their playlists in additional ways YouTube doesn’t currently offer, such as by video duration or channel name
- The user logs into YouTube through their browser using the secure OAuth2 authentication system, then the program generates a graphic of the playlist and begin sorting
- Async programming using the popular tokio runtime improves performance when hitting many APIs; other optimizations include batch downloading video information and caching video information to conserve a limited number of allowed API calls. Successfully tested on several playlists with nearly 2,000 videos
📜 PaperScraper • Writeup • Repository
Python, PyTest, Poetry, APIs (Reddit, Imgur, Flickr), Git
- A command-line program that allows users to scrape wallpapers from reddit both from their own saved account features and from specific subreddits
- Uses async downloading with httpx (a requests alternative) and batching to efficiently download wallpapers in a fraction of the time compared to a naive synchronous implementation
- 91% of the codebase is covered bye extensive automated unit tests, written with libraries like VCR.py, pytest, and GitHub actions
- My detailed writeup, The Worst Project I Ever Finished, discusses multiple massive refactors over several years (now over 500 commits!) across industry-standard technologies including Poetry, uv, pre-commit, and Postman as well as the design patterns I iterated over, and all the lessons I learned for avoiding pitfalls in future projects
1️⃣ There Can Only Be One • Repository
Rust, blake3, Git
- A program which finds duplicated files in a specified path, optimized to rule out or find duplicates over multiple passes
- First, all files in the directory are bucketed by file size, then a small amount of the file (based on the disk’s block size) is hashed using the blake3 hashing algorithm to quickly and inexpensively identify true negatives that can be ruled
- In the second pass, any hash collisions from the first pass are inspected by fully hashing all colliding files with colliding hashes. Any files whose hashes collide are reported as a group as duplicates
📻 nprcore.me • Repository
Vue.js, JavaScript, Spotify API, Git
All Things Considered tweeting their nprcore result
- A joke website I pair-programmed with a friend that compares users’ Spotify activity with National Public Radio’s recommended songs; I designed the algorithm to estimate user scores before we had user data to create an empirical cumulative density function to rate music tastes
- The site was recognized by two official NPR Twitter accounts: All Things Considered and NPR Interns and got over 5,000 hits in 24 hours
🐦⬛ Blue Raven • Repository
Rust, anyhow, regex, Git
- Automates pairing Bluetooth devices on machine that dual-booting Windows and Linux
- Discovers Windows mounts and parses multiple bluetooth files and pairing formats to provide comprehensive support across a wide variety of device types
🏃♀️ KLIP • Repository
Django, SQL, Spoke API, Git
- A proof-of-concept app for a wearable device that allows runners to alert trusted contacts or emergency services if they feel unsafe while running
- Developed with a team of three other programmers at Northeastern University Generate; besides taking ownership of my part of the Django backend, I led several weekly stand-ups
👨💻 samlowe.dev • Repository
Jekyll, Markdown, HTML, CSS, Liquid, GitHub Actions
- The website you’re on right now, built with the Jekyll static site generator and a customized version of the Chirpy theme
- I commit to a (not so secret) private repo, which is then automatically built and deployed by a GitHub action, and deployed to the public repository which is then publicly hosted via Porkbun
📐 GitHub Template Repositories
- For standard Rust projects, featuring pre-commit and anyhow
- For standard Python projects, featuring uv, pre-commit, PyTest, and GitHub actions
- For Jupyter notebooks, featuring uv, TensorFlow, (geo)pandas, scikit-learn, seaborn, etc.
📝 Explainers and Articles
- A Survey of Intermediate Python Features, which I wrote to help new Python programmers bridge the gap to more advanced parts of the language
- My undergraduate thesis on Elliptic Curves and the Birch and Swinnerton-Dyer Conjecture, written in LaTeX. It also covers some brief SageMath/Python I wrote to check elliptic curve data based on findings from the L-functions and modular forms database
- A short introduction I wrote on greedy algorithms
- Advice for learning mathematics
- Advice for learning how to program
- A list of GitHub repositories I think are cool
- Advice for learning data science (coming soon!)