Projects
Jump to navigation
Jump to search
Projects are projects on which I have worked.
Personal Projects
My guiding principles for building personal projects are:
- Build something that I will use myself;
- Open source code where appropriate;
- Write documentation on how to get started with my open source code and;
- Write a blog post announcing a larger projects.
I like to experiment with new programming languages, especially for small, limited-scope projects.
Computer Vision
- VisionScript: An abstract programming language for computer vision.
- VisionScript features a programming language, a REPL, a cloud Notebook environment, and a cloud deployment solution integrated with the CLI and Notebook environment
- VisionScript Documentation
ML and AI
- James Bot: An AI chatbot that references my blog, wiki, and other sources to answer questions. Powered by ChatGPT.
- personal-notebooks: Notebooks for personal experiments with machine learning and computer vision.
- swifties.me: Find how similar your voice is to Taylor Swift (WIP).
- Uses demucs for vocal isolation
- Offers two algorithms:
- SpeechBrain speaker verification
- ImageBind for audio embeddings on which similarity scores are computed
- soundbites.wtf: Compete to make a sound closest to the prompt of the day.
- Uses CLAP by LAION AI for sound comparison
- I, Spy: I, Spy mixed with a scavenger hunt. Take a photo and you'll get a label "Warmer" or "Colder" showing how if your photo is close to the prompt of the day. When you take a photo of the prompt of the day (i.e. a cat), you win!
Libraries
- indieweb-utils: A Python library with over a dozen functions useful for building IndieWeb and publishing tools, including implementations of parts of the W3C IndieAuth and Webmention specifications and the Post Type Discovery W3C Note. Functions include:
- Original post discovery implementation
- Discover feeds on a web page
- Get representative h-card on a page
- Page name discovery implementation
- Authorship discovery
- Endpoint discovery
- Get post reply URLs
- URL canonicalization
- Add hashtags and person tags to a string
- Remove URL tracking parameters
- Slugify a URL
- Generate reply contexts
- Discover IndieAuth endpoints
- Retrieve RelMeAuth links
- Retrieve a h-app item
- Discover Webmention endpoint
- Send a Webmention
- Discover a Trackback endpoint
- Send and validate trackbacks
- Really Simple Discovery (RSD) implementation
- Reduce size of a given image
- Process a Salmention
- Paginate a sequence
- pyatproto: Abstract Python functions for engaging with BlueSky and AT Protocol implementations.
- getsitemap: Retrieve URLs from a sitemap. Recursive retrieval supported.
- getsitemapurls.com: A web verison of getsitemap. Export sitemap URLs to CSV.
- pysurprisal: Calculate surprisal for words in text.
- mf2py (maintainer): Python microformats2 parsing library
- wrote library documentation
- helped to manage Python 2 deprecation
- modernized test suite
Specification Implementations
- Cinnamon: An implementation of the Microsub draft specification.
- Micropub: An implementation of the W3C Micropub specification used to post content on websites.
- Trackback Server: A front-end using the Trackback functionality built into indieweb-utils.
- Webmention Receiver: An implementation of the W3C Webmention specification to send and receive Webmentions.
- IndieAuth Server: Authenticate with a website using the IndieAuth protocol.
- Salmention: A playground for experimenting with the Salmention protocol.
- WebSub: An implementation of the W3C WebSub specification.
JavaScript Utilities
- commandk.js: A script to enable Command + K search on a website.
- highlight.js: Inline text highlights for web pages. Also available as a browser extension.
- seasonal.js: Change an emoji on your website for different seasonal events.
- fragmention.js: An implementation of the Fragmentation specification in JavaScript.
- hovercard.js: A script to load cards when you hover over a link in an article.
- darkmode.js: Trigger dark mode and light mode on your website.
- spa.js: Turn a website into a Single-Page Web App (SPA). Not finished.
- linkaside.js: Display cards for all of the outgoing links on a web page.
Web Utilities
- bsky.link: Generate shareable, embeddable links for Bluesky posts.
- mf2.link: Generate shareable, embeddable links for Mastodon and other posts, marked up with mf2.
- linguist.link: Calculate NLP insights on an article (reading time, most surprising words, most common bigrams, and more).
Documentation
- Best Practices for Packaging Python Projects: An e-book documenting best practices for packaging Python projects.
Misc.
Python
- IndieWeb Search: A search engine for the IndieWeb community. Indexed over 410,000 documents at peak.
- Elasticsearch used for storing data
- Back-end API for interfacing with Elasticsearch
- Front-end contains logic for parsing various microformats to return featured snippets
- Link graph analysis for calculating weights for ranking
- Custom-built crawler. Algorithms for:
- Identifying thin content
- Parsing link headers
- Discovering new content
- Validating whether a URL is eligible to be crawled
- Identifying canonical links
- Suspending crawling if a target server slows down notably
- Filtering nofollow links
- And more
- Microformats to Mediawiki: Turn documents marked up with microformats2 into MediaWiki markup. Posts the MediaWiki markup to a wiki instance.
- Novacast: Internal linking API powered by embeddings.
- Semantic Search with CLIP: Simple script that uses CLIP to enable semantic search on a directory of images.
- HyperText Coffee Pot: A Python implementation of the HyperText Coffee Pot Control Protocol.
- markdown-revision-extension: A proposed extension to markdown used for making inline revisions to text.
- index: A tool to create an index for my blog content using NLP.
- maps.webtools.garden: An aggregate map generator for use with microformats data.
Perl
- avtr.dev: Given a URL or email address, retrieve an avatar.
- Convert h-resume to man Document: Convert a h-resume document to a Linux man page.
- Calendar Generator: Generate a .ics file for Google algorithm updates or IANA KSK signing ceremonies.
- IndieWeb Etherpad Archiver: A tool for archiving IndieWeb event and Etherpad documents to the community wiki.
- James' Search v2: A search engine implemented with Perl.
Ruby
- MediaWiki Sparkline Generator: Generate sparklines showing the number of contributions made by a contributor on a MediaWiki instance.
- markdown-html-link-rot: A script to substitute invalid links in markdown and HTML with a link to an Internet Archive backup.
- Microsub OPML Utilities: Import OPML files into a Microsub server and export Microsub subscriptions to an OPML file.
- Static Site Webring: A webring for static websites, built with Ruby and Sinatra.
- Planet: An aggregator that shows new posts from tech blogs I follow. Built with Ruby and Sinatra.
JavaScript
- A front-end for plotting saved GeoJSON data onto OpenStreetMap maps using leaflet.js.
- Screenshots: A Node.js wrapper around puppeteer to retrieve screenshots of web pages.
- Spontaneity RSS Feed: RSS Feed for @telepathics' Spontaneity Generator.
- airport-pianos: The airportpianos.org website.
- stories.js: A HTML component that enables stories on your personal website.
- coffeerecipes.co source code: The coffeerecipes.co website.
Lisp
- lispDOT: A DOT DSL implemented in Common Lisp.
- Lisp HTML Generator: A simple HTML generator written in Lisp.
- Lisp Interpreter: A Lisp interpreter built in Python.
- Lisp Bigrams: A tool to retrieve and display the most commonly used bigrams in a text.
Go
- DNS Experiments: A DNS server using Go and the Go dns library.
- go-robotstxt-parser: A simple robots.txt parser written in Go.
- go-uptime-monitor: A Go script to poll websites and send an email if they are down.
One-offs
- Crackle Pop Implementation - Lisp
- Generate Checkins - Ruby
- Keybow Keyboard Macros - Lua
- Parse nginx logs - Bash (no longer used)
Wikis
- Breakfast and Coffee: A decentralized wiki for sharing coffee shop and breakfast spot recommendations.
- James Wiki: A personal wiki.
Web
- jamesg.blog: Published over 450 blog posts, built with a custom SSG. Features include:
- Custom coffee shop maps (https://jamesg.blog/coffee/maps)
- Date archive pages
- A coffee ratio calculator (https://jamesg.blog/ratio/)
- Live streaming on the home page
- bskyemoji.com: An analysis of Bluesky to find the most commonly-used emojis.
- letsjam: An archived version of the SSG behind jamesg.blog.
IndieWeb Events
Professional Projects
Roboflow
- Contributed to the following projects:
- Started following autodistill modules, and maintain them with the team:
- Co-created the following autodistill modules:
- CVevals: Evaluate the performance of computer vision models and prompts for zero-shot models (Grounding DINO, CLIP, BLIP, ImageBind, models hosted on Roboflow).
- Roboflow Collect: Passively collect images for computer vision datasets on the edge. ]
- PolygonZone: A web application that lets you calculate the coordinates for polygons and lines on an image.
- Homepage Inference Widget: Run inference on Microsoft COCO in your browser using your webcam; query three other computer vision models.
- SXSW Scavenger Hunt: A PWA scavenger hunt run for SXSW 2023. The primary gameplay interface used a computer vision model to track when you identified items on your scavenger hunt list. Over 200 players participated in the ~week during which the app was active.
- roboflow-tea-detector: A detector to track my caffeine intake.
- Roboflow Models v2: Upgraded version of the Roboflow Model Directory.
- How to Augment: How to Augment directory (programmatic)
- How to Label: How to Label directory (programmatic)
- How to Deploy: How to Deploy directory (programmatic)
- Autodistill Tutorials: Autodistill directory (programmatic)
- Created front-ends for Research Directory, Utilities Directory, Roboflow Templates.
- Actively contributing blog posts and editing posts from contributors. Posts I have written:
- How to Use Roboflow Models in CVAT
- How to Crop Computer Vision Model Predictions
- 5 Hobbyist Computer Vision Project Ideas
- Studying Links Between Litter and Socio-Economic Factors with Computer Vision (write up of a user's project)
- How to Use the Roboflow License Plate Detection API
- What is Image Classification? A Guide for Beginners
- How to Use the Roboflow Bird Detection API
- How to Use the Roboflow Fish Detection API
- How to Draw a Bounding Box for Computer Vision with Python
- Generate Image Augmentations with Roboflow
- How to Save Computer Vision Predictions to a Google Sheet
- Announcing the Roboflow SXSW Scavenger Hunt
- How to Blur a Bounding Box in Python
- How to Use the Roboflow People Detection API
- Launch: Use Universe Models for Label Assist and Training
- How to Deploy a YOLOv8 Model Using Roboflow and Repl.it
- How to Deploy a YOLOv8 Model to a Raspberry Pi
- Launch: Calculate Polygon Coordinates with PolygonZone
- What is a Confusion Matrix? A Beginner's Guide.
- How to Send Roboflow Model Predictions to Zapier Webhooks
- Narrate the Contents of a Room with Computer Vision
- How to Count Objects in an Image Using Python
- Monitoring My Caffeine Intake with Computer Vision
- Launch: Deploy YOLOv8 with Roboflow
- How to Deploy YOLOv5 Models with Roboflow
- Launch: Version, Export, and Train Models in the Roboflow Python Package
- How to Draw a Bounding Box Prediction Label with Python
- What is ImageBind? A Deep Dive
- Collect Images at the Edge with Roboflow Collect
- Compare Prompts for Zero-Shot Vision Detection
- How to Evaluate Computer Vision Models with CVevals
- Top 5 Use Cases for Segment Anything Model (SAM)
- How to Build a Semantic Image Search Engine with Roboflow and CLIP
- Use CLIP Zero-Shot Classification with the Roboflow Inference Server
- From Idea to Reality: Building a Computer Vision Scavenger Hunt for SXSW
- How to Detect, Monitor and Correct Computer Vision Data Drift
- How to Identify Mislabeled Images in Computer Vision Datasets
- What is DINOv2? A Deep Dive
- How to Classify Images with DINOv2
- How to Train a YOLOv8 Classification Model
- How to Count Objects in a Zone
- Distill Large Vision Models into Smaller, Efficient Models with Autodistill
- How to Evaluate Autodistill Prompts with CVevals
- How to Deploy a Roboflow Model to Lens Studio
- Comparing AI-Labeled Data to Human-Labeled Data
- Train a Segmentation Model with No Labeling
- Train an Image Classification Model with No Labeling
- How to Build a Defect Detection System
- How to Use LabelMe: A Complete Guide
- Announcing Roboflow Train 3.0
- How to Analyze and Classify Video with CLIP
- How to Build a Photo Memories App with CLIP
CK
- Internal Linking API: API powering content recommendations for over 6,000 documents deployed on a site with 500k monthly visitors.
- Static Site Generator: SSG that generated 6k blog posts from WordPress into a web page that scored green on all Core Web Vitals for desktop and mobile devices.
- Crawl Log Analysis: Used Jupyter Notebook to analyze crawl logs from AWS Athena. Merged crawl logs with NLP analysis and data from Google Analytics and Search Console to diagnose SEO issues and track performance changes to content after an SEO optimization process.
- Contributed 800+ blog posts.
- Data analysis scripts for the State of the Bootcamp Market reports.
- Wrote:
External Writing
Barista Magazine
- A Future for Packaging With Manifesto Coffee
- The Unique Police Box Cafés of Edinburgh, Scotland
- Edinburgh coffee roasters city guide (in print)
Sprudge
Steampunk Coffee
- Tasting pour-over coffee at different points of extraction
- Why do a coffee cupping at home?
- Coffee Filtration: A Guide
- Five Ways to Make Coffee at Home
- Brewing with the Aeropress in the Park
- An Aeropress glossary
- Lessons from a home coffee cupping
- How Do I Start Brewing Coffee at Home?
- Comparing the Kalita Wave and the V60
- How to Read a Coffee Label
- Make cold brew at home with NO fancy equipment
- How to Make a Cappuccino at Home (without an espresso machine)
- A Pour-Over Brewing Glossary
- My Experience Cupping Coffee with Steampunk
- Thoughts on the regular Aeropress method
- My Experience with the Aeropress
Coffee People
- Story featured in print edition