Sr Machine Learning Engineer - Reddit (June 2021 - October 2022)
I worked on Reddit's User Understanding team, whose main task was to features for use in models, primarily recommendations. I created specific features and established patterns for aggregating content features to users and creating user embeddings. This work focused on both batch and streaming pipelines.
User interests Aggregates content labels to user level. Project included filtering NSFW, grouping labels, and decaying. I designed and implemented. Batch feature was built with Airflow scheduler calling BigQuery scripts. Streaming feature was built with Flink. Further built User-to-Subreddit mapping using Annoy approximate nearest neighbors.StreamingNearest NeighborSQL / DatabaseClusteringWorkflow managerDesign / Arch
Some substantial projects may be excluded due to proprietary information.