James
Cornell University, College of Engineering
May 2022 | Ithaca, NY
- Bachelor of Science, Computer Science
Westford Academy
June 2019 | Westford, MA
- Honors & Awards: National AP Scholar | Spanish Honor Society | AP Chemistry Award
Skills
- Programming Languages
- Python
- SQL
- Java
- OCaml
- Scala
- C#
- Platforms
- Databricks
- Unity
- OpenSCAD
- Logism
- Tableau
- Libraries
- Spark MLLib
- Sklearn
- Numpy
- Pandas
- Scipy
- Stats
Relevant Experience
Databricks Contractor
Contractor
May 2019 - August 2019 | Remote
Work Included:
- Machine learning use cases
- Creating training content related to Databricks, Machine Learning, and Apache Airflow
Databricks Contractor
Contractor
May 2020 - August 2020 | Remote
Work Included:
- Expanding machine learning educational materials and solution accelerators
- Helping with projects related to machine learning and statistic use cases
- Testing and developing initial training materials for MLflow, which has become an open source standard for machine learning organization, training, and deployment
Databricks Internship
Intern
May 2021 - August 2021 | Remote
Work Included:
- Creating educational content for ever developing machine learning use cases
- Conducting research in and developing material for machine learning in production and drift monitoring
- Helping with creating of content for customers in education of drift monitoring
- Creating training content for MLflow and advanced MLflow techniques
- Showcasing and creating demos for new Databricks features
- Creating training material for Microsoft machine learning team
- Developing a SQL Dashboard with multitask automated jobs for populating Delta tables and constructing SQL queries to populate visualizations to better describe traction with Delta related projects and isolate areas of development with Delta related projects
- Conducted presentation to group showcasing demo of dashboard and use cases
- Work on PyTorch example use cases for Scalable Deep Learning
- Creating educational material for Distributed XGBoost training
Projects
SQL Dashboard
Overview
At a high level, this project was designed to create a dashboard of visualizations that would allow a user to track the traction different Delta related packages were getting and to identify issues with Delta related projects that needed developer attention.
Individual Contributions
At an indvidual level I was responsible for:
- Creating Python code to utilize a variety of APIs to gather information such as PyPI download stats, Maven download information, and GitHub issues and pull requests
- Aggregate information into Delta tables and organize the workflow into a multitask job such that it could be automatically renewed on a daily basis
- Writing the SQL queries to gather the information from the Delta tables and construct visualizations
- Place and organize the visualizations into a professional dashboard
- Present the dashboard to a group of Delta focused employees at Databricks
Technical details
The most challenging aspects of this included:
- The feature engineering of the raw API output into the Delta tables
- Organizing the content to automatically refresh
- Presenting the professional dashboard