Programming languages are fundamental tools, giving the freedom to be as innovative and as impactful as you can imagine; “with software, where there’s a will there’s a way”. As such, I have extensive commercial expertise, across all key languages in data:
Python: Development of production machine learning / data science and engineering applications, both in OOP and functional approaches as well as coaching & mentoring juniors on fundamentals and best practices. I have significant experience across most relevant packages & frameworks including; Tensorflow, XGBoost, Scikit-Learn, Scipy, Statsmodel, Seaborn, Matplotlib, Plotly, Pandas, Numpy, Boto3 & Botocore, Pyarrow as well as less popular packages.
R: Development of bespoke analysis & visualisations for repeatable analysis as well as coaching & mentoring to up-skill in R. I have significant experience with all Tidyverse packages, ggplot2, forecast as well as less popular packages
SQL: Development of bespoke and extensive data warehouses for storing and analysing large volumes of data and serving back to the business quickly and efficiently. I have significant experience in managing complex logic, procedures and statements which utilise advanced statements such as window functions, query optimisation, as well as most other SQL workflows.
Machine Learning and Data Science is one of the most exciting areas in data and business and is an area that I am both professionally and personally passionate about.
My commercial experience in this discipline ranges from developing and deploying bespoke unsupervised machine learning applications to bespoke customer scoring to improve customer targeting:
Time-series: The go to set of models for forecasting univariate data. My experience covers naive / simple methods such as Holt Winters, ARIMA, as well as more complex methods such as Facebook’s Prophet, NeuralProphet and DeepAR.
Regression: Usually the starting place for most modelling problems as often a simpler, well crafted, regression model beats a more complex machine learning algorithms and provides the gold standard for explainability. My experience covers complex regression modelling (Multivariate, Poisson, Logistic, Transformed linear, Lasso, Polynomial, Splines) for both predictive and descriptive problems.
Random Forests: Some of the most performant models for describing complex relationships in tabular and unstructured data. My experience covers the use of gradient boosted algorithms such as XGBoost, LightGBM & Sci-kit Learn
Neural Networks: the go-to juggernaut for solving some of data’s most complex problems (image classification, machine vision, NLP, etc.). My commercial experience includes using RNN & CNN deep networks for predictive problems on both the Tensorflow & Pytorch frameworks.
Data engineering is the foundation of any business’s data offering. All other functions ultimately rely on the quality and timeliness of their data. My commercial experience in data engineering ranges from developing pipeline applications in the cloud to structuring data warehouses / lakehouses to managing a team of agile engineers.
MLOps: Lifecycle management for machine learning models in production. My experience covers the management of training, parameter tuning and inference jobs and monitoring accuracy on AWS Sagemaker.
Data Pipelines: Development of pipelines to extract data from API’s & endpoints, transform and normalise the data ready for data warehouse / lakehouse. My commerical experience includes development & maintenance of pipelines on AWS including, ECS, ECR, Step functions & Glue.
Data Warehousing / Lakehousing: Development and management of data warehouse / lakehouses architectural models to structure data correctly for business querying. My commercial experience includes development and maintenance of warehouses on PgSQL, SQL Server, Redshift and S3.
Containerisation: Development of applications as “dockerised” containers allows for the leverage of newer cloud services. My commercial experience covers AWS ECS & Fargate, ECR, and Sagemaker, with personal experience with Kubernetes.