Bernardo Galvão

Bernardo Galvão

ML Engineer

Contact
Work Experience

2021-10-01 to Present

MLOps Engineer (Contractor):

Operationalized training cycles and deployments of machine learning models.

Developed Docker Swarm stacks for deployment of MLOps services, including a model registry (MLFlow); databases (Postgres and Redis); a reverse‐proxy (Traefik) with an authentication service; a workflow orchestrator (Prefect); Grafana and Prometheus.

Designed an on‐prem MLOps solution to provide end‐to‐end services for frictionless training, validation, deployment and monitoring of ML models based on Open‐Source software, including a CI/CD component for ML.

Designed a workflow template using DVC for Data Scientists to confidently develop datasets and models with unit testing integrating a CI/CD pipeline written in GitLab CI/CD specification.

Developing a schema validation suite with Great Expectations to test data integrity from a Data Warehouse.

Wrote tests and GitLab CI pipelines for deployment of thoroughly tested models on their data and performance. Wrote Gitlab CI pipelines for automated deployment of Docker Stacks.

Deployed a Kubernetes cluster including Istio Ingress Gateway and Seldon CRDs for model deployment and monitoring.

Optimized performance of data queries and transformations using optimized memory formats Apache Arrow and DuckDB.

2020-01-01 to 2021-07-01

Research Fellow:

Deployed MLFlow server for experiment tracking and experiment data collection.

Conducted research on performance feature selection methods applied to radiomics for clinical decision‐making.

Implemented Target Shuffling robustness check and Nested Cross‐Validation procedure for use by other members in the team.

2018-06-01 to 2020-02-01

Data Scientist:

Implemented an object detection viewer and trained a marine‐species object detector via transfer learning.

Automated dataset processing, model training and Tensorflow model format conversion to TensorFlowJS and TensorFlow Lite using Docker, Python and Bash.

Provided a user‐friendly CLI for pulling, configuring and training models from TensorFlow’s Object Detection API.

Implemented process for cooperative collection, annotation and augmentation of training images.

Analyzed social media data activity levels per location, including profiling points of interest on Madeira island via topic modelling of TripAdvisor reviews with Latent Drichlet Allocation.

Performed data wrangling and analysis to assess performance of low‐cost air quality sensors.

2017-11-01 to 2018-02-01

Software Engineering Intern:

Developed a raycasting prototype in order to produce an attention heatmap on a 3D object using NumPy and VTK in Python.

Education

2015-09-01 to 2017-06-01

Nova Information Management School (IMS)

MSc: Data Science and Advanced Analytics

2011-09-01 to 2014-12-15

Católica-Lisbon School of Business and Economics

BSc: Economics

Languages
  • Portuguese
  • Native
  • English
  • Proficient