Welcome to dennymarcel’s Page!

Denny!

(In case you were wondering, that’s me!)

Math enthusiast with informal experience in coding. I am passionate with developing programmatic solutions willing to optimize and automate those sorts of processes humans are terribly designed for. The more that part of life is taken care of, the more time we have for those things humans excel.

Here I showcase the personal projects I managed to bring to life!

Volume adjustment simulator using hands with live video capture

Web app that captures the index and thumb positions to simulate hand control of volume with video bar feedback. This implementation leverages CV2, mediapipe and Flask.

Instantaneous translator - Portuguese to Esperanto (!)

Web app that translates from Portuguese to Esperando instantly after each keystroke. This implementation leverages a pre-trained transformers-based model available via the transformers library, and Flask.

Clinical diagnosis

This implementation is confidential (but the work was published!). I am relying on multidimensional data from clinical exams to develop a model capable of delivering fast, easy and reliable diagnostics for COVID-19. General information at the PoLiVirUS website.

News classification (NLP Project)

This implementation is confidential. In this project, I relied on BERT Portuguese to train a model to classify news articles into one of ten possible categories. The dataset was built from scratch using webscraping library Beautiful Soup (I still wonder where that name came from). The trained model was containerized using Docker and turned into a web tool using Google Kubernetes Engine. ~A prototype can be check here.~ (offline due to incident costs). A txt file containing one news text in Portuguese must be supplied. I suggest getting a news article from El País Brasil since the categories are expected to match.

Instacart Market Basket Analysis

IPython Notebook containing my implementation of the apriori algorithm to determine association rules based on the dataset included in a Kaggle competition that goes by the same name. The data consists of over 3 million orders in a grocery store, indexed by user. The orders contain product information and timestamp.

Northeastern SMILE Lab - Recognizing Faces in the Wild

IPython Notebook containing code for my approach to the Can you determine if two individuals are related? Kaggle challenge. The dataset contains face photos of individuals grouped by family. The objective is to determine from the photos if two individuals are genetically related.

FIFA 2019 Complete Player Dataset

IPython Notebook containing code for the exploration of the FIFA 2019 Complete Player Dataset from Kaggle. This dataset contains several attributes for all 18k+ FIFA soccer players. I used the technical features, and some physical features of the athletes, to group them in technical profiles. Then I investigated how these profiles were related to the players’s actual positions in the field.

NYC Taxi Fare Prediction

IPython Notebook containing code for my implementation of the NYC Taxi Fare Prediction challenge from Kaggle. This dataset contains loooots of instances for taxi rides, along with features depicting time, pickup/dropoff coordinates and number of passengers. I spent most of my energy on data exploration, data cleaning and feature engineering. It was one of the most fun datasets I worked with this far.

Human Activity Recognition Using Smartphone

IPython Notebook containing code for my implementation of the Human Activity Recognition Using Smartphones Data Set. The dataset contains features derived from movement measured by the accelerometer and gyroscope of a smartphone while volunteers were performing six activities. There are 10299 instances and 561 features for this classification problem. I took this opportunity to explore the Scikit-Learn documentation, first on the supervised learning methods, then on the best-performing model for the tuning process.

Black Friday Hackaton

IPython Notebook containing code for my implementation of the Black Friday Hackaton by Analytics Vidhya. The hackaton consisted of implementing a regression model to predict Purchase Amount from customer behavior and demografic information. I took this opportunity to delve a little deeper into the tuning process.

Knowledge Discovery and Data Mining Cup 2009

IPython Notebook having code for my implementation of the 2009 Knowledge Discovery and Data Mining challenge. The challenge comprised implementing a model to classify three behaviors of the clientele, based on a large dataset containing 50,000 instances of 230 variables, of which 190 were numerical and 40 were categorical. The output classes were unbalanced.

Flower Species Predictor

Code to use a Deep Learning approach to predict flower species from images. The project concerns downloading a pre-trained network, defining and training a classifier on a dataset containing 102 flower species and more than 6,500 images, saving the best-performing model, and using this model to make predictions on new images. This project was originally part of the PyTorch Scholarship Challenge Nanodegree Program, developed by Udacity and Facebook and hosted at Udacity.

Next Word Predictor

Code to implement a Shiny app consisting of a “next word” predictor, based on a text collection consisting of blogs, news and twitter texts. This project was originally part of the Data Science Specialization course, developed by the Johns Hopkins University and hosted at Coursera. (This was one of the funniest challenges I ever took!)

Forex Analysis Shiny

Code to implement a Shiny app that runs a summary of one or more MyFxBook accounts, enabling risk management decisions.

Mangrove Microbiome Project

Code to investigate taxonomic and functional properties of the microbiome of samples of mangrove sediment, by means of next-generation DNA sequencing technology. This work was published!