(In case you were wondering, that’s me!)
Math enthusiast with informal experience in coding. I am passionate with developing programmatic solutions willing to optimize and automate those sorts of processes humans are terribly designed for. The more that part of life is taken care of, the more time we have for those things humans excel.
Though I am officialy a beginner in the field, my boat is confidently sailing towards becoming a data scientist of sorts. I am currently committed to getting some formal education in the field; for an updated history on how this is going, please visit my LinkedIn page. I love working in R, but am getting acquainted with Python as well.
My intention with this humble page is to showcase the personal projects I managed to bring to life!
IPython Notebook containing code for my approach to the Can you determine if two individuals are related? Kaggle challenge. The dataset contains face photos of individuals grouped by family. The objective is to determine from the photos if two individuals are genetically related.
Created on July 31, 2019. Last relevant update on July 31, 2019.
IPython Notebook containing code for the exploration of the FIFA 2019 Complete Player Dataset from Kaggle. This dataset contains several attributes for all 18k+ FIFA soccer players. I used the technical features, and some physical features of the athletes, to group them in technical profiles. Then I investigated how these profiles were related to the players’s actual positions in the field.
Created on May 30, 2019. Last relevant update on May 30, 2019.
IPython Notebook containing code for my implementation of the NYC Taxi Fare Prediction challenge from Kaggle. This dataset contains loooots of instances for taxi rides, along with features depicting time, pickup/dropoff coordinates and number of passengers. I spent most of my energy on data exploration, data cleaning and feature engineering. It was one of the most fun datasets I worked with this far.
Created on April 9, 2019. Last relevant update on April 9, 2019.
IPython Notebook containing code for my implementation of the Human Activity Recognition Using Smartphones Data Set. The dataset contains features derived from movement measured by the accelerometer and gyroscope of a smartphone while volunteers were performing six activities. There are 10299 instances and 561 features for this classification problem. I took this opportunity to explore the Scikit-Learn documentation, first on the supervised learning methods, then on the best-performing model for the tuning process.
Created on March 29, 2019. Last relevant update on March 29, 2019.
IPython Notebook containing code for my implementation of the Black Friday Hackaton by Analytics Vidhya. The hackaton consisted of implementing a regression model to predict Purchase Amount from customer behavior and demografic information. I took this opportunity to delve a little deeper into the tuning process.
Created on March 11, 2019. Last relevant update on March 11, 2019.
IPython Notebook having code for my implementation of the 2009 Knowledge Discovery and Data Mining challenge. The challenge comprised implementing a model to classify three behaviors of the clientele, based on a large dataset containing 50,000 instances of 230 variables, of which 190 were numerical and 40 were categorical. The output classes were unbalanced.
Created on Dec 27, 2018. Last relevant update on Dec 27, 2018.
Code to use a Deep Learning approach to predict flower species from images. The project concerns downloading a pre-trained network, defining and training a classifier on a dataset containing 102 flower species and more than 6,500 images, saving the best-performing model, and using this model to make predictions on new images. This project was originally part of the PyTorch Scholarship Challenge Nanodegree Program, developed by Udacity and Facebook and hosted at Udacity.
Created on Dec 18, 2018. Last relevant update on Dec 18, 2018.
Code to implement a Shiny app consisting of a “next word” predictor, based on a text collection consisting of blogs, news and twitter texts. This project was originally part of the Data Science Specialization course, developed by the Johns Hopkins University and hosted at Coursera.
Created on Oct 19, 2018. Last relevant update on Oct 19, 2018.
Code to implement a Shiny app that runs a summary of one or more MyFxBook accounts, enabling risk management decisions.
Created on Sep 5, 2018. Last relevant update on Dec 24, 2018.
Code to investigate taxonomic and functional properties of the microbiome of samples of mangrove sediment, by means of next-generation DNA sequencing technology.
Created on Sep 4, 2018. Last relevant update on May 20, 2019.