Thalia Rodriguez


Expert in architecting data-driven solutions, with a strong foundation in analytics, big data management, and the application of AI and Machine Learning technologies.

Committed to enhancing data quality and integrity, driving business insights, and fostering innovation for impactful outcomes.

Experience


Machine Learning Engineer
UK Dementia Research Institute
2021 - 2024
• Improved clinical data quality by 5% and environmental by 30% by integrating third-party APIs (EMR, NHS, sensors) data and implementing ETL batch workflows for data validation, transformation, and advanced analytics.
• Built a custom production ML system to develop therapies for +100 patients by engineering sleep related features and creating models to identify causes of sleep disruption.
• Authored and updated software documentation and technical reports, covering system architectures, algorithms, analytics insights, and research methodologies.


Data Scientist
Consultant
2018 - 2020
• Reduced in 90% manual data entry by designing a website and integrating automated data collection systems.
• Boosted client’s website conversion rates from 2% to 10% through targeted A/B testing strategies.
• Optimized data processing by developing a user-friendly Python interface with a SQL backend, enabling betterunderstanding and tracking of key KPIs, while reducing query execution time by 20%.


Physics Lecturer
Indiana University Purdue University Columbus
2019 - 2020
• Maintained 100% attendance in classes by engaging students with adaptive teaching methods, receiving an 89% positive evaluation rate.
• Mentored students to improve their grades by an average of 10% across physics, mathematics, and statistics courses through enhanced critical thinking and problem-solving exercises


Research Assistant
Lab of Computational Biophysics TCU
2013 - 2018
• Created mathematical models in Python (scikit-learn, SciPy) to analyze the effect of novel antivirals in respiratory viral infections, tracking benchmark metrics and reducing experimental research costs.
• Applied statistical techniques (MCMC, bootstrapping) and regression methods (linear, ODE parameter estimation) to extract insights and maximize value from limited data.


Case competitions:


IOWA Tippie Business Analytics Competition 2018
February 2018
• Conducted an in-depth analysis of United Airlines customer surveys big data using R and Python. Identified through machine learning that the key factors to improve customer experience.
• Delivered visualizations of my findings using Tableau and recommendations to a panel of industry experts.


TCU Neeley School of Business IBM Case Competition 2017
August 2017
• Detected areas of opportunity to take advantage of the emergence of AI and presented recommendations to accelerate market share growth.


Integrative Project Simulation TCU Neeley School of Business
August 2017
• Planned and managed a marketing strategy based on data available on the simulation.
• Counseled my team members in the areas of R&D, supply chain and finance.

Computer skills:


Python
  • Pandas
  • Scipy
  • Numpy
  • Seaborn
  • Matplotlib
  • NLTK
  • Spacy
  • ScikitLearn
  • Statsmodels
Analytics:
  • SAS
  • HuggingFace
  • PyTorch
  • Tableau
  • XGBoost
  • CatBoost
  • R
Data Storage & Control:
  • Git
  • SQL
  • AWS
  • Snowflake
  • Airflow
  • Docker
Web Development:
  • HTML/CSS/Javascript
  • Flask
  • Streamlit

Other skills:

Logical thinking:

I love puzzles and problem solving. I believe logic and critical thinking is more important than memorization.

Communication:

In college, I was invited to work as radio host for a musical show. But instead, I asked to have a section in a science oriented show. A couple of times people recognized my voice in the street.

Business:

I took MBA courses during my Ph.D. I learned about Business Operations, Marketing, Supply Chain, and Finance.
I also met awesome people.

Education


Texas Christian University and Neeley School of Business

Ph.D. in Physics with Business Option

2013 - 2018

GPA: 3.6

Texas Christian University

M.S. Physics

2013 - 2016

GPA: 3.9

Universidad Autonoma de Zacatecas

B.S. in Physics

2008 - 2013

GPA: 3.1

Certifications:
AWS Cloud Practitioner (CLF-C02) Link

March 2025

Data Engineer by DataCamp Link

February 2025

Structuring Machine Learning Projects by deeplearning.ai

February 2020

Machine Learning by Stanford University on Coursera

January 2020

Level 1 Intelligence Analyst on Udemy

April 2019

Machine Learning

I am interested in using Python and technology to improve people's lives, enhance business and create impact.
Over the past few years, I have focused on healthcare.
However, I am curious about other topics as well. For example, in business school, my favorite classes were Marketing and Finance. I enjoy reading about these subjects and thinking about projects.

Recent projects:
Predicting Hotel Booking Conversion Rates in Real Zero-Inflated Data: A Dual-Model Approach

July 2024

Trivago is a meta-search website that enables advertisers to promote accommodations and allows users to compare different prices for the same accommodation. By aggregating offers from various booking sites, Trivago provides users with a comprehensive view of available options, helping them to make informed decisions and find the best deals.
The task given was to create a model to predict conversion rates for hotel bookings across various advertising sites. This involved using the provided anonymized real data to make predictions for each advertiser-hotel combination for the specific date of August 11. The challenge was further compounded by the sparsity of data and the presence of zero-inflated data, as many advertiser-hotel combinations had no bookings at all.
For more info click here.


ThaliaGPT


June 2024

ThaliaGPT is a Streamlit-based app that streamlines interactions between Thalia Rodriguez and recruiters using OpenAI's advanced Large Language Models (LLMs) and personalized data. It offers an engaging chat experience, allowing users to ask questions about Thalia and receive accurate responses. Users can choose their contact method (Email or LinkedIn) and log their details. The app, deployed on an AWS EC2 instance, ensures scalable, reliable performance. This project demonstrates AI-driven chatbot capabilities and serves as a practical networking tool enhanced by personalized data ingestion. For more info click here.

Other projects:

SmartGram App: Models\ App

Code here

Code

Code

Code

Code

Research Publications



Method to determine whether sleep phenotypes are driven by endogenous circadian rhythms or environmental light by combining longitudinal data and personalised mathematical models.

.
Quantifying the effect of trypsin and elastase on in vitro SARS-CoV infections.

.
Estimation of viral kinetics model parameters in young and aged SARS-CoV-2 infected macaques.

.
A comparison of methods for extracting influenza viral titer characteristics. In this article, our aim is to compare the estimates of different viral titer characteristics using three different approaches. The first approach is the traditional method that uses estimates based on experimentally measured data. The second approach relies on the use of a linear model to fit the viral titer data. The third approach uses an exponential model for the fitting process and the parameters of interest are extracted from there.
Investigating different mechanisms of action in combination therapy for influenza
Here, we use a mathematical model of influenza to model combination treatment with antivirals having different mechanisms of action to measure peak viral load, infection duration, and synergy of different drug combinations.

Interests and Voluntary Work

Apart from coding, I enjoy physical challenges. If I have free time, I like to go to the gym to practice boxing (I do it since highschool). In the weekends, I practice yoga at home.

I enjoy sci-fi movies. And in Netflix, I mostly watch crime and mistery shows.

I can't decide if I'm a cat or dog person, I adore both.

I love to talk and learn about science. I spend a lot of time reading about tech.


Machine Learning Engineer, Omdena

• Collaborating in the AI challenge: “Analyzing the role of connectivity on economic and human development”. This challenge is hosted by UNDP and the goal is to build an AI-based solution for identifying the relationship between connectivity and human development indicators (life expectancy, education, and/or per capita income)..

November 2020 - January 2021
Mentor at Honors Program IUPUC

I work closely with a mentee, we use statistics and mathematical models to develop biophysics simulations using Python.


Spring 2020
Volunteer at Alice Carlson Enrichment Classes

Developed a program for the enrichment classes at Alice Carlson Elementary School (Fort Worth, TX).
Taught Magnetism through an interactive course to 3rd grade students.


Spring 2016
Science Communicator at Radio Zacatecas

Created science content for the radio show “A Ciencia cierta." Participated as a host in the live show.


January 2011 - May 2012

Memberships

WWWCode

2019 - Present

Tech Ladies

2018 - Present

Society for Industrial and Applied Mathematics

2013 - 2017

American Physics Society

2013 - 2017