Matheus Cardoso
@matheuscardoso
Senior Data Scientist building GenAI and governed data pipelines for risk, discovery, and real-time decisions.
What I'm looking for
I’m a Data Scientist focused on turning complex data into reliable decisions—especially where GenAI, privacy, and data governance must work together. I’ve led end-to-end solutions spanning modeling, pipeline engineering, and production-ready delivery.
At Mercado Livre, I led a financial data solution unifying receivables (future quotas, advances, and cash-flow views) to support the Financial Risk team. I also worked on Expected Credit Loss modeling for Mexico, concentrating on banking-regulation compliance with market simulators and expected loss/risk analyses.
My recent GenAI work includes building an advanced chatbot using Gemini and Claude LLMs with NER-based data anonymization to keep user data privacy-compliant. I implemented a federated learning architecture combining LLMs with knowledge graphs for confidentiality and personalization, and I built a vector-based Data Discovery Bot using LLM + Faiss + metadata that generated revenue for the business.
I bring strong data-engineering foundations for scale and reliability: real-time classification (e.g., NPS), GCP–AWS pipeline reconciliation, Airflow-based data-quality validation, and Spark pipelines detecting transformer failures in ~1B IoT records/hour to reduce downtime. I’m motivated by building governed, well-modeled systems that teams can trust and reuse.
Experience
Work history, roles, and key accomplishments
Data Scientist
Mercado Livre
May 2025 - Present (1 year)
Led development of a financial data solution unifying receivables (including future quotas), advances, and cash-flow views to support the Financial Risk team’s decision-making. Developed Expected Credit Loss (ECL) modeling for Mexico, including market simulators and expected loss/risk analyses for regulatory compliance.
Data Scientist
Cloudevs
Jan 2024 - May 2025 (1 year 4 months)
Built an advanced chatbot integrating Gemini and Claude LLMs, using NER-based data anonymization for privacy-compliant handling of user data. Implemented federated learning with LLMs and knowledge graphs to preserve confidentiality while improving personalization.
Data Scientist
Tim Brasil
Mar 2023 - May 2025 (2 years 2 months)
Built a vector-based chatbot (LLM + FAISS + metadata) as a Data Discovery Bot that generated R$2M in revenue for the company. Deployed a real-time NPS classifier integrating usage and billing data to support retention strategies.
Data Scientist and Analytics Engineer
Stone Tech
Jan 2022 - Mar 2023 (1 year 2 months)
Developed GCP-to-AWS pipelines for revenue and chargeback reconciliation. Created an Airflow-based data-quality validation system to improve robustness in high-volume pipelines.
Data Scientist
CPFL Paulista
Jan 2021 - Jan 2022 (1 year)
Applied Transformer/BERT models to streamline inspection report analysis. Built a Spark-based model to detect transformer failures in ~1B IoT records per hour, reducing downtime.
Data Scientist
Robert Bosch LTDA
Feb 2019 - Dec 2020 (1 year 10 months)
Developed asset-based analysis models for data failure prediction and clustering using IoT data. Built data processing workflows capable of handling over 1B lines of IoT data for downstream analytics.
Education
Degrees, certifications, and relevant coursework
Universidade Federal do Rio de Janeiro (UFRJ)
Master Data Science and Analytics, Data Science and Analytics
Master Data Science and Analytics with a thesis on applying LLMs to help a startup business plan using the Socratic method.
Pontifícia Universidade Católica de Minas (PUC Minas)
Post Graduation in Machine Learning and Artificial Intelligence, Machine Learning and Artificial Intelligence
Post graduation in Machine Learning and Artificial Intelligence.
Centro Federal de Ensino Tecnológico Celso Suckow (CEFET/RJ)
Electrical Engineer (Electronics emphasis), Electrical Engineering
Electrical Engineer with an electronics emphasis.
Tech stack
Software and tools used professionally
Azure Synapse
Apache Spark
AWS Glue
Apache Flink
Metabase
Google Cloud Platform
GitHub
Kubernetes
NumPy
Pandas
PySpark
dbt
MySQL
PostgreSQL
MongoDB
Cassandra
Hadoop
Gmail
Redis
Terraform
Java
TensorFlow
PyTorch
MLflow
scikit-learn
Keras
Kafka
Datadog
Gemini
Airflow
Luigi
SQL
LangChain
Polars
Monte Carlo
Delta Lake
Great Expectations
Faiss
Factory
Remote
Method
Jan
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Matheus?
You can contact Matheus and 90k+ other talented remote workers on Himalayas.
Message MatheusFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
