Xiaotian CHEN
@xiaotianchen
Data analyst with strong statistical modeling, ML, and data engineering skills for impactful business insights.
What I'm looking for
I am a data analyst with strong training in mathematics, econometrics and data science from Université Paris Cité and Toulouse School of Economics, experienced in building predictive models and end-to-end analytics pipelines.
I have applied machine learning and time-series methods on large datasets — for example, creating a CatBoost purchase-prediction model on 10M+ records using PySpark that improved recall through feature engineering and oversampling. My projects span causal inference, NLP (LSTM/BERT), customer churn, and BI reporting with Power BI and OLAP.
I combine technical expertise in Python, PySpark, SQL, R and Stata with clear communication and visualization skills to deliver actionable insights and robust, interpretable models for stakeholders.
Experience
Work history, roles, and key accomplishments
Data Scientist
ABC Bank
Jan 2024 - Dec 2025 (1 year 11 months)
Implemented a modeling pipeline using PCA, clustering, DA, CART, bagging and random forests with SMOTE to predict customer churn and evaluated robustness via cross-validation.
BI Analyst
TSE
Jan 2024 - Dec 2025 (1 year 11 months)
Designed a relational database in Microsoft Access, built OLAP cubes with SAP Business Objects, wrote SQL for ETL, and created interactive Power BI dashboards for sales analysis and reporting.
Data Analyst
TSE
Jan 2024 - Dec 2025 (1 year 11 months)
Used R to clean and visualize panel data and applied fixed effects models with interactions and lags to analyze how political instability affects tourism, identifying significant negative effects and regional heterogeneity.
NLP Research Intern
Paris Cité
Oct 2025 - Dec 2025 (2 months)
Developed LSTM- and BERT-based classifiers to predict mental health status from text and applied LIME for local interpretability of model predictions.
Research Intern
Paris Cité
Oct 2025 - Dec 2025 (2 months)
Applied g-computation and bootstrap robustness testing on synthetic data to estimate Average Causal Effects and benchmarked Lasso, Boosted CART, and neural networks to optimize causal analysis and predictive performance.
Data Analysis Intern
Automotive Data of China (CATARC)
May 2025 - Aug 2025 (3 months)
Built a CatBoost purchase prediction model on 10M+ records using Spark, achieving 50% recall via time-series feature engineering and oversampling; developed dynamic scoring with SQL window functions and delivered visualized analytics.
Education
Degrees, certifications, and relevant coursework
Université Paris Cité
Master 2, Mathematics and Applications
2025 - 2026
Master 2 in Mathématiques et Applications with specialization in ingénierie mathématique et biostatistique, including coursework in machine learning, deep learning, stochastic algorithms, survival analysis, differential models and nonparametric statistics.
Toulouse School of Economics
Master, Econometrics and Statistics
2024 - 2025
Completed a Master of Econometrics and Statistics (Data Science for Social Sciences) with coursework in R, Julia, Python, SQL, Power BI, Stata, time series, Markov chains and machine learning.
Toulouse School of Economics
Bachelor of Economics, Economics
2021 - 2024
Grade: 15.16/20
Bachelor of Economics covering macroeconomics, microeconomics, econometrics and statistics with practical experience in R, Python, Excel and SAS; achieved average grade 15.16/20.
Availability
Location
Authorized to work in
Job categories
Interested in hiring Xiaotian?
You can contact Xiaotian and 90k+ other talented remote workers on Himalayas.
Message XiaotianFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
