Huzaifa Arif

Logo

View My GitHub Profile

Huzaifa Arif

About Me

I am a fifth-year Ph.D. candidate and researcher with multiple first-author publications in trustworthy AI. My work focuses on exposing and mitigating privacy and safety vulnerabilities in Large Language Models, demonstrated through research at IBM and LLNL. I am seeking a full-time Research Scientist position starting December 2025 where I can apply my expertise in LLM alignment and privacy to build verifiably safe AI systems.

Professional Experience

IBM T.J Watson Research Center - Yorktown Heights, NY

AI Research Extern - Trustworthy AI | Jun 2025–Aug 2025
Mentors: Pin-Yu Chen, Ching-Yun Ko, Keerthiram Murugesan, Payel Das

Lawrence Livermore National Laboratory - Livermore, CA

Data Science Intern | May 2024–Aug 2024
Mentors: Bhavya Kailkhura, James Diffenderfer

IBM T.J Watson Research Center - Yorktown Heights, NY

AI Research Extern - Trustworthy AI | Jun 2023–Aug 2023
Mentors: Pin-Yu Chen, Keerthiram Murugesan, Payel Das

IBM T.J Watson Research Center - Yorktown Heights, NY

AI Research Extern - Trustworthy AI | Jun 2022–Aug 2022
Mentor: Pin-Yu Chen

Education

Rensselaer Polytechnic Institute - Troy, NY

Ph.D. in Electrical and Computer Systems Engineering | Expected Dec 2025

Lahore University of Management Sciences - Lahore, Pakistan

B.S. in Electrical Engineering

Ongoing work

Project 1: Parameter-Efficient LLM Safety Alignment: I am developing a lightweight, prefix-based method to steer LLMs towards safe behavior without needing to retrain the entire model. By combining Supervised Fine-Tuning (SFT) with Direct Preference Optimization (DPO), this work efficiently instills safety, mitigates toxicity, and can be adapted to prevent demographic bias and PII leakage. (Under Review).

Project 2: Exposing Association Leakage in LLMs: We consider the privacy vulnerability, “association leakage,” where LLMs can be prompted to reveal sensitive linked information (e.g., a name and its corresponding private data). This work introduces a new post-hoc attack that amplifies this leakage by manipulating attention heads at inference time, defining a new threat model for LLM safety audits. (Under Review).

Publications

1. Reprogrammable-FL: Improving Utility-Privacy Tradeoff in Federated Learning via Model Reprogramming

Conference: IEEE Conference on Secure and Trustworthy Machine Learning, February 2023
Authors: Huzaifa Arif, Alex Gittens, Pin-Yu Chen
🎥 Talk | 💻 Code | 📄 Paper

2. PEEL the Layers and Find Yourself: Revisiting Inference-time Data Leakage for Residual Neural Networks

Conference: IEEE Conference on Secure and Trustworthy Machine Learning, April 2025
Authors: Huzaifa Arif, Keerthiram Murugesan, Payel Das, Alex Gittens, Pin-Yu Chen
đź“„ Paper

3. Group Fair Federated Learning via Stochastic Kernel Regularization

Journal: Transactions on Machine Learning Research, April 2025
Authors: Huzaifa Arif, Pin-Yu Chen, Keerthiram Murugesan, Alex Gittens
đź“„ Paper

4. Forecasting Fails: Unveiling Evasion Attacks in Weather Prediction Models

Conference (Workshop): AAAI Workshop on AI to Accelerate Science and Engineering
Authors: Huzaifa Arif, Pin-Yu Chen, Alex Gittens, James Diffenderfer, Bhavya Kailkhura
đź“„ Paper

Patents

Book Chapters

Preprints

DP-Compressed VFL is secure for Model Inversion Attacks

Paper | Code

Additional Research Work

Professional Service

Reviewer Experience

Recent News & Achievements

Contact Information

đź“§ Email: arifh@rpi.edu | huzaifaarif20@gmail.com
📞 Phone: (518) 961-8482