{"data":{"jobs":{"edges":[{"node":{"frontmatter":{"title":"AI Engineer","company":"Alumbra AI","range":"2026 - Present","technologies":null},"html":""}},{"node":{"frontmatter":{"title":"Research Engineering Intern","company":"Pareto AI","range":"2025 - 2026","technologies":null},"html":""}},{"node":{"frontmatter":{"title":"Machine Learning Engineer","company":"Spring Oaks Capital","range":"2022 - 2025","technologies":null},"html":"<ul>\n<li>Developed and deployed scalable ETL and modeling pipelines using Airflow and Kubernetes for text/call efforts and offer generation, incorporating ranking recommendations and constrained optimization solutions to scheduling problems.</li>\n<li>Implemented CI/CD processes including tests, automated builds, and deployments utilizing AWS ECR, CodeBuild, and GitHub Actions, ensuring seamless and efficient workflow.</li>\n<li>Assisted wrt to cloud infrastructure for the core technology stack, from containerization to resource provisioning and development environments, optimizing performance and scalability.</li>\n<li>Prepared and maintained comprehensive Sigma dashboards and streamlit apps to monitor online performance metrics, providing key stakeholders with actionable insights and facilitating data-driven decision-making.</li>\n</ul>"}},{"node":{"frontmatter":{"title":"Data Scientist","company":"Capital One (DS)","range":"2021 - 2022","technologies":[{"name":null}]},"html":"<ul>\n<li>Engineered and productionalized critical updates to the core codebase, impacting 30MM+ users, through advanced feature engineering, robust data pipelines, unit tests, and custom model architectures.</li>\n<li>Developed sequential recommendation POCs utilizing PyTorch, Huggingface, and Nvidia’s Merlin/Transformers4Rec. These innovations were showcased at the Nvidia GTC Fall Summit 2022, highlighting cutting-edge advancements in recommendation systems.</li>\n<li>Co-led and designed a bi-weekly lecture series on Deep Learning and Neural Recommendation, fostering knowledge sharing and upskilling within the team.</li>\n</ul>"}},{"node":{"frontmatter":{"title":"Ph.D. Internships in Data Science & Applied Research","company":"Capital One (Internships)","range":"2020 - 2021","technologies":[{"name":null}]},"html":"<ul>\n<li>Researched and implemented state-of-the-art neural recommendation solutions for adtech challenges, significantly improving ad targeting and engagement.</li>\n<li>Developed scalable and extensible data pipelines in PySpark, leveraging novel data sources to enhance model performance and insights.</li>\n<li>Provided strategic insights and recommendations for integrating neural solutions into production environments, extending the impact of summer projects.</li>\n<li>Conducted advanced research in agent-based modeling and reinforcement learning, contributing to the Center for Machine Learning (C4ML).</li>\n</ul>"}},{"node":{"frontmatter":{"title":"Data Scientist","company":"Buffalo Check LLC","range":"2015 - 2019","technologies":[{"name":null}]},"html":"<ul>\n<li>Cofounded and scaled a successful LLC, delivering innovative advertising solutions to the US military and generating over $2M in revenue.</li>\n<li>Performed detailed quantitative analysis on user engagement, enhancing advertising effectiveness and client satisfaction.</li>\n</ul>"}},{"node":{"frontmatter":{"title":"Analyst and Business Intelligence","company":"Real World Marketing","range":"2016 - 2019","technologies":[{"name":null}]},"html":"<ul>\n<li>Designed and automated interactive dashboards and ad hoc reports, driving data-driven decision-making and improving operational efficiency.</li>\n<li>Integrated and analyzed diverse data sources, using statistical techniques to uncover actionable insights and optimize marketing strategies.</li>\n<li>Conducted multivariate analysis and A/B testing, leading to significant improvements in site conversion rates and marketing ROI.</li>\n</ul>"}},{"node":{"frontmatter":{"title":"Optimization Analyst","company":"Voltari","range":"2012 - 2015","technologies":[{"name":null}]},"html":"<ul>\n<li>Conducted analysis centered around first and second click ad performance.</li>\n<li>Analysis concerning pricing strategy/optimization.</li>\n<li>Managed point of interest (POI) database via SQL.</li>\n</ul>"}}]},"publications":{"edges":[{"node":{"frontmatter":{"title":"Skill Reuse as Compression in Agentic RL","slug":"/publications/skill-reuse-2026","authors":"Zhikun Xu, Yu Feng, Jacob Dineen, Ben Zhou","date":"2026-01-03T00:00:00.000Z","venue":"Pending NeurIPS 2026","arxiv":null,"googlescholar":null,"semanticscholar":null,"paperurl":null,"code":null,"slides":null,"abstract":"","bibtex":null,"technologies":null},"html":""}},{"node":{"frontmatter":{"title":"Vocabulary Dropout for Curriculum Diversity in LLM Co-Evolution","slug":"/publications/vocab-dropout-2026","authors":"Jacob Dineen, Aswin RRV, Zhikun Xu, Ben Zhou","date":"2026-01-02T00:00:00.000Z","venue":"Pending COLM 2026","arxiv":"https://arxiv.org/abs/2604.03472","googlescholar":null,"semanticscholar":"https://www.semanticscholar.org/paper/6a758138df371e7386112005d64cd31c29cd5039","paperurl":"https://arxiv.org/pdf/2604.03472","code":null,"slides":null,"abstract":"Co-evolutionary self-play, where one language model generates problems and another solves them, promises autonomous curriculum learning without human supervision. In practice, the proposer quickly converges to a narrow distribution of problems that satisfy the reward function. This diversity collapse renders the curriculum uninformative for the solver, stalling the co-evolutionary loop. We introduce vocabulary dropout, a random mask applied to the proposer's output logits during both policy training and curriculum generation, as a lightweight mechanism to sustain diversity. The mask is hard and non-stationary, preventing the proposer from locking into fixed token sequences. Training Qwen3-4B and Qwen3-8B on mathematical reasoning via R-Zero, we find that vocabulary dropout sustains proposer diversity across lexical, semantic, and functional metrics throughout training, and yields solver improvements averaging +4.4 points at 8B, with the largest gains on competition-level benchmarks. Our findings suggest that explicit action-space constraints, analogous to the structural role that game rules play in classical self-play, can help sustain productive co-evolution in language. Vocabulary dropout is one simple instantiation of this principle.","bibtex":"@misc{dineen2026vocabularydropoutcurriculumdiversity,\n  title={Vocabulary Dropout for Curriculum Diversity in LLM Co-Evolution},\n  author={Jacob Dineen and Aswin RRV and Zhikun Xu and Ben Zhou},\n  year={2026},\n  eprint={2604.03472},\n  archivePrefix={arXiv},\n  primaryClass={cs.CL},\n  url={https://arxiv.org/abs/2604.03472},\n}\n","technologies":null},"html":""}},{"node":{"frontmatter":{"title":"Mid-training with Self-Generated Data Improves Reinforcement Learning in Language Models","slug":"/publications/midtraining-2026","authors":"Aswin RRV, Jacob Dineen, Divij Handa, Mihir Parmar, Ben Zhou, Chitta Baral, Swaroop Mishra","date":"2026-01-01T00:00:00.000Z","venue":"Pending ICML 2026","arxiv":null,"googlescholar":null,"semanticscholar":null,"paperurl":null,"code":null,"slides":null,"abstract":"","bibtex":null,"technologies":null},"html":""}},{"node":{"frontmatter":{"title":"RECAP: Transparent Inference-Time Emotion Alignment for Medical Dialogue Systems","slug":"/publications/recap-2025","authors":"Adarsh Srinivasan, Jacob Dineen, Muhammad Umar Afzal, Muhammad Uzair Sarfraz, Irbaz B. Riaz, Ben Zhou","date":"2026-01-01T00:00:00.000Z","venue":"Clinical NLP Workshop 2026 (Oral)","arxiv":"https://arxiv.org/abs/2509.10746","googlescholar":null,"semanticscholar":"https://www.semanticscholar.org/paper/RECAP%3A-Transparent-Inference-Time-Emotion-Alignment-Srinivasan-Dineen/c0486a9634c8540adba69c818b35be74c2249c00","paperurl":"https://arxiv.org/pdf/2509.10746.pdf","code":null,"slides":null,"abstract":"Large language models in healthcare often miss critical emotional cues, delivering medically sound but emotionally flat advice. This is especially problematic in clinical contexts where patients are distressed and vulnerable, and require empathic communication to support safety, adherence, and trust. We present RECAP (Reflect-Extract-Calibrate-Align-Produce), an inference-time framework that adds structured emotional reasoning without retraining. By decomposing empathy into transparent appraisal-theoretic stages and exposing per-dimension Likert signals, RECAP produces nuanced, auditable responses. Across EmoBench, SECEU, and EQ-Bench, RECAP improves emotional reasoning by 22-28% on 8B models and 10-13% on larger models over zero-shot baselines. Clinician evaluations further confirm superior empathetic communication. RECAP shows that modular, theory-grounded prompting can systematically enhance emotional intelligence in medical AI while preserving the accountability required for deployment.","bibtex":"@misc{srinivasan2025recaptransparentinferencetimeemotion,\n  title={RECAP: Transparent Inference-Time Emotion Alignment for Medical Dialogue Systems}, \n  author={Adarsh Srinivasan and Jacob Dineen and Muhammad Umar Afzal and Muhammad Uzair Sarfraz and Irbaz B. Riaz and Ben Zhou},\n  year={2025},\n  eprint={2509.10746},\n  archivePrefix={arXiv},\n  primaryClass={cs.CL},\n  url={https://arxiv.org/abs/2509.10746}\n}\n","technologies":null},"html":""}},{"node":{"frontmatter":{"title":"VisAnalog: A Diagnostic Suite for Visual Concept Transfer on Natural Images","slug":"/publications/visual-analogies-2025","authors":"Zhaonan Li, Kyle R. Chickering, Bangzheng Li, Jacob Dineen, Xiao Ye, Zhikun Xu, Shijie Lu, Yuxi Huang, Ming Shen, Bach Nguyen, Jaya Adithya Pavuluri, Mau Son Nguyen, Sanika Chavan, Ngoc Minh Thu Le, Muhao Chen, Ben Zhou","date":"2025-11-02T00:00:00.000Z","venue":"CVPR Workshop on Visual Concepts (VisCon), 2026","arxiv":null,"googlescholar":null,"semanticscholar":null,"paperurl":null,"code":null,"slides":null,"abstract":"","bibtex":null,"technologies":null},"html":""}},{"node":{"frontmatter":{"title":"Unbiased Visual Reasoning with Controlled Visual Inputs","slug":"/publications/unbiased-visual-reasoning-2025","authors":"Zhaonan Li, Shijie Lu, Fei Wang, Jacob Dineen, Xiao Ye, Zhikun Xu, Siyi Liu, Young Min Cho, Bangzheng Li, Daniel Chang, Kenny Nguyen, Qizheng Yang, Muhao Chen, Ben Zhou","date":"2025-11-01T00:00:00.000Z","venue":"Pending COLM 2026","arxiv":"https://arxiv.org/abs/2512.22183","googlescholar":null,"semanticscholar":"https://www.semanticscholar.org/paper/Unbiased-Visual-Reasoning-with-Controlled-Visual-Li-Lu/22f91c294c3d9ab4ed051beb2a4cec7ff22b9edc","paperurl":"https://arxiv.org/pdf/2512.22183.pdf","code":null,"slides":null,"abstract":"End-to-end Vision-language Models (VLMs) often answer visual questions by exploiting spurious correlations instead of causal visual evidence, and can become more shortcut-prone when fine-tuned. We introduce VISTA (Visual-Information Separation for Text-based Analysis), a modular framework that decouples perception from reasoning via an explicit information bottleneck. A frozen VLM sensor is restricted to short, objective perception queries, while a text-only LLM reasoner decomposes each question, plans queries, and aggregates visual facts in natural language. This controlled interface defines a reward-aligned environment for training unbiased visual reasoning with reinforcement learning. Instantiated with Qwen2.5-VL and Llama3.2-Vision sensors, and trained with GRPO from only 641 curated multi-step questions, VISTA significantly improves robustness to real-world spurious correlations on SpuriVerse (+16.29% with Qwen-2.5-VL-7B and +6.77% with Llama-3.2-Vision-11B), while remaining competitive on MMVP and a balanced SeedBench subset.\n","bibtex":"@article{li2025unbiased,\n  title={Unbiased Visual Reasoning with Controlled Visual Inputs},\n  author={Li, Zhaonan and Lu, Shijie and Wang, Fei and Dineen, Jacob and Ye, Xiao and Xu, Zhikun and Liu, Siyi and Cho, Young Min and Li, Bangzheng and Chang, Daniel and others},\n  journal={arXiv preprint arXiv:2512.22183},\n  year={2025}\n}\n","technologies":null},"html":""}},{"node":{"frontmatter":{"title":"Evaluating Medical LLMs by Levels of Autonomy: A Survey","slug":"/publications/medical-llms-autonomy-2025","authors":"Xiao Ye, Jacob Dineen, Zhaonan Li, Zhikun Xu, Weiyu Chen, Shijie Lu, Yuxi Huang, Ming Shen, Phu Tran, Ji-Eun Irene Yum, Muhammad Ali Khan, Muhammad Umar Afzal, Irbaz Bin Riaz, Ben Zhou","date":"2025-10-20T00:00:00.000Z","venue":"arXiv preprint","arxiv":"https://arxiv.org/abs/2510.17764","googlescholar":"https://scholar.google.com/citations?view_op=view_citation&hl=en&user=WKurvcoAAAAJ&citation_for_view=WKurvcoAAAAJ:UebtZRa9Y70C","semanticscholar":"https://www.semanticscholar.org/paper/Evaluating-Medical-LLMs-by-Levels-of-Autonomy%3A-A-to-Ye-Dineen/80b4730c8b96de7273725688552b8547cd2566ee","paperurl":"https://arxiv.org/pdf/2510.17764.pdf","code":null,"slides":null,"abstract":"Medical Large language models achieve strong scores on standard benchmarks; however, the transfer of those results to safe and reliable performance in clinical workflows remains a challenge. This survey reframes evaluation through a levels-of-autonomy lens (L0-L3), spanning informational tools, information transformation and aggregation, decision support, and supervised agents. We align existing benchmarks and metrics with the actions permitted at each level and their associated risks, making the evaluation targets explicit. This motivates a level-conditioned blueprint for selecting metrics, assembling evidence, and reporting claims, alongside directions that link evaluation to oversight. By centering autonomy, the survey moves the field beyond score-based claims toward credible, risk-aware evidence for real clinical use.\n","bibtex":"@misc{ye2025evaluatingmedicalllmslevels,\n      title={Evaluating Medical LLMs by Levels of Autonomy: A Survey Moving from Benchmarks to Applications}, \n      author={Xiao Ye and Jacob Dineen and Zhaonan Li and Zhikun Xu and Weiyu Chen and Shijie Lu and Yuxi Huang and Ming Shen and Phu Tran and Ji-Eun Irene Yum and Muhammad Ali Khan and Muhammad Umar Afzal and Irbaz Bin Riaz and Ben Zhou},\n      year={2025},\n      eprint={2510.17764},\n      archivePrefix={arXiv},\n      primaryClass={cs.CL},\n      url={https://arxiv.org/abs/2510.17764}, \n}\n","technologies":null},"html":""}},{"node":{"frontmatter":{"title":"ArenaBencher: Automatic Benchmark Evolution via Multi-Model Competitive Evaluation","slug":"/publications/arenabencher-2025","authors":"Qin Liu, Jacob Dineen, Yuxi Huang, Sheng Zhang, Hoifung Poon, Ben Zhou, Muhao Chen","date":"2025-10-09T00:00:00.000Z","venue":"Pending COLM 2026","arxiv":"https://arxiv.org/abs/2510.08569","googlescholar":null,"semanticscholar":"https://www.semanticscholar.org/paper/ArenaBencher%3A-Automatic-Benchmark-Evolution-via-Liu-Dineen/588205ed40fa03b87bfb4741e69f22a513265b61","paperurl":"https://arxiv.org/pdf/2510.08569.pdf","code":null,"slides":null,"abstract":"Benchmarks are central to measuring the capabilities of large language models and guiding model development, yet widespread data leakage from pretraining corpora undermines their validity. Models can match memorized content rather than demonstrate true generalization, which inflates scores, distorts cross-model comparisons, and misrepresents progress. We introduce ArenaBencher, a model-agnostic framework for automatic benchmark evolution that updates test cases while preserving comparability. Given an existing benchmark and a diverse pool of models to be evaluated, ArenaBencher infers the core ability of each test case, generates candidate question-answer pairs that preserve the original objective, verifies correctness and intent with an LLM as a judge, and aggregates feedback from multiple models to select candidates that expose shared weaknesses. The process runs iteratively with in-context demonstrations that steer generation toward more challenging and diagnostic cases. We apply ArenaBencher to math problem solving, commonsense reasoning, and safety domains and show that it produces verified, diverse, and fair updates that uncover new failure modes, increase difficulty while preserving test objective alignment, and improve model separability. The framework provides a scalable path to continuously evolve benchmarks in step with the rapid progress of foundation models.","bibtex":"@misc{liu2025arenabencherautomaticbenchmarkevolution,\n  title={ArenaBencher: Automatic Benchmark Evolution via Multi-Model Competitive Evaluation}, \n  author={Qin Liu and Jacob Dineen and Yuxi Huang and Sheng Zhang and Hoifung Poon and Ben Zhou and Muhao Chen},\n  year={2025},\n  eprint={2510.08569},\n  archivePrefix={arXiv},\n  primaryClass={cs.CL},\n  url={https://arxiv.org/abs/2510.08569}\n}\n","technologies":null},"html":""}},{"node":{"frontmatter":{"title":"ThinkTuning: Instilling Cognitive Reflections without Distillation","slug":"/publications/thinktuning-2025","authors":"Aswin RRV, Jacob Dineen, Divij Handa, Md Nayem Uddin, Mihir Parmar, Chitta Baral, Ben Zhou","date":"2025-08-11T00:00:00.000Z","venue":"EMNLP 2025","arxiv":"https://arxiv.org/abs/2508.07616","googlescholar":null,"semanticscholar":"https://www.semanticscholar.org/paper/ThinkTuning%3A-Instilling-Cognitive-Reflections-Rrv-Dineen/bd882244d2d84d7a455fdd1af5198f8fbdcdd228","paperurl":"https://arxiv.org/pdf/2508.07616.pdf","code":"https://github.com/3rdAT/ThinkTuning","slides":"/slides/ThinkTuningSlides.pdf","abstract":"Recent advances in test-time scaling have led to the emergence of thinking LLMs that exhibit self-reflective behaviors and multi-step reasoning. While RL drives this self-improvement paradigm, a recent study (Gandhi et al., 2025) shows that RL alone does not truly instill these new reasoning abilities - it merely draws out behaviors already present in the base models. This raises a question: How can we train the models that don't exhibit such thinking behavior to develop it in the first place? To this end, we propose ThinkTuning, a GRPO-based interactive training approach where we augment the rollouts of a student model with the guidance from a teacher model. A simple idea from classroom practice inspires our method: a teacher poses a problem, lets the student try an answer, then gives corrective feedback -- enough to point the mind in the right direction and then show the solution. Each piece of feedback reshapes the student's thoughts, leading them to arrive at the correct solution. Similarly, we find that this type of implicit supervision through feedback from a teacher model of the same size improves the reasoning capabilities of the student model. In particular, on average, our method shows a 3.85% improvement over zero-shot baselines across benchmarks, and on MATH-500, AIME and GPQA-Diamond it shows 2.08%, 2.23% and 3.99% improvements over the vanilla-GRPO baseline. Source code is available at this https URL.\n","bibtex":"@misc{rrv2025thinktuninginstillingcognitivereflections,\n      title={ThinkTuning: Instilling Cognitive Reflections without Distillation}, \n      author={Aswin RRV and Jacob Dineen and Divij Handa and Md Nayem Uddin and Mihir Parmar and Chitta Baral and Ben Zhou},\n      year={2025},\n      eprint={2508.07616},\n      archivePrefix={arXiv},\n      primaryClass={cs.AI},\n      url={https://arxiv.org/abs/2508.07616}, \n}\n","technologies":null},"html":""}},{"node":{"frontmatter":{"title":"CC-LEARN: Cohort-based Consistency Learning","slug":"/publications/cc-learn-2025","authors":"Xiao Ye, Shaswat Shrivastava, Zhaonan Li, Jacob Dineen, Shijie Lu, Avneet Ahuja, Ming Shen, Zhikun Xu, Ben Zhou","date":"2025-06-18T00:00:00.000Z","venue":"Pending COLM 2026","arxiv":"https://arxiv.org/abs/2506.15662","googlescholar":null,"semanticscholar":null,"paperurl":"https://arxiv.org/pdf/2506.15662.pdf","code":null,"slides":null,"abstract":"Large language models excel at many tasks but still struggle with consistent, robust reasoning. We introduce Cohort-based Consistency Learning (CC-Learn), a reinforcement learning framework that improves the reliability of LLM reasoning by training on cohorts of similar questions derived from shared programmatic abstractions. To enforce cohort-level consistency, we define a composite objective combining cohort accuracy, a retrieval bonus for effective problem decomposition, and a rejection penalty for trivial or invalid lookups that reinforcement learning can directly optimize, unlike supervised fine-tuning. Optimizing this reward guides the model to adopt uniform reasoning patterns across all cohort members. Experiments on challenging reasoning benchmarks (including ARC-Challenge and StrategyQA) show that CC-Learn boosts both accuracy and reasoning stability over pretrained and SFT baselines. These results demonstrate that cohort-level RL effectively enhances reasoning consistency in LLMs.","bibtex":"@misc{ye2025cclearn,\n  title={CC-LEARN: Cohort-based Consistency Learning}, \n  author={Xiao Ye and Shaswat Shrivastava and Zhaonan Li and Jacob Dineen and Shijie Lu and Avneet Ahuja and Ming Shen and Zhikun Xu and Ben Zhou},\n  year={2025},\n  eprint={2506.15662},\n  archivePrefix={arXiv},\n  primaryClass={cs.CL},\n  url={https://arxiv.org/abs/2506.15662}\n}\n","technologies":null},"html":""}},{"node":{"frontmatter":{"title":"Training Language Models with Context-Free Reasoning Bottlenecks","slug":"/publications/bow-2025","authors":"Ming Shen, Zhikun Xu, Jacob Dineen, Xiao Ye, Ben Zhou","date":"2025-06-16T00:00:00.000Z","venue":"Pending COLM 2026","arxiv":"https://arxiv.org/abs/2506.13502","googlescholar":null,"semanticscholar":null,"paperurl":"https://arxiv.org/pdf/2506.13502.pdf","code":null,"slides":null,"abstract":"Large language models (LLMs) are typically pretrained with next-word prediction (NWP), which yields strong surface fluency but places limited pressure on models to form explicit reasoning before emitting tokens. We study whether shifting the supervision signal can better elicit explicit reasoning and, more broadly, strengthen models' general reasoning capability. We present BOttlenecked next-Word prediction (BOW), a RL formulation of NWP that inserts an intermediate reasoning bottleneck. Instead of predicting the next word directly from context, the policy model must first generate a next-word reasoning trajectory. A frozen scorer then assigns this trajectory a soft, distributional reward equal to the probability of the gold next token conditioned solely on the trajectory to guide the RL optimization. We also propose an optional L1-style regularizer on the reward to discourage 'name-the-answer' shortcuts. Across ten benchmarks, a brief BOW adaptation phase on Qwen2.5-7B-Instruct and Llama3.1-8B-Instruct improves zero-shot reasoning and outperforms strong continual-pretraining baselines, including an RL variant with a hard, binary reward and a supervised finetuning approach with augmented data, by nearly 5% on average, while achieving the top result in 7 of 10 intrinsic NWP evaluations. These results indicate that BOW is a viable alternative to vanilla NWP, inducing explicit next-word reasoning and strengthening general reasoning ability.","bibtex":"@misc{shen2025bowreinforcementlearningbottlenecked,\n  title={BOW: Reinforcement Learning for Bottlenecked Next Word Prediction}, \n  author={Ming Shen and Zhikun Xu and Jacob Dineen and Xiao Ye and Ben Zhou},\n  year={2025},\n  eprint={2506.13502},\n  archivePrefix={arXiv},\n  primaryClass={cs.CL},\n  url={https://arxiv.org/abs/2506.13502}\n}\n","technologies":null},"html":""}},{"node":{"frontmatter":{"title":"QA-LIGN: Aligning LLMs through Constitutionally Decomposed QA","slug":"/publications/qa-lign-2025","authors":"Jacob Dineen, Aswin RRV, Qin Liu, Zhikun Xu, Xiao Ye, Ming Shen, Zhaonan Li, Shijie Lu, Chitta Baral, Muhao Chen, Ben Zhou","date":"2025-01-02T00:00:00.000Z","venue":"EMNLP 2025","arxiv":"https://arxiv.org/abs/2506.08123","googlescholar":null,"semanticscholar":"https://www.semanticscholar.org/paper/QA-LIGN%3A-Aligning-LLMs-through-Constitutionally-QA-Dineen-Rrv/706b4a2e443aaf81039aa1f531ad75a4e53c7ab6","paperurl":"https://arxiv.org/pdf/2506.08123.pdf","code":null,"slides":"/slides/EMNLP 2025_Find-3576 slides.pdf","abstract":"Alignment of large language models (LLMs) with principles like helpfulness, honesty, and harmlessness typically relies on scalar rewards that obscure which objectives drive the training signal. We introduce QA-LIGN, which decomposes monolithic rewards into interpretable principle-specific evaluations through structured natural language programs. Models learn through a draft, critique, and revise pipeline, where symbolic evaluation against the rubrics provides transparent feedback for both initial and revised responses during GRPO training. Applied to uncensored Llama-3.1-8B-Instruct, QA-LIGN reduces attack success rates by up to 68.7% while maintaining a 0.67% false refusal rate, achieving Pareto optimal safety-helpfulness performance and outperforming both DPO and GRPO with state-of-the-art reward models given equivalent training. These results demonstrate that making reward signals interpretable and modular improves alignment effectiveness, suggesting transparency enhances LLM safety.","bibtex":"@misc{dineen2025qalignaligningllmsconstitutionally,\n      title={QA-LIGN: Aligning LLMs through Constitutionally Decomposed QA}, \n      author={Jacob Dineen and Aswin RRV and Qin Liu and Zhikun Xu and Xiao Ye and Ming Shen and Zhaonan Li and Shijie Lu and Chitta Baral and Muhao Chen and Ben Zhou},\n      year={2025},\n      eprint={2506.08123},\n      archivePrefix={arXiv},\n      primaryClass={cs.CL},\n      url={https://arxiv.org/abs/2506.08123}, \n}\n","technologies":null},"html":""}},{"node":{"frontmatter":{"title":"ToW: Thoughts of Words Improve Reasoning in Large Language Models","slug":"/publications/tow-2024","authors":"Zhikun Xu, Ming Shen, Jacob Dineen, Zhaonan Li, Xiao Ye, Shijie Lu, Aswin RRV, Chitta Baral, Ben Zhou","date":"2024-10-21T00:00:00.000Z","venue":"NAACL 2025","arxiv":"https://arxiv.org/abs/2410.16235","googlescholar":"https://scholar.google.com/citations?view_op=view_citation&hl=en&user=WKurvcoAAAAJ&citation_for_view=WKurvcoAAAAJ:Y0pCki6q_DkC","semanticscholar":"https://www.semanticscholar.org/paper/ToW%3A-Thoughts-of-Words-Improve-Reasoning-in-Large-Xu-Shen/aac83c00da794b980c2128eca1517b5c359ef923","paperurl":"https://arxiv.org/pdf/2410.16235.pdf","code":null,"slides":null,"abstract":"We introduce thoughts of words (ToW), a novel training-time data-augmentation method for next-word prediction. ToW views next-word prediction as a core reasoning task and injects fine-grained thoughts explaining what the next word should be and how it is related to the previous contexts in pre-training texts. Our formulation addresses two fundamental drawbacks of existing next-word prediction learning schemes: they induce factual hallucination and are inefficient for models to learn the implicit reasoning processes in raw texts. While there are many ways to acquire such thoughts of words, we explore the first step of acquiring ToW annotations through distilling from larger models. After continual pre-training with only 70K ToW annotations, we effectively improve models' reasoning performances by 7% to 9% on average and reduce model hallucination by up to 10%. At the same time, ToW is entirely agnostic to tasks and applications, introducing no additional biases on labels or semantics.","bibtex":"@article{xu2024tow,\n  title={Tow: Thoughts of words improve reasoning in large language models},\n  author={Xu, Zhikun and Shen, Ming and Dineen, Jacob and Li, Zhaonan and Ye, Xiao and Lu, Shijie and RRV, Aswin and Baral, Chitta and Zhou, Ben},\n  journal={arXiv preprint arXiv:2410.16235},\n  year={2024}\n}\n","technologies":null},"html":""}},{"node":{"frontmatter":{"title":"Unified Explanations in Machine Learning Models: A Perturbation Approach","slug":"/publications/unified-xai-2023","authors":"Jacob Dineen, Don Kridel, Daniel Dolk, David Castillo","date":"2023-1-07","venue":"HICSS 2023","arxiv":"https://arxiv.org/pdf/2405.20200","googlescholar":"https://scholar.google.com/citations?view_op=view_citation&hl=en&user=WKurvcoAAAAJ&citation_for_view=WKurvcoAAAAJ:UeHWp8X0CEIC","semanticscholar":"https://www.semanticscholar.org/paper/Unified-Explanations-in-Machine-Learning-Models%3A-A-Dineen-Kridel/24c79f9a4985d4503d7300eca987f874a7e8491e","paperurl":null,"code":"https://github.com/jacobdineen/hiccs2021","slides":null,"abstract":"A high-velocity paradigm shift towards Explainable Artificial Intelligence (XAI) has emerged in recent years. Highly complex Machine Learning (ML) models have flourished in many tasks of intelligence, and the questions have started to shift away from traditional metrics of validity towards something deeper; What is this model telling me about my data, and how is it arriving at these conclusions? Inconsistencies between XAI and modeling techniques can have the undesirable effect of casting doubt upon the efficacy of these explainability approaches. To address these problems, we propose a systematic, perturbation-based analysis against a popular, model-agnostic method in XAI, SHapley Additive exPlanations (Shap). We devise algorithms to generate relative feature importance in settings of dynamic inference amongst a suite of popular machine learning and deep learning methods, and metrics that allow us to quantify how well explanations generated under the static case hold. We propose a taxonomy for feature importance methodology, measure alignment, and observe quantifiable similarity amongst explanation models across several datasets.","bibtex":"@article{dineen2024unified,\ntitle={Unified Explanations in Machine Learning Models: A Perturbation Approach},\nauthor={Dineen, Jacob and Kridel, Don and Dolk, Daniel and Castillo, David},\njournal={arXiv preprint arXiv:2405.20200},\nyear={2024}\n}\n","technologies":null},"html":""}},{"node":{"frontmatter":{"title":"Formal Methods for an Iterated Volunteer's Dilemma","slug":"/publications/volunteer-dilemma-2021","authors":"Jacob Dineen, ASM Ahsan-Ul Haque, Matthew Bielskas","date":"2021-7-04","venue":"SBP-BRiMS 2021","arxiv":"https://arxiv.org/abs/2008.12846","googlescholar":"https://scholar.google.com/citations?view_op=view_citation&hl=en&user=WKurvcoAAAAJ&citation_for_view=WKurvcoAAAAJ:9yKSN-GCB0IC","semanticscholar":"https://www.semanticscholar.org/reader/020c978cbbf92d960ce486eb12cb5d4ca0ff10c8","paperurl":"https://link.springer.com/chapter/10.1007%2F978-3-030-80387-2_8","code":"https://github.com/jacobdineen/volunteergame_","slides":null,"abstract":"Game theory provides a paradigm through which we can study the evolving communication and phenomena that occur via rational agent interaction [10]. The Volunteer’s dilemma is a vastly studied game throughout literature that models agents as cooperative, rather than selfish, entities. In this work, we design a model framework and explore the Volunteer’s dilemma with the goals of 1) modeling it as a stochastic concurrent n-player game, 2) constructing properties to verify model correctness and reachability, 3) constructing strategy synthesis graphs to understand how the game is iteratively stepped through most optimally and, 4) analyzing a series of parameters to understand correlations with expected local and global rewards over a finite time horizon.","bibtex":"@inproceedings{dineen2021formal,\ntitle={Formal Methods for an Iterated Volunteer’s Dilemma},\nauthor={Dineen, Jacob and Haque, ASM Ahsan-Ul and Bielskas, Matthew},\nbooktitle={Social, Cultural, and Behavioral Modeling: 14th International Conference, SBP-BRiMS 2021, Virtual Event, July 6--9, 2021, Proceedings 14},\npages={81--90},\nyear={2021},\norganization={Springer}\n}\n","technologies":null},"html":""}},{"node":{"frontmatter":{"title":"Reinforcement Learning for Data Poisoning on Graph Neural Networks","slug":"/publications/rl-data-poisoning-2021","authors":"Jacob Dineen, ASM Ahsan-Ul Haque, Matthew Bielskas","date":"2021-7-04","venue":"SBP-BRiMS 2021","arxiv":"https://arxiv.org/abs/2102.06800","googlescholar":"https://scholar.google.com/citations?view_op=view_citation&hl=en&user=WKurvcoAAAAJ&citation_for_view=WKurvcoAAAAJ:d1gkVwhDpl0C","semanticscholar":"https://www.semanticscholar.org/reader/d9926dc56bd67f2f1b3f9caf18e53183cb3499ac","paperurl":"https://link.springer.com/chapter/10.1007%2F978-3-030-80387-2_14","code":"https://github.com/jacobdineen/RL4DataPoisoning","slides":null,"abstract":"Adversarial Machine Learning has emerged as a substantial subfield of Computer Science due to a lack of robustness in the models we train along with crowdsourcing practices that enable attackers to tamper with data. In the last two years, interest has surged in adversarial attacks on graphs yet the Graph Classification setting remains nearly untouched. Since a Graph Classification dataset consists of discrete graphs with class labels, related work has forgone direct gradient optimization in favor of an indirect Reinforcement Learning approach. We will study the novel problem of Data Poisoning (training-time) attacks on Neural Networks for Graph Classification using Reinforcement Learning Agents.","bibtex":"@inproceedings{dineen2021reinforcement,\ntitle={Reinforcement Learning for Data Poisoning on Graph Neural Networks},\nauthor={Dineen, Jacob and Haque, ASM Ahsan-Ul and Bielskas, Matthew},\nbooktitle={Social, Cultural, and Behavioral Modeling: 14th International Conference, SBP-BRiMS 2021, Virtual Event, July 6--9, 2021, Proceedings 14},\npages={141--150},\nyear={2021},\norganization={Springer}\n}\n","technologies":null},"html":""}},{"node":{"frontmatter":{"title":"Model Interpretation and Explainability towards Creating Transparency in Prediction Models","slug":"/publications/model-xai-2020","authors":"Daniel Dolk, Donald Kridel, Jacob Dineen, David Castillo","date":"2020-1-07","venue":"HICSS 2020","arxiv":"https://arxiv.org/pdf/2405.20794","googlescholar":"https://scholar.google.com/citations?view_op=view_citation&hl=en&user=WKurvcoAAAAJ&citation_for_view=WKurvcoAAAAJ:u5HHmVD_uO8C","semanticscholar":"https://pdfs.semanticscholar.org/5643/1886ac2ed20255ee1fa5983543b2817105d2.pdf?_gl=1*1ojfebs*_ga*MjAzNTY4OTM1NC4xNjkwNDIwMzQ5*_ga_H7P4ZT52H5*MTY5MjU2ODk4Mi4xMC4xLjE2OTI1Njg5ODQuNTguMC4w","paperurl":"https://scholarspace.manoa.hawaii.edu/handle/10125/63859","code":"https://github.com/jacobdineen/explainability","slides":null,"abstract":"Explainable AI has a counterpart in analytical modeling which we refer to as model explainability. We tackle the issue of model explainability in the context of prediction models. We analyze a dataset of loans from a credit card company and apply three stages; Execute and compare four different prediction methods, apply the best known explainability techniques in the current literature to the model training sets to identify feature importance (FI)(static case), and finally to cross-check whether the FI set holds up under “what if” prediction scenarios for continuous and categorical variables (dynamic case). We found inconsistency in FI identification between the static and dynamic cases. We summarize the “state of the art” in model explainability and suggest further research to advance the field.","bibtex":"@article{kridel2024model,\ntitle={Model interpretation and Explainability: towards creating transparency in prediction models},\nauthor={Kridel, Donald and Dineen, Jacob and Dolk, Daniel and Castillo, David},\njournal={arXiv preprint arXiv:2405.20794},\nyear={2024}\n}\n","technologies":null},"html":""}}]},"education":{"edges":[{"node":{"frontmatter":{"venue":"Arizona State University","degree":"Ph.D. in Artificial Intelligence","gpa":"4.00/4.00","range":"2022-2026","technologies":[{"name":"Computer Systems Security"},{"name":"Software Security"},{"name":"Planning and Learning Methods in AI"},{"name":"Algorithms"},{"name":"Knowledge Representation"}]},"html":"<ul>\n<li>LLM Research at <a href=\"https://arc-asu.github.io/\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">ARC Lab</a>, advised by <a href=\"http://xuanyu.me/\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Professor Ben Zhou</a>.</li>\n<li>Artificial Intelligence Research at <a href=\"https://sefcom.asu.edu/\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">SEFCOM</a>.</li>\n<li>Learned to hack on <a href=\"https://pwn.college/\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">pwn.college</a> (Green Belt, @jdin).</li>\n</ul>"}},{"node":{"frontmatter":{"venue":"University of Virginia","degree":"M.Sc. Computer Science","gpa":"4.00/4.00","range":"2019 - 2021","technologies":[{"name":"Algorithms"},{"name":"Machine Learning"},{"name":"Computer Vision"},{"name":"Formal Methods"},{"name":"Reinforcement Learning"},{"name":"Graph Mining"},{"name":"Learning Theory (Game Theory)"},{"name":"Cloud Computing"},{"name":"Research Hours"}]},"html":"<ul>\n<li>Worked at the <a href=\"https://biocomplexity.virginia.edu/\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Biocomplexity Institute and Initiative</a> on graph dynamic systems and cooperative game theory / behavior modeling, advised by <a href=\"https://engineering.virginia.edu/faculty/madhav-marathe\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Professor Madhav Marathe</a>.</li>\n</ul>"}},{"node":{"frontmatter":{"venue":"Syracuse University","degree":" M.S. Data Science","gpa":"4.00/4.00","range":"2017 - 2018","technologies":[{"name":"Data Analysis and Decision Making"},{"name":"Business Analytics"},{"name":"Financial Analytics"},{"name":"Marketing Analytics"},{"name":"Advanced Information Systems"},{"name":"Data Science"},{"name":"Data Warehousing"},{"name":"Text Mining"},{"name":"Scripting for Data Analysis"},{"name":"Information Policy"}]},"html":""}},{"node":{"frontmatter":{"venue":"Grand Canyon University","degree":"B.S. Finance and Economics","gpa":"3.65/4.00","range":"2012 - 2015","technologies":null},"html":""}}]}}}