Research Papers

Artificial Intelligence

Allen Schmaltz. 2025. Similarity-Distance-Magnitude Universal Verification. arXiv preprint arXiv:2502.20167. Code.

Introduces SDM activation functions, SDM calibration, and SDM networks, which are neural networks (e.g., LLMs) with uncertainty-aware verification and interpretability-by-exemplar as intrinsic properties. See the blog post "The Determinants of Controllable AGI" for a high-level overview of the broader implications.

Allen Schmaltz and Danielle Rasooly. 2022. Introspection, Updatability, and Uncertainty Quantification with Transformers: Concrete Methods for AI Safety.

December 2022, ML Safety Workshop, 36th Conference on Neural Information Processing Systems (NeurIPS 2022). Poster.

Allen Schmaltz and Danielle Rasooly. 2022. Approximate Conditional Coverage & Calibration via Neural Model Approximations. arXiv preprint arXiv:2205.14310.

Spotlight talk, July 2022, Workshop on Distribution-Free Uncertainty Quantification at the Thirty-ninth International Conference on Machine Learning (ICML 2022), Baltimore, Maryland.

Allen Schmaltz. 2021. Detecting Local Insights from Global Labels: Supervised & Zero-Shot Sequence Labeling via a Convolutional Decomposition. Computational Linguistics. https://doi.org/10.1162/coli_a_00416. Online Appendix. Code.

Introduces instance-based, metric-learner approximations of neural network models and hard-attention mechanisms that can be constructed with task-specific inductive biases for effective semi-supervised learning (i.e., feature detection). These mechanisms combine to yield effective methods for interpretability-by-exemplar over the representation space of neural models.

Natural Language Processing

Allen Schmaltz. 2019. Learning to Order & Learning to Correct. Harvard University, Ph.D. dissertation, Computer Science.

Allen Schmaltz, Yoon Kim, Alexander Rush, and Stuart Shieber. 2017. Adapting Sequence Models for Sentence Correction. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2807-2813, Copenhagen, Denmark, September. Association for Computational Linguistics. https://www.aclweb.org/anthology/D17-1298. (Appendix) (.bib)

Allen Schmaltz, Alexander M. Rush, and Stuart Shieber. 2016. Word Ordering Without Syntax. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 2319-2324, Austin, TX, USA, November. Association for Computational Linguistics. https://aclweb.org/anthology/D16-1255. (.bib)

Demonstrated that multi-layer networks can encode hierarchical language structures without explicit human annotations. Prior to this work, the prevailing view in NLP and computational linguistics was that neural language models would need to be trained with human-annotated syntactic structures to model syntax.

Allen Schmaltz, Yoon Kim, Alexander M. Rush, and Stuart Shieber. 2016. Sentence-Level Grammatical Error Identification as Sequence-to-Sequence Correction. In Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications, pages 242-251, San Diego, CA, USA, June. Association for Computational Linguistics. https://www.aclweb.org/anthology/W16-0528. (.bib)

Medicine and Public Health

Allen Schmaltz and Andrew L. Beam. 2020. Sharpening the Resolution on Data Matters: A Brief Roadmap for Understanding Deep Learning for Medical Data. The Spine Journal. https://doi.org/10.1016/j.spinee.2020.08.012.

Andrew L. Beam, Benjamin Kompa, Allen Schmaltz, Inbar Fried, Griffin Weber, Nathan P. Palmer, Xu Shi, Tianxi Cai, and Isaac S. Kohane. 2020. Clinical Concept Embeddings Learned from Massive Sources of Multimodal Medical Data. In Proceedings of the Pacific Symposium on Biocomputing (PSB) 25, pages 295-306. arXiv:1804.01486.

Public Policy

Allen Schmaltz. 2018. On the Utility of Lay Summaries and AI Safety Disclosures: Toward Robust, Open Research Oversight. In Proceedings of the Second ACL Workshop on Ethics in Natural Language Processing, pages 1-6, New Orleans, LA, USA, June. Association for Computational Linguistics. https://aclweb.org/anthology/W18-0801. (.bib)

Quantitative Social Science

Wenxin Jiang, Gary King, Allen Schmaltz, and Martin A. Tanner. 2019. Ecological Regression with Partial Identification. Political Analysis. https://doi.org/10.1017/pan.2019.19.

Technical Reports

Each of these papers introduced novel methods when they first appeared on arXiv, and are of lasting interest for the reasons described in the block quotes below.

Allen Schmaltz and Andrew Beam. 2020. Coarse-to-Fine Memory Matching for Joint Retrieval and Classification. arXiv preprint arXiv:2012.02287.

Introduces interpretability-by-exemplar for multi-stage retrieval and classification with a single model, including feature detection via alignment of bi-encoded sequences. Includes a method for beam search through the search graph of bi- and cross-encoded sequences, and an early approach for constraining the output of a retrieval system based on dense matching into the support set.
This is, in effect, an early example of test-time compute with a Transformer language model. Instead of using reinforcement learning, multi-stage search is learned end-to-end via a contrastive loss over bi- and cross-encoded sequences. See these presentation slides from 2021 for a high-level overview.

Allen Schmaltz and Andrew Beam. 2020. Exemplar Auditing for Multi-Label Biomedical Text Classification. arXiv preprint arXiv:2004.03093.

Introduces a loss and inductive bias for a hard-attention mechanism suitable for high-dimensional multi-label classification tasks. This illustrates the generality of the hard-attention mechanism introduced in "Detecting Local Insights from Global Labels: Supervised & Zero-Shot Sequence Labeling via a Convolutional Decomposition", with which it becomes straightforward to model task-specific inductive biases suitable for effective semi-supervised learning.