Publications

Restrict? Restrict?
deep learning, neural networks reproducibility, evaluation issues and methodology
big data, large-scale data processing information seeking, user interaction, visualization
Twitter, real-time search and filtering medical and biomedical informatics
information retrieval computational social science, digital humanities
question answering, document summarization hydrology, water sciences
natural language processing, computational linguistics

Jump to:

2029 | 2028 | 2027 | 2026 | 2025 | 2024 | 2023 | 2022 | 2021 | 2020
2019 | 2018 | 2017 | 2016 | 2015 | 2014 | 2013 | 2012 | 2011 | 2010
2009 | 2008 | 2007 | 2006 | 2005 | 2004 | 2003 | 2002 | 2001 | 2000
1999 | 1998

2024

654.Nandan Thakur, Luiz Bonifacio, Maik Fröbe, Alexander Bondarenko, Ehsan Kamalloo, Martin Potthast, Matthias Hagen, and Jimmy Lin. Systematic Evaluation of Neural Retrieval Models on the Touché 2020 Argument Retrieval Subset of BEIR. Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2024), July 2024, Washington, D.C.
653.Ronak Pradeep and Jimmy Lin. Towards Automated End-to-End Health Misinformation Free Search with a Large Language Model. Proceedings of the 46th European Conference on Information Retrieval (ECIR 2024), Part IV, pages 78-86, April 2024, Glasgow, Scotland.
652.Jasper Xian, Tommaso Teofili, Ronak Pradeep, and Jimmy Lin. Vector Search with OpenAI Embeddings: Lucene Is All You Need. Proceedings of the 17th ACM International Conference on Web Search and Data Mining (WSDM 2024), pages 1090–1093, March 2024, Mérida, México.

2023

651.Mofetoluwa Adeyemi, Akintunde Oladipo, Ronak Pradeep, and Jimmy Lin. Zero-Shot Cross-Lingual Reranking with Large Language Models for Low-Resource Languages. arXiv:2312.16159, December 2023. Twitter logo
650.Manveer Singh Tamber, Ronak Pradeep, and Jimmy Lin. Scaling Down, LiTting Up: Efficient Zero-Shot Listwise Reranking with Seq2seq Encoder-Decoder Models. arXiv:2312.16098, December 2023. Twitter logo
649.Nandan Thakur, Luiz Bonifacio, Xinyu Zhang, Odunayo Ogundepo, Ehsan Kamalloo, David Alfonso-Hermelo, Xiaoguang Li, Qun Liu, Boxing Chen, Mehdi Rezagholizadeh, and Jimmy Lin. NoMIRACL: Knowing When You Don't Know for Robust Multilingual Retrieval-Augmented Generation. arXiv:2312.11361, December 2023. Twitter logo
648.Mofetoluwa Adeyemi, Akintunde Oladipo, Xinyu Zhang, David Alfonso-Hermelo, Mehdi Rezagholizadeh, Boxing Chen, and Jimmy Lin. CIRAL at FIRE 2023: Cross-Lingual Information Retrieval for African Languages. Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation (FIRE 2023), December 2023, pages 4-6, Panjim, India.
647.Sheng-Chieh Lin, Amin Ahmad, and Jimmy Lin. mAggretriever: A Simple yet Effective Approach to Zero-Shot Multilingual Dense Retrieval. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, December 2023, pages 11688-11696, Singapore.
646.Ronak Pradeep, Kai Hui, Jai Gupta, Adam Lelkes, Honglei Zhuang, Jimmy Lin, Donald Metzler, and Vinh Tran. How Does Generative Retrieval Scale to Millions of Passages? Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, December 2023, pages 1305-1321, Singapore. Twitter logo Twitter logo
645.Akintunde Oladipo, Mofetoluwa Adeyemi, Orevaoghene Ahia, Abraham Owodunni, Odunayo Ogundepo, David Adelani, and Jimmy Lin. Better Quality Pre-training Data and T5 Models for African Languages. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, December 2023, pages 158-168, Singapore. Twitter logo
644.Sheng-Chieh Lin, Akari Asai, Minghan Li, Barlas Oguz, Jimmy Lin, Yashar Mehdad, Wen-tau Yih, and Xilun Chen. How to Train Your Dragon: Diverse Augmentation Towards Generalizable Dense Retrieval. Findings of the Association for Computational Linguistics: EMNLP 2023, December 2023, pages 6385-6400, Singapore.
643.Christopher Akiki, Odunayo Ogundepo, Aleksandra Piktus, Xinyu Zhang, Akintunde Oladipo, Jimmy Lin, Martin Potthast. Spacerini: Plug-and-play Search Engines with Pyserini and Hugging Face. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, December 2023, pages 140–148, Singapore.
642.Ronak Pradeep, Sahel Sharifymoghaddam, and Jimmy Lin. RankZephyr: Effective and Robust Zero-Shot Listwise Reranking is a Breeze! arXiv:2312.02724, December 2023. Twitter logo Twitter logo Twitter logo
641.Xinyu Zhang, Sebastian Hofstätter, Patrick Lewis, Raphael Tang, and Jimmy Lin. Rank-without-GPT: Building GPT-Independent Listwise Rerankers on Open-Source Large Language Models. arXiv:2312.02969, December 2023. Twitter logo Twitter logo
640.Jimmy Lin and Tommaso Teofili. Searching Dense Representations with Inverted Indexes. arXiv:2312.01556, December 2023. Twitter logo
639.Raphael Tang, Xinyu Zhang, Jimmy Lin, and Ferhan Ture. What Do Llamas Really Think? Revealing Preference Biases in Language Model Representations. arXiv:2311.18812, November 2023. Twitter logo
638.Haonan Chen, Carlos Lassance, and Jimmy Lin. End-to-End Retrieval with Learned Dense and Sparse Representations Using Lucene. arXiv:2311.18503, November 2023. Twitter logo
637.Jheng-Hong Yang, Carlos Lassance, Rafael Sampaio de Rezende, Krishna Srinivasan, Miriam Redi, Stéphane Clinchant, and Jimmy Lin. TREC2023 AToMiC Overview. Proceedings of the Thirty-Second Text REtrieval Conference (TREC 2023), November 2023, Gaithersburg, Maryland.
636.Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Hossein A. Rahmani, Daniel Campos, Jimmy Lin, Ellen M. Voorhees, and Ian Soboroff. Overview of the TREC 2023 Deep Learning Track. Proceedings of the Thirty-Second Text REtrieval Conference (TREC 2023), November 2023, Gaithersburg, Maryland.
635.Jheng-Hong Yang and Jimmy Lin. TREC 2023: h2oloo in the Product Search Challenge. Proceedings of the Thirty-Second Text REtrieval Conference (TREC 2023), November 2023, Gaithersburg, Maryland.
634.Carlos Lassance, Ronak Pradeep, and Jimmy Lin. Naverloo @ TREC Deep Learning and NeuCLIR 2023: As Easy as Zero, One, Two, Three — Cascading Dual Encoders, Mono, Duo, and Listo for Ad-Hoc Retrieval. Proceedings of the Thirty-Second Text REtrieval Conference (TREC 2023), November 2023, Gaithersburg, Maryland.
633.Xueguang Ma, Hengxin Fun, Xusen Yin, Antonio Mallia, and Jimmy Lin. Enhancing Sparse Retrieval via Unsupervised Learning. Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region (SIGIR-AP 2023), pages 150-157, November 2023.
632.Minghan Li, Honglei Zhuang, Kai Hui, Zhen Qin, Jimmy Lin, Rolf Jagerman, Xuanhui Wang, and Michael Bendersky. Generate, Filter, and Fuse: Query Expansion via Multi-Step Keyword Generation for Zero-Shot Neural Rankers. arXiv:2311.09175, November 2023.
631.Nandan Thakur, Jianmo Ni, Gustavo Hernández Ábrego, John Wieting, Jimmy Lin, and Daniel Cer. Leveraging LLMs for Synthesizing Training Data Across Many Languages in Multilingual Dense Retrieval. arXiv:2311.05800, November 2023. Twitter logo
630.Xueguang Ma, Tommaso Teofili, and Jimmy Lin. Anserini Gets Dense Retrieval: Integration of Lucene's HNSW Indexes. Proceedings of the 32nd International Conference on Information and Knowledge Management (CIKM 2023), October 2023, pages 5366–5370, Birmingham, the United Kingdom. Twitter logo
629.Xueguang Ma, Liang Wang, Nan Yang, Furu Wei, and Jimmy Lin. Fine-Tuning LLaMA for Multi-Stage Text Retrieval. arXiv:2310.08319, October 2023. Twitter logo
628.Raphael Tang, Xinyu Zhang, Xueguang Ma, Jimmy Lin, and Ferhan Ture. Found in the Middle: Permutation Self-Consistency Improves Listwise Ranking in Large Language Models. arXiv:2310.07712, October 2023. Twitter logo
627.Ronak Pradeep, Sahel Sharifymoghaddam, and Jimmy Lin. RankVicuna: Zero-Shot Listwise Document Reranking with Open-Source Large Language Models. arXiv:2309.15088, September 2023. Twitter logo Twitter logo
626.Wei Zhong, Yuqing Xie, and Jimmy Lin. Answer Retrieval for Math Questions Using Structural and Dense Retrieval. Proceedings of CLEF 2023: Experimental IR Meets Multilinguality, Multimodality, and Interaction., pages 209-223, Thessaloniki, Greece, September 2023.
625.Xinyu Zhang, Kelechi Ogueji, Xueguang Ma, and Jimmy Lin. Towards Best Practices for Training Multilingual Dense Retrieval Models. ACM Transactions on Information Systems, 42(2), Article No. 39, 2023.
624.Xinyu Zhang, Nandan Thakur, Odunayo Ogundepo, Ehsan Kamalloo, David Alfonso-Hermelo, Xiaoguang Li, Qun Liu, Mehdi Rezagholizadeh, and Jimmy Lin. MIRACL: A Multilingual Retrieval Dataset Covering 18 Diverse Languages. Transactions of the Association for Computational Linguistics, 11:1114–1131, 2023.
623.Jimmy Lin, Ronak Pradeep, Tommaso Teofili, and Jasper Xian. Vector Search with OpenAI Embeddings: Lucene Is All You Need. arXiv:2308.14963, August 2023. Twitter logo
622.Ehsan Kamalloo, Aref Jafari, Xinyu Zhang, Nandan Thakur, and Jimmy Lin. HAGRID: A Human-LLM Collaborative Dataset for Generative Information-Seeking with Attribution. arXiv:2307.16883, June 2023. Twitter logo
621.Ronak Pradeep, Kai Hui, Jai Gupta, Adam Lelkes, Honglei Zhuang, Jimmy Lin, Donald Metzler, and Vinh Tran. How Does Generative Retrieval Scale to Millions of Passages? Proceedings of the First Workshop on Generative Information Retrieval at SIGIR 2023 (Gen-IR@SIGIR23), July 2023, Taipei, Taiwan.
620.Nandan Thakur, Nils Reimers, and Jimmy Lin. Injecting Domain Adaptation with Learning-to-hash for Effective and Efficient Zero-shot Dense Retrieval. Proceedings of SIGIR 2023 Workshop on Reaching Efficiency in Neural Information Retrieval (ReNeuIR'23), July 2023, Taipei, Taiwan.
619.Wei Zhong, Sheng-Chieh Lin, Jheng-Hong Yang, and Jimmy Lin. One Blade for One Purpose: Advancing Math Information Retrieval using Hybrid Search. Proceedings of the 46th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2023), pages 141-151, July 2023, Taipei, Taiwan.
618.Minghan Li, Sheng-Chieh Lin, Xueguang Ma, and Jimmy Lin. SLIM: Sparsified Late Interaction for Multi-Vector Retrieval with Inverted Indexes. Proceedings of the 46th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2023), pages 1954–1959, July 2023, Taipei, Taiwan.
617.Chris Kamphuis, Aileen Lin, Siwen Yang, Jimmy Lin, Arjen P. de Vries, and Faegheh Hasibi. MMEAD: MS MARCO Entity Annotations and Disambiguations. Proceedings of the 46th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2023), pages 2817-2825, July 2023, Taipei, Taiwan.
616.Nandan Thakur, Kexin Wang, Iryna Gurevych, Jimmy Lin. SPRINT: A Unified Toolkit for Evaluating and Demystifying Zero-shot Neural Sparse Retrieval. Proceedings of the 46th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2023), pages 2964-2974, July 2023, Taipei, Taiwan.
615.Jheng-Hong Yang, Carlos Lassance, Rafael Sampaio De Rezende, Krishna Srinivasan, Miriam Redi, Stéphane Clinchant, and Jimmy Lin. AToMiC: An Image/Text Retrieval Test Collection to Support Multimedia Content Creation. Proceedings of the 46th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2023), pages 2975-2984, July 2023, Taipei, Taiwan.
614.Luyu Gao, Xueguang Ma, Jimmy Lin, and Jamie Callan. Tevatron: An Efficient and Flexible Toolkit for Neural Retrieval. Proceedings of the 46th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2023), pages 3120-3124, July 2023, Taipei, Taiwan.
613.Nandan Thakur, Kexin Wang, Iryna Gurevych, and Jimmy Lin. SPRINT: A Unified Toolkit for Evaluating and Demystifying Zero-shot Neural Sparse Retrieval. arXiv:2307.10488, July 2023. (Appears at SIGIR 2023)
612.Luyu Gao, Xueguang Ma, Jimmy Lin, and Jamie Callan. Precise Zero-Shot Dense Retrieval without Relevance Labels. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1762-1777, July 2023, Toronto, Canada.
611.Raphael Tang, Linqing Liu, Akshat Pandey, Zhiying Jiang, Gefei Yang, Karun Kumar, Pontus Stenetorp, Jimmy Lin, and Ferhan Ture. What the DAAM: Interpreting Stable Diffusion Using Cross Attention. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5644–5659, July 2023, Toronto, Canada. Twitter logo
610.Minghan Li, Sheng-Chieh Lin, Barlas Oguz, Asish Ghoshal, Jimmy Lin, Yashar Mehdad, Wen-tau Yih, and Xilun Chen. CITADEL: Conditional Token Interaction via Dynamic Lexical Routing for Efficient and Effective Multi-Vector Retrieval. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 11891-11907, July 2023, Toronto, Canada.
609.Aleksandra Piktus, Odunayo Ogundepo, Christopher Akiki, Akintunde Oladipo, Xinyu Zhang, Hailey Schoelkopf, Stella Biderman, Martin Potthast, and Jimmy Lin. GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data Exploration. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), pages 588–598, July 2023, Toronto, Canada.
608.Ehsan Kamalloo, Xinyu Zhang, Odunayo Ogundepo, Nandan Thakur, David Alfonso-hermelo, Mehdi Rezagholizadeh, and Jimmy Lin. Evaluating Embedding APIs for Information Retrieval. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track), pages 518–526, July 2023, Toronto, Canada.
607.Ji Xin, Raphael Tang, Zhiying Jiang, Yaoliang Yu, and Jimmy Lin. Operator Selection and Ordering in a Pipeline Approach to Efficiency Optimizations for Transformers. Findings of the Association for Computational Linguistics: ACL 2023, pages 2870–2882, July 2023, Toronto, Canada.
606.Zhiying Jiang, Matthew Yang, Mikhail Tsirlin, Raphael Tang, Yiqin Dai, and Jimmy Lin. “Low-Resource” Text Classification: A Parameter-Free Classification Method with Compressors. Findings of the Association for Computational Linguistics: ACL 2023, pages 6810-6828, July 2023, Toronto, Canada. Twitter logo
605.Ehsan Kamalloo, Nandan Thakur, Carlos Lassance, Xueguang Ma, Jheng-Hong Yang, and Jimmy Lin. Resources for Brewing BEIR: Reproducible Reference Models and an Official Leaderboard. arXiv:2306.07471, June 2023. Twitter logo
604.Vanessa Liao, Syed Shariyar Murtaza, Yifan Nie, and Jimmy Lin. Regex-augmented Domain Transfer Topic Classification based on a Pre-trained Language Model: An application in Financial Domain. arXiv:2305.18324, June 2023.
603.Ronak Pradeep, Kai Hui, Jai Gupta, Adam D. Lelkes, Honglei Zhuang, Jimmy Lin, Donald Metzler, and Vinh Q. Tran. How Does Generative Retrieval Scale to Millions of Passages? arXiv:2305.11841, May 2023.
602.Nandan Thakur, Nils Reimers, and Jimmy Lin. Injecting Domain Adaptation with Learning-to-hash for Effective and Efficient Zero-shot Dense Retrieval. arXiv:2205.11498, May 2023. (Appears at the SIGIR 2023 ReNeuIR Workshop)
601.Sheng-Chieh Lin, Minghan Li, and Jimmy Lin. Aggretriever: A Simple Approach to Aggregate Textual Representations for Robust Dense Passage Retrieval. Transactions of the Association for Computational Linguistics, 11:436-452, 2023.
600.Josh Seltzer, Jiahua (Fiona) Pan, Kathy Cheng, Yuxiao Sun, Santosh Kolagati, Jimmy Lin, and Shi Zong. SmartProbe: A Virtual Moderator for Market Research Surveys. arXiv:2305.08271, May 2023.
599.Ehsan Kamalloo, Xinyu Zhang, Odunayo Ogundepo, Nandan Thakur, David Alfonso-Hermelo, Mehdi Rezagholizadeh, and Jimmy Lin. Evaluating Embedding APIs for Information Retrieval. arXiv:2305.06300, May 2023. Twitter logo
598.Xueguang Ma, Xinyu Zhang, Ronak Pradeep, and Jimmy Lin. Zero-Shot Listwise Document Reranking with a Large Language Model. arXiv:2305.02156, May 2023. Twitter logo
597.Xueguang Ma, Tommaso Teofili, and Jimmy Lin. Anserini Gets Dense Retrieval: Integration of Lucene's HNSW Indexes. arXiv:2304.12139, April 2023. (Appears at CKIM 2023) Twitter logo
596.Ronak Pradeep, Haonan Chen, Lingwei Gu, Manveer Singh Tamber, and Jimmy Lin. PyGaggle: A Gaggle of Resources for Open-Domain Question Answering. Proceedings of the 45th European Conference on Information Retrieval (ECIR 2023), Part III, pages 148-162, April 2023, Dublin, Ireland.
595.Manveer Singh Tamber, Ronak Pradeep, and Jimmy Lin. Pre-Processing Matters! Improved Wikipedia Corpora for Open-Domain Question Answering. Proceedings of the 45th European Conference on Information Retrieval (ECIR 2023), Part III, pages 163–176, April 2023, Dublin, Ireland.
594.Sheng-Chieh Lin and Jimmy Lin. A Dense Representation Framework for Lexical and Semantic Matching. ACM Transactions on Information Systems, 41(4), Article No. 110, 2023. Twitter logo
593.Jheng-Hong Yang, Carlos Lassance, Rafael Sampaio de Rezende, Krishna Srinivasan, Miriam Redi, Stéphane Clinchant, and Jimmy Lin. AToMiC: An Image/Text Retrieval Test Collection to Support Multimedia Content Creation. arXiv:2304.01961, April 2023. Twitter logo
592.Jimmy Lin, David Alfonso-Hermelo, Vitor Jeronymo, Ehsan Kamalloo, Carlos Lassance, Rodrigo Nogueira, Odunayo Ogundepo, Mehdi Rezagholizadeh, Nandan Thakur, Jheng-Hong Yang, and Xinyu Zhang. Simple Yet Effective Neural Ranking and Reranking Baselines for Cross-Lingual Information Retrieval. arXiv:2304.01019, April 2023.
591.Joel Mackenzie, Andrew Trotman, and Jimmy Lin. Efficient Document-at-a-Time and Score-at-a-Time Query Evaluation for Learned Sparse Representations. ACM Transactions on Information Systems, 41(4), Article No. 96, 2023. Twitter logo
590.Christopher Akiki, Odunayo Ogundepo, Aleksandra Piktus, Xinyu Zhang, Akintunde Oladipo, Jimmy Lin, and Martin Potthast. Spacerini: Plug-and-play Search Engines with Pyserini and Hugging Face. arXiv:2302.14534, February 2023.
589.Sheng-Chieh Lin, Akari Asai, Minghan Li, Barlas Oguz, Jimmy Lin, Yashar Mehdad, Wen-tau Yih, and Xilun Chen. How to Train Your DRAGON: Diverse Augmentation Towards Generalizable Dense Retrieval. arXiv:2302.07452, February 2023. Twitter logo
588.Xinyu Zhang, Minghan Li, and Jimmy Lin. Improving Out-of-Distribution Generalization of Neural Rerankers with Contextualized Late Interaction. arXiv:2302.06589, February 2023. Twitter logo
587.Minghan Li, Sheng-Chieh Lin, Xueguang Ma, and Jimmy Lin. SLIM: Sparsified Late Interaction for Multi-Vector Retrieval with Inverted Indexes. arXiv:2302.06587, February 2023.
586.Shi Zong, Josh Seltzer, Jiahua (Fiona) Pan, Kathy Cheng, and Jimmy Lin. Which Model Shall I Choose? Cost/Quality Trade-offs for Text Classification Tasks. arXiv:2301.07006, January 2023.

2022

585.Jimmy Lin. Building a Culture of Reproducibility in Academic Research. arXiv:2212.13534, December 2022. Twitter logo
584.Luyu Gao, Xueguang Ma, Jimmy Lin, and Jamie Callan. Precise Zero-Shot Dense Retrieval without Relevance Labels. arXiv:2212.10496, December 2022. Twitter logo
583.Hang Li, Shengyao Zhuang, Xueguang Ma, Jimmy Lin, and Guido Zuccon. Pseudo-Relevance Feedback with Dense Retrievers in Pyserini. Proceedings of the 26th Australasian Document Computing Symposium (ADCS 2022), December 2022, Adelaide, Australia.
582.Minghan Li, Xinyu Zhang, Ji Xin, Hongyang Zhang, and Jimmy Lin. Certified Error Control of Candidate Set Pruning for Two-Stage Relevance Ranking. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 333-345, December 2022, Abu Dhabi, United Arab Emirates.
581.Odunayo Ogundepo, Xinyu Zhang, Shuo Sun, Kevin Duh, and Jimmy Lin. AfriCLIRMatrix: Enabling Cross-Lingual Information Retrieval for African Languages. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 8721-8728, December 2022, Abu Dhabi, United Arab Emirates.
580.Raphael Tang, Karun Kumar, Gefei Yang, Akshat Pandey, Yajie Mao, Vladislav Belyaev, Madhuri Emmadi, Craig Murray, Ferhan Ture, and Jimmy Lin. SpeechNet: Weakly Supervised, End-to-End Speech Recognition at Industrial Scale. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track, December 2022, Abu Dhabi, United Arab Emirates.
579.Wei Zhong, Jheng-Hong Yang, Yuqing Xie, and Jimmy Lin. Evaluating Token-Level and Passage-Level Dense Retrieval Models for Math Information Retrieval. Findings of the Association for Computational Linguistics: EMNLP 2022, pages 1092-1102, December 2022, Abu Dhabi, United Arab Emirates.
578.Peng Shi, Rui Zhang, He Bai, and Jimmy Lin. XRICL: Cross-lingual Retrieval-Augmented In-Context Learning for Cross-lingual Text-to-SQL Semantic Parsing. Findings of the Association for Computational Linguistics: EMNLP 2022, pages 5248-5259, December 2022, Abu Dhabi, United Arab Emirates.
577.Peng Shi, Linfeng Song, Lifeng Jin, Haitao Mi, He Bai, Jimmy Lin, and Dong Yu. Cross-Lingual Text-to-SQL Semantic Parsing with Representation Mixup. Findings of the Association for Computational Linguistics: EMNLP 2022, pages 5296-5306, December 2022, Abu Dhabi, United Arab Emirates.
576.Zhiying Jiang, Yiqin Dai, Ji Xin, Ming Li, and Jimmy Lin. Few-Shot Non-Parametric Learning with Deep Latent Variable Model. Proceedings of the Thirty-Sixth Conference on Neural Information Processing Systems (NeurIPS 2022), December 2022. Twitter logo
575.Raphael Tang, Karun Kumar, Gefei Yang, Akshat Pandey, Yajie Mao, Vladislav Belyaev, Madhuri Emmadi, Craig Murray, Ferhan Ture, and Jimmy Lin. SpeechNet: Weakly Supervised, End-to-End Speech Recognition at Industrial Scale. arXiv:2211.11740, November 2022.
574.Minghan Li, Sheng-Chieh Lin, Barlas Oguz, Asish Ghoshal, Jimmy Lin, Yashar Mehdad, Wen-tau Yih, Xilun Chen. CITADEL: Conditional Token Interaction via Dynamic Lexical Routing for Efficient and Effective Multi-Vector Retrieval. arXiv:2211.10411, November 2022.
573.Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, Jimmy Lin, Ellen M. Voorhees, and Ian Soboroff. Overview of the TREC 2022 Deep Learning Track. Proceedings of the Thirty-First Text REtrieval Conference (TREC 2022), November 2022, Gaithersburg, Maryland.
572.Mashrekur Rahman, Jonathan M. Frame, Jimmy Lin, and Grey S. Nearing. Hydrology Research Articles Are Becoming More Topically Diverse. Journal of Hydrology, 614:128551, November 2022.
571.Peng Shi, Rui Zhang, He Bai, and Jimmy Lin. XRICL: Cross-lingual Retrieval-Augmented In-Context Learning for Cross-lingual Text-to-SQL Semantic Parsing. arXiv:2210.13693, October 2022.
570.Xinyu Zhang, Nandan Thakur, Odunayo Ogundepo, Ehsan Kamalloo, David Alfonso-Hermelo, Xiaoguang Li, Qun Liu, Mehdi Rezagholizadeh, and Jimmy Lin. Making a MIRACL: Multilingual Information Retrieval Across a Continuum of Languages. arXiv:2210.09984, October 2022. Twitter logo Twitter logo
569.Odunayo Ogundepo, Xinyu Zhang, and Jimmy Lin. Better Than Whitespace: Information Retrieval for Languages without Custom Tokenizers. arXiv:2210.05481, October 2022.
568.Raphael Tang, Akshat Pandey, Zhiying Jiang, Gefei Yang, Karun Kumar, Jimmy Lin, and Ferhan Ture. What the DAAM: Interpreting Stable Diffusion Using Cross Attention. arXiv:2210.04885, October 2022. Twitter logo Twitter logo
567.Wei Zhong, Yuqing Xie, and Jimmy Lin. Applying Structural and Dense Semantic Matching for the ARQMath Lab 2022, CLEF. Proceedings of the Working Notes of CLEF 2022 — Conference and Labs of the Evaluation Forum: CEUR Workshop Proceedings Vol-3180, pages 147-170, Bologna, Italy, September 2022. (Best paper from ARQMath-3 @ CLEF 2022)
566.Chris Kamphuis, Faegheh Hasibi, Jimmy Lin, and Arjen de Vries. REBL: Entity Linking at Scale. Proceedings of the 3rd International Conference on Design of Experimental Search & Information REtrieval Systems (DESIRES 2022), San Jose, California, August 2022.
565.Sheng-Chieh Lin, Minghan Li, and Jimmy Lin. Aggretriever: A Simple Approach to Aggregate Textual Representation for Robust Dense Passage Retrieval. arXiv:2208.00511, July 2022.
564.Ji Xin, Raphael Tang, Zhiying Jiang, Yaoliang Yu, and Jimmy Lin. Building an Efficiency Pipeline: Commutativity and Cumulativeness of Efficiency Operators for Transformers. arXiv:2208.00483, July 2022.
563.Minghan Li, Xueguang Ma, and Jimmy Lin. An Encoder Attribution Analysis for Dense Passage Retriever in Open-Domain Question Answering. Proceedings of the 2nd Workshop on Trustworthy Natural Language Processing (TrustNLP 2022), pages 1-11, July 2022, Seattle, Washington.
562.Odunayo Ogundepo, Akintunde Oladipo, Mofetoluwa Adeyemi, Kelechi Ogueji, and Jimmy Lin. AfriTeVA: Extending "Small Data" Pretraining Approaches to Sequence-to-Sequence Models. Proceedings of the Third Workshop on Deep Learning for Low-Resource Natural Language Processing, pages 126-135, July 2022, Seattle, Washington.
561.Ronak Pradeep, Yilin Li, Yuetong Wang, and Jimmy Lin. Neural Query Synthesis and Domain-Specific Ranking Templates for Multi-Stage Clinical Trial Matching. Proceedings of the 45th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2022), pages 2325–2330, July 2022, Madrid, Spain.
560.Hang Li, Shuai Wang, Shengyao Zhuang, Ahmed Mourad, Xueguang Ma, Jimmy Lin, and Guido Zuccon. To Interpolate or not to Interpolate: PRF, Dense and Sparse Retrievers. Proceedings of the 45th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2022), pages 2495-2500, July 2022, Madrid, Spain.
559.Yuqi Liu, Chengcheng Hu, and Jimmy Lin. Another Look at Information Retrieval as Statistical Translation. Proceedings of the 45th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2022), pages 2749–2754, July 2022, Madrid, Spain.
558.Jimmy Lin, Daniel Campos, Nick Craswell, Bhaskar Mitra, and Emine Yilmaz. Fostering Coopetition While Plugging Leaks: The Design and Implementation of the MS MARCO Leaderboards. Proceedings of the 45th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2022), pages 2939-2948, July 2022, Madrid, Spain. Twitter logo
557.Ellen M. Voorhees, Nick Craswell, and Jimmy Lin. Too Many Relevants: Whither Cranfield Test Collections? Proceedings of the 45th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2022), pages 2970–2980, July 2022, Madrid, Spain. Twitter logo
556.Xueguang Ma, Ronak Pradeep, Rodrigo Nogueira, and Jimmy Lin. Document Expansions and Learned Sparse Lexical Representations for MS MARCO V1 and V2. Proceedings of the 45th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2022), pages 3187–3197, July 2022, Madrid, Spain. Twitter logo
555.Andrew Trotman, Joel Mackenzie, Pradeesh Parameswaran, and Jimmy Lin. A Common Framework for Exploring Document-at-a-Time and Score-at-a-Time Retrieval Methods. Proceedings of the 45th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2022), pages 3229-3234, July 2022, Madrid, Spain.
554.Josh Seltzer, Kathy Cheng, Shi Zong, and Jimmy Lin. Flipping the Script: Inverse Information Seeking Dialogues for Market Research. Proceedings of the 45th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2022), pages 3380–3383, July 2022, Madrid, Spain.
553.Zhiying Jiang, Yiqin Dai, Ji Xin, Ming Li, and Jimmy Lin. Few-Shot Non-Parametric Learning with Deep Latent Variable Model. arXiv:2206.11573, June 2022.
552.Sheng-Chieh Lin and Jimmy Lin. A Dense Representation Framework for Lexical and Semantic Matching. arXiv:2206.09912, May 2022. (Later appears in ACM Transactions on Information Systems, 2023) Twitter logo
551.Matthew Y. R. Yang, Siwen Yang, and Jimmy Lin. Integration of Text and Geospatial Search for Hydrographic Datasets Using the Lucene Search Library. Proceedings of the 22nd ACM/IEEE Joint Conference on Digital Libraries (JCDL 2022), article no. 36, pages 1-5, June 2022, Cologne, Germany.
550.Raphael Tang, Karun Kumar, Ji Xin, Piyush Vyas, Wenyan Li, Gefei Yang, Yajie Mao, Craig Murray, and Jimmy Lin. Temporal Early Exiting for Streaming Speech Commands Recognition. Proceedings of the 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2022), pages 7567-7571, May 2022, Singapore.
549.Nandan Thakur, Nils Reimers, and Jimmy Lin. Domain Adaptation for Memory-Efficient Dense Retrieval. arXiv:2205.11498, May 2022.
548.Minghan Li, Xinyu Zhang, Ji Xin, Hongyang Zhang, and Jimmy Lin. Certified Error Control of Candidate Set Pruning for Two-Stage Relevance Ranking. arXiv:2205.09638, May 2022. Twitter logo
547.Akintunde Oladipo, Odunayo Ogundepo, Kelechi Ogueji, and Jimmy Lin. An Exploration of Vocabulary Size and Transfer Effects in Multilingual Language Models for African Languages. Proceedings of the 3rd Workshop on African Natural Language Processing (AfricaNLP 2022), April 2022. Twitter logo
546.Hang Li, Shengyao Zhuang, Ahmed Mourad, Xueguang Ma, Jimmy Lin, and Guido Zuccon. Improving Query Representations for Dense Retrieval with Pseudo Relevance Feedback: A Reproducibility Study. Proceedings of the 44th European Conference on Information Retrieval (ECIR 2022), Part I, pages 599-612, April 2021, Stavanger, Norway. [local PDF] Twitter logo
545.Xueguang Ma, Kai Sun, Ronak Pradeep, Minghan Li, and Jimmy Lin. Another Look at DPR: Reproduction of Training and Replication of Retrieval. Proceedings of the 44th European Conference on Information Retrieval (ECIR 2022), Part I, pages 613-626, April 2021, Stavanger, Norway. [local PDF]
544.Ronak Pradeep, Yuqi Liu, Xinyu Zhang, Yilin Li, Andrew Yates, and Jimmy Lin. Squeezing Water from a Stone: A Bag of Tricks for Further Improving Cross-Encoder Effectiveness for Reranking. Proceedings of the 44th European Conference on Information Retrieval (ECIR 2022), Part I, pages 655-670, April 2021, Stavanger, Norway. [local PDF]
543.Xinyu Zhang, Kelechi Ogueji, Xueguang Ma, and Jimmy Lin. Towards Best Practices for Training Multilingual Dense Retrieval Models. arXiv:2204.02363, April 2022. Twitter logo
542.Luyu Gao, Xueguang Ma, Jimmy Lin, and Jamie Callan. Tevatron: An Efficient and Flexible Toolkit for Dense Retrieval. arXiv:2203.05765, March 2022.
541.Wei Zhong, Jheng-Hong Yang, and Jimmy Lin. Evaluating Token-Level and Passage-Level Dense Retrieval Models for Math Information Retrieval. arXiv:2203.11163, March 2022.
540.Josh Devins, Julie Tibshirani, and Jimmy Lin. Aligning the Research and Practice of Building Search Applications: Elasticsearch and Pyserini. Proceedings of the 15th ACM International Conference on Web Search and Data Mining (WSDM 2022), pages 1573-1576, February 2022. Twitter logo
539.Ellen M. Voorhees, Ian Soboroff, and Jimmy Lin. Can Old TREC Collections Reliably Evaluate Modern Neural Retrieval Models? arXiv:2201.11086, January 2022. Twitter logo Twitter logo

2021

538.Jimmy Lin. A Proposed Conceptual Framework for a Representational Approach to Information Retrieval. SIGIR Forum, 55(2), Article No. 4, pages 1-29, 2021.
537.Jheng-Hong Yang, Xueguang Ma, and Jimmy Lin. Sparsifying Sparse Representations for Passage Retrieval by Top-k Masking. arXiv:2112.09628, December 2021. Twitter logo
536.Hang Li, Shengyao Zhuang, Ahmed Mourad, Xueguang Ma, Jimmy Lin, and Guido Zuccon. Improving Query Representations for Dense Retrieval with Pseudo Relevance Feedback: A Reproducibility Study. arXiv:2112.06400, December 2021. Twitter logo
535.Sheng-Chieh Lin and Jimmy Lin. Densifying Sparse Representations for Passage Retrieval by Representational Slicing. arXiv:2112.04666, December 2021. Twitter logo
534.Bhaleka D. Persaud, Krysha A. Dukacz, Gopal C. Saha, Amber Peterson, Laleh Moradi, Stephen O'Hearn, Erin Clary, Juliane Mai, Michael Steeleworthy, Jason J. Venkiteswaran, Homa Kheyrollah Pour, Brent B. Wolfe, Sean K. Carey, John W. Pomeroy, Chris M. DeBeer, James M. Waddington, Philippe Van Cappellen, and Jimmy Lin. Ten Best Practices to Strengthen Stewardship and Sharing of Water Science Data in Canada. Hydrological Processes, 35(11):e14385, 2021. Twitter logo
533.Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, and Jimmy Lin. Overview of the TREC 2021 Deep Learning Track. Proceedings of the Thirtieth Text REtrieval Conference (TREC 2021), November 2021, Gaithersburg, Maryland.
532.Jimmy Lin, Rodrigo Nogueira, and Andrew Yates. Pretrained Transformers for Text Ranking: BERT and Beyond. Morgan & Claypool Publishers, 2021. [Springer] Twitter logo
531.Kelechi Ogueji, Yuxin Zhu, and Jimmy Lin. Small Data? No Problem! Exploring the Viability of Pretrained Multilingual Language Models for Low-Resource Languages. Proceedings of the 1st Workshop on Multilingual Representation Learning, pages 116-126, November 2021, Punta Cana, Dominican Republic. Twitter logo
530.Xinyu Zhang, Xueguang Ma, Peng Shi, and Jimmy Lin. Mr. TyDi: A Multi-lingual Benchmark for Dense Retrieval. Proceedings of 1st Workshop on Multilingual Representation Learning, pages 127-137, November 2021, Punta Cana, Dominican Republic. Twitter logo
529.Peng Shi, Rui Zhang, He Bai, and Jimmy Lin. Cross-Lingual Training of Dense Retrievers for Document Retrieval. Proceedings of 1st Workshop on Multilingual Representation Learning, pages 251-253, November 2021, Punta Cana, Dominican Republic.
528.Yue Zhang, Chengcheng Hu, Yuqi Liu, Hui Fang, and Jimmy Lin. Learning to Rank in the Age of Muppets: Effectiveness-Efficiency Tradeoffs in Multi-Stage Ranking. Proceedings of the Second Workshop on Simple and Efficient Natural Language Processing, pages 64-73, November 2021.
527.Zhiying Jiang, Raphael Tang, Ji Xin, and Jimmy Lin. How Does BERT Rerank Passages? An Attribution Analysis with Information Bottlenecks. Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, pages 496-509, November 2021, Punta Cana, Dominican Republic.
526.Raphael Tang, Karun Kumar, Kendra Chalkley, Ji Xin, Liming Zhang, Wenyan Li, Gefei Yang, Yajie Mao, Junho Shin, Geoffrey Murray, and Jimmy Lin. Voice Query Auto Completion. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 900-906, November 2021, Online and Punta Cana, Dominican Republic.
525.Sheng-Chieh Lin, Jheng-Hong Yang, and Jimmy Lin. Contextualized Query Embeddings for Conversational Search. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 1004-1015, November 2021, Online and Punta Cana, Dominican Republic.
524.Xueguang Ma, Minghan Li, Kai Sun, Ji Xin, and Jimmy Lin. Simple and Effective Unsupervised Redundancy Elimination to Compress Dense Vectors for Passage Retrieval. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 2854-2859, November 2021, Online and Punta Cana, Dominican Republic. Twitter logo
523.Minghan Li, Ming Li, Kun Xiong, and Jimmy Lin. Multi-Task Dense Retrieval via Model Uncertainty Fusion for Open-Domain Question Answering. Findings of the Association for Computational Linguistics: EMNLP 2021, pages 274-287, November 2021, Punta Cana, Dominican Republic. Twitter logo
522.Anup Anand Deshmukh, Qianqiu Zhang, Ming Li, Jimmy Lin, and Lili Mou. Unsupervised Chunking as Syntactic Structure Induction with a Knowledge-Transfer Approach. Findings of the Association for Computational Linguistics: EMNLP 2021, pages 3626-3634, November 2021, Punta Cana, Dominican Republic.
521.Joel Mackenzie, Andrew Trotman, and Jimmy Lin. Wacky Weights in Learned Sparse Representations and the Revenge of Score-at-a-Time Query Evaluation. arXiv:2110.11540, October 2021. Twitter logo
520.Minghan Li and Jimmy Lin. Encoder Adaptation of Dense Passage Retrieval for Open-Domain Question Answering. arXiv:2110.01599, October 2021. Twitter logo
519.Jimmy Lin. A Proposed Conceptual Framework for a Representational Approach to Information Retrieval. arXiv:2110.01529, October 2021. Twitter logo
518.Sheng-Chieh Lin, Jheng-Hong Yang, Rodrigo Nogueira, Ming-Feng Tsai, Chuan-Ju Wang, and Jimmy Lin. Multi-Stage Conversational Passage Retrieval: An Approach to Fusing Term Importance Estimation and Neural Query Rewriting. ACM Transactions on Information Systems, 39(4), Article No. 48, 2021. Twitter logo
517.Wei Zhong, Xinyu Zhang, Ji Xin, Jimmy Lin, and Richard Zanibbi. Approach Zero and Anserini at the CLEF-2021 ARQMath Track: Applying Substructure Search and BM25 on Operator Tree Path Tokens. Proceedings of the Working Notes of CLEF 2021 — Conference and Labs of the Evaluation Forum: CEUR Workshop Proceedings Vol-2936, pages 133-156, September 2021.
516.Mayank Anand, Jiarui Zhang, Shane Ding, Ji Xin, and Jimmy Lin. Serverless BM25 Search and BERT Reranking. Proceedings of the 2nd International Conference on Design of Experimental Search & Information REtrieval Systems (DESIRES 2021): CEUR Workshop Proceedings Vol-2950, pages 3-9, September 2021. [slides] [talk] Twitter logo
515.Jimmy Lin, Xueguang Ma, Joel Mackenzie, and Antonio Mallia. On the Separation of Logical and Physical Ranking Models for Text Retrieval Applications. Proceedings of the 2nd International Conference on Design of Experimental Search & Information REtrieval Systems (DESIRES 2021): CEUR Workshop Proceedings Vol-2950, pages 176-178, September 2021. [slides] [talk] Twitter logo
514.Peng Shi, Rui Zhang, He Bai, and Jimmy Lin. Cross-Lingual Training with Dense Retrieval for Document Retrieval. arXiv:2109.01628, September 2021.
513.Xinyu Zhang, Xueguang Ma, Peng Shi, and Jimmy Lin. Mr. TyDi: A Multi-lingual Benchmark for Dense Retrieval. arXiv:2108.08787, August 2021. (Later appears at the EMNLP 2021 workshop on Multilingual Representation Learning) Twitter logo
512.Ogundepo Odunayo, Naveela N. Sookoo, Gautam Bathla, Anthony Cavallin, Bhaleka D. Persaud, Kathy Szigeti, Philippe Van Cappellen, and Jimmy Lin. Rescuing Historical Climate Observations to Support Hydrological Research: A Case Study of Solar Radiation Data. Proceedings of the 21st ACM Symposium on Document Engineering (DocEng '21), Article No. 19, August 2021.
511.Sheng-Chieh Lin, Jheng-Hong Yang, and Jimmy Lin. In-Batch Negatives for Knowledge Distillation with Tightly-Coupled Teachers for Dense Retrieval. Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP-2021), pages 163-173, August 2021. Twitter logo
510.Xinyu Zhang, Ji Xin, Andrew Yates, and Jimmy Lin. Bag-of-Words Baselines for Semantic Code Search. Proceedings of the 1st Workshop on Natural Language Processing for Programming (NLP4Prog 2021), pages 88-94, August 2021.
509.He Bai, Peng Shi, Jimmy Lin, Luchen Tan, Kun Xiong, Wen Gao, Jie Liu, and Ming Li. Semantics of the Unwritten: The Effect of End of Paragraph and Sequence Tokens on Text Generation with GPT2. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop, pages 148-162, August 2021.
508.Ji Xin, Raphael Tang, Yaoliang Yu, and Jimmy Lin. The Art of Abstention: Selective Prediction and Error Regularization for Natural Language Processing. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1040-1051, August 2021.
507.Kelvin Jiang, Ronak Pradeep, and Jimmy Lin. Exploring Listwise Evidence Reasoning with T5 for Fact Verification. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 402-410, August 2021.
506.Sebastian Hofstätter, Sheng-Chieh Lin, Jheng-Hong Yang, Jimmy Lin, and Allan Hanbury. Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling. Proceedings of the 44th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2021), pages 113-122, July 2021. Twitter logo
505.Nick Craswell, Bhaskar Mitra, Daniel Campos, Emine Yilmaz, and Jimmy Lin. MS MARCO: Benchmarking Ranking Models in the Large-Data Regime. Proceedings of the 44th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2021), pages 1566-1576, July 2021.
504.Ronak Pradeep, Xueguang Ma, Rodrigo Nogueira, and Jimmy Lin. Vera: Prediction Techniques for Reducing Harmful Misinformation in Consumer Health Search. Proceedings of the 44th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2021), pages 2066-2070, July 2021.
503.Jimmy Lin, Daniel Campos, Nick Craswell, Bhaskar Mitra, and Emine Yilmaz. Significant Improvements over the State of the Art? A Case Study of the MS MARCO Document Ranking Leaderboard. Proceedings of the 44th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2021), pages 2283–2287, July 2021. Twitter logo
502.Jimmy Lin, Xueguang Ma, Sheng-Chieh Lin, Jheng-Hong Yang, Ronak Pradeep, and Rodrigo Nogueira. Pyserini: A Python Toolkit for Reproducible Information Retrieval Research with Sparse and Dense Representations. Proceedings of the 44th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2021), pages 2356-2362, July 2021. Twitter logo Twitter logo Twitter logo
501.Edwin Zhang, Sheng-Chieh Lin, Jheng-Hong Yang, Ronak Pradeep, Rodrigo Nogueira, and Jimmy Lin. Chatty Goose: A Python Framework for Conversational Search. Proceedings of the 44th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2021), pages 2521-2525, July 2021. Twitter logo
500.Wei Zhong and Jimmy Lin. PyA0: A Python Toolkit for Accessible Math-Aware Search. Proceedings of the 44th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2021), pages 2541-2545, July 2021. Twitter logo
499.Andrew Yates, Rodrigo Nogueira, and Jimmy Lin. Pretrained Transformers for Text Ranking: BERT and Beyond. Proceedings of the 44th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2021), pages 2666-2668, July 2021.
498.Xiao Han, Yuqi Liu, and Jimmy Lin. The Simplest Thing That Can Possibly Work: (Pseudo-)Relevance Feedback via Text Classification. Proceedings of the 2021 ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR 2021), pages 123-129, July 2021.
497.Juliane Mai, Bryan A. Tolson, Hongren Shen, Étienne Gaborit, Vincent Fortin, Nicolas Gasset, Hervé Awoye, Tricia A. Stadnyk, Lauren M. Fry, Emily A. Bradley, Frank Seglenieks, André G. T. Temgoua, Daniel G. Princz, Shervan Gharari, Amin Haghnegahdar, Mohamed E. Elshamy, Saman Razavi, Martin Gauch, Jimmy Lin, Xiaojing Ni, Yongping Yuan, Meghan McLeod, Nandita B. Basu, Rohini Kumar, Oldrich Rakovec, Luis Samaniego, Sabine Attinger, Narayan K. Shrestha, Prasad Daggupati, Tirthankar Roy, Sungwook Wi, Tim Hunter, James R. Craig, and Alain Pietroniro. Great Lakes Runoff Intercomparison Project Phase 3: Lake Erie (GRIP-E). Journal of Hydrologic Engineering, 26(9):05021020, 2021.
496.Jimmy Lin and Xueguang Ma. A Few Brief Notes on DeepImpact, COIL, and a Conceptual Framework for Information Retrieval Techniques. arXiv:2106.14807, June 2021. Twitter logo
495.Andrew Yates, Rodrigo Nogueira, and Jimmy Lin. Pretrained Transformers for Text Ranking: BERT and Beyond. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Tutorials, pages 1-4, June 2021.
494.Nick Craswell, Bhaskar Mitra, Daniel Campos, Emine Yilmaz, and Jimmy Lin. MS MARCO: Benchmarking Ranking Models in the Large-Data Regime. arXiv:2105.04021, May 2021. (Later appears at SIGIR 2021)
493.Rodrigo Nogueira, Zhiying Jiang, and Jimmy Lin. Investigating the Limitations of Transformers with Simple Arithmetic Tasks. Proceedings of the 1st Mathematical Reasoning in General Artificial Intelligence Workshop at ICLR 2021, May 2021.
492.Ji Xin, Raphael Tang, Yaoliang Yu, and Jimmy Lin. BERxiT: Early Exiting for BERT with Better Fine-Tuning and Extension to Regression. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 91-104, April 2021.
491.Mohan Zhang, Luchen Tan, Zihang Fu, Kun Xiong, Jimmy Lin, Ming Li, Zhengkai Tu. Don't Change Me! User-Controllable Selective Paraphrase Generation. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 3522-3527, April 2021.
490.Ronak Pradeep, Xueguang Ma, Rodrigo Nogueira, and Jimmy Lin. Scientific Claim Verification with VerT5erini. Proceedings of the 12th International Workshop on Health Text Mining and Information Analysis, pages 94-103, April 2021. Twitter logo
489.Sheng-Chieh Lin, Jheng-Hong Yang, and Jimmy Lin. Contextualized Query Embeddings for Conversational Search. arXiv:2104.08707, April 2021. (Later appears at EMNLP 2021)
488.Martin Gauch, Frederik Kratzert, Daniel Klotz, Grey Nearing, Jimmy Lin, and Sepp Hochreiter. Rainfall–Runoff Prediction at Multiple Timescales with a Single Long Short-Term Memory Network. Hydrology and Earth System Sciences, 25(4):2045-2062, 2021. Twitter logo
487.Sebastian Hofstätter, Sheng-Chieh Lin, Jheng-Hong Yang, Jimmy Lin, and Allan Hanbury. Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling. arXiv:2104.06967, April 2021. (Later appears at SIGIR 2021) Twitter logo
486.Xueguang Ma, Kai Sun, Ronak Pradeep, and Jimmy Lin. A Replication Study of Dense Passage Retriever. arXiv:2104.05740, April 2021. Twitter logo
485.Xinyu Zhang, Andrew Yates, and Jimmy Lin. Comparing Score Aggregation Approaches for Document Retrieval with Pretrained Transformers. Proceedings of the 43rd European Conference on Information Retrieval (ECIR 2021), Part II, pages 150-163, March 2021. [local PDF] Twitter logo
484.Samantha Fritz, Ian Milligan, Nick Ruest, and Jimmy Lin. Fostering Community Engagement through Datathon Events: The Archives Unleashed Experience. Digital Humanities Quarterly, 15(1), 2021. Twitter logo
483.Andrew Yates, Rodrigo Nogueira, and Jimmy Lin. Pretrained Transformers for Text Ranking: BERT and Beyond. Proceedings of the 14th ACM International Conference on Web Search and Data Mining (WSDM 2021), pages 1154-1156, March 2021. Twitter logo
482.Rodrigo Nogueira, Zhiying Jiang, and Jimmy Lin. Investigating the Limitations of the Transformers with Simple Arithmetic Tasks. arXiv:2102.13019, February 2021. (Later appears at the ICLR 2021 workshop on Mathematical Reasoning in General Artificial Intelligence) Twitter logo
481.Jimmy Lin, Daniel Campos, Nick Craswell, Bhaskar Mitra, and Emine Yilmaz. Significant Improvements over the State of the Art? A Case Study of the MS MARCO Document Ranking Leaderboard. arXiv:2102.12887, February 2021. (Slightly shorter version later appears at SIGIR 2021) Twitter logo
480.Jimmy Lin, Xueguang Ma, Sheng-Chieh Lin, Jheng-Hong Yang, Ronak Pradeep, and Rodrigo Nogueira. Pyserini: An Easy-to-Use Python Toolkit to Support Replicable IR Research with Sparse and Dense Representations. arXiv:2102.10073, February 2021. Twitter logo Twitter logo
479.He Bai, Peng Shi, Jimmy Lin, Yuqing Xie, Luchen Tan, Kun Xiong, Wen Gao, and Ming Li. Segatron: Segment-Aware Transformer for Language Modeling and Understanding. Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21), pages 12526-12534, February 2021. Twitter logo
478.Ronak Pradeep, Rodrigo Nogueira, and Jimmy Lin. The Expando-Mono-Duo Design Pattern for Text Ranking with Pretrained Sequence-to-Sequence Models. arXiv:2101.05667, January 2021. Twitter logo
477.Nick Ruest, Samantha Fritz, Ryan Deschamps, Jimmy Lin, and Ian Milligan. From Archive to Analysis: Accessing Web Archives at Scale Through a Cloud-Based Interface. International Journal of Digital Humanities, 2021.
476.Martin Gauch, Juliane Mai, and Jimmy Lin. The Proper Care and Feeding of CAMELS: How Limited Training Data Affects Streamflow Prediction. Environmental Modelling & Software, 135:104926, 2021. Twitter logo

2020

475.Rodrigo Nogueira, Zhiying Jiang, Kyunghyun Cho, and Jimmy Lin. Navigation-Based Candidate Expansion and Pretrained Language Models for Citation Recommendation. Scientometrics, 125(3):3001-3016.
474.Samantha Fritz, Ian Milligan, Nick Ruest, and Jimmy Lin. Building Community at Distance: A Datathon During COVID-19. Digital Library Perspectives, 36(4):415-428, 2020. Twitter logo
473.Martin Gauch, Daniel Klotz, Frederik Kratzert, Grey Nearing, Sepp Hochreiter, and Jimmy Lin. A Machine Learner's Guide to Streamflow Prediction. NeurIPS 2020 Workshop on AI for Earth Sciences, December 2020. (Revised version of June 2020 arXiv paper.)
472.Jheng-Hong Yang, Sheng-Chieh Lin, Rodrigo Nogueira, Ming-Feng Tsai, Chuan-Ju Wang, and Jimmy Lin. Designing Templates for Eliciting Commonsense Knowledge from Pretrained Sequence-to-Sequence Models. Proceedings of the 28th International Conference on Computational Linguistics, pages 3449-3453, December 2020. (Updated and expanded version of March 2020 arXiv paper.)
471.Ronak Pradeep, Xueguang Ma, Xinyu Zhang, Hang Cui, Ruizhou Xu, Rodrigo Nogueira, and Jimmy Lin. H2oloo at TREC 2020: When all you got is a hammer... Deep Learning, Health Misinformation, and Precision Medicine. Proceedings of the Twenty-Ninth Text REtrieval Conference (TREC 2020), November 2020, Gaithersburg, Maryland.
470.Rodrigo Nogueira, Zhiying Jiang, Ronak Pradeep, and Jimmy Lin. Document Ranking with a Pretrained Sequence-to-Sequence Model. Findings of the Association for Computational Linguistics: EMNLP 2020, pages 708-718, November 2020. (Updated and expanded version of March 2020 arXiv paper.)
469.Peng Shi, He Bai, and Jimmy Lin. Cross-Lingual Training of Neural Models for Document Ranking. Findings of the Association for Computational Linguistics: EMNLP 2020, pages 2768-2773, November 2020.
468.Zhiying Jiang, Raphael Tang, Ji Xin, and Jimmy Lin. Inserting Information Bottlenecks for Attribution in Transformers. Findings of the Association for Computational Linguistics: EMNLP 2020, pages 3850-3857, November 2020.
467.Ji Xin, Rodrigo Nogueira, Yaoliang Yu, and Jimmy Lin. Early Exiting BERT for Efficient Document Ranking. Proceedings of SustaiNLP: Workshop on Simple and Efficient Natural Language Processing, pages 83-88, November 2020.
466.Xinyu Zhang, Andrew Yates, and Jimmy Lin. A Little Bit Is Worse Than None: Ranking with Limited Training Data. Proceedings of SustaiNLP: Workshop on Simple and Efficient Natural Language Processing, pages 107-122, November 2020.
465.Edwin Zhang, Nikhil Gupta, Raphael Tang, Xiao Han, Ronak Pradeep, Kuang Lu, Yue Zhang, Rodrigo Nogueira, Kyunghyun Cho, Hui Fang, and Jimmy Lin. Covidex: Neural Ranking Models and Keyword Search Infrastructure for the COVID-19 Open Research Dataset. Proceedings of the 1st Workshop on Scholarly Document Processing, pages 31-41, November 2020. (Updated and expanded version of July 2020 arXiv paper.)
464.Shane Ding, Edwin Zhang, and Jimmy Lin. Cydex: Neural Search Infrastructure for the Scholarly Literature. Proceedings of the 1st Workshop on Scholarly Document Processing, pages 168-173, November 2020.
463.Raphael Tang, Jaejun Lee, Afsaneh Razi, Julia Cambre, Ian Bicking, Jofish Kaye, and Jimmy Lin. Howl: A Deployed, Open-Source Wake Word Detection System. Proceedings of Second Workshop for NLP Open Source Software (NLP-OSS), pages 61-65, November 2020.
462.Ronak Pradeep, Xueguang Ma, Rodrigo Nogueira, and Jimmy Lin. Scientific Claim Verification with VerT5erini. arXiv:2010.11930, October 2020. (Later appears at EACL 2021 workshop on Health Text Mining and Information Analysis) Twitter logo
461.Sheng-Chieh Lin, Jheng-Hong Yang, and Jimmy Lin. Distilling Dense Representations for Ranking using Tightly-Coupled Teachers. arXiv:2010.11386, October 2020. Twitter logo
460.Minghan Li, He Bai, Luchen Tan, Kun Xiong, Ming Li, and Jimmy Lin. Latte-Mix: Measuring Sentence Semantic Similarity with Latent Categorical Mixtures. arXiv:2010.11351, October 2020.
459.Martin Gauch, Frederik Kratzert, Daniel Klotz, Grey Nearing, Jimmy Lin, and Sepp Hochreiter. Rainfall-Runoff Prediction at Multiple Timescales with a Single Long Short-Term Memory Network. arXiv:2010.07921, October 2020. (Later appears in Hydrology and Earth System Sciences, 2021) Twitter logo
458.Jimmy Lin, Rodrigo Nogueira, and Andrew Yates. Pretrained Transformers for Text Ranking: BERT and Beyond. arXiv:2010.06467, October 2020. Twitter logo
457.Andrew Yates, Kevin Martin Jose, Xinyu Zhang, and Jimmy Lin. Flexible IR Pipelines with Capreolus. Proceedings of the 29th International Conference on Information and Knowledge Management (CIKM 2020), pages 3181–3188, October 2020.
456.Zhengkai Tu, Wei Yang, Zihang Fu, Yuqing Xie, Luchen Tan, Kun Xiong, Ming Li, and Jimmy Lin. Approximate Nearest Neighbor Search and Lightweight Dense Vector Reranking in Multi-Stage Retrieval Architectures. Proceedings of the 2020 ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR 2020), pages 97-100, September 2020.
455.Venu Satuluri, Yao Wu, Xun Zheng, Yilei Qian, Brian Wichers, Qieyun Dai, Gui Ming Tang, Jerry Jiang, and Jimmy Lin. SimClusters: Community-Based Representations for Heterogeneous Recommendations at Twitter. Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD 2020), pages 3183-3193, August 2020. Twitter logo
454.Raphael Tang, Jaejun Lee, Afsaneh Razi, Julia Cambre, Ian Bicking, Jofish Kaye, and Jimmy Lin. Howl: A Deployed, Open-Source Wake Word Detection System. arXiv:2008.09606, August 2020. (Later appears at the EMNLP 2020 NLP-OSS workshop) Twitter logo
453.Mohan Zhang, Luchen Tan, Zhengkai Tu, Zihang Fu, Kun Xiong, Ming Li, and Jimmy Lin. To Paraphrase or Not To Paraphrase: User-Controllable Selective Paraphrase Generation. arXiv:2008.09290, August 2020.
452.Nick Ruest, Jimmy Lin, Ian Milligan, and Samantha Fritz. The Archives Unleashed Project: Technology, Process, and Community to Improve Scholarly Access to Web Archives. Proceedings of the 20th ACM/IEEE Joint Conference on Digital Libraries (JCDL 2020), pages 157-166, August 2020.
451.Tobi Adewoye, Xiao Han, Nick Ruest, Ian Milligan, Samantha Fritz, and Jimmy Lin. Content-Based Exploration of Archival Images Using Neural Networks. Proceedings of the 20th ACM/IEEE Joint Conference on Digital Libraries (JCDL 2020), pages 489-490, August 2020.
450.Martin Gauch, James Bai, Juliane Mai, and Jimmy Lin. An Open-Source Interface to the Canadian Surface Prediction Archive. Proceedings of the 20th ACM/IEEE Joint Conference on Digital Libraries (JCDL 2020), pages 529-530, August 2020.
449.Zeynep Akkalyoncu Yilmaz, Charles L. A. Clarke, and Jimmy Lin. A Lightweight Environment for Learning Experimental IR Research Practices. Proceedings of the 43rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2020), pages 2113-2116, July 2020.
448.Jimmy Lin, Joel Mackenzie, Chris Kamphuis, Craig Macdonald, Antonio Mallia, Michał Siedlaczek, Andrew Trotman, and Arjen de Vries. Supporting Interoperability Between Open-Source Search Engines with the Common Index File Format. Proceedings of the 43rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2020), pages 2149-2152, July 2020.
447.Edwin Zhang, Nikhil Gupta, Raphael Tang, Xiao Han, Ronak Pradeep, Kuang Lu, Yue Zhang, Rodrigo Nogueira, Kyunghyun Cho, Hui Fang, and Jimmy Lin. Covidex: Neural Ranking Models and Keyword Search Infrastructure for the COVID-19 Open Research Dataset. arXiv:2007.07846, July 2020. (Updated and expanded version of April 2020 arXiv paper.) Twitter logo
446.Ji Xin, Raphael Tang, Jaejun Lee, Yaoliang Yu, and Jimmy Lin. DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020), pages 2246-2251, July 2020. Twitter logo
445.Raphael Tang, Jaejun Lee, Ji Xin, Xinyu Liu, Yaoliang Yu, and Jimmy Lin. Showing Your Work Doesn't Always Work. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020), pages 2766-2772, July 2020.
444.Hamidreza Shahidi, Ming Li, and Jimmy Lin. Two Birds, One Stone: A Simple, Unified Model for Text Generation from Structured and Unstructured Data. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020), pages 3864-3870, July 2020.
443.Edwin Zhang, Nikhil Gupta, Rodrigo Nogueira, Kyunghyun Cho, and Jimmy Lin. Rapidly Deploying a Neural Search Engine for the COVID-19 Open Research Dataset. Proceedings of the 1st Workshop on NLP for COVID-19 at ACL 2020, July 2020. (Extended abstract version of April 2020 arXiv paper.)
442.Ashutosh Adhikari, Achyudh Ram, Raphael Tang, William L. Hamilton, and Jimmy Lin. Exploring the Limits of Simple Learners in Knowledge Distillation for Document Classification with DocBERT. Proceedings of the 5th Workshop on Representation Learning for NLP, pages 72-77, July 2020.
441.Martin Gauch and Jimmy Lin. A Data Scientist's Guide to Streamflow Prediction. arXiv:2006.12975, June 2020. Twitter logo
440.Mashrekur Rahman, Jonathan M. Frame, Jimmy Lin, and Grey Nearing. Hidden Stories: Topic Modeling in Hydrology Literature. EarthArXiv, doi:10.31223/osf.io/2sy7a, May 2020.
439.Siddhartha Sahu, Amine Mhedhbi, Semih Salihoglu, Jimmy Lin, and M. Tamer Özsu. The Ubiquity of Large Graphs and Surprising Challenges of Graph Processing: Extended Survey. The VLDB Journal, 29(2-3):595-618.
438.Sheng-Chieh Lin, Jheng-Hong Yang, Rodrigo Nogueira, Ming-Feng Tsai, Chuan-Ju Wang, and Jimmy Lin. Query Reformulation using Query History for Passage Retrieval in Conversational Search. arXiv:2005.02230, May 2020 (v1), updated March 2021 (v2) with title change to "Multi-Stage Conversational Passage Retrieval: An Approach to Fusing Term Importance Estimation and Neural Query Rewriting". (Later appears in the ACM Transactions on Information Systems, October 2021) Twitter logo
437.He Bai, Peng Shi, Jimmy Lin, Luchen Tan, Kun Xiong, Wen Gao, and Ming Li. SegaBERT: Pre-training of Segment-aware BERT for Language Understanding. arXiv:2004.14996, April 2020 (v1), updated December 2020 (v2). (Later appears at AAAI 2021) Twitter logo
436.Raphael Tang, Jaejun Lee, Ji Xin, Xinyu Liu, Yaoliang Yu, and Jimmy Lin. Showing Your Work Doesn't Always Work. arXiv:2004.13705, April 2020. (Later appears at ACL 2020) Twitter logo
435.Ji Xin, Raphael Tang, Jaejun Lee, Yaoliang Yu, and Jimmy Lin. DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference. arXiv:2004.12993, April 2020. (Later appears at ACL 2020) Twitter logo Twitter logo
434.Raphael Tang, Rodrigo Nogueira, Edwin Zhang, Nikhil Gupta, Phuong Cam, Kyunghyun Cho, and Jimmy Lin. Rapidly Bootstrapping a Question Answering Dataset for COVID-19. arXiv:2004.11339, April 2020. Twitter logo
433.Yuqing Xie, Wei Yang, Luchen Tan, Kun Xiong, Nicholas Jing Yuan, Baoxing Huai, Ming Li, and Jimmy Lin. Distant Supervision for Multi-Stage Fine-Tuning in Retrieval-Based Question Answering. Proceedings of The Web Conference 2020 (WWW '20), pages 2934-2940, April 2020.
432.Adrien Grand, Robert Muir, Jim Ferenczi, and Jimmy Lin. From MaxScore to Block-Max WAND: The Story of How Lucene Significantly Improved Query Evaluation Performance. Proceedings of the 42nd European Conference on Information Retrieval, Part II (ECIR 2020), pages 20-27, April 2020. Twitter logo
431.Chris Kamphuis, Arjen de Vries, Leonid Boytsov, and Jimmy Lin. Which BM25 Do You Mean? A Large-Scale Reproducibility Study of Scoring Variants. Proceedings of the 42nd European Conference on Information Retrieval, Part II (ECIR 2020), pages 28-34, April 2020. Twitter logo
430.Jimmy Lin and Qian Zhang. Reproducibility is a Process, not an Achievement: The Replicability of IR Reproducibility Experiments. Proceedings of the 42nd European Conference on Information Retrieval, Part II (ECIR 2020), pages 43-49, April 2020. Twitter logo
429.Rodrigo Nogueira, Zhiying Jiang, Kyunghyun Cho, and Jimmy Lin. Evaluating Pretrained Transformer Models for Citation Recommendation. Proceedings of the 10th International Workshop on Bibliometric-enhanced Information Retrieval (BIR) at ECIR 2020: CEUR Workshop Proceedings Vol-2591, pages 89-100, April 2020. Twitter logo
428.Edwin Zhang, Nikhil Gupta, Rodrigo Nogueira, Kyunghyun Cho, and Jimmy Lin. Rapidly Deploying a Neural Search Engine for the COVID-19 Open Research Dataset: Preliminary Thoughts and Lessons Learned. arXiv:2004.05125, April 2020. Twitter logo
427.He Bai, Peng Shi, Jimmy Lin, Luchen Tan, Kun Xiong, Wen Gao, Jie Liu, and Ming Li. Semantics of the Unwritten. arXiv:2004.02251, April 2020.
426.Sheng-Chieh Lin, Jheng-Hong Yang, Rodrigo Nogueira, Ming-Feng Tsai, Chuan-Ju Wang, and Jimmy Lin. Conversational Question Reformulation via Sequence-to-Sequence Architectures and Pretrained Language Models. arXiv:2004.01909, April 2020.
425.Sheng-Chieh Lin, Jheng-Hong Yang, Rodrigo Nogueira, Ming-Feng Tsai, Chuan-Ju Wang, and Jimmy Lin. TTTTTackling WinoGrande Schemas. arXiv:2003.08380, March 2020. (Later appears at COLING 2020) Twitter logo
424.Jimmy Lin, Joel Mackenzie, Chris Kamphuis, Craig Macdonald, Antonio Mallia, Michał Siedlaczek, Andrew Trotman, and Arjen de Vries. Supporting Interoperability Between Open-Source Search Engines with the Common Index File Format. arXiv:2003.08276, March 2020. Twitter logo
423.Rodrigo Nogueira, Zhiying Jiang, and Jimmy Lin. Document Ranking with a Pretrained Sequence-to-Sequence Model. arXiv:2003.06713, March 2020. (Later appears in Findings of EMNLP 2020) Twitter logo
422.Jimmy Lin, Ian Milligan, Douglas Oard, Nick Ruest, and Katie Shilton. We Could, but Should We? Ethical Considerations for Providing Access to GeoCities and Other Historical Digital Collections. Proceedings of the Fifth ACM SIGIR Conference on Human Information Interaction and Retrieval (CHIIR 2020), pages 135-144, March 2020. Twitter logo
421.Royal Sequiera, Luchen Tan, Yinan Zhang, and Jimmy Lin. Update Delivery Mechanisms for Prospective Information Needs: A Reproducibility Study. Proceedings of the Fifth ACM SIGIR Conference on Human Information Interaction and Retrieval (CHIIR 2020), pages 308-312, March 2020.
420.Ruixue Zhang, Wei Yang, Luyun Lin, Zhengkai Tu, Yuqing Xie, Zihang Fu, Yuhao Xie, Luchen Tan, Kun Xiong, and Jimmy Lin. Rapid Adaptation of BERT for Information Extraction on Domain-Specific Business Documents. arXiv:2002.01861, February 2020. Twitter logo
419.Jimmy Lin. A Prototype of Serverless Lucene. arXiv:2002.01447, February 2020. Twitter logo
418.Andrew Yates, Siddhant Arora, Xinyu Zhang, Wei Yang, Kevin Martin Jose, and Jimmy Lin. Capreolus: A Toolkit for End-to-End Neural Ad Hoc Retrieval. Proceedings of the 13th ACM International Conference on Web Search and Data Mining (WSDM 2020), pages 861-864, February 2020, Houston, Texas.
417.Rodrigo Nogueira, Zhiying Jiang, Kyunghyun Cho, and Jimmy Lin. Navigation-Based Candidate Expansion and Pretrained Language Models for Citation Recommendation. arXiv:2001.08687, January 2020. (Later appears in Scientometrics)
416.Nick Ruest, Jimmy Lin, Ian Milligan, and Samantha Fritz. The Archives Unleashed Project: Technology, Process, and Community to Improve Scholarly Access to Web Archives. arXiv:2001.05399, January 2020. (Later appears at JCDL 2020)

2019

415.Jimmy Lin. The Neural Hype, Justified! A Recantation. SIGIR Forum, 53(2):88-93, 2019. Twitter logo
414.Martin Gauch, Juliane Mai, Shervan Gharari, and Jimmy Lin. Streamflow Prediction with Limited Spatially-Distributed Input Data. NeurIPS 2019 Workshop on Tackling Climate Change with Machine Learning, December 2019, Vancouver, British Columbia, Canada.
413.Rodrigo Nogueira and Jimmy Lin. From doc2query to docTTTTTquery. December 2019. Twitter logo
412.Martin Gauch, Juliane Mai, and Jimmy Lin. The Proper Care and Feeding of CAMELS: How Limited Training Data Affects Streamflow Prediction. arXiv:1911.07249, November 2019. (Later appears in Environmental Modelling & Software, 2021) Twitter logo
411.Achyudh Ram, Ji Xin, Meiyappan Nagappan, Yaoliang Yu, Rocío Cabrera Lozoya, Antonino Sabetta, and Jimmy Lin. Exploiting Token and Path-based Representations of Code for Identifying Security-Relevant Commits. arXiv:1911.07620, November 2019.
410.Allison McCoy, Dean Sittig, Jimmy Lin, and Adam Wright. Identification and Ranking of Biomedical Informatics Researcher Citation Statistics through a Google Scholar Scraper. Proceedings of the 2019 Annual Symposium of the American Medical Informatics Association (AMIA 2019), pages 655-663, November 2019, Washington, D.C.
409.Jheng-Hong Yang, Sheng-Chieh Lin, Jimmy Lin, Ming-Feng Tsai, and Chuan-Ju Wang. Query and Answer Expansion from Conversation History. Proceedings of the Twenty-Eighth Text REtrieval Conference (TREC 2019), November 2019, Gaithersburg, Maryland.
408.Zeynep Akkalyoncu Yilmaz, Shengjin Wang, and Jimmy Lin. H2oloo at TREC 2019: Combining Sentence and Document Evidence in the Deep Learning Track. Proceedings of the Twenty-Eighth Text REtrieval Conference (TREC 2019), November 2019, Gaithersburg, Maryland.
407.Linqing Liu, Huan Wang, Jimmy Lin, Richard Socher, and Caiming Xiong. Attentive Student Meets Multi-Task Teacher: Improved Knowledge Distillation for Pretrained Models. arXiv:1911.03588, November 2019.
406.Jaejun Lee, Raphael Tang, and Jimmy Lin. What Would Elsa Do? Freezing Layers During Transformer Fine-Tuning. arXiv:1911.03090, November 2019.
405.Peng Shi and Jimmy Lin. Cross-Lingual Relevance Transfer for Document Retrieval. arXiv:1911.02989, November 2019.
404.Yinan Zhang, Raphael Tang, and Jimmy Lin. Explicit Pairwise Word Interaction Modeling Improves Pretrained Transformers for English Semantic Similarity Tasks. arXiv:1911.02847, November 2019.
403.Linqing Liu, Wei Yang, Jinfeng Rao, Raphael Tang, and Jimmy Lin. Incorporating Contextual and Syntactic Structures Improves Semantic Similarity Modeling. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 1204-1209, November 2019, Hong Kong, China.
402.Zeynep Akkalyoncu Yilmaz, Wei Yang, Haotian Zhang, and Jimmy Lin. Cross-Domain Modeling of Sentence-Level Evidence for Document Retrieval. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3490-3496, November 2019, Hong Kong, China.
401.Hsiu-Wei Yang, Yanyan Zou, Peng Shi, Wei Lu, Jimmy Lin, and Xu Sun. Aligning Cross-Lingual Entities with Multi-Aspect Information. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 4431-4441, November 2019, Hong Kong, China.
400.Jinfeng Rao, Linqing Liu, Yi Tay, Wei Yang, Peng Shi, and Jimmy Lin. Bridging the Gap Between Relevance Matching and Semantic Matching for Short Text Similarity Modeling. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5370-5381, November 2019, Hong Kong, China.
399.Ji Xin, Jimmy Lin, and Yaoliang Yu. What Part of the Neural Network Does This? Understanding LSTMs by Measuring and Dissecting Neurons. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5823-5830, November 2019, Hong Kong, China.
398.Zeynep Akkalyoncu Yilmaz, Shengjin Wang, Wei Yang, Haotian Zhang, and Jimmy Lin. Applying BERT to Document Retrieval with Birch. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations, pages 19-24, November 2019, Hong Kong, China.
397.Jaejun Lee, Raphael Tang, and Jimmy Lin. Honkling: In-Browser Personalization for Ubiquitous Keyword Spotting. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations, pages 91-96, November 2019, Hong Kong, China.
396.Ryan Clancy, Ihab F. Ilyas, and Jimmy Lin. Knowledge Graph Construction from Unstructured Text with Applications to Fact Verification and Beyond. Proceedings of the Second Workshop on Fact Extraction and VERification (FEVER), pages 39-46, November 2019, Hong Kong, China.
395.Raphael Tang, Yao Lu, and Jimmy Lin. Natural Language Generation for Effective Knowledge Distillation. Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019), pages 202-208, November 2019, Hong Kong, China.
394.Rodrigo Nogueira, Wei Yang, Kyunghyun Cho, and Jimmy Lin. Multi-Stage Document Ranking with BERT. arXiv:1910.14424, October 2019.
393.Jimmy Lin, Lori Paniak, and Gordon Boerke. The Performance Envelope of Inverted Indexing on Modern Hardware. arXiv:1910.11028, October 2019.
392.Tommaso Teofili and Jimmy Lin. Lucene for Approximate Nearest-Neighbors Search on Arbitrary Dense Vectors. arXiv:1910.10208, October 2019.
391.Hsiu-Wei Yang, Yanyan Zou, Peng Shi, Wei Lu, Jimmy Lin, and Xu Sun. Aligning Cross-Lingual Entities with Multi-Aspect Information. arXiv:1910.06575, October 2019. (Later appears at EMNLP 2019)
390.Martin Gauch, Juliane Mai, Shervan Gharari, and Jimmy Lin. Data-Driven vs. Physically-Based Streamflow Prediction Models. Proceedings of the 9th International Workshop on Climate Informatics, October 2019, Paris, France.
389.Hamidreza Shahidi, Ming Li, and Jimmy Lin. Two Birds, One Stone: A Simple, Unified Model for Text Generation from Structured and Unstructured Data. arXiv:1909.10158, September 2019. (Later appears at ACL 2020)
388.Wei Yang, Kuang Lu, Peilin Yang, and Jimmy Lin. Critically Examining the "Neural Hype": Weak Baselines and the Additivity of Effectiveness Gains from Neural Ranking Models. Proceedings of the 42nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019), pages 1129-1132, July 2019, Paris, France. (🏆 Best Short Paper Honorable Mention)
387.Jimmy Lin and Peilin Yang. The Impact of Score Ties on Repeatability in Document Ranking. Proceedings of the 42nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019), pages 1125-1128, July 2019, Paris, France.
386.Raphael Tang, Ferhan Ture, and Jimmy Lin. Yelling at Your TV: An Analysis of Speech Recognition Errors and Subsequent User Behavior on Entertainment Systems. Proceedings of the 42nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019), pages 853-856, July 2019, Paris, France.
385.Ryan Clancy, Toke Eskildsen, Nick Ruest, and Jimmy Lin. Solr Integration in the Anserini Information Retrieval Toolkit. Proceedings of the 42nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019), pages 1285-1288, July 2019, Paris, France.
384.Ryan Clancy, Jaejun Lee, Zeynep Akkalyoncu Yilmaz, and Jimmy Lin. Information Retrieval Meets Scalable Text Analytics: Solr Integration with Spark. Proceedings of the 42nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019), pages 1313-1316, July 2019, Paris, France.
383.Ferhan Ture, Jinfeng Rao, Raphael Tang, and Jimmy Lin. Challenges and Opportunities in Understanding Spoken Queries Directed at Modern Entertainment Platforms. Proceedings of the 42nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019), pages 1375-1376, July 2019, Paris, France.
382.Ryan Clancy, Nicola Ferro, Claudia Hauff, Jimmy Lin, Tetsuya Sakai, and Ze Zhong Wu. The SIGIR 2019 Open-Source IR Replicability Challenge (OSIRRC 2019). Proceedings of the 42nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019), pages 1432-1434, July 2019, Paris, France.
381.Ryan Clancy, Nicola Ferro, Claudia Hauff, Jimmy Lin, Tetsuya Sakai, Ze Zhong Wu. (Editors) Proceedings of the Open-Source IR Replicability Challenge (OSIRRC 2019): CEUR Workshop Proceedings Vol-2409, July 2019, Paris, France.
380.Ryan Clancy, Nicola Ferro, Claudia Hauff, Jimmy Lin, Tetsuya Sakai, and Ze Zhong Wu. Overview of the 2019 Open-Source IR Replicability Challenge (OSIRRC 2019). Proceedings of the Open-Source IR Replicability Challenge (OSIRRC 2019): CEUR Workshop Proceedings Vol-2409, pages 1-7, July 2019, Paris, France.
379.Ryan Clancy, Zeynep Akkalyoncu Yilmaz, Ze Zhong Wu, and Jimmy Lin. University of Waterloo Docker Images for OSIRRC at SIGIR 2019. Proceedings of the Open-Source IR Replicability Challenge (OSIRRC 2019): CEUR Workshop Proceedings Vol-2409, page 36, July 2019, Paris, France.
378.Peng Shi, Jinfeng Rao, and Jimmy Lin. Simple Attention-Based Representation Learning for Ranking Short Social Media Posts. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 2212–2217, June 2019, Minneapolis, Minnesota.
377.Ashutosh Adhikari, Achyudh Ram, Raphael Tang, and Jimmy Lin. Rethinking Complex Neural Network Architectures for Document Classification. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4046-4051, June 2019, Minneapolis, Minnesota.
376.Wei Yang, Luchen Tan, Chunwei Lu, Anqi Cui, Han Li, Xi Chen, Kun Xiong, Muzi Wang, Ming Li, Jian Pei, and Jimmy Lin. Detecting Customer Complaint Escalation with Recurrent Neural Networks and Manually-Engineered Features. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Industry Papers), pages 56-63, June 2019, Minneapolis, Minnesota.
375.Wei Yang, Yuqing Xie, Aileen Lin, Xingyu Li, Luchen Tan, Kun Xiong, Ming Li, and Jimmy Lin. End-to-End Open-Domain Question Answering with BERTserini. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations) (NAACL 2019), pages 72-77, June 2019, Minneapolis, Minnesota.
374.Ryan Deschamps, Samantha Fritz, Jimmy Lin, Ian Milligan, and Nick Ruest. The Cost of a WARC: Analyzing Web Archives in the Cloud. Proceedings of the 19th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL 2019), pages 261-264, June 2019, Urbana-Champaign, Illinois.
373.Ian Milligan, Nathalie Casemajor, Samantha Fritz, Jimmy Lin, Nick Ruest, Matthew S. Weber, and Nicholas Worby. Building Community and Tools for Analyzing Web Archives through Datathons. Proceedings of the 19th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL 2019), pages 265-268, June 2019, Urbana-Champaign, Illinois.
372.Ryan Deschamps, Nick Ruest, Jimmy Lin, Samantha Fritz, and Ian Milligan. The Archives Unleashed Notebook: Madlibs for Jumpstarting Scholarly Exploration of Web Archives. Proceedings of the 19th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL 2019), pages 337-338, June 2019, Urbana-Champaign, Illinois.
371.Hsiu-Wei Yang, Linqing Liu, Ian Milligan, Nick Ruest, and Jimmy Lin. Scalable Content-Based Analysis of Images in Web Archives with TensorFlow and the Archives Unleashed Toolkit. Proceedings of the 19th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL 2019), pages 436-437, June 2019, Urbana-Champaign, Illinois.
370.Nick Ruest, Ian Milligan, and Jimmy Lin. Warclight: A Rails Engine for Web Archive Discovery. Proceedings of the 19th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL 2019), pages 442-443, June 2019, Urbana-Champaign, Illinois.
369.Wei Yang, Kuang Lu, Peilin Yang, and Jimmy Lin. Critically Examining the "Neural Hype": Weak Baselines and the Additivity of Effectiveness Gains from Neural Ranking Models. arXiv:1904.09171, April 2019 (v1), updated September 2019 (v2). (Later appears at SIGIR 2019)
368.Jimmy Lin. The Simplest Thing That Can Possibly Work: Pseudo-Relevance Feedback Using Text Classification. arXiv:1904.08861, April 2019.
367.Ashutosh Adhikari, Achyudh Ram, Raphael Tang, and Jimmy Lin. DocBERT: BERT for Document Classification. arXiv:1904.08398, April 2019 (v1), updated August 2019 (v2, v3).
366.Rodrigo Nogueira, Wei Yang, Jimmy Lin, and Kyunghyun Cho. Document Expansion by Query Prediction. arXiv:1904.08375, April 2019 (v1), updated September 2019 (v2).
365.Wei Yang, Yuqing Xie, Luchen Tan, Kun Xiong, Ming Li, and Jimmy Lin. Data Augmentation for BERT Fine-Tuning in Open-Domain Question Answering. arXiv:1904.06652, April 2019.
364.Peilin Yang and Jimmy Lin. Reproducing and Generalizing Semantic Term Matching in Axiomatic Information Retrieval. Proceedings of the 41st European Conference on Information Retrieval, Part I (ECIR 2019), pages 369-381, April 2019, Cologne, Germany.
363.Ruifan Yu, Yuhao Xie, and Jimmy Lin. Simple Techniques for Cross-Collection Relevance Feedback. Proceedings of the 41st European Conference on Information Retrieval, Part I (ECIR 2019), pages 397-409, April 2019, Cologne, Germany. (🏆 Best Reproducibility Paper — Honourable Mention)
362.Peng Shi and Jimmy Lin. Simple BERT Models for Relation Extraction and Semantic Role Labeling. arXiv:1904.05255, April 2019. Twitter logo
361.Raphael Tang, Yao Lu, Linqing Liu, Lili Mou, Olga Vechtomova, and Jimmy Lin. Distilling Task-Specific Knowledge from BERT into Simple Neural Networks. arXiv:1903.12136, March 2019.
360.Wei Yang, Haotian Zhang, and Jimmy Lin. Simple Applications of BERT for Ad Hoc Document Retrieval. arXiv:1903.10972, March 2019.
359.Michael Azmy, Peng Shi, Jimmy Lin, and Ihab F. Ilyas. Matching Entities Across Different Knowledge Graphs with Graph Embeddings. arXiv:1903.06607, March 2019.
358.Jaejun Lee, Raphael Tang, and Jimmy Lin. Universal Voice-Enabled User Interfaces Using JavaScript. Proceedings of the 24th International Conference on Intelligent User Interfaces: Companion (IUI '19), pages 81-82, March 2019, Marina del Ray, California.
357.Wei Yang, Yuqing Xie, Aileen Lin, Xingyu Li, Luchen Tan, Kun Xiong, Ming Li, and Jimmy Lin. End-to-End Open-Domain Question Answering with BERTserini. arXiv:1902.01718, February 2019 (v1), updated September 2019 (v2). (Later appears at NAACL-HLT 2019)
356.Jinfeng Rao, Wei Yang, Yuhao Zhang, Ferhan Ture, and Jimmy Lin. Multi-Perspective Relevance Matching with Hierarchical ConvNets for Social Media Search. Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19), pages 232-240, January 2019, Honolulu, Hawaii.

2018

355.Jimmy Lin. The Neural Hype and Comparisons Against Weak Baselines. SIGIR Forum, 52(2):40-51, 2018.
354.Raphael Tang, Gefei Yang, Hong Wei, Yajie Mao, Ferhan Ture, and Jimmy Lin. Streaming Voice Query Recognition using Causal Convolutional Recurrent Neural Networks. arXiv:1812.07754, December 2018.
353.Raphael Tang, Ashutosh Adhikari, and Jimmy Lin. FLOPs as a Direct Optimization Objective for Learning Sparse Neural Networks. Proceedings of the NIPS 2018 Workshop on Compact Deep Neural Networks with Industrial Applications, December 2018, Montreal, Quebec, Canada.
352.Ruifan Yu, Yuhao Xie, and Jimmy Lin. H2oloo at TREC 2018: Cross-Collection Relevance Transfer for the Common Core Track. Proceedings of the Twenty-Seventh Text REtrieval Conference (TREC 2018), November 2018, Gaithersburg, Maryland.
351.Peilin Yang and Jimmy Lin. Anserini at TREC 2018: CENTRE, Common Core, and News Tracks. Proceedings of the Twenty-Seventh Text REtrieval Conference (TREC 2018), November 2018, Gaithersburg, Maryland.
350.Royal Sequiera, Luchen Tan, and Jimmy Lin. Overview of the TREC 2018 Real-Time Summarization Track. Proceedings of the Twenty-Seventh Text REtrieval Conference (TREC 2018), November 2018, Gaithersburg, Maryland.
349.Peng Shi, Jinfeng Rao, and Jimmy Lin. Simple Attention-Based Representation Learning for Ranking Short Social Media Posts. arXiv:1811.01013, November 2018 (v1), updated September 2019 (v2). (Later appears at NAACL-HLT 2019)
348.Raphael Tang and Jimmy Lin. Progress and Tradeoffs in Neural Language Models. arXiv:1811.00942, November 2018.
347.Frank Hopfgartner, Allan Hanbury, Henning Müller, Ivan Eggel, Krisztian Balog, Torben Brodt, Gordon V. Cormack, Jimmy Lin, Jayashree Kalpathy-Cramer, Noriko Kando, Makoto P. Kato, Anastasia Krithara, Tim Gollub, Martin Potthast, Evelyne Viega, and Simon Mercer. Evaluation-as-a-Service for the Computational Sciences: Overview and Outlook. Journal of Data and Information Quality, 10(4), Article 15, 2018.
346.Peilin Yang, Hui Fang, and Jimmy Lin. Anserini: Reproducible Ranking Baselines Using Lucene. Journal of Data and Information Quality, 10(4), Article 16, 2018.
345.Jaejun Lee, Raphael Tang, and Jimmy Lin. JavaScript Convolutional Neural Networks for Keyword Spotting in the Browser: An Experimental Analysis. arXiv:1810.12859, October 2018.
344.Raphael Tang and Jimmy Lin. Adaptive Pruning of Neural Language Models for Mobile Devices. arXiv:1809.10282, September 2018.
343.Jimmy Lin. Computing without Servers, V8, Rocket Ships, and Other Batsh*t Crazy Ideas in Data Systems. Proceedings of the First Biennial Conference on Design of Experimental Search & Information Retrieval Systems: CEUR Workshop Proceedings Vol-2167, pages 3-6, August 2018, Bertinoro, Italy. [slides]
342.Michael Azmy, Peng Shi, Ihab Ilyas, and Jimmy Lin. Farewell Freebase: Migrating the SimpleQuestions Dataset to DBpedia. Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018), pages 2093-2103, August 2018, Santa Fe, New Mexico.
341.Jinfeng Rao, Ferhan Ture, and Jimmy Lin. Multi-Task Learning with Neural Networks for Voice Query Understanding on an Entertainment Platform. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD 2018), pages 635-645, August 2018, London, United Kingdom. [YouTube video]
340.Jimmy Lin and Peilin Yang. Repeatability Corner Cases in Document Ranking: The Impact of Score Ties. arXiv:1807.05798, May 2018 (v1), updated September 2019 (v2). (Later appears at SIGIR 2019)
339.Jimmy Lin, Salman Mohammed, Royal Sequiera, and Luchen Tan. Update Delivery Mechanisms for Prospective Information Needs: An Analysis of Attention in Mobile Users. Proceedings of the 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018), pages 785-794, July 2018, Ann Arbor, Michigan.
338.Jinfeng Rao, Ferhan Ture, and Jimmy Lin. What Do Viewers Say to Their TVs? An Analysis of Voice Queries to Entertainment Systems. Proceedings of the 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018), pages 1213-1216, July 2018, Ann Arbor, Michigan.
337.Ajeet Grewal and Jimmy Lin. The Evolution of Content Analysis for Personalized Recommendations at Twitter. Proceedings of the 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018), pages 1355-1356, July 2018, Ann Arbor, Michigan.
336.Ajeet Grewal, Jerry Jiang, Gary Lam, Tristan Jung, Lohith Vuddemarri, Quannan Li, Aaditya Landge, and Jimmy Lin. RecService: Distributed Real-Time Graph Processing at Twitter. Proceedings of the 10th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '18), July 2018, Boston, Massachusetts.
335.Youngbin Kim and Jimmy Lin. Serverless Data Analytics with Flint. Proceedings of the 2018 IEEE 11th International Conference on Cloud Computing (CLOUD 2018), pages 451-455, July 2018, San Francisco, California.
334.Jimmy Lin. Scale Up or Scale Out for Graph Processing? IEEE Internet Computing, 22(3):72-78, 2018.
333.Peilin Yang, Srikanth Thiagarajan, and Jimmy Lin. Robust, Scalable, Real-Time Event Time Series Aggregation at Twitter. Proceedings of the 2018 ACM SIGMOD International Conference on Management of Data (SIGMOD 2018), pages 595-599, June 2018, Houston, Texas.
332.Salman Mohammed, Peng Shi, and Jimmy Lin. Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 291-296, May 2018, New Orleans, Louisiana.
331.Zhucheng Tu, Mengping Li, and Jimmy Lin. Pay-Per-Request Deployment of Neural Network Models Using Serverless Architectures. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, pages 6-10, May 2018, New Orleans, Louisiana.
330.Yiyun Liang, Zhucheng Tu, Laetitia Huang, and Jimmy Lin. CNNs for NLP in the Browser: Client-Side Deployment and Visualization Opportunities. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, pages 61-65, May 2018, New Orleans, Louisiana.
329.Ryan Deschamps, Jimmy Lin, Nick Ruest, Samantha Fritz, Ian Milligan. Usability, Accessibility, and Performance: Striking the Right Balance with the Archives Unleashed Toolkit. CSDH/SCHN Digital Humanities Conference 2018, May 2018, Toronto, Ontario, Canada.
328.Jinfeng Rao, Wei Yang, Yuhao Zhang, Ferhan Ture, and Jimmy Lin. Multi-Perspective Relevance Matching with Hierarchical ConvNets for Social Media Search. arXiv:1805.08159, May 2018 (v1), updated June 2019 (v2) (Later appears at AAAI-19)
327.Kareem El Gebaly and Jimmy Lin. In-Browser Split-Execution Support for Interactive Analytics in the Cloud. arXiv:1804.08822, April 2018.
326.Raphael Tang, Weijie Wang, Zhucheng Tu, and Jimmy Lin. An Experimental Analysis of the Power Consumption of Convolutional Neural Networks for Keyword Spotting. Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018), pages 5479-5483, April 2018, Calgary, Alberta, Canada.
325.Raphael Tang and Jimmy Lin. Deep Residual Learning for Small-Footprint Keyword Spotting. Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018), pages 5484-5488, April 2018, Calgary, Alberta, Canada.
324.Youngbin Kim and Jimmy Lin. Serverless Data Analytics with Flint. arXiv:1803.06354, March 2018. (Later appears at CLOUD 2018)
323.Joel Mackenzie, Shane Culpepper, Roi Blanco, Matt Crane, Charles L. A. Clarke, and Jimmy Lin. Query Driven Algorithm Selection in Early Stage Retrieval. Proceedings of the 11th ACM International Conference on Web Search and Data Mining (WSDM 2018), pages 396-404, February 2018, Marina Del Rey, California.

2017

322.Siddhartha Sahu, Amine Mhedhbi, Semih Salihoglu, Jimmy Lin, M. Tamer Özsu. The Ubiquity of Large Graphs and Surprising Challenges of Graph Processing. Proceedings of the VLDB Endowment, 11(4):420-431, 2017. (🏆 Best Paper)
321.Salman Mohammed, Peng Shi, and Jimmy Lin. Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks. arXiv:1712.01969, December 2017 (v1), updated June 2018 (v2). (Later appears at NAACL-HLT 2018)
320.Jimmy Lin, Salman Mohammed, Royal Sequiera, Luchen Tan, Nimesh Ghelani, Mustafa Abualsaud, Richard McCreadie, Dmitrijs Milajevs, and Ellen Voorhees. Overview of the TREC 2017 Real-Time Summarization Track. Proceedings of the Twenty-Sixth Text REtrieval Conference (TREC 2017), November 2017, Gaithersburg, Maryland.
319.Gaurav Baruah, Richard McCreadie, and Jimmy Lin. A Comparison of Nuggets and Clusters for Evaluating Timeline Summaries. Proceedings of the 2017 International Conference on Information and Knowledge Management (CIKM 2017), pages 67-76, November 2017, Singapore.
318.Jinfeng Rao, Ferhan Ture, Hua He, Oliver Jojic, and Jimmy Lin. Talking to Your TV: Context-Aware Voice Search with Hierarchical Recurrent Neural Networks. Proceedings of the 2017 International Conference on Information and Knowledge Management (CIKM 2017), pages 557-566, November 2017, Singapore.
317.Raphael Tang, Weijie Wang, Zhucheng Tu, and Jimmy Lin. An Experimental Analysis of the Power Consumption of Convolutional Neural Networks for Keyword Spotting. arXiv:1711.00333, October 2017. (Later appears at ICASSP 2018)
316.Raphael Tang and Jimmy Lin. Deep Residual Learning for Small-Footprint Keyword Spotting. arXiv:1710.10361, October 2017. (Later appears at ICASSP 2018)
315.Raphael Tang and Jimmy Lin. Honk: A PyTorch Reimplementation of Convolutional Neural Networks for Keyword Spotting. arXiv:1710.06554, October 2017.
314.Jinfeng Rao, Ferhan Ture, Xing Niu, and Jimmy Lin. Mining the Temporal Statistics of Query Terms for Searching Social Media Posts. Proceedings of the 3rd ACM International Conference on the Theory of Information Retrieval (ICTIR 2017), pages 133-140, October 2017, Amsterdam, The Netherlands.
313.Matt Crane and Jimmy Lin. An Exploration of Serverless Architectures for Information Retrieval. Proceedings of the 3rd ACM International Conference on the Theory of Information Retrieval (ICTIR 2017), pages 241-244, October 2017, Amsterdam, The Netherlands.
312.Gaurav Baruah and Jimmy Lin. The Pareto Frontier of Utility Models as a Framework for Evaluating Push Notification Systems. Proceedings of the 3rd ACM International Conference on the Theory of Information Retrieval (ICTIR 2017), pages 253-256, October 2017, Amsterdam, The Netherlands. (🏆 Best Short Paper)
311.Salman Mohammed, Matt Crane, and Jimmy Lin. Quantization in Append-Only Collections. Proceedings of the 3rd ACM International Conference on the Theory of Information Retrieval (ICTIR 2017), pages 265-268, October 2017, Amsterdam, The Netherlands.
310.Jimmy Lin. The Lambda and the Kappa. IEEE Internet Computing, 21(5):60-66, 2017.
309.Siddhartha Sahu, Amine Mhedhbi, Semih Salihoglu, Jimmy Lin, M. Tamer Özsu. The Ubiquity of Large Graphs and Surprising Challenges of Graph Processing: A User Survey. arXiv:1709.03188, September 2017. (Later appears in PVLDB, 2017)
308.Hua He, Kris Ganjam, Navendu Jain, Jessica Lundin, Ryen White, and Jimmy Lin. An Insight Extraction System on BioMedical Literature with Deep Neural Networks. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017), pages 2691-2701, September 2017, Copenhagen, Denmark.
307.Kareem El Gebaly, Lukasz Golab, and Jimmy Lin. Portable In-Browser Data Cube Exploration. Proceedings of the KDD 2017 Workshop on Interactive Data Exploration and Analytics (IDEA), pages 35-39, August 2017, Halifax, Nova Scotia, Canada.
306.Zhucheng Tu, Matt Crane, Royal Sequiera, Junchen Zhang, and Jimmy Lin. An Exploration of Approaches to Integrating Neural Reranking Models in Multi-Stage Ranking Architectures. Proceedings of the SIGIR 2017 Workshop on Neural Information Retrieval (Neu-IR'17), August 2017, Tokyo, Japan. (Published as arXiv:1707.08275)
305.Royal Sequiera, Gaurav Baruah, Zhucheng Tu, Salman Mohammed, Jinfeng Rao, Haotian Zhang, and Jimmy Lin. Exploring the Effectiveness of Convolutional Neural Networks for Answer Selection in End-to-End Question Answering. Proceedings of the SIGIR 2017 Workshop on Neural Information Retrieval (Neu-IR'17), August 2017, Tokyo, Japan. (Published as arXiv:1707.07804)
304.Jinfeng Rao, Hua He, Haotian Zhang, Ferhan Ture, Royal Sequiera, Salman Mohammed, and Jimmy Lin. Integrating Lexical and Temporal Signals in Neural Ranking Models for Social Media Search. Proceedings of the SIGIR 2017 Workshop on Neural Information Retrieval (Neu-IR'17), August 2017, Tokyo, Japan. (Published as arXiv:1707.07792)
303.Adam Roegiest, Luchen Tan, and Jimmy Lin. Online In-Situ Interleaved Evaluation of Real-Time Push Notification Systems. Proceedings of the 40th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2017), pages 415-424, August 2017, Tokyo, Japan.
302.Luchen Tan, Gaurav Baruah, and Jimmy Lin. On the Reusability of "Living Labs" Test Collections: A Case Study of Real-Time Summarization. Proceedings of the 40th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2017), pages 793-796, August 2017, Tokyo, Japan.
301.Haotian Zhang, Jinfeng Rao, Jimmy Lin, and Mark D. Smucker. Automatically Extracting High-Quality Negative Examples for Answer Selection in Question Answering. Proceedings of the 40th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2017), pages 797-800, August 2017, Tokyo, Japan.
300.Jinfeng Rao, Hua He, and Jimmy Lin. Experiments with Convolutional Neural Network Models for Answer Selection. Proceedings of the 40th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2017), pages 1217-1220, August 2017, Tokyo, Japan.
299.Royal Sequiera and Jimmy Lin. Finally, a Downloadable Test Collection of Tweets. Proceedings of the 40th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2017), pages 1225-1228, August 2017, Tokyo, Japan.
298.Peilin Yang, Hui Fang, and Jimmy Lin. Anserini: Enabling the Use of Lucene for Information Retrieval Research. Proceedings of the 40th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2017), pages 1253-1256, August 2017, Tokyo, Japan.
297.Nimesh Ghelani, Salman Mohammed, Shine Wang, and Jimmy Lin. Event Detection on Curated Tweet Streams. Proceedings of the 40th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2017), pages 1325-1328, August 2017, Tokyo, Japan.
296.Leif Azzopardi, Matt Crane, Hui Fang, Grant Ingersoll, Jimmy Lin, Yashar Moshfeghi, Harrisen Scells, Peilin Yang, and Guido Zuccon. The Lucene for Information Access and Retrieval Research (LIARR) Workshop at SIGIR 2017. Proceedings of the 40th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2017), pages 1429-1430, August 2017, Tokyo, Japan.
295.Amy Wickner, Katie Shilton, Doug Oard, and Jimmy Lin. Protecting Secrets in Email: Archival Views on Challenges and Opportunities. PC4DS: The First International Workshop on Privacy-sensitive Collections for Digital Scholarship, August 2017, Montreal, Quebec, Canada.
294.Jimmy Lin, Ian Milligan, Jeremy Wiebe, and Alice Zhou. Warcbase: Scalable Analytics Infrastructure for Exploring Web Archives. ACM Journal on Computing and Cultural Heritage, 10(4), Article 22, 2017.
293.Ziquan Wang, Borui Lin, Ian Milligan, and Jimmy Lin. Topic Shifts Between Two US Presidential Administrations. JCDL 2017 Workshop on Web Archiving and Digital Libraries, June 2017, Toronto, Ontario, Canada.
292.Jimmy Lin and Andrew Trotman. The Role of Index Compression in Score-at-a-Time Query Evaluation. Information Retrieval, 20(3):199-220, 2017.
291.Jimmy Lin. In Defense of MapReduce. IEEE Internet Computing, 21(3):94-98, 2017.
290.Jinfeng Rao, Ferhan Ture, Hua He, Oliver Jojic, and Jimmy Lin. Talking to Your TV: Context-Aware Voice Search with Hierarchical Recurrent Neural Networks. arXiv:1705.04892, April 2017. (Later appears at CIKM 2017)
289.Kareem El Gebaly and Jimmy Lin. In-Browser Interactive SQL Analytics with Afterburner. Proceedings of the 2017 ACM SIGMOD International Conference on Management of Data (SIGMOD 2017), pages 1623-1626, May 2017, Chicago, Illinois.
288.Anil Pacaci, Alice Zhou, Jimmy Lin, M. Tamer Özsu. Do We Need Specialized Graph Databases? Benchmarking Real-Time Social Networking Applications. Proceedings of the Fifth International Workshop on Graph Data-management Experiences & Systems (GRADES'17), Article 12, May 2017, Chicago, Illinois.
287.Salman Mohammed, Nimesh Ghelani, and Jimmy Lin. Distant Supervision for Topic Classification of Tweets in Curated Streams. arXiv:1704.06726, April 2017.
286.Joel Mackenzie, J. Shane Culpepper, Roi Blanco, Matt Crane, Charles L. A. Clarke, and Jimmy Lin. Efficient and Effective Tail Latency Minimization in Multi-Stage Retrieval Systems. arXiv:1704.03970, April 2017. (Later appears at WSDM 2018)
285.Charles L. A. Clarke, Gordon V. Cormack, Jimmy Lin, and Adam Roegiest. Ten Blue Links on Mars. Proceedings of the 26th International World Wide Web Conference (WWW 2017), pages 273-281, April 2017, Perth, Australia.
284.Yulu Wang and Jimmy Lin. Partitioning and Segment Organization Strategies for Real-Time Selective Search on Document Streams. Proceedings of the Tenth ACM International Conference on Web Search and Data Mining (WSDM 2017), pages 221-230, February 2017, Cambridge, United Kingdom.
283.Matt Crane, J. Shane Culpepper, Jimmy Lin, Joel Mackenzie, Andrew Trotman. A Comparison of Document-at-a-Time and Score-at-a-Time Query Evaluation. Proceedings of the Tenth ACM International Conference on Web Search and Data Mining (WSDM 2017), pages 201-210, February 2017, Cambridge, United Kingdom.

2016

282.Andrew Trotman and Jimmy Lin. In Vacuo and In Situ Evaluation of SIMD Codecs. Proceedings of the 21st Australasian Document Computing Symposium (ADCS 2016), pages 1-8, November 2016, Melbourne, Australia.
281.J. Shane Culpepper, Charles L. A. Clarke and Jimmy Lin. Dynamic Cutoff Prediction in Multi-Stage Retrieval Systems. Proceedings of the 21st Australasian Document Computing Symposium (ADCS 2016), pages 17-24, November 2016, Melbourne, Australia. (🏆 Best Paper)
280.Jimmy Lin, Adam Roegiest, Luchen Tan, Richard McCreadie, Ellen Voorhees, and Fernando Diaz. Overview of the TREC 2016 Real-Time Summarization Track. Proceedings of the Twenty-Fifth Text REtrieval Conference (TREC 2016), November 2016, Gaithersburg, Maryland.
279.Jinfeng Rao, Hua He, and Jimmy Lin. Noise-Contrastive Estimation for Answer Selection with Deep Neural Networks. Proceedings of the 25th International Conference on Information and Knowledge Management (CIKM 2016), pages 1913-1916, October 2016, Indianapolis, Indiana.
278.Gaurav Baruah, Haotian Zhang, Rakesh Guttikonda, Jimmy Lin, Mark D. Smucker, and Olga Vechtomova. Optimizing Nugget Annotations with Active Learning. Proceedings of the 25th International Conference on Information and Knowledge Management (CIKM 2016), pages 2359-2364, October 2016, Indianapolis, Indiana.
277.Charles L. A. Clarke, Gordon V. Cormack, Jimmy Lin, and Adam Roegiest. Ten Blue Links on Mars. arXiv:1610.06468, October 2016. (Later appears at WWW 2017)
276.Jimmy Lin, Zhucheng Tu, Michael Rose, and Patrick White. Prizm: A Wireless Access Point for Proxy-Based Web Lifelogging. Proceedings of the First Workshop on Lifelogging Tools and Applications (LTA 2016), pages 19-25, October 2016, Amsterdam, The Netherlands.
275.J. Shane Culpepper, Charles L. A. Clarke, and Jimmy Lin. Dynamic Trade-Off Prediction in Multi-Stage Retrieval Systems. arXiv:1610.02502, October 2016. (Later appears at ADC 2016)
274.Charles L. A. Clarke, Gordon V. Cormack, Jimmy Lin, and Adam Roegiest. Total Recall: Blue Sky on Mars. Proceedings of the 2nd ACM International Conference on the Theory of Information Retrieval (ICTIR 2016), pages 45-48, September 2016, Newark, Delaware.
273.Jiaul H. Paik and Jimmy Lin. Retrievability in API-Based "Evaluation as a Service". Proceedings of the 2nd ACM International Conference on the Theory of Information Retrieval (ICTIR 2016), pages 91-94, September 2016, Newark, Delaware. (🏆 Best Short Paper)
272.Ahmed Elbagoury, Matt Crane, and Jimmy Lin. Rank-at-a-Time Query Processing. Proceedings of the 2nd ACM International Conference on the Theory of Information Retrieval (ICTIR 2016), pages 229-232, September 2016, Newark, Delaware.
271.Jinfeng Rao and Jimmy Lin. Temporal Query Expansion Using a Continuous Hidden Markov Model. Proceedings of the 2nd ACM International Conference on the Theory of Information Retrieval (ICTIR 2016), pages 295-298, September 2016, Newark, Delaware.
270.Jimmy Lin and Kareem El Gebaly. The Future of Big Data Is... JavaScript? IEEE Internet Computing, 20(5):82-88, 2016.
269.Aneesh Sharma, Jerry Jiang, Praveen Bommannavar, Brian Larson, and Jimmy Lin. GraphJet: Real-Time Content Recommendations at Twitter. Proceedings of the VLDB Endowment, 9(13):1281-1292, 2016.
268.Ahmed El-Roby, Khaled Ammar, Ashraf Aboulnaga, and Jimmy Lin. Sapphire: Querying RDF Data Made Simple. Proceedings of the VLDB Endowment, 9(13):1481-1484, 2016.
267.Xin Qian, Jimmy Lin, and Adam Roegiest. Interleaved Evaluation for Retrospective Summarization and Prospective Notification on Document Streams. Proceedings of the 39th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2016), pages 175-184, July 2016, Pisa, Italy.
266.Luchen Tan, Adam Roegiest, Jimmy Lin, and Charles L. A. Clarke. An Exploration of Evaluation Metrics for Mobile Push Notifications. Proceedings of the 39th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2016), pages 741-744, July 2016, Pisa, Italy.
265.Cody Buntain and Jimmy Lin. Burst Detection in Social Media Streams for Tracking Interest Profiles in Real Time. Proceedings of the 39th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2016), pages 777-780, July 2016, Pisa, Italy.
264.Haotian Zhang, Jimmy Lin, Gordon V. Cormack, and Mark D. Smucker. Sampling Strategies and Active Learning for Volume Estimation. Proceedings of the 39th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2016), pages 981-984, July 2016, Pisa, Italy.
263.Luchen Tan, Adam Roegiest, Charles L. A. Clarke, and Jimmy Lin. Simple Dynamic Emission Strategies for Microblog Filtering. Proceedings of the 39th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2016), pages 1009-1012, July 2016, Pisa, Italy.
262.Adam Roegiest, Luchen Tan, Jimmy Lin, and Charles L. A. Clarke. A Platform for Streaming Push Notifications to Mobile Assessors. Proceedings of the 39th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2016), pages 1077-1080, July 2016, Pisa, Italy.
261.Ian Milligan, Jimmy Lin, Jeremy Wiebe, and Alice Zhou. Exploring and Discovering Archive-It Collections with Warcbase. Digital Humanities 2016, July 2016, Krakow, Poland.
260.Andrew Jackson, Jimmy Lin, Ian Milligan, and Nick Ruest. Desiderata for Exploratory Search Interfaces to Web Archives in Support of Scholarly Activities. Proceedings of the 16th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL 2016), pages 103-106, June 2016, Newark, New Jersey.
259.Ian Milligan, Nick Ruest, and Jimmy Lin. Content Selection and Curation for Web Archiving: The Gatekeepers vs. the Masses. Proceedings of the 16th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL 2016), pages 107-110, June 2016, Newark, New Jersey.
258.Hua He, John Wieting, Kevin Gimpel, Jinfeng Rao, and Jimmy Lin. UMD-TTIC-UW at SemEval-2016 Task 1: Attention-Based Multi-Perspective Convolutional Neural Networks for Textual Similarity Measurement. Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pages 1103-1108, June 2016, San Diego, California.
257.Hua He and Jimmy Lin. Pairwise Word Interaction Modeling with Neural Networks for Semantic Similarity Measurement. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2016), pages 937-948, June 2016, San Diego, California.
256.Luchen Tan, Jimmy Lin, Adam Roegiest, and Charles L. A. Clarke. The Effects of Latency Penalties in Evaluating Push Notification Systems. arXiv:1606.03066, May 2016.
255.Douglas Oard, Katie Shilton, and Jimmy Lin. Evaluating Search Among Secrets. Proceedings of the 7th International Workshop on Evaluating Information Access (EVIA 2016), pages 21-24, June 2016, Tokyo, Japan.
254.Kareem El Gebaly and Jimmy Lin. Afterburner: The Case for In-Browser Analytics. arXiv:1605.04035, May 2016. (Later appears at SIGMOD 2017)
253.Praveen Bommannavar, Jimmy Lin, and Anand Rajaraman. Estimating Topical Volume in Social Media Streams. Proceedings of the 31st ACM Symposium on Applied Computing (SAC 2016), pages 1096-1101, April 2016, Pisa, Italy.
252.Abdul Quamar, Amol Deshpande, and Jimmy Lin. NScale: Neighborhood-Centric Large-Scale Graph Analytics in the Cloud. The VLDB Journal, 25(2):125-150, 2016.
251.Jimmy Lin, Matt Crane, Andrew Trotman, Jamie Callan, Ishan Chattopadhyaya, John Foley, Grant Ingersoll, Craig Macdonald, Sebastiano Vigna. Toward Reproducible Baselines: The Open-Source IR Reproducibility Challenge. Proceedings of the 38th European Conference on Information Retrieval (ECIR 2016), pages 408-420, March 2016, Padua, Italy.
250.Jinfeng Rao, Xing Niu, and Jimmy Lin. Compressing and Decoding Term Statistics Time Series. Proceedings of the 38th European Conference on Information Retrieval (ECIR 2016), pages 675-681, March 2016, Padua, Italy.
249.Jimmy Lin, Charles L.A. Clarke, and Gaurav Baruah. Searching from Mars. IEEE Internet Computing, 20(1):78-82, 2016.
248.Cody Buntain, Jimmy Lin, and Jennifer Golbeck. Discovering Key Moments in Social Media Streams. Proceedings of the 13th Annual IEEE Consumer Communications & Networking Conference (CCNC 2016), pages 373-381, January 2016, Las Vegas, Nevada.

2015

247.Allan Hanbury, Henning Müller, Krisztian Balog, Torben Brodt, Gordon V. Cormack, Ivan Eggel, Tim Gollub, Frank Hopfgartner, Jayashree Kalpathy-Cramer, Noriko Kando, Anastasia Krithara, Jimmy Lin, Simon Mercer, Martin Potthast. Evaluation-as-a-Service: Overview and Outlook. arXiv:1512.07454, December 2015. (Later appears in the Journal of Data and Information Quality, 2018)
246.Jaime Arguello, Matt Crane, Fernando Diaz, Jimmy Lin, Andrew Trotman. Report on the SIGIR 2015 Workshop on Reproducibility, Inexplicability, and Generalizability of Results (RIGOR). SIGIR Forum, 49(2):107-116, 2015.
245.Cody Buntain and Jimmy Lin. Burst Detection in Social Media Streams for Tracking Interest Profiles in Real Time. Proceedings of the Twenty-Fourth Text REtrieval Conference (TREC 2015), November 2015, Gaithersburg, Maryland.
244.Jimmy Lin, Miles Efron, Yulu Wang, Garrick Sherman, and Ellen Voorhees. Overview of the TREC-2015 Microblog Track. Proceedings of the Twenty-Fourth Text REtrieval Conference (TREC 2015), November 2015, Gaithersburg, Maryland.
243.Jimmy Lin and Andrew Trotman. Anytime Ranking for Impact-Ordered Indexes. Proceedings of the ACM International Conference on the Theory of Information Retrieval (ICTIR 2015), pages 301-304, September 2015, Northampton, Massachusetts.
242.Jimmy Lin. Building a Self-Contained Search Engine in the Browser. Proceedings of the ACM International Conference on the Theory of Information Retrieval (ICTIR 2015), pages 309-312, September 2015, Northampton, Massachusetts.
241.Yulu Wang and Jimmy Lin. The Feasibility of Brute Force Scans for Real-Time Tweet Search. Proceedings of the ACM International Conference on the Theory of Information Retrieval (ICTIR 2015), pages 321-324, September 2015, Northampton, Massachusetts.
240.Hua He, Kevin Gimpel, and Jimmy Lin. Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP 2015), pages 1576-1586, September 2015, Lisbon, Portugal.
239.Jimmy Lin. Is Big Data a Transient Problem? IEEE Internet Computing, 19(5):86-90, 2015.
238.Dean F. Sittig, Allison B. McCoy, Adam Wright, and Jimmy Lin. Developing an Open-Source Bibliometric Ranking Website Using Google Scholar Citation Profiles for Researchers in the Field of Biomedical Informatics. Proceedings of the 14th World Congress on Medical and Health Informatics, page 1004, August 2015, São Paulo, Brazil.
237.Jiaul H. Paik and Jimmy Lin. Do Multiple Listeners to the Public Twitter Sample Stream Receive the Same Tweets? Proceedings of the SIGIR 2015 Workshop on Temporal, Social and Spatially-Aware Information Access, August 2015, Santiago, Chile.
236.Yulu Wang, Garrick Sherman, Jimmy Lin, and Miles Efron. Assessor Differences and User Preferences in Tweet Timeline Generation. Proceedings of the 38th Annual International ACM Conference on Research and Development in Information Retrieval (SIGIR 2015), pages 615-624, August 2015, Santiago, Chile.
235.Jaime Arguello, Fernando Diaz, Jimmy Lin, and Andrew Trotman. SIGIR 2015 Workshop on Reproducibility, Inexplicability, and Generalizability of Results (RIGOR). Proceedings of the 38th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2015), pages 1147-1148, August 2015, Santiago, Chile.
234.Cody Buntain, Jimmy Lin, and Jennifer Golbeck. Learning to Discover Key Moments in Social Media Streams. arXiv:1508.00488, August 2015. (Later appears at CCNC 2016)
233.Sarah Weissman, Samet Ayhan, Joshua Bradley, and Jimmy Lin. Identifying Duplicate and Contradictory Information in Wikipedia. Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL 2015), pages 57-60, June 2015, Knoxville, Tennessee.
232.Jimmy Lin. The Sum of All Human Knowledge in Your Pocket: Full-Text Searchable Wikipedia on a Raspberry Pi. Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL 2015), pages 85-86, June 2015, Knoxville, Tennessee.
231.Frank Hopfgartner, Allan Hanbury, Henning Müller, Noriko Kando, Simon Mercer, Jayashree Kalpathy-Cramer, Martin Potthast, Tim Gollub, Anastasia Krithara, Jimmy Lin, Krisztian Balog, and Ivan Eggel. Report on the Evaluation-as-a-Service (EaaS) Expert Workshop. SIGIR Forum, 49(1):57-65, 2015.
230.Jimmy Lin. Scaling Down Distributed Infrastructure on Wimpy Machines for Personal Web Archiving. Proceedings of the 24th International World Wide Web Conference Companion (WWW 2015), pages 1351-1355, May 2015, Florence, Italy. (Temporal Web Analytics Workshop 2015)
229.Jimmy Lin. On Building Better Mousetraps and Understanding the Human Condition: Reflections on Big Data in the Social Sciences. The Annals of the American Academy of Political and Social Science, 659(1):33-47, May 2015.
228.Jinfeng Rao, Jimmy Lin, and Miles Efron. Reproducible Experiments on Lexical and Temporal Feedback for Tweet Search. Proceedings of the 37th European Conference on Information Retrieval (ECIR 2015), pages 755-767, March 2015, Vienna, Austria.
227.Hua He, Jimmy Lin, and Adam Lopez. Gappy Pattern Matching on GPUs for On-Demand Extraction of Hierarchical Translation Grammars. Transactions of the Association for Computational Linguistics, 3:87-100, 2015.

2014

226.Jimmy Lin, Miles Efron, Yulu Wang, and Garrick Sherman. Overview of the TREC-2014 Microblog Track. Proceedings of the Twenty-Third Text REtrieval Conference (TREC 2014), November 2014, Gaithersburg, Maryland.
225.Krist Wongsuphasawat and Jimmy Lin. Using Visualizations to Monitor Changes and Harvest Insights from a Global-Scale Logging Infrastructure at Twitter. Proceedings of the 2014 IEEE Conference on Visual Analytics Science and Technology (VAST 2014), pages 113-122, November 2014, Paris, France.
224.Jinfeng Rao, Jimmy Lin, and Hanan Samet. Partitioning Strategies for Spatio-Textual Similarity Join. Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data (BigSpatial '14), pages 40-49, November 2014, Dallas, Texas.
223.Jimmy Lin, Jian Pei, Xiaohua Hu, Wo Chang, Raghunath Nambiar, Charu Aggarwal, Nick Cercone, Vasant Honavar, Jun Huan, Bamshad Mobasher, and Saumyadipta Pyne (editors). Proceedings of the 2014 IEEE International Conference on Big Data. New Jersey: IEEE Press, 2014.
222.Jimmy Lin. On the Feasibility and Implications of Self-Contained Search Engines in the Browser. arXiv:1410.4500, October 2014. (Later appears at ICTIR 2015)
221.Ferhan Ture and Jimmy Lin. Exploiting Representations from Statistical Machine Translation for Cross-Language Information Retrieval. ACM Transactions on Information Systems, 32(4), Article 19, 2014.
220.Nima Asadi, Jimmy Lin, and Arjen P. de Vries. Runtime Optimizations for Tree-Based Machine Learning Models. IEEE Transactions on Knowledge and Data Engineering, 26(9):2281-2292, 2014.
219.Gebrekirstos G. Gebremeskel, Jiyin He, Arjen P. de Vries, and Jimmy Lin. Cumulative Citation Recommendation: A Feature-aware Comparisons of Approaches. Proceedings of the 11th International Workshop on Text-based Information Retrieval at DEXA 2014, September 2014, Munich, Germany.
218.Pankaj Gupta, Venu Satuluri, Ajeet Grewal, Siva Gurumurthy, Volodymyr Zhabiuk, Quannan Li, and Jimmy Lin. Real-Time Twitter Recommendation: Online Motif Detection in Large Dynamic Graphs. Proceedings of the VLDB Endowment, 7(13):1379-1380, 2014.
217.Oscar Boykin, Sam Ritchie, Ian O'Connell, and Jimmy Lin. Summingbird: A Framework for Integrating Batch and Online MapReduce Computations. Proceedings of the VLDB Endowment, 7(13):1441-1451, 2014. [Slides (PDF)]
216.Abdul Quamar, Amol Deshpande, and Jimmy Lin. NScale: Neighborhood-centric Analytics on Large Graphs. Proceedings of the VLDB Endowment, 7(13):1673-1676, 2014.
215.Miles Efron, Jimmy Lin, Jiyin He, and Arjen de Vries. Temporal Feedback for Tweet Search with Non-Parametric Density Estimation. Proceedings of the 37th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2014), pages 33-42, July 2014, Gold Coast, Australia.
214.Hannes Mühleisen, Thaer Samar, Jimmy Lin, and Arjen de Vries. Old Dogs Are Great at New Tricks: Column Stores for IR Prototyping. Proceedings of the 37th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2014), pages 863-866, July 2014, Gold Coast, Australia.
213.Ellen M. Voorhees, Jimmy Lin, and Miles Efron. On Run Diversity in "Evaluation as a Service". Proceedings of the 37th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2014), pages 959-962, July 2014, Gold Coast, Australia.
212.Jimmy Lin, Kari Kraus, and Ricardo Punzalan. Supporting "Distant Reading" for Web Archives. Digital Humanities 2014, pages 239-241, July 2014, Lausanne, Switzerland.
211.Sarah Weissman, Samet Ayhan, Joshua Bradley, and Jimmy Lin. Identifying Duplicate and Contradictory Information in Wikipedia. arXiv:1406.1143, June 2014. (Later appears at JCDL 2015)
210.Abdul Quamar, Amol Deshpande, and Jimmy Lin. NScale: Neighborhood-centric Large-Scale Graph Analytics in the Cloud. arXiv:1405.1499, May 2014. (Later appears in the VLDB Journal, 2016)
209.Yulu Wang and Jimmy Lin. The Impact of Future Term Statistics in Real-Time Tweet Search. Proceedings of the 36th European Conference on Information Retrieval (ECIR 2014), pages 567-572, April 2014, Amsterdam, The Netherlands.
208.Hannes Mühleisen, Thaer Samar, Jimmy Lin, and Arjen P. de Vries. Column Stores as an IR Prototyping Tool. Proceedings of the 36th European Conference on Information Retrieval (ECIR 2014), pages 789-792, April 2014, Amsterdam, The Netherlands.
207.Seth A. Myers, Aneesh Sharma, Pankaj Gupta, and Jimmy Lin. Information Network or Social Network? The Structure of the Twitter Follow Graph. Proceedings of the 23rd International World Wide Web Conference Companion (WWW 2014), pages 493-498, April 2014, Seoul, South Korea. (Web Science Track)
206.Jimmy Lin and Miles Efron. Infrastructure Support for Evaluation as a Service. Proceedings of the 23rd International World Wide Web Conference Companion (WWW 2014), pages 79-82, April 2014, Seoul, South Korea.
205.Lidan Wang, Jimmy Lin, Donald Metzler, and Jiawei Han. Learning to Efficiently Rank on Big Data. Proceedings of the 23rd International World Wide Web Conference Companion (WWW 2014), pages 209-210, April 2014, Seoul, South Korea. (Tutorial)
204.Jimmy Lin, Milad Gholami, and Jinfeng Rao. Infrastructure for Supporting Exploration and Discovery in Web Archives. Proceedings of the 23rd International World Wide Web Conference Companion (WWW 2014), pages 851-855, April 2014, Seoul, South Korea. (Temporal Web Analytics Workshop 2014)
203.K. Ashwin Kumar, Jonathan Gluck, Amol Deshpande, and Jimmy Lin. Optimization Techniques for "Scaling Down" Hadoop on Multi-Core, Shared-Memory Systems. Proceedings of the 17th International Conference on Extending Database Technology (EDBT 2014), pages 13-24, March 2014, Athens, Greece.
202.Zhengzheng Xu, Dan Goldwasser, Benjamin B. Bederson, and Jimmy Lin. Visual Analytics of MOOCs at Maryland. Proceedings of the First ACM Conference on Learning at Scale, pages 195-196, March 2014, Atlanta, Georgia.
201.Nima Asadi and Jimmy Lin. An Exploration of Postings List Contiguity in Main-Memory Incremental Indexing. Proceedings of the WSDM 2014 Workshop on Large-Scale and Distributed Systems for Information Retrieval, February 2014, New York, New York.
200.Alan Said, Alejandro Bellogín, Jimmy Lin, and Arjen P. de Vries. Do Recommendations Matter? News Recommendation in Real Life. Proceedings of the Companion Publication of the 17th ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW 2014), pages 237-240, February 2014, Baltimore, Maryland.

2013

199.Nima Asadi and Jimmy Lin. Document Vector Representations for Feature Extraction in Multi-Stage Document Ranking. Information Retrieval, 16(6):747-768, 2013.
198.Jimmy Lin and Miles Efron. Evaluation as a Service for Information Retrieval. SIGIR Forum, 47(2):8-14, 2013.
197.Alejandro Bellogín, Gebrekirstos G. Gebremeskel, Jiyin He, Jimmy Lin, Alan Said, Thaer Samar, Arjen P. de Vries, and Jeroen B. P. Vuurens. CWI and TU Delft at TREC 2013: Contextual Suggestion, Federated Web Search, KBA, and Web Tracks. Proceedings of the Twenty-Second Text REtrieval Conference (TREC 2013), November 2013, Gaithersburg, Maryland.
196.Jimmy Lin and Miles Efron. Overview of the TREC-2013 Microblog Track. Proceedings of the Twenty-Second Text REtrieval Conference (TREC 2013), November 2013, Gaithersburg, Maryland.
195.Alan Said, Jimmy Lin, Alejandro Bellogín, and Arjen P. de Vries. A Month in the Life of a Production News Recommender System. Proceedings of the CIKM 2013 Workshop on Living Labs for Information Retrieval Evaluation (LivingLab '13), pages 7-10, November 2013, San Francisco, California.
194.K. Ashwin Kumar, Jonathan Gluck, Amol Deshpande, and Jimmy Lin. Hone: "Scaling Down" Hadoop on Shared-Memory Systems. Proceedings of the VLDB Endowment, 6(12):1354-1357, 2013.
193.Nima Asadi, Jimmy Lin, and Michael Busch. Dynamic Memory Allocation Policies for Postings in Real-Time Twitter Search. Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD 2013), pages 1186-1194, August 2013, Chicago, Illinois.
192.Vladimir Eidelman, Ke Wu, Ferhan Ture, Philip Resnik, and Jimmy Lin. Mr. MIRA: Open-Source Large-Margin Structured Learning on MapReduce. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 199-204, August 2013, Sofia, Bulgaria.
191.Vladimir Eidelman, Ke Wu, Ferhan Ture, Philip Resnik, and Jimmy Lin. Towards Efficient Large-Scale Feature-Rich Statistical Machine Translation. Proceedings of the Eighth Workshop on Statistical Machine Translation, pages 128-133, August 2013, Sofia, Bulgaria.
190.Jimmy Lin and Miles Efron. Temporal Relevance Profiles for Tweet Search. Proceedings of the SIGIR 2013 Workshop on Time-Aware Information Access, August 2013, Dublin, Ireland.
189.Ferhan Ture and Jimmy Lin. Flat vs. Hierarchical Phrase-Based Translation Models for Cross-Language Information Retrieval. Proceedings of the 36th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2013), pages 813-816, July 2013, Dublin, Ireland.
188.Nima Asadi and Jimmy Lin. Effectiveness/Efficiency Tradeoffs for Candidate Generation in Multi-Stage Retrieval Architectures. Proceedings of the 36th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2013), pages 997-1000, July 2013, Dublin, Ireland.
187.Nima Asadi and Jimmy Lin. Fast Candidate Generation for Real-Time Tweet Search with Bloom Filter Chains. ACM Transactions on Information Systems, 31(3), Article 13, 2013.
186.Miguel Rios and Jimmy Lin. Visualizing the "Pulse" of World Cities on Twitter. Proceedings of the 7th International AAAI Conference on Weblogs and Social Media (ICWSM 2013), July 2013, pages 717-720, July 2013, Boston, Massachusetts.
185.Gilad Mishne, Jeff Dalton, Zhenghua Li, Aneesh Sharma, and Jimmy Lin. Fast Data in the Era of Big Data: Twitter's Real-Time Related Query Suggestion Architecture. Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD 2013), pages 1147-1157, June 2013, New York, New York.
184.Hua He, Jimmy Lin, and Adam Lopez. Massively Parallel Suffix Array Queries and On-Demand Phrase Extraction for Statistical Machine Translation Using GPUs. Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2013), pages 325-334, June 2013, Atlanta, Georgia.
183.Jimmy Lin. MapReduce Algorithm Design. Tutorial at 22th International World Wide Web Conference (WWW 2013), May 2013, Rio de Janeiro, Brazil.
182.Pankaj Gupta, Ashish Goel, Jimmy Lin, Aneesh Sharma, Dong Wang, and Reza Zadeh. WTF: The Who to Follow Service at Twitter. Proceedings of the 22th International World Wide Web Conference (WWW 2013), pages 505-514, May 2013, Rio de Janeiro, Brazil.
181.Nima Asadi and Jimmy Lin. Fast, Incremental Inverted Indexing in Main Memory for Web-Scale Collections. arXiv:1305.0699, May 2013. (Later appears at the WSDM 2014 LSDS-IR workshop)
180.Jimmy Lin. Monoidify! Monoids as a Design Principle for Efficient MapReduce Algorithms. arXiv:1304.7544, April 2013.
179.Nima Asadi and Jimmy Lin. Training Efficient Tree-Based Models for Document Ranking. Proceedings of the 35th European Conference on Information Retrieval (ECIR 2013), pages 146-157, March 2013, Moscow, Russia.
178.Jimmy Lin. MapReduce is Good Enough? If All You Have is a Hammer, Throw Away Everything That's Not a Nail! Big Data, 1(1):28-37, 2013.
177.Nima Asadi, Jimmy Lin, and Michael Busch. Dynamic Memory Allocation Policies for Postings in Real-Time Twitter Search. arXiv:1302.5302, February 2013. (Later appears at KDD 2013)

2012

176.Jimmy Lin and Dmitriy Ryaboy. Scaling Big Data Mining Infrastructure: The Twitter Experience, SIGKDD Explorations, 14(2):6-19, 2012.
175.Nima Asadi, Jimmy Lin, and Arjen P. de Vries. Runtime Optimizations for Prediction with Tree-Based Models. arXiv:1212.2287, December 2012. (Later appears in the IEEE Transactions on Knowledge and Data Engineering, 2014)
174.Ferhan Ture, Jimmy Lin, and Douglas W. Oard. Combining Statistical Translation Techniques for Cross-Language Information Retrieval. Proceedings of the 24th International Conference on Computational Linguistics (COLING 2012), pages 2685-2702, December 2012, Mumbai, India.
173.Ian Soboroff, Iadh Ounis, Craig Macdonald, and Jimmy Lin. Overview of the TREC-2012 Microblog Track. Proceedings of the Twenty-First Text REtrieval Conference (TREC 2012), November 2012, Gaithersburg, Maryland.
172.Gilad Mishne, Jeff Dalton, Zhenghua Li, Aneesh Sharma, and Jimmy Lin. Fast Data in the Era of Big Data: Twitter's Real-Time Related Query Suggestion Architecture. arXiv:1210.7350, October 2012. (Later appears at SIGMOD 2013)
171.Nima Asadi and Jimmy Lin. Fast Candidate Generation for Two-Phase Document Ranking: Postings List Intersection with Bloom Filters. Proceedings of the 21st International Conference on Information and Knowledge Management (CIKM 2012), pages 2419-2422, October 2012, Maui, Hawaii.
170.Jimmy Lin. MapReduce is Good Enough? If All You Have is a Hammer, Throw Away Everything That's Not a Nail! arXiv:1209.2191, September 2012. (Later appears in Big Data, 2013)
169.George Lee, Jimmy Lin, Chuang Liu, Andrew Lorek, and Dmitriy Ryaboy. The Unified Logging Infrastructure for Data Analytics at Twitter. Proceedings of the VLDB Endowment, 5(12):1771-1780, 2012.
168.Ferhan Ture, Jimmy Lin, and Douglas W. Oard. Looking Inside the Box: Context-Sensitive Translation for Cross-Language Information Retrieval. Proceedings of the 35th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012), pages 1105-1106, August 2012, Portland, Oregon.
167.Richard McCreadie, Ian Soboroff, Jimmy Lin, Craig Macdonald, Iadh Ounis, and Dean McCullough. On Building a Reusable Twitter Corpus. Proceedings of the 35th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012), pages 1113-1114, August 2012, Portland, Oregon.
166.Gilad Mishne and Jimmy Lin. Twanchor Text: A Preliminary Study of the Value of Tweets as Anchor Text. Proceedings of the 35th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012), pages 1159-1160, August 2012, Portland, Oregon.
165.Ferhan Ture and Jimmy Lin. Why Not Grab a Free Lunch? Mining Large Corpora for Parallel Sentences to Improve Translation Modeling. Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 626-630, June 2012, Montreal, Quebec, Canada.
164.Miguel Rios and Jimmy Lin. Distilling Massive Amounts of Data into Simple Visualizations: Twitter Case Studies. Proceedings of the Workshop on Social Media Visualization (SocMedVis) at ICWSM 2012, pages 22-25, June 2012, Dublin, Ireland.
163.Ian Soboroff, Dean McCullough, Jimmy Lin, Craig Macdonald, Iadh Ounis, and Richard McCreadie. Evaluating Real-Time Search over Tweets. Proceedings of the 6th International AAAI Conference on Weblogs and Social Media (ICWSM 2012), pages 579-582, June 2012, Dublin, Ireland.
162.Jimmy Lin and Gilad Mishne. A Study of "Churn" in Tweets and Real-Time Search Queries. Proceedings of the 6th International AAAI Conference on Weblogs and Social Media (ICWSM 2012), pages 503-506, June 2012, Dublin, Ireland.
161.Jimmy Lin and Gilad Mishne. A Study of "Churn" in Tweets and Real-Time Search Queries (Extended Version). arXiv:1205.6855, May 2012.
160.Jimmy Lin and Alek Kolcz. Large-Scale Machine Learning at Twitter. Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD 2012), pages 793-804, May 2012, Scottsdale, Arizona.
159.Michael Busch, Krishna Gade, Brian Larson, Patrick Lok, Samuel Luckenbill, and Jimmy Lin. Earlybird: Real-Time Search at Twitter. Proceedings of the 28th International Conference on Data Engineering (ICDE 2012), pages 1360-1369, April 2012, Washington, D.C.

2011

158.Iadh Ounis, Craig Macdonald, Jimmy Lin, and Ian Soboroff. Overview of the TREC-2011 Microblog Track. Proceedings of the Twentieth Text REtrieval Conference (TREC 2011), November 2011, Gaithersburg, Maryland.
157.Florian Leibert, Jake Mannix, Jimmy Lin, and Babak Hamadani. Automatic Management of Partitioned, Replicated Search Services. Proceedings of the 2nd ACM Symposium on Cloud Computing (SoCC 2011), October 2011, Cascais, Portugal.
156.Tamer Elsayed, Jimmy Lin, and Don Metzler. When Close Enough Is Good Enough: Approximate Positional Indexes for Efficient Ranked Retrieval. Proceedings of the 20th International Conference on Information and Knowledge Management (CIKM 2011), pages 1993-1996, October 2011, Glasgow, Scotland.
155.Jimmy Lin, Rion Snow, and William Morgan. Smoothing Techniques for Adaptive Online Language Models: Topic Tracking in Tweet Streams. Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD 2011), pages 422-429, August 2011, San Diego, California. [Slides (PDF)]
154.Jimmy Lin. Large-Scale Data Processing with MapReduce. Tutorial at 25th AAAI Conference on Artificial Intelligence (AAAI-11), August 2011, San Francisco, California.
153.Lidan Wang, Jimmy Lin, and Donald Metzler. A Cascade Ranking Model for Efficient Ranked Retrieval. Proceedings of the 34th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2011), pages 105-114, July 2011, Beijing, China.
152.Ferhan Ture, Tamer Elsayed, and Jimmy Lin. No Free Lunch: Brute Force vs. Locality-Sensitive Hashing for Cross-lingual Pairwise Similarity. Proceedings of the 34th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2011), pages 943-952, July 2011, Beijing, China.
151.Nima Asadi, Donald Metzler, Tamer Elsayed, and Jimmy Lin. Pseudo Test Collections for Learning Web Search Ranking Functions. Proceedings of the 34th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2011), pages 1073-1082, July 2011, Beijing, China.
150.Nima Asadi, Donald Metzler, and Jimmy Lin. Cross-Corpus Relevance Projection. Proceedings of the 34th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2011), pages 1163-1164, July 2011, Beijing, China.
149.Jimmy Lin, Dmitriy Ryaboy, and Kevin Weil. Full-Text Indexing for Optimizing Selection Operations in Large-Scale Data Analytics. Proceedings of the Second International Workshop on MapReduce and Its Applications (MAPREDUCE 2011), pages 59-66, June 2011, San Jose, California.
148.Gregory Chockler, Eliezer Dekel, Joseph JaJa, and Jimmy Lin. Special Issue on Cloud Computing. Journal of Parallel and Distributed Computing, 71(6):731, 2011.
147.Earl J. Wagner and Jimmy Lin. In-Depth Accounts and Passing Mentions in the News: Connecting Readers to the Context of a News Event. Proceedings of the 2011 iConference, pages 790-791, February 2011, Seattle, Washington.

2010

146.Tamer Elsayed, Nima Asadi, Donald Metzler, Lidan Wang, and Jimmy Lin. UMD and USC/ISI: TREC 2010 Web Track Experiments with Ivory. Proceedings of the Nineteenth Text REtrieval Conference (TREC 2010), November 2010, Gaithersburg, Maryland.
145.Di-Wei Huang and Jimmy Lin. Scaling Populations of a Genetic Algorithm for Job Shop Scheduling Problems using MapReduce. Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science (CLOUDCOM 2010), pages 780-785, November 2010, Indianapolis, Indiana.
144.Tamer Elsayed, Ferhan Ture, and Jimmy Lin. Brute-Force Approaches to Batch Retrieval: Scalable Indexing with MapReduce, or Why Bother? Technical Report HCIL-2010-23, University of Maryland, College Park, October 2010.
143.Lidan Wang, Donald Metzler, and Jimmy Lin. Ranking Under Temporal Constraints. Proceedings of the 19th International Conference on Information and Knowledge Management (CIKM 2010), pages 79-88, October 2010, Toronto, Ontario, Canada.
142.Philip Resnik and Jimmy Lin. Evaluation of NLP Systems. In Alex Clark, Chris Fox, and Shalom Lappin, editors, Computational Linguistics and Natural Language Processing Handbook. Oxford, England: Blackwell Publishers, 2010.
141.Brandyn White, Tom Yeh, Jimmy Lin, and Larry Davis. Web-Scale Computer Vision using MapReduce for Multimedia Data Mining. Proceedings of the Tenth International Workshop on Multimedia Data Mining (MDMKDD 2010), pages 9:1-9:10, July 2010, Washington, D.C.
140.Jimmy Lin and Michael Schatz. Design Patterns for Efficient Graph Algorithms in MapReduce. Proceedings of the Eighth Workshop on Mining and Learning with Graphs Workshop (MLG-2010), pages 78-85, July 2010, Washington, D.C.
139.Lidan Wang, Jimmy Lin, and Donald Metzler. Learning to Efficiently Rank. Proceedings of the 33rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2010), pages 138-145, July 2010, Geneva, Switzerland.
138.Di-Wei Huang and Jimmy Lin. Scaling Populations of a Genetic Algorithm for Job Shop Scheduling Problems using MapReduce. Technical Report HCIL-2010-14, University of Maryland, College Park, June 2010. (Later appears at CLOUDCOM 2010)
137.Jimmy Lin, Nitin Madnani, and Bonnie Dorr. Putting the User in the Loop: Interactive Maximal Marginal Relevance for Query-Focused Summarization. Proceedings of the 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL/HLT 2010), pages 305-308, June 2010, Los Angeles, California.
136.Jimmy Lin and Chris Dyer. Data-Intensive Text Processing with MapReduce. Tutorial at the 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL/HLT 2010), June 2010, Los Angeles, California.
135.Alexander J. Quinn, Benjamin B. Bederson, Tom Yeh, and Jimmy Lin. CrowdFlow: Integrating Machine Learning with Mechanical Turk for Speed-Cost-Quality Flexibility. Technical Report HCIL-2010-09, University of Maryland, College Park, May 2010.
134.Jimmy Lin and Chris Dyer. Data-Intensive Text Processing with MapReduce. Morgan & Claypool Publishers, 2010.

2009

133.Ben Langmead, Michael C. Schatz, Jimmy Lin, Mihai Pop, and Steven L. Salzberg. Searching for SNPs with Cloud Computing. Genome Biology, 10:R134, 2009. Open Access
132.David Cheung, Il-Yeol Song, Wesley Chu, Xiaohua Hu, and Jimmy Lin (editors). Proceedings of the 18th ACM Conference on Information and Knowledge Management. New York: ACM Press, 2009.
131.Allison Druin, Paul T. Jaeger, Kenneth R. Fleischmann, Jennifer Golbeck, Jimmy Lin, Yan Qu, Ping Wang, and Bo Xie. The Maryland Modular Method: An Approach to Doctoral Education in Information Studies. Journal of Education for Library and Information Science, 50(4):293-301, 2009.
130.Jimmy Lin, Donald Metzler, Tamer Elsayed, and Lidan Wang. Of Ivory and Smurfs: Loxodontan MapReduce Experiments for Web Search. Proceedings of the Eighteenth Text REtrieval Conference (TREC 2009), November 2009, Gaithersburg, Maryland.
129.Jimmy Lin. Care and Feeding of Hadoop Clusters. Tutorial at the 23rd Large Installation System Administration Conference (LISA 2009), November 2009, Baltimore, Maryland.
128.Michael C. Schatz, Ben Langmead, Jimmy Lin, Mihai Pop, and Steven L. Salzberg. Whole Genome Resequencing Analysis in the Clouds. Poster at Supercomputing 2009, November 2009, Portland, Oregon.
127.Jimmy Lin. Summarization. In Tamer M. Özsu and Ling Liu, editors, Encyclopedia of Database Systems. Heidelberg, Germany: Springer-Verlag, 2009.
126.Ben Langmead, Michael C. Schatz, Jimmy Lin, Mihai Pop, and Steven L. Salzberg. Human SNPs from Short Reads in Hours using Cloud Computing. The 9th Workshop on Algorithms in Bioinformatics, September 2009, Baltimore, Maryland.
125.Timothy Hawes, Jimmy Lin, and Philip Resnik. Elements of a Computational Model for Multi-Party Discourse: The Turn-Taking Behavior of Supreme Court Justices. Journal of the American Society for Information Science and Technology, 60(8):1607-1615, 2009.
124.Jimmy Lin and W. John Wilbur. Modeling Actions of PubMed Users with N-Gram Language Models. Information Retrieval, 12(4):487-503, 2009. Open Access
123.Jimmy Lin. The Curse of Zipf and Limits to Parallelization: A Look at the Stragglers Problem in MapReduce. Proceedings of the 7th Workshop on Large-Scale Distributed Systems for Information Retrieval (LSDS-IR'09) at SIGIR 2009, pages 57-60, July 2009, Boston, Massachusetts.
122.Jimmy Lin. Brute Force and Indexed Approaches to Pairwise Document Similarity Comparisons with MapReduce. Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2009), pages 155-162, July 2009, Boston, Massachusetts. (Full-size graphs: Figure 6, top/middle/bottom; Figure 7; Figure 8; Figure 9)
121.Jimmy Lin. Data-Intensive Text Processing with MapReduce. Tutorial at the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2009), July 2009, Boston, Massachusetts.
120.Judith L. Klavans, Eileen Abels, Jimmy Lin, Rebecca Passonneau, Carolyn Sheffield, and Dagobert Soergel. Mining Texts for Image Terms: The CLiMB Project. Digital Humanities 2009, pages 184-186, June 2009, College Park, Maryland.
119.G. Craig Murray, Jimmy Lin, John Wilbur, and Zhiyong Lu. Users' Adjustments to Unsuccessful Queries in Biomedical Search. Proceedings of the 9th ACM/IEEE-CS Joint International Conference on Digital Libraries (JCDL 2009), pages 433-434, June 2009, Austin, Texas.
118.Jimmy Lin and Chris Dyer. Data-Intensive Text Processing with MapReduce. Tutorial at the 2009 North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL/HLT 2009), May 2009, Boulder, Colorado.
117.Michael Lieberman and Jimmy Lin. You Are Where You Edit: Locating Wikipedia Users Through Edit Histories. Proceedings of the Third International Conference on Weblogs and Social Media (ICWSM 2009), pages 106-113, May 2009, San Jose, California.
116.Paul T. Jaeger, Jimmy Lin, Justin Grimes, and Shannon Simmons. Where Is the Cloud? Geography, Economics, Environment, and Jurisdiction in Cloud Computing. First Monday, 14(5), 2009.
115.Judith L. Klavans, Carolyn Sheffield, Eileen Abels, Jimmy Lin, Rebecca Passonneau, Tandeep Sidhu, and Dagobert Soergel. Computational Linguistics for Metadata Building (CLiMB): Using Text Mining for the Automatic Identification, Categorization, and Disambiguation of Subject Terms for Image Metadata. Journal of Multimedia Tools and Applications, 42(1):115-138, 2009.
114.Jimmy Lin, G. Craig Murray, Bonnie J. Dorr, Jan Hajic, and Pavel Pecina. A Cost-effective Lexical Acquisition Process for Large-Scale Thesaurus Translation. Language Resource and Evaluation, 43(1):27-40, 2009.
113.Justin Grimes, Paul T. Jaeger, and Jimmy Lin. Weathering the Storm: The Policy Implications of Cloud Computing. Proceedings of the 2009 iConference, February 2009, Chapel Hill, North Carolina.
112.Jimmy Lin. Is Searching Full Text More Effective Than Searching Abstracts? BMC Bioinformatics, 10:46 (3 February 2009). Open Access
111.Jimmy Lin, Anand Bahety, Shravya Konda, and Samantha Mahindrakar. Low-Latency, High-Throughput Access to Static Global Resources within the Hadoop Framework. Technical Report HCIL-2009-01, University of Maryland, College Park, January 2009.

2008

110.Jimmy Lin, Philip Wu, and Eileen Abels. Towards Automatic Facet Analysis and Need Negotiation: Lessons from Mediated Search. ACM Transactions on Information Systems, 27(1), Article 6, 2008.
109.Paul T. Jaeger, Jimmy Lin, and Justin Grimes. Cloud Computing and Information Policy: Computing in a Policy Cloud? Journal of Information Technology & Politics, 5(3):269-283, 2008.
108.Saif Mohammad, Bonnie J. Dorr, Melissa Egan, Nitin Madnani, David Zajic, and Jimmy Lin. Multiple Alternative Sentence Compressions and Word-Pair Antonymy for Automatic Text Summarization and Recognizing Textual Entailment. Proceedings of the First Text Analysis Conference (TAC 2008), November 2008, Gaithersburg, Maryland.
107.Jimmy Lin. Scalable Language Processing Algorithms for the Masses: A Case Study in Computing Word Co-occurrence Matrices with MapReduce. Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing (EMNLP 2008), pages 419-428, October 2008, Honolulu, Hawaii.
106.Jimmy Lin, Michael DiCuccio, Vahan Grigoryan, and W. John Wilbur. Navigating Information Spaces: A Case Study of Related Article Search in PubMed. Information Processing & Management, 44(5):1771-1783, 2008.
105.Jimmy Lin and Mark D. Smucker. How Do Users Find Things with PubMed? Towards Automatic Utility Evaluation with User Simulations. Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2008), pages 19-26, July 2008, Singapore.
104.David Zajic, Bonnie Dorr, and Jimmy Lin. Single-Document and Multi-Document Summarization Techniques for Email Threads Using Sentence Compression. Information Processing & Management, 44(4):1600-1610, 2008.
103.Jimmy Lin. Scalable Language Processing Algorithms for the Masses: A Case Study in Computing Word Co-occurrence Matrices with MapReduce. Technical Report HCIL-2008-28, University of Maryland, College Park, June 2008. (Later appears at EMNLP 2008)
102.Chris Dyer, Aaron Cordova, Alex Mont, and Jimmy Lin. Fast, Easy, and Cheap: Construction of Statistical Machine Translation Models with MapReduce. Proceedings of the Third Workshop on Statistical Machine Translation, pages 199-207, June 2008, Columbus, Ohio.
101.Jimmy Lin. Exploring Large-Data Issues in the Curriculum: A Case Study with MapReduce. Proceedings of the Third Workshop on Issues in Teaching Computational Linguistics, pages 54-61, June 2008, Columbus, Ohio.
100.Tamer Elsayed, Jimmy Lin, and Douglas W. Oard. Pairwise Document Similarity in Large Collections with MapReduce. Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics, Companion Volume (ACL 2008), pages 265-268, June 2008, Columbus, Ohio.
99.Judith L. Klavans, Carolyn Sheffield, Jimmy Lin, and Tandeep Sidhu. Computational Linguistics for Metadata Building. Proceedings of the 8th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL 2008), page 427, June 2008, Pittsburgh, Pennsylvania.
98.Jimmy Lin. PageRank without Hyperlinks: Reranking with PubMed Related Article Networks for Biomedical Text Retrieval. BMC Bioinformatics, 9:270 (6 Jun 2008). Open Access
97.Judith Klavans, Carolyn Sheffield, Eileen Abels, Joan Beaudoin, Laura Jenemann, Jimmy Lin, Tom Lippincott, Rebecca Passonneau, Tandeep Sidhu, Dagobert Soergel, Tae Yano. Computational Linguistics for Metadata Building: Aggregating Text Processing Technologies for Enhanced Image Access. Proceedings of the 2nd Workshop on Language Resources for Content-Based Image Retrieval (OntoImage 2008) at LREC 2008, pages 42-47, May 2008, Marrakech, Morocco.
96.Jimmy Lin and Mark D. Smucker. How Do Users Find Things with PubMed? Towards Automatic Utility Evaluation with User Simulations. Technical Report LAMP-TR-148/HCIL-2008-07, University of Maryland, College Park, February 2008. (Later appears at SIGIR 2008)
95.Judith L. Klavans, Tandeep Sidhu, Carolyn Sheffield, Dagobert Soergel, Jimmy Lin, Eileen Abels, and Rebecca Passonneau. Computational Linguistics for Metadata Building (CLiMB) Text Mining for the Automatic Extraction of Subject Terms for Image Metadata. Proceedings of the VISAPP 2008 Workshop on Metadata Mining for Image Understanding, pages 3-12, January 2008, Funchal, Madeira, Portugal.
94.Timothy Hawes, Jimmy Lin, and Philip Resnik. Elements of a Computational Model for Multi-Party Discourse: The Turn-Taking Behavior of Supreme Court Justices. Technical Report LAMP-TR-147/HCIL-2008-02, University of Maryland, College Park, January 2008. (Later appears in JASIST, 2009)
93.Jimmy Lin. PageRank without Hyperlinks: Reranking with Related Document Networks. Technical Report LAMP-TR-146/HCIL-2008-01, University of Maryland, College Park, January 2008. (Later appears in BMC Bioinformatics)

2007

92.Michael Evans, Wayne McIntosh, Jimmy Lin, and Cynthia Cates. Recounting the Courts? Applying Automated Content Analysis to Enhance Empirical Legal Research. Journal of Empirical Legal Studies, 4(4):1007-1039, 2007.
91.Hoa Trang Dang, Jimmy Lin, and Diane Kelly. Overview of the TREC 2007 Question Answering Track. Proceedings of the Sixteenth Text REtrieval Conference (TREC 2007), November 2007, Gaithersburg, Maryland.
90.Nitin Madnani, Jimmy Lin, and Bonnie Dorr. TREC 2007 ciQA Task: University of Maryland. Proceedings of the Sixteenth Text REtrieval Conference (TREC 2007), November 2007, Gaithersburg, Maryland.
89.Jimmy Lin and Dina Demner-Fushman. Semantic Clustering of Answers to Clinical Questions. Proceedings of the 2007 Annual Symposium of the American Medical Informatics Association (AMIA 2007), pages 458-462, November 2007, Chicago, Illinois.
88.David Zajic, Bonnie Dorr, Jimmy Lin, and Richard Schwartz. Multi-Candidate Reduction: Sentence Compression as a Tool for Document Summarization Tasks. Information Processing & Management, 43(6):1549-1570, 2007.
87.Jimmy Lin and W. John Wilbur. PubMed Related Articles: A Probabilistic Topic-based Model for Content Similarity. BMC Bioinformatics, 8:423 (30 October 2007). Open Access
86.Jimmy Lin and W. John Wilbur. Syntactic Sentence Compression in the Biomedical Domain: Facilitating Access to Related Articles. Information Retrieval, 10(4-5):393-414, 2007. Open Choice
85.Jimmy Lin, Michael DiCuccio, Vahan Grigoryan, and W. John Wilbur. Exploring the Effectiveness of Related Article Search in PubMed. Technical Report LAMP-TR-145/CS-TR-4877/UMIACS-TR-2007-36/HCIL-2007-10, University of Maryland, College Park, July 2007. (Later appears in Information Processing and Management)
84.Jimmy Lin and Pengyi Zhang. Deconstructing Nuggets: The Stability and Reliability of Complex Question Answering Evaluation. Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2007), pages 327-334, July 2007, Amsterdam, the Netherlands.
83.Hoa Trang Dang and Jimmy Lin. Different Structures for Evaluating Answers to Complex Questions: Pyramids Won't Topple, and Neither Will Human Assessors. Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL 2007), pages 768-775, June 2007, Prague, Czech Republic.
82.Tandeep Sidhu, Judith Klavans, and Jimmy Lin. Concept Disambiguation for Improved Subject Access Using Multiple Knowledge Sources. Proceedings of the Workshop on Language Technology for Cultural Heritage Data (LaTeCH 2007), pages 25-32, June 2007, Prague, Czech Republic.
81.Paul Kantor and Jimmy Lin. Presentation Schemes for Component Analysis in IR Experiments. SIGIR Forum, 41(1):34-39, 2007.
80.Diane Kelly and Jimmy Lin. Overview of the TREC 2006 ciQA Task. SIGIR Forum, 41(1):107-116, 2007.
79.Jimmy Lin. User Simulations for Evaluating Answers to Question Series. Information Processing & Management, 43(3):717-729, 2007.
78.Georg Apitz and Jimmy Lin. Interfaces to Support the Scholarly Exploration of Text Collections. Proceedings of the CHI 2007 Workshop for Exploratory Search and HCI, pages 60-63, April 2007, San Jose, CA.
77.Jimmy Lin. Is Question Answering Better Than Information Retrieval? Towards a Task-Based Evaluation Framework for Question Series. Proceedings of the 2007 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL/HLT 2007), pages 212-219, April 2007, Rochester, New York.
76.Nitin Madnani, David Zajic, Bonnie Dorr, Necip Fazil Ayan, and Jimmy Lin. Multiple Alternative Sentence Compressions for Automatic Text Summarization. Proceedings of the 2007 Document Understanding Conference (DUC-2007) at NAACL/HLT 2007, April 2007, Rochester, New York.
75.Jimmy Lin. An Exploration of the Principles Underlying Redundancy-Based Factoid Question Answering. ACM Transactions on Information Systems, 25(2):1-55, 2007.
74.Dina Demner-Fushman and Jimmy Lin. Answering Clinical Questions with Knowledge-Based and Statistical Techniques. Computational Linguistics, 33(1):63-103, 2007.

2006

73.Jimmy Lin and Dina Demner-Fushman. Methods for Automatically Evaluating Answers to Complex Questions. Information Retrieval, 9(5):565-587, 2006.
72.Hoa Trang Dang, Jimmy Lin, and Diane Kelly. Overview of the TREC 2006 Question Answering Track. Proceedings of the Fifteenth Text REtrieval Conference (TREC 2006), pages 99-116, November 2006, Gaithersburg, Maryland.
71.Douglas Oard, Tamer Elsayed, Jianqiang Wang, Yejun Wu, Pengyi Zhang, Eileen Abels, Jimmy Lin, and Dagobert Soergel. TREC-2006 at Maryland: Blog, Enterprise, Legal and QA Tracks. Proceedings of the Fifteenth Text REtrieval Conference (TREC 2006), November 2006, Gaithersburg, Maryland.
70.Xiaoli Huang, Jimmy Lin, and Dina Demner-Fushman. Evaluation of PICO as a Knowledge Representation for Clinical Questions. Proceedings of the 2006 Annual Symposium of the American Medical Informatics Association (AMIA 2006), pages 359-363, November 2006, Washington, D.C.
69.G. Craig Murray, Jimmy Lin, and Abdur Chowdhury. Identification of User Sessions with Hierarchical Agglomerative Clustering. Proceedings of the 2006 Annual Meeting of the American Society for Information Science and Technology (ASIST 2006), November 2006, Austin, Texas. (poster)
68.Jimmy Lin and Dina Demner-Fushman. The Role of Knowledge in Conceptual Retrieval: A Study in the Domain of Clinical Medicine. Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2006), pages 99-106, August 2006, Seattle, Washington.
67.Jimmy Lin, Philip Wu, Dina Demner-Fushman, and Eileen Abels. Exploring the Limits of Single-Iteration Clarification Dialogs. Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2006), pages 469-476, August 2006, Seattle, Washington.
66.G. Craig Murray, Jimmy Lin, and Abdur Chowdhury. Action Modeling: Using Language Models to Predict Query Behavior. Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2006), pages 681-682, August 2006, Seattle, Washington.
65.G. Craig Murray, Bonnie J. Dorr, Jimmy Lin, Jan Hajic, and Pavel Pecina. Leveraging Reusability: Cost-Effective Lexical Acquisition for Large-scale Ontology Translation. Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics (COLING/ACL 2006), pages 945-952, July 2006, Sydney, Australia.
64.Dina Demner-Fushman and Jimmy Lin. Answer Extraction, Semantic Clustering, and Extractive Summarization for Clinical Question Answering. Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics (COLING/ACL 2006), pages 841-848, July 2006, Sydney, Australia.
63.Jimmy Lin. The Role of Information Retrieval in Answering Complex Questions. Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics (COLING/ACL 2006), Poster Sessions, pages 523-530, July 2006, Sydney, Australia.
62.Dina Demner-Fushman and Jimmy Lin. Situated Question Answering in the Clinical Domain: Selecting the Best Drug Treatment for Diseases. Proceedings of the Workshop on Task-Focused Summarization and Question Answering, pages 24-31, July 2006, Sydney, Australia.
61.G. Craig Murray, Bonnie J. Dorr, Jimmy Lin, Jan Hajic, and Pavel Pecina. Leveraging Recurrent Phrase Structure in Large-scale Ontology Translation. Proceedings of the 11th Annual conference of the European Association for Machine Translation (EAMT), June 2006, Oslo, Norway.
60.David Zajic, Bonnie Dorr, Jimmy Lin, Dianne O'Leary, John Conroy, Judith Schlesinger. Sentence Trimming and Selection: Mixing and Matching. Proceedings of the 2006 Document Understanding Conference (DUC-2006) at HLT/NAACL 2006, June 2006, New York, New York.
59.David Zajic, Bonnie Dorr, Jimmy Lin, and Richard Schwartz. Sentence Compression as a Component of a Multi-Document Summarization System. Proceedings of the 2006 Document Understanding Conference (DUC-2006) at HLT/NAACL 2006, June 2006, New York, New York.
58.Jimmy Lin and Dina Demner-Fushman. Will Pyramids Built of Nuggets Topple Over? Proceedings of the 2006 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT/NAACL 2006), pages 383-390, June 2006, New York, New York.
57.Jimmy Lin, Damianos Karakos, Dina Demner-Fushman, and Sanjeev Khudanpur. Generative Content Models for Structural Analysis of Medical Abstracts. Proceedings of the HLT-NAACL BioNLP Workshop on Linking Natural Language and Biology, pages 65-72, June 2006, New York, New York.
56.Jimmy Lin and Boris Katz. Building a Reusable Test Collection for Question Answering. Journal of the American Society for Information Science and Technology, 57(7):851-861, 2006.
55.Jimmy Lin. The Role of Information Retrieval in Answering Complex Questions. Technical Report LAMP-TR-130/CS-TR-4787/UMIACS-TR-2006-11, University of Maryland, College Park, February 2006. (Later appears at ACL 2006)

2005

54.Jimmy Lin and Dina Demner-Fushman. Will Pyramids Built of Nuggets Topple Over? Technical Report LAMP-TR-127/CS-TR-4771/UMIACS-TR-2005-71, University of Maryland, College Park, December 2005. (Later appears at HLT/NAACL 2006)
53.Jimmy Lin, Eileen Abels, Dina Demner-Fushman, Douglas W. Oard, Philip Wu, and Yejun Wu. A Menagerie of Tracks at Maryland: HARD, Enterprise, QA, and Genomics, Oh My! Proceedings of the Fourteenth Text REtrieval Conference (TREC 2005), November 2005, Gaithersburg, Maryland.
52.Alan R. Aronson, Dina Demner-Fushman, Susanne M. Humphrey, Jimmy Lin, Hongfang Liu, Patrick Ruch, Miguel E. Ruiz, Lawrence H. Smith, Lorraine K. Tanabe, and W. John Wilbur. Fusion of Knowledge-Intensive and Statistical Approaches for Retrieving and Annotating Textual Genomics Documents. Proceedings of the Fourteenth Text REtrieval Conference (TREC 2005), November 2005, Gaithersburg, Maryland.
51.Jimmy Lin and Dina Demner-Fushman. "Bag of Words" Is Not Enough for Strength of Evidence Classification. Proceeding of the 2005 Annual Symposium of the American Medical Informatics Association (AMIA 2005), page 1031, October 2005, Washington, D.C.
50.David Zajic, Bonnie Dorr, Jimmy Lin, Christof Monz, and Richard Schwartz. A Sentence-Trimming Approach to Multi-Document Summarization. Proceedings of the 2005 Document Understanding Conference (DUC-2005) at HLT/EMNLP 2005, October 2005, Vancouver, British Columbia, Canada.
49.Grazia Russo-Lassner, Jimmy Lin, and Philip Resnik. A Paraphrase-Based Approach to Machine Translation Evaluation. Technical Report LAMP-TR-125/CS-TR-4754/UMIACS-TR-2005-57, University of Maryland, College Park, August 2005.
48.Jimmy Lin and Dina Demner-Fushman. Automatically Evaluating Answers to Definition Questions. Proceedings of the 2005 Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP 2005), pages 931-938, October 2005, Vancouver, British Columbia, Canada.
47.Jimmy Lin and Dina Demner-Fushman. Representation of Information Needs and the Elements of Context: A Case Study in the Domain of Clinical Medicine. Proceedings of the SIGIR 2005 Workshop on Information Retrieval in Context (IRiX 2005), pages 51-53, August 2005, Salvador, Brazil.
46.Jimmy Lin. Evaluation of Resources for Question Answering Evaluation. Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2005), pages 392-399, August 2005, Salvador, Brazil.
45.Jimmy Lin and G. Craig Murray. Assessing the Term Independence Assumption in Blind Relevance Feedback. Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2005), pages 635-636, August 2005, Salvador, Brazil.
44.Dina Demner-Fushman and Jimmy Lin. Knowledge Extraction for Clinical Question Answering: Preliminary Results. Proceedings of the AAAI-05 Workshop on Question Answering in Restricted Domains, July 2005, Pittsburgh, Pennsylvania.
43.Jimmy Lin and Dina Demner-Fushman. Evaluating Summaries and Answers: Two Sides of the Same Coin? Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pages 41-48, June 2005, Ann Arbor, Michigan.
42.Michael Evans, Wayne McIntosh, Cynthia L. Cates, and Jimmy Lin. Recounting the Courts? Toward A Text-Centered Computational Approach to Understanding the Dynamics of the Judicial System. Proceedings of the 2005 Annual Meeting of the Midwest Political Science Association, April 2005, Chicago, Illinois.
41.Jimmy Lin and Dina Demner-Fushman. Automatically Evaluating Answers to Definition Questions. Technical Report LAMP-TR-119/CS-TR-4695/UMIACS-TR-2005-04, University of Maryland, College Park, February 2005. (Later appears at HLT/EMNLP 2005)
40.Jimmy Lin. Evaluation of Resources for Question Answering Evaluation. Technical Report LAMP-TR-118/CS-TR-4693/UMIACS-TR-2005-03, University of Maryland, College Park, February 2005) (Later appears at SIGIR 2005)

2004

39.Boris Katz, Jimmy Lin, Chris Stauffer and Eric Grimson. Answering Questions About Moving Objects in Videos. In Mark T. Maybury, editor, New Directions in Question Answering. Cambridge, Massachusetts: MIT Press, 2004, pages 113-128.
38.Boris Katz, Sue Felshin, Jimmy Lin, and Gregory Marton. Viewing the Web as a Virtual Database for Question Answering. In Mark T. Maybury, editor, New Directions in Question Answering. Cambridge, Massachusetts: MIT Press, 2004, pages 215-226.
37.Boris Katz, Matthew Bilotti, Sue Felshin, Aaron Fernandes, Wesley Hildebrandt, Roni Katzir, Jimmy Lin, Daniel Loreto, Gregory Marton, Federico Mora, Ozlem Uzuner. Answering Multiple Questions on a Topic From Heterogeneous Resources. Proceedings of the Thirteenth Text REtrieval Conference (TREC 2004), November 2004, Gaithersburg, Maryland.
36.Jimmy Lin. Are Degree Achievements Really Achievements? Proceedings of the 9th International Symposium on Chinese Languages and Linguistics (IsCLL-9), November 2004, Taipei, Taiwan.
35.Jimmy Lin. Event Structure and the Encoding of Arguments: The Syntax of the Mandarin and English Verb Phrase. Ph.D. Thesis, Massachusetts Institute of Technology, 2004. (a version that wastes less paper)
34.Matthew W. Bilotti, Boris Katz, and Jimmy Lin. What Works Better for Question Answering: Stemming or Morphological Query Expansion? Proceedings of the SIGIR 2004 Workshop on Information Retrieval for Question Answering (IR4QA), July 2004, Sheffield, England.
33.Jimmy Lin. A Computational Framework for Non-Lexicalist Semantics. Proceedings of the Student Research Workshop at HLT-NAACL 2004, pages 19-24, May 2004, Boston, Massachusetts.
32.Jimmy Lin. Fine-Grained Lexical Semantic Representations and Compositionally-Derived Events in Mandarin Chinese. Proceedings of the Computational Lexical Semantics Workshop at HLT-NAACL 2004, pages 100-107, May 2004, Boston, Massachusetts.
31.Wesley Hildebrandt, Boris Katz, and Jimmy Lin. Answering Definition Questions with Multiple Knowledge Sources. Proceedings of the 2004 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT/NAACL 2004), pages 49-56, May 2004, Boston, Massachusetts.

2003

30.Boris Katz, Jimmy Lin, Daniel Loreto, Wesley Hildebrandt, Matthew Bilotti, Sue Felshin, Aaron Fernandes, Gregory Marton, and Federico Mora. Integrating Web-based and Corpus-based Techniques for Question Answering. Proceedings of the Twelfth Text REtrieval Conference (TREC 2003), pages 426-435, November 2003, Gaithersburg, Maryland.
29.Jimmy Lin and Boris Katz. Question Answering from the Web Using Knowledge Annotation and Knowledge Mining Techniques. Proceedings of the Twelfth International Conference on Information and Knowledge Management (CIKM 2003), pages 116-123, November 2003, New Orleans, Louisiana.
28.Boris Katz and Jimmy Lin. Organizing and Accessing a Comprehensive Knowledge Base Using the World Wide Web. Proceedings of the International Conference on Integration of Knowledge Intensive Multi-Agent System (KIMAS 2003), pages 529-534, September 2003, Boston, Massachusetts.
27.Jimmy Lin, Dennis Quan, Vineet Sinha, Karun Bakshi, David Huynh, Boris Katz, and David R. Karger. What Makes a Good Answer? The Role of Context in Question Answering. Proceedings of the Ninth IFIP TC13 International Conference on Human-Computer Interaction (INTERACT 2003), pages 25-32, September 2003, Zurich, Switzerland.
26.Stefanie Tellex, Boris Katz, Jimmy Lin, Gregory Marton, and Aaron Fernandes. Quantitative Evaluation of Passage Retrieval Algorithms for Question Answering. Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2003), pages 41-47, July 2003, Toronto, Ontario, Canada. (🏆 Best Student Paper)
25.Ali Ibrahim, Boris Katz, and Jimmy Lin. Extracting Structural Paraphrases from Aligned Monolingual Corpora. Proceedings of the Second International Workshop on Paraphrasing, pages 57-64, July 2003, Sapporo, Japan.
24.Boris Katz, Roger Hurwitz, Jimmy Lin, Özlem Uzuner. Better Public Policy Through Natural Language Information Access. Proceedings of the National Conference on Digital Government Research (DG.O 2003), pages 73-78, May 2003, Boston, Massachusetts.
23.Boris Katz, Roger Hurwitz, Jimmy Lin, Özlem Uzuner. START: A Framework for Facilitating E-Rulemaking. Proceedings of the National Conference on Digital Government Research (DG.O 2003), page 285, May 2003, Boston, Massachusetts. (demo)
22.Jimmy Lin and Boris Katz. Question Answering Techniques for the World Wide Web. Tutorial at the 11th Conference of the European Chapter of the Association of Computational Linguistics (EACL 2003), April 2003, Budapest, Hungary.
21.Boris Katz and Jimmy Lin. Selectively Using Relations to Improve Precision in Question Answering. Proceedings of the EACL 2003 Workshop on Natural Language Processing for Question Answering, pages 43-50, April 2003, Budapest, Hungary.
20.Jimmy Lin, Dennis Quan, Vineet Sinha, Karun Bakshi, David Huynh, Boris Katz, David R. Karger. The Role of Context in Question Answering Systems. Extended abstracts of the 2003 Conference on Human Factors in Computing Systems (CHI 2003), pages 1006-1007, April 2003, Fort Lauderdale, Florida.
19.Boris Katz, Jimmy Lin, Chris Stauffer, and Eric Grimson. Answering Questions about Moving Objects in Surveillance Videos. Proceedings of the 2003 AAAI Spring Symposium on New Directions in Question Answering, pages 145-152, March 2003, Palo Alto, California.
18.David Karger, Boris Katz, Jimmy Lin, and Dennis Quan. Sticky Notes for the Semantic Web. Proceedings of the 2003 International Conference on Intelligent User Interfaces (IUI 2003), pages 254-256, January 2003, Miami, Florida.

2002

17.Jimmy Lin, Aaron Fernandes, Boris Katz, Gregory Marton, and Stefanie Tellex. Extracting Answers from the Web Using Knowledge Annotation and Knowledge Mining Techniques. Proceedings of the Eleventh Text REtrieval Conference (TREC 2002), pages 447-456, November 2002, Gaithersburg, Maryland.
16.Boris Katz, Jimmy Lin, Sue Felshin. The START Multimedia Information System: Current Technology and Future Directions. Proceedings of the International Workshop on Multimedia Information Systems (MIS 2002), pages 117-123, October 2002, Tempe, Arizona.
15.Boris Katz, Jimmy Lin, Dennis Quan. Natural Language Annotations for the Semantic Web. Proceedings of the International Conference on Ontologies, Databases, and Application of Semantics (ODBASE 2002), pages 1317-1331, September 2002, Irvine, California.
14.Boris Katz and Jimmy Lin. Annotating the Semantic Web Using Natural Language. COLING-02: The 2nd Workshop on NLP and XML (NLPXML-2002), September 2002, Taipei, Taiwan.
13.Susan Dumais, Michele Banko, Eric Brill, Jimmy Lin, and Andrew Ng. Web Question Answering: Is More Always Better? Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2002), pages 291-298, August 2002, Tampere, Finland.
12.Boris Katz, Sue Felshin, Deniz Yuret, Ali Ibrahim, Jimmy Lin, Gregory Marton, Alton Jerome McFarland, and Baris Temelkuran. Omnibase: Uniform Access to Heterogeneous Data for Question Answering. Proceedings of the 6th International Conference on Applications of Natural Language to Information Systems (NLDB 2002), pages 230-234, June 2002, Stockholm, Sweden.
11.Jimmy Lin. The Web as a Resource for Question Answering: Perspective and Challenges. Proceedings of the third International Conference on Language Resources and Evaluation (LREC 2002), pages 2120-2127, May 2002, Canary Islands, Spain.
10.Michele Banko, Eric Brill, Susan Dumais, and Jimmy Lin. AskMSR: Question Answering Using the Worldwide Web. Proceedings of the 2002 AAAI Spring Symposium on Mining Answers from Texts and Knowledge Bases, pages 7-8, March 2002, Palo Alto, California. [scanned proceedings version]

2001

9.Eric Brill, Jimmy Lin, Michele Banko, Susan Dumais, and Andrew Ng. Data-Intensive Question Answering. Proceedings of the Tenth Text REtrieval Conference (TREC 2001), pages 393-400, November 2001, Gaithersburg, Maryland.
8.Joyce Chai, Jimmy Lin, Wlodek Zadrozny, Yiming Ye, Margo Stys-Budzikowska, Veronika Horvath, Nanda Kambhatla, and Catherine Wolf. The Role of a Natural Language Conversational Interface in Online Sales: A Case Study. International Journal of Speech Technology, 4:285-295, 2001.
7.Boris Katz, Jimmy Lin, and Sue Felshin. Gathering Knowledge for a Question Answering System from Heterogeneous Information Sources. Proceedings of the ACL 2001 Workshop on Human Language Technology and Knowledge Management, July 2001, Toulouse, France.
6.Jimmy Lin. Indexing and Retrieving Natural Language Using Ternary Expressions. Master's Thesis, Massachusetts Institute of Technology, 2001.

2000

5.Boris Katz and Jimmy Lin. REXTOR: A System for Generating Relations from Natural Language. ACL-2000 Workshop on Recent Advances in Natural Language Processing and Information Retrieval, pages 67-77, October 2000, Hong Kong, China.
4.Joyce Chai, Jimmy Lin, Wlodek Zadrozny, Yiming Ye, Margo Budzikowska, Veronika Horvath, Nanda Kambhatla, Catherine Wolf. Comparative Evaluation of a Natural Language Dialog Based System and a Menu Driven System for Information Access: A Case Study. Proceedings of the International Conference on Multimedia Information Retrieval - Volume 2 (RIAO 2000), pages 1590-1600, April 2000, Paris, France. [authors' version]
3.Joyce Chai, Jimmy Lin, Wlodek Zadrozny, Yiming Ye, Margo Budzikowska, Veronika Horvath, Nanda Kambhatla, and Catherine Wolf. Evaluation of a Natural Language Dialog Based Web Navigation System—A Case Study. Proceedings of the 2nd International Conference on Language Resources and Evaluation (LREC 2000) Interactivity Workshop, May 2000, Athens, Greece.

1999

2.Boris Katz, Deniz Yuret, Jimmy Lin, Sue Felshin, Rebecca Schulman, Adnan Ilik, Ali Ibrahim, Philip Osafo-Kwaako. Integrating Large Lexicons and Web Resources into a Natural Language Query System. Proceedings of the International Conference on Multimedia Computing and Systems (ICMCS 1999), Volume 2, pages 255-261, June 1999, Florence, Italy. [authors' version]

1998

1.Boris Katz, Deniz Yuret, Jimmy Lin, Sue Felshin, Rebecca Schulman, Adnan Ilik. Blitz: A Preprocessor for Detecting Context-Independent Linguistic Structures. Proceedings of the 5th Pacific Rim Conference on Artificial Intelligence (PRICAI 1998), November 1998, Singapore.