William W. Cohen's Papers: Info Extraction/Reading/QA

  1. Chung-Ching Chang, William W. Cohen, Yun-Hsuan Sung (2023): Characterizing Tradeoffs in Language Model Decoding with Informational Interpretations in progress.
  2. Tal Schuster, Adam D. Lelkes, Haitian Sun, Jai Gupta, Jonathan Berant, William W. Cohen, Donald Metzler (2024): SEMQA: Semi-Extractive Multi-Source Question Answering in NAACL-2024.
  3. Yury Zemlyanskiy, Michiel de Jong, Luke Vilnis, Santiago Ontañón, William W. Cohen, Sumit Sanghai, Joshua Ainslie (2024): MEMORY-VQ: Compression for Tractable Internet-Scale Memory in NAACL-2024.
  4. Haitian Sun, William W. Cohen, Ruslan Salakhutdinov (2023): Answering Ambiguous Questions with a Database of Questions, Answers, and Revisions in progress.
  5. Michiel de Jong, Yury Zemlyanskiy, Nicholas FitzGerald, Sumit Sanghai, William W. Cohen, Joshua Ainslie (2023): GLIMMER: generalized late-interaction memory reranker in progress.
  6. Wenhu Chen, Hexiang Hu, Yandong Li, Nataniel Ruiz, Xuhui Jia, Ming-Wei Chang, William W. Cohen (2023): Subject-driven Text-to-Image Generation via Apprenticeship Learning in NeurIPS-2023.
  7. Michiel de Jong, Yury Zemlyanskiy, Nicholas FitzGerald, Joshua Ainslie, Sumit Sanghai, Fei Sha, William W. Cohen (2023): Pre-computed memory or on-the-fly encoding? A hybrid approach to retrieval augmentation makes the most of your compute in ICML-2023.
  8. Wenhu Chen, Hexiang Hu, Xi Chen, Pat Verga, William W. Cohen (2023): MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over Images and Text in EACL-2023.
  9. Haitian Sun, William W. Cohen, Ruslan Salakhutdinov (2022): Reasoning over Logically Interacted Conditions for Question Answering in progress.
  10. Michiel de Jong, Yury Zemlyanskiy, Joshua Ainslie, Nicholas FitzGerald, Sumit Sanghai, Fei Sha, William Cohen (2023): FiDO: Fusion-in-Decoder optimized for stronger performance and faster inference in ACL-2023 (Findings).
  11. Wenhu Chen, Hexiang Hu, Chitwan Saharia, William W. Cohen (2023): Re-Imagen: Retrieval-Augmented Text-to-Image Generator in ICLR-2023.
  12. Julian Martin Eisenschlos, Jeremy R. Cole, Fangyu Liu, William W. Cohen (2023): WinoDict: Probing language models for in-context word acquisition in EACL-2023.
  13. John Wieting, Jonathan H. Clark, William W. Cohen, Graham Neubig, Taylor Berg-Kirkpatrick (2023): Beyond Contrastive Learning: A Variational Generative Model for Multilingual Retrieval in ACL-2023.
  14. Wenhu Chen, Xueguang Ma, Xinyi Wang, William W. Cohen (2022): Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks in progress.
  15. Wenhu Chen, William W. Cohen, Michiel De Jong, Nitish Gupta, Alessandro Presta, Pat Verga, John Wieting (2023): QA Is the New KR: Question-Answer Pairs as Knowledge Bases in AAAI-2023.
  16. Vidhisha Balachandran, Hannaneh Hajishirzi, William Cohen, Yulia Tsvetkov (2022): Correcting Diverse Factual Errors in Abstractive Summarization via Post-Editing and Language Model Infilling in EMNLP-2022.
  17. Haitian Sun, William W. Cohen, Ruslan Salakhutdinov (2023): Scenario-based Question Answering with Interacting Contextual Properties in ICLR-2023.
  18. Bernd Bohnet, Vinh Q. Tran, Pat Verga, Roee Aharoni, Daniel Andor, Livio Baldini Soares, Jacob Eisenstein, Kuzman Ganchev, Jonathan Herzig, Kai Hui, Tom Kwiatkowski, Ji Ma, Jianmo Ni, Tal Schuster, William W. Cohen, Michael Collins, Dipanjan Das, Donald Metzler, Slav Petrov, and Kellie Webster (2022): Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models in progress.
  19. Vidhisha Balachandran and Bhuwan Dhingra and Haitian Sun and Michael Collins and William W. Cohen (2021): Investigating the Effect of Background Knowledge on Natural Questions in DeeLIO-2021.
  20. Haitian Sun, William W. Cohen, Ruslan Salakhutdinov (2021): ConditionalQA: A Complex Reading Comprehension Dataset with Conditional Answers in ACL 2022.
  21. Haitian Sun, William W. Cohen, Ruslan Salakhutdinov (2021): End-to-End Multihop Retrieval for Compositional Question Answering over Long Documents in preparation.
  22. Haitian Sun, Pat Verga, Bhuwan Dhingra, Ruslan Salakhutdinov, William W. Cohen (2021): Reasoning Over Virtual Knowledge Bases With Open Predicate Relations in ICML2021.
  23. Wenhu Chen, Ming-Wei Chang, Eva Schlinger, William Wang, William W. Cohen (2021): Open Question Answering Over Tables and Text in ICLR-2021.
  24. Pat Verga, Haitian Sun, Livio Baldini Soares, and William W. Cohen (2021): Adaptable and Interpretable Neural Memory Over Symbolic Knowledge in NAACL-2021.
  25. Bill Yuchen Lin, Haitian Sun, Bhuwan Dhingra, Manzil Zaheer, Xiang Ren, William W. Cohen (2020): Differentiable Open-Ended Commonsense Reasoning in NAACL-2021.
  26. Pat Verga, Haitian Sun, Livio Baldini Soares, and William W. Cohen (2020): Facts as Experts: Adaptable and Interpretable Neural Memory over Symbolic Knowledge in arxiv.
  27. William W. Cohen, Haitian Sun, R. Alex Hofer, Matthew Siegler (2020): Scalable Neural Methods for Reasoning With a Symbolic Knowledge Base in ICLR-2020.
  28. Bhuwan Dhingra, Manzil Zaheer, Vidhisha Balachandran, Graham Neubig, Ruslan Salakhutdinov, William W. Cohen (2020): Differentiable Reasoning over a Virtual Knowledge Base in ICLR-2020.
  29. Qiao Jin, Bhuwan Dhingra, Zhengping Liu, William W Cohen, and Xinghua Lu (2019): PubMedQA: A Dataset for Biomedical Research Question Answering in EMNLP-2019.
  30. Bhuwan Dhingra, Manaal Faruqui, Ankur Parikh, Ming-Wei Chang, Dipanjan Das, William W. Cohen (2019): Handling Divergent Reference Texts when Evaluating Table-to-Text Generation in ACL-2019.
  31. William W. Cohen, Haitian Sun, Alex Hofer, Matthew Siegler (2019): Differentiable Representations For Multihop Inference Rules in arxiv.
  32. William W. Cohen, Matthew Siegler, Alex Hofer (2019): Neural Query Language: A Knowledge Base Query Language for Tensorflow in arxiv.
  33. Haitian Sun, Tania Bedrax-Weiss, William W. Cohen (2019): PullNet: Open Domain Question Answering with Iterative Retrieval on Knowledge Bases and Text in EMNLP-2019.
  34. Haohan Wang, Xiang Liu, Yifeng Tao, Wenting Ye, Qiao Jin, William W. Cohen and Eric P. Xing (2019): Automatic Human-like Mining and Constructing Reliable Genetic Association Database with Deep Reinforcement Learning in Biocomputing.
  35. Haitian Sun, William W. Cohen, Lidong Bing (2018): Semi-Supervised Learning with Declaratively Specified Entropy Constraints in NIPS-2018.
  36. Zhilin Yang, Jake (Junbo) Zhao, Bhuwan Dhingra, Kaiming He, William W. Cohen, Ruslan Salakhutdinov, Yann LeCun (2018): GLoMo: Unsupervisedly Learned Relational Graphs as Transferable Representations in NIPS-2018.
  37. Zhilin Yang, Peng Qi, Saizheng Zhang, Yoshua Bengio, William W. Cohen, Ruslan Salakhutdinov, Christopher D. Manning (2018): HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering in EMNLP-2018.
  38. Haitian Sun, Bhuwan Dhingra, Manzil Zaheer, Kathryn Mazaitis, Ruslan Salakhutdinov, and William W. Cohen (2018): Open Domain Question Answering Using Early Fusion of Knowledge Bases and Text in EMNLP-2018.
  39. Bhuwan Dhingra, Qiao Jin, Zhilin Yang, William W. Cohen, Ruslan Salakhutdinov (2018): Neural Models for Reasoning over Multiple Mentions using Coreference in NAACL-2018.
  40. Vidhisha Balachandran and Dheeraj Rajagopal and , Rose Catherine Kanjirathinkal and William W. Cohen (2018): Learning to Define Terms in the Software Domain in W-NUT 2018.
  41. T. Mitchell, W. Cohen, E. Hruschka, P. Talukdar, B. Yang, J. Betteridge, A. Carlson, B. Dalvi, M. Gardner, B. Kisiel, J. Krishnamurthy, N. La, K. Mazaitis, T. Mohamed, N. Nakashole, E. Platanios, A. Ritter, M. Samadi, B. Settles, R. Wang, D. Wijaya, A. Gupta, X. Chen, A. Saparov,M. Greaves, J. Welling (2017): Never-Ending Learning in CACM.
  42. Rose Catherine, Kathryn Mazaitis, Maxine Eskenazi, William W. Cohen (2017): Explainable Entity-based Recommendations with Knowledge Graphs (poster paper) in RecSys-2017.
  43. Bhuwan Dhingra, Kathryn Mazaitis, William W. Cohen (2017): Quasar: Datasets for Question Answering by Search and Reading in arxiv 1707.03904.
  44. Bhuwan Dhingra, Hanxiao Liu, Ruslan Salakhutdinov, and William W. Cohen (2017): A Comparative Study of Word Embeddings for Reading Comprehension in arxiv 1703.00993.
  45. Rose Catherine, William W. Cohen (2017): TransNets: Learning to Transform for Recommendation in RecSys-2017.
  46. Lidong Bing, William W. Cohen, Bhuwan Dhingra, and Richard C. Wang (2017): Using Graphs of Classifiers to Impose Constraints on Semi-supervised Relation Extraction in IJCAI 2017.
  47. Zhilin Yang, Junjie Hu, Ruslan Salakhutdinov, William W. Cohen (2017): Semi-Supervised QA with Generative Domain-Adaptive Nets in ACL-2017.
  48. Zhilin Yang, Bhuwan Dhingra, Ye Yuan, Junjie Hu, William W. Cohen, Ruslan Salakhutdinov (2017): Words or Characters? Fine-grained Gating for Reading Comprehension in ICLR 2017.
  49. Zhilin Yang, Ruslan Salakhutdinov, William W. Cohen (2017): Transfer Learning for Sequence Tagging with Hierarchical Recurrent Networks in ICLR 2017.
  50. Lidong Bing, Bhuwan Dhingra, Kathryn Mazaitis, Jong Hyuk Park, William W. Cohen (2017): Bootstrapping Distantly Supervised IE using Joint Learning and Small Well-structured Corpora in AAAI 2017.
  51. Abhinav Maurya, Kenton Murray, Yandong Liu, Chris Dyer, William W. Cohen and Daniel B. Neill (2016): Semantic Scan: Detecting Subtle, Spatially Localized Events in Text Streams in arxiv 1602.04393.
  52. Lidong Bing, William W. Cohen, Bhuwan Dhingra, and Richard C. Wang (2016): Using Graphs of Classifiers to Impose Constraints on Semi-supervised Relation Extraction in WAKBC-2016.
  53. Zhilin Yang, Ruslan Salakhutdinov, William Cohen (2016): Revisiting Semi-Supervised Learning with Graph Embeddings in ICML-2016.
  54. Zhilin Yang, Ruslan Salakhutdinov, William Cohen (2016): Multi-Task Cross-Lingual Sequence Tagging from Scratch in arxiv 1603.06270.
  55. Lidong Bing, Mingyang Ling, Richard C. Wang, William W. Cohen (2016): Distant IE by Bootstrapping Using Lists and Document Structure in AAAI-2016.
  56. Jay Pujara, Hui Miao, Lise Getoor, and William W. Cohen (2015): Using semantics and statistics to turn data into knowledge in AI Magazine 2015.
  57. Lidong Bing, Sneha Chaudhari, Richard C. Wang, and William W. Cohen (2015): Improving Distant Supervision for Information Extraction Using Label Propagation Through Lists in EMNLP-2015.
  58. Bhavana Dalvi, Einat Minkov, Partha P. Talukdar, and William W. Cohen (2015): Automatic Gloss Finding for a Knowledge Base using Ontological Constraints in WSDM-2015.
  59. T. Mitchell, W. Cohen, E. Hruscha, P. Talukdar, J. Betteridge, A. Carlson, B. Dalvi, M. Gardner,B. Kisiel,J. Krishnamurthy, N. Lao, K. Mazaitis, T. Mohammad, N. Nakashole, E. Platanios,A. Ritter, M. Samadi, B. Settles, R.Wang, D.Wijaya, A. Gupta, X. Chen, A. Saparov, M. Greaves, J.Welling (2015): Never-Ending Learning in AAAI-2015.
  60. Jay Pujara, Hui Miao, Lise Getoor and William W. Cohen (2014): Using Semantics & Statistics to Turn Data into Knowledge in AI Magazine 2014.
  61. Jay Pujara, Hui Miao, Lise Getoor, and William W. Cohen (2013): Ontology-Aware Partitioning for Knowledge Graph Identification in AKBC-2013.
  62. Bhavana Dalvi, William W. Cohen, and Jamie Callan (2013): Classifying Entities into an Incomplete Ontology in AKBC-2013.
  63. Jay Pujara, Hui Miao, Lise Getoor, and William W. Cohen (2013): Knowledge Graph Identification in ISWC-2013.
  64. Ramnath Balasubramanyan, Bhavana Dalvi and William W. Cohen (2013): From Topic Models to Semi-Supervised Learning: Biasing Mixed-membership Models to Exploit Topic-Indicative Features in Entity Clustering in ECML/PKDD-2013.
  65. Bhavana Dalvi and William W. Cohen and Jamie Callan (2013): Exploratory Learning in ECML/PKDD-2013.
  66. Bhavana Dalvi and William W. Cohen (2013): Very Fast Similarity Queries on Semi-Structured Data from the Web in SDM-2013.
  67. Freddy Chong Tat Chua, William W. Cohen, Justin Betteridge, and Ee-Peng Lim (2012): Community-Based Classification of Noun Phrases in Twitter in CIKM-2012 (short paper).
  68. Ni Lao, Amar Subramanya, Fernando Pereira and William W. Cohen (2012): Reading The Web with Learned Syntactic-Semantic Inference Rules in EMNLP-CoNLL-2012.
  69. Bhavana Dalvi, William W. Cohen, and Jamie Callan (2012): Collectively Representing Semi-Structured Data from the Web in AKBC-2012.
  70. Dana Movshovitz-Attias and William W. Cohen (2012): Alignment-based Extraction of Abbreviations from Biomedical Text in BioNLP-2012.
  71. Dana Movshovitz-Attias and William W. Cohen (2012): Bootstrapping Biomedical Ontologies for Scientific Text using NELL in BioNLP-2012.
  72. Bhavana Dalvi, William W. Cohen, and Jamie Callan (2012): WebSets: Extracting Sets of Entities from the Web Using Unsupervised Information Extraction in WSDM-2012.
  73. Ni Lao, Tom Mitchell, and William W. Cohen (2011): Random Walk Inference and Learning in A Large Scale Knowledge Base in EMNLP-2011.
  74. Jacob Eisenstein, Tae Yano, William W. Cohen, Noah A. Smith, and Eric P. Xing (2011): Structured Databases of Named Entities from Bayesian Nonparametrics in UNSUP-2011.
  75. Bhavana Dalvi, Jamie Callan, and William W. Cohen (2011): Entity List Completion Using Set Expansion Techniques in TREC 2011.
  76. Einat Minkov and William W. Cohen (2010): Improving Graph-Walk Based Similarity with Reranking: Case Studies for Personal Information Management in TOIS-2010.
  77. L. P. Coelho, A. Ahmed, A. Arnold, J. Kangas, A.-S. Sheikh, E. Xing, W. Cohen, and R. F. Murphy (2010): Structured Literature Image Finder: Extracting Information from Text and Images in Biomedical Literature in Lecture Notes in Bioinformatics.
  78. A. Ahmed, A. Arnold, L. P. Coelho, J. Kangas, A.-S. Sheikh, E. Xing, W. Cohen, and R. F. Murphy (2010): Structured Literature Image Finder: Parsing Text and Figures in Biomedical Literature in Journal of Web Semantics.
  79. Richard Wang and William W. Cohen (2009): Character-level Analysis of Semi-Structured Documents for Set Expansion in EMNLP 2009.
  80. Richard Wang and William W. Cohen (2009): Automatic Set Instance Extraction using the Web in ACL-IJNLP 2009.
  81. Richard Wang and William W. Cohen (2008): Iterative Set Expansion of Named Entities Using the Web in ICDM-2008.
  82. Andrew Arnold and William W. Cohen (2008): Intra-document Structural Frequency Features for Semi-Supervised Domain Adaptation in CIKM-2008.
  83. Andrew Arnold, Ramesh Nallapati and William W. Cohen (2008): Exploiting Feature Hierarchy for Transfer Learning in Named Entity Recognition in ACL-2008.
  84. Andrew Arnold, Ramesh Nallapati and William W. Cohen (2007): A Comparative Study of Methods for Transductive Transfer Learning in ICDM Workshop on Mining and Management of Biological Data.
  85. Richard Wang and William Cohen (2007): Language-Independent Set Expansion of Named Entities using the Web in ICDM-2007.
  86. Zhenzhen Kou and William W. Cohen (2007): Stacked Graphical Models for Efficient Inference in Markov Random Fields in SDM-2007.
  87. Zhenzhen Kou, William W. Cohen, and Robert F. Murphy (2007): A Stacked Graphical Model for Associating Information from Text And Images In Figures in PSB-2007.
  88. Richard C. Wang, Anthony Tomasic, Robert E. Frederking, William W. Cohen (2006): Learning to Extract Gene-Protein Names from Weakly-Labeled Text in CMU SCS Technical Report Series (CMU-LTI-08-04).
  89. Einat Minkov, Richard C.Wang, Anthony Tomasic and William W. Cohen (2006): NER Systems that Suit Users Preferences: Adjusting the Recall-Precision Trade-off for Entity Extraction in HLT/NAACL-2006 (short paper).
  90. William W. Cohen (2006): A Graph-Search Framework for GeneId Ranking (Extended Abstract) in BioNLP'06.
  91. William W. Cohen & Einat Minkov (2006): A Graph-Search Framework for Associating Gene Identifiers with Documents in BMC Bioinformatics.
  92. Einat Minkov, Richard C. Wang, and William W. Cohen (2005): Extracting Personal Names from Email: Applying Named Entity Recognition to Informal Text in EMNLP/HLT-2005.
  93. William W. Cohen, Einat Minkov & Anthony Tomasic (2005): Learning to Understand Web Site Update Requests in IJCAI-2005.
  94. Zhenzhen Kou, William W. Cohen & Robert F. Murphy (2005): High-Recall Protein Entity Recognition Using a Dictionary in ISMB-2005.
  95. Einat Minkov, Richard Wang & William Cohen (2004): Extracting Personal Names from Emails: Applying Named Entity Recognition to Informal Text in NAACL-2005.
  96. Sunita Sarawagi & William W. Cohen (2004): Semi-Markov Conditional Random Fields for Information Extraction in NIPS 2004.
  97. Robert F. Murphy, Zhenzhen Kou, Juchang Hua, Matthew Joffe, William W. Cohen (2004): Extracting and Structuring Subcellular Location Information from On-line Journal Articles: The Subcellular Location Image Finder in KSCE-2004.
  98. Anthony Tomasic, William W. Cohen, Einat Minkov (2004): Learning to Navigate Web Forms in IIWeb 2004.
  99. Vitor Carvalho & William W. Cohen (2004): Learning to Extract Signature and Reply Lines from Email in CEAS 2004.
  100. William W. Cohen & Sunita Sarawagi (2004): Exploiting Dictionaries in Named Entity Extraction: Combining Semi-Markov Extraction Processes and Data Integration Methods in KDD 2004: 89-98.
  101. William W. Cohen (2003): Learning and Discovering Structure in Web Pages in IEEE Data Eng. Bull. 26(3): 3-10 (2003).
  102. William W. Cohen, Zhenzhen Kou & Robert F. Murphy (2003): Extracting Information from Text and Images for Location Proteomics in BIOKDD 2003: 2-9.
  103. William W. Cohen, Richard Wang & Robert Murphy (2003): Understanding Captions in Biomedical Publications in KDD 2003: 499-504.
  104. William W. Cohen (2003): Infrastructure Components for Large-Scale Information Extraction Systems in IAAI 2003: 71-78.
  105. William W. Cohen (2002): Improving A Page Classifier with Anchor Extraction and Link Analysis in NIPS 2002.
  106. William W. Cohen, Matthew Hurst & Lee S. Jensen (2003): A Flexible Learning System for Wrapping Tables and Lists in HTML Documents in Web Document Analysis: Challenges and Opportunities, ed. Antonacopoulos & Hu, Word Scientific Publishing.
  107. William W. Cohen, Matthew Hurst & Lee S. Jensen (2002): A Flexible Learning System for Wrapping Tables and Lists in HTML Documents in WWW 2002: 232-241.
  108. William W. Cohen (2001): Issues in Extracting Information from the Web (Extended Abstract) in IWPT 2001.
  109. William W. Cohen (2000): Extracting Information from the Web for Concept Learning and Collaborative Filtering in ALT 2000: 1-12.
  110. William W. Cohen, Andrew McCallum, Dallan Quass (2000): Learning to Understand the Web in IEEE Data Eng. Bull. 23(3): 17-24 (2000).
  111. William W. Cohen and Wei Fan (1999): Learning Page-Independent Heuristics for Extracting Data from Web Pages in Computer Networks 31(11-16): 1641-1652 (1999).
  112. William W. Cohen and Wei Fan (1999): Learning Page-Independent Heuristics for Extracting Data from Web Pages in WWW 1999.
  113. William W. Cohen (1999): Reasoning about Textual Similarity in a Web-Based Information Access in Autonomous Agents and Multi-Agent Systems 2(1): 65-86 (1999).
  114. William W. Cohen (1999): A Demonstration of WHIRL (demonstration abstract) in SIGIR 1999: 327.

[Selected papers| By topic: GNAT System| Retrieval Augmented LMs| Applications| Collaborative Filtering| Intelligent Tutoring| Explanation-Based Learning| Formal Results| Learning in Graphs| Inductive Logic Programming| Neural Knowledge Representation| Topic Modeling| Matching/Data Integration| Deep Learning| Rule Learning| Text Categorization| Info Extraction/Reading/QA| By year: All papers]