List of Publications

Journal Articles

YearJournal Article
2025Sivilotti, S., D.M. Friday, and N. Jackson, Active learning high coverage sets of complementary reaction conditions. Digital Discovery, 2025. 4: p. 846-852. https://doi.org/10.1039/D4DD00365A
2025Shah, A.K., A. Dey, L. Luo, B. Amador, P. Philippy, M. Zhong, S. Ouyang, D.M. Friday, D. Bianchi, N. Jackson, R. Zanibbi, and J. Han, Multimodal Search in Chemical Documents and Reactions. arXiv preprint arXiv:2502.16865, 2025. https://doi.org/10.48550/arXiv.2502.16865
2025Boorla, V.S. and C.D. Maranas, CatPred: a comprehensive framework for deep learning in vitro enzyme kinetic parameters. Nat Commun, 2025. 16(1): p. 2072. https://doi.org/10.1038/s41467-025-57215-9
2025Anand, M., V. Upadhyay, and C.D. Maranas, minChemBio: Expanding chemical synthesis with chemo-enzymatic pathways using minimal transitions. ACS Synthetic Biology, 2025. https://doi.org/10.1021/acssynbio.4c00692
2025Adak, T., T. Menard, M. Albritton, F. Florit, M. Burke, K. Jensen, and S. Denmark, Catalytic allylation of native hexoses and pentoses in water with indium. Nature, 2025. https://doi.org/10.1038/s41586-025-08690-z
2025Upadhyay, V., H. Li, J. He, B.E. Ocampo, S. Cook, H. Zhao, and C.D. Maranas, Combining Chemical Catalysis with Enzymatic Steps for the Synthesis of the Artemisinin Precursor Dihydroartemisinic Acid. ACS Synthetic Biology, 2025. https://doi.org/10.1021/acssynbio.4c00707
2024Zhou, S., S. Li, Y. Meng, Y. Jiao, H. Ji, and J. Han, Establishing Knowledge Preference in Language Models. arXiv preprint arXiv:2407.13048, 2024. https://doi.org/10.48550/arXiv.2407.13048
2024Zhang, Y., R. Yang, X. Xu, R. Li, J. Xiao, J. Shen, and J. Han, Teleclass: Taxonomy enrichment and llm-enhanced hierarchical text classification with minimal supervision. arXiv preprint arXiv:2403.00165, 2024. https://doi.org/10.48550/arXiv.2403.00165
2024Wang, W., N.H. Angello, D.J. Blair, T. Tyrikos-Ergas, W.H. Krueger, K.N. Medine, A.J. LaPorte, J.M. Berger, and M.D. Burke, Rapid automated iterative small-molecule synthesis. Nature Synthesis, 2024. 3(8): p. 1031-1038. https://doi.org/10.1038/s44160-024-00558-w
2024Upadhyay, V., M. Anand, and C.D. Maranas, novoStoic2.0: An integrated framework for pathway synthesis, thermodynamic evaluation, and enzyme selection. bioRxiv, 2024: p. 2024.09. 27.615368. https://doi.org/10.1101/2024.09.27.615368
2024Strieth-Kalthoff, F., H. Hao, V. Rathore, J. Derasp, T. Gaudin, N.H. Angello, M. Seifrid, E. Trushina, M. Guy, J. Liu, X. Tang, M. Mamada, W. Wang, T. Tsagaantsooj, C. Lavigne, R. Pollice, T.C. Wu, K. Hotta, L. Bodo, S. Li, M. Haddadnia, A. Wolos, R. Roszak, C.T. Ser, C. Bozal-Ginesta, R.J. Hickman, J. Vestfrid, A. Aguilar-Granda, E.L. Klimareva, R.C. Sigerson, W. Hou, D. Gahler, S. Lach, A. Warzybok, O. Borodin, S. Rohrbach, B. Sanchez-Lengeling, C. Adachi, B.A. Grzybowski, L. Cronin, J.E. Hein, M.D. Burke, and A. Aspuru-Guzik, Delocalized, asynchronous, closed-loop discovery of organic laser emitters. Science, 2024. 384(6697): p. eadk9227. https://doi.org/10.1126/science.adk9227
2024Shved, A.S., B.E. Ocampo, E.S. Burlova, C.L. Olen, N.I. Rinehart, and S.E. Denmark, molli: A General Purpose Python Toolkit for Combinatorial Small Molecule Library Generation, Manipulation, and Feature Extraction. J Chem Inf Model, 2024. 64(21): p. 8083-8090. https://doi.org/10.1021/acs.jcim.4c00424
2024Shah, A.K., B. Amador, A. Dey, M. Creekmore, B. Ocampo, S. Denmark, and R. Zanibbi, ChemScraper: leveraging PDF graphics instructions for molecular diagram parsing. International Journal on Document Analysis and Recognition (IJDAR), 2024. 27(3): p. 395-414. https://doi.org/10.1007/s10032-024-00486-7
2024Samajdar, R., H. Yang, S. Yi, C.-I. Wang, M.A. Pence, M. Meigooni, S. Putnam, X. Liu, J. Ren, and J.S. Moore, E. Tajkhorshid, J. Rodríguez-López, N.E. Jackson, C.M. Schroeder, Dynamic formation of Au-C anchors in molecular junctions. ChemRxiv, 2024. https://doi.org/10.26434/chemrxiv-2024-g4q8t-v2
2024Liu, X., H. Li, and H. Zhao, Chemoenzymatic synthesis planning by evaluating the synthetic potential in biocatalysis and chemocatalysis. ChemRxiv, 2024. https://doi.org/10.26434/chemrxiv-2024-hnl71
2024Li, X., L. Wang, Y. Luo, C. Edwards, S. Gui, Y. Lin, H. Ji, and S. Ji, Geometry Informed Tokenization of Molecules for Language Model Generation. arXiv preprint arXiv:2408.10120, 2024. https://doi.org/10.48550/arXiv.2408.10120
2024Li, H., X. Liu, G. Jiang, and H. Zhao, Chemoenzymatic Synthesis Planning Guided by Reaction Type Score. J Chem Inf Model, 2024. 64(24): p. 9240-9248. https://doi.org/10.1021/acs.jcim.4c01525
2024Jin, B., G. Liu, C. Han, M. Jiang, H. Ji, and J. Han, Large language models on graphs: A comprehensive survey. IEEE Transactions on Knowledge and Data Engineering, 2024. https://doi.org/10.1109/TKDE.2024.3469578
2024Jiang, M., K.Z. Liu, M. Zhong, R. Schaeffer, S. Ouyang, J. Han, and S. Koyejo, Investigating data contamination for pre-training language models. arXiv preprint arXiv:2401.06059, 2024. https://doi.org/10.48550/arXiv.2401.06059
2024Ding, K., J. Luo, and Y. Luo, Leveraging conformal prediction to annotate enzyme function space with limited false positives. PLoS Comput Biol, 2024. 20(5): p. e1012135. https://doi.org/10.1371/journal.pcbi.1012135
2024Ding, K., M. Chin, Y. Zhao, W. Huang, B.K. Mai, H. Wang, P. Liu, Y. Yang, and Y. Luo, Machine learning-guided co-optimization of fitness and diversity facilitates combinatorial library design in enzyme engineering. Nat Commun, 2024. 15(1): p. 6392. https://doi.org/10.1038/s41467-024-50698-y
2024Chen, J., T.J. Dean, and D. Shukla, Contribution of Signaling Partner Association to Strigolactone Receptor Selectivity. J Phys Chem B, 2024. 128(3): p. 698-705. https://doi.org/10.1021/acs.jpcb.3c06940
2024Burke, M.D., S.E. Denmark, Y. Diao, J. Han, R. Switzky, and H. Zhao, Molecule Maker Lab Institute: Accelerating, advancing, and democratizing molecular innovation. AI Magazine, 2024. 45(1): p. 117-123. https://doi.org/10.1002/aaai.12154
2024Boob, A.G., J. Chen, and H. Zhao, Enabling pathway design by multiplex experimentation and machine learning. Metab Eng, 2024. 81: p. 70-87. https://doi.org/10.1016/j.ymben.2023.11.006
2024Argun, B.R. and A. Statt, Interplay of Spatial and Topological Defects in Polymer Networks. ACS Eng Au, 2024. 4(3): p. 351-358. https://doi.org/10.1021/acsengineeringau.3c00072
2024Argun, B.R., Y. Fu, and A. Statt, Molecular dynamics simulations of anisotropic particles accelerated by neural-net predicted interactions. J Chem Phys, 2024. 160(24). https://doi.org/10.1063/5.0206636
2024Angello, N.H., D.M. Friday, C. Hwang, S. Yi, A.H. Cheng, T.C. Torres-Flores, E.R. Jira, W. Wang, A. Aspuru-Guzik, M.D. Burke, C.M. Schroeder, Y. Diao, and N.E. Jackson, Closed-loop transfer enables artificial intelligence to yield chemical knowledge. Nature, 2024. 633(8029): p. 351-358. https://doi.org/10.1038/s41586-024-07892-1
2024Olen, C.L., Zahrt, A. F., Reilly, S.W., Schultz, D., Emerson, K., Candito, D., Wang, X., Strotman, N.A., Denmark, S.E. Chemoinformatic Catalyst Selection Methods for the Optimization of Copper-Bis(oxazoline)-Mediated, Asymmetric, Vinylogous Mukaiyama Aldol Reactions. ACS Catal. 2024, 14, 2642-2655. https://doi.org/10.1021/acscatal.3c05903
2024Schnitzer, T., Schnurr, M., Zahrt, A.F., Sakhaee, N., Denmark, S.E., Wennemers, H. Machine Learning to Develop Peptide Catalysts-Successes, Limitations and Opportunities. ACS Cent. Sci. 2024, 10, 367-373. https://doi.org/10.1021/acscentsci.3c01284
2023Yuan, Y., C. Shi, and H. Zhao, Machine Learning-Enabled Genome Mining and Bioactivity Prediction of Natural Products. ACS Synth Biol, 2023. 12(9): p. 2650-2662. https://doi.org/10.1021/acssynbio.3c00234
2023Yu, T., H. Cui, J.C. Li, Y. Luo, G. Jiang, and H. Zhao, Enzyme function prediction using contrastive learning. Science, 2023. 379(6639): p. 1358-1363. https://doi.org/10.1126/science.adf2465
2023Yu, T., A.G. Boob, M.J. Volk, X. Liu, H. Cui, and H. Zhao, Machine learning-enabled retrobiosynthesis of molecules. Nature Catalysis, 2023. 6(2): p. 137-151. https://doi.org/10.1038/s41929-022-00909-w
2023Yu, T., A.G. Boob, N. Singh, Y. Su, and H. Zhao, In vitro continuous protein evolution empowered by machine learning and automation. Cell Syst, 2023. 14(8): p. 633-644. https://doi.org/10.1016/j.cels.2023.04.006
2023Upadhyay, V., V.S. Boorla, and C.D. Maranas, Rank-ordering of known enzymes as starting points for re-engineering novel substrate activity using a convolutional neural network. Metab Eng, 2023. 78: p. 171-182. https://doi.org/10.1016/j.ymben.2023.06.001
2023Shah, A.K., B.M. Amador, A. Dey, M. Creekmore, B. Ocampo, S.E. Denmark, and R. Zanibbi, ChemScraper: Graphics Extraction, Molecular Diagram Parsing, and Annotated Data Generation for PDF Images. CoRR, 2023. https://dblp.org/rec/journals/corr/abs-2311-12161
2023Rinehart, N.I., R.K. Saunthwal, J. Wellauer, A.F. Zahrt, L. Schlemper, A.S. Shved, R. Bigler, S. Fantasia, and S.E. Denmark, A machine-learning tool to predict substrate-adaptive conditions for Pd-catalyzed C-N couplings. Science, 2023. 381(6661): p. 965-972. https://doi.org/10.1126/science.adg2114
2023Luo, Y., Y. Liu, and J. Peng, Calibrated geometric deep learning improves kinase-drug binding predictions. Nat Mach Intell, 2023. 5(12): p. 1390-1401. https://doi.org/10.1038/s42256-023-00751-0
2023Luo, Y., Sensing the shape of functional proteins with topology. Nat Comput Sci, 2023. 3(2): p. 124-125. https://doi.org/10.1038/s43588-023-00404-7
2023Lee, J.-H., A. Khasbaatar, A.L. Jones, C. Hwang, M. Kim, J. Strzalka, E. Gann, M.L. Lee, J.R. Reynolds, and Y. Diao, Recycling the Energy of Indoor Light: Highly Efficient Organic Photovoltaics via a Ternary Strategy. ACS Applied Polymer Materials, 2023. 5(6): p. 4199-4209. https://doi.org/10.1021/acsapm.3c00408
2023Lai, T.M., C. Zhai, and H. Ji, KEBLM: Knowledge-Enhanced Biomedical Language Models. J Biomed Inform, 2023. 143: p. 104392. https://doi.org/10.1016/j.jbi.2023.104392
2023Khasbaatar, A., Z. Xu, J.H. Lee, G. Campillo-Alvarado, C. Hwang, B.N. Onusaitis, and Y. Diao, From Solution to Thin Film: Molecular Assembly of pi-Conjugated Systems and Impact on (Opto)electronic Properties. Chem Rev, 2023. 123(13): p. 8395-8487. https://doi.org/10.1021/acs.chemrev.2c00905
2023Jang, S., E.I. Hernandez Alvarez, C. Chen, B.B. Jing, C. Shen, P.V. Braun, A. Schleife, C.M. Schroeder, and C.M. Evans, Control of Lithium Salt Partitioning, Coordination, and Solvation in Vitrimer Electrolytes. Chemistry of Materials, 2023. 35(19): p. 8039-8049. https://doi.org/10.1021/acs.chemmater.3c01353
2023Ding, K., S. Wang, and Y. Luo, Supervised biological network alignment with graph neural networks. Bioinformatics, 2023. 39(39 Suppl 1): p. i465-i474. https://doi.org/10.1093/bioinformatics/btad241
2023Chambers, R.K., J.D. Weaver, J. Kim, J.L. Hoar, S.W. Krska, and M.C. White, A preparative small-molecule mimic of liver CYP450 enzymes in the aliphatic C-H oxidation of carbocyclic N-heterocycles. Proc Natl Acad Sci USA, 2023. 120(29): p. e2300315120. https://doi.org/10.1073/pnas.2300315120
2022Zhang, Z. and H. Zhao, Tunnel engineering enables multifaceted improvements in halogenase. Chem Catalysis, 2022. 2(10): p. 2432-2434. https://doi.org/10.1016/j.checat.2022.09.010
2022Rose, B.T., J.C. Timmerman, S.A. Bawel, S. Chin, H. Zhang, and S.E. Denmark, High-Level Data Fusion Enables the Chemoinformatically Guided Discovery of Chiral Disulfonimide Catalysts for Atropselective Iodination of 2-Amino-6-arylpyridines. J Am Chem Soc, 2022. 144(50): p. 22950-22964. https://doi.org/10.1021/jacs.2c08820
2022Fu, W. and Y. Yang, Undirected biocatalytic amination of unactivated C (sp3)− H bonds. Chem catalysis, 2022. 2(12): p. 3287-3289. https://doi.org/10.1016/j.checat.2022.11.013
2022Bubliauskas, A., D.J. Blair, H. Powell-Davies, P.J. Kitson, M.D. Burke, and L. Cronin, Digitizing Chemical Synthesis in 3D Printed Reactionware. Angew Chem Int Ed, 2022. 61(24): p. e202116108. https://doi.org/10.1002/anie.202116108
2022Boorla, V.S., V. Upadhyay, and C.D. Maranas, ML helps predict enzyme turnover rates. Nature Catalysis, 2022. 5(8): p. 655-657. https://doi.org/10.1038/s41929-022-00827-x
2022Angello, N.H., V. Rathore, W. Beker, A. Wolos, E.R. Jira, R. Roszak, T.C. Wu, C.M. Schroeder, A. Aspuru-Guzik, B.A. Grzybowski, and M.D. Burke, Closed-loop optimization of general reaction conditions for heteroaryl Suzuki-Miyaura coupling. Science, 2022. 378(6618): p. 399-405. https://doi.org/10.1126/science.adc8743
2021Wang, Y., P. Xue, M. Cao, T. Yu, S.T. Lane, and H. Zhao, Directed Evolution: Methodologies and Applications. Chem Rev, 2021. 121(20): p. 12384-12444. https://doi.org/10.1021/acs.chemrev.1c00260
2021Wang, L., V. Upadhyay, and C.D. Maranas, dGPredictor: Automated fragmentation method for metabolic reaction free energy prediction and de novo pathway design. PLoS Comput Biol, 2021. 17(9): p. e1009448. https://doi.org/10.1371/journal.pcbi.1009448
2021Shen, W., Y. Yin, Y. Yang, J. Han, J. Wang, and X. Yuan, Toward tweet entity linking with heterogeneous information networks. IEEE Transactions on Knowledge and Data Engineering, 2021. 34(12): p. 6003-6017. https://doi.org/10.1109/TKDE.2021.3068093
2021Shen, W., Y. Li, Y. Liu, J. Han, J. Wang, and X. Yuan, Entity linking meets deep learning: Techniques and solutions. IEEE Transactions on Knowledge and Data Engineering, 2021. 35(3): p. 2556-2578. https://doi.org/10.1109/TKDE.2021.3117715
2021Luo, Y., G. Jiang, T. Yu, Y. Liu, L. Vo, H. Ding, Y. Su, W.W. Qian, H. Zhao, and J. Peng, ECNet is an evolutionary context-integrated deep learning framework for protein engineering. Nat Commun, 2021. 12(1): p. 5743. https://doi.org/10.1038/s41467-021-25976-8
2021Campos, D. and H. Ji, IMG2SMI: translating molecular structure images to simplified molecular-input line-entry system. arXiv preprint arXiv:2109.04202, 2021. https://doi.org/10.48550/arXiv.2109.04202

Conference Proceedings

YearConference Paper
2025Pengcheng Jiang, Cao Xiao, Minhao Jiang, Parminder Bhatia, Taha Kass-Hout, Jimeng Sun, Jiawei Han, “Reasoning-Enhanced Healthcare Predictions with Knowledge Graph Community Retrieval”, in Proc. of 2025 Int. Conf. on Learning Representations (ICLR’25), April 2025
2025Bowen Jin, Jinsung Yoon, Jiawei Han, Sercan O. Arik, “Long-Context LLMs Meet RAG: Overcoming Challenges for Long Inputs in RAG”, in Proc. of 2025 Int. Conf. on Learning Representations (ICLR’25), April 2025
2025Ming Zhong, Aston Zhang, Xuewei Wang, Rui Hou, Wenhan Xiong, Chenguang Zhu, Zhengxing Chen, Liang Tan, Chloe Bi, Mike Lewis, Sravya Popuri, Sharan Narang, Melanie Kambadur, Dhruv Mahajan, Sergey Edunov, Jiawei Han, Laurens van der Maaten, “Law of the Weakest Link: Cross Capabilities of Large Language Models”, in Proc. of 2025 Int. Conf. on Learning Representations (ICLR’25), April 2025
2025Yu Zhang, Yanzhen Shen, SeongKu Kang, Xiusi Chen, Bowen Jin, Jiawei Han, “Chain-of-Factors Paper-Reviewer Matching”, in Proc. The Web Conference 2025 (WWW’25), April 2025
2025Yunyi Zhang, Ruozhen Yang, Xueqiang Xu, Rui Li, Jinfeng Xiao, Jiaming Shen, Jiawei Han, “TELEClass: Taxonomy Enrichment and LLM-Enhanced Hierarchical Text Classification with Minimal Supervision”, in Proc. The Web Conference 2025 (WWW’25), April 2025
2025Yizhu Jiao, Siru Ouyang, Ming Zhong, Yunyi Zhang, Linyi Ding, Sizhe Zhou, Jiawei Han, “Retrieval and Structuring Augmented Generation with Large Language Models for Web Applications”, (Conf. Tutorial), 2025 The Web Conference (WWW’25), April 2025
2025Pengcheng Jiang, Cao Xiao, Tianfan Fu, Parminder Bhatia, Taha Kass-Hout, Jimeng Sun, Jiawei Han, “Bi-level Contrastive Learning for Knowledge-Enhanced Molecule Representations”, Proc. of 2025 AAAI Conf. on Artificial Intelligence (AAAI’25), Feb. 2025
2025SeongKu Kang, Bowen Jin, Wonbin Kweon, Yu Zhang, Dongha Lee, Jiawei Han, Hwanjo Yu, “Improving Scientific Document Retrieval with Concept Coverage-based Query Set Generation”, in Proc. 2025 ACM Int. Conf. on Web Search and Data Mining (WSDM’25), March 2025. https://doi.org/10.1145/3701551.3703544
2025Fu, C., X. Li, B. Olson, H. Ji, and S. Ji, Fragment and Geometry Aware Tokenization of Molecules for Structure-Based Drug Design Using Language Models. 2025. Proc. The Thirteenth International Conference on Learning Representations (ICLR2025). https://openreview.net/forum?id=mMhZS7qt0U
2025Arneson, K., L. Fu, L. Gatzke. Toward More Usable, Reproducible, and Sustainable Scientific Software: The Impact of User-Centered Design in Research Software Development. 2025. Proc. Platform for Advanced Scientific Computing.
2025Nguyen, T., K.-H. Huang, G. Liu, M.D. Burke, Y. Diao, and H. Ji, FARM: Functional Group-Aware Representations for Small Molecules. , 2025. Proc. NAACL2025 Workshop on AI and Scientific Discovery: Directions and Opportunities. https://doi.org/10.48550/arXiv.2410.02082
2024Zhu, K., B.-W. Huang, B. Jin, Y. Jiao, M. Zhong, K. Chang, S.-D. Lin, and J. Han. Investigating Instruction Tuning Large Language Models on Graphs. in Conference on Language Modeling. 2024. https://doi.org/10.48550/arXiv.2408.05457
2024Zhou, S., Y. Meng, B. Jin, and J. Han. Grasping the Essentials: Tailoring Large Language Models for Zero-Shot Relation Extraction. in Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024. Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.emnlp-main.747
2024Zhong, X., Y. Du, S. Ouyang, M. Zhong, T. Luo, Q. Ho, H. Peng, H. Ji, and J. Han. Actionie: Action extraction from scientific literature with programming languages. in Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2024. Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.acl-long.683
2024Zhao, J., C. Zhang, and Y. Luo. Contrastive fitness learning: Reprogramming protein language models for low-n learning of protein fitness landscape. in International Conference on Research in Computational Molecular Biology. 2024. Springer Nature Switzerland Cham. https://doi.org/10.1007/978-1-0716-3989-4_55
2024Zhang, Y., M. Zhong, S. Ouyang, Y. Jiao, S. Zhou, L. Ding, and J. Han. Automated Mining of Structured Knowledge from Text in the Era of Large Language Models. in Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2024. https://doi.org/10.1145/3637528.3671469
2024Zhang, Y., X. Chen, B. Jin, S. Wang, S. Ji, W. Wang, and J. Han. A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery. in Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024. Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.emnlp-main.498
2024Zeng, Q., M. Sidhu, H.P. Chan, L. Wang, and H. Ji. Scientific Opinion Summarization: Paper Meta-review Generation Dataset, Methods, and Evaluation. in 1st AI4Research Workshop. 2024. International Joint Conferences on Artificial Intelligence Organization. https://doi.org/10.48550/arXiv.2305.14647
2024Yan, K., X. Li, H. Ling, K. Ashen, C. Edwards, R. Arróyave, M. Zitnik, H. Ji, X. Qian, and X. Qian. Invariant Tokenization of Crystalline Materials for Language Model Enabled Generation. in Advances in Neural Information Processing Systems. 2024. https://proceedings.neurips.cc/paper_files/paper/2024/file/e23133d34964a0a09f6d076fc4b922a4-Paper-Conference.pdf
2024Xiao, J., L. Ding, J. Barry, M. Elkaref, G. De Mel, and J. Han. ORAG: Ontology-Guided Retrieval-Augmented Generation for Theme-Specific Entity Typing. in Conference on Language Modeling. 2024. https://openreview.net/forum?id=cKBmZ2PZ6c
2024Wang, Q., Z. Zhang, H. Li, X. Liu, J. Han, H. Zhao, and H. Ji. Chem-FINESE: Validating fine-grained few-shot entity extraction through text reconstruction. in Findings of the Association for Computational Linguistics: EACL 2024. 2024. Association for Computational Linguistics. https://aclanthology.org/2024.findings-eacl.1/
2024Wang, Q., C. Edwards, H. Ji, and T. Hope. Towards a human-computer collaborative scientific paper lifecycle: A pilot study and hands-on tutorial. in Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024): Tutorial Summaries. 2024. ELRA and ICCL. https://aclanthology.org/2024.lrec-tutorials.10/
2024Wang, Q., D. Downey, H. Ji, and T. Hope. Scimon: Scientific inspiration machines optimized for novelty. in Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2024. Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.acl-long.18
2024Roy, S.G. and J. Han. ILCiteR: Evidence-grounded Interpretable Local Citation Recommendation. in Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). 2024. ELRA and ICCL. https://aclanthology.org/2024.lrec-main.757/
2024Reddy, R.G., J. Doo, Y. Xu, M.A. Sultan, D. Swain, A. Sil, and H. Ji. FIRST: Faster Improved Listwise Reranking with Single Token Decoding. in Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024. Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.emnlp-main.491
2024Ouyang, S., S. Wang, M. Jiang, M. Zhong, D. Yu, J. Han, and Y. Shen. Temperature-Centric Investigation of Speculative Decoding with Knowledge Distillation. in Findings of the Association for Computational Linguistics: EMNLP 2024. 2024. Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.findings-emnlp.767
2024Ouyang, S., J. Huang, P. Pillai, Y. Zhang, Y. Zhang, and J. Han. Ontology enrichment for effective fine-grained entity typing. in Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2024. https://doi.org/10.1145/3637528.3671857
2024Nguyen, T., T. Torres-Flores, C. Hwang, C. Edwards, Y. Diao, and H. Ji. GLaD: Synergizing Molecular Graphs and Language Descriptors for Enhanced Power Conversion Efficiency Prediction in Organic Photovoltaic Devices. in Proceedings of the 33rd ACM International Conference on Information and Knowledge Management. 2024. https://doi.org/10.1145/3627673.3680103
2024Liu, H., Q. Wang, P. Karisani, and H. Ji. Named Entity Recognition Under Domain Shift via Metric Learning for Life Sciences. in Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2024. Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.naacl-long.1
2024Komarlu, T., M. Jiang, X. Wang, and J. Han. OntoType: Ontology-Guided and Pre-Trained Language Model Assisted Fine-Grained Entity Typing. in Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2024. https://doi.org/10.1145/3637528
2024Kang, S., Y. Zhang, P. Jiang, D. Lee, J. Han, and H. Yu. Taxonomy-guided Semantic Indexing for Academic Paper Search. in Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024. Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.emnlp-main.407
2024Kang, S., S. Agarwal, B. Jin, D. Lee, H. Yu, and J. Han. Improving retrieval in theme-specific applications using a corpus topical taxonomy. in Proceedings of the ACM Web Conference 2024. https://doi.org/10.1145/3589334.3645512
2024Jin, B., Y. Zhang, S. Li, and J. Han. Bridging Text Data and Graph Data: Towards Semantics and Structure-aware Knowledge Discovery. in Proceedings of the 17th ACM International Conference on Web Search and Data Mining. 2024. https://doi.org/10.1145/3616855
2024Jin, B., C. Xie, J. Zhang, K.K. Roy, Y. Zhang, Z. Li, R. Li, X. Tang, S. Wang, Y. Meng, and J. Han. Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs. in Findings of the Association for Computational Linguistics: ACL 2024. 2024. Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.findings-acl.11
2024Jin, B., Z. Pang, B. Guo, Y.-X. Wang, J. You, and J. Han. InstructG2I: Synthesizing Images from Multimodal Attributed Graphs. in Annual Conference on Neural Information Processing Systems. 2024. https://doi.org/10.48550/arXiv.2410.07157
2024Jiao, Y., S. Li, S. Zhou, H. Ji, and J. Han. TEXT2DB: Integration-Aware Information Extraction with Large Language Model Agents. in Findings of the Association for Computational Linguistics ACL 2024. 2024. Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.findings-acl.12
2024Edwards, C., Q. Wang, L. Zhao, and H. Ji. L+M-24: Building a Dataset for Language+Molecules @ ACL 2024. in Proceedings of the 1st Workshop on Language + Molecules (L+M 2024). 2024. Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.langmol-1.1
2024Edwards, C., Q. Wang, and H. Ji. Language + Molecules. in Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: Tutorial Abstracts. 2024. Association for Computational Linguistics. https://aclanthology.org/2024.eacl-tutorials.3/
2024Edwards, C., A. Naik, T. Khot, M.D. Burke, H. Ji, and T. Hope. SynerGPT: In-Context Learning for Personalized Drug Synergy Prediction and Drug Design. in Conference on Language Modeling. 2024. https://doi.org/10.48550/arXiv.2307.11694
2024Ding, L., J. Xiao, S. Zhou, C. Yang, and J. Han. Topic-Oriented Open Relation Extraction with A Priori Seed Generation. in Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024. Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.emnlp-main.766
2024Pengfei Yu and Heng Ji. 2024. Information Association for Language Model Updating by Mitigating LM-Logical Discrepancy. In Proceedings of the 28th Conference on Computational Natural Language Learning, pages 117–129, Miami, FL, USA. Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.conll-1.10
2024Ghaffari, S., E. Saleh, A. Schwing, Y.-X. Wang, M.D. Burke, and S. Sinha. Robust Model-Based Optimization for Challenging Fitness Landscapes. in International Conference on Learning Representations. 2024. https://openreview.net/forum?id=xhEN0kJh4q
2023Zhou, S., S. Ge, J. Shen, and J. Han. Corpus-based relation extraction by identifying and refining relation patterns. in Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 2023. Springer Nature Switzerland Cham. https://doi.org/10.1007/978-3-031-43421-1_2
2023Zhong, M., S. Ouyang, Y. Jiao, P. Kargupta, L. Luo, Y. Shen, B. Zhou, X. Zhong, X. Liu, H. Li, J. Xiao, M. Jiang, X. Wang, H. Ji, M.D. Burke, H. Zhao and J. Han. Reaction miner: An integrated system for chemical reaction extraction from textual data. in Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. 2023. Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.emnlp-demo.36
2023Zhong, M., S. Ouyang, M. Jiang, V. Hu, Y. Jiao, X. Wang, and J. Han. ReactIE: Enhancing Chemical Reaction Extraction with Weak Supervision. in Findings of the Association for Computational Linguistics: ACL 2023. 2023. Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.findings-acl.767
2023Zhao, L., C. Edwards, and H. Ji. What a Scientific Language Model Knows and Doesn’t Know about Chemistry. in NeurIPS 2023 AI for Science Workshop. 2023. https://openreview.net/forum?id=hSmn7BQZ2v&noteId=Nr11sAV2kF
2023Zhang, Y., Y. Zhang, M. Michalski, Y. Jiang, Y. Meng, and J. Han. Effective seed-guided topic discovery by integrating multiple types of contexts. in Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining. 2023. https://doi.org/10.1145/3539597.3570475
2023Zhang, Y., Y. Zhang, and J. Han. Mining Structures from Massive Texts by Exploring the Power of Pre-trained Language Models. in EDBT. 2023. https://doi.org/10.48786/EDBT.2023.81
2023Zhang, Y., B. Jin, Q. Zhu, Y. Meng, and J. Han. The effect of metadata on scientific literature tagging: A cross-field cross-model study. in Proceedings of the ACM Web Conference 2023. 2023. https://doi.org/10.1145/3543507.3583354
2023Zhang, Y., B. Jin, X. Chen, Y. Shen, Y. Zhang, Y. Meng, and J. Han. Weakly supervised multi-label classification of full-text scientific papers. in Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2023. https://doi.org/10.1145/3580305.3599544
2023Zhang, Y., M. Jiang, Y. Meng, Y. Zhang, and J. Han. Pieclass: Weakly-supervised text classification with prompting and noise-robust iterative ensemble training. in Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023. Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.emnlp-main.780
2023Yoon, S., Y. Meng, D. Lee, and J. Han. SCStory: Self-supervised and Continual Online Story Discovery. in Proceedings of the ACM Web Conference 2023. 2023. https://doi.org/10.1145/3543507.3583507
2023Yoon, S., D. Lee, Y. Zhang, and J. Han. Unsupervised story discovery from continuous news streams via scalable thematic embedding. in Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2023. https://doi.org/10.1145/3539618.3591782
2023Yoon, S., H.P. Chan, and J. Han. Pdsum: Prototype-driven continuous summarization of evolving multi-document sets stream. in Proceedings of the ACM Web Conference 2023. 2023. https://doi.org/10.1145/3543507.3583371
2023Sprueill, H.W., C. Edwards, M.V. Olarte, U. Sanyal, H. Ji, and S. Choudhury. Monte Carlo Thought Search: Large Language Model Querying for Complex Scientific Reasoning in Catalyst Design. in Findings of the Association for Computational Linguistics: EMNLP 2023. 2023. Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.findings-emnlp.560
2023Shah, A.K. and R. Zanibbi. Line-of-sight with graph attention parser (LGAP) for math formulas. in International Conference on Document Analysis and Recognition. 2023. Springer Nature Switzerland Cham. https://doi.org/10.1007/978-3-031-41734-4_25
2023Ouyang, S., S. Wang, Y. Liu, M. Zhong, Y. Jiao, D. Iter, R. Pryzant, C. Zhu, H. Ji, and J. Han. The Shifted and The Overlooked: A Task-oriented Investigation of User-GPT Interactions. in Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023. Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.emnlp-main.146
2023Miao, S., Y. Luo, M. Liu, and P. Li. Interpretable Geometric Deep Learning via Learnable Randomness Injection. in International Conference on Learning Representation. 2023. https://doi.org/10.48550/arXiv.2210.16966
2023Meng, Y., M. Michalski, J. Huang, Y. Zhang, T. Abdelzaher, and J. Han. Tuning language models as training data generators for augmentation-enhanced few-shot learning. in International Conference on Machine Learning. 2023. PMLR. https://dl.acm.org/doi/10.5555/3618408.3619426
2023Meng, Y., J. Huang, Y. Zhang, Y. Zhang, and J. Han. Pretrained Language Representations for Text Understanding: A Weakly-Supervised Perspective. in Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2023. https://doi.org/10.1145/3580305
2023Luo, J. and Y. Luo. Contrastive learning of protein representations with graph neural networks for structural and functional annotations. in Biocomputing. 2023. World Scientific. https://doi.org/10.1142/9789811270611_0011
2023Jin, X., B. Vinzamuri, S. Venkatapathy, H. Ji, and P. Natarajan. Adversarial robustness for large language NER models using disentanglement and word attributions. in Findings of the Association for Computational Linguistics: EMNLP 2023. 2023. Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.findings-emnlp.830
2023Jin, B., Y. Zhang, Q. Zhu, and J. Han. Heterformer: Transformer-based deep node representation learning on heterogeneous text-rich networks. in Proceedings of the 29th ACM SIGKDD conference on knowledge discovery and data mining. 2023. https://doi.org/10.1145/3580305
2023Jin, B., Y. Zhang, Y. Meng, and J. Han. Edgeformers: Graph-empowered transformers for representation learning on textual-edge networks. in International Conference on Learning Representation. 2023. https://doi.org/10.48550/arXiv.2302.11050
2023Jin, B., W. Zhang, Y. Zhang, Y. Meng, X. Zhang, Q. Zhu, and J. Han. Patton: Language Model Pretraining on Text-Rich Networks. in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2023. Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.acl-long.387
2023Jiao, Y., M. Zhong, J. Shen, Y. Zhang, C. Zhang, and J. Han. Unsupervised event chain mining from multiple documents. in Proceedings of the ACM Web Conference 2023. 2023. https://doi.org/10.1145/3543507
2023Jiao, Y., M. Zhong, S. Li, R. Zhao, S. Ouyang, H. Ji, and J. Han. Instruct and Extract: Instruction Tuning for On-Demand Information Extraction. in Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023. Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.emnlp-main.620
2023Jiang, P., S. Agarwal, B. Jin, X. Wang, J. Sun, and J. Han. Text Augmented Open Knowledge Graph Completion via Pre-Trained Language Models. in Findings of the Association for Computational Linguistics: ACL 2023. 2023. Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.findings-acl.709
2023Guan, J., W.W. Qian, X. Peng, Y. Su, J. Peng, and J. Ma. 3D Equivariant Diffusion for Target-Aware Molecule Generation and Affinity Prediction. in International Conference on Learning Representation. 2023. https://doi.org/10.48550/arXiv.2303.03543
2023Ge, S., J. Huang, Y. Meng, and J. Han. FineSum: Target-Oriented, Fine-Grained Opinion Summarization. in Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining. 2023. https://doi.org/10.1145/3539597.3570397
2023Chan, H.P., Q. Zeng, and H. Ji. Interpretable Automatic Fine-grained Inconsistency Detection in Text Summarization. in Findings of the Association for Computational Linguistics: ACL 2023. 2023. Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.findings-acl.402
2023Balepur, N., S. Agarwal, K.V. Ramanan, S. Yoon, D. Yang, and J. Han. DynaMiTE: Discovering explosive topic evolutions with user guidance. in Findings of the Association for Computational Linguistics: ACL 2023. 2023. Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.findings-acl.14
2022Zhong, M., Y. Liu, D. Yin, Y. Mao, Y. Jiao, P. Liu, C. Zhu, H. Ji, and J. Han. Towards a Unified Multi-Dimensional Evaluator for Text Generation. in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 2022. Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.emnlp-main.131
2022Zhong, M., Y. Liu, S. Ge, Y. Mao, Y. Jiao, X. Zhang, Y. Xu, C. Zhu, M. Zeng, and J. Han. Unsupervised Multi-Granularity Summarization. in Findings of the Association for Computational Linguistics: EMNLP 2022. 2022. Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.findings-emnlp.366
2022Zhang, Y., Y. Meng, X. Wang, S. Wang, and J. Han. Seed-Guided Topic Discovery with Out-of-Vocabulary Seeds. in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2022. Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.naacl-main.21
2022Zhang, Y., F. Guo, J. Shen, and J. Han. Unsupervised key event detection from massive text corpora. in Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining. 2022. https://doi.org/10.1145/3534678.3539395
2022Zhang, Y., S. Garg, Y. Meng, X. Chen, and J. Han. MotifClass: Weakly Supervised Text Classification with Higher-order Metadata Information. in Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining. 2022. https://doi.org/10.1145/3488560.3498384
2022Wang, X., H. Wang, H. Ji, and J. Han. New frontiers of scientific text mining: tasks, data, and tools. in Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2022. https://doi.org/10.1145/3534678.3542606
2022Wang, X., H. Wang, H. Ji, and J. Han. Modern natural language processing techniques for scientific web mining: tasks, data, and tools. in Proceedings of the ACM Web Conference 2022. 2022. https://blender.cs.illinois.edu/paper/wwwtutorial2022.pdf
2022Wang, X., V. Hu, M. Jiang, Y. Zhang, J. Xiao, D.C. Loving, H. Ji, M. Burke, and J. Han. REACTCLASS: cross-modal supervision for subword-guided reactant entity classification. in 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2022. IEEE. https://doi.ieeecomputersociety.org/10.1109/BIBM55620.2022.9995489
2022Wang, H., W. Li, X. Jin, K. Cho, H. Ji, J. Han, and M.D. Burke. Chemical-Reaction-Aware Molecule Representation Learning. in International Conference on Learning Representation. 2022. https://openreview.net/forum?id=6sh3pIzKS-
2022Meng, Y., Y. Zhang, J. Huang, Y. Zhang, and J. Han. Topic discovery via latent space clustering of pretrained language model representations. in Proceedings of the ACM Web Conference 2022. 2022. https://doi.org/10.1145/3485447.3512034
2022Meng, Y., J. Huang, Y. Zhang, and J. Han. Generating training data with language models: Towards zero-shot language understanding. in Advances in Neural Information Processing Systems. 2022. https://proceedings.neurips.cc/paper_files/paper/2022/file/0346c148ba1c21c6b4780a961ea141dc-Paper-Conference.pdf
2022Meng, Y., J. Huang, Y. Zhang, and J. Han. Adapting Pretrained Representations for Text Mining. in Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2022. https://doi.org/10.1145/3534678
2022Lee, D., J. Shen, S. Lee, S. Yoon, H. Yu, and J. Han. Topic Taxonomy Expansion via Hierarchy-Aware Topic Phrase Generation. in Findings of the Association for Computational Linguistics: EMNLP 2022. 2022. Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.findings-emnlp.122
2022Lee, D., J. Shen, S. Kang, S. Yoon, J. Han, and H. Yu. Taxocom: Topic taxonomy completion with hierarchical discovery of novel topic clusters. in Proceedings of the ACM Web Conference 2022. 2022. https://doi.org/10.1145/3485447.3512002
2022Jiao, Y., S. Li, Y. Xie, M. Zhong, H. Ji, and J. Han. Open-Vocabulary Argument Role Prediction For Event Extraction. in Findings of the Association for Computational Linguistics: EMNLP 2022. 2022. Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.findings-emnlp.395
2022Hwang, C., S. Yi, D. Friday, N.H. Angello, T.C. Torres-Flores, N.E. Jackson, M.D. Burke, C.M Schroeder, and Y. Diao. Autonomous Materials Discovery for Organic Photovoltaics. in AI for Accelerated Materials Design NeurIPS 2022 Workshop. 2022. https://openreview.net/forum?id=RfJOs4EMfjj
2022Huang, J., Y. Meng, and J. Han. Few-shot fine-grained entity typing with automatic label interpretation and instance generation. in Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining. 2022. https://doi.org/10.1145/3534678
2022Guan, J., W.W. Qian, Q. Liu, W.-Y. Ma, J. Ma, and J. Peng. Energy-inspired molecular conformation optimization. in International Conference on Learning Representation. 2022. https://openreview.net/forum?id=7QfLW-XZTl
2022Gu, X., Y. Shen, J. Shen, J. Shang, and J. Han. Phrase-aware Unsupervised Constituency Parsing. in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2022. Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.acl-long.444
2022Edwards, C., T. Lai, K. Ros, G. Honke, K. Cho, and H. Ji. Translation between Molecules and Natural Language. in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 2022. Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.emnlp-main.26
2022Agarwal, S., R. Sawhney, M. Thakkar, P. Nakov, J. Han, and T. Derr. Think: Temporal hypergraph hyperbolic network. in 2022 IEEE International Conference on Data Mining (ICDM). 2022. IEEE. https://doi.ieeecomputersociety.org/10.1109/ICDM54844.2022.00096
2021Zhu, Q., C. Yang, Y. Xu, H. Wang, C. Zhang, and J. Han. Transfer learning of graph neural networks with ego-graph information maximization. in Advances in Neural Information Processing Systems. 2021. https://proceedings.neurips.cc/paper_files/paper/2021/file/0dd6049f5fa537d41753be6d37859430-Paper.pdf
2021Zhu, Q., N. Ponomareva, J. Han, and B. Perozzi. Shift-Robust GNNs: Overcoming the Limitations of Localized Graph Training data. in Advances in Neural Information Processing Systems. 2021. https://proceedings.neurips.cc/paper_files/paper/2021/hash/eb55e369affa90f77dd7dc9e2cd33b16-Abstract.html
2021Zhang, Z., N.N. Parulian, H. Ji, A.S. Elsayed, S. Myers, and M. Palmer. Fine-grained Information Extraction from Biomedical Literature based on Knowledge-enriched Abstract Meaning Representation. in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021. Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.acl-long.489
2021Zhang, X., C. Zhang, X.L. Dong, J. Shang, and J. Han. Minimally-supervised structure-rich text categorization via learning on text-rich networks. in Proceedings of the Web Conference 2021. 2021. https://doi.org/10.1145/3442381.3450114
2021Xie, Y., J. Shen, S. Li, Y. Mao, and J. Han. Eider: Empowering Document-level Relation Extraction with Efficient Evidence Extraction and Inference-stage Fusion. in Findings of the Association for Computational Linguistics: ACL 2022. 2021. https://doi.org/10.18653/v1/2022.findings-acl.23
2021Wang, X., V. Hu, X. Song, S. Garg, J. Xiao, and J. Han. ChemNER: fine-grained chemistry named entity recognition with ontology-guided distant supervision. in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021. Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.emnlp-main.424
2021Sun, C., W. Li, J. Xiao, N.N. Parulian, C. Zhai, and H. Ji. Fine-grained chemical entity typing with multimodal knowledge representation. in 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2021. IEEE. https://doi.ieeecomputersociety.org/10.1109/BIBM52615.2021.9669360
2021Shen, J., Y. Zhang, H. Ji, and J. Han. Corpus-based open-domain event type induction. in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021. Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.emnlp-main.441
2021Shah, A.K., A. Dey, and R. Zanibbi. A math formula extraction and evaluation framework for pdf documents. in Document Analysis and Recognition–ICDAR 2021: 16th International Conference, Lausanne, Switzerland, September 5–10, 2021, Proceedings, Part II 16. 2021. Springer International Publishing. https://doi.org/10.1007/978-3-030-86331-9_2
2021Meng, Y., Y. Zhang, J. Huang, X. Wang, Y. Zhang, H. Ji, and J. Han. Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training. in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021. Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.emnlp-main.810
2021Meng, Y., J. Huang, Y. Zhang, and J. Han. On the Power of Pre-Trained Text Representations: Models and Applications in Text Mining. in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 2021. https://doi.org/10.1145/3447548.3470810
2021Mao, Y., W. Ma, D. Lei, J. Han, and X. Ren. Extract, Denoise and Enforce: Evaluating and Improving Concept Preservation for Text-to-Text Generation. in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021. Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.emnlp-main.413
2021Lai, T., H. Ji, C. Zhai, and Q.H. Tran. Joint Biomedical Entity and Relation Extraction with Knowledge-Enhanced Collective Inference. in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021. Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.acl-long.488
2021Lai, T., H. Ji, and C. Zhai. BERT might be Overkill: A Tiny but Effective Biomedical Entity Linker based on Residual Convolutional Neural Networks. in Findings of the Association for Computational Linguistics: EMNLP 2021. 2021. Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.findings-emnlp.140
2021Gu, X., Z. Wang, Z. Bi, Y. Meng, L. Liu, J. Han, and J. Shang. Ucphrase: Unsupervised context-aware quality phrase tagging. in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 2021. https://doi.org/10.1145/3447548.3467397
2021Edwards, C., C. Zhai, and H. Ji. Text2mol: Cross-modal molecule retrieval with natural language queries. in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021. Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.emnlp-main.47
2021Dey, A. and R. Zanibbi. ScanSSD-XYc: faster detection for math formulas. in Document Analysis and Recognition–ICDAR 2021 Workshops: Lausanne, Switzerland, September 5–10, 2021, Proceedings, Part I 16. 2021. Springer. https://doi.org/10.1007/978-3-030-86198-8_7