Assistant Professor
Stevens Institute of Technology
Other affiliations:
Affiliated faculty at City University of New York
Co-adjutant at Rutgers University
Congratulations, Dr. Mengjiao Zhang! | 09/2023 |
Congratulations, Dr. Xuting Tang! | 07/2023 |
Congratulations, Dr. Yu Yu! | 06/2023 |
I am creating methods for competitive machine translation systems. These methods often push things beyond the current state-of-the-art. To achieve this, I am devising general machine learning methods, study their empirical and theoretical limitations, and introduce techniques in ensemble learning, subsampling methods, and bringing geometric techniques in the study of structured prediction.
I am an assistant professor at the Stevens Institute of Technology. Previously, I was an associate professor at the Chinese Academy of Sciences and a faculty member and Ph.D. advisor at Tsinghua University. As a graduate student supervised by Hermann Ney from RWTH Aachen, I had fruitful visits to IBM in Watson and the NLP group in Microsoft Research (MSR) Redmond. My current research interests are in Machine Learning, with a focus on highly competitive machine translation systems. Lately, I have developed an interest in devising techniques that explore the underlying metric and geometric properties of machine translation systems. I am publishing in mainstream venues in computational linguistics and machine learning (e.g., AAAI, ICML, ACL). I am often leading teams that win first (or one of the first) position in machine translation competitions.
Resilient Multi-Agent Reinforcement Learning with Dynamic Participating Agents
Xuting Tang, Jia Xu, and Shusen Wang
2023, accepted in IEEE 12th International Conference on Cloud Networking (CloudNet)
ConceptX: A Framework for Latent Concept Analysis
Firoj Alam, Fahim Dalvi, Nadir Durrani, Hassan Sajjad, Abdul Rafae Khan, and Jia Xu
2023, in Proceedings of AAAI-23 Demonstrations Program
Learning Uncertainty for Unknown Domains with Zero-Target-Assumption
Yu Yu, Hassan Sajjad, and Jia Xu
Fluent Translation Built on Giant Pre-trained Models
Abdul Rafae Khan, Hrishikesh Kanade, Girish Amar Budhrani, Preet Jhanglani, and Jia Xu
2022, in Proceedings of WMT'22 at EMNLP
Can Data Diversity Enhance Learning Generalization?
Yu Yu, Shahram Khadivi, and Jia Xu
2022, in Proceedings of COLING
Byte-based Multilingual NMT for Endangered Languages
Mengjiao Zhang and Jia Xu
2022, in Proceedings of COLING
Analyzing Encoded Concepts in Transformer Language Models
Hassan Sajjad, Nadir Durrani, Fahim Dalvi, Firoj Alam, Abdul Rafae Khan, and Jia Xu
Discovering Latent Concepts Learned in BERT
Fahim Dalvi, Abdul Rafae Khan, Firoj Alam, Nadir Durrani, Jia Xu, Hassan Sajjad
Grouping Words with Semantic Diversity
Karine Chubarian, Abdul Rafae Khan, Anastasios Sidiropoulos, and Jia Xu (alphabetically ordered)
Coding Textual Inputs Boosts the Accuracy of Neural Networks
Abdul Rafae Khan, Jia Xu, and Weiwei Sun
A Clustering Framework for Lexical Normalization of Roman Urdu
Abdul Rafae Khan, Asim Karim, Hassan Sajjad, Faisal Kamiran, and Jia Xu
2020, in Journal of Natural Language Engineering
CUNY-PKU Parser at SemEval-2019 task 1: Cross-Lingual Semantic Parsing with UCCA
Weimin Lyu, Sheng Huang, Abdul Rafae Khan, Shengqiang Zhang, Weiwei Sun, and Jia Xu
2019, in Proceedings of the 13th International Work-shop on Semantic Evaluation
WCS: Robust Network Localization by Weighted Component Stitching
Tianyuan Sun, Yongcai Wang, Deying Li, Zhaoquan Gu, and Jia Xu
2018, in IEEE/ACM Transactions on Networking
Assessing Quality Estimation Models for Sentence-Level Prediction
Hoang Cuong and Jia Xu
August 2018, Proceedings of COLING
On the Efficient Online Model Adaptation by Incremental Simplex Tableau
Zhixian Lei, Xuehan Ye, Yongcai Wang, Deying Li, Jia Xu
June 2017, Proceedings of AAAI
Hunter MT: A Course for Young Researchers in WMT17
Jia Xu, Yizong Kuang, Shondell Baijoo, Jacob Lee, Uman Shahzad, Meredith Lancaster, and Chris Carlan
June 2017, Proceedings of the Second Conference on Machine Translation at EMNLP
On the Power and Limits of Distance-Based Learning
P. A. Papakonstantinou, J. Xu, G. Yang (authors alphabetically ordered)June 2016, Proceedings of ICML
Query Lattice for Translation Retrieval
M. Dong, Y. Cheng, Y. Liu, J. Xu, M. Sun
August 2014, Proceedings of COLING
An ant colony optimization method to detect communities in social networks.
Javadi SH, Khadivi S, Shiri ME, and Xu J.
August 2014, Proceedings of ASONAM
Salient Object Detection in Image Sequences via Spatial-Temporal Cue
C. Gan, Z. Qin, J. Xu and T. Wan
November 2013, Proceedings of the Conference on Visual Communications and Image Processing
Enhancing Chinese Word Segmentation Using Unlabeled Data
W. Sun and J. Xu
July 2011, Proceedings of the Conference on Empirical Methods in Natural Language Processing
Generating Virtual Parallel Corpus: A Compatibility Centric Method
J. Xu and W.Sun
September 2011, Proceedings of the Machine Translation Summit
Improving Machine Translation Performance Using Comparable Corpora
A. Eisele and J. Xu
May 2010, Proceedings of the LREC Workshop on Building and Using Comparable Corpora
Further Experiments with Shallow Hybrid MT Systems
C. Federmann, A. Eisele, Y. Chen, S. Hunsicker, J. Xu and H. Uszkoreit
July 2010, Proceedings of the ACL Workshop on Statistical Machine Translation
Bayesian Semi-Supervised Chinese Word Segmentation for Statistical Machine Translation
J. Xu, J. Gao, K. Toutanova and H. Ney
August 2008, Proceedings of the 22nd International Conference on Computational Linguistics
Phrase Table Training for Precision and Recall: What Makes a Good Phrase and a Good Phrase Pair?
Y.Deng, J. Xu and Y. Gao
June 2008, Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics
Domain Dependent Machine Translation
J. Xu, Y. Deng, Y. Gao and H. Ney
September 2007, Proceedings of the Machine Translation Summit XI
Partitioning Parallel Documents Using Binary Segmentation
J. Xu, R. Zens and H. Ney
June 2006, Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL): Proceedings of the Workshop on Statistical Machine Translation
Error Analysis of Statistical Machine Translation Output
D.Vilar, J. Xu, L. F. D'Haro and H. Ney
May 2006, Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC)
Integrated Chinese Word Segmentation in Statistical Machine Translation
J. Xu, E. Matusov, R. Zens and H. Ney
October 2005, Proceedings of the International Workshop on Spoken Language Translation (IWSLT)
The RWTH Phrase-based Statistical Machine Translation System
R. Zens, O. Bender, S. Hasan, S. Khadivi, E. Matusov, J. Xu, Y. Zhang, and H. Ney
May 2005, Proceedings of the International Workshop on Spoken Language Translation (IWSLT)
Sentence Segmentation Using IBM Word Alignment Model 1
J. Xu, R. Zens and H. Ney
May 2005, Proceedings of the 10th Annual Conference of the European Association for Machine Translation (EAMT 2005)
Do We Need Chinese Word Segmentation for Statistical Machine Translation?
J. Xu, R. Zens and H. Ney
July 2004, Proceedings of the 10th Annual Conference of the European Association for Machine Translation (EAMT 2005)
Sequence Segmentation of Statistical Machine Translation
J. Xu
September 2010, Dissertation of Ph.D. of Computer Science
A Computational Model for Sound Processing in the Human Auditory System
J. Xu
November 2002, Dissertation of Master of Computer Science
Competition | Representing | Rank | Notes |
---|---|---|---|
WMT 2022 | SIT | 1st | Code-Mixing MT subtask 2 Hinglish->English team advisor |
WMT 2022 | SIT | 1st (w.r.t. WER) | Code-Mixing MT subtask 1 Hindi+English->Hinglish team advisor |
WMT 2018 | Hunter | 1st | French-English Biomedical track team leader |
WMT 2017 | Hunter | 1st (w.r.t. BLEU) | Finnish-English News track team leader |
NIST 2015 | ICT-DCU | 1st and 4th | 1st (academic inst.) and 4th (overall) team leader and main contributor |
WMT 2011 | DFKI | 1st (w.r.t. BLEU) | English-German News track team leader Official result |
NIST 2008 | MSR | 1st | intern at MSR -- Official result |
NIST 2006 | RWTH-Aachen | 4th | Official result |
NIST 2005 | RWTH-Aachen | 4th | Official result |
NIST 2004 | RWTH-Aachen | 2nd | Official result |
GALE 2008 | RWTH-Aachen | 2nd | ranked second in NightInGale |
GALE 2007 | RWTH-Aachen | 2nd | ranked second in NightInGale |
GALE 2006 | RWTH-Aachen | 2nd | ranked second in NightInGale |
TC-Star 2006 | RWTH-Aachen | 1st | |
TC-Star 2005 | RWTH-Aachen | 1st | |
TC-Star 2004 | RWTH-Aachen | 1st |
Source | Amount | Role | Duration | Title | Grant No. |
---|---|---|---|---|---|
Amazon grant | 250,000 USD | Principal investigator | 2023-2025 | Alexa Prize | - |
NSF CRAFT pilot | 30,000 USD | Principal investigator | 2022 | Center for Research toward Advanced Financial Technologies (CRAFT) NSF IUCRC | - |
NSF grant | 299,000 USD | Co-PI | 2018-2023 | IUCRC Phase I Rutgers, Newark: Center for Accelerated Real Time Analytics (CARTA) | 1747728 |
NSFC (NSF-China) grant | 660,000 RMB (100,000 USD) | Co-PI | 2017--2019 | Key Problems for Tightly-coupled, Multi-signal Fusion based Simultaneously Locating and Mapping | 61672524 |
ICT-CAS grant (Innovation subjects) | 500,000 RMB (83,000 USD) | Principal investigator | 2015--2017 | Ensemble learning in machine translation | 20156020 |
KLIIP-ICT-CAS grant | 200,000 RMB (33,000 USD) | Principal investigator | 2015 -- 2016 | Novel machine learning methods | 20156020 |
NSFC grant | 660,000 RMB (100,000 USD) | Co-PI | 2014-2017 | New approaches to the limits of efficient propositional reasoning: algorithms, approximations and foundations | 20131351464 |
IIIS-Tsinghua grant | 150,000 RMB (25,000 USD) | Principal investigator | 2012-2015 | Machine learning and machine translation | NA |
Affiliation | Month |
---|---|
CUNY GC | May 2019 |
JSALT | July 2019 |
MTMA | May 2019 |
Google Research NYC | Dec. 2015 |
Columbia University | Nov. 2015 |
University of Washington | Nov. 2015 |
USC | Nov. 2015 |
CWMT | Oct. 2014 |
RWTH-Aachen University | Oct. 2014 |
Aarhus University | Oct. 2014 |
Stanford University | Apr. 2014 |
Mar. 2014 |
Name | Position | University | Starting Month |
---|---|---|---|
Abdul Khan | Postdoc | Stevens | Apr 2020 |
Mengjiao Zhang | Ph.D. | Stevens | Aug 2019 |
Xuting Tang | Ph.D. | Stevens | Aug 2019 |
Yu Yu | Ph.D. | Stevens | Aug 2020 |
Name | Position | University | Month | Thesis |
---|---|---|---|---|
Cuong Huang | Postdoc | CUNY | 17-18 | (Research) Quality Estimation |
Abdul Khan | Ph.D. | CUNY GC | Jun 2019 | Robust Neural Machine Translation |
Sejal Vyas | MSc | Stevens | Aug 2020 | |
Sattvik Sahai | MSc | Stevens | Aug 2020 | |
Geliang Chen | MSc | Tsinghua | Dec 2016 | Phrase-based Language Model for Statistical Machine Translation (SMT) |
Xiaojun Zhang | MSc | U. Saarland (Co-adv. Prof. Uszkoreit) | Nov 2011 | Two-level Parallel Text Extraction from Comparable Corpora |
Shun Zheng | B.S. | BUPT | May 2014 | Improvements on Word Alignment Models in SMT |
Zhengping Che | B.S. | Tsinghua | Jun 2013 | Dirichlet Process Model for Phrase-based MT |
Yulong Zeng | B.S. | Tsinghua | Jun 2013 | A Comparative Study of Generative Model and Discriminative Model |
Zhibo Zhang | B.S. | Tsinghua | Jun 2013 | Machine Learning-based Crime Prediction |
Geliang Chen | B.S. | Peking University | Jun 2013 | A Novel Approach for Language Model |
Yong Cheng | B.S. | NJTU | Jun 2013 | Analysis of User Behaviors in Social Network |