I am a Senior Applied Scientist at Amazon Search, building knowledge-enhanced LLMs for next-generation conversational shopping experiences. I lead science efforts on NLP, Text/Graph Mining that enables accurate and comprehensive product understanding with special foci on low-resource, open-world settings. My research interests lie in the fields of data mining and natural language processing. In particular, I am interested in text/graph mining and knowledge-intensive language tasks.
[2023/05] Five papers (Open-World Attribute Mining/Relational Text Augmentation/KG Geometric Embedding/Cross-Lingual Transfer/Cross-Modal Transfer) are accepted by ACL 2023.
[2023/04] One paper on Rationale Extraction is accepted by SIGIR 2023.
[2022/12] Accepted the invitation to serve on the Program Committee of ACL 2023.
[2022/09] Accepted the invitation to serve on the Program Committee of TheWebConf 2023.
We present a new task setting for attribute mining on e-commerce products, serving as a practical solution to extract open-world attributes without extensive human intervention. Our supervision comes from a high-quality seed attribute set bootstrapped from existing resources, and we aim to expand the attribute vocabulary of existing seed types, and also to discover any new attribute types automatically. A new dataset is created to support our setting, and our approach Amacer is proposed specifically to tackle the limited supervision. Especially, given that no direct supervision is available for those unseen new attributes, our novel formulation exploits self-supervised heuristic and unsupervised latent attributes, which attains implicit semantic signals as additional supervision by leveraging product context. Experiments suggest that our approach surpasses various baselines by 12 F1, expanding attributes of existing types significantly by up to 12 times, and discovering values from 39% new types. Our dataset and code will be publicly available.
@inproceedings{xu2023topic,
selected = {1},
abbr = {ACL},
topic = {Knowledge Graph},
title = {Towards Open-World Product Attribute Mining: A Lightly-Supervised Approach},
author = {Xu, Liyan and Zhang, Chenwei and Li, Xian and Shang, Jingbo and Choi, Jinho D.},
booktitle = {Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics},
year = {2023},
pdf = {}
}
Unsupervised relation extraction aims to extract relationship between entities from natural language sentences without prior information on relational scope or distribution. Existing works either utilize self-supervised schemes to refine relational feature signals by iteratively leveraging adaptive clustering and classification that provoke gradual drift problems, or adopt instance-wise contrastive learning which unreasonably pushes apart those sentence pairs that are semantically similar. To overcome these defects, we propose a novel contrastive learning framework named HiURE, which has the capability to derive hierarchical signals from relational feature space using cross hierarchy attention and effectively optimize relation representation of sentences under exemplar-wise contrastive learning. Experimental results on two public datasets demonstrate the advanced effectiveness and robustness of HiURE on unsupervised relation extraction when compared with state-of-the-art models.
@inproceedings{liu2022hierarchical,
abbr = {NAACL},
topic = {NLP},
selected = {1},
title = {HiURE: Hierarchical Exemplar Contrastive Learning for Unsupervised Relation Extraction},
author = {Liu, Shuliang and Hu, Xuming and Zhang, Chenwei and Li, Shu'ang and Wen, Lijie and Yu, Philip S.},
booktitle = {The 2022 Conference of the North American Chapter of the Association for Computational Linguistics},
year = {2022},
pdf = {https://arxiv.org/pdf/2205.02225.pdf},
code = {https://github.com/THU-BPM/HiURE}
}
Automatic extraction of product attributes from their textual descriptions is essential for online shopper experience. One inherent challenge of this task is the emerging nature of e-commerce products — we see new types of products with their unique set of new attributes constantly. Most prior works on this matter mine new values for a set of known attributes but cannot handle new attributes that arose from constantly changing data. In this work, we study the attribute mining problem in an open-world setting to extract novel attributes and their values. Instead of providing comprehensive training data, the user only needs to provide a few examples for a few known attribute types as weak supervision. We propose a principled framework that first generates attribute value candidates and then groups them into clusters of attributes. The candidate generation step probes a pre-trained language model to extract phrases from product titles. Then, an attribute-aware fine-tuning method optimizes a multitask objective and shapes the language model representation to be attribute-discriminative. Finally, we discover new attributes and values through the self-ensemble of our framework, which handles the open-world challenge. We run extensive experiments on a large distantly annotated development set and a gold standard human-annotated test set that we collected. Our model significantly outperforms strong baselines and can generalize to unseen attributes and product types.
@inproceedings{zhang2022open,
abbr = {TheWebConf},
topic = {Knowledge Graph},
selected = {1},
title = {OA-Mine: Open-World Attribute Mining for E-Commerce Products with Weak Supervision},
author = {Zhang, Xinyang and Zhang, Chenwei and Li, Xian and Dong, Xin Luna and Shang, Jingbo and Faloutsos, Christos and Han, Jiawei},
booktitle = {Proceedings of the Web Conference},
year = {2022},
pdf = {https://assets.amazon.science/d5/d3/ce07fed14287b4a8c23a7d34bf59/oa-mine-open-world-attribute-mining-for-ecommerce-products-with-weak-supervision.pdf},
code = {https://github.com/xinyangz/OAMine},
video = {https://www.youtube.com/watch?v=vrDPV8EMLnA},
slides = {OA-Mine_2022TheWebConf_slides.pdf}
}
Low-resource Relation Extraction (LRE) aims to extract relation facts from limited labeled corpora when human annotation is scarce. Existing works either utilize self-training scheme to generate pseudo labels that will cause the gradual drift problem, or leverage meta-learning scheme which does not solicit feed-back explicitly. To alleviate selection bias due to the lack of feedback loops in existing LRE learning paradigms, we developed a Gradient Imitation Reinforcement Learning method to encourage pseudo label data to imitate the gradient descent direction on labeled data and bootstrap its optimization capability through trial and error. We also propose a framework called GradLRE, which handles two major scenarios in low-resource relation extraction. Besides the scenario where unlabeled data is sufficient, GradLRE handles the situation where no unlabeled data is available, by exploiting a contextualized augmentation method to generate data. Experimental results on two public datasets demonstrate the effectiveness of GradLRE on low resource relation extraction when comparing with baselines.
@inproceedings{hu2021gradient,
abbr = {EMNLP},
topic = {NLP},
selected = {1},
title = {Gradient Imitation Reinforcement Learning for Low Resource Relation Extraction},
author = {Hu, Xuming and Zhang, Chenwei and Yang, Yawen and Li, Xiaohe and Lin, Li and Wen, Lijie and Yu, Philip S.},
booktitle = {Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing},
year = {2021},
pdf = {https://arxiv.org/pdf/2109.06415.pdf},
code = {https://github.com/THU-BPM/GradLRE}
}
We answer the following key questions in this tutorial: What are unique challenges to build a product knowledge graph and what are solutions? Are these techniques applicable to building other domain knowledge graphs? What are practical tips to make this to production?
@inproceedings{zalmout2021all,
abbr = {KDD},
topic = {Knowledge Graph},
selected = {1},
title = {All You Need to Know to Build a Product Knowledge Graph},
author = {Zalmout, Nasser and Zhang, Chenwei and Li, Xian and Liang, Yan and Dong, Xin Luna},
booktitle = {Proceedings of the 27th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining},
year = {2021},
pdf = {https://naixlee.github.io/Product_Knowledge_Graph_Tutorial_KDD2021/},
media = {https://naixlee.github.io/Product_Knowledge_Graph_Tutorial_KDD2021/}
}
Text categorization is an essential task in Web content analysis. Considering the ever-evolving Web data and new emerging categories, instead of the laborious supervised setting, in this paper, we focus on the minimally-supervised setting that aims to categorize documents effectively, with a couple of seed documents annotated per category. We recognize that texts collected from the Web are often structure-rich, i.e., accompanied by various metadata. One can easily organize the corpus into a text-rich network, joining raw text documents with document attributes, high-quality phrases, label surface names as nodes, and their associations as edges. Such a network provides a holistic view of the corpus’ heterogeneous data sources and enables a joint optimization for network-based analysis and deep textual model training. We therefore propose a novel framework for minimally supervised categorization by learning from the text-rich network. Specifically, we jointly train two modules with different inductive biases – a text analysis module for text understanding and a network learning module for class-discriminative, scalable network learning. Each module generates pseudo training labels from the unlabeled document set, and both modules mutually enhance each other by co-training using pooled pseudo labels. We test our model on two real-world datasets. On the challenging e-commerce product categorization dataset with 683 categories, our experiments show that given only three seed documents per category, our framework can achieve an accuracy of about 92%, significantly outperforming all compared methods; our accuracy is only less than 2% away from the supervised BERT model trained on about 50K labeled documents.
@inproceedings{zhang2021minimally,
abbr = {TheWebConf},
topic = {Graph Mining},
selected = {1},
title = {Minimally-Supervised Structure-Rich Text Categorization via Learning on Text-Rich Networks},
author = {Zhang, Xinyang and Zhang, Chenwei and Dong, Xin Luna and Shang, Jingbo and Han, Jiawei},
booktitle = {Proceedings of the Web Conference},
year = {2021},
pdf = {https://arxiv.org/pdf/2102.11479.pdf},
code = {https://github.com/xinyangz/ltrn},
video = {https://videolectures.net/www2021_zhang_minimally_supervised/}
}
Open relation extraction is the task of extracting open-domain relation facts from natural language sentences. Existing works either utilize heuristics or distant-supervised annotations to train a supervised classifier over pre-defined relations, or adopt unsupervised methods with additional assumptions that have less discriminative power. In this work, we propose a self-supervised framework named SelfORE, which exploits weak, self-supervised signals by leveraging large pretrained language model for adaptive clustering on contextualized relational features, and bootstraps the self-supervised signals by improving contextualized features in relation classification. Experimental results on three datasets show the effectiveness and robustness of SelfORE on open-domain Relation Extraction when comparing with competitive baselines.
@inproceedings{hu2020selfore,
abbr = {EMNLP},
topic = {NLP},
selected = {1},
title = {SelfORE: Self-supervised Relational Feature Learning for Open Relation Extraction},
author = {Hu, Xuming and Zhang, Chenwei and Xu, Yusong and Wen, Lijie and Yu, Philip S.},
booktitle = {Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing},
year = {2020},
pdf = {https://arxiv.org/pdf/2004.02438.pdf},
code = {https://github.com/THU-BPM/SelfORE}
}
Can one build a knowledge graph (KG) for all products in the world? Knowledge graphs have firmly established themselves as valuable sources of information for search and question answering, and it is natural to wonder if a KG can contain information about products offered at online retail sites. There have been several successful examples of generic KGs, but organizing information about products poses many additional challenges, including sparsity and noise of structured data for products, complexity of the domain with millions of product types and thousands of attributes, heterogeneity across large number of categories, as well as large and constantly growing number of products. We describe AutoKnow, our automatic (self-driving) system that addresses these challenges. The system includes a suite of novel techniques for taxonomy construction, product property identification, knowledge extraction, anomaly detection, and synonym discovery. AutoKnow is (a) automatic, requiring little human intervention, (b) multi-scalable, scalable in multiple dimensions (many domains, many products, and many attributes), and (c) integrative, exploiting rich customer behavior logs. AutoKnow has been operational in collecting product knowledge for over 11K product types.
@inproceedings{dong2020autoknow,
abbr = {KDD},
topic = {Knowledge Graph},
selected = {1},
title = {AutoKnow: Self-Driving Knowledge Collection for Products of Thousands of Types},
author = {Dong, Xin Luna and He, Xiang and Kan, Andrey and Li, Xian and Liang, Yan and Ma, Jun and Xu, Yifan Ethan and Zhang, Chenwei and Zhao, Tong and Blanco Saldana, Gabriel and others},
booktitle = {Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining},
pages = {2724--2734},
year = {2020},
pdf = {https://arxiv.org/pdf/2006.13473.pdf},
media = {https://www.amazon.science/blog/building-product-graphs-automatically}
}
@inproceedings{zhang2019joint,
abbr = {ACL},
topic = {NLP},
selected = {1},
title = {Joint Slot Filling and Intent Detection via Capsule Neural Networks},
author = {Zhang, Chenwei and Li, Yaliang and Du, Nan and Fan, Wei and Yu, Philip S.},
booktitle = {Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics},
pages = {5259--5267},
year = {2019},
pdf = {https://arxiv.org/pdf/1812.09471.pdf},
poster = {https://drive.google.com/file/d/1rZpP-4WY7T8AtARXde7qZd5enV53yNOL/view},
code = {https://github.com/czhang99/Capsule-NLU}
}
@inproceedings{xia2018zero,
abbr = {EMNLP},
topic = {NLP},
selected = {1},
title = {Zero-shot User Intent Detection via Capsule Neural Networks},
author = {Xia*, Congying and Zhang*, Chenwei and Yan, Xiaohui and Chang, Yi and Yu, Philip S.},
booktitle = {Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing},
pages = {3090--3099},
year = {2018},
pdf = {https://arxiv.org/pdf/1809.00385.pdf},
video = {https://vimeo.com/305945714},
code = {https://github.com/congyingxia/ZeroShotCapsule}
}