Chenwei Zhang

Chenwei Zhang

I am a Senior Applied Scientist at Amazon, building LLMs for next-generation shopping experiences. My research interests lie in the fields of data mining and natural language processing. In particular, I am interested in text/graph mining as well as knowledge-intensive language and sequence modeling tasks.

Contact: cwzhang910 AT gmail D0T c0m | Google Scholar | LinkedIn



Research Topics:

◦  Natural Language Processing

◦  Text/Graph Mining

◦  LLM

  NEWS

[2024/05] One paper on Life-Long Learning for Attribute Extraction is accepted by ACL 2024.

[2024/02] One paper on Cross-Lingual Transfer Beyond Texts is accepted by LREC-Coling 2024.

[2024/02] Conversational Shopping with Amazon through our LLM: Amazon announces Rufus …

[2023/10] Two papers on Multi-Domain Prompting and Knowledge-Selective Pretraining are accepted by EMNLP 2023.

[2023/09] One paper on Web Search Augmented Open Relation Extraction is accepted by TKDE.

See all

   WORK EXPERIENCES

[2023/04 -    Now    ] Senior Applied Scientist at Amazon Stores Foundational AI, Seattle, WA

[2019/08 - 2023/04] Senior Applied Scientist at Amazon Product Graph, Seattle, WA

[2017/08 - 2019/05] Research Assistant at UIC Big Data and Social Computing Lab, Chicago, IL


   SELECTED PUBLICATIONS (ALL)
  1. EMNLP
    CoF-CoT: Enhancing large language models with coarse-to-fine chain-of-thought prompting for multi-domain NLU tasks Hoang H Nguyen, Ye Liu, Chenwei Zhang, Tao Zhang, and Philip S. Yu In The 2023 Conference on Empirical Methods in Natural Language Processing 2023 [Abstract] [BibTex]
    While Chain-of-Thought prompting is popular in reasoning tasks, its application to Large Language Models (LLMs) in Natural Language Understanding (NLU) is under-explored. Motivated by multi-step reasoning of LLMs, we propose Coarse-to-Fine Chain-of-Thought (CoF-CoT) approach that breaks down NLU tasks into multiple reasoning steps where LLMs can learn to acquire and leverage essential concepts to solve tasks from different granularities. Moreover, we propose leveraging semantic-based Abstract Meaning Representation (AMR) structured knowledge as an intermediate step to capture the nuances and diverse structures of utterances, and to understand connections between their varying levels of granularity. Our proposed approach is demonstrated effective in assisting the LLMs adapt to the multi-grained NLU tasks under both zero-shot and few-shot multi-domain settings.
    @inproceedings{nguyen2023enhancing,
      selected = {1},
      abbr = {EMNLP},
      topic = {NLP},
      title = {CoF-CoT: Enhancing large language models with coarse-to-fine chain-of-thought prompting for multi-domain NLU tasks},
      author = {Nguyen, Hoang H and Liu, Ye and Zhang, Chenwei and Zhang, Tao and Yu, Philip S.},
      booktitle = {The 2023 Conference on Empirical Methods in Natural Language Processing},
      year = {2023},
      pdf = {}
    }
    
  2. EMNLP
    Knowledge-Selective Pretraining for Attribute Value Extraction Hui Liu, Qingyu Yin, Zhengyang Wang, Chenwei Zhang, Haoming Jiang, Yifan Gao, Zheng Li, Xian Li, Chao Zhang, Bing Yin, William Yang Wang, and Xiaodan Zhu In Findings of the 2023 Conference on Empirical Methods in Natural Language Processing 2023 [Abstract] [BibTex]
    Attribute Value Extraction (AVE) aims to retrieve the values of attributes from the product profiles. The state-of-the-art methods tackle the AVE task through a question-answering (QA) paradigm, where the value is predicted from the context (i.e. product profile) given a query (i.e. attributes). Despite of the substantial advancements that have been made, the performance of existing methods on rare attributes is still far from satisfaction, and they cannot be easily extended to unseen attributes due to the poor generalization ability. In this work, we propose to leverage pretraining and transfer learning to address the aforementioned weaknesses. We first collect the product information from various E-commerce stores and retrieve a large number of (profile, attribute, value) triples, which will be used as the pretraining corpus. To more effectively utilize the retrieved corpus, we further design a Knowledge-Selective Framework (KSelF) based on query expansion that can be closely combined with the pretraining corpus to boost the performance. Meanwhile, considering the public AE-pub dataset contains considerable noise, we construct and contribute a larger benchmark EC-AVE collected from E-commerce websites. We conduct evaluation on both of these datasets. The experimental results demonstrate that our proposed KSelF achieves new state-of-the-art performance without pretraining. When incorporated with the pretraining corpus, the performance of KSelF can be further improved, particularly on the attributes with limited training resources.
    @inproceedings{liu2023knowledge,
      selected = {1},
      abbr = {EMNLP},
      topic = {NLP},
      title = {Knowledge-Selective Pretraining for Attribute Value Extraction},
      author = {Liu, Hui and Yin, Qingyu and Wang, Zhengyang and Zhang, Chenwei and Jiang, Haoming and Gao, Yifan and Li, Zheng and Li, Xian and Zhang, Chao and Yin, Bing and Wang, William Yang and Zhu, Xiaodan},
      booktitle = {Findings of the 2023 Conference on Empirical Methods in Natural Language Processing},
      year = {2023},
      pdf = {}
    }
    
  3. TKDE
    Reading Broadly to Open Your Mind: Improving Open Relation Extraction with Self-supervised Information in Documents Xuming Hu, Zhaochen Hong, Chenwei Zhang, Aiwei Liu, Shiao Meng, Lijie Wen, Irwin King, and Philip S. Yu In IEEE Transactions on Knowledge and Data Engineering 2023 [BibTex]
    @inproceedings{hu2023reading,
      selected = {1},
      abbr = {TKDE},
      topic = {NLP},
      title = {Reading Broadly to Open Your Mind: Improving Open Relation Extraction with Self-supervised Information in Documents},
      author = {Hu, Xuming and Hong, Zhaochen and Zhang, Chenwei and Liu, Aiwei and Meng, Shiao and Wen, Lijie and King, Irwin and Yu, Philip S.},
      booktitle = {IEEE Transactions on Knowledge and Data Engineering},
      year = {2023},
      pdf = {}
    }
    
  4. ACL
    Towards Open-World Product Attribute Mining: A Lightly-Supervised Approach Liyan Xu, Chenwei Zhang, Xian Li, Jingbo Shang, and Jinho D. Choi In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics 2023 [Abstract] [BibTex]
    We present a new task setting for attribute mining on e-commerce products, serving as a practical solution to extract open-world attributes without extensive human intervention. Our supervision comes from a high-quality seed attribute set bootstrapped from existing resources, and we aim to expand the attribute vocabulary of existing seed types, and also to discover any new attribute types automatically. A new dataset is created to support our setting, and our approach Amacer is proposed specifically to tackle the limited supervision. Especially, given that no direct supervision is available for those unseen new attributes, our novel formulation exploits self-supervised heuristic and unsupervised latent attributes, which attains implicit semantic signals as additional supervision by leveraging product context. Experiments suggest that our approach surpasses various baselines by 12 F1, expanding attributes of existing types significantly by up to 12 times, and discovering values from 39% new types. Our dataset and code will be publicly available.
    @inproceedings{xu2023topic,
      selected = {1},
      abbr = {ACL},
      topic = {Knowledge Graph},
      title = {Towards Open-World Product Attribute Mining: A Lightly-Supervised Approach},
      author = {Xu, Liyan and Zhang, Chenwei and Li, Xian and Shang, Jingbo and Choi, Jinho D.},
      booktitle = {Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics},
      year = {2023},
      pdf = {https://arxiv.org/pdf/2305.18350.pdf}
    }
    
  5. EMNLP
    Gradient Imitation Reinforcement Learning for Low Resource Relation Extraction Xuming Hu, Chenwei Zhang, Yawen Yang, Xiaohe Li, Li Lin, Lijie Wen, and Philip S. Yu In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021 [Abstract] [BibTex] [Code]
    Low-resource Relation Extraction (LRE) aims to extract relation facts from limited labeled corpora when human annotation is scarce. Existing works either utilize self-training scheme to generate pseudo labels that will cause the gradual drift problem, or leverage meta-learning scheme which does not solicit feed-back explicitly. To alleviate selection bias due to the lack of feedback loops in existing LRE learning paradigms, we developed a Gradient Imitation Reinforcement Learning method to encourage pseudo label data to imitate the gradient descent direction on labeled data and bootstrap its optimization capability through trial and error. We also propose a framework called GradLRE, which handles two major scenarios in low-resource relation extraction. Besides the scenario where unlabeled data is sufficient, GradLRE handles the situation where no unlabeled data is available, by exploiting a contextualized augmentation method to generate data. Experimental results on two public datasets demonstrate the effectiveness of GradLRE on low resource relation extraction when comparing with baselines.
    @inproceedings{hu2021gradient,
      abbr = {EMNLP},
      topic = {NLP},
      selected = {1},
      title = {Gradient Imitation Reinforcement Learning for Low Resource Relation Extraction},
      author = {Hu, Xuming and Zhang, Chenwei and Yang, Yawen and Li, Xiaohe and Lin, Li and Wen, Lijie and Yu, Philip S.},
      booktitle = {Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing},
      year = {2021},
      pdf = {https://arxiv.org/pdf/2109.06415.pdf},
      code = {https://github.com/THU-BPM/GradLRE}
    }
    
  6. TheWebConf
    Minimally-Supervised Structure-Rich Text Categorization via Learning on Text-Rich Networks Xinyang Zhang, Chenwei Zhang, Xin Luna Dong, Jingbo Shang, and Jiawei Han In Proceedings of the Web Conference 2021 [Abstract] [BibTex] [Code] [Video]
    Text categorization is an essential task in Web content analysis. Considering the ever-evolving Web data and new emerging categories, instead of the laborious supervised setting, in this paper, we focus on the minimally-supervised setting that aims to categorize documents effectively, with a couple of seed documents annotated per category. We recognize that texts collected from the Web are often structure-rich, i.e., accompanied by various metadata. One can easily organize the corpus into a text-rich network, joining raw text documents with document attributes, high-quality phrases, label surface names as nodes, and their associations as edges. Such a network provides a holistic view of the corpus’ heterogeneous data sources and enables a joint optimization for network-based analysis and deep textual model training. We therefore propose a novel framework for minimally supervised categorization by learning from the text-rich network. Specifically, we jointly train two modules with different inductive biases – a text analysis module for text understanding and a network learning module for class-discriminative, scalable network learning. Each module generates pseudo training labels from the unlabeled document set, and both modules mutually enhance each other by co-training using pooled pseudo labels. We test our model on two real-world datasets. On the challenging e-commerce product categorization dataset with 683 categories, our experiments show that given only three seed documents per category, our framework can achieve an accuracy of about 92%, significantly outperforming all compared methods; our accuracy is only less than 2% away from the supervised BERT model trained on about 50K labeled documents.
    @inproceedings{zhang2021minimally,
      abbr = {TheWebConf},
      topic = {Graph Mining},
      selected = {1},
      title = {Minimally-Supervised Structure-Rich Text Categorization via Learning on Text-Rich Networks},
      author = {Zhang, Xinyang and Zhang, Chenwei and Dong, Xin Luna and Shang, Jingbo and Han, Jiawei},
      booktitle = {Proceedings of the Web Conference},
      year = {2021},
      pdf = {https://arxiv.org/pdf/2102.11479.pdf},
      code = {https://github.com/xinyangz/ltrn},
      video = {https://videolectures.net/www2021_zhang_minimally_supervised/}
    }
    
  7. EMNLP
    SelfORE: Self-supervised Relational Feature Learning for Open Relation Extraction Xuming Hu, Chenwei Zhang, Yusong Xu, Lijie Wen, and Philip S. Yu In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing 2020 [Abstract] [BibTex] [Code]
    Open relation extraction is the task of extracting open-domain relation facts from natural language sentences. Existing works either utilize heuristics or distant-supervised annotations to train a supervised classifier over pre-defined relations, or adopt unsupervised methods with additional assumptions that have less discriminative power. In this work, we propose a self-supervised framework named SelfORE, which exploits weak, self-supervised signals by leveraging large pretrained language model for adaptive clustering on contextualized relational features, and bootstraps the self-supervised signals by improving contextualized features in relation classification. Experimental results on three datasets show the effectiveness and robustness of SelfORE on open-domain Relation Extraction when comparing with competitive baselines.
    @inproceedings{hu2020selfore,
      abbr = {EMNLP},
      topic = {NLP},
      selected = {1},
      title = {SelfORE: Self-supervised Relational Feature Learning for Open Relation Extraction},
      author = {Hu, Xuming and Zhang, Chenwei and Xu, Yusong and Wen, Lijie and Yu, Philip S.},
      booktitle = {Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing},
      year = {2020},
      pdf = {https://arxiv.org/pdf/2004.02438.pdf},
      code = {https://github.com/THU-BPM/SelfORE}
    }
    
  8. KDD
    AutoKnow: Self-Driving Knowledge Collection for Products of Thousands of Types Xin Luna Dong, Xiang He, Andrey Kan, Xian Li, Yan Liang, Jun Ma, Yifan Ethan Xu, Chenwei Zhang, Tong Zhao, Gabriel Blanco Saldana, and others In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining 2020 [Abstract] [BibTex] [Media]
    Can one build a knowledge graph (KG) for all products in the world? Knowledge graphs have firmly established themselves as valuable sources of information for search and question answering, and it is natural to wonder if a KG can contain information about products offered at online retail sites. There have been several successful examples of generic KGs, but organizing information about products poses many additional challenges, including sparsity and noise of structured data for products, complexity of the domain with millions of product types and thousands of attributes, heterogeneity across large number of categories, as well as large and constantly growing number of products. We describe AutoKnow, our automatic (self-driving) system that addresses these challenges. The system includes a suite of novel techniques for taxonomy construction, product property identification, knowledge extraction, anomaly detection, and synonym discovery. AutoKnow is (a) automatic, requiring little human intervention, (b) multi-scalable, scalable in multiple dimensions (many domains, many products, and many attributes), and (c) integrative, exploiting rich customer behavior logs. AutoKnow has been operational in collecting product knowledge for over 11K product types.
    @inproceedings{dong2020autoknow,
      abbr = {KDD},
      topic = {Knowledge Graph},
      selected = {1},
      title = {AutoKnow: Self-Driving Knowledge Collection for Products of Thousands of Types},
      author = {Dong, Xin Luna and He, Xiang and Kan, Andrey and Li, Xian and Liang, Yan and Ma, Jun and Xu, Yifan Ethan and Zhang, Chenwei and Zhao, Tong and Blanco Saldana, Gabriel and others},
      booktitle = {Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining},
      pages = {2724--2734},
      year = {2020},
      pdf = {https://arxiv.org/pdf/2006.13473.pdf},
      media = {https://www.amazon.science/blog/building-product-graphs-automatically}
    }
    
  9. ACL
    Joint Slot Filling and Intent Detection via Capsule Neural Networks Chenwei Zhang, Yaliang Li, Nan Du, Wei Fan, and Philip S. Yu In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019 [BibTex] [Poster] [Code]
    @inproceedings{zhang2019joint,
      abbr = {ACL},
      topic = {NLP},
      selected = {1},
      title = {Joint Slot Filling and Intent Detection via Capsule Neural Networks},
      author = {Zhang, Chenwei and Li, Yaliang and Du, Nan and Fan, Wei and Yu, Philip S.},
      booktitle = {Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics},
      pages = {5259--5267},
      year = {2019},
      pdf = {https://arxiv.org/pdf/1812.09471.pdf},
      poster = {https://drive.google.com/file/d/1rZpP-4WY7T8AtARXde7qZd5enV53yNOL/view},
      code = {https://github.com/czhang99/Capsule-NLU}
    }
    
  10. EMNLP
    Zero-shot User Intent Detection via Capsule Neural Networks Congying Xia*, Chenwei Zhang*, Xiaohui Yan, Yi Chang, and Philip S. Yu In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing 2018 [BibTex] [Code] [Video]
    @inproceedings{xia2018zero,
      abbr = {EMNLP},
      topic = {NLP},
      selected = {1},
      title = {Zero-shot User Intent Detection via Capsule Neural Networks},
      author = {Xia*, Congying and Zhang*, Chenwei and Yan, Xiaohui and Chang, Yi and Yu, Philip S.},
      booktitle = {Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing},
      pages = {3090--3099},
      year = {2018},
      pdf = {https://arxiv.org/pdf/1809.00385.pdf},
      video = {https://vimeo.com/305945714},
      code = {https://github.com/congyingxia/ZeroShotCapsule}
    }
    


  EDUCATION

[2014/08 - 2019/05] Ph.D. in Computer Science, University of Illinois at Chicago, 2019. Advisor: Prof. Philip S. Yu

[2010/09 - 2014/05] B.Eng in Computer Science and Technology, Southwest University, China, 2014.

   PROFESSIONAL SERVICES

Program Committee: ACL, EMNLP, NAACL, KDD, AAAI, WSDM, TheWebConf, RecSys, CIKM, AACL

Reviewer: TKDE, TKDD, VLDB, NeuroComputing, TBD, TOIS, KDD, WSDM, ICDM, PAKDD, ICWSM, ASONAM

Organzing Committee: KR2ML@NeurIPS 2020