Yingyi Zhang
PhD. Candidate
Joint PhD. of DUT and CityU (2022~Now)
Hi! Welcome to my homepage. I’m Yingyi Zhang (张颖异), a PhD candidate at the joint program between Dalian University of Technology (DUT) and City University of Hong Kong (CityU). My doctoral research focuses on consumer behavior in e-commerce and recommender systems, and I am jointly supervised by Prof. Xianneng Li (at DUT) and Prof. Xiangyu Zhao (at CityU). My interests lie in leveraging advanced technologies such as user behavior modeling and deep learning methods to address management challenges in online e-commerce.
Curriculum Vitae
",
which does not match the baseurl
("
") configured in _config.yml
.
baseurl
in _config.yml
to "
".
Yingyi Zhang, Pengyue Jia, Xianneng Li#, Xiangyu Zhao#, etc. (# corresponding author)
KDD 2025 Research Track 2025
Cloud-device collaboration leverages on-cloud Large Language Models (LLMs) for handling public user queries and on-device Small Language Models (SLMs) for processing private user data, collectively forming a powerful and privacy-preserving solution. However, existing approaches often fail to fully leverage the scalable problem-solving capabilities of on-cloud LLMs while underutilizing the advantage of on-device SLMs in accessing and processing personalized data. This leads to two interconnected issues: 1) Limited utilization of the problem-solving capabilities of on-cloud LLMs, which fail to align with personalized user-task needs, and 2) Inadequate integration of user data into on-device SLM responses, resulting in mismatches in contextual user information. In this paper, we propose a Leader-Subordinate Retrieval framework for Privacy-preserving cloud-device collaboration (LSRP), a novel solution that bridges these gaps by: 1) enhancing on-cloud LLM guidance to on-device SLM through a dynamic selection of task-specific leader strategies named as user-to-user retrieval-augmented generation (U-U-RAG), and 2) integrating the data advantages of on-device SLMs through small model feedback Direct Preference Optimization (SMFB-DPO) for aligning the on-cloud LLM with the on-device SLM. Experiments on two datasets demonstrate that LSRP consistently outperforms state-of-the-art baselines, significantly improving question-answer relevance and personalization, while preserving user privacy through efficient on-device retrieval.
Zhipeng Li, Binglin Wu, Yingyi Zhang, Xianneng Li# (# corresponding author)
WWW 2025 Competition Track 2025
The increasing complexity of e-commerce customer service (CS) scenarios, driven by rapid product evolution and user base growth, presents unique challenges for intent recognition. Unlike generic user-generated content (UGC), CS-UGC exhibits multimodal complexity (e.g., product inquiries, return requests) that traditional methods struggle to address due to (1) Limited CS domain-specific knowledge, which hampers the ability of large language models (LLMs) to handle multimodal CS data, and (2) The complexity and noise in CS UGC, which undermines the robustness of traditional approaches. In this paper, we propose Customer Service Augmented LLM Merge (CuSMer), a novel framework integrating semi-supervised learning with model merging techniques through dual pipelines: (i) pseudo-labeling → fine-tuning → LLM merging and (ii) image augmentation → fine-tuning → LLM merging. These piplines enhance the robustness of LLMs against noisy, out-of-distribution data while improving their multimodal understanding of CS scenarios. Evaluated on Alibaba's real-world datasets, CuSMer demonstrates superior robustness in noisy environments and enhanced multimodal understanding compared to baseline LLMs. It achieved third place in the first round and first place in the final round in the WWW25 - Competition: Multimodal Dialogue System Intent Recognition Challenge, validating its scalability and effectiveness for industrial CS applications.
Yingyi Zhang, Zhipeng Li, Zhewei Zhi, Xianneng Li# (# corresponding author)
KDD 2024 Workshop Amazon KDD Cup 2024
In this paper, we investigate how to improve the large language model (LLM) in the user behavior alignment task, which is constrained by input confusion and process uncertainty. We propose a novel framework that employs input-level model cooperation and model-level parameter optimization. Specifically, in input-level model cooperation, we use the small language models to provide supplementary information to the LLM from both chain-of-thought and semantic similarity perspectives. In model-level parameter optimization, we first use data selection methods to train different models and then hybridize them to obtain the best one. The proposed framework was verified in the KDD Cup 2024 and achieved rank-2 performance, with code open-sourced at here.
Yudi Xiao, Yingyi Zhang, Xianneng Li# (# corresponding author)
KDD 2024 Workshop Amazon KDD Cup 2024
Users generally have a tendency to rely on numerical information of recommendations presented on the web page when judging the recommended items, which refers to a classic psychological concept, anchoring effect. Learning users' psychology from explicit behaviors has been widely applied in RS and performs well on capturing user preferences and guiding the prediction tasks of recommendations. Recent studies have empirically proven that the anchoring effect can mislead users to click/purchase items that are not liked in principle, which will bring bias and noise to behavior data. However, vast majority of existing recommendation algorithms trained on behavior data ignore the anchoring bias, which results in suboptimal recommendations. In this paper, we propose a novel method named Variational Anchoring Effect Encoder (VAEE) to model the anchoring effect and mitigate the anchoring bias for recommender systems. The proposed method mainly includes two steps: 1) User Anchoring Effect Module which aims to reconstruct the unanchored user preferences with a Variational Autoencoder (VAE)-based deep structure, and 2) User Anchoring Debias Module that generate the recommendation results with the reconstructed unbiased user representations. Extensive experiments on real-world datasets are conducted to demonstrate that reducing anchoring effect can bring particular improvement in AUC and the proposed VAEE is attachable to most existing recommendation models. We also compare the recommendation quality when using different anchoring feature subsets, which indicates that the learned representation of anchoring effect is authentic and truly effective to restore users’ true preferences.
Zerong Lan, Yingyi Zhang, Xianneng Li# (# corresponding author)
In Proceedings of the 17th ACM Conference on Recommender Systems (RecSys ’23) 2023
Users in recommender systems exhibit multi-behavior in multiple business scenarios on real-world e-commerce platforms. A crucial challenge in such systems is to make recommendations for each business scenario at the same time. On top of this, multiple predictions (e.g., Click Through Rate and Conversion Rate) need to be made simultaneously in order to improve the platform revenue. Research focus on making recommendations for several business scenarios is in the field of Multi-Scenario Recommendation (MSR), and Multi-Task Recommendation (MTR) mainly attempts to solve the possible problems in collaboratively executing different recommendation tasks. However, existing researchers have paid attention to either MSR or MTR, ignoring the integration of MSR and MTR that faces the issue of conflict between scenarios and tasks. To address the above issue, we propose a Meta-based Multi-scenario Multi-task RECommendation framework (M3REC) to serve multiple tasks in multiple business scenarios by a unified model. However, integrating MSR and MTR in a proper manner is non-trivial due to: 1) Unified representation problem: Users’ and items’ representation behave Non-i.i.d in different scenarios and tasks which takes inconsistency into recommendations. 2) Synchronous optimization problem: Tasks distribution varies in different scenarios, and a unified optimization method is needed to optimize multi-tasks in multi-scenarios. Thus, to unified represent users and items, we design a Meta-Item-Embedding Generator (MIEG) and a User-Preference Transformer (UPT). The MIEG module can generate initialized item embedding using item features through meta-learning technology, and the UPT module can transfer user preferences in other scenarios. Besides, the M3REC framework uses a specifically designed backbone network together with a task-specific aggregate gate to promote all tasks to achieve the purpose of optimizing multiple tasks in multiple business scenarios within one model. Experiments on two public datasets have shown that M3REC outperforms those compared MSR and MTR state-of-the-art methods.
Yingyi Zhang, Xianneng Li#, Yahe Yu, ect. (# corresponding author)
In Companion Proceedings of the ACM Web Conference 2023 (WWW’23 Companion - industry truck) 2023
Large-scale e-commercial platforms usually contain multiple business fields, which require industrial algorithms to characterize user intents across multiple domains. Numerous efforts have been made in user multi-domain intent modeling to achieve state-of-the-art performance. However, existing methods mainly focus on the domains having rich user information, which makes implementation to domains with sparse or rare user behavior meet with mixed success. Hence, in this paper, we propose a novel method named Meta-generator enhanced multi-Domain model (MetaDomain) to address the above issue. MetaDomain mainly includes two steps, 1) users’ multi-domain intent representation and 2) users’ multi-domain intent fusion. Specifically, in users’ multi-domain intent representation, we use the gradient information from a domain intent extractor to train the domain intent meta-generator, where the domain intent extractor has the input of users’ sequence feature and domain meta-generator has the input of users’ basic feature, hence the capability of generating users’ intent with sparse behavior. Afterward, in users’ multi-domain intent fusion, a domain graph is used to represent the high-order multi-domain connectivity. Extensive experiments have been carried out under a real-world industrial platform named Meituan. Both offline and rigorous online A/B tests under the billion-level data scale demonstrate the superiority of the proposed MetaDomain method over the state-of-the-art baselines. Furthermore comparing with the method using multi-domain sequence features, MetaDomain can reduce the serving latency by 20%. Currently, MetaDomain has been deployed in Meituan one of the largest worldwide Online-to-Offline(O2O) platforms.
Yingyi Zhang, Xianneng Li#, Yanhong Guo, Xiaogang Li, Shuang Zheng (# corresponding author)
Journal of Management Sciences in China 2023
The essence of recommender systems is to model the implicit preferences in consumer behavior. The human behavior is inseparable from psychology, and there are rich internal motives behind the superficial behavior. However, the current studies mainly focus on the behavioral data modeling, rarely involve the internal psychological activities and the information processing process in decision-making. Therefore, this paper studied a new idea of recommender systems by introducing AIDMA decision model from the perspective of consumer decision journey. This paper proposed a new deep review-based recommender system, which applies the AIDMA decision journey into the deep learning framework. Experiments showed that the recommendation performance of the proposal is significantly better than the state-of-the-art methods. This paper follows the big data-driven research paradigm of "model driven + data-driven", realizing the in-depth method innovation with theoretical support.
Yingyi Zhang, Xianneng Li#, Yahe Yu, ect. (# corresponding author)
DL4SR’22: Workshop on Deep Learning for Search and Recommendation, co-located with the 31st ACM International Conference on Information and Knowledge Management (CIKM) 2022
Predicting users’ conversion rate (CVR) is essentially important for ranking systems in industrial Online-to-Offline (O2O) applications. Numerous efforts have been made in CVR modeling to achieve state-of-the-art performance. However, existing methods mainly focus on the Business-to-Customer (B2C) scenario, which makes implementations to O2O meet with mixed success. This can be revealed via several scenario-specific challenges. For example, O2O users in different locations generally encounter different candidates of surrounding stores. This leads to users’ behavioral regularity becoming essentially prominent. Besides, O2O users’ conversion includes a two-stage cost, i.e., online order cost and offline transportation cost. This inspires that users’ location sensitivity deserves additional attention compared with conventional scenarios. Motivated by these characteristics, we propose a novel CVR prediction method for the O2O scenario, named Entire Cost enhanced Multi-task Model (ECMM): i) users’ historical behavior sequences across different locations are modeled to capture the users’ preference of behavioral regularity; ii) both online order cost and offline transportation cost are modeled to predict the users’ aggregated preference for conversion. By designing two novel attention mechanisms, i.e., convert attention and sliding window attention, ECMM can be trained end-to-end to appropriately fit O2O characteristics. Extensive experiments have been carried out under a real-world industrial O2O platform Meituan. Both offline and rigorous online A/B tests under the billion-level data scale demonstrate the superiority of the proposed ECMM over the highly optimized state-of-the-art baselines.