Paper34. DDTCDR: Deep Dual Transfer Cross Domain Recommendation

4 minute read

DDTCDR: Deep Dual Transfer Cross Domain Recommendation

Abstract

Cross domain recommender systems have been increasingly valuable for helping consumers identify the most satisfying items from different categories. However, previously proposed cross-domain models did not take into account bidirectional latent relations between users and items. In addition, they do not explicitly model information of user and item features, while utilizing only user ratings information for recommendations. To address these concerns, in this paper we propose a novel approach to cross-domain recommendations based on the mechanism of dual learning that transfers information between two related domains in an iterative manner until the learning process stabilizes. We develop a novel latent orthogonal mapping to extract user preferences over multiple domains while preserving relations between users across different latent spaces. Combining with autoencoder approach to extract the latent essence of feature information, we propose Deep Dual Transfer Cross Domain Recommendation (DDTCDR) model to pro- vide recommendations in respective domains. We test the proposed method on a large dataset containing three domains of movies, book and music items and demonstrate that it consistently and significantly outperforms several state-of-the-art baselines and also classical transfer learning approaches.

해당 논문에서 제안하는 DDTCDR(Deep Dual Transfer Cross Domain Recommendation)은 Target-Domain에서 Cold-Start Problem을 해결하기 위하여 Source-Domain의 정보를 추가적으로 사용한다.
해당 논문에서 강조하는 것은 Domain Specific한 정보 뿐만 아니라, Domain across정보를 활용한다는 것 이다. 즉, Domain에 공통적인 정보와 Domain에 특화된 정보를 뽑아서 각각 Prediction에서 활용한다는 것 이다.

여기서, 눈여겨봐야 하는 정보는 Domain Across정보가 각각 $X, X^T, \text{s.t. }XX^T= I$로서 orthogonal이다. 즉, 서로 겹치지 않은 정보를 이면서 비슷한 User끼리는 비슷한 값을 가지도록 유도한 다는 것 이다.

Limitation

해당 논문에서의 한계점으로 아래와 같이 2개가 있다.

현재 2개의 Domain에 대한 실험과 결과를 작성하였다는 것 이다. 즉, 3개 이상의 Domain으로(multiple-domain)으로 가능성은 제시하였지만, 성능이나 검증은 하지 못하였다.
Inter-Domain의 성능을 높이는데 초점을 맞추었다. 즉, Intra-Domain에서 성능이 3%오르고 // Inter-Domain에서 각각 2.0, 2.0씩 올라도 해당 방법은 더 좋다고 할 수 있다.
User Full Overlap에서 사용하는 방법 이다.

Introduction

In this paper, we make the following contributions: • We propose to apply the combination of dual transfer learning mechanism and latent embedding approach to the problem of cross domain recommendations.
• We empirically demonstrate that the proposed model outperforms the state-of-the-art approaches and improves rec- ommendation accuracy across multiple domains and experimental settings.
• We theoretically demonstrate the convergence condition for the simplified case of our model and empirically show that the proposed model stabilizes and converges after several iterations.
• We illustrate that the proposed model can be easily extended for multiple-domain recommendation applications.

해당 논문에서 제안하는 DDTCDR은 위와 같이 4가지 Contribution을 가지게 된다.
요약하자면, Domain Specific한 정보와 Domain Across 정보를 latente embedding으로 표현하고, orthogonal 성질을 활용한다는 것 이다.

Method

Notation

$u$: User Features
$i$: Item Features
$W_u$: User Feature Embeddings
$W_i$: Item Feature Embeddings
$\gamma$:Learning Rate
$r$:Estimated Ratings
$r_{within}$: Within-Domain Estimated Ratings
$r_{cross}$: Cross-Domain Estimated Ratings
$X$: Latent Orthogonal Mapping
$X^T$: Transpose of Latent Orthogonal Mapping
$RS$: Domain-Specific Recommender System
$AE$: Domain-Specific Autoencoder
$\alpha$: Hyperparameter in Hybrid Utility Function
$MLP$: Multi-layer perceptron
$MLP_{dec}$: Decoder of AutoEncoder
$MLP_{enc}$: Encoder of AutoEncoder

Architecture

png

Feature Embedding

가장 먼저, DNN Input으로 사용하기 위하여 Feature Embedding이 필요하다. 해당 논문에서는 Feature Embedding을 위하여, User, Item별로 AutoEncoder를 사용하였다. (Model 훈련 과정에서 End-to-End인지 확인 필요)

user a와 item b가 있을 때, 각각의 reproduction을 위하여 아래와 같이 Loss를 MSE로서 사용 가능 하다.

$$L = \|u_a - MLP_{dec}(MLP_{enc}(u_a)) \|$$

$$u_a = \{ u_{a_1}, \ldots, u_{a_m} \}, \text{s.t. }m=\text{dimensions of user}$$

해당 Autoencoder를 학습할 때 주의하여야 할 점은, Domain간의 information leak을 방지하고자, 각각의 Domain의 User, Item을 학습을 진행하였다는 것 이다.

※) 개인적인 생각으로는, Feature가 많지 않다면 해당 Feature Embedding과정은 없어도 된다고 생각한다.

Latent Orthogonal Mapping

먼저, 실제 구현 Code를 확인하기는 하여야 하지만, $X$의 정의를 Latent Orthogonal Mapping로 정의하였다. 즉 X = MLP(MLP_{enc}(u_a))로서 생각하면 된다.

위와 같은 정의를 하였을 때, 아래와 같은 과정으로 이루워 진다.

Domain A에서 각각의 User Embedding A, Item Embedding A를 구한다.
User Embedding A -> $X$로 Mapping한다. 이때의 제약사항은 Orthogonal Matrix여야 한다.
해당 X와 User Embedding A, Item Embedding A로서 Cross Domain Prediction을 구한다.
User Embedding A, Item Embedding A로서 Within Domain Prediction을 구한다.
3,4의 과정을 Ensemble하여 Domain A Prediction을 구한다.
$X^T$를 구하여, 위와 같은 과정은 Domain B에서도 적용한다.

여기에서의 가정은 Domain A의 User중 비슷한 구매를 한 사람은 Domain B에서도 비슷한 구매를 할 것이다. 라는 가정이 있다.

예를들어, Domain A에서 비슷한 Feature를 가진 고객은 비슷한 Item을 살 것이고 (CF Filtering) -> Orthogonal Latent Mapping($X$) 하여도 서로 값은 비슷할 것이다. -> 이를 Transpose하여도 비슷한 User끼리는 비슷한 값을 가지게 될 것이다.

Deep Dual Transfer Learning

위에서 설명한 과정을 적게 수식과 Algorithm으로 나타내게 되면, 아래와 같다.

png

$$r^{'}_A = (1-\alpha) RS_A (W_{u_A}, W_{i_A}) + \alpha RS_B (X * W_{u_A}, W_{i_A})$$

간단하게 생각하면, Within Domain Prediction과 Cross Domain Prediction간의 중요도를 $\alpha$로서 조절하게 되며, Cross Domain Prediction에서는 옆의 Domain User의 특성($X$)를 활용한다.

Appendix. Multiple Domain-Loss Function

아래 수식의 Loss Function은 Dual-Domain이 아닌, 여러 Domain을 사용할 때, Prediction하는 방법 이다.

$$r^{'}_k = (1-\alpha) RS_k (W_{u_k}, W_{i_k}) + \frac{\alpha}{n-1} \sum_{j=1;j \neq k}^n RS_j (X_{jk} * W_{u_k}, W_{i_k})$$

위의 수식을 살펴보게 되면, Multiple로서 Domain이 많아지게 되면, 해당 갯수만큼 orthogonal mapping matrix를 만들고 적용하는 것을 알 수 있다.

Appendix. Convergence Analysis.

해당 논문이 위의 Loss Function이 Convergence하다는 것을 증명하였다. -> 나중에 살펴보기.

Result

초창기 Cross Domain Recommendation이여서 그런지 제약사항이 많아 사용하기 어렵다고 판다. (1) Full overlap, 2) Orthogonal mapping matrix가 기하급수적으로 증가)
Domain이 달라도 비슷한 User끼리는 같은 특성을 가지고 있다고 판단하여 학습 진행 -> 나중에 orthogonal ideation사용할 수 있는지 확인 필요.
alpha에 따른 결과 확인 필요 -> alpha에 따라 MSE Loss의 차이를 보임. + 매우 적은 값으로 사용 -> 다른 Domain의 정보를 조금밖에 사용 불가….. (아래 Figure 참조)

png

Share on

Twitter Facebook LinkedIn

JeongYong Hwang