ZhangbuDong
  • Joined on Aug 19, 2022
Loading Heatmap…

ZhangbuDong commented on issue zeizei/OpenI_Learning#633

【“我为开源打榜狂” 第5期】OpenBandit 数据集上传

# Bandit数据集 Open Bandit Dataset 是公开的真实世界记录的老虎机反馈数据。该数据集由日本最大的时尚电子商务公司 ZOZO, Inc. 提供,市值超过 50 亿美元(截至 2020 年 5 月)。该公司使用多臂老虎机算法在名为 ZOZOTOWN 的大型时尚电子商务平台上向用户推荐时尚单品。 官方网址:https://research.zozo.com/data.html paperwithcode:https://paperswithcode.com/dataset/obp GitHub:https://github.com/st-tech/zr-obp Open Bandit Dataset 是公开的真实世界记录的老虎机反馈数据。该数据集由日本最大的时尚电子商务公司 ZOZO, Inc. 提供,市值超过 50 亿美元(截至 2020 年 5 月)。该公司使用多臂老虎机算法在名为 ZOZOTOWN 的大型时尚电子商务平台上向用户推荐时尚单品。 ## 数据集介绍 Open Bandit Dataset 是在一个大型时尚电商平台中对两个多臂老虎机策略的 A/B 测试构建的, 佐佐敦. 它目前总共由 2600 万行组成,每一行都代表具有一些特征值的用户印象、所选项目作为操作、真实倾向得分和点击指标作为结果。这尤其适用于评估离策略评估(OPE),它尝试使用由不同算法生成的数据来估计假设算法的反事实性能。 ## 字段 以下是这些字段的详细说明(它们在 CSV 文件中以逗号分隔):{behavior_policy}/{campaign}.csv(behavior_policy in (bts, random), campaign in (all, men, women)) - timestamp:展示的时间戳。 - item_id:作为武器的物品索引(索引范围在“所有”活动中为 0-80,“男性”活动为 0-33,“女性”活动为 0-46)。 - position:被推荐的item的位置(1、2、3分别对应ZOZOTOWN推荐界面的左、中、右位置)。 - click: 目标变量,指示一个项目是否被点击 (1) 或没有 (0)。 - propensity_score:在每个位置推荐项目的概率。 - user feature 0-4:与用户相关的特征值。 - user-item affinity 0-:由每个用户-项目对之间观察到的过去点击次数得出的用户-项目亲和度分数。 ## 论文与引用 论文:Yuta Saito, Shunsuke Aihara, Megumi Matsutani, Yusuke Narita. Large-scale Open Dataset, Pipeline, and Benchmark for Bandit Algorithms https://arxiv.org/abs/2008.07146 ``` 使用此数据集时,请引用以下bibtex的论文: @article{saito2020large, title={Large-scale Open Dataset, Pipeline, and Benchmark for Bandit Algorithms}, author={Saito, Yuta, Shunsuke Aihara, Megumi Matsutani, Yusuke Narita}, journal={arXiv preprint arXiv:2008.07146}, year={2020} } ```

16 hours ago

ZhangbuDong opened issue zeizei/OpenI_Learning#633

【“我为开源打榜狂” 第5期】OpenBandit 数据集上传

16 hours ago

ZhangbuDong upload dataset template_with_training.zip

16 hours ago

ZhangbuDong commented on issue zeizei/OpenI_Learning#629

【“我为开源打榜狂” 第5期】沃尔玛产品详细信息 数据集上传

# 沃尔玛产品详细信息 本数据集包含沃尔玛2020年1月至2020年3月产品信息,共包含30K条记录。 内容 该数据集包含以下内容: 总记录数:370322  域名:walmart.com  日期范围:2020年1月1日至2020年3月31日   可用字段:Uniq ID,抓取时间戳记,产品网址,产品名称,描述,标价,销售价格,品牌,商品编号,Gtin,包装尺寸,类别,邮政编码

21 hours ago

ZhangbuDong opened issue zeizei/OpenI_Learning#629

【“我为开源打榜狂” 第5期】沃尔玛产品详细信息 数据集上传

21 hours ago

ZhangbuDong pushed to master at ZhangbuDong/沃尔玛产品详细信息

1 day ago

ZhangbuDong upload dataset WalmartProductDetails.zip

1 day ago

ZhangbuDong pushed to master at ZhangbuDong/HotelRec酒店推荐数据集

1 day ago

ZhangbuDong pushed to master at ZhangbuDong/HotelRec酒店推荐数据集

1 day ago

ZhangbuDong pushed to master at ZhangbuDong/Bandit数据集

1 day ago

ZhangbuDong created repository ZhangbuDong/Bandit数据集

1 day ago

ZhangbuDong commented on issue zeizei/OpenI_Learning#620

【“我为开源打榜狂” 第5期】ReDial 数据集上传

# ReDial推荐对话数据集 ReDial(推荐对话)是一个带注释的对话数据集,用户可以在其中相互推荐电影。该数据集是由一个团队工作的研究人员(Polytechnique Montréal, MILA – Quebec AI Institute, Microsoft Research Montréal, HEC Montreal, and Element AI.)收集 ## 数据集介绍 ReDial(推荐对话)是一个带注释的对话数据集,用户可以在其中相互推荐电影。该数据集包含 10,000 多个以提供电影推荐为主题的对话。 官方网址:https://redialdata.github.io/website/ paperwithcode:https://paperswithcode.com/dataset/redial ## 结构 Structure The dataset is published in the “jsonl” format, i.e., as a text file where each line corresponds to a Dialogue given as a valid JSON document. A Dialogue contains these fields: - conversationId: an integer - initiatorWorkerId: an integer identifying to the worker initiating the conversation (the recommendation seeker) - respondentWorkerId: an integer identifying the worker responding to the initiator (the recommender) - messages: a list of Message objects - movieMentions: a dict mapping movie IDs mentioned in this dialogue to movie names - initiatorQuestions: a dictionary mapping movie IDs to the labels supplied by the initiator. Each label is a bool corresponding to whether the initiator has said he saw the movie, liked it, or suggested it. - respondentQuestions: a dictionary mapping movie IDs to the labels supplied by the respondent. Each label is a bool corresponding to whether the initiator has said he saw the movie, liked it, or suggested it. Each Message contains these fields: - messageId: a unique ID for this message - text: a string with the actual message. The string may contain a token starting with @ followed by an integer. This is a movie ID which can be looked up in the movieMentions field of the Dialogue object. - timeOffset: time since start of dialogue in seconds - senderWorkerId: the ID of the worker sending the message, either initiatorWorkerId or respondentWorkerId. The labels in initiatorQuestions and respondentQuestions have the following meaning: - suggested: 0 if it was mentioned by the seeker, 1 if it was a suggestion from the recommender - seen: 0 if the seeker has not seen the movie, 1 if they have seen it, 2 if they did not say - liked: 0 if the seeker did not like the movie, 1 if they liked it, 2 if they did not say Dataset Size The dataset contains a total of 11348 dialogues, 10006 for training and model selection, and 1342 for testing. ## 引用 如果您在研究中使用 ReDial,请使用以下 BibTeX 条目引用论文: ``` @inproceedings{li2018conversational, title={Towards Deep Conversational Recommendations}, author={Li, Raymond and Kahou, Samira Ebrahimi and Schulz, Hannes and Michalski, Vincent and Charlin, Laurent and Pal, Chris}, booktitle={Advances in Neural Information Processing Systems 31 (NIPS 2018)}, year={2018} } ```

1 day ago

ZhangbuDong opened issue zeizei/OpenI_Learning#620

【“我为开源打榜狂” 第5期】ReDial 数据集上传

1 day ago

ZhangbuDong upload dataset ReDial.zip

1 day ago

ZhangbuDong pushed to master at ZhangbuDong/ReDial推荐对话数据集

1 day ago

ZhangbuDong created CPU/GPU type debugging task test_similarity

1 day ago