Session on Causal Inference Analysis for Information Retrieval, INFORMS 2021

Schedule:

Host Online at October 25, 11am - 12:30 pm, 2021. Please contact session chair (Da Xu): Daxu5180@gmail.com for questions.

Register:

Please visit the INFORMS 2021 Website for registration information.

All the recordings (Youtube links) and abstracts for the invited talks are provided below in the Speaker Section.

Introduction

Information retrieval (IR) systems have experienced extraordinary progress fueled by deep learning in the past decade. The success of neural networks has brought tremendous opportunities to model highly complex patterns in the collected data for prediction; however, the critical transition from model prediction to the final decision making in IR is far from trivial. What distinguishes IR from other domains such as computer vision and natural language processing is that it interacts directly with users and inherently involves making many complex decisions to satisfy information and user needs -- the mere prediction of relevance or classification of content is not enough. There are many desirable properties besides accuracy that IR systems should possess, such as robustness (stability), potential negative impact, long-term utility, as well as the satisfaction of various parties involved. Historically, there have always been gaps between pattern prediction and making decisions, and many of the algorithmic approaches make oversimplified assumptions about human behavior.

Therefore, we are hosting this session in a timely manner to unite researchers and practitioners from various backgrounds to identify the emerging challenges, discover the connections, and study promising solutions via the lens of causal inference. It is very fortunate that many wonderful scientists have devoted to exploring the frontier of these cross-domain challenges, and we are very lucky to invite four of them here to help us learn and discuss.

Speakers

Da Xu

Da Xu Machine Learning Manager @ Walmart Labs Session Chair

Yuting Ye

Yuting Ye Assistant Professor @ SUSTech Speaker 1

Chuanwei Ruan

Chuanwei Ruan Senior Machine Learning Engineer @ Instacart Speaker 2

Fengshi Niu

Fengshi Niu Postdoctoral Scholar @ Stanford Speaker 3

Zeman Wang

Zenan Wang Research Scientist @ JD.com Speaker 4

  • Opening Remark [YouTube] [from Da]
  •      How does causal inference reshape the landscape of information retrieval?

  • Topic 1: Towards Robust Off-policy Learning and Evaluation for Runitme Uncertainty [YouTube] [from Yuting]
  •      Abstract: Off-policy learning plays a pivotal role in optimizing and evaluating policies prior to the online deployment. However, during the real-time serving, we observe varieties of inter- ventions and constraints that cause inconsistency between the online and offline setting, which we summarize and term as runtime uncertainty. Such uncertainty cannot be learned from the logged data due to its abnormality and rareness nature. To assert a certain level of robustness, we perturb the off-policy estimators along an adversarial direction in view of the run- time uncertainty. It allows the resulting estimators to be robust not only to observed but also unexpected runtime uncertainties. Leveraging this idea, we bring runtime-uncertainty robustness to three methods, the inverse propensity score method, the reward-model method, and the doubly robust method. We the- oretically justify the robustness of our methods to runtime uncertainty, and demonstrate their effectiveness using both the simulation and the real-world online experiments.

  • Topic 2: Ranking Grocery Items with Quanlity Constraints using Conterfactual Modelling [YouTube] [from Chuanwei]
  •      Abstract: Observational data with transparent intervention mechanism is usually impractical for real-world applications. However, there exists abundant feedback data from unknown interventions - any recommendation made by the system potentially changes the environment for users’ decision making. We will present a comprehensive framework on how we use counterfactual modelling to effectively rank grocery items under quality constraints.

  • Topic 3: Using Auction Throttling to Measure the Effect of Online Advertising [YouTube] [from Fengshi]
  •      Abstract: Causally identifying the effect of digital advertising is challenging, because experimentation is expensive, and observational data lacks random variation. This presentation aims to identify a pervasive source of naturally occurring, quasi-experimental variation in user-level ad-exposure in digital advertising campaigns. It shows how this variation can be utilized by ad-publishers to identify the causal effect of advertising campaigns.

  • Topic 4: Adaptive Experimentation with Delayed Feedback [YouTube] [from Zenan]
  •      Abstract: The classical experiment framework expects analysis to be done after expected sample size is reached. However, peeking at the result is prevalent among practitioners running the online experiment. This talk will discuss some possibilities to address issues for continuous monitoring. First, what proper adjustment is needed to prevent type I error inflation. Second, how to deal with delayed responses.