Humberto Castejón.
Humberto explained that the current practice in many companies is to optimize “processes” locally, namely through increasing the immediate gains from marketing activities. For example, a company may send out a direct mailing (DM) offer to a customer with the aim of maximizing the gain from that sale, without considering whether that offer potentially has any negative consequences for that customer’s long run relationship with the firm and thus negatively affecting her CLV. A useful tool for global optimization is reinforcement learning. Reinforcement learning finds the optimal policy (sequence of actions) that maximizes cumulative reward by executing a given action (e.g., sending out a DM) in a given state (e.g., after just having received another DM offer). Consequently, the “artificial” agent can evaluate the long-term effects of alternative marketing activities and choose the optimal strategy. Despite that reinforcement learning is not really new, we see a significant increase in its applications due to the availability of large datasets, increases in computing power, and the complexity of models.