Núria Agell, ESADE, Universitat Ramon Llull Av. Pedralbes, 62 E-08034, Barcelona (Spain) agell@esade.es

Juan Carlos Aguado, Automatic Control Dept. (ESAII) Technical University of Catalonia Pau Gargallo, 5 E-08028, Barcelona (Spain) jaguado@esaii.upc.es


Abstract: This paper shows some aspect of a bigger project of collaboration among ESADE, the UPC and a retailer chain, oriented to develop a Data Mining tool. This software will provide marketing professionals with useful information recovered from the huge volume of data a supermarket chain can collect from their clients’ behavior. In fact, nowadays, retail companies are dealing with a new problem. Many distribution companies have been investing a lot of money over the last years to get a lot of information about their customers’ transactions. However, most of those companies have neither the skills nor the equipment necessary to process the raw data stored in their data warehouses.

The project that will be introduced in this paper is based on a self-learning classifying technique called LAMDA (which relies on the generalizing power of Fuzzy Logic and the interpolation capability of logical hybrid connectives). Thus, retailers may gain an important competitive advantage because of a much deeper knowledge about their customers’ base. The special nature of LAMDA’s algorithms enables it to cope simultaneously with numerical and qualitative information. Consequently, LAMDA clearly outperforms many other classifying techniques that have to transform qualitative inputs into binary codes, or at the contrary, carry out decisions only on categorical partitions. The classification algorithms based on the hybrid connectives are closely related to the neural networks operation, but LAMDA’s explanation capabilities make it a much more useful tool than neural networks when it comes to analyzing the obtained outputs.

Some preliminary test have been made on the viability of the project. The first study aims at helping retail managers discover which customers are more likely to stop being loyal before a competitor sets up a new retail outlet in the same city. In this case supervised learning approach will be used to obtain a tool that can make this kind of prediction. A second study, only in the definition stage, addresses one increasingly important fear in the “new-economy”: the possibility that some clients can abandon traditional stores for the new developed commercial websites. There, the unsupervised learning capability will be used to try to discover which customers could be lost.

The tests carried out in this project are based on the data gathered from the customers’cards of a Spanish grocer: Supermercats Pujol, S.A. – “Plus Fresc” chain, winner of 1998 Global Electronic Marketing Award, www.plusfresc.es.

Keywords: Data mining, hybrid techniques, learning and classification, marketing