Domain Adaptation
Most of the theoretical analysis of machine learning classification tasks addresses a setup in which the training and test data are randomly generated by the same data distribution. While this may be a good approximation of reality in some machine learning tasks, in many practical applications this assumption cannot be jutified. The data-generating distribution might change over time or there might simply not be any labeled data available from the relevant target domain to train a classifier on. The task of learning when the training data is generated differently than the data one aims to predict is referred to as Domain Adaption (DA) learning. Domain Adaptation tasks occur in many practical situations and are frequently addressed in experimental research, however, achieving a solid theoretical understanding of remains a challenge.
Publications
- On the Hardness of Domain Adaptataion (And the Utility of Unlabeled Target Samples)
Shai Ben-David and Ruth Urner ALT 2012 - Domain Adaptation--Can Quantity compensate for Quality?
Shai Ben-David, Shai Shalev-Shwartz, and Ruth Urner ISAIM 2012 - Impossibility Theorems for Domain Adaptation
Shai Ben David, Tyler Lu, Teresa Luu, David Pal AISTATS 2010 - A theory of learning from different domains
Shai Ben-David, John Blitzer, Koby Crammer, Alex Kulesza, Fernando Pereira, Jennifer Wortman Vaughan: Machine Learning 79(1-2): 151-175 (2010) - A notion of task relatedness yielding provable multiple-task learning guarantees
Shai Ben-David, Reba Schuller Borbely. Machine Learning 73(3): 273-287 (2008) - Data Representation Framework Addressing the Training/Test Distributions Gap
Shai Ben-David. Book chapter in Dataset Shift in machine learning J.,Q., Candela, N. Lawrence, A., Schwaighofer and M., Sugiyama (Eds), MIT press, 2008. - Analysis of Representations for Domain Adaptation
Shai Ben-David, John Blitzer, Koby Crammer and Fernando Pereira NIPS 2006