[1a]	Reinforcement Learning 1 - Domain Representation
	571	An Object-Oriented Representation for Efficient Reinforcement Learning. Carlos Diuk, Andre Cohen, and Michael Littman

	544	Hierarchical Model-Based Reinforcement Learning: R-max + MAXQ. Nicholas Jong and Peter Stone

	682	On the Hardness of Finding Symmetries in Markov Decision Processes. Shravan Narayanamurthy and Balaraman Ravindran

	197	Efficiently Learning Linear-Linear Exponential Family Predictive Representations of State. David Wingate and Satinder Singh

[2a]	Reinforcement Learning 2 - Value Representation
	341	Online Kernel Selection for Bayesian Reinforcement Learning. Joseph Reisinger, Peter Stone, and Risto Miikkulainen

	125	A Worst-Case Comparison Between Temporal Difference and Residual Gradient with Linear Function Approximation. Lihong Li

	581	An Analysis of Linear Models, Linear Value-Function Approximation, and Feature Selection for Reinforcement Learning. Ronald Parr, Lihong Li, Gavin Taylor, Christopher Painter-Wakefield, and Michael Littman

	429	A Semi-parametric Statistical Approach to Model-free Policy Evaluation. Tsuyoshi Ueno, Motoaki Kawanabe, Takeshi Mori, Shin-Ichi Maeda, and Shin Ishii

[4a]	Reinforcement Learning 3
	259	Non-Parametric Policy Gradients: A Unified Treatment of Propositional and Relational Domains. Kristian Kersting and Kurt Driessens

	488	Space-indexed Dynamic Programming: Learning to Follow Trajectories. J. Zico Kolter, Adam Coates, Andrew Ng, Yi Gu, and Charles DuHadway

	335	Privacy-Preserving Reinforcement Learning. Jun Sakuma, Shigenobu Kobayashi, and Rebecca Wright

	645	Apprenticeship Learning Using Linear Programming. Umar Syed, Michael Bowling, and Robert Schapire

	257	Learning All Optimal Policies with Multiple Criteria. Leon Barrett and Srinivas Narayanan

[5a]	Reinforcement Learning 4 - Active Learning
	290	Active Reinforcement Learning. Arkady Epshteyn, Adam Vogel, and Gerald DeJong

	627	Knows What It Knows: A Framework For Self-Aware Learning. Lihong Li, Michael Littman, and Thomas Walsh

	487	Reinforcement Learning with Limited Reinforcement: Using Bayes Risk for Active Learning in POMDPs. Finale Doshi, Joelle Pineau, and Nicholas Roy

	490	The Many Faces of Optimism: a Unifying Approach. Istvan Szita and Andras Lorincz

	479	Transfer of Samples in Batch Reinforcement Learning. Alessandro Lazaric, Marcello Restelli, and Andrea Bonarini

[6a]	Reinforcement Learning 5
	452	Learning for Control from Multiple Demonstrations. Adam Coates, Pieter Abbeel, and Andrew Ng

	580	Reinforcement Learning in the Presence of Rare Events. Jordan Frank, Shie Mannor, and Doina Precup

	317	On-line Discovery of Temporal-Difference Networks. Takaki Makino and Toshihisa Takagi

	111	Preconditioned Temporal Difference Learning. Hengshuai Yao and Zhi-Qiang Liu

[7a]	Reinforcement Learning 6
	458	Automatic Discovery and Transfer of MAXQ Hierarchies. Neville Mehta, Soumya Ray, Prasad Tadepalli, and Thomas Dietterich

	652	An Analysis of Reinforcement Learning with Function Approximation. Francisco Melo, Sean Meyn, and Isabel Ribeiro

	519	Exploration Scavenging. John Langford, Alexander Strehl, and Jennifer Wortman

	564	Sample-Based Learning and Search with Permanent and Transient Memories. David Silver, Richard Sutton, and Martin Mueller

[8a]	Transfer Learning and Games
	412	Learning to Learn Implicit Queries from Gaze Patterns. Kai Puolamäki, Antti Ajanki, and Samuel Kaski

	520	Multi-Task Learning for HIV Therapy Screening. Steffen Bickel, Jasmina Bogojeska, Thomas Lengauer, and Tobias Scheffer

	229	Manifold Alignment using Procrustes Analysis. Chang Wang and Sridhar Mahadevan

	542	No-Regret Learning in Convex Games. Geoffrey J. Gordon, Amy Greenwald, and Casey Marks

	655	Strategy Evaluation in Extensive Games with Importance Sampling. Michael Bowling, Michael Johanson, Neil Burch, and Duane Szafron

[1b]	Kernels
	377	Tailoring Density Estimation via Reproducing Kernel Moment Matching. Le Song, Xinhua Zhang, Alex Smola, Arthur Gretton, and Bernhard Schoelkopf

	277	Nonextensive Entropic Kernels. Andre F. T. Martins, Mario A. T. Figueiredo, Pedro M. Q. Aguiar, Noah A. Smith, and Eric P. Xing

	216	Nu-Support Vector Machine as Conditional Value-at-Risk Minimization. Akiko Takeda and Masashi Sugiyama

	643	A Generalization of Haussler's Convolution Kernel - Mapping Kernel. Kilho Shin and Tetsuji Kuboyama

[2b]	Active Learning and Experimental design
	448	Optimizing Estimated Loss Reduction for Active Sampling in Rank Learning. Pinar Donmez and Jaime Carbonell

	324	Hierarchical sampling for active learning. Sanjoy Dasgupta and Daniel Hsu

	437	Active Kernel Learning. Steven C.H. Hoi and Rong Jin

	687	Actively Learning Level-Sets of Composite Functions. Brent Bryan and Jeff Schneider

[4b]	Kernel - Including Kernel Learning
	158	Localized Multiple Kernel Learning. Mehmet Gonen and Ethem Alpaydin

	665	Composite Kernel Learning. Marie Szafranski, Yves Grandvalet, and Alain Rakotomamonjy

	531	Training SVM with Indefinite Kernels. Jianhui Chen and Jieping Ye

	449	Robust Matching and Recognition using Context-Dependent Kernels. Hichem Sahbi, Jean-Yves Audibert, Jaonary Rabarisoa, and Renaud Keriven

	641	An RKHS for Multi-View Learning and Manifold Co-Regularization. Vikas Sindhwani and David Rosenberg

[5b]	Gaussian Processes
	151	Fast Gaussian Process Methods for Point Process Intensity Estimation. John Cunningham, Krishna Shenoy, and Maneesh Sahani

	241	Gaussian Process Product Models for Nonparametric Nonstationarity. Ryan Adams and Oliver Stegle

	599	Sparse Multiscale Gaussian Process Regression. Christian Walder, Kwang In Kim, and Bernhard Schoelkopf

	371	Topologically-Constrained Latent Variable Models. Raquel Urtasun, David Fleet, Andreas Geiger, Jovan Popovic, Trevor Darrell, and Neil Lawrence

	399	Bi-Level Path Following for Cross Validated Solution of Kernel Quantile Regression. Saharon Rosset

[6b]	Boosting
	258	Random Classification Noise Defeats All Convex Potential Boosters. Philip M. Long and Rocco A. Servedio

	676	ManifoldBoost: Stagewise Function Approximation for Fully-, Semi- and Un-supervised Learning. Nicolas Loeff, David Forsyth, and Deepak Ramachandran

	331	Boosting with Incomplete Information. Gholamreza Haffari, Yang Wang, Shaojun Wang, Greg Mori, and Feng Jiao

	362	Maximum Likelihood Rule Ensembles. Wojciech Kotlowski, Krzysztof Dembczynski, and Roman Slowinski

[7b]	Online learning
	367	Rank Minimization via Online Learning. Raghu Meka, Prateek Jain, Constantine Caramanis, and Inderjit Dhillon

	322	Confidence-Weighted Linear Classification. Mark Dredze, Koby Crammer, and Fernando Pereira

	355	The Projectron: a Bounded Kernel-Based Perceptron. Francesco Orabona, Joseph Keshet, and Barbara Caputo

	511	Efficient Bandit Algorithms for Online Multiclass Prediction. Sham M. Kakade, Shai Shalev-Shwartz, and Ambuj Tewari

	242	Prediction with Expert Advice for the Brier Game. Vladimir Vovk and Fedor Zhdanov

[8b]	Kernels - Including scalability
	166	A Dual Coordinate Descent Method for Large-scale Linear SVM. Cho-Jui Hsieh, Kai-Wei Chang, Chih-Jen Lin, S. Sathiya Keerthi, and S. Sundararajan

	411	Optimized Cutting Plane Algorithm for Support Vector Machines. Vojtech Franc and Soeren Sonnenburg

	491	Fast Support Vector Machine Training and Classification on Graphics Processors. Bryan Catanzaro, Narayanan Sundaram, and Kurt Keutzer

	266	SVM Optimization: Inverse Dependence on Training Set Size. Shai Shalev-Shwartz and Nathan Srebro

	476	Improved Nystrom Low-Rank Approximation and Error Analysis. Kai Zhang, Ivor Tsang, and James Kwok

[1c]	Clustering
	628	A Rate-Distortion One-Class Model and its Applications to Clustering. Koby Crammer, Partha Pratim Talukdar, and Fernando Pereira

	196	Estimating Local Optimums in EM Algorithm over Gaussian Mixture Model. Zhenjie Zhang, Bing Tian Dai, and Anthony K.H. Tung

	236	A Decoupled Approach to Exemplar-based Unsupervised Learning.. Sebastian Nowozin and Gökhan Bakir

	168	Efficient MultiClass Maximum Margin Clustering. Bin Zhao, Fei Wang, and Changshui Zhang

[2c]	Distance learning and Efficient Use
	215	Fast Solvers and Efficient Implementations for Distance Metric Learning. Kilian Weinberger and Lawrence Saul

	178	Nearest Hyperdisk Methods for High-Dimensional Classification. Hakan Cevikalp, Bill Triggs, and Robi Polikar

	400	Fast Nearest Neighbor Retrieval for Bregman Divergences. Lawrence Cayton

	340	Deep Learning via Semi-Supervised Embedding. Jason Weston, Frederic Ratle, and Ronan Collobert

[4c]	Semi-supervised Learning - Embeddings and Transduction
	611	Semi-supervised Learning of Compact Document Representations with Deep Networks. Marc'Aurelio Ranzato and Martin Szummer

	382	Large Scale Manifold Transduction. Michael Karlen, Jason Weston, Ayse Erkan, and Ronan Collobert

	296	Graph Transduction via Alternating Minimization. Jun Wang, Tony Jebara, and Shih-Fu Chang

	254	Stability of Transductive Regression Algorithms. Corinna Cortes, Mehryar Mohri, Dmitry Pechyony, and Ashish Rastogi

	383	On Multi-View Active Learning and the Combination with Semi-Supervised Learning. Wei Wang and Zhi-Hua Zhou

[5c]	Semi-supervised clustering and classification
	337	Estimating Labels from Label Proportions. Novi Quadrianto, Alex Smola, Tiberio Caetano, and Quoc Viet Le

	432	Self-taught Clustering. Wenyuan Dai, Qiang Yang, Gui-Rong Xue, and Yong Yu

	172	Spectral Clustering with Inconsistent Advice. Tom Coleman, James Saunderson, and Anthony Wirth

	145	Pairwise Constraint Propagation by Semidefinite Programming for Semi-Supervised Classification. Zhenguo Li, Jianzhuang Liu, and Xiaoou Tang

	528	The Asymptotics of Semi-Supervised Learning in Discriminative Probabilistic Models. Nataliya Sokolovska, Olivier Cappé, and François Yvon

[6c]	Discriminative vs Generative, and Energy-Based Learning
	415	Discriminative Parameter Learning for Bayesian Networks. Jiang Su, Harry Zhang, Charles X. Ling, and Stan Matwin

	588	An Asymptotic Analysis of Generative, Discriminative, and Pseudolikelihood Estimators. Percy Liang and Michael Jordan

	601	Classification using Discriminative Restricted Boltzmann Machines. Hugo Larochelle and Yoshua Bengio

	573	On the Quantitative Analysis of Deep Belief Networks. Ruslan Salakhutdinov and Iain Murray

	638	Training Restricted Boltzmann Machines using Approximations to the Likelihood Gradient. Tijmen Tieleman

[7c]	Embeddings
	163	Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization. Haiping Lu, Konstantinos Plataniotis, and Anastasios Venetsanopoulos

	484	Expectation-Maximization for Sparse and Non-Negative PCA. Christian David Sigg and Joachim M. Buhmann

	551	ICA and ISA Using Schweizer-Wolff Measure of Dependence. Sergey Kirshner and Barnabás Póczos

	600	Bayesian Probabilistic Matrix Factorization using Markov Chain Monte Carlo. Ruslan Salakhutdinov and Andriy Mnih

[8c]	Embeddings
	270	A Least Squares Formulation for Canonical Correlation Analysis. Liang Sun, Shuiwang Ji, and Jieping Ye

	668	Closed-form Supervised Dimensionality Reduction with Generalized Linear Models. Irina Rish, Genady Grabarnilk, Guillermo Cecchi, Francisco Pereira, and Geoffrey J. Gordon

	312	Grassmann Discriminant Analysis: a Unifying View on Subspace-Based Learning. Jihun Hamm and Daniel Lee

	582	Metric Embedding for Kernel Classification Rules. Bharath Sriperumbudur, Omer Lang, and Gert Lanckriet

	592	Extracting and Composing Robust Features with Denoising Autoencoders. Pascal Vincent, Hugo Larochelle, Yoshua Bengio, and Pierre-Antoine Manzagol

[1d]	Hidden Markov Models
	182	Inverting the Viterbi Algorithm: an Abstract Framework for Structure Design. Michael Schnall-Levin, Leonid Chindelevitch, and Bonnie Berger

	305	An HDP-HMM for Systems with State Persistence. Emily Fox, Erik Sudderth, Michael Jordan, and Alan Willsky

	413	Modeling Interleaved Hidden Processes. Niels Landwehr

	679	Beam Sampling for the Infinite Hidden Markov Model. Jurgen Van Gael, Yunus Saatci, Yee Whye Teh, and Zoubin Ghahramani

[2d]	Mixture models, Dirichlet processes
	460	Statistical Models for Partial Membership. Katherine Heller, Sinead Williamson, and Zoubin Ghahramani

	538	The Dynamic Hierarchical Dirichlet Process. Lu Ren, David B. Dunson, and Lawrence Carin

	554	Hierarchical Kernel Stick-Breaking Process for Multi-Task Image Analysis. Qi An, Chunping Wang, Ivo Shterev, Eric Wang, Lawrence Carin, and David B. Dunson

	502	Data Spectroscopy: Learning Mixture Models using Eigenspaces of Convolution Operators. Tao Shi, Mikhail Belkin, and Bin Yu

[4d]	Ranking and Classification with Sampling
	489	Democratic Approximation of Lexicographic Preference Models. Fusun Yaman, Thomas Walsh, Michael Littman, and Marie desJardins

	343	Unsupervised Rank Aggregation with Distance-Based Models. Alexandre Klementiev, Dan Roth, and Kevin Small

	392	Learning Dissimilarities by Ranking: From SDP to QP. Hua Ouyang and Alexander Gray

	523	Empirical Bernstein Stopping. Volodymyr Mnih, Csaba Szepesvari, and Jean-Yves Audibert

	614	Pointwise Exact Bootstrap Distributions of Cost Curves. Charles Dugas and David Gadoury

[5d]	Sequence Data
	278	A Distance Model for Rhythms. Jean-Francois Paiement, Yves Grandvalet, Samy Bengio, and Douglas Eck

	318	A Reproducing Kernel Hilbert Space Framework for Pairwise Time Series Distances. Zhengdong Lu, Todd K. Leen, Yonghong Huang, and Deniz Erdogmus

	440	Sequence Kernels for Predicting Protein Essentiality. Cyril Allauzen, Mehryar Mohri, and Ameet Talwalkar

	180	Local Likelihood Modeling of Temporal Text Streams. Guy Lebanon and Yang Zhao

	160	Causal Modelling Combining Instantaneous and Lagged Effects: an Identifiable Model Based on Non-Gaussianity. Aapo Hyvarinen, Shohei Shimizu, and Patrik Hoyer

[6d]	Ranking and IR
	167	Listwise Approach to Learning to Rank - Theory and Algorithm. Fen Xia, Tie-Yan Liu, Jue Wang, Wensheng Zhang, and Hang Li

	179	Query-Level Stability and Generalization in Learning to Rank. Yanyan Lan, Tie-Yan Liu, Tao Qin, Zhiming Ma, and Hang Li

	470	Predicting Diverse Subsets Using Structural SVMs. Yisong Yue and Thorsten Joachims

	264	Learning Diverse Rankings with Multi-Armed Bandits. Filip Radlinski, Robert Kleinberg, and Thorsten Joachims

[7d]	Topic models
	562	mStruct: A New Admixture Model for Inference of Population Structure in Light of Both Genetic Admixing and Allele Mutations. Suyash Shringarpure and Eric Xing

	419	Memory Bounded Inference in Topic Models. Ryan Gomes, Max Welling, and Pietro Perona

	667	Nonnegative Matrix Factorization via Rank-One Downdate. Michael Biggs, Ali Ghodsi, and Stephen Vavasis

	129	Dirichlet Component Analysis: Feature Extraction for Compositional Data. Hua-Yan Wang, Qiang Yang, Hong Qin, and Hongbin Zha

[8d]	NLP
	304	Learning to Sportscast: A Test of Grounded Language Acquisition. David Chen and Raymond Mooney

	391	A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning. Ronan Collobert and Jason Weston

	398	Modified MMI/MPE: a Direct Evaluation of the Margin in Speech Recognition. Georg Heigold, Thomas Deselaers, Ralf Schlueter, and Hermann Ney

	311	Fully Distributed EM for Very Large Datasets. Jason Wolfe, Aria Haghighi, and Dan Klein

	673	Structure Compilation: Trading Structure for Features. Percy Liang, Hal Daume, and Dan Klein

[1e]	Graphs
	379	Graph Kernels Between Point Clouds. Francis Bach

	396	The Skew Spectrum of Graphs. Risi Kondor and Karsten Borgwardt

	681	Message-passing for Graph-structured Linear Programs: Proximal Projections, Convergence and Rounding Schemes. Pradeep Ravikumar, Alekh Agarwal, and Martin J. Wainwright

	565	Fast Incremental Proximity Search in Large Graphs. Purnamrita Sarkar, Andrew Moore, and Amit Prakash

[2e]	Optimization
	327	Efficiently Solving Convex Relaxations for MAP Estimation. Pawan Kumar Mudigonda and Philip Torr

	461	A Quasi-Newton Approach to Nonsmooth Convex Optimization. Jin Yu, S.V.N. Vishwanathan, Simon Guenter, and Nicol Schraudolph

	497	Stopping Conditions for Exact Computation of Leave-One-Out Error in Support Vector Machines. Vojtech Franc, Pavel Laskov, and Klaus-R. Mueller

	260	On Partial Optimality in Multi-label MRFs. Pushmeet Kohli, Alexander Shekhovtsov, Carsten Rother, Vladimir Kolmogorov, and Philip Torr

[4e]	Structured output, ILP and Sparsity
	402	Accurate Max-margin Training for Structured Output Spaces. Sunita Sarawagi and Rahul Gupta

	279	Training Structural SVMs when Exact Inference is Intractable. Thomas Finley and Thorsten Joachims

	530	Discriminative Structure and Parameter Learning for Markov Logic Networks. Tuyen Huynh and Raymond Mooney

	237	Laplace Maximum Margin Markov Networks. Jun Zhu, Eric Xing, and Bo Zhang

	503	Fast Estimation of Relational Pattern Coverage through Randomization and Maximum Likelihood. Ondrej Kuzelka and Filip Zelezny

[5e]	Feature selection and sparsity
	630	Detecting Statistical Interactions with Additive Groves of Trees. Daria Sorokina, Rich Caruana, Mirek Riedewald, and Daniel Fink

	574	Sparse Bayesian Nonparametric Regression. Francois Caron and Arnaud Doucet

	390	Bolasso: Model Consistent Lasso Estimation through the Bootstrap. Francis Bach

	113	The GroupLASSO for Generalized Linear Models: Uniqueness of Solutions and Efficient Algorithms. Volker Roth and Bernd Fischer

	323	On the Chance Accuracies of Large Collections of Classifiers. Mark Palatucci and Andrew Carlson

[6e]	Compressed Sensing and Projections
	121	Autonomous Geometric Precision Error Estimation in Low-level Computer Vision Tasks. Andrés Corrada-Emmanuel and Howard Schultz

	209	Multi-Task Compressive Sensing with Dirichlet Process Priors. Yuting Qi, Dehong Liu, David Dunson, and Lawrence Carin

	459	Compressed Sensing and Bayesian Experimental Design. Matthias Seeger and Hannes Nickisch

	361	Efficient Projections onto the L1-Ball for Learning in High Dimensions. John Duchi, Shai Shalev-Shwartz, Yoram Singer, and Tushar Chandra

[7e]	Classification
	455	Bayes Optimal Classification for Decision Trees. Siegfried Nijssen

	536	Multi-Classification by Categorical Features via Clustering. Yevgeny Seldin and Naftali Tishby

	150	Cost-Sensitive Multi-class Classification from Probability Estimates. Deirdre O'Brien, Maya Gupta, and Robert Gray

	513	Polyhedral Classifier for Target Detection A Case Study: Colorectal Cancer. Murat Dundar, Matthias Wolf, Sarang Lakare, Marcos Salganicoff, and Vikas C. Raykar

[8e]	Multiple Instance Learning and Learning with Missing Features
	130	Adaptive p-Posterior Mixture-Model Kernels for Multiple Instance Learning. Hua-Yan Wang, Qiang Yang, and Hongbin Zha

	552	Multiple Instance Ranking. Charles Bergeron, Jed Zaretzki, Curt Breneman, and Kristin Bennett

	587	Bayesian Multiple Instance Learning: Automatic Feature Selection and Inductive Transfer. Vikas Raykar, Balaji Krishnapuram, Jinbo Bi, Murat Dundar, and R. Bharat Rao

	202	Learning to Classify with Missing and Corrupted Features. Ofer Dekel and Ohad Shamir

	272	Learning from Incomplete Data with Infinite Imputations. Uwe Dick, Peter Haider, and Tobias Scheffer