1996 ~2020 年历届 AAAI 最佳论文汇总
本文汇总了从1996年~2020年AAAI所有的最佳论文，同时附上作者和论文链接以及论文介绍（点击论文题目即可跳转）, 文末有最佳论文合集的 下载链接 ~
AAAI的英文全称是 the Association for the Advance of Artificial Intelligence，即美国人工智能协会。
美国人工智能协会（American Association for Artificial Intelligence）美国人工智能协会是人工智能领域的主要学术组织之一。该协会主办的年会（AAAI, The National Conference on Artificial Intelligence）是一个主要的人工智能学术会议。
WinoGrande: An Adversarial Winograd Schema Challenge at Scaleh
作者：Keisuke Sakaguchi、Ronan Le Bras、Chandra Bhagavatula、Yejin Choi
论文介绍：维诺格拉德模式挑战赛（Winograd Schema Challenge：WSC）是一个用于常识推理的基准测试，该测试有 273 个专家编写的问题，专门应对依赖选择偏好和词语联想的统计学模型。但是近来，许多模型在该基准测试的性能已达到 90%。因此，研究者希望了解，这些模型是否真正获得了鲁棒的常识能力。
因此，研究者提出了 WINOGRANDE，一个有着 44k 个问题的大规模数据集。该数据集在规模和难度上较之前的数据集更大。该数据集的构建包括两个步骤：首先使用众包的方式设计问题，然后使用一个新的 AFLITE 算法缩减系统偏见（systematic bias），使得人类可以察觉到的词汇联想转换成机器可以检测到的嵌入联想（embedding association）。现在最好的 SOTA 模型可以达到的性能是 59.4 – 79.1%，比人脸性能水平（94%）低 15-35%（绝对值）。这种性能波动取决于训练数据量（2% 到 100%）。
此外，研究者还在 5 个相关的基准数据集上进行了测试，取得了以下结果：WSC (→ 90.1%)、DPR (→ 93.1%)、COPA(→ 90.6%)、KnowRef (→ 85.6%) 和 Winogender (→ 97.1%)。
这说明，一方面 WINOGRANDE 是一个很好的迁移学习的资源；但另一方面，这说明我们现在高估了模型的常识推理的能力。研究者希望通过这项研究能够让学界重视减少算法的偏见。
作者：Yonathan Efroni, Gal Dalal, Bruno Scherrer, Shie Mannor
有限时域前瞻策略(Finite-horizon lookahead policies)已经在强化学习中得到广泛应用，并取得了令人印象深刻的实证成果。通常，前瞻性策略是通过特定的规划方法实现的，例如蒙特卡洛树搜索(Monte Carlo Tree Search)，AlphaZero正是应用了该方法。将规划问题视为树搜索，实现上的一种合理做法是只备份叶节点上的值，而根节点上获得的信息只用于更新策略。
论文提出一种简单明了的增强方法：使用最优树路径的返回值来备份根节点的后代的值。为了实现结果，作者引入一个称为多步贪婪一致性(multiple-step greedy consistency)的概念。然后，在树搜索阶段和值估计阶段同时注入噪声的情况下，为上述增强方法的两个算法实例提供收敛速度。
Memory-Augmented Monte Carlo Tree Search
作者：Chenjun Xiao, University of Alberta；
Jincheng Mei, University of Alberta；
Martin Müller, University of Alberta
论文介绍：本文中提出记忆增强的蒙特卡洛树搜索（Memory-Augmented Monte Carlo Tree Search，M-MCTS）并对其进行了评估，提供了利用在线实时搜索的泛化能力的新方法。M-MCTS 的核心思想是将 MCTS 结合一种记忆结构，其中每一项记录包含一个特定状态的信息。通过结合相似状态的估计，这些记忆被用于生成一个近似值估计。我们在本文中表明基于记忆的值逼近在温和条件下高概率地优于原始的蒙特卡洛评估方法。我们在围棋中评估了 M-MCTS。实验结果表明 M-MCTS 在相同的模拟次数下优于原始的 MCTS。
Label-Free Supervision of Neural Networks with Physics and DomainKnowledge
作者： Russell Stewart & Stefano Ermon, Stanford University
Bidirectional Search That Is Guaranteed to Meet in the Middle
作者：Robert C. Holte, University of Alberta；
Ariel Felner, Ben-Gurion University；
Guni Sharon, Ben-Gurion University；
Nathan R. Sturtevant, University of Denver
摘要：We present MM, the first bidirectional heuristic search algo- rithm whose forward and backward searches are guaranteed to “meet in the middle”, i.e. never expand a node beyond the solution midpoint. We also present a novel framework for comparing MM, A*, and brute-force search, and identify conditions favoring each algorithm. Finally, we present ex- perimental results that support our theoretical analysis.
From Non-Negative to General Operator Cost Partitioning
作者：Florian Pommerening, University of Basel；
Malte Helmert, University of Basel；
Gabriele Röger, University of Basel；
Jendrik Seipp, University of Basel
摘要：Operator cost partitioning is a well-known technique to make admissible heuristics additive by distributing the operator costs among individual heuristics. Planning tasks are usu- ally defined with non-negative operator costs and therefore it appears natural to demand the same for the distributed costs. We argue that this requirement is not necessary and demonstrate the benefit of using general cost partitioning. We show that LP heuristics for operator-counting constraints are cost-partitioned heuristics and that the state equation heuris- tic computes a cost partitioning over atomic projections. We also introduce a new family of potential heuristics and show their relationship to general cost partitioning.
Recovering from Selection Bias in Causal and Statistical Inference
作者：Elias Bareinboim, University of California Los Angeles；
Jin Tian, Iowa State University；
Judea Pearl, University of California Los Angeles
摘要：Selection bias is caused by preferential exclusion of units from the samples and represents a major obstacle to valid causal and statistical inferences; it cannot be removed by randomized experiments and can rarely be detected in ei- ther experimental or observational studies. In this paper, we provide complete graphical and algorithmic conditions for recovering conditional probabilities from selection biased data. We also provide graphical conditions for recoverabil- ity when unbiased data is available over a subset of the vari- ables. Finally, we provide a graphical condition that gener- alizes the backdoor criterion and serves to recover causal ef- fects when the data is collected under preferential selection.
HC-Search: Learning Heuristics and Cost Functions for Structured Prediction
作者：Janardhan Rao Doppa, Oregon State University；
Alan Fern, Oregon State University；
Prasad Tadepalli, Oregon State University
摘要：Structured prediction is the problem of learning a func- tion from structured inputs to structured outputs. In- spired by the recent successes of search-based struc- tured prediction, we introduce a new framework for structured prediction called HC-Search. Given a struc- tured input, the framework uses a search procedure guided by a learned heuristic H to uncover high qual- ity candidate outputs and then uses a separate learned cost function C to select a final prediction among those outputs. We can decompose the regret of the overall ap- proach into the loss due to H not leading to high qual- ity outputs, and the loss due to C not selecting the best among the generated outputs. Guided by this decompo- sition, we minimize the overall regret in a greedy stage- wise manner by first training H to quickly uncover high quality outputs via imitation learning, and then training C to correctly rank the outputs generated via H accord- ing to their true losses. Experiments on several bench- mark domains show that our approach significantly out- performs the state-of-the-art methods.
SMILe: Shufﬂed Multiple-Instance Learning
作者：Gary Doran & Soumya Ray, Case Western Reserve University
Learning SVM Classifiers with Indefinite Kernels
作者：Suicheng Gu & Yuhong Guo, Temple University
摘要：Recently, training support vector machines with indef- inite kernels has attracted great attention in the ma- chine learning community. In this paper, we tackle this problem by formulating a joint optimization model over SVM classifications and kernel principal compo- nent analysis. We first reformulate the kernel principal component analysis as a general kernel transformation framework, and then incorporate it into the SVM clas- sification to formulate a joint optimization model. The proposed model has the advantage of making consistent kernel transformations over training and test samples. It can be used for both binary classification and multi- class classification problems. Our experimental results on both synthetic data sets and real world data sets show the proposed model can significantly outperform related approaches.
Document Summarization Based on Data Reconstruction
作者：Zhanying He, Zhejiang University；
Chun Chen, Zhejiang University；
Jiajun Bu, Zhejiang University；
Can Wang, Zhejiang University；
Lijan Zhang, Zhejiang University；
Deng Cai, Zhejiang University；
Xiaofei He, Zhejiang University
摘要：Document summarization is of great value to many real world applications, such as snippets generation for search results and news headlines generation. Tradition- ally, document summarization is implemented by ex- tracting sentences that cover the main topics of a doc- ument with a minimum redundancy. In this paper, we take a different perspective from data reconstruction and propose a novel framework named Document Summa- rization based on Data Reconstruction (DSDR). Specif- ically, our approach generates a summary which consist of those sentences that can best reconstruct the original document. To model the relationship among sentences, we introduce two objective functions: (1) linear recon- struction, which approximates the document by linear combinations of the selected sentences; (2) nonnega- tive linear reconstruction, which allows only additive, not subtractive, linear combinations. In this framework, the reconstruction error becomes a natural criterion for measuring the quality of the summary. For each objec- tive function, we develop an efficient algorithm to solve the corresponding optimization problem. Extensive ex- periments on summarization benchmark data sets DUC 2006 and DUC 2007 demonstrate the effectiveness of our proposed approach.
Dynamic Resource Allocation in Conservation Planning
作者：Daniel Golovin, California Institute of Technology；
Andreas Krause, ETH Zurich；
Beth Gardner, North Carolina State University；
Sarah J. Converse, Patuxent Wildlife Research Center；
Steve Morey, U.S. Fish and Wildlife Service
摘要：Consider the problem of protecting endangered species by selecting patches of land to be used for conservation purposes. Typically, the availability of patches changes over time, and recommendations must be made dynamically. This is a chal- lenging prototypical example of a sequential optimization problem under uncertainty in computational sustainability. Ex- isting techniques do not scale to problems of realistic size. In this paper, we develop an efficient algorithm for adaptively making recommendations for dynamic conservation planning, and prove that it obtains near-optimal performance. We further evaluate our approach on a detailed reserve design case study of conservation planning for three rare species in the Pacific Northwest of the United States.
Complexity of and Algorithms for Borda Manipulation
作者：Jessica Davies, University of Toronto；
George Katsirelos, Université Paris-Sud；
Nina Narodytska, University of New South Wales；
Toby Walsh, NICTA
摘要：：We prove that it is NP-hard for a coalition of two manipu- lators to compute how to manipulate the Borda voting rule. This resolves one of the last open problems in the computa- tional complexity of manipulating common voting rules. Be- cause of this NP-hardness, we treat computing a manipula- tion as an approximation problem where we try to minimize the number of manipulators. Based on ideas from bin pack- ing and multiprocessor scheduling, we propose two new ap- proximation methods to compute manipulations of the Borda rule. Experiments show that these methods significantly out- perform the previous best known approximation method. We are able to find optimal manipulations in almost all the ran- domly generated elections tested. Our results suggest that, whilst computing a manipulation of the Borda rule by a coali- tion is NP-hard, computational complexity may provide only a weak barrier against manipulation in practice.
How Incomplete Is Your Semantic Web Reasoner?
作者：Giorgos Stoilos, Oxford University；
Bernardo Cuenca Grau, Oxford University；
Ian Horrocks, Oxford University
摘要：Conjunctive query answering is a key reasoning service for many ontology-based applications. In order to improve scal- ability, many Semantic Web query answering systems give up completeness (i.e., they do not guarantee to return all query answers). It may be useful or even critical to the designers and users of such systems to understand how much and what kind of information is (potentially) being lost. We present a method for generating test data that can be used to provide at least partial answers to these questions, a purpose for which existing benchmarks are not well suited. In addition to devel- oping a general framework that formalises the problem, we describe practical data generation algorithms for some pop- ular ontology languages, and present some very encouraging results from our preliminary evaluation.
A Novel Transition Based Encoding Scheme for Planning as Satisfiability
作者：Ruoyun Huang, Washington University in St. Louis；
Yixin Chen, Washington University in St. Louis；
Weixiong Zhang, Washington University in St. Louis
摘要：Planning as satisfiability is a principal approach to plan- ning with many eminent advantages. The existing plan- ning as satisfiability techniques usually use encodings compiled from the STRIPS formalism. We introduce a novel SAT encoding scheme based on the SAS+ formal- ism. It exploits the structural information in the SAS+ formalism, resulting in more compact SAT instances and reducing the number of clauses by up to 50 fold. Our results show that this encoding scheme improves upon the STRIPS-based encoding, in terms of both time and memory efficiency.
How Good is Almost Perfect?
作者：Malte Helmert & Gabriele Röger, Albert-Ludwigs-Universität Freiburg
摘要：Heuristic search using algorithms such as A∗ and IDA∗ is the prevalent method for obtaining optimal sequential solu- tions for classical planning tasks. Theoretical analyses of these classical search algorithms, such as the well-known re- sults of Pohl, Gaschnig and Pearl, suggest that such heuristic search algorithms can obtain better than exponential scaling behaviour, provided that the heuristics are accurate enough. Here, we show that for a number of common planning bench- mark domains, including ones that admit optimal solution in polynomial time, general search algorithms such as A∗ must necessarily explore an exponential number of search nodes even under the optimistic assumption of almost per- fect heuristic estimators, whose heuristic error is bounded by a small additive constant.
Our results shed some light on the comparatively bad per- formance of optimal heuristic search approaches in “simple” planning domains such as GRIPPER. They suggest that in many applications, further improvements in run-time require changes to other parts of the search algorithm than the heuris- tic estimator.
Optimal False-Name-Proof Voting Rules with Costly Voting
作者：Liad Wagman & Vincent Conitzer, Duke University
One way for agents to reach a joint decision is to vote over the alternatives. In open, anonymous settings such as the Inter- net, an agent can vote more than once without being detected. A voting rule is false-name-proof if no agent ever benefits from casting additional votes. Previous work has shown that all false-name-proof voting rules are unresponsive to agents’ preferences. However, that work implicitly assumes that cast- ing additional votes is costless. In this paper, we consider what happens if there is a cost to casting additional votes. We characterize the optimal (most responsive) false-name-proof- with-costs voting rule for 2 alternatives. In sharp contrast to the costless setting, we prove that as the voting population grows larger, the probability that this rule selects the major- ity winner converges to 1. We also characterize the optimal group false-name-proof rule for 2 alternatives, which is ro- bust to coalitions of agents sharing the costs of additional votes. Unfortunately, the probability that this rule chooses the majority winner as the voting population grows larger is relatively low. We derive an analogous rule in a setting with 3 alternatives, and provide bounding results and computational approaches for settings with 4 or more alternatives.
PLOW: A Collaborative Task Learning Agent
作者：James Allen, Institute for Human and Machine Cognition；
Nathanael Chambers, Stanford University；
George Ferguson, University of Rochester；
Lucian Galescu, Institute for Human and Machine Cognition；
Hyuckchul Jung, Institute for Human and Machine Cognition；
Mary Swift, University of Rochester；
William Taysom, Institute for Human and Machine Cognition
摘要：To be effective, an agent that collaborates with humans needs to be able to learn new tasks from humans they work with. This paper describes a system that learns executable task models from a single collaborative learning session consisting of demonstration, explanation and dialogue. To accomplish this, the system integrates a range of AI tech- nologies: deep natural language understanding, knowledge representation and reasoning, dialogue systems, plan- ning/agent-based systems and machine learning. A formal evaluation shows the approach has great promise.
Thresholded Rewards: Acting Optimally in Timed, Zero-Sum Games
作者：Colin McMillen & Manuela Veloso, Carnegie Mellon University
摘要：In timed, zero-sum games, the goal is to maximize the prob- ability of winning, which is not necessarily the same as max- imizing our expected reward. We consider cumulative inter- mediate reward to be the difference between our score and our opponent’s score; the “true” reward of a win, loss, or tie is determined at the end of a game by applying a thresh- old function to the cumulative intermediate reward. We in- troduce thresholded-rewards problems to capture this depen- dency of the final reward outcome on the cumulative interme- diate reward. Thresholded-rewards problems reflect different real-world stochastic planning domains, especially zero-sum games, in which time and score need to be considered. We investigate the application of thresholded rewards to finite- horizon Markov Decision Processes (MDPs). In general, the optimal policy for a thresholded-rewards MDP will be non- stationary, depending on the number of time steps remain- ing and the cumulative intermediate reward. We introduce an efficient value iteration algorithm that solves thresholded- rewards MDPs exactly, but with running time quadratic on the number of states in the MDP and the length of the time horizon. We investigate a number of heuristic-based tech- niques that efficiently find approximate solutions for MDPs with large state spaces or long time horizons.
Model Counting: A New Strategy for Obtaining Good Bounds
作者：Carla P. Gomes, Cornell University；
Ashish Sabharwal, Cornell University；
Bart Selman, Cornell University
Model counting is the classical problem of computing the number of solutions of a given propositional formula. It vastly generalizes the NP-complete problem of propositional satisfiability, and hence is both highly useful and extremely expensive to solve in practice. We present a new approach to model counting that is based on adding a carefully cho- sen number of so-called streamlining constraints to the input formula in order to cut down the size of its solution space in a controlled manner. Each of the additional constraints is a randomly chosen XOR or parity constraint on the prob- lem variables, represented either directly or in the standard CNF form. Inspired by a related yet quite different theoretical study of the properties of XOR constraints, we provide a for- mal proof that with high probability, the number of XOR con- straints added in order to bring the formula to the boundary of being unsatisfiable determines with high precision its model count. Experimentally, we demonstrate that this approach can be used to obtain good bounds on the model counts for for- mulas that are far beyond the reach of exact counting meth- ods. In fact, we obtain the first non-trivial solution counts for very hard, highly structured combinatorial problem instances. Note that unlike other counting techniques, such as Markov Chain Monte Carlo methods, we are able to provide high- confidence guarantees on the quality of the counts obtained.
Towards an Axiom System for Default Logic
作者：Gerhard Lakemeyer, RWTH Aachen University；
Hector J. Levesque, University of Toronto
摘要：Recently, Lakemeyer and Levesque proposed a logic of only- knowing which precisely captures three forms of nonmono- tonic reasoning: Moore’s Autoepistemic Logic, Konolige’s variant based on moderately grounded expansions, and Rei- ter’s default logic. Defaults have a uniform representation under all three interpretations in the new logic. Moreover, the logic itself is monotonic, that is, nonmonotonic reasoning is cast in terms of validity in the classical sense. While Lake- meyer and Levesque gave a model-theoretic account of their logic, a proof-theoretic characterization remained open. This paper fills that gap for the propositional subset: a sound and complete axiom system in the new logic for all three varieties of default reasoning. We also present formal derivations for some examples of default reasoning. Finally we present evi- dence that it is unlikely that a complete axiom system exists in the first-order case, even when restricted to the simplest forms of default reasoning.
The Max K- Armed Bandit: A New Model of Exploration Applied to Search Heuristic Selection
作者：Vincent A. Cicirello, Drexel University；
Stephen F. Smith, Carnegie Mellon University
摘要：The multiarmed bandit is often used as an analogy for the tradeoff between exploration and exploitation in search prob- lems. The classic problem involves allocating trials to the arms of a multiarmed slot machine to maximize the expected sum of rewards. We pose a new variation of the multiarmed bandit—the Max K-Armed Bandit—in which trials must be allocated among the arms to maximize the expected best sin- gle sample reward of the series of trials. Motivation for the Max K-Armed Bandit is the allocation of restarts among a set of multistart stochastic search algorithms. We present an analysis of this Max K-Armed Bandit showing under certain assumptions that the optimal strategy allocates trials to the observed best arm at a rate increasing double exponentially relative to the other arms. This motivates an exploration strat- egy that follows a Boltzmann distribution with an exponen- tially decaying temperature parameter. We compare this ex- ploration policy to policies that allocate trials to the observed best arm at rates faster (and slower) than double exponen- tially. The results confirm, for two scheduling domains, that the double exponential increase in the rate of allocations to the observed best heuristic outperforms the other approaches.
Learning and Inferring Transportation Routines
作者：Lin Liao, University of Washington；
Dieter Fox, University of Washington；
Henry Kautz, University of Washington
摘要：This paper introduces a hierarchical Markov model that can learn and infer a user’s daily movements through the commu- nity. The model uses multiple levels of abstraction in order to bridge the gap between raw GPS sensor measurements and high level information such as a user’s mode of transporta- tion or her goal. We apply Rao-Blackwellised particle filters for efficient inference both at the low level and at the higher levels of the hierarchy. Significant locations such as goals or locations where the user frequently changes mode of trans- portation are learned from GPS data logs without requiring any manual labeling. We show how to detect abnormal be- haviors (e.g. taking a wrong bus) by concurrently tracking his activities with a trained and a prior model. Experiments show that our model is able to accurately predict the goals of a per- son and to recognize situations in which the user performs un- known activities.
On Computing All Abductive Explanations
作者：Thomas Eiter, Technische Universität Wien；
Kazuhisa Makino, Osaka University
We consider the computation of all respectively a polynomial subset of the explanations of an abductive query from a Horn theory, and pay particular attention to whether the query is a positive or negative letter, the explanation is based on lit- erals from an assumption set, and the Horn theory is rep- resented in terms of formulas or characteristic models. We derive tractability results, one of which refutes a conjecture by Selman and Levesque, as well as intractability results, and furthermore also semi-tractability results in terms of solvabil- ity in quasi-polynomial time. Our results complement previ- ous results in the literature, and elucidate the computational complexity of generating the set of explanations.
The Game of Hex: An Automatic Theorem-Proving Approach to Game Programming
作者：Vadim V. Anshelevich, Vanshel Consulting
摘要：The game of Hex is a two-player game with simple rules, a deep underlying mathematical beauty, and a strategic complexity comparable to that of Chess and Go. The massive game-tree search techniques developed mostly for Chess, and successfully used for Checkers, Othello, and a number of other games, become less useful for games with large branching factors like Go and Hex. We offer a new approach, which results in superior playing strength. This approach emphasizes deep analysis of relatively few game positions. In order to reach this goal, we develop an automatic theorem proving technique for topological analysis of Hex positions. We also discuss in detail an idea of modeling Hex positions with electrical resistor circuits. We explain how this approach is implemented in Hexy - the strongest known Hex-playing computer program, able to compete with best human players.
PROVERB: The Probabilistic Cruciverbalist
作者：Greg A. Keim, Duke University；
Noam M. Shazeer, Duke University；
Michael L. Littman, Duke University；
Sushant Agarwal, Duke University；
Catherine M. Cheves, Duke University；
Joseph Fitzgerald, Duke University；
Jason Grosland, Duke University；
Fan Jiang, Duke University；
Shannon Pollard, Duke University；
Karl Weinmeister, Duke University
Learning Evaluation Functions for Global Optimization and Boolean Satisfiability
作者：Justin A. Boyan & Andrew W. Moore, Carnegie Mellon University
Acceleration Methods for Numeric CSPs
作者：Yahia Lebbah & Olivier Lhomme, Ecole des Mines de Nantes
The Interactive Museum Tour-Guide Robot
作者：Wolfram Burgard, University of Bonn；
Armin B. Cremers, University of Bonn；
Dieter Fox, University of Bonn；
Dirk Hähnel, University of Bonn；
Gerhard Lakemeyer, Aachen University of Technology；
Dirk Schulz, University of Bonn；
Walter Steiner, University of Bonn；
Sebastian Thrun, Carnegie Mellon University
Statistical Parsing with a Context-Free Grammar and Word Statistics
作者：Eugene Charniak, Brown University
A Practical Algorithm for Finding Optimal Triangulations
作者：Krill Shoikhet & Dan Geiger, Technion
Fast Context Switching in Real-Time Propositional Reasoning
作者：P. Pandurang Nayak & Brian C. Williams, NASA Ames Research Center
Building Concept Representations from Reusable Components
作者：Peter Clark, Boeing；
Bruce Porter, University of Texas at Austin
Verification of Knowledge Bases Based on Containment Checking
作者：Alon Y. Levy, AT&T Laboratories；
Marie-Christine Rousset, Université Paris-Sud
A Novel Application of Theory Refinement to Student Modeling
作者：Paul T. Baffes, SciComp；
Raymond J. Mooney, University of Texas at Austin
Pushing the Envelope: Planning, Propositional Logic, and Stochastic Search
作者：Henry Kautz & Bart Selman, AT&T Laboratories