Queer in AI Social (II) Sat 18 Jul 09:00 a.m.
Interact with other attendees and sponsors in a virtual world, videochat with those near you, using the Gather.town platform. The time hopefully works for everyone from Africa to Asia.
Link to join: Queer in AI Code of Conduct.
If you join the social, please fill in our check-in survey! Thanks :-)
Workshops Sat Sat 18 Jul 10:00 a.m.
One of the most significant and challenging open problems in Artificial Intelligence (AI) is the problem of Lifelong Learning. Lifelong Machine Learning considers systems that can continually learn many tasks (from one or more domains) over a lifetime. A lifelong learning system efficiently and effectively:
1. retains the knowledge it has learned from different tasks;
2. selectively transfers knowledge (from previously learned tasks) to facilitate the learning of new tasks;
3. ensures the effective and efficient interaction between (1) and (2).
Lifelong Learning introduces several fundamental challenges in training models that generally do not arise in a single task batch learning setting. This includes problems like catastrophic forgetting and capacity saturation. This workshop aims to explore solutions for these problems in both supervised learning and reinforcement learning settings.
Normalizing flows are explicit likelihood models using invertible neural networks to construct flexible probability distributions of high-dimensional data. Compared to other generative models, the main advantage of normalizing flows is that they can offer exact and efficient likelihood computation and data generation. Since their recent introduction, flow-based models have seen a significant resurgence of interest in the machine learning community. As a result, powerful flow-based models have been developed, with successes in density estimation, variational inference, and generative modeling of images, audio and video.
This workshop is the 2nd iteration of the ICML 2019 workshop on Invertible Neural Networks and Normalizing Flows. While the main goal of last year’s workshop was to make flow-based models more accessible to the general machine learning community, as the field is moving forward, we believe there is now a need to consolidate recent progress and connect ideas from related fields. In light of the interpretation of latent variable models and autoregressive models as flows, this year we expand the scope of the workshop and consider likelihood-based models more broadly, including flow-based models, latent variable models and autoregressive models. We encourage the researchers to use these models in conjunction to exploit the their benefits at …
One proposed solution towards the goal of designing machines that can extrapolate experience across environments and tasks, are inductive biases. Providing and starting algorithms with inductive biases might help to learn invariances e.g. a causal graph structure, which in turn will allow the agent to generalize across environments and tasks.
While some inductive biases are already available and correspond to common knowledge, one key requirement to learn inductive biases from data seems to be the possibility to perform and learn from interventions. This assumption is partially motivated by the accepted hypothesis in psychology about the need to experiment in order to discover causal relationships. This corresponds to an reinforcement learning environment, where the agent can discover causal factors through interventions and observing their effects.
We believe that one reason which has hampered progress on building intelligent agents is the limited availability of good inductive biases. Learning inductive biases from data is difficult since this corresponds to an interactive learning setting, which compared to classical regression or classification frameworks is far less understood e.g. even formal definitions of generalization in RL have not been developed. While Reinforcement Learning has already achieved impressive results, the sample complexity required to achieve consistently good …
Models of negative dependence and submodularity are increasingly important in machine learning. Whether selecting training data, finding an optimal experimental design, exploring in reinforcement learning and Bayesian optimization, or designing recommender systems, selecting high-quality yet diverse items has become a core challenge. This workshop aims to bring together researchers who, using theoretical or applied techniques, leverage negative dependence and submodularity in their work. Expanding upon last year's workshop, we will highlight recent developments in the rich mathematical theory of negative dependence, cover novel critical applications, and discuss the most promising directions for future research.
Training machine learning models in a centralized fashion often faces significant challenges due to regulatory and privacy concerns in real-world use cases. These include distributed training data, computational resources to create and maintain a central data repository, and regulatory guidelines (GDPR, HIPAA) that restrict sharing sensitive data. Federated learning (FL) is a new paradigm in machine learning that can mitigate these challenges by training a global model using distributed data, without the need for data sharing. The extensive application of machine learning to analyze and draw insight from real-world, distributed, and sensitive data necessitates familiarization with and adoption of this relevant and timely topic among the scientific community.Despite the advantages of federated learning, and its successful application in certain industry-based cases, this field is still in its infancy due to new challenges that are imposed by limited visibility of the training data, potential lack of trust among participants training a single model, potential privacy inferences, and in some cases, limited or unreliable connectivity.The goal of this workshop is to bring together researchers and practitioners interested in FL. This day-long event will facilitate interaction among students, scholars, and industry professionals from around the world to understand the topic, identify technical challenges, …
Machine learning is increasingly being applied to problems in the healthcare domain. However, there is a risk that the development of machine learning models for improving health remain focused within areas and diseases which are more economically incentivised and resourced. This presents the risk that as research and technological entities aim to develop machine-learning-assisted consumer healthcare devices, or bespoke algorithms for their populations within a certain geographical region, that the challenges of healthcare in resource-constrained settings will be overlooked. The predominant research focus of machine learning for healthcare in the “economically advantaged” world means that there is a skew in our current knowledge of how machine learning can be used to improve health on a more global scale – for everyone. This workshop aims to draw attention to the ways that machine learning can be used for problems in global health, and to promote research on problems outside high-resource environments.
Deep learning has achieved great success in a variety of tasks such as recognizing objects in images, predicting the sentiment of sentences, or image/speech synthesis by training on a large-amount of data. However, most existing success are mainly focusing on perceptual tasks, which is also known as System I intelligence. In real world, many complicated tasks, such as autonomous driving, public policy decision making, and multi-hop question answering, require understanding the relationship between high-level variables in the data to perform logical reasoning, which is known as System II intelligence. Integrating system I and II intelligence lies in the core of artificial intelligence and machine learning.
Graph is an important structure for System II intelligence, with the universal representation ability to capture the relationship between different variables, and support interpretability, causality, and transferability / inductive generalization. Traditional logic and symbolic reasoning over graphs has relied on methods and tools which are very different from deep learning models, such Prolog language, SMT solvers, constrained optimization and discrete algorithms. Is such a methodology separation between System I and System II intelligence necessary? How to build a flexible, effective and efficient bridge to smoothly connect these two systems, and create higher order artificial intelligence? …
Machine learning has achieved considerable successes in recent years, but this success often relies on human experts, who construct appropriate features, design learning architectures, set their hyperparameters, and develop new learning algorithms. Driven by the demand for off-the-shelf machine learning methods from an ever-growing community, the research area of AutoML targets the progressive automation of machine learning aiming to make effective methods available to everyone. Hence, the workshop targets a broad audience ranging from core machine learning researchers in different fields of ML connected to AutoML, such as neural architecture search, hyperparameter optimization, meta-learning, and learning to learn, to domain experts aiming to apply machine learning to new types of problems.
The schedule is wrt CEST (i.e., the time zone of Vienna)
Although data is considered to be the “new oil”, it is very hard to be priced. Raw use of data has been invaluable in several sectors such as advertising, healthcare, etc, but often in violation of people’s privacy. Labeled data has also been extremely valuable for the training of machine learning models (driverless car industry). This is also indicated by the growth of annotation companies such as Figure8 and Scale.AI, especially in the image space. Yet, it is not clear what is the right pricing for data workers who annotate the data or the individuals who contribute their personal data while using digital services. In the latter case, it is very unclear how the value of the services offered is compared to the private data exchanged. While the first data marketplaces have appeared, such as AWS, Narattive.io, nitrogen.ai, etc, they suffer from a lack of good pricing models. They also fail to maintain the right of the data owners to define how their own data will be used. There have been numerous suggestions for sharing data while maintaining privacy, such as training generative models that preserve original data statistics.
Language is one of the most impressive human accomplishments and is believed to be the core to our ability to learn, teach, reason and interact with others. Yet, current state-of-the-art reinforcement learning agents are unable to use or understand human language at all. The ability to integrate and learn from language, in addition to rewards and demonstrations, has the potential to improve the generalization, scope and sample efficiency of agents. Furthermore, many real-world tasks, including personal assistants and general household robots, require agents to process language by design, whether to enable interaction with humans, or simply use existing interfaces. The aim of our workshop is to advance this emerging field of research by bringing together researchers from several diverse communities to discuss recent developments in relevant research areas such as instruction following and embodied language learning, and identify the most important challenges and promising research avenues.
This workshop aims to bring together researchers from academia and industry to discuss major challenges, outline recent advances, and highlight future directions pertaining to novel and existing large-scale real-world experiment design and active learning problems. We aim to highlight new and emerging research opportunities for the machine learning community that arise from the evolving needs to make experiment design and active learning procedures that are theoretically and practically relevant for realistic applications.
The intended audience and participants include everyone whose research interests, activities, and applications involve experiment design, active learning, bandit/Bayesian optimization, efficient exploration, and parameter search methods and techniques. We expect the workshop to attract substantial interest from researchers working in both academia and industry. The research of our invited speakers spans both theory and applications, and represents a diverse range of domains where experiment design and active learning are of fundamental importance (including robotics & control, biology, physical sciences, crowdsourcing, citizen science, etc.).
The schedule is with respect to UTC (i.e., Universal Time) time zone.
In situations where a task can be cleanly formulated and data is plentiful, modern machine learning (ML) techniques have achieved impressive (and often super-human) results. Here, plentiful data'' can mean labels from humans, access to a simulator and well designed reward function, or other forms of interaction and supervision.<br><br>On the other hand, in situations where tasks cannot be cleanly formulated and plentifully supervised, ML has not yet shown the same progress. We still seem far from flexible agents that can learn without human engineers carefully designing or collating their supervision. This is problematic in many settings where machine learning is or will be applied in real world settings, where these agents have to interact with human users and may be used in settings that go beyond any initial clean training data used during system development. A key open question is how to make machine learning effective and robust enough to operate in real world open domains.<br><br>Artificial {\it open} worlds are ideal laboratories for studying how to extend the successes of ML to build such agents. <br>Open worlds are characterized by:<br>\begin{itemize}<br> \item Large (or perhaps infinite) collections of tasks, often not specified till test time; or lack of well defined tasks …
Artificial Intelligence (AI), and Machine Learning systems in particular, often depend on the information provided by multiple agents. The most well-known example is federated learning, but also sensor data, crowdsourced human computation, or human trajectory inputs for inverse reinforcement learning. However, eliciting accurate data can be costly, either due to the effort invested in obtaining it, as in crowdsourcing, or due to the need to maintain automated systems, as in distributed sensor systems. Low-quality data not only degrades the performance of AI systems, but may also pose safety concerns. Thus, it becomes important to verify the correctness of data and be smart in how data is aggregated, and to provide incentives to promote effort and high-quality data. During the recent workshop on Federated Learning at NeurIPS 2019, 4 of 6 panel members mentioned incentives as the most important open issue.
This workshop is proposed to understand this aspect of Machine Learning, both theoretically and empirically. We particularly encourage contributions on the following aspects:
- How to collect high quality and credible data for machine learning systems from self-interested and possibly malicious agents, considering the game-theoretical properties of the problem?
- How to evaluate the quality of data supplied by self-interested …
The ever-increasing size and accessibility of vast media libraries has created a demand more than ever for AI-based systems that are capable of organizing, recommending, and understanding such complex data.
While this topic has received only limited attention within the core machine learning community, it has been an area of intense focus within the applied communities such as the Recommender Systems (RecSys), Music Information Retrieval (MIR), and Computer Vision communities. At the same time, these domains have surfaced nebulous problem spaces and rich datasets that are of tremendous potential value to machine learning and the AI communities at large.
This year's Machine Learning for Media Discovery (ML4MD) aims to build upon the five previous Machine Learning for Music Discovery editions at ICML, broadening the topic area from music discovery to media discovery. The added topic diversity is aimed towards having a broader conversation with the machine learning community and to offer cross-pollination across the various media domains.
One of the largest areas of focus in the media discovery space is on the side of content understanding. The recommender systems community has made great advances in terms of collaborative feedback recommenders, but these approaches suffer strongly from the cold-start problem. As …
Recent years have witnessed the rising need for learning agents that can interact with humans. Such agents usually involve applications in computer vision, natural language processing, human computer interaction, and robotics. Creating and running such agents call for interdisciplinary research of artificial intelligence, machine learning, and software engineering design, which we abstract as Human in the Loop Learning (HILL). HILL is a modern machine learning paradigm of significant practical and theoretical interest. For HILL, models and humans engage in a two-way dialog to facilitate more accurate and interpretable learning. The workshop aims to bring together researchers and practitioners working on the broad areas of human in the loop learning, ranging from the interactive/active learning algorithm designs for real-world decision making systems (e.g., autonomous driving vehicles, robotic systems, etc.), models with strong explainability, as well as interactive system designs (e.g., data visualization, annotation systems, etc.). In particular, we aim to elicit new connections among these diverse fields, identifying theory, tools and design principles tailored to practical machine learning workflows. The target audience for the workshop includes people who are interested in using machines to solve problems by having a human be an integral part of the learning process. In this year’s …