Grzegorz Głuch

EPFL, INJ 116 · Lausanne, 1015, Switzerland · grzegorz.gluch@EPFL.CH

5-th year PhD student at EPFL advised by a duo: Rudiger Urbanke and Michael Kapralov. Before that I studied at University of Wroclaw working with Jerzy Marcinkowski.

Theoretical computer scientist working on machine learning, quantum computing and (recently) its connections to physics. The common theme is studying systems resistant to adversaries.

Trying to make an impact.


Publications

How hard is it to fake entanglement? A complexity theoretic view of nonlocality and its applications to delegating quantum computation

Under Review
Abstract Is it possible to operationally distinguish every entangled state from all separable states? This is a long-standing open question in quantum information. More concretely, assuming that two non-communicating parties interact classically with a verifier, can the verifier distinguish the following two cases: (i) the parties have access to an entangled state, (ii) they have access to a separable state only (a local hidden variable model). In this work, we define a computational version of state non-locality, and show that if every entangled state exhibits such non-locality then 𝙱𝚀𝙿≠𝙿𝙿. Surprisingly, we demonstrate how this result implies that if a one-round delegation of quantum computation (DQC) exists then 𝙱𝚀𝙿≠𝙿𝙿. This gives a necessary complexity-theoretic assumption needed for the existence of such DQC. Our proof technique essentially builds a framework that allows one to give stronger lower-bounds for DQC by proving upper-bounds for the complexity of local-hidden-variable models.

Bayes Complexity of Learners vs Overfitting

Under Review
Abstract We introduce a new notion of complexity of functions and we show that it has the following properties: (i) it governs a PAC Bayes-like generalization bound, (ii) for neural networks it relates to natural notions of complexity of functions (such as the variation), and (iii) it explains the generalization gap between neural networks and linear schemes. While there is a large set of papers which describes bounds that have each such property in isolation, and even some that have two, as far as we know, this is a first notion that satisfies all three of them. Moreover, in contrast to previous works, our notion naturally generalizes to neural networks with several layers. Even though the computation of our complexity is nontrivial in general, an upper-bound is often easy to derive, even for higher number of layers and functions with structure, such as period functions. An upper-bound we derive allows to show a separation in the number of samples needed for good generalization between 2 and 4-layer neural networks for periodic functions.

Breaking a Classical Barrier for Classifying Arbitrary Test Examples in the Quantum Model

AISTATS 2023 (Poster at QIP 2022)
Informal Abstract This paper presents a surprising connection between recent advances in delegation of quantum computation to adversarial robustness. Imagine an adversary A that is in between a classifier C and a source S producing samples to be classified. Can we design an interactive protocol between C and the A, in which A would convince C that the samples being sent for classification really came from S? Using ideas from delegation of quantum computation we show it is possible to do it if A receives quantum states from S and A communicates with C classically. Why do we need quantum mechanics? Assume that S is uniform on {0,1}^n. How can A convince C that the samples he sent really came from S and were not handcrafted by S (if A received classical samples from S)? This seems to us be an almost impossible problem to solve, as we discuss in the paper.

Noether: The more things change, the more stay the same

Under Review
Informal Abstract The field of theroetical machine learning managed to produce some initial explanations for why neural networks perform so well in practice. Unfortunately when reading these papers it is hard to see a common theme or a unifying message. In this paper we show how to look at many of the recent advances through one lens, i.e. lens of symmetries. We show how this concept explains many previous results and how it helps to obtain new ones. To do all of that we study a connection to the famous Noether's theorem from physics.

Exponential Separation between Two Learning Models and Adversarial Robustness

NeurIPS 2021 (spotlight)
Informal Abstract There are two sides of this paper. The technical one and potential applications to adversarial robustness. On the technical side we show exponential separation between two models. The standard PAC-learning model and the Equivalence-Query model in which the samples are acquired through an interaction between a teacher and a learner, where the teacher provides counterexamples to hypotheses given by the learner. It is not surprising that one needs fewer samples in the Equivalence-Query model but proving it formally is tricky and the reduction and the proof are quite involved. The second part, i.e. implications for adversarial robustness, was in fact our motivation to study these two models. Our reasoning is as follows. Every adversarial attack in particular provides counterexamples to a given classifier and thus is of the form of the EQ-model. Then the idea is that these adversarial examples can be used for learning faster. This is not a new idea- adversarial training does exactly that. Our contribution is in PROVING that if the adversarial examples have particular structure then they can be used to improve learning over the standard approach.

Query Complexity of Adversarial Attacks

ICML 2021
Informal Abstract How hard is it to adversarially attack a network to which the adversary has black box access only? That, of course, depends on what the adversary knows. If he knew the learning algorithm, training data and the randomness used internally by the algorithm then the model becomes white-box from his point of view. In this paper we analyze what happens if the adversary doesn't know the traning data nor the internal randomness. We show that the number of queries needed ro reliably attack depends on the amount of "entropy" in the decision boundaries. Moreover we prove that some learning algorithms are inherently more secure, e.g. KNN, than others, e.g. linear classifiers.

Spectral Clustering Oracles in Sublinear Time

SODA 2021
Abstract Given a graph G that can be partitioned into k disjoint expanders with outer conductance upper bounded by ∊ « 1, can we efficiently construct a small space data structure that allows quickly classifying vertices of G according to the expander (cluster) they belong to? Formally, we would like an efficient local computation algorithm that misclassifies at most an O(epsilon) fraction of vertices in every expander. We refer to such a data structure as a spectral clustering oracle. Our main result is a spectral clustering oracle with query time O^*(n^(1/2+O(epsilon))) and preprocessing time 2^O(1/epsilon k^4 log^2(k))n^(1/2 + O(epsilon)) that provides misclassification error O(epsilon log k) per cluster for any epsilon « 1/log k. More generally, query time can be reduced at the expense of increasing the preprocessing time appropriately (as long as the product is about n^(1+O(epsilon)) – this in particular gives a nearly linear time spectral clustering primitive. The main technical contribution is a sublinear time oracle that provides dot product access to the spectral embedding of G by estimating distributions of short random walks from vertices in G. The distributions themselves provide a poor approximation to the spectral embedding, but we show that an appropriate linear transformation can be used to achieve high precision dot product access. We give an estimator for this linear transformation and analyze it using spectral perturbation bounds and a novel upper bound on the leverage scores of the spectral embedding matrix of a k-clusterable graph. We then show that dot product access to the spectral embedding is sufficient to design a clustering oracle. At a high level our approach amounts to hyperplane partitioning in the spectral embedding of G, but crucially operates on a nested sequence of carefully defined subspaces in the spectral embedding to achieve per cluster recovery guarantees.

Constructing a provably adversarially-robust classifier from a high accuracy one

AISTATS 2020
Abstract Modern machine learning models with very high accuracy have been shown to be vulnerable to small, adversarially chosen perturbations of the input. Given black-box access to a high-accuracy classifier f, we show how to construct a new classifier g that has high accuracy and is also robust to adversarial L2-bounded perturbations. Our algorithm builds upon the framework of randomized smoothing that has been recently shown to outperform all previous defenses against L2-bounded adversaries. Using techniques like random partitions and doubling dimension, we are able to bound the adversarial error of g in terms of the optimum error. In this paper we focus on our conceptual contribution, but we do present two examples to illustrate our framework. We will argue that, under some assumptions, our bounds are optimal for these cases.
Pre-PhD work

The first order truth behind undecidability of regular path queries determinacy

ICDT 2019
Abstract In our paper [Głuch, Marcinkowski, Ostropolski-Nalewaja, LICS ACM, 2018] we have solved an old problem stated in [Calvanese, De Giacomo, Lenzerini, Vardi, SPDS ACM, 2000] showing that query determinacy is undecidable for Regular Path Queries. Here a strong generalisation of this result is shown, and -- we think -- a very unexpected one. We prove that no regularity is needed: determinacy remains undecidable even for finite unions of conjunctive path queries.

Can One Escape Red Chains? Regular Path Queries Determinacy is Undecidable

LICS 2018
Abstract For a given set of queries (which are expressions in some query language) Q = {Q1, Q2, ... Qk} and for another query Q0 we say that Q determines Q0 if -- informally speaking -- for every database D, the information contained in the views Q(D) is sufficient to compute Q0(D). Query Determinacy Problem is the problem of deciding, for given Q and Q0, whether Q determines Q0. Many versions of this problem, for different query languages, were studied in database theory. In this paper we solve a problem stated in [CGLV02] and show that Query Determinacy Problem is undecidable for the Regular Path Queries -- the paradigmatic query language of graph databases.

4/3 Rectangle Tiling lower bound

Abstract The problem that we consider is the following: given an n×n array A of positive numbers, find a tiling using at most p rectangles (which means that each array element must be covered by some rectangle and no two rectangles must overlap) that minimizes the maximum weight of any rectangle (the weight of a rectangle is the sum of elements which are covered by it). We prove that it is NP-hard to approximate this problem to within a factor of 4/3 (the previous best result was 5/4).

Teaching

Videos from a course on "Quantum Interactive Proofs" I held at EPFL can be found here


Education

École Polytechnique Fédérale de Lausanne (EPFL)

Doctor of Philosophy
Computer Science
2018 - Present

University of Wroclaw

Master of Science
Computer Science
2015 - 2018

University of Wroclaw

Bachelor of Science
Mathematics
2012 - 2015

University of Wroclaw

Bachelor of Science
Computer Science
2012 - 2015