How did we arrive here? When did it become the norm that by entering a semi-public space such as a bar we implicitly agree to be filmed, identified and tracked? Has it in fact become the norm?

Every technology is adopted gradually. First there were automatic border gates at airports. They seemed harmless: we had to stand in a specific location and remove our glasses, the camera came with its own special white light and each check took seemingly forever. Combined, these “features” made it clear to us when we are and when we are not surveilled. It gave us a sense of control. Even though we may not have had a choice whether to engage with the technology or not, the illusion of control was there.

Then there was Facebook’s tagging of photos. At first Facebook merely detected faces in uploaded photos and we had to supply their names manually. Then, once face recognition was advanced enough, Facebook was able to tell not only where the face is, but also who’s face it is. Still, we were not overly worried. Face recognition was used only when we posted photos to Facebook, only when we decided to actively engage with the technology. We could compartmentalise the use of face recognition and not worry about it the rest of the time.

It is more difficult to compartmentalise Apple’s FaceID since it is used every time we pick up our iPhones. But here we have Apple’s promise that all processing happens on the device; our faces never leave the phone. Again, this provides us with a sense of control: we decide to use the technology when we pick up the phone and at any time we can delete the stored face image and go back to using a PIN to unlock the phone. And yet, when reading that Apple used one billion images to train the AI behind FaceID, we should pause and ask ourselves, where do these images come from?

This was the past. The future is different. With the new generation of face recognition-powered products and services the balance shifts: face recognition can now be used continuously, video streams can be analysed in real time making compartmentalisation more difficult; and with cameras installed in public or semi-public spaces we lose the choice whether or not to use face recognition. That choice is made for us.

But do we need to be alarmed? Is there a difference between face recognition and other modes of surveillance? Surely, browser cookies have been tracking our internet browsing and our phones have been tracking our location for the past years already (even when we think we have disabled location tracking)? Why should we be more concerned about face recognition? There is a difference: face recognition tracks who we are rather than something we possess or something we do.

To avoid tracking when surfing the internet we can use incognito mode, connect to the internet via a VPN or even use the Tor browser. We can turn off our phones and leave them at home or buy a second phone if required. But there is no incognito mode for our faces, we cannot leave our face at home to avoid tracking for a day and covering our face in public leads to instant suspicion. With face recognition, tracking and surveillance is no longer opt-in or opt-out, but rather non-optional.

Right now we still have a choice. As a technology face recognition, even though far from perfect, is mature enough to be deployed in products at scale. And while it is being adopted gradually, by businesses, the police and the government, it is not yet ubiquitous. But, given the pace of technological change, soon it may well be. In China it probably already is. Which leads us to the simple question: do we want it to? And the more complicated follow-up: what will we do next?

]]>Then we moved on to examples where AI actually does influence our lives, but I struggled find good examples. I was wasn’t looking for examples where AI is used; there are plenty of those such as Alexa, Siri, Cortana, Facebook’s news feed, Netflix recommendations and Google translate. I was looking for examples where AI has changed our interaction with a system beyond making the system more convenient to use. This was surprisingly difficult and I wondered why.

I now think, this is in part because our everyday understanding is based on stories, while AI works at the level of statistics. We tell each other stories, we remember stories, we make sense of the world via stories. Stories in which we ourselves participate we call experience. The most useful stories are those that are at once particular and universal: when I was twelve my front wheel got caught in a tram rail while cycling and I fell and broke my wrist. A simple story detailing a particular event. But beyond this one event, there is also a universal side: tram rails can be dangerous for cyclists (apparently particularly so in Edinburgh). It is a certainly a lesson I learned. The specifics of a story hook it in our memory while the universal aspects add to our understanding of the world.

With AI such stories are much more difficult to find. We can tell particular stories: an add for cruise vacations placed next to a video of the sinking cruise ship Costa Concordia. But is the story universal? Does the story tell us why this happened and under what circumstances it will happen again? Does it only happen with Google ads or also with Facebook ads? We learn that algorithmic advertising sometimes leads to inappropriate ad placements. With access to data we could quantify how often this happens and tell a statistical story. But we rarely have this data, as it is jealously guarded and kept out sight by the tech companies.

Our troubles with story telling don’t end here: algorithms change, making stories outdated. For a while Google translate would translate the English sentence “She is a doctor” into Turkish as “O bir doktor”, but when translated back into English it would become “*He *is a doctor”. Thus this story served as an example of gender bias in Google’s systems. Until, in 2018 Google addressed the issue and now provides both English translations: “He is…” and “She is…”. This does not mean that Google translate is now free of gender bias. In fact we don’t whether it is or not. It is probably less biased than it was before, but ultimately we don’t know. The particular is gone, the fate of the universal is unknown, but our ability to comprehend it is reduced.

The book Algorithms of Oppression by Safiya Noble grappled with the same problem. In it the author uses particular stories about Google search results to make wider points about the impact of algorithmic decisions. While the universal points remain valid, the particulars of some stories have become outdated even before the book was published. Searching for “jews” no longer returns anti-semitic websites as top search results and searching for “black girls” no longer returns links to porn websites. But this does not mean Google has magically solved its problems of bias and misrepresentation. It means that we need to find new stories to represent the bias. An image search for CEO still returns mostly male CEOs.

When looking for stories to explain what AI does, we quickly run into structural obstacles. As Safiya Noble put it: “It’s impossible to know the specifics of what influences the design of proprietary algorithms, other than that human beings are designing them, that profit models are driving them, and that they are not up for public discussion.”

We can see the same in other areas: climate change is so difficult to comprehend, because it started its life as a statistic. *On average*, the surface temperature is rising. We should *expect* more extreme weather events. No single hot day, wild fire or hurricane is due to climate change, but the statistical aggregate is. Although with every heatwave and new heat records being set every year climate change is now a story as well as a statistic. 38C in London is hard to ignore.

At this point after presenting the problem I should suggest what to do about it. Well, I don’t know. No amount of artificial intelligence will change human nature and so we will remain in need of stories. Stories can be powerful. Maybe we just need to find the right ones.

]]>Everything’s got a story in it. Change the story, change the world. —Pratchett

It made me realise that the article reflected my own struggle when it comes to the digital economy to find words that accurately describe what is happening and at the same time allow us to understand it intuitively and emotionally. For example, writing “Amazon is willing to pay people $10 to get access to their browsing histories.” is my attempt to accurately describe the transaction. Is it successful? Not quite. First, the word “access” suggests a temporary right to something that can be revoked, but the browsing data that Amazon collects will remain with Amazon; there is no revoking of “access”. Second, “browsing data” is an innocent looking term that can hide a lot: We should not think of browsing data as simply a list of websites that we visited. Rather, it also includes when we visited each website, down to the second, its contents, how we navigated the site, how much time we spent there, our physical location when we looked at it.

None of this meaning is captured by the expression “access to our browsing histories”. And even in this form the expression feels sterile, carefully crafted to convey facts, not meaning. I would much rather say, “Amazon is paying $10 for our data.” This leaves open the question, “What data?”, but at least the question is out in the open, rather than hidden under “browsing history”, a term that promises transparency without delivering it. And we have the personal pronoun, indicating that we are giving away something that belongs to us. This too is inaccurate. There is no data before the transaction. Only after we install the Amazon Assistant and allow it look over our shoulder while we browse, is the data created. Before Amazon there is only our browsing, our behaviour, data exists only after we engage with Amazon. Furthermore, we never see the data that we are giving away. In most cases we don’t even know the extent of it. Is our location being stored? Probably. The accelerometer reading from our phone? Maybe. Finally, throughout the transaction the data is never “ours”. The behaviour is ours, but the data is not. Whether or not we find it morally right, this is the legal position. In Europe, under GDPR, the user has some rights regarding the use and accuracy of part of all data, but ownership remains with the company.

Maybe a better phrasing would be, “Amazon is paying $10 to surveil us browsing the internet.” Now we capture the all-encompassing nature of information capture: not just what we do, but the process of doing is being observed. Still it is not perfect, because the permanent record of our behaviour, created by the surveillance, is hidden in the shadows. Not denied but also not explicitly mentioned.

Why the obsession with language? How we use language influences how we think. The term “sharing economy” conjures images of someone renting out the spare bedroom to visiting tourists, but in fact 55% of all Airbnb listings in London are for whole apartments and 46% are by hosts with multiple listings on the site. How we use language influences how we think. Shoshana Zuboff uses the term “surveillance economy” to describe the business model of making money by surveilling human behaviour and using the data to sell targeted ads. The similar terms “digital economy” or “internet economy” don’t have the same intrusive, totalitarian connotations.

Despite following the tech news and working in machine learning, I struggle to emotionally comprehend the vastness and pervasiveness of digital surveillance, how much our lives are changing with new technologies and new economic realities. And finding the right language to talk about it, to think about it, will be an essential part to persevere in the struggle. I have to thank Sidney Fussell for writing the original article and helping me think through these questions.

]]>

Several parts of the report invite comments, but I will focus here on one particular aspect: the notion of ‘impactful’ mathematics. The report wants to overcome the traditional division of mathematics in ‘pure’ and ‘applied’ and so it creates a new category—impactful mathematics.

What is impactful mathematics? The report mentions several well-known examples intended to show that pure mathematics can be impactful. Graph theory is used to analyse social networks, harmonic analysis underlies much modern signal processing and number theory is the basis of modern encryption methods.

The problem with the label ‘impactful’ is that it can only be applied in retrospect. Sometimes decades pass between the mathematical discovery and its impact. Elliptic curves, for example, which are used in the encryption and signature algorithms underlying Bitcoin make their first appearance in the work of Diophantus and that they form an Abelian group was known at the time of Poincaré. The use of elliptic curves in cryptography was first proposed in 1985, independently by Neal Koblitz and Victor S. Miller, but only in the 2000s did their use become widespread. The situation is similar for graph theory and signal processing. Impact often takes time.

The report states that

We are often able to predict that a mathematical breakthrough will be important – but not always. G.H. Hardy, for example, famously boasted in his ‘A Mathematician’s Apology’ of the uselessness of his great love, number theory. Seventy years later, number theory lies at the heart of internet and e-commerce security, fundamental to the functioning of the world economy and of worldwide communications.

Two comments jump to mind. First, we may be able to predict the usefulness of a breakthrough once it has happened, but the research grant-oriented landscape we all live in require us to predict the usefulness of *future* breakthroughs. There our track record is much worse. Breakthroughs often happen serendipitously without much planning or anticipation and they certainly don’t come with a pre-written ‘Pathways to Impact’ statement as required by EPSRC.

Second, Hardy’s views on the usefulness of mathematics are often misrepresented. Hardy did not so much boast of the uselessness of number theory as take solace in it. Hardy was well aware that some mathematics is useful or impactful. (All following quotes are from ‘A Mathematician’s Aplogy’)

Now some mathematics is certainly useful in this way; the engineers could not do their job without a fair working knowledge of mathematics, and mathematics is beginning to find applications even in physiology. —Hardy §19

But then he drew the conscious decision that this is not the mathematics that he himself is interested in. For Hardy the pursuit of mathematics is an aesthetic pursuit, mathematics is to be judged by its beauty and depth. Interestingly, Hardy also anticipated the notion of impactful mathematics and that it differs from both pure and applied mathematics.

There is another misconception against which we must guard. It is quite natural to suppose that there is a great difference in utility between ‘pure’ and ‘applied’ mathematics. This is a delusion: there is a sharp distinction between the two kinds of mathematics, […], but it hardly affects their utility. —Hardy §22

While the Bond report gives examples of pure mathematics that has found impact, Hardy on the other hand gives examples of applied mathematics that—in his time at least—has no usefulness.

I count Maxwell and Einstein, Eddington and Dirac, among ‘real’ mathematicians. The great modern achievements of applied mathematics have been in relativity and quantum mechanics, and these subjects are, at present at any rate, almost as ‘useless’ as the theory of numbers. —Hardy §25

Hardy is also aware that his views might well be swept away by the tides of time.

It is the dull and elementary parts of applied mathematics, as it is the dull and elementary parts of pure mathematics, that work for good or ill. Time may change all this. No one foresaw the applications of matrices and groups and other purely mathematical theories to modern physics, and it may be that some of the ‘highbrow’ applied mathematics will become ‘useful’ in as unexpected a way; but the evidence so far points to the conclusion that, in one subject as in the other, it is what is commonplace and dull that counts for practical life. —Hardy §25

Hardy certainly did not boast about the ‘uselesness’ of number theory. In fact he wrote the exact opposite.

But here I must deal with a misconception. It is sometimes suggested that pure mathematicians glory in the uselessness of their work, and make it a boast that it has no practical applications. […] If the theory of numbers could be employed for any practical and obviously honourable purpose, if it could be turned directly to the furtherance of human happiness or the relief of human suffering, as physiology and even chemistry can, then surely neither Gauss nor any other mathematician would have been so foolish as to decry or regret such applications. —Hardy §21

And now we come to the difficult part: one can apply mathematics for good as well as for evil. Rockets that brought man to the moon also enable man to deliver a nuclear warhead anywhere in the world. The technology that enables Facebook to automatically tag people in photos also enables police to automatically identify people on CCTV. And so Hardy continues:

But science works for evil as well as for good (and particularly, of course, in time of war); and both Gauss and less mathematicians may be justified in rejoicing that there is one science at any rate, and that their own, whose very remoteness from ordinary human activities should keep it gentle and clean. —Hardy §21

Today mathematics has found many applications and with the rise of artificial intelligence and machine learning there will certainly be many more. We are living in a time where mathematics can be used for both good and evil in our everyday life. Cathy O’Neil recently wrote a book, ‘Weapons of Math Destruction’ highlighting the potential of mathematics to cause harm if employed without care and reflection. Mathematics has certainly lost the innocence and harmlessness it still enjoyed in Hardy’s time.

]]>

I learned two things during this day: First, every department is struggling to adapt to students that are less prepared for a mathematics degree than the department is used to teaching. Second, the introduction of subject-level TEF evaluations in 2019/20 is going to be a really big deal.

The issue of adaptation is an interesting one. It is not specific to a class of universities. Universities that lowered entry requirements in recent years from ABB to BBB have to rethink what they are teaching and how they are teaching it. Sometimes student engagement is a problem, sometimes material has to be moved from year 1 to year 2. But also Russell group universities are finding that teaching methods and assessment structures that worked in the past have become less effective.

I came away from the day with the feeling that to some extent everyone is struggling. As lecturers and professors we all know mathematics and we all want to impart this knowledge to the next generation. But certainty seems to be draining away. People are becoming unsure what to teach, who to teach it to, how to teach it and what the purpose of the teaching is. Few students who study mathematics will become mathematicians; for many a mathematics degree is a prerequisite to getting a job. Independent of the admission standards there is pressure in every department to keep dropout rates low. And then there are NSS scores, which often enough come to not only measure but define the quality of teaching.

This, ultimately, is the environment in which teaching happens and in which decisions are being made what to teach and how teach it. It is also an environment that is alien to mathematics itself. And so uncertainty creeps in. If students are not learning mathematics to become mathematicians, what should we teach them? Is the $\epsilon$-$\delta$-notion of convergence really necessary for a job in X? What about Galois theory? Galois theory may have provided the proof that one cannot square the circle and become the foundation of modern algebra, but is it not too difficult for an undergraduate? If a department is measured by its dropout rate and its NSS scores, maybe we can ease the students’ workload a bit; sacrifice a bit of rigour to gain a bit of happiness?

How should we teach mathematics? There were lively debates at the Education Day on this subject. Are lectures a thing of the past? Should we abandon lectures for more active modes of learning? The paper by Freeman et al. was quoted several times. The paper is a meta-analysis of studies comparing traditional lectures with “active learning” methods and comes out strongly in favour of active learning. The department in Edinburgh uses for all first and most second year teaching the flipped classroom methodology. Other departments have not gone this far, but have made steps in the same direction. The question should not be: lecture or active learning? The question should be, where is the right balance between lecturing and active learning. Maybe not the flipped classroom, but just a tilted one. Which is incidentally the title of a brilliant article by Lara Alcock.

]]>

Why should this be a problem? I will talk about mathematics, because this is the field I know best and I can only guess how much the following applies to other fields. First, there are essential activities that are not captured by markers of esteem. The most important is the reviewing of papers. Ideally, a paper that is submitted to a journal is reviewed by one or two other mathematicians, who read the paper in detail and check that the proofs are correct. Reading a mathematical paper is hard work and takes time and each hour spent reviewing a paper is time not spent writing your own papers. And we see that often enough reviewers do not take the time to read a paper. And this means that in the prevailing publish-or-perish atmosphere we spend less time polishing and proofreading our own papers and also less time reviewing papers written by others. In consequence I claim, with no evidence apart from anecdotal evidence, that the overall quality of research papers is diminishing.

Second, the hunt for esteem leads to the search for the magical creature called *the **least publishable unit*. In the beginning scientists are motivated by the pursuit of knowledge by the desire to answer questions whose answers are not known. What happens when the questions turn out to be difficult? This is when mathematics becomes interesting, this is where research becomes exciting. But it also means that I am spending time “unproductively”, because I am not writing a paper. Half a year spent working on a problem is half a year not spent writing papers. And so it can be tempting to chip of a small subproblem that I can solve and write a paper about it. And then perhaps chip off another subproblem. And if after some chipping the main problem is still too big, there are always other chippable problems to be found.

Third, measuring mathematics in terms of esteem means that when discussing other mathematicians we stop asking the questions: What is he or she researching? What result has he or she proven? Instead we are asking the other kind of question: How many papers in the Annals of Mathematics or Inventiones Mathematicae have they published? How many NSF or EPSRC grants do they have? It is because the latter kind of questions are easier to answer. They don’t require us to think about actual mathematics or to make judgements about whether a given subdiscipline is important or what the point of a theorem is. It even gives us the illusion that we can compare someone working on analysis of PDEs with someone doing algebraic topology without having to know much about either area.

Having said this, how robust is the scientific process if we treat science as a sport instead of pursuing it to increase our knowledge? It is a difficult question, because we all are pushed in this direction to some extent. In practice academic hiring and promotion is tied to markers of esteem: citations, publications and grants. And so the more appropriate question is: How much should we swim against the tide? How much time do we spend doing what is important for the community, for students and for mathematics but will not be measured in numbers? This encompasses many things: writing research monographs, developing high quality teaching materials, reading other research papers in detail. I don’t have an answer to this question, but there are hints—studies in psychology that cannot be reproduced, or in debates about foundational work in symplectic geometry—that point to cracks in the facade of science.

]]>

Two things have changed: First, professionally MOOCs are “the competition”, they provide higher education at a fraction of the cost of a university. This is particularly true in England, where the cost of a university degree is £9,000 per year and rising. Even a distance learning institution such as the Open University is significantly more expensive than most MOOCs. So I wanted to experience how a MOOC feels like for learners. To see how they use technology, how they pace videos and design programming exercises and what we as lecturers can learn from them. Second, academics in the UK were striking for 14 days to avoid steep cuts to their pensions and hence I found myself with spare time on my hands.

Thus I decided to dive into the MOOC experience. After some research, I chose the grandfather of MOOCs, Andrew Ng’s Coursera course on machine learning.

The course lasts 11 weeks and each week usually covers one specific algorithm. The course starts with linear regression, continues with logistic regression and neural networks, covers support vector machines, K-means and PCA and finally talks about optimizing large scale problems via batch and stochastic gradient descent. Each week has about 1.5 to 2 hours of video content of Andrew Ng explaining the mathematics behind the algorithm of the week. Then there is a multiple choice test and finally a programming exercise that is submitted and automatically graded online.

• Andrew Ng succeeds in presenting a large amount of mathematics with a minimum of mathematical prerequisites. As mathematicians we tend to think that to understand a topic we have to understand it down to the last detail. It is ingrained in how we were taught. In the mathematics curriculum analysis and the epsilon-delta notion of convergence is seen as the bedrock upon which calculus is built. In this course however Andrew Ng explains and uses the gradient algorithm without assuming any knowledge of calculus! Seeing how this is done is fascinating.

• Each lecture is followed by a multiple choice quiz which tests understanding of the material. I really liked the questions. They take the material of the lectures and then go just a little bit further to check whether we actually thought about what Andrew Ng said in the video.

• In the programming exercises we are implementing all the basic machine learning algorithms in Matlab (or Octave) from scratch: linear and logistic regression, k-means clustering, neural networks. For this to work a lot of scaffolding is provided by the course. One can argue that creating the scaffolding—helper functions to read in data, visualise the results, etc.—is at least as challenging as coding the core of the algorithm. But of course too much freedom makes it impossible to automatically grade the programming exercises.

• Machine learning. I had seen some of the algorithms before, but seeing a systematic exposition of the material was helpful. However, particularly useful were the nuggets of wisdom on the practical aspects: how to divide a dataset into training and test sets, how to go about optimising learning rates and other hyper-parameters, how to diagnose bias and variance. All the things that are needed to make theory work.

• Focus. Everything in the course is focused towards a goal. The course starts with basics: cost functions, nonlinear optimisation and gradient descent, but everything is introduced only to the extent that it is strictly necessary: maximum likelihood is mentioned only once in passing, and is convexity; conjugate gradient method is used in programming exercises but only as an optimisation black box. In a 50 minute lecture there is the temptation to wander and explore side avenues because they are interesting or because they provide “a more complete picture”, but if the lecture is split into 10 minute long videos, focus becomes essential.

All in all it was both an enjoyable and instructive experience. I am looking forward to continuing with the spiritual successor, Andrew Ng’s second MOOC on deep learning.

]]>In the end I settled on showing how to evaluate

There are many proofs for this identity—fourteen have been been collected by Robin Chapman—and it is often done as an application of the theory of Fourier series. One of the proofs, however, uses double integrals: using the geometric series one can show that

Evaluating this integral is an instructive exercise because it confounds many of the assumptions about double integrals students may have: the integration domain is as simple as one might hope, yet the right approach is to perform a change of coordinates; the resulting domain can be parametrized as one piece, yet it is better to split the domain in two; non-trivial trigonometric identities are used to simplify the integrands.

Whether or not the students were entertained remains unknown, but as a result I wrote some notes explaining the calculation in great, perhaps excessive, detail.

]]>

There are genuine arguments for controlled immigration and “taking back cotrol of the border”. There is also merit in finding and deporting those who are in the country illegally. But what can possibly be the point of making life hell for those who have done nothing wrong? What purpose is served by casting the net of suspicion so wide that it causes collateral damage?

Unfortunately, this is not an isolated incident. There is the “unfortunate error” made by the Home Office when it sent out up to 100 letters to EU nationals ordering them to leave the country. There are cases of people who have lived more than 50 years in the UK and came close to being deported because they could not prove that they moved to the UK as children in the 1960s. There are stories of EU nationals applying for permanent residence in the months after Brexit who had their applications rejected because of technicalities. And then there is today’s case of a man being detained for two months and threatened with deportation after reporting a crime.

What does the future hold? Starting from January banks will be helping Theresa May create a “hostile environment” for illegal immigrants by carrying out immigration checks on their customers. It would be a miracle if these checks don’t generate false positives—legal immigrants who are mistakenly flagged up by the system, whose bank accounts are closed or threatened with closure. There is the general uncertainty about the rights and status of EU citizens in the UK after Brexit and what process they will have to follow in order to continue living their lives.

And while one part of the country seems intent to make life as hard as possible for immigrants another part seems unable to do without them. Proposals for a “barista visa” have been floated and farms complain already about a shortage of migrant workers.

I would like to look optimistically into the future, but after hearing Philip Hammond admit that the cabinet has not yet discussed the government’s preferred “end state position” after Brexit, it is hard to shake off doubts.

]]>The document below contains some of these nuggets, written in a form that may be useful to students writing a bachelor thesis or final year project in mathematics or some other technical subject. I am grateful to my colleagues who read it and helped improve it and to my students who provided me with the necessary experience to write it.

]]>