psmv3: First Go, now cars

A little while ago, Google cracked the Go problem (see reference 1), a hard problem but a closed problem in that once you know the very small number of simple rules you are on the road, so as to speak. They are still working on the much more open car driving problem – open in that all kinds of much more fuzzy considerations come into play – and a problem which prompts the question: ‘what is the difference between the seeing that a computer does and the seeing that a person does’. So we use the computer that drives Google’s driverless cars (see reference 2) as an example, from which we derive the rather disorganised jottings which follow.

Jottings which will maybe tell us something about what consciousness is for, something which people have struggled with for some time now. What human function or activity can we identify that an unconscious computer is going to struggle with?

Data

Google have put in a lot of time into building up a database about the areas in which their cars are to drive, say, for example, Mountain View in California – Google’s home town and an area which, if the illustration above is a fair sample, is not very complicated compared with, say, central London.

A database which, I assume, combines map flavoured data with street furniture and street scene flavoured data. Some of this may well go further than the raw geometry of that world, taking in the image data from Street View. Certainly in principle, one could extract pictures of things from Street View – for example a road side letter box – and paste them into one’s more geometrically flavoured database. One could extract data from the pictures from the cameras on the driverless cars themselves. All of this adding up to a comprehensive database about Mountain View, a database which is being continuously updated and which is continuously available to all the driverless cars. A rather communistic arrangement, a rather different arrangement to that of humans, whereby each human has to take responsibility for his or her own data and updates.

The driverless car is then driving about in Mountain View. I suppose that each such car maintains its own real time database about where it is, combining data from the central database built up over time with data arriving in real time from its own half a dozen or so roof mounted cameras.

Which will include data about possibly temporary changes to road layout and road furniture and, more importantly, about other road and sidewalk users. Pedestrians, cyclists, ice-cream carts, cars, vans and lorries. The odd duck wandering across the road.

The software in the driverless car is clever enough to be able to identify such things and to know something about how each sort of such thing is likely to behave. Having, for example, noticed a cyclist, the driverless car will make a point of tracking that cyclist until he or she is out of harm’s way.

There will also be half a dozen or so roof mounted range finders, capable of measuring the distance to designated points in sub-second time to millimetre accuracy. A capability which humans do not have: they have to make do with what they can get by moving their head about and stereoscopic vision.

Functions

The software will also need to be able to do most of the stuff which follows.

Strip out weather. So that, for example, the rain on the camera lenses does not get into the camera images. What with eyelids and brains, humans are quite good at this.

Strip out unusual lighting. So that images are adjusted to reflect normal, even lighting conditions. Strip out shadows. This should, inter alia, help with comparing one image with another, either from another place or from another time, looking for changes.

There is also the business of the night, when less visual information is going to be available. Perhaps much less colour – although I don’t know whether the colour vision of cameras degrades in the dark in the same way as that of humans. Perhaps the Google cars are not yet allowed out after dark.

Steady the image. Maybe convert the lots of frames to the second of the cameras to an average frame of one to the second. This should, inter alia, cut out a local of insignificant change at the level of pixels, leaving change that is more worth checking out. It would probably also smooth out, for example, a butterfly flying across a building - but not a goshawk. Perhaps about the right balance for this sort of application, although something that a human is perhaps better at, perhaps applying more context sensitivity to the smoothing out process than the computer can manage – so far.

Snap the features of the real-time image onto those of the stored image, in or implicit in the on-board database. A tidying up process. Rather in the way that Powerpoint can snap objects in its image to its grid.

Camera and range-finder control. While it may be that a lot of the time these devices can just be left to quietly scan the world under their own steam, there will be times when the computer needs some information quickly and it will need to be able to direct one of these devices to get it. An activity analogous in some ways to the brain controlled saccades of the human eye. Sometimes conscious, sometimes not.

Know where the car is, to millimetre accuracy, that is to say a lot better than satnav can manage, what direction it is pointing in and at what speed it is travelling. Plus all kinds of engine and car management stuff.

A moving object identification and orientation system. Once the computer has detected an object moving on or near the road it was on, it would be good if was able to tell what sort of object it was and which way it was pointing. Am I following the bus or is it coming straight at me? The visual cues from the front of the bus would thus complement & confirm those coming from the range finders. More redundancy in such matters equals more safety.

A white line tracking system. An important, labour-saving device in that white lines substantially reduce the amount of effort & work needed to drive on open roads, such as motorways. But what about interpreting the various white line conventions? The dash as opposed to the dotted as opposed to the continuous? How to be sure that the old white lines which have been carelessly taken up really have been taken up? Does the tracking system go so far as to read the odd signs, symbols and messages which might be written down in white line? As indeed they have been in the illustration above, including what appear to be decoys.

A street furniture tracking system. Which keeps an eye on the street furniture, in particular the direction signs, which can be used to confirm that the car is where the computer thinks it is. Does the system go so far as to read the words on the signs and to use that information to confirm where the car is going?

Humans

So given all this stuff that the Google computer is probably doing, what might a person, perhaps a driver of a regular car, be doing in the same sort of situation with a lot of the same sort of data?

In round terms, a person needs to be conscious to drive a car. We leave aside such phenomena as arriving at one’s destination, quite safely, but without any idea of how one got there, the conscious mind having been thinking about something else altogether. On the other hand, no-one is suggesting that the Google on-board computer is conscious, or even that the one back at the Googleplex is. So what is different about what the person is doing?

Let us suppose that the person is well practised in mind control and is not thinking about anything else, other than processing the visual scene and driving the car. But even this processing will, at least in most people, go well beyond what is strictly necessary to drive the car. Most people take an interest in their surroundings, and even if they are not consciously articulating any thoughts on the matter, there will be plenty going on subconsciously, goings on which will, inter alia, give emotional colour to the scene. In this, people are much less disciplined than a computer, with an upside being that they might well take on board all kinds of information which might just prove useful in the future – say that there is an upcoming clearance sale at a local furniture shop. The first generation of Google cars are not going to do this, never mind emotional colour; they will have their work cut out just to drive the car.

Nor will they be much good at spotting the helicopter gunship coming down out of the sun to strafe the street. Humans are quite good at dealing with the unexpected, without crashing their cars, while the Google car driving computer will have been instructed to stick to exactly that.

Getting more anatomical, there are well documented links between the seeing process and the vestibular processes which go on behind the ears, the processes which detect movement and the orientation of the head with respect to vertical – at least in normal earth bound conditions. The computer probably does not bother with much of this, being content to judge its own position and movement by reference to its surroundings.

There are well documented links between the seeing process which goes on inside the brain and the motor processes which control the movements of the head and of the eyes. The computer will be doing something of this sort, albeit in a rather different way, reflecting its rather different way of doing things.

We leave aside all the signals the brain is getting from the body, apart from those to do with vision. For example, the feel of the hands on the steering wheel, of the feet on the pedals and of the sun on the face. The sounds of passing vehicles and chattering passengers. All the stuff which contributes to the conscious and more or less continuous sense of self. Which may, occasionally result in the withdrawal of resources from the vision system, possibly with untoward results and certainly something which the Google car has been instructed not to do, probably cannot do, short of breaking down altogether.

Slightly different is the fact that the eyes of a human are in the head of a vertebrate animal sitting inside the car, while the cameras of the Google car are sitting on top of the roof of the car. And while there is a computer that is not quite the same as a free-standing – or rather sitting – animal. Then the eyes will take in all kinds of stuff which the cameras on the roof do not. For example, the arms on the steering wheel, the occasional glimpse of the nose. Stuff which can be linked with other stuff about these bits of body, all the other stuff coming in on the other channels of the nervous system mentioned above. The instruments in the instrument panel. The position of the gear stick. The eyes and the brain are tracking what is going on according to at least two frames of reference, one corresponding to the interior of the car and the other to the car moving about in the outside world.

Absence of conclusions

So there are all kinds of differences. But as yet nothing which seems to go to the essence of the matter, the essence of the difference between the car and the human. We can’t say much more than that the brain is massively parallel and much more weakly – or perhaps leakily – structured than the computer. That said, one does suspect that a lot of what the brain does is done with a small number of general purpose processes, quite possibly rather statistical in a Bayesian sort of way, rather than with a whole catalogue of specialised and deterministic functions in the computer, with their equally deterministic interactions. Part of the thinking here being that we have not been down from the trees all that long, in evolutionary terms, and we have not had time to evolve a lot of fancy new equipment. It is not as if we were prize pigs being bred up on purpose for our handsome chops; evolution is not at all purposeful and much, much slower.

And we don’t seem to have made any progress at all with the question of what consciousness is for. What is our subjective experience of the world good for? What can it do for us that Google can’t do?

But I do offer one closing thought. That is that the computer will usually be running a number of processes at any one time, processes of roughly equal standing, peer processes. Processes which can get on with their own affairs without bothering over much about what the other processes are up to. While a human is only paying attention to one thing at a time. My conscious attention is, perhaps, taken by a sweet wrapper which has, irritatingly, fallen to the floor of the car. A significant withdrawal of resources from the business of driving the car, perhaps for the odd second, perhaps, far more dangerously, for a number of seconds. An ability which has its pluses and its minuses.

Work in progress!

Reference 1: http://psmv3.blogspot.co.uk/2016/03/high-finance_9.html.

Reference 2: https://www.google.com/selfdrivingcar/.

psmv3

Sunday, 17 April 2016

First Go, now cars

No comments:

Post a Comment