Transportation

Berkeley Simons Institute Launches Online Video Shorts With An Eye-Opening Look At Perception


The Simons Institute for the Theory of Computing at the University of California Berkeley is a highly regarded venue that was established in 2012 and has been bringing together computer scientists and other allied experts in an effort to explore deep and unsolved computational problems.

Besides conducting various workshops and symposium, along with publishing research papers, the Simons Institute has recently launched a new online video documentary series called Theory Shorts.

For general information about the Simons Institute, see this link here.

For free viewing of the new Theory Shorts online video documentary series, use this link here.

As an excellent and well-worth watching opener for the video series, the first such video is entitled “Perception as Inference: The Brain and Computation” and is both informative and engaging, offering a vivid and telling exploration of how the eye captures visual images, along with the vital intertwining of the brain and the mind in making sense of what the eye is observing.

It is jam-packed with a quite fascinating and readily explicated background on the topic, mixing together both theory and practice, and offers a riveting style that clocks in at less than 20 minutes, moving along at a unified pace and amazingly covers rather substantive content in such a short time-frame.

Kudos goes to Dr. Bruno Olshausen, Professor in the Helen Wills Neuroscience Institute and the School of Optometry, and with an affiliated appointment in Electrical Engineering and Computer Science (EECS), as he is the expert extraordinaire for this enlightening video about the eye-brain duality.

I’ll provide some tidbits from the video, whetting your appetite to view it, and also offer some thoughts on an allied use case involving the advent of Autonomous Vehicles (AV) and especially the emergence of AI-based self-driving cars.

The Eyes As Visual Sensory Devices

We often take our eyes for granted and give little attention to how they work and why they work.

In many respects, it is a kind of biological and neural sensory miracle that our eyes and our brain are able to seamlessly work hand-in-hand, an intertwining of our vision system with our cognitive capabilities that enables us to perceive the world and make sense of what we are viewing (most of the time, one hopes).

As stated in the video, our sense-making of the visual inputs can lead to hallucinations, though as aptly pointed out, one could ingeniously argue that we are perhaps always hallucinating (for some, a startling and yet useful way to understand the matter), and we manage fortunately most of the time to do a prudent job by turning the raw imagery into rational meaning for operating in our day-to-day lives.

If you were to separate the eyes from the brain (ugh, sorry if that seems untoward), you would have essentially a standalone sensory device that is collecting visually stimulated data but that has no place to go, and for which there would not be any viable sense-making in terms of embellishing the data and leveraging the data for comprehension purposes.

This is worth contemplating as there are many that oftentimes make a mistaken or ill-informed analogy to the use of electronic-based cameras and the taking of pictures and the recording of video, and it is important to set the record straight that though you might assert that such cameras are the equivalent of the eye, this tends to egregiously downplay the role of the brain and mind in the essential duality involved.

In short, a camera can provide images that we then as humans can watch and make sense from the visualizations proffered using our brains, but the camera itself is not particularly doing any sense-making per se.

In the use case of AV’s and AI-based self-driving cars, you can place cameras onto a car and collect as much visual data as you might seek, yet the ability to make sense of the images, such as determining that there is a pedestrian about to cross the street and might get hit, this comes from the AI side of the computational effort.

We obviously already know how to produce cameras (which do keep improving), while it’s another matter altogether to figure out how to process the visual data and turn it into anything nearing the brain’s ability to transform visual imagery into cogent thoughts and actions.

That’s part of the so-called “hold-up” in the crafting of self-driving cars, namely, how to transform the rudimentary visual data into cognitive-like comprehension that an AI system might embody and translate into then appropriately and safely steering the wheels of the car, along with operating the accelerator and brake pedals.

It is indisputably a hard problem and bodes great challenges (see my discussion here).

Another important facet involves the aspect that the eyes are registering visual imagery that inherently contains visual clutter and noise, comprising numerous distortions, and is frequently unstable and non-uniformly being sampled, etc. Simply stated, the visual imagery is not at all pristine and the neural processing has to overcome and transform this raw input accordingly.

You could liken this somewhat to the data provided by the cameras on a self-driving car.

Imagine that a self-driving car is rocketing down the freeway in the rain. The video data that is streaming into the sensors is partially occluded and altered too amidst the jostling and motion of the vehicle, being chockful of various noisiness and distortions, all of which the AI-based driving system must try to screen and computationally analyze.

Human drivers are generally able to drive a car and do so primarily via the sensing of the road and the use of their eyes, seemingly effortlessly and with little awareness of their eye-mind dual processing at play.

Attempts to replicate this kind of human-led activity by the use of cameras and AI-processing in self-driving vehicles are making rudimentary progress, though much remains to be solved (see my analysis at this link here).

Humans, of course, have two eyes, though my parents always seemed to have a third eye in the back of their heads, while the Jumping Spider has 8 eyes and the Box Jellyfish has an astounding 24 eyes.

Presumably, in a Darwinian fashion, creatures on this planet have evolved toward some befitting number and style of eyes that align with their environment and their base survival. For my explication of of bio-mimicry and robo-mimicry, see this link here.

In the case of self-driving cars, there are arguments aplenty regarding how many cameras and the type of cameras that should best be used to achieve true self-driving.

Furthermore, there are ongoing acrimonious debates about whether other sensory devices such as radar and LIDAR should be used, which some suggest that since humans don’t have such innate capabilities as radar and LIDAR, perhaps self-driving should be done entirely via cameras and not require such additional sensory elements (for more on this dispute, see the link here).

There are also many computationally inspired debates on these AV topics.

For example, the human eye and the brain seem to be operating on a basis that one could apply Bayesian inference (quite well-explained in the video), and as such this same modeling technique might be used for the AI processing in self-driving cars.

Some though are uncomfortable with the incorporating of probabilistic based computations into AI driving systems and believe that somehow the AI needs to deal in only absolute certainties, a seemingly incongruous notion when compared to the realities of driving (see my analysis here).

Conclusion

By deftly crafting increasingly sophisticated computational models of the eye and the eye-brain duality, we can attempt to emulate the actions of this miracle of nature and dramatically improve our computer-based mechanization’s accordingly.

Meanwhile, such modeling is indubitably and inexorably going to teasingly reveal the deep dark secrets underlying this remarkable wetware and get us step-wise closer to one day cracking their enigmatic code. 

This first video in the Simons Institute series is a go-getter that out-the-gate provides an invigorating look at a crucial slice of perception and addresses numerous computational aspects that are bound to get you thinking.

Keep your eyes open and be on the watch for more such videos.

Your eye-mind duality will thank you for doing so.



READ NEWS SOURCE

Also Read  Tesla Model S Becomes The First EV With A 400 Mile-Plus Range