Multiplying monochrome images as matrices: A*B and softmax JuliaCon 2021 (times are UTC)

Multiplying monochrome images as matrices: A*B and softmax
.ical

We can interpret monochrome images as matrices KxM and MxN and combine them via matrix multiplication. It turns out that results are often visually interesting, especially if we normalize rows of the left-hand-side matrix and columns of the right-hand-side matrix with softmax before taking the product.

It would be great to understand the properties of matrix multiplication better. This seems to be particularly worthwhile because matrix multiplication plays a prominent role in Transformers, a class of machine learning models invented in 2017 and responsible for many recent breakthroughs including GPT-3. Sometimes, the rows of the left-hand-side matrix are softmax-normalized in order to use them as probabilities.

One can interpret a monochrome image as a matrix (the size of the matrix depends on the resolution of the image in question, so one should rescale the image in question as desired). I decided to explore whether matrix products of monochrome images are visually interesting. I used JuliaImages packages, Julia LinearAlgebra facilities and Julia Jupyter notebooks.

I looked at standard Julia test images, such as "mandrill" and "jetplane", and discovered that there is plenty of visually interesting information in their matrix products. I used the scaling of pixel values which is also used by ImageView.imshow() methods.

It turned out that the matrix products were particularly informative and had a lot of visible fine structure, if one softmax-normalized rows of the left-hand-side matrix and columns of the right-hand-side matrix before taking the product. The images looked slightly toned-down and striped after normalization, but not too different visually. However the products were drastically different. Note that in Transformer models one usually applies softmax only on one side, but this turns out to be insufficient for our visual exploration of matrix products.

I like the resulting images as visual art, and I think this might point to some interesting novel ways to obtain visual art by mathematical transformations.

I also hope this might eventually be of help as people try to achieve better understanding and more fine-grained control of our machine learning models.

The markdown file commenting the Julia notebook and elaborating on machine learning connections is posted at https://github.com/anhinga/julia-notebooks/blob/main/images-as-matrices/presentation/commentary.md

After this proposal was submitted, I have explored composing matrix multiplications with other image transformations. The resulting compact neural machines produce visually interesting results.

I have conducted first experiments in solving machine learning problems formulated in terms of those compact machines taking advantage of flexibility of differentiable programming in Julia Flux.

I have created a repository containing materials relevant to this poster: https://github.com/anhinga/JuliaCon2021-poster

Mishka (Michael Bukatin)

main theme

My main research focus has been to find, study, and develop a high-level computer programming formalism
allowing to deform programs in a continuous fashion (just as one can deform recurrent neural networks in a continuous fashion).

I was trying to approach this problem from various angles: doing research in the mathematics of continuous domains
for denotational semantics of programming languages, studying theoretical neuroscience, and so on.

Finally, our research collaboration was starting to see the hints of the possible solution from approximately Fall of 2012,
and the formalism for continuously deformable programs was developed by our research collaborations in 2015-2016.

These days I am continuing to focus on studying and experimenting with this formalism and I am hoping that it will
eventually stop being a purely research subject and will become a technology.

I maintain a Web site for this formalism here: https://anhinga.github.io/

I also maintain a list of open problems and promising research and technological directions and interdisciplinary
connections related to this formalism: https://www.cs.brandeis.edu/~bukatin/dmm-collaborative-research-agenda.pdf

brief timeline

My background in software, mathematics, and science goes back to Soviet Union, to machine code, Algol-60, Fortran-4,
and to punched cards; to Pushchino, the Biological Center of the Soviet Academy of Sciences, and to
the Mathematical class of Moscow High School number 7.

I started to focus on continuous models of computations in college, then emigrated to USA, worked as
a scientific programmer for Alex Rashin at Biosym Technologies doing computational geometry and computational chemistry
(I was the second author on several papers in The Journal of Physical Chemistry and Biophysical Chemistry
from that period), then did a PhD in Computer Science at Brandeis University focusing of mathematics
of continuous domains for denotational semantics (this is a copy of my 2002 PhD thesis: https://arxiv.org/abs/1512.03868).

In parallel, I worked in various places in the software industry. There I had a chance to first touch
dataflow programming, Common Lisp, and actor model of programming.

This century I have been working at a geographic software company (ownership of it went through acquisitions, spin-offs,
and such, so one very long employment looks like several shorter ones from a formal viewpoint),
while doing research in parallel. My research focus was mostly on theoretical neuroscience for a while,
then a research collaboration on deep connections between partial metrics and fuzzy equalities,
and finally (from approximately Fall of 2012) a research collaboration
on deep connections between partial contradictions and vector semantics of programming languages
and, from 2014-2015 on, a series of research collaborations on neuromophic computations with linear streams.

Starting from about 2011 I was gradually moving from just being a lover of computer animation and electronic music to
my first attempts to make some visual, audio, and audio-visual art of my own, and I am continuing to make new computer art every few months or so.
It involved playing a bit with MilkDrop 2 for WinAmp, mixing music a bit with Serato DJ,
doing a lot of animations and a bit of sound work in Processing, doing a tiny bit of that in Clojure,
and finally working a bit with shader-based GLSL animations.

2015-present

Linear streams are streams for which linear combinations of several streams are defined. If one makes sure that
linear computations and general (often non-linear) computations are interleaved, then one gets continuously deformable programs which
we call Dataflow matrix machines (DMMs). Another way to obtain DMMs is to start with recurrent neural networks
and replace streams of numbers with linear streams and allow complicated "activation functions"
(that is, transformations of linear streams) with arbitrary arity.

This setup also allows these neural machines to have very natural and flexible self-modification facilities.
There are toy implementations in Processing with mutable matrices, and the reference implementation in Clojure with
immutable streams of tree-shaped "flexible-rank tensors". The reference paper on DMMs is https://arxiv.org/abs/1712.07447

I hope to create the next application of this formalism in Julia
(both Julia Flux and JAX are the machine learning frameworks which finally have sufficient flexibility
we need to take full advantage of the flexibility of DMMs). I started to switch to Julia in the early 2020.
I recently sketched a three-page note outlining my hopes in this sense: https://www.cs.brandeis.edu/~bukatin/towards-practical-dmms.pdf

Multiplying monochrome images as matrices: A*B and softmax .ical

main theme

brief timeline

2015-present

Multiplying monochrome images as matrices: A*B and softmax
.ical