he came from google to give a talk about using semantic context to improve visual systems. It was quite good.
Basically he combined a deep learning visual network, with one of these semantic networks. He made a mapping between the visual network and the semantic network that put visual cues into a continuous semantic space. Then he could teach the visual network these image classes (hand labeled from imageNET(?)) and it would learn the mapping between the visual cues and the semantic space. The semantic space was learned from wikipedia -- just like next word predictors (that are also deep I think). The visual network could then recognize completely new visual categories (not just new images within a learned category). Because the semantic space would link the categories (the category relationships are learned through wikipedia). He makes the metaphor that the visual system has seen a desk chair, but never a rocking chair. But the semantic system has read about a rocking chair and its relation to a desk chair and other chairs. The system then sees a rocking chair for the first time and can infer that it is indeed a rocking chair through its semantic knowledge.
There is also this trick of using basically a hash-table to quickly get out approximate filters in a deep network, this speeds up the system by almost 20,000x.
Here is the paper
I can't find much of his other papers. He was in EJ's lab and has a lot of papers with Greg fields and Jeff Gauthier.
And there is this -- the code for a rbm/deep belief net run on gpu
https://code.google.com/p/cuda-convnet/
And this which was used to get all the stuff from wikipedia:
https://code.google.com/p/word2vec/
No comments:
Post a Comment